The race is that the shrink_dcache_memory shrinker could get called while a
filesystem is being unmounted, and could try to prune a dentry belonging to
that filesystem.
If it does, then it will call in to iput on the inode while the dentry is
no longer able to be found by the umounting process. If iput takes a
while, generic_shutdown_super could get all the way though
shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without
ever waiting on this particular inode.
Eventually the superblock gets freed anyway and if the iput tried to touch
it (which some filesystems certainly do), it will lose. The promised
"Self-destruct in 5 seconds" doesn't lead to a nice day.
The race is closed by holding s_umount while calling prune_one_dentry on
someone else's dentry. As a down_read_trylock is used,
shrink_dcache_memory will no longer try to prune the dentry of a filesystem
that is being unmounted, and unmount will not be able to start until any
such active prune_one_dentry completes.
This requires that prune_dcache *knows* which filesystem (if any) it is
doing the prune on behalf of so that it can be careful of other
filesystems. shrink_dcache_memory isn't called it on behalf of any
filesystem, and so is careful of everything.
shrink_dcache_anon is now passed a super_block rather than the s_anon list
out of the superblock, so it can get the s_anon list itself, and can pass
the superblock down to prune_dcache.
If prune_dcache finds a dentry that it cannot free, it leaves it where it
is (at the tail of the list) and exits, on the assumption that some other
thread will be removing that dentry soon. To try to make sure that some
work gets done, a limited number of dnetries which are untouchable are
skipped over while choosing the dentry to work on.
I believe this race was first found by Kirill Korotaev.
Cc: Jan Blunck <jblunck@suse.de>
Acked-by: Kirill Korotaev <dev@openvz.org>
Cc: Olaf Hering <olh@suse.de>
Acked-by: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch removes the steal_locks() function.
steal_locks() doesn't work correctly with any filesystem that does it's own
lock management, including NFS, CIFS, etc.
In addition it has weird semantics on local filesystems in case tasks
sharing file-descriptor tables are doing POSIX locking operations in
parallel to execve().
The steal_locks() function has an effect on applications doing:
clone(CLONE_FILES)
/* in child */
lock
execve
lock
POSIX locks acquired before execve (by "child", "parent" or any further
task sharing files_struct) will after the execve be owned exclusively by
"child".
According to Chris Wright some LSB/LTP kind of suite triggers without the
stealing behavior, but there's no known real-world application that would
also fail.
Apps using NPTL are not affected, since all other threads are killed before
execve.
Apps using LinuxThreads are only affected if they
- have multiple threads during exec (LinuxThreads doesn't kill other
threads, the app may do it with pthread_kill_other_threads_np())
- rely on POSIX locks being inherited across exec
Both conditions are documented, but not their interaction.
Apps using clone() natively are affected if they
- use clone(CLONE_FILES)
- rely on POSIX locks being inherited across exec
The above scenarios are unlikely, but possible.
If the patch is vetoed, there's a plan B, that involves mostly keeping the
weird stealing semantics, but changing the way lock ownership is handled so
that network and local filesystems work consistently.
That would add more complexity though, so this solution seems to be
preferred by most people.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add the vendor-specific extended capability PCI_CAP_ID_VNDR. It is required
by the Myri-10G Ethernet driver.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a revocation notification method to the key type and calls it whilst
the key's semaphore is still write-locked after setting the revocation
flag.
The patch then uses this to maintain a reference on the task_struct of the
process that calls request_key() for as long as the authorisation key
remains unrevoked.
This fixes a potential race between two processes both of which have
assumed the authority to instantiate a key (one may have forked the other
for example). The problem is that there's no locking around the check for
revocation of the auth key and the use of the task_struct it points to, nor
does the auth key keep a reference on the task_struct.
Access to the "context" pointer in the auth key must thenceforth be done
with the auth key semaphore held. The revocation method is called with the
target key semaphore held write-locked and the search of the context
process's keyrings is done with the auth key semaphore read-locked.
The check for the revocation state of the auth key just prior to searching
it is done after the auth key is read-locked for the search. This ensures
that the auth key can't be revoked between the check and the search.
The revocation notification method is added so that the context task_struct
can be released as soon as instantiation happens rather than waiting for
the auth key to be destroyed, thus avoiding the unnecessary pinning of the
requesting process.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce SELinux hooks to support the access key retention subsystem
within the kernel. Incorporate new flask headers from a modified version
of the SELinux reference policy, with support for the new security class
representing retained keys. Extend the "key_alloc" security hook with a
task parameter representing the intended ownership context for the key
being allocated. Attach security information to root's default keyrings
within the SELinux initialization routine.
Has passed David's testsuite.
Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This moves the usb class devices that control the usbfs nodes to show up
in the proper place in the larger device tree.
No userspace changes is needed, this is compatible due to the symlinks
generated by the driver core.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This will allow for us to give endpoints a major/minor to create a
"usbfs2-like" way to access endpoints directly from userspace in an
easier manner than the current usbfs provides us.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Move <linux/usb_input.h> to <linux/usb/input.h> and remove some
redundant includes.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This moves header files for controller-specific platform data
from <linux/usb_XXX.h> to <linux/usb/XXX.h> to start reducing
some clutter.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This moves <linux/usb_cdc.h> to <linux/usb/cdc.h> to reduce some of the
clutter of usb header files.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as699) adds usb_reset_composite_device(), a routine for
sending a USB port reset to a device with multiple interfaces owned by
different drivers. Drivers are notified about impending and completed
resets through two new methods in the usb_driver structure.
The patch modifieds the usbfs ioctl code to make it use the new routine
instead of usb_reset_device(). Follow-up patches will modify the hub,
usb-storage, and usbhid drivers so they can utilize this new API.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
During the recent "isa drivers using platform devices" discussion it was
pointed out that (ALSA) ISA drivers ran into the problem of not having
the option to fail driver load (device registration rather) upon not
finding their hardware due to a probe() error not being passed up
through the driver model. In the course of that, I suggested a seperate
ISA bus might be best; Russell King agreed and suggested this bus could
use the .match() method for the actual device discovery.
The attached does this. For this old non (generically) discoverable ISA
hardware only the driver itself can do discovery so as a difference with
the platform_bus, this isa_bus also distributes match() up to the driver.
As another difference: these devices only exist in the driver model due
to the driver creating them because it might want to drive them, meaning
that all device creation has been made internal as well.
The usage model this provides is nice, and has been acked from the ALSA
side by Takashi Iwai and Jaroslav Kysela. The ALSA driver module_init's
now (for oldisa-only drivers) become:
static int __init alsa_card_foo_init(void)
{
return isa_register_driver(&snd_foo_isa_driver, SNDRV_CARDS);
}
static void __exit alsa_card_foo_exit(void)
{
isa_unregister_driver(&snd_foo_isa_driver);
}
Quite like the other bus models therefore. This removes a lot of
duplicated init code from the ALSA ISA drivers.
The passed in isa_driver struct is the regular driver struct embedding a
struct device_driver, the normal probe/remove/shutdown/suspend/resume
callbacks, and as indicated that .match callback.
The "SNDRV_CARDS" you see being passed in is a "unsigned int ndev"
parameter, indicating how many devices to create and call our methods with.
The platform_driver callbacks are called with a platform_device param;
the isa_driver callbacks are being called with a "struct device *dev,
unsigned int id" pair directly -- with the device creation completely
internal to the bus it's much cleaner to not leak isa_dev's by passing
them in at all. The id is the only thing we ever want other then the
struct device * anyways, and it makes for nicer code in the callbacks as
well.
With this additional .match() callback ISA drivers have all options. If
ALSA would want to keep the old non-load behaviour, it could stick all
of the old .probe in .match, which would only keep them registered after
everything was found to be present and accounted for. If it wanted the
behaviour of always loading as it inadvertently did for a bit after the
changeover to platform devices, it could just not provide a .match() and
do everything in .probe() as before.
If it, as Takashi Iwai already suggested earlier as a way of following
the model from saner buses more closely, wants to load when a later bind
could conceivably succeed, it could use .match() for the prerequisites
(such as checking the user wants the card enabled and that port/irq/dma
values have been passed in) and .probe() for everything else. This is
the nicest model.
To the code...
This exports only two functions; isa_{,un}register_driver().
isa_register_driver() register's the struct device_driver, and then
loops over the passed in ndev creating devices and registering them.
This causes the bus match method to be called for them, which is:
int isa_bus_match(struct device *dev, struct device_driver *driver)
{
struct isa_driver *isa_driver = to_isa_driver(driver);
if (dev->platform_data == isa_driver) {
if (!isa_driver->match ||
isa_driver->match(dev, to_isa_dev(dev)->id))
return 1;
dev->platform_data = NULL;
}
return 0;
}
The first thing this does is check if this device is in fact one of this
driver's devices by seeing if the device's platform_data pointer is set
to this driver. Platform devices compare strings, but we don't need to
do that with everything being internal, so isa_register_driver() abuses
dev->platform_data as a isa_driver pointer which we can then check here.
I believe platform_data is available for this, but if rather not, moving
the isa_driver pointer to the private struct isa_dev is ofcourse fine as
well.
Then, if the the driver did not provide a .match, it matches. If it did,
the driver match() method is called to determine a match.
If it did _not_ match, dev->platform_data is reset to indicate this to
isa_register_driver which can then unregister the device again.
If during all this, there's any error, or no devices matched at all
everything is backed out again and the error, or -ENODEV, is returned.
isa_unregister_driver() just unregisters the matched devices and the
driver itself.
More global points/questions...
- I'm introducing include/linux/isa.h. It was available but is ofcourse
a somewhat generic name. Moving more isa stuff over to it in time is
ofcourse fine, so can I have it please? :)
- I'm using device_initcall() and added the isa.o (dependent on
CONFIG_ISA) after the base driver model things in the Makefile. Will
this do, or I really need to stick it in drivers/base/init.c, inside
#ifdef CONFIG_ISA? It's working fine.
Lastly -- I also looked, a bit, into integrating with PnP. "Old ISA"
could be another pnp_protocol, but this does not seem to be a good
match, largely due to the same reason platform_devices weren't -- the
devices do not have a life of their own outside the driver, meaning the
pnp_protocol {get,set}_resources callbacks would need to callback into
driver -- which again means you first need to _have_ that driver. Even
if there's clean way around that, you only end up inventing fake but
valid-form PnP IDs and generally catering to the PnP layer without any
practical advantages over this very simple isa_bus. The thing I also
suggested earlier about the user echoing values into /sys to set up the
hardware from userspace first is... well, cute, but a horrible idea from
a user standpoint.
Comments ofcourse appreciated. Hope it's okay. As said, the usage model
is nice at least.
Signed-off-by: Rene Herman <rene.herman@keyaccess.nl>
This patch (as721) makes dev_info and related macros print the device's
bus name if the device doesn't have a driver, instead of printing just a
blank. If the device isn't on a bus either... well, then it does leave
a blank space. But it will be easier for someone else to change if they
want.
Cc: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is the first step in moving class_device to being replaced by
struct device. It allows struct device to export a dev_t and makes it
easy to dynamically create and destroy struct device as long as they are
associated with a specific class.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
To have a home for all hypervisors, this patch creates /sys/hypervisor.
A new config option SYS_HYPERVISOR is introduced, which should to be set
by architecture dependent hypervisors (e.g. s390 or Xen).
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
allow sysdev_class adding attribute. Next patch will use the new API to
add an attribute under /sys/device/system/cpu/.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Let tty_register_device() return a pointer to the class device it creates.
This allows registrants to add their own sysfs files under the class
device node.
Signed-off-by: Hansjoerg Lipp <hjlipp@web.de>
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Introduce __iowrite64_copy. It will be used by the Myri-10G Ethernet
driver to post requests to the NIC. This driver will be submitted soon.
__iowrite64_copy copies to I/O memory in units of 64 bits when possible (on
64 bit architectures). It reverts to __iowrite32_copy on 32 bit
architectures.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.infradead.org/hdrcleanup-2.6: (63 commits)
[S390] __FD_foo definitions.
Switch to __s32 types in joystick.h instead of C99 types for consistency.
Add <sys/types.h> to headers included for userspace in <linux/input.h>
Move inclusion of <linux/compat.h> out of user scope in asm-x86_64/mtrr.h
Remove struct fddi_statistics from user view in <linux/if_fddi.h>
Move user-visible parts of drivers/s390/crypto/z90crypt.h to include/asm-s390
Revert include/media changes: Mauro says those ioctls are only used in-kernel(!)
Include <linux/types.h> and use __uXX types in <linux/cramfs_fs.h>
Use __uXX types in <linux/i2o_dev.h>, include <linux/ioctl.h> too
Remove private struct dx_hash_info from public view in <linux/ext3_fs.h>
Include <linux/types.h> and use __uXX types in <linux/affs_hardblocks.h>
Use __uXX types in <linux/divert.h> for struct divert_blk et al.
Use __u32 for elf_addr_t in <asm-powerpc/elf.h>, not u32. It's user-visible.
Remove PPP_FCS from user view in <linux/ppp_defs.h>, remove __P mess entirely
Use __uXX types in user-visible structures in <linux/nbd.h>
Don't use 'u32' in user-visible struct ip_conntrack_old_tuple.
Use __uXX types for S390 DASD volume label definitions which are user-visible
S390 BIODASDREADCMB ioctl should use __u64 not u64 type.
Remove unneeded inclusion of <linux/time.h> from <linux/ufs_fs.h>
Fix private integer types used in V4L2 ioctls.
...
Manually resolve conflict in include/linux/mtd/physmap.h
* git://git.infradead.org/~dwmw2/rbtree-2.6:
[RBTREE] Switch rb_colour() et al to en_US spelling of 'color' for consistency
Update UML kernel/physmem.c to use rb_parent() accessor macro
[RBTREE] Update hrtimers to use rb_parent() accessor macro.
[RBTREE] Add explicit alignment to sizeof(long) for struct rb_node.
[RBTREE] Merge colour and parent fields of struct rb_node.
[RBTREE] Remove dead code in rb_erase()
[RBTREE] Update JFFS2 to use rb_parent() accessor macro.
[RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
[RBTREE] Update key.c to use rb_parent() accessor macro.
[RBTREE] Update ext3 to use rb_parent() accessor macro.
[RBTREE] Change rbtree off-tree marking in I/O schedulers.
[RBTREE] Add accessor macros for colour and parent fields of rb_node
* git://git.infradead.org/mtd-2.6: (199 commits)
[MTD] NAND: Fix breakage all over the place
[PATCH] NAND: fix remaining OOB length calculation
[MTD] NAND Fixup NDFC merge brokeness
[MTD NAND] S3C2410 driver cleanup
[MTD NAND] s3c24x0 board: Fix clock handling, ensure proper initialisation.
[JFFS2] Check CRC32 on dirent and data nodes each time they're read
[JFFS2] When retiring nextblock, allocate a node_ref for the wasted space
[JFFS2] Mark XATTR support as experimental, for now
[JFFS2] Don't trust node headers before the CRC is checked.
[MTD] Restore MTD_ROM and MTD_RAM types
[MTD] assume mtd->writesize is 1 for NOR flashes
[MTD NAND] Fix s3c2410 NAND driver so it at least _looks_ like it compiles
[MTD] Prepare physmap for 64-bit-resources
[JFFS2] Fix more breakage caused by janitorial meddling.
[JFFS2] Remove stray __exit from jffs2_compressors_exit()
[MTD] Allow alternate JFFS2 mount variant for root filesystem.
[MTD] Disconnect struct mtd_info from ABI
[MTD] replace MTD_RAM with MTD_GENERIC_TYPE
[MTD] replace MTD_ROM with MTD_GENERIC_TYPE
[MTD] remove a forgotten MTD_XIP
...
Following problems are addressed:
- wrong status caused early break out of nand_wait()
- removed the bogus status check in nand_wait() which
is a relict of the abandoned support for interrupted
erase.
- status check moved to the correct place in read_oob
- oob support for syndrom based ecc with strange layouts
- use given offset in the AUTOOOB based oob operations
Partially based on a patch from Vitaly Vool <vwool@ru.mvista.com>
Thanks to Savin Zlobec <savin@epico.si> for tracking down the
status problem.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In this implementation, audit registers inotify watches on the parent
directories of paths specified in audit rules. When audit's inotify
event handler is called, it updates any affected rules based on the
filesystem event. If the parent directory is renamed, removed, or its
filesystem is unmounted, audit removes all rules referencing that
inotify watch.
To keep things simple, this implementation limits location-based
auditing to the directory entries in an existing directory. Given
a path-based rule for /foo/bar/passwd, the following table applies:
passwd modified -- audit event logged
passwd replaced -- audit event logged, rules list updated
bar renamed -- rule removed
foo renamed -- untracked, meaning that the rule now applies to
the new location
Audit users typically want to have many rules referencing filesystem
objects, which can significantly impact filtering performance. This
patch also adds an inode-number-based rule hash to mitigate this
situation.
The patch is relative to the audit git tree:
http://kernel.org/git/?p=linux/kernel/git/viro/audit-current.git;a=summary
and uses the inotify kernel API:
http://lkml.org/lkml/2006/6/1/145
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch adds audit support to POSIX message queues. It applies cleanly to
the lspp.b15 branch of Al Viro's git tree. There are new auxiliary data
structures, and collection and emission routines in kernel/auditsc.c. New hooks
in ipc/mqueue.c collect arguments from the syscalls.
I tested the patch by building the examples from the POSIX MQ library tarball.
Build them -lrt, not against the old MQ library in the tarball. Here's the URL:
http://www.geocities.com/wronski12/posix_ipc/libmqueue-4.41.tar.gz
Do auditctl -a exit,always -S for mq_open, mq_timedsend, mq_timedreceive,
mq_notify, mq_getsetattr. mq_unlink has no new hooks. Please see the
corresponding userspace patch to get correct output from auditd for the new
record types.
[fixes folded]
Signed-off-by: George Wilson <ltcgcw@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The following patch addresses most of the issues with the IPC_SET_PERM
records as described in:
https://www.redhat.com/archives/linux-audit/2006-May/msg00010.html
and addresses the comments I received on the record field names.
To summarize, I made the following changes:
1. Changed sys_msgctl() and semctl_down() so that an IPC_SET_PERM
record is emitted in the failure case as well as the success case.
This matches the behavior in sys_shmctl(). I could simplify the
code in sys_msgctl() and semctl_down() slightly but it would mean
that in some error cases we could get an IPC_SET_PERM record
without an IPC record and that seemed odd.
2. No change to the IPC record type, given no feedback on the backward
compatibility question.
3. Removed the qbytes field from the IPC record. It wasn't being
set and when audit_ipc_obj() is called from ipcperms(), the
information isn't available. If we want the information in the IPC
record, more extensive changes will be necessary. Since it only
applies to message queues and it isn't really permission related, it
doesn't seem worth it.
4. Removed the obj field from the IPC_SET_PERM record. This means that
the kern_ipc_perm argument is no longer needed.
5. Removed the spaces and renamed the IPC_SET_PERM field names. Replaced iuid and
igid fields with ouid and ogid in the IPC record.
I tested this with the lspp.22 kernel on an x86_64 box. I believe it
applies cleanly on the latest kernel.
-- ljk
Signed-off-by: Linda Knippers <linda.knippers@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Allow callers to remove watches from their event handler via
inotify_remove_watch_locked(). This functionality can be used to
achieve IN_ONESHOT-like functionality for a subset of events in the
mask.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add inotify_init_watch() so caller can use inotify_watch refcounts
before calling inotify_add_watch().
Add inotify_find_watch() to find an existing watch for an (ih,inode)
pair. This is similar to inotify_find_update_watch(), but does not
update the watch's mask if one is found.
Add inotify_rm_watch() to remove a watch via the watch pointer instead
of the watch descriptor.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
When an inotify event includes a dentry name, also include the inode
associated with that name.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The following series of patches introduces a kernel API for inotify,
making it possible for kernel modules to benefit from inotify's
mechanism for watching inodes. With these patches, inotify will
maintain for each caller a list of watches (via an embedded struct
inotify_watch), where each inotify_watch is associated with a
corresponding struct inode. The caller registers an event handler and
specifies for which filesystem events their event handler should be
called per inotify_watch.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Having two or more qdisc_run's contend against each other is bad because
it can induce packet reordering if the packets have to be requeued. It
appears that this is an unintended consequence of relinquinshing the queue
lock while transmitting. That in turn is needed for devices that spend a
lot of time in their transmit routine.
There are no advantages to be had as devices with queues are inherently
single-threaded (the loopback device is not but then it doesn't have a
queue).
Even if you were to add a queue to a parallel virtual device (e.g., bolt
a tbf filter in front of an ipip tunnel device), you would still want to
process the queue in sequence to ensure that the packets are ordered
correctly.
The solution here is to steal a bit from net_device to prevent this.
BTW, as qdisc_restart is no longer used by anyone as a module inside the
kernel (IIRC it used to with netif_wake_queue), I have not exported the
new __qdisc_run function.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (109 commits)
[ETHTOOL]: Fix UFO typo
[SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.
[SCTP]: Send only 1 window update SACK per message.
[SCTP]: Don't do CRC32C checksum over loopback.
[SCTP] Reset rtt_in_progress for the chunk when processing its sack.
[SCTP]: Reject sctp packets with broadcast addresses.
[SCTP]: Limit association max_retrans setting in setsockopt.
[PFKEYV2]: Fix inconsistent typing in struct sadb_x_kmprivate.
[IPV6]: Sum real space for RTAs.
[IRDA]: Use put_unaligned() in irlmp_do_discovery().
[BRIDGE]: Add support for NETIF_F_HW_CSUM devices
[NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM
[TG3]: Convert to non-LLTX
[TG3]: Remove unnecessary tx_lock
[TCP]: Add tcp_slow_start_after_idle sysctl.
[BNX2]: Update version and reldate
[BNX2]: Use CPU native page size
[BNX2]: Use compressed firmware
[BNX2]: Add firmware decompression
[BNX2]: Allow WoL settings on new 5708 chips
...
Manual fixup for conflict in drivers/net/tulip/winbond-840.c
Trying to suspend/resume with console messages flying all around is
doomed to failure, when the devices that the messages are trying to
go to are being shut down.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Andrew Victor
This patch includes a number of updates to the AT91RM9200 serial driver.
Changes include:
1. Conversion to a platform_driver. [Ivan Kokshaysky]
2. Replaced all references to AT91RM9200 with AT91. This driver can now
also be used for the AT91SAM9216.
3. Allow TIOCM_LOOP to configure local loopback mode.
4. Cleaned up the 'read_status_mask' usage and interrupt handler code.
[Chip Coldwell]
5. Suspend/resume support. [David Brownell]
There are a few 'unused variable' warning when compiling this - I
removed the new DMA support to keep this first patch simpler.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
identically so we test for them in quite a few places. For the sake
of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two. We
also test the disjunct of NETIF_F_IP_CSUM and the other two in various
places, for that purpose I've added NETIF_F_ALL_CSUM.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>