Commit Graph

802 Commits

Author SHA1 Message Date
Linus Torvalds
cf59001235 Merge master.kernel.org:/pub/scm/linux/kernel/git/aia21/ntfs-2.6 2005-08-16 09:31:28 -07:00
Trond Myklebust
65e4308d25 [PATCH] NFS: Ensure we always update inode->i_mode when doing O_EXCL creates
When the client performs an exclusive create and opens the file for writing,
a Netapp filer will first create the file using the mode 01777. It does this
since an NFSv3/v4 exclusive create cannot immediately set the mode bits.
The 01777 mode then gets put into the inode->i_mode. After the file creation
is successful, we then do a setattr to change the mode to the correct value
(as per the NFS spec).

The problem is that nfs_refresh_inode() no longer updates inode->i_mode, so
the latter retains the 01777 mode. A bit later, the VFS notices this, and calls
remove_suid(). This of course now resets the file mode to inode->i_mode & 0777.
Hey presto, the file mode on the server is now magically changed to 0777. Duh...

Fixes http://bugzilla.linux-nfs.org/show_bug.cgi?id=32

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-16 09:30:58 -07:00
Trond Myklebust
58fcb8df0b [PATCH] NFS: Ensure ACL xdr code doesn't overflow.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-16 08:52:11 -07:00
Anton Altaparmakov
e74589ac25 NTFS: Fix bug in mft record writing where we forgot to set the device in
the buffers when mapping them after the VM had discarded them.
      Thanks to Martin MOKREJŠ for the bug report.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-08-16 16:38:28 +01:00
John McCutchan
89204c40a0 [PATCH] inotify: add MOVE_SELF event
This adds a MOVE_SELF event to inotify.  It is sent whenever the inode
you are watching is moved.  We need this event so that we can catch
something like this:

 - app1:
	watch /etc/mtab

 - app2:
	cp /etc/mtab /tmp/mtab-work
	mv /etc/mtab /etc/mtab~
	mv /tmp/mtab-work /etc/mtab

app1 still thinks it's watching /etc/mtab but it's actually watching
/etc/mtab~.

Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-15 09:50:31 -07:00
Robert Love
0bf955ce98 [PATCH] inotify: fix idr_get_new_above usage
We are saving the wrong thing in ->last_wd.  We want the wd, not the
return value.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-15 09:48:31 -07:00
Steve French
27876d02b3 [PATCH] CIFS: Fix path name conversion for long filenames
Fix path name conversion for long filenames when mapchars mount option
was specified at mount time.

Signed-off-by: Steve French (sfrench@us.ibm.com)
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-14 15:27:24 -07:00
Steve French
d024709deb [PATCH] CIFS: Fix missing entries in search results
Fix missing entries in search results when very long file names and more
than 50 (or so) of such long search entries in the directory.

FindNext could send corrupt last byte of resume name when resume key was
a few hundred bytes long file name or longer.

Fixes Samba Bug # 2932

Signed-off-by: Steve French (sfrench@us.ibm.com)
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-14 15:27:24 -07:00
Jan Kara
1b0a74d1c0 [PATCH] Fix error handling in reiserfs
Initialize key object ID in inode so that we don't try to remove the inode
when we fail on some checks even before we manage to allocate something.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-13 21:54:13 -07:00
Dave Kleikamp
2d610b80e9 Merge with /home/shaggy/git/linus-clean/
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-08-10 11:15:13 -05:00
Dave Kleikamp
8a9cd6d676 JFS: Fix race in txLock
TxAnchor.anon_list is protected by jfsTxnLock (TXN_LOCK), but there was
a place in txLock() that was removing an entry from the list without holding
the spinlock.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-08-10 11:14:39 -05:00
John McCutchan
7a91bf7f5c [PATCH] fsnotify_name/inoderemove
The patch below unhooks fsnotify from vfs_unlink & vfs_rmdir.  It
introduces two new fsnotify calls, that are hooked in at the dcache
level.  This not only more closely matches how the VFS layer works, it
also avoids the problem with locking and inode lifetimes.

The two functions are

 - fsnotify_nameremove -- called when a directory entry is going away.
   It notifies the PARENT of the deletion.  This is called from
   d_delete().

 - inoderemove -- called when the files inode itself is going away.  It
   notifies the inode that is being deleted.  This is called from
   dentry_iput().

Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-08 11:53:47 -07:00
Miklos Szeredi
68b47139ea [PATCH] namespace.c: fix bind mount from foreign namespace
I'm resending this patch, because I still believe it's the correct fix.

Tested before/after applying the patch with a test application
available from:

  http://www.inf.bme.hu/~mszeredi/nstest.c

Bind mount from a foreign namespace results in an un-removable mount.
The reason is that mnt->mnt_namespace is copied from the old mount in
clone_mnt().  Because of this check_mnt() in sys_umount() will fail.

The solution is to set mnt->mnt_namespace to current->namespace in
clone_mnt().  clone_mnt() is either called from do_loopback() or
copy_tree().  copy_tree() is called from do_loopback() or
copy_namespace().

When called (directly or indirectly) from do_loopback(), always
current->namspace is being modified: check_mnt(nd->mnt).  So setting
mnt->mnt_namespace to current->namspace is the right thing to do.

When called from copy_namespace(), the setting of mnt_namespace is
irrelevant, since mnt_namespace is reset later in that function for
all copied mounts.

Jamie said:

  This patch is correct.  The old code was buggy for more fundamental and
  serious reason: it broke the invariant that a tree of vfsmnts all have the
  same value of mnt_namespace (and the same for the mnt_list list).

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Jamie Lokier <jamie@shareable.org>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-07 10:00:38 -07:00
Andrew Morton
e525e153c7 [PATCH] __bio_clone() dead comment
Remove a very wrong comment.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-07 10:00:38 -07:00
Linus Torvalds
fab5a60a29 Check input buffer size in zisofs
This uses the new deflateBound() thing to sanity-check the input to the
zlib decompressor before we even bother to start reading in the blocks.

Problem noted by Tim Yamin <plasmaroo@gentoo.org>
2005-08-06 09:42:06 -07:00
John McCutchan
0c3dba1534 [PATCH] Clean up inotify delete race fix
This avoids the whole #ifdef mess by just getting a copy of
dentry->d_inode before d_delete is called - that makes the codepaths the
same for the INOTIFY/DNOTIFY cases as for the regular no-notify case.
I've been running this under a Gnome session for the last 10 minutes.
Inotify is being used extensively.

Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04 21:37:39 -07:00
Dave Kleikamp
a5c96cab8f Merge with /home/shaggy/git/linus-clean/
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-08-04 15:56:15 -05:00
John McCutchan
e234f35c54 [PATCH] inotify delete race fix
The included patch fixes a problem where a inotify client would receive a
delete event before the file was actually deleted.  The bug affects both
dnotify & inotify.

Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04 13:11:15 -07:00
Robert Love
3de11748c1 [PATCH] inotify: update help text
The inotify help text still refers to the character device.  Update it.

Fixes kernel bug #4993.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04 13:11:15 -07:00
Roman Zippel
74f9c9c258 [PATCH] hfs: don't reference missing page
If there was a read error, the bnode might miss some pages, so skip them.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-01 21:38:00 -07:00
Roman Zippel
f76d28d235 [PATCH] hfs: don't dirty unchanged inode
If inode size hasn't changed, don't do anything further in truncate, which
also prevents a dirty inode, what might upset some readonly devices quite
badly.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-01 21:38:00 -07:00
Dave Kleikamp
30db1ae864 JFS: Check for invalid inodes in jfs_delete_inode
Some error paths may iput an invalid inode with i_nlink=0.  jfs should
not try to actually delete such an inode.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-08-01 16:54:26 -05:00
John McCutchan
b9c55d29e9 [PATCH] inotify: fix race between the kernel and user space
When you rm a watch, an IN_IGNORED event is sent down the event queue
with the watch descriptor that you just rm'd.

If you then add a watch you could get the ignored watch's wd and if you
haven't read the entire event queue, user space will think that it's
newly created watch was just ignored.

To avoid this problem we just use idr_get_new_above instead of
idr_get_new.

Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-01 09:16:53 -07:00
John McCutchan
7544953685 [PATCH] inotify: fix file deletion by rename detection
When a file is moved over an existing file that you are watching,
inotify won't send you a DELETE_SELF event and it won't unref the inode
until the inotify instance is closed by the application.

Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-01 09:16:53 -07:00
Maneesh Soni
9ca1eb3282 [PATCH] sysfs: fix sysfs_setattr
o sysfs_dirent's s_mode field should also be updated in sysfs_setattr(), else
  there could be inconsistency in the two fields. s_mode is used while
  ->readdir so as not to bring in the inode to cache.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-29 13:12:49 -07:00
Maneesh Soni
bc062b1b5c [PATCH] sysfs: fix sysfs_chmod_file
o sysfs_chmod_file() must update the new iattr field in sysfs_dirent else
  the mode change will not be persistent in case of inode evacuation from
  cache.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-29 13:12:49 -07:00
Paolo 'Blaisorblade' Giarrusso
a2d76bd8fa [PATCH] uml: implement hostfs syncing
Actually implement the hostfs "sync" method.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28 21:46:05 -07:00
Andrew Morton
a5453be48e [PATCH] bio_clone fix
Fix bug introduced in 2.6.11-rc2: when we clone a BIO we need to copy over the
current index into it as well.

It corrupts data with some MD setups.

See http://bugzilla.kernel.org/show_bug.cgi?id=4946

Huuuuuuuuge thanks to Matthew Stapleton <matthew4196@gmail.com> for doggedly
chasing this one down.

Acked-by: Jens Axboe <axboe@suse.de>
Cc: <linux-raid@vger.kernel.org>
Cc: <dm-devel@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28 08:38:59 -07:00
Dave Kleikamp
da28c12089 Merge with /home/shaggy/git/linus-clean/
/home/shaggy/git/linus-clean/
/home/shaggy/git/linus-clean/

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-28 09:03:36 -05:00
Linus Torvalds
49302d0c42 Merge head 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/shaggy/jfs-2.6 2005-07-27 16:42:22 -07:00
Jesper Juhl
77933d7276 [PATCH] clean up inline static vs static inline
`gcc -W' likes to complain if the static keyword is not at the beginning of
the declaration.  This patch fixes all remaining occurrences of "inline
static" up with "static inline" in the entire kernel tree (140 occurrences in
47 files).

While making this change I came across a few lines with trailing whitespace
that I also fixed up, I have also added or removed a blank line or two here
and there, but there are no functional changes in the patch.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:20 -07:00
Olaf Hering
44456d37b5 [PATCH] turn many #if $undefined_string into #ifdef $undefined_string
turn many #if $undefined_string into #ifdef $undefined_string to fix some
warnings after -Wno-def was added to global CFLAGS

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:08 -07:00
Andreas Gruenbacher
02b775696f [PATCH] reiserfs doesn't use mbcache
reiserfs doesn't use the mbcache, so this can go.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:07 -07:00
Andreas Gruenbacher
8c52ab42c1 [PATCH] mbcache: Remove unused mb_cache_shrink parameter
The cache parameter to mb_cache_shrink isn't used.  We may as well remove
it.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:07 -07:00
Peter Staubach
c293621bbf [PATCH] stale POSIX lock handling
I believe that there is a problem with the handling of POSIX locks, which
the attached patch should address.

The problem appears to be a race between fcntl(2) and close(2).  A
multithreaded application could close a file descriptor at the same time as
it is trying to acquire a lock using the same file descriptor.  I would
suggest that that multithreaded application is not providing the proper
synchronization for itself, but the OS should still behave correctly.

SUS3 (Single UNIX Specification Version 3, read: POSIX) indicates that when
a file descriptor is closed, that all POSIX locks on the file, owned by the
process which closed the file descriptor, should be released.

The trick here is when those locks are released.  The current code releases
all locks which exist when close is processing, but any locks in progress
are handled when the last reference to the open file is released.

There are three cases to consider.

One is the simple case, a multithreaded (mt) process has a file open and
races to close it and acquire a lock on it.  In this case, the close will
release one reference to the open file and when the fcntl is done, it will
release the other reference.  For this situation, no locks should exist on
the file when both the close and fcntl operations are done.  The current
system will handle this case because the last reference to the open file is
being released.

The second case is when the mt process has dup(2)'d the file descriptor.
The close will release one reference to the file and the fcntl, when done,
will release another, but there will still be at least one more reference
to the open file.  One could argue that the existence of a lock on the file
after the close has completed is okay, because it was acquired after the
close operation and there is still a way for the application to release the
lock on the file, using an existing file descriptor.

The third case is when the mt process has forked, after opening the file
and either before or after becoming an mt process.  In this case, each
process would hold a reference to the open file.  For each process, this
degenerates to first case above.  However, the lock continues to exist
until both processes have released their references to the open file.  This
lock could block other lock requests.

The changes to release the lock when the last reference to the open file
aren't quite right because they would allow the lock to exist as long as
there was a reference to the open file.  This is too long.

The new proposed solution is to add support in the fcntl code path to
detect a race with close and then to release the lock which was just
acquired when such as race is detected.  This causes locks to be released
in a timely fashion and for the system to conform to the POSIX semantic
specification.

This was tested by instrumenting a kernel to detect the handling locks and
then running a program which generates case #3 above.  A dangling lock
could be reliably generated.  When the changes to detect the close/fcntl
race were added, a dangling lock could no longer be generated.

Cc: Matthew Wilcox <willy@debian.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:06 -07:00
Carsten Otte
0cfc11ed45 [PATCH] fix xip sparse file handling in ext2
Oliver Paukstadt from our test department is testing the xip patches in
Linus' git-tree.  He found a problem that shows when reading a file that
contains sparse blocks (holes) on a -o xip mounted ext2 filesystem: the
BUG_ON() in fs/ext2/xip.c:40 triggers where it should not.  The problem was
introduced by a cleanup in my previous patch, this patch fixes it.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:53 -07:00
Ian Kent
104e49fc1e [PATCH] autofs4: fix infamous "Busy inodes after umount ..." message
If the automount daemon receives a signal which causes it to sumarily
terminate the autofs4 module leaks dentries.  The same problem exists with
detached mount requests without the warning.

This patch cleans these dentries at umount.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:51 -07:00
Jan Kara
ab6862e6da [PATCH] ext3: drop quota references before releasing inode
We must drop references to quota structures before releasing the inode.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:50 -07:00
Jan Kara
c7e9a52ef0 [PATCH] ext2: drop quota reference before releasing inode
We must drop references to quota structures before releasing the inode.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:50 -07:00
Jeff Mahoney
b3bb8afd96 [PATCH] reiserfs: fix deadlock in inode creation failure path w/ default ACL
reiserfs_new_inode() can call iput() with the xattr lock held.  This will
cause a deadlock to occur when reiserfs_delete_xattrs() is called to clean
up.

The following patch releases the lock and reacquires it after the iput.
This is safe because interaction with xattrs is complete, and the relock is
just to balance out the release in the caller.

The locking needs some reworking to be more sane, but that's more intrusive
and I was just looking to fix this bug.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:50 -07:00
Nigel Cunningham
ef2a701d44 [PATCH] Fix missing refrigerator invocation in jffs2
Here's a patch to fix a missing refrigerator call in jffs2.

Signed-off-by: Nigel Cunningham <nigel@suspend2.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:49 -07:00
Dave Kleikamp
6de7dc2c4c Merge with /home/shaggy/git/linus-clean/
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-27 12:50:08 -05:00
Dave Kleikamp
cbc3d65ebc JFS: Improve sync barrier processing
Under heavy load, hot metadata pages are often locked by non-committed
transactions, making them difficult to flush to disk.  This prevents
the sync point from advancing past a transaction that had modified the
page.

There is a point during the sync barrier processing where all
outstanding transactions have been committed to disk, but no new
transaction have been allowed to proceed.  This is the best time
to write the metadata.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-27 09:17:57 -05:00
Andrew Morton
89373de7dd [PATCH] inotify: fix oops fix
Cc: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:34:18 -07:00
Robert Love
e5ca844a9d [PATCH] inotify: check retval in init
Check for (unlikely) errors in the filesystem initialization stuff in
our module_init() function.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 13:37:22 -07:00
Robert Love
1b2ccf0cc1 [PATCH] inotify: change default limits
Change default inotify limits: Maximum instances per user to 128 and
maximum events per queue to 16k.  The max instances used to be 128; the
change to 8 was a mistake.  Memory consumption is fine.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 13:37:22 -07:00
Robert Love
5eb22cbcdb [PATCH] inotify: exit path cleanups
Handle error out paths better.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 13:37:22 -07:00
Robert Love
783bc29bbc [PATCH] inotify: oops fix
Bug fix: Ensure that the fd passed to inotify_add_watch() and
inotify_rm_watch() belongs to inotify.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 13:37:21 -07:00
Robert Love
33ea2f52b8 [PATCH] inotify: use fget_light
As an optimization, use fget_light() and fput_light() where possible.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 13:31:57 -07:00
Robert Love
b680716ed2 [PATCH] inotify: misc. cleanup
Miscellaneous invariant clean up, comment fixes, and so on.  Trivial
stuff.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 13:31:57 -07:00
Dave Kleikamp
18190cc08d JFS: Fix i_blocks accounting when allocation fails
A failure in dbAlloc caused a directory's i_blocks to be incorrectly
incremented, causing jfs_fsck to find the inode to be corrupt.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-26 09:29:13 -05:00
Dave Kleikamp
c2783f3a62 JFS: Don't set log_SYNCBARRIER when log->active == 0
If a metadata page is kept active, it is possible that the sync barrier logic
continues to trigger, even if all active transactions have been phyically
written to the journal.  This can cause a hang, since the completion of the
journal I/O is what unsets the sync barrier flag to allow new transactions
to be created.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-25 08:58:54 -05:00
Dave Kleikamp
c40c202493 JFS: Fix typo in last patch
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-22 11:08:44 -05:00
Dave Kleikamp
21d1ee8b37 Merge with /home/shaggy/git/linus-clean/
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-19 13:46:53 -05:00
Linus Torvalds
af6ea9ca23 Merge master.kernel.org:/pub/scm/linux/kernel/git/aia21/ntfs-2.6 2005-07-16 11:47:51 -07:00
Thomas Gleixner
2c4eec9802 Merge with rsync://fileserver/linux 2005-07-16 09:20:01 +02:00
Carsten Otte
afa597ba20 [PATCH] execute-in-place fixes
This patch includes feedback from Andrew and Christoph. Thanks for
taking time to review.

Use of empty_zero_page was eliminated to fix compilation for architectures
that don't have it.

This patch removes setting pages up-to-date in ext2_get_xip_page and all
bug checks to verify that the page is indeed up to date.  Setting the page
state on mapping to userland is bogus.  None of the code patchs involved
with these pages in mm cares about the page state.

still on my ToDo list: identify a place outside second extended where
__inode_direct_access should reside

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-15 09:54:50 -07:00
Qu Fuping
3d9b1cdd24 JFS: fsync wrong behavior when I/O failure occurs
This is half of a patch that Qu Fuping submitted in April.  The first part
was applied to fs/mpage.c in 2.6.12-rc4.

jfs_fsync should return error, but it doesn't wait for the metadata page to
be uptodate, e.g.:
jfs_fsync->jfs_commit_inode->txCommit->diWrite->read_metapage->
__get_metapage->read_cache_page reads a page from disk. Because read is
async, when read_cache_page: err = filler(data, page), filler will not
return error, it just submits I/O request and returns. So, page is not
uptodate.  Checking only if(IS_ERROR(mp->page)) is not enough, we should
add "|| !PageUptodate(mp->page)"

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-15 10:36:08 -05:00
Dave Kleikamp
56d1254917 JFS: Remove assert statement in dbJoin & return -EIO instead
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-15 09:43:36 -05:00
Thomas Gleixner
5d157885f3 [JFFS2] Fix node allocation leak
In the rare case of failing to write the cleanmarker
the allocated node was not freed.

Pointed out by Forrest Zhao
Initial cleanup by Joern Engel

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-15 08:14:44 +02:00
Dave Kleikamp
00be3e7e5c JFS: Remove bogus WARN_ON statement and some dead code
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-14 15:15:39 -05:00
Paolo 'Blaisorblade' Giarrusso
a0d43df931 [PATCH] uml: hostfs: unuse ROOT_DEV
Minimal patch removing uses of ROOT_DEV; next patch unexports it.  I've
opposed this, but I've planned to reintroduce the functionality without using
ROOT_DEV.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:25 -07:00
Paolo 'Blaisorblade' Giarrusso
8e0a218124 [PATCH] uml: fix hppfs error path
Fix the error message to refer to the error code, i.e.  err, not count, plus
add some cosmetical fixes.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:25 -07:00
Anton Altaparmakov
c514720716 Automatic merge with /usr/src/ntfs-2.6.git. 2005-07-13 23:09:23 +01:00
Linus Torvalds
3720bd8b1e Merge master.kernel.org:/pub/scm/linux/kernel/git/tglx/mtd-2.6 2005-07-13 12:19:30 -07:00
Steve Dickson
7ee91ec14b [PATCH] NFS: procfs/sysctl interfaces for lockd do not work on x86_64
Allow the setting of NLM timeouts and grace periods through the proc and
sysclt interfaces on x86_64 architectures

Signed-off-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:25:24 -07:00
Anton Altaparmakov
88bd5121d6 [PATCH] Fix soft lockup due to NTFS: VFS part and explanation
Something has changed in the core kernel such that we now get concurrent
inode write outs, one e.g via pdflush and one via sys_sync or whatever.
This causes a nasty deadlock in ntfs.  The only clean solution
unfortunately requires a minor vfs api extension.

First the deadlock analysis:

Prerequisive knowledge: NTFS has a file $MFT (inode 0) loaded at mount
time.  The NTFS driver uses the page cache for storing the file contents as
usual.  More interestingly this file contains the table of on-disk inodes
as a sequence of MFT_RECORDs.  Thus NTFS driver accesses the on-disk inodes
by accessing the MFT_RECORDs in the page cache pages of the loaded inode
$MFT.

The situation: VFS inode X on a mounted ntfs volume is dirty.  For same
inode X, the ntfs_inode is dirty and thus corresponding on-disk inode,
which is as explained above in a dirty PAGE_CACHE_PAGE belonging to the
table of inodes ($MFT, inode 0).

What happens:

Process 1: sys_sync()/umount()/whatever...  calls __sync_single_inode() for
$MFT -> do_writepages() -> write_page for the dirty page containing the
on-disk inode X, the page is now locked -> ntfs_write_mst_block() which
clears PageUptodate() on the page to prevent anyone else getting hold of it
whilst it does the write out (this is necessary as the on-disk inode needs
"fixups" applied before the write to disk which are removed again after the
write and PageUptodate is then set again).  It then analyses the page
looking for dirty on-disk inodes and when it finds one it calls
ntfs_may_write_mft_record() to see if it is safe to write this on-disk
inode.  This then calls ilookup5() to check if the corresponding VFS inode
is in icache().  This in turn calls ifind() which waits on the inode lock
via wait_on_inode whilst holding the global inode_lock.

Process 2: pdflush results in a call to __sync_single_inode for the same
VFS inode X on the ntfs volume.  This locks the inode (I_LOCK) then calls
write-inode -> ntfs_write_inode -> map_mft_record() -> read_cache_page() of
the page (in page cache of table of inodes $MFT, inode 0) containing the
on-disk inode.  This page has PageUptodate() clear because of Process 1
(see above) so read_cache_page() blocks when tries to take the page lock
for the page so it can call ntfs_read_page().

Thus Process 1 is holding the page lock on the page containing the on-disk
inode X and it is waiting on the inode X to be unlocked in ifind() so it
can write the page out and then unlock the page.

And Process 2 is holding the inode lock on inode X and is waiting for the
page to be unlocked so it can call ntfs_readpage() or discover that
Process 1 set PageUptodate() again and use the page.

Thus we have a deadlock due to ifind() waiting on the inode lock.

The only sensible solution: NTFS does not care whether the VFS inode is
locked or not when it calls ilookup5() (it doesn't use the VFS inode at
all, it just uses it to find the corresponding ntfs_inode which is of
course attached to the VFS inode (both are one single struct); and it uses
the ntfs_inode which is subject to its own locking so I_LOCK is irrelevant)
hence we want a modified ilookup5_nowait() which is the same as ilookup5()
but it does not wait on the inode lock.

Without such functionality I would have to keep my own ntfs_inode cache in
the NTFS driver just so I can find ntfs_inodes independent of their VFS
inodes which would be slow, memory and cpu cycle wasting, and incredibly
stupid given the icache already exists in the VFS.

Below is a patch that does the ilookup5_nowait() implementation in
fs/inode.c and exports it.

ilookup5_nowait.diff:

Introduce ilookup5_nowait() which is basically the same as ilookup5() but
it does not wait on the inode's lock (i.e. it omits the wait_on_inode()
done in ifind()).

This is needed to avoid a nasty deadlock in NTFS.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:25:24 -07:00
Robert Love
9a556e8908 [PATCH] inotify: misc cleanup
Really simple, basic cleanup.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:09:31 -07:00
Robert Love
0399cb08c5 [PATCH] inotify: move sysctl
This moves the inotify sysctl knobs to "/proc/sys/fs/inotify" from
"/proc/sys/fs".  Also some related cleanup.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:09:31 -07:00
Ian Dall
59192ed9e7 JFS: Need to be root to create files with security context
It turns out this is due to some inverted logic in xattr.c

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-13 09:15:18 -05:00
Dave Kleikamp
6211502d7e JFS: Allow security.* xattrs to be set on symlinks
All of the different xattr namespaces have different rules.
user.* and ACL's are not allowed on symlinks, and since these were the
first xattrs implemented, I assumed there was no need to support xattrs
on symlinks.  This one-line patch should fix it.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-13 09:07:53 -05:00
Dave Kleikamp
f7f24758ac Merge with /home/shaggy/git/linus-clean/
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-13 08:57:38 -05:00
Thomas Gleixner
1b3035b7fc Merge with rsync://fileserver/linux 2005-07-13 10:45:00 +02:00
Robert Love
0eeca28300 [PATCH] inotify
inotify is intended to correct the deficiencies of dnotify, particularly
its inability to scale and its terrible user interface:

        * dnotify requires the opening of one fd per each directory
          that you intend to watch. This quickly results in too many
          open files and pins removable media, preventing unmount.
        * dnotify is directory-based. You only learn about changes to
          directories. Sure, a change to a file in a directory affects
          the directory, but you are then forced to keep a cache of
          stat structures.
        * dnotify's interface to user-space is awful.  Signals?

inotify provides a more usable, simple, powerful solution to file change
notification:

        * inotify's interface is a system call that returns a fd, not SIGIO.
	  You get a single fd, which is select()-able.
        * inotify has an event that says "the filesystem that the item
          you were watching is on was unmounted."
        * inotify can watch directories or files.

Inotify is currently used by Beagle (a desktop search infrastructure),
Gamin (a FAM replacement), and other projects.

See Documentation/filesystems/inotify.txt.

Signed-off-by: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 20:38:38 -07:00
Linus Torvalds
bd4c625c06 reiserfs: run scripts/Lindent on reiserfs code
This was a pure indentation change, using:

	scripts/Lindent fs/reiserfs/*.c include/linux/reiserfs_*.h

to make reiserfs match the regular Linux indentation style.  As Jeff
Mahoney <jeffm@suse.com> writes:

 The ReiserFS code is a mix of a number of different coding styles, sometimes
 different even from line-to-line. Since the code has been relatively stable
 for quite some time and there are few outstanding patches to be applied, it
 is time to reformat the code to conform to the Linux style standard outlined
 in Documentation/CodingStyle.

 This patch contains the result of running scripts/Lindent against
 fs/reiserfs/*.c and include/linux/reiserfs_*.h. There are places where the
 code can be made to look better, but I'd rather keep those patches separate
 so that there isn't a subtle by-hand hand accident in the middle of a huge
 patch. To be clear: This patch is reformatting *only*.

 A number of patches may follow that continue to make the code more consistent
 with the Linux coding style.

 Hans wasn't particularly enthusiastic about these patches, but said he
 wouldn't really oppose them either.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 20:21:28 -07:00
Jeff Mahoney
7fa94c8868 [PATCH] reiserfs: fix up case where indent misreads the code
indent(1) doesn't know how to handle the "do not compile" error. It results
 in the item_ops array declaration being indented a tab stop in when it should
 not be. This patch replaces it with a #error that describes why it's failing.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:22:35 -07:00
Brian King
7da6844cf7 [PATCH] cdev: cdev_put oops
While fixing an oops in the st driver in a dirty release path, I
encountered an oops in cdev_put for cdevs allocated using cdev_alloc.  If
cdev_del is called when the cdev kobject still has an open user, when the
last cdev_put is called, the cdev_put will call kobject_put, which will end
up ultimately releasing the cdev in cdev_dynamic_release.  Patch fixes the
oops by preventing cdev_put from accessing freed memory.

Signed-off-by: Brian King <brking@us.ibm.com>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:01:02 -07:00
Jan Kara
50a5223428 [PATCH] ext2: fix mount options parting
Restore old set of ext2 mount options when remounting of a filesystem
fails.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:01:01 -07:00
Jan Kara
08c6a96fd7 [PATCH] ext3: fix options parsing
Fix a problem with ext3 mount option parsing.  When remount of a filesystem
fails, old options are now restored.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:01:01 -07:00
Roland McGrath
5323125031 [PATCH] reset real_timer target on exec leader change
When a noninitial thread does exec, it becomes the new group leader.  If
there is a ITIMER_REAL timer running, it points at the old group leader and
when it fires it can follow a stale pointer.  The timer data needs to be
reset to point at the exec'ing thread that is becoming the group leader.
This has to synchronize with any concurrent firing of the timer to make
sure that it_real_fn can never run when the data points to a thread that
might have been reaped already.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:01:01 -07:00
Artem B. Bityuckiy
4120db4719 [PATCH] bugfix: two read_inode() calls without clear_inode() call between
Bug symptoms
~~~~~~~~~~~~
For the same inode VFS calls read_inode() twice and doesn't call
clear_inode() between the two read_inode() invocations.

Bug description
~~~~~~~~~~~~~~~
Suppose we have an inode which has zero reference count but is still in
the inode cache. Suppose kswapd invokes shrink_icache_memory() to free
some RAM. In prune_icache() inodes are removed from i_hash. prune_icache
() is then going to call clear_inode(), but drops the inode_lock
spinlock before this. If in this moment another task calls iget() for an
inode which was just removed from i_hash by prune_icache(), then iget()
invokes read_inode() for this inode, because it is *already removed*
from i_hash.

The end result is: we call iget(#N) then iput(#N); inode #N has zero
i_count now and is in the inode cache; kswapd starts. kswapd removes the
inode #N from i_hash ans is preempted; we call iget(#N) again;
read_inode() is invoked as the result; but we expect clear_inode()
before.

Fix
~~~~~~~
To fix the bug I remove inodes from i_hash later, when clear_inode() is
actually called. I remove them from i_hash under spinlock protection.
Since the i_state is set to I_FREEING, it is safe to do this. The others
will sleep waiting for the inode state change.

I also postpone removing inodes from i_sb_list. It is not compulsory to
do so but I do it for readability reasons. Inodes are added/removed to
the lists together everywhere in the code and there is no point to
change this rule. This is harmless because the only user of i_sb_list
which somehow may interfere with me (invalidate_list()) is excluded by
the iprune_sem mutex.

The same race is possible in invalidate_list() so I do the same for it.

Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:00:59 -07:00
Miklos Szeredi
168a9fd6a1 [PATCH] __wait_on_freeing_inode fix
This patch fixes queer behavior in __wait_on_freeing_inode().

If I_LOCK was not set it called yield(), effectively busy waiting for the
removal of the inode from the hash.  This change was introduced within
"[PATCH] eliminate inode waitqueue hashtable" Changeset 1.1938.166.16 last
october by wli.

The solution is to restore the old behavior, of unconditionally waiting on
the waitqueue.  It doesn't matter if I_LOCK is not set initally, the task
will go to sleep, and wake up when wake_up_inode() is called from
generic_delete_inode() after removing the inode from the hash chain.

Comment is also updated to better reflect current behavior.

This condition is very hard to trigger normally (simultaneous clear_inode()
with iget()) so probably only heavy stress testing can reveal any change of
behavior.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:00:59 -07:00
Todd Poynor
a98a5d04f4 Merge with rsync://fileserver/linux 2005-07-13 00:58:44 +02:00
Todd Poynor
751382dd5c [JFFS2] Avoid compiler warnings when JFFS2_FS_WRITEBUFFER=n
Signed-off-by: Todd Poynor <tpoynor@mvista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-13 00:03:19 +02:00
Artem B. Bityuckiy
b62205986a [JFFS2] Init locks early during mount
In case of a mount error locks might be uninitialized but
accessed by the resulting call to jffs2_kill_sb().

Signed-off-by: Artem B. Bityuckiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-13 00:02:33 +02:00
Artem B. Bityuckiy
e4fef66189 [JFFS2] Rename function and update comments
We recently changed the method of collecting and sorting of
tmp_dnode objects to use a temporary RB-tree instead of a
temporary list. Rename function and update comments.

Signed-off-by: Artem B. Bityuckiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-12 23:58:26 +02:00
Artem B. Bityuckiy
86ffc0d5f5 [JFFS2] Remove needless variable initialization
Signed-off-by: Artem B. Bityuckiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-12 23:56:40 +02:00
Artem B. Bityuckiy
336d2ff711 [JFFS2] Avoid alloc/dealloc for zero sized nodes
Signed-off-by: Artem B. Bityuckiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-12 23:55:05 +02:00
Linus Torvalds
200d481f28 Merge master.kernel.org:/pub/scm/linux/kernel/git/tglx/mtd-2.6 2005-07-11 10:18:18 -07:00
NeilBrown
e34ac862ee [PATCH] nfsd4: fix fh_expire_type
After discussion at the recent NFSv4 bake-a-thon, I realized that my
assumption that NFS4_FH_PERSISTENT required filehandles to persist was a
misreading of the spec.  This also fixes an interoperability problem with the
Solaris client.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:11 -07:00
NeilBrown
4c4cd222ee [PATCH] nfsd4: check lock type against openmode.
We shouldn't be allowing, e.g., write locks on files not open for read.  To
enforce this, we add a pointer from the lock stateid back to the open stateid
it came from, so that the check will continue to be correct even after the
open is upgraded or downgraded.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:11 -07:00
NeilBrown
3a4f98bbf4 [PATCH] nfsd4: clean up nfs4_preprocess_seqid_op
As long as we're here, do some miscellaneous cleanup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:10 -07:00
NeilBrown
f8816512fc [PATCH] nfsd4: clarify close_lru handling
The handling of close_lru in preprocess_stateid_op was a source of some
confusion here recently.  Try to make the logic a little clearer, by renaming
find_openstateowner_id to make its purpose clearer and untangling some
unnecessarily complicated goto's.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:10 -07:00
NeilBrown
52fd004e29 [PATCH] nfsd4: renew lease on seqid modifying operations
nfs4_preprocess_seqid_op is called by NFSv4 operations that imply an implicit
renewal of the client lease.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
b700949b78 [PATCH] nfsd4: return better error on io incompatible with open mode
from RFC 3530:
"Share reservations are established by OPEN operations and by their
nature are mandatory in that when the OPEN denies READ or WRITE
operations, that denial results in such operations being rejected
with error NFS4ERR_LOCKED."

(Note that share_denied is really only a legal error for OPEN.)

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
444c2c07c2 [PATCH] nfsd4: always update stateid on open
An OPEN from the same client/open stateowner requires a stateid update because
of the share/deny access update.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
e66770cd7b [PATCH] nfsd4: relax new lock seqid check
We're insisting that the lock sequence id field passed in the
open_to_lockowner struct always be zero.  This is probably thanks to the
sentence in rfc3530: "The first request issued for any given lock_owner is
issued with a sequence number of zero."

But there doesn't seem to be any problem with allowing initial sequence
numbers other than zero.  And currently this is causing lock reclaims from the
Linux client to fail.

In the spirit of "be liberal in what you accept, conservative in what you
send", we'll relax the check (and patch the Linux client as well).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
7fb64cee34 [PATCH] nfsd4: seqid comments
Add some comments on the use of so_seqid, in an attempt to avoid some of the
confusion outlined in the previous patch....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
bd9aac523b [PATCH] nfsd4: fix open_reclaim seqid
The sequence number we store in the sequence id is the last one we received
from the client.  So on the next operation we'll check that the client gives
us the next higher number.

We increment sequence id's at the last moment, in encode, so that we're sure
of knowing the right error return.  (The decision to increment the sequence id
depends on the exact error returned.)

However on the *first* use of a sequence number, if we set the sequence number
to the one received from the client and then let the increment happen on
encode, we'll be left with a sequence number one to high.

For that reason, ENCODE_SEQID_OP_TAIL only increments the sequence id on
*confirmed* stateowners.

This creates a problem for open reclaims, which are confirmed on first use.
Therefore the open reclaim code, as a special exception, *decrements* the
sequence id, cancelling out the undesired increment on encode.  But this
prevents the sequence id from ever being incremented in the case where
multiple reclaims are sent with the same openowner.  Yuch!

We could add another exception to the open reclaim code, decrementing the
sequence id only if this is the first use of the open owner.

But it's simpler by far to modify the meaning of the op_seqid field: instead
of representing the previous value sent by the client, we take op_seqid, after
encoding, to represent the *next* sequence id that we expect from the client.
This eliminates the need for special-case handling of the first use of a
stateowner.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
893f87701c [PATCH] nfsd4: comment indentation
Yeah, it's trivial, but this drives me up the wall....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
3751517731 [PATCH] nfsd4: stop overusing RECLAIM_BAD
A misreading of the spec lead us to convert all errors on open and lock
reclaims to RECLAIM_BAD.  This causes problems--for example, a reboot within
the grace period could lead to reclaims with stale stateid's, and we'd like to
return STALE errors in those cases.

What rfc3530 actually says about RECLAIM_BAD: "The reclaim provided by the
client does not match any of the server's state consistency checks and is
bad." I'm assuming that "state consistency checks" refers to checks for
consistency with the state recorded to stable storage, and that the error
should be reserved for that case.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
0dd395dc76 [PATCH] nfsd4: ERR_GRACE should bump seqid on lock
A GRACE or NOGRACE response to a lock request should also bump the sequence
id.  So we delay the handling of grace period errors till after we've found
the relevant owner.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
b648330a1d [PATCH] nfsd4: ERR_GRACE should bump seqid on open
The GRACE and NOGRACE errors should bump the sequence id on open.  So we delay
the handling of these errors until nfsd4_process_open2, at which point we've
set the open owner, so the encode routine will be able to bump the sequence
id.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
0fa822e452 [PATCH] nfsd4: fix release_lockowner
We oops in list_for_each_entry(), because release_stateowner frees something
on the list we're traversing.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
67be431350 [PATCH] nfsd4: prevent multiple unlinks of recovery directories
Make sure we don't try to delete client recovery directories multiple times;
fixes some spurious error messages.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
cdc5524e8a [PATCH] nfsd4: lookup_one_len takes i_sem
Oops, this lookup_one_len needs the i_sem.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:07 -07:00
NeilBrown
a6ccbbb886 [PATCH] nfsd4: fix sync'ing of recovery directory
We need to fsync the recovery directory after writing to it, but we weren't
doing this correctly.  (For example, we weren't taking the i_sem when calling
->fsync().)

Just reuse the existing nfsd fsync code instead.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:07 -07:00
NeilBrown
463090294e [PATCH] nfsd4: reboot recovery fix
We need to remove the recovery directory here too.  (This chunk just got lost
somehow in the process of commuting the reboot recovery patches past the other
patches.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:07 -07:00
Miklos Szeredi
751c404b8f [PATCH] namespace: rename _mntput to mntput_no_expire
This patch renames _mntput() to something a little more descriptive:
mntput_no_expire().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:52 -07:00
Miklos Szeredi
55e700b924 [PATCH] namespace: rename mnt_fslink to mnt_expire
This patch renames vfsmount->mnt_fslink to something a little more
descriptive: vfsmount->mnt_expire.

Signed-off-by: Mike Waychison <michael.waychison@sun.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:52 -07:00
Miklos Szeredi
732dbef606 [PATCH] dcookies.c: use proper refcounting functions
Dcookies shouldn't play with the internals of dentry and vfsmnt
refcounting.  It defeats grepping, and is prone to break if implementation
details change.

In addition the function doesn't even seem to be performance critical: it
calls kmem_cache_alloc().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:52 -07:00
Miklos Szeredi
484e389c63 [PATCH] set mnt_namespace in the correct place
This patch sets ->mnt_namespace where it's actually added to the
namespace.

Previously mnt_namespace was set in do_kern_mount() even if the filesystem
was never added to any process's namespace (most kernel-internal
filesystems).

This discrepancy doesn't actually cause any problems, but it's cleaner if
mnt_namespace is NULL for these non exported filesystems.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:52 -07:00
Miklos Szeredi
ac0811538b [PATCH] namespace.c: fix mnt_namespace zeroing for expired mounts
This patch clears mnt_namespace in an expired mount.

If mnt_namespace is not cleared, it's possible to attach a new mount to the
already detached mount, because check_mnt() can return true.

The effect is a resource leak, since the resulting tree will never be
freed.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
Miklos Szeredi
ed42c879b7 [PATCH] namespace.c: fix expiring of detached mount
This patch fixes a bug noticed by Al Viro:

   However, we still have a problem here - just what would
   happen if vfsmount is detached while we were grabbing namespace
   semaphore?  Refcount alone is not useful here - we might be held by
   whoever had detached the vfsmount.  IOW, we should check that it's
   still attached (i.e. that mnt->mnt_parent != mnt).  If it's not -
   just leave it alone, do mntput() and let whoever holds it deal with
   the sucker.  No need to put it back on lists.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
Miklos Szeredi
24ca2af1e7 [PATCH] namespace.c: split mark_mounts_for_expiry()
This patch splits the mark_mounts_for_expiry() function.  It's too complex and
too deeply nested, even without the bugfix in the following patch.

Otherwise code is completely the same.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
Miklos Szeredi
a4d7027861 [PATCH] namespace.c: cleanup in mark_mounts_for_expiry()
This patch simplifies mark_mounts_for_expiry() by using detach_mnt() instead
of duplicating everything it does.

It should be an equivalent transformation except for righting the dput/mntput
order.

Al Viro said: "Looks sane".

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
Miklos Szeredi
1ce88cf466 [PATCH] namespace.c: fix race in mark_mounts_for_expiry()
This patch fixes a race found by Ram in mark_mounts_for_expiry() in
fs/namespace.c.

The bug can only be triggered with simultaneous exiting of a process having
a private namespace, and expiry of a mount from within that namespace.
It's practically impossible to trigger, and I haven't even tried.  But
still, a bug is a bug.

The race happens when put_namespace() is called by another task, while
mark_mounts_for_expiry() is between atomic_read() and get_namespace().  In
that case get_namespace() will be called on an already dead namespace with
unforeseeable results.

The solution was suggested by Al Viro, with his own words:

      Instead of screwing with atomic_read() in there, why don't we
      simply do the following:
      	a) atomic_dec_and_lock() in put_namespace()
      	b) __put_namespace() called without dropping lock
      	c) the first thing done by __put_namespace would be
      struct vfsmount *root = namespace->root;
      namespace->root = NULL;
      spin_unlock(...);
      ....
      umount_tree(root);
      ...
      	d) check in mark_... would be simply namespace && namespace->root.

      And we are all set; no screwing around with atomic_read(), no magic
      at all.  Dying namespace gets NULL ->root.
      All changes of ->root happen under spinlock.
      If under a spinlock we see non-NULL ->mnt_namespace, it won't be
      freed until we drop the lock (we will set ->mnt_namespace to NULL
      under that lock before we get to freeing namespace).
      If under a spinlock we see non-NULL ->mnt_namespace and
      ->mnt_namespace->root, we can grab a reference to namespace and be
      sure that it won't go away.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
Miklos Szeredi
202322e6f7 [PATCH] namespace.c: fix mnt_namespace clearing
This patch clears mnt_namespace on unmount.

Not clearing mnt_namespace has two effects:

   1) It is possible to attach a new mount to a detached mount,
      because check_mnt() returns true.

      This means, that when no other references to the detached mount
      remain, it still can't be freed.  This causes a resource leak,
      and possibly un-removable modules.

   2) If mnt_namespace is dereferenced (only in mark_mounts_for_expiry())
      after the namspace has been freed, it can cause an Oops, memory
      corruption, etc.

1) has been tested before and after the patch, 2) is only speculation.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
KAMBAROV, ZAUR
7eaae2828d [PATCH] coverity: fs/locks.c flp null check
We're dereferencing `flp' and then we're testing it for NULLness.

Either the compiler accidentally saved us or the existing null-pointer checdk
is redundant.

This defect was found automatically by Coverity Prevent, a static analysis tool.

Signed-off-by: Zaur Kambarov <zkambarov@coverity.com>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:47 -07:00
Ian Kent
682d4fc931 [PATCH] autofs4: mistake in debug print
Fix debugging printk.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:46 -07:00
Andreas Gruenbacher
ff87b37da9 [PATCH] ext3 xattr: Don't write to the in-inode xattr space of reserved inodes
We are not using the in-inode space for xattrs in reserved inodes because
mkfs.ext3 doesn't initialize it properly.  For those inodes, we set
i_extra_isize to 0.  Make sure that we also don't overwrite the
i_extra_isize field when writing out the inode in that case.  This is for
cleanliness only, and doesn't fix an actual bug.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:46 -07:00
Christoph Lameter
6c036527a6 [PATCH] mostly_read data section
Add a new section called ".data.read_mostly" for data items that are read
frequently and rarely written to like cpumaps etc.

If these maps are placed in the .data section then these frequenly read
items may end up in cachelines with data is is frequently updated.  In that
case all processors in an SMP system must needlessly reload the cachelines
again and again containing elements of those frequently used variables.

The ability to share these cachelines will allow each cpu in an SMP system
to keep local copies of those shared cachelines thereby optimizing
performance.

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Shobhit Dayal <shobhit@calsoftinc.com>
Signed-off-by: Christoph Lameter <christoph@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:46 -07:00
Andreas Gruenbacher
b84c21572d [PATCH] acl kconfig cleanup
Original patch from Matt Mackall <mpm@selenic.com>

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:45 -07:00
Nick Piggin
a39722034a [PATCH] page_uptodate locking scalability
Use a bit spin lock in the first buffer of the page to synchronise asynch
IO buffer completions, instead of the global page_uptodate_lock, which is
showing some scalabilty problems.

Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:45 -07:00
Paolo 'Blaisorblade' Giarrusso
3f580470ba [PATCH] uml: restore hppfs support
Some time ago a trivial patch broke HPPFS (one var became a pointer, not
all uses were updated).  It wasn't fixed at that time because not very
used, now it's been requested so I've fixed this, and it has been tested
positively (at least partially).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:44 -07:00
Anton Blanchard
cf36680887 [PATCH] move ioprio syscalls into syscalls.h
- Make ioprio syscalls return long, like set/getpriority syscalls.
- Move function prototypes into syscalls.h so we can pick them up in the
  32/64bit compat code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:37 -07:00
Mark Fasheh
cb2c023375 [PATCH] export generic_drop_inode() to modules
OCFS2 wants to mark an inode which has been orphaned by another node so
that during final iput it takes the correct path through the VFS and can
pass through the OCFS2 delete_inode callback.  Since i_nlink can get out of
date with other nodes, the best way I see to accomplish this is by clearing
i_nlink on those inodes at drop_inode time.  Other than this small amount
of work, nothing different needs to happen, so I think it would be cleanest
to be able to just call generic_drop_inode at the end of the OCFS2
drop_inode callback.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:35 -07:00
Artem B. Bityuckiy
b3539219c9 Merge with rsync://fileserver/linux
Update to 2.6.12-rc3
2005-07-06 19:40:38 +02:00
Artem B. Bityuckiy
6430a8def1 [JFFS2] Simplify the tree insert code.
It isn't _normal_ that we allow key collision in rbtrees, 
but it does not matter as long as the two nodes with the same
version are together.

Signed-off-by: Artem B. Bityuckiy <dedekind@infradead.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-06 18:30:00 +02:00
David Woodhouse
265489f01d [JFFS2] Remove compatibilty cruft for ancient kernels
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-06 16:12:09 +02:00
David Woodhouse
9dee7503ce [JFFS2] Optimise jffs2_add_tn_to_list
Use an rbtree instead of a simple linked list. We were wasting 
an amazing amount of time in jffs2_add_tn_to_list(). 
Thanks to Artem Bityuckiy and Jarkko Jlavinen  for noticing.

Signed-off-by: David Woodhouse <dwmw2@infradead.org> 
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-06 14:07:54 +02:00
Anton Altaparmakov
07929dcb96 Automatic merge with /usr/src/ntfs-2.6.git. 2005-07-04 14:14:42 +01:00
Andrew Morton
ef6689eff4 [PATCH] fatfs sectioning fix
Fixup for the recent slab leak fix

Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 22:29:48 -07:00
Andrew Morton
c60e81ee1c [PATCH] reiserfs: handle_attrs() fix
Fix a use-uninitialised bug.

Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:13 -07:00
Pekka Enberg
8cb681b9c7 [PATCH] freevxfs: minor cleanups
This patch addresses the following minor issues:

  - Typo in printk
  - Redundant casts
  - Use C99 struct initializers instead of memset
  - Parenthesis around return value
  - Use inline instead of __inline__

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:12 -07:00
Pekka Enberg
1d2cc3b87b [PATCH] freevxfs: remove 2.4 compatability
This patch removes 2.4 compatability header from freevxfs.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:12 -07:00
Pekka Enberg
ba03bda81e [PATCH] freevxfs: fix buffer_head leak
- fix a buffer_head leak in vxfs_getfsh()

- s/SLAB_KERNEL/GFP_KERNEL/

- check sb_bread() return value

- drop pointless buffer-mapped() test.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:12 -07:00
Christoph Hellwig
1c71e22e4e [PATCH] udf_find_entry() cleanup
udf_find_entry can never be called with a NULL argument, so we shouldn't
check for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:11 -07:00
Pekka J Enberg
532a39a375 [PATCH] fat: fix slab cache leak
This patch plugs a slab cache leak in fat module initialization.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:11 -07:00
Anton Altaparmakov
c2d9b8387b Automerge with /usr/src/ntfs-2.6.git. 2005-06-30 09:52:20 +01:00
Jeff Mahoney
2949ccf937 [PATCH] reiserfs: enable attrs by default if saf
The following patch enables attrs by default if the reiserfs_attrs_cleared
bit is set in the superblock.  This allows chattr-type attrs to be used
without any further action by the user.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-29 21:02:04 -07:00
Jeff Mahoney
869eb76e7b [PATCH] reiserfs: Check if attrs are enabled for attr ioctls
ReiserFS currently will allow the user to set/get attrs for files
regardless if they are enabled.  The patch checks to see if they are
enabled, and returns -NOTTY if they are not.

ext[23] doesn't need this check because attrs are always enabled.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-29 21:02:04 -07:00
Mingming Cao
21fe3471c3 [PATCH] ext3: reduce allocate-with-reservation lock latencies
Currently in ext3 block reservation code, the global filesystem reservation
tree lock (rsv_block) is hold during the process of searching for a space
to make a new reservation window, including while scaning the block bitmap
to verify if the avalible window has a free block.  Holding the lock during
bitmap scan is unnecessary and could possibly cause scalability issue and
latency issues.

This patch tries to address this by dropping the lock before scan the
bitmap.  Before that we need to reserve the open window in case someone
else is targetting at the same window.  Question was should we reserve the
whole free reservable space or just the window size we need.  Reserve the
whole free reservable space will possibly force other threads which
intended to do block allocation nearby move to another block group(cause
bad layout).  In this patch, we just reserve the desired size before drop
the lock and scan the block bitmap.  This patch fixed a ext3 reservation
latency issue seen on a cvs check out test.  Patch is tested with many fsx,
tiobench, dbench and untar a kernel test.

Signed-Off-By: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:35 -07:00
KAMBAROV, ZAUR
c7f1721ef2 [PATCH] coverity: fs/ext3/super.c: match_int return check
The return value of  "match_int" is  checked  27 out of 28 times

In lib/parser.c
142  	/**
143  	 * match_int: - scan a decimal representation of an integer from a substring_t
144  	 * @s: substring_t to be scanned
145  	 * @result: resulting integer on success
146  	 *
147  	 * Description: Attempts to parse the &substring_t @s as a decimal integer. On
148  	 * success, sets @result to the integer represented by the string and returns 0.
149  	 * Returns either -ENOMEM or -EINVAL on failure.
150  	 */
151  	int match_int(substring_t *s, int *result)
152  	{
153  		return match_number(s, result, 0);
154  	}

Signed-off-by: Zaur Kambarov <zkambarov@coverity.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:34 -07:00
KAMBAROV, ZAUR
ec471dc484 [PATCH] coverity: fs/udf/namei.c null check
"dir" was dereferenced before null check

Signed-off-by: Zaur Kambarov <zkambarov@coverity.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:33 -07:00
Sbastien Dugu
c016e2257a [PATCH] aio-retry-fix: fix aio retry work queueing
In the case of buffered AIO, in the aio retry path (aio_run_iocb), when the
retry method returns EIOCBRETRY the kicked iocb is added to the context run
list but is never queued onto the work queue.  The request therefore is
never completed.

This patch fixes that by adding the appropriate call to aio_queue_work in
aio_run_aiocb so that subsequent retries will be handled by the aio worker
thread.

Signed-off-by: Sbastien Dugu <sebastien.dugue@bull.net>
Acked-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:32 -07:00
Christoph Hellwig
334a13ec3d [PATCH] really remove xattr_acl.h
Looks like it sneaked back with the NFS ACL merge..

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:31 -07:00
Pekka J Enberg
687a21cee1 [PATCH] rename wakeup_bdflush to wakeup_pdflush
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:31 -07:00
Wen-chien Jesse Sung
8d451687ca [PATCH] fix semaphore handling in __unregister_chrdev_region
This up() should be down() instead.

Signed-off-by: Wen-chien Jesse Sung <jesse@cola.voip.idv.tw>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:29 -07:00
Jens Axboe
22e2c507c3 [PATCH] Update cfq io scheduler to time sliced design
This updates the CFQ io scheduler to the new time sliced design (cfq
v3).  It provides full process fairness, while giving excellent
aggregate system throughput even for many competing processes.  It
supports io priorities, either inherited from the cpu nice value or set
directly with the ioprio_get/set syscalls.  The latter closely mimic
set/getpriority.

This import is based on my latest from -mm.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 14:33:29 -07:00
Dave Kleikamp
b38a3ab3d1 JFS: Code cleanup - getting rid of never-used debug code
I'm finally getting around to cleaning out debug code that I've never used.
There has always been code ifdef'ed out by _JFS_DEBUG_DMAP, _JFS_DEBUG_IMAP,
_JFS_DEBUG_DTREE, and _JFS_DEBUG_XTREE, which I have personally never used,
and I doubt that anyone has since the design stage back in OS/2.  There is
also a function, xtGather, that has never been used, and I don't know why it
was ever there.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-06-27 15:35:37 -05:00
Thomas Gleixner
7ca6448dbf Merge with rsync://fileserver/linux
Update to Linus latest
2005-06-26 23:20:36 +02:00
Anton Altaparmakov
2a322e4c08 Automatic merge with /usr/src/ntfs-2.6.git. 2005-06-26 22:19:40 +01:00
Anton Altaparmakov
ba6d2377c8 NTFS: Fix a nasty deadlock that appeared in recent kernels.
The situation: VFS inode X on a mounted ntfs volume is dirty.  For
      same inode X, the ntfs_inode is dirty and thus corresponding on-disk
      inode, i.e. mft record, which is in a dirty PAGE_CACHE_PAGE belonging
      to the table of inodes, i.e. $MFT, inode 0.
      What happens:
      Process 1: sys_sync()/umount()/whatever...  calls
      __sync_single_inode() for $MFT -> do_writepages() -> write_page for
      the dirty page containing the on-disk inode X, the page is now locked
      -> ntfs_write_mst_block() which clears PageUptodate() on the page to
      prevent anyone else getting hold of it whilst it does the write out.
      This is necessary as the on-disk inode needs "fixups" applied before
      the write to disk which are removed again after the write and
      PageUptodate is then set again.  It then analyses the page looking
      for dirty on-disk inodes and when it finds one it calls
      ntfs_may_write_mft_record() to see if it is safe to write this
      on-disk inode.  This then calls ilookup5() to check if the
      corresponding VFS inode is in icache().  This in turn calls ifind()
      which waits on the inode lock via wait_on_inode whilst holding the
      global inode_lock.
      Process 2: pdflush results in a call to __sync_single_inode for the
      same VFS inode X on the ntfs volume.  This locks the inode (I_LOCK)
      then calls write-inode -> ntfs_write_inode -> map_mft_record() ->
      read_cache_page() for the page (in page cache of table of inodes
      $MFT, inode 0) containing the on-disk inode.  This page has
      PageUptodate() clear because of Process 1 (see above) so
      read_cache_page() blocks when it tries to take the page lock for the
      page so it can call ntfs_read_page().
      Thus Process 1 is holding the page lock on the page containing the
      on-disk inode X and it is waiting on the inode X to be unlocked in
      ifind() so it can write the page out and then unlock the page.
      And Process 2 is holding the inode lock on inode X and is waiting for
      the page to be unlocked so it can call ntfs_readpage() or discover
      that Process 1 set PageUptodate() again and use the page.
      Thus we have a deadlock due to ifind() waiting on the inode lock.
      The solution: The fix is to use the newly introduced
      ilookup5_nowait() which does not wait on the inode's lock and hence
      avoids the deadlock.  This is safe as we do not care about the VFS
      inode and only use the fact that it is in the VFS inode cache and the
      fact that the vfs and ntfs inodes are one struct in memory to find
      the ntfs inode in memory if present.  Also, the ntfs inode has its
      own locking so it does not matter if the vfs inode is locked.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-26 22:12:02 +01:00
Andrew Morton
34f18a9887 [PATCH] jffs2 build fix
Missed conversion in the swsusp cleanup.

Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-26 08:43:19 -07:00
Linus Torvalds
2031d0f586 Merge Christoph's freeze cleanup patch 2005-06-25 17:16:53 -07:00
Christoph Lameter
3e1d1d28d9 [PATCH] Cleanup patch for process freezing
1. Establish a simple API for process freezing defined in linux/include/sched.h:

   frozen(process)		Check for frozen process
   freezing(process)		Check if a process is being frozen
   freeze(process)		Tell a process to freeze (go to refrigerator)
   thaw_process(process)	Restart process
   frozen_process(process)	Process is frozen now

2. Remove all references to PF_FREEZE and PF_FROZEN from all
   kernel sources except sched.h

3. Fix numerous locations where try_to_freeze is manually done by a driver

4. Remove the argument that is no longer necessary from two function calls.

5. Some whitespace cleanup

6. Clear potential race in refrigerator (provides an open window of PF_FREEZE
   cleared before setting PF_FROZEN, recalc_sigpending does not check
   PF_FROZEN).

This patch does not address the problem of freeze_processes() violating the rule
that a task may only modify its own flags by setting PF_FREEZE. This is not clean
in an SMP environment. freeze(process) is therefore not SMP safe!

Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 17:10:13 -07:00
Domen Puncer
c33ed27126 [PATCH] list_for_each_entry: fs-dquot.c
Make code more readable with list_for_each_entry_safe.

Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Maximilian Attems <janitor@sternwelten.at>
Signed-off-by: Domen Puncer <domen@coderock.org>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:11 -07:00
Adrian Bunk
db40716377 [PATCH] fs/ncpfs/: remove unused #ifdef USE_OLD_SLOW_DIRECTORY_LISTING code
This patch removes some unused #ifdef USE_OLD_SLOW_DIRECTORY_LISTING
code.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:04 -07:00
Adrian Bunk
94c9eca223 [PATCH] fs/jffs/: cleanups
This patch contains the following cleanups:
- make needlessly global functions static
- provide some debugging helper functions only for appropriate
  values of CONFIG_JFFS_FS_VERBOSE

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:04 -07:00
Adrian Bunk
486fd404fb [PATCH] small partitions/msdos cleanups
This patch makes the following changes to the msdos partition code:
- remove CONFIG_NEC98_PARTITION leftovers
- make parse_bsd static

This patch was already ACK'ed by Andries Brouwer.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:59 -07:00
Vivek Goyal
72658e9d50 [PATCH] kdump: Parse elf32 headers and export through /proc/vmcore
o Adds support for parsing core ELF32 headers.
o I am expecting ELF32 support to go away down the line. This patch has been
  introduced for testing purposes as gdb can not parse ELF64 headers for
  i386. When a decent user space solution is available, ELF32 support
  can go away.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:53 -07:00
Vivek Goyal
666bfddbe8 [PATCH] kdump: Access dump file in elf format (/proc/vmcore)
From: "Vivek Goyal" <vgoyal@in.ibm.com>

o Support for /proc/vmcore interface. This interface exports elf core image
  either in ELF32 or ELF64 format, depending on the format in which elf headers
  have been stored by crashed kernel.
o Added support for CONFIG_VMCORE config option.
o Removed the dependency on /proc/kcore.

From: "Eric W. Biederman" <ebiederm@xmission.com>

This patch has been refactored to more closely match the prevailing style in
the affected files.  And to clearly indicate the dependency between
/proc/kcore and proc/vmcore.c

From: Hariprasad Nellitheertha <hari@in.ibm.com>

This patch contains the code that provides an ELF format interface to the
previous kernel's memory post kexec reboot.

Signed off by Hariprasad Nellitheertha <hari@in.ibm.com>
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:53 -07:00
Qu Fuping
6283d58e74 [PATCH] reiserfs: do not ignore i/io error on readpage
Reiserfs's readpage does not notice i/o errors.  This patch makes
reiserfs_readpage to return -EIO when i/o error appears.

This patch makes reiserfs to not ignore I/O error on readpage.

Signed-off-by: Qu Fuping <fs@ercist.iscas.ac.cn>
Signed-off-by: Vladimir V. Saveliev <vs@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:40 -07:00
Hugh Dickins
8ae0b77811 [PATCH] fix fsync(dir) return value for ram-based filesystems
Any filesystem which is using simple_dir_operations will retunr -EINVAL for
fsync() on a directory.  Make it return zero instead.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:38 -07:00
Anton Altaparmakov
af859a42d7 NTFS: Prepare for 2.1.23 release: Update documentation and bump version.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 21:07:27 +01:00
Anton Altaparmakov
4757d7dff6 NTFS: Change ntfs_map_runlist_nolock() to only decompress the mapping pairs
if the requested vcn is inside it.  Otherwise we get into problems
      when we try to map an out of bounds vcn because we then try to map
      the already mapped runlist fragment which causes
      ntfs_mapping_pairs_decompress() to fail and return error.  Update
      ntfs_attr_find_vcn_nolock() accordingly.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 17:24:08 +01:00
Anton Altaparmakov
fa3be92317 NTFS: Add an extra parameter @last_vcn to ntfs_get_size_for_mapping_pairs()
and ntfs_mapping_pairs_build() to allow the runlist encoding to be
      partial which is desirable when filling holes in sparse attributes.
      Update all callers.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 17:15:36 +01:00
Anton Altaparmakov
1d58b27b8d NTFS: Change the runlist terminator of the newly allocated cluster(s) to
LCN_ENOENT in ntfs_attr_make_non_resident().  Otherwise the runlist
      code gets confused.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 17:04:55 +01:00
Anton Altaparmakov
3bd1f4a173 NTFS: Fix several occurences of a bug where we would perform 'var & ~const'
with a 64-bit variable and a int, i.e. 32-bit, constant.  This causes
      the higher order 32-bits of the 64-bit variable to be zeroed.  To fix
      this cast the 'const' to the same 64-bit type as 'var'.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 16:51:58 +01:00
Anton Altaparmakov
ca8fd7a0c6 NTFS: Detect the case when Windows has been suspended to disk on the volume
to be mounted and if this is the case do not allow (re)mounting
      read-write.  This is done by parsing hiberfil.sys if present.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 16:31:27 +01:00
Anton Altaparmakov
9f993fe463 NTFS: Fix a bug in address space operations error recovery code paths where
if the runlist was not mapped at all and a mapping error occured we
      would leave the runlist locked on exit to the function so that the
      next access to the same file would try to take the lock and deadlock.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 16:15:36 +01:00
Anton Altaparmakov
3f2faef00c NTFS: Stamp the transaction log ($UsnJrnl), aka user space journal, if it
is active on the volume and we are mounting read-write or remounting
      from read-only to read-write.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 15:28:56 +01:00
Anton Altaparmakov
38b22b6e9f Automerge with /usr/src/ntfs-2.6.git. 2005-06-25 14:27:27 +01:00
Alexey Dobriyan
75043cb5b3 [PATCH] fs/qnx4/*: fix sparse warnings
This patch fixes sparse warnings in the qnx4fs (and might even make
qnx4fs work on big-endian boxes)

Signed-off-by: Alexey Dobriyan <adobriyan@mail.ru>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Anders Larsen <al@alarsen.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 14:14:24 -07:00
Adrian Bunk
52c1da3953 [PATCH] make various thing static
Another rollup of patches which give various symbols static scope

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:43 -07:00
Carsten Otte
eb6fe0c388 [PATCH] xip: reduce code duplication
This patch reworks filemap_xip.c with the goal to reduce code duplication
from mm/filemap.c.  It applies agains 2.6.12-rc6-mm1.  Instead of
implementing the aio functions, this one implements the synchronous
read/write functions only.  For readv and writev, the generic fallback is
used.  For aio, we rely on the application doing the fallback.  Since our
"synchronous" function does memcpy immediately anyway, there is no
performance difference between using the fallbacks or implementing each
operation.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:41 -07:00
Carsten Otte
6d79125bba [PATCH] xip: ext2: execute in place
These are the ext2 related parts.  Ext2 now uses the xip_* file operations
along with the get_xip_page aop when mounted with -o xip.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:41 -07:00
Carsten Otte
ceffc07852 [PATCH] xip: fs/mm: execute in place
- generic_file* file operations do no longer have a xip/non-xip split
- filemap_xip.c implements a new set of fops that require get_xip_page
  aop to work proper. all new fops are exported GPL-only (don't like to
  see whatever code use those except GPL modules)
- __xip_unmap now uses page_check_address, which is no longer static
  in rmap.c, and defined in linux/rmap.h
- mm/filemap.h is now much more clean, plainly having just Linus'
  inline funcs moved here from filemap.c
- fix includes in filemap_xip to make it build cleanly on i386

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:41 -07:00
Martin Waitz
3d41088fa3 [PATCH] DocBook: update comments
This patch updates some comments to match code changes.

Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:40 -07:00
NeilBrown
0964a3d3f1 [PATCH] knfsd: nfsd4 reboot dirname fix
Set the recovery directory via /proc/fs/nfsd/nfs4recoverydir.

It may be changed any time, but is used only on startup.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:36 -07:00
NeilBrown
c7b9a45927 [PATCH] knfsd: nfsd4: reboot recovery
This patch adds the code to create and remove client subdirectories from the
recovery directory, as described in the previous patch comment.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:36 -07:00
NeilBrown
190e4fbf96 [PATCH] knfsd: nfsd4: initialize recovery directory
NFSv4 clients are required to know what state they have on the server so that
they can reclaim it on server reboot.  However, it is possible for
pathalogical combinations of server reboots and network partitions to leave a
client in a state where it cannot know whether it has lost its state on the
server.

For this reason, rfc3530 requires that we store some information about clients
to stable storage.

So we maintain a directory /var/lib/nfs/v4recovery with a subdirectory for
each client with active state.  We leave open the possibility of including
files underneath each such subdirectory with information about the client, but
for now the subdirectories are empty.

We create a client subdirectory whenever a client makes its first non-reclaim
open_confirm.

We remove a client subdirectory whenever either
        a) its lease expires, or
	b) the grace period ends without it reclaiming anything.
When handling reclaims, we allow the reclaim if and only if the client doing
the reclaim has a subdirectory.

This patch adds just the code to scan the recovery directory on nfsd startup.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:35 -07:00
NeilBrown
cb36d63457 [PATCH] knfsd: nfsd4: remove cb_parsed
The cb_parsed field is only used by probe_callback, to determine whether the
callback information has been filled in by setclientid.  But there is no way
that probe_callback() can be called without that having already happened, so
that check is superfluous, as is cb_parsed.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:35 -07:00
NeilBrown
3e9e3dbe0f [PATCH] knfsd: nfsd4: allow multiple lockowners
>From the language of rfc3530 section 8.1.3 (e.g., the suggestion that a
"process id" might be a reasonable lockowner value) it's conceivable that a
client might want to use the same lockowner string on multiple files, so we may
as well allow that.  We expect each use of open_to_lockowner to create a
distinct seqid stream, though.

For now we're also allowing multiple uses of open_to_lockowner with the same
open, though it seems unlikely clients would actually do that.

Also add a comment reminding myself of some very non-scalable data structures.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:35 -07:00
NeilBrown
ea1da636e9 [PATCH] knfsd: nfsd4: rename state list fields
Trivial renaming patch:

I can never remember, while looking at various lists relating the nfsd4 state
structures, which are the "heads" and which are items on other lists, or which
structures are actually on the various lists.  The following convention helps
me: given structures foo and bar, with foo containing the head of a list of
bars, use "bars" for the name of the head of the list contained in the struct
foo, and use "per_foo" for the entries in the struct bars.

Already done for struct nfs4_file; go ahead and do it for the other nfsd4
state structures.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:35 -07:00
NeilBrown
21ab45a480 [PATCH] knfsd: nfsd4: miscellaneous setclientid_confirm cleanup
Minor cleanup, remove some unnecessary printk's.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:34 -07:00
NeilBrown
7c79f7377c [PATCH] knfsd: nfsd4: setclientid_confirm comments
Trivial whitespace and comment fixes.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:34 -07:00
NeilBrown
08e8987c37 [PATCH] knfsd: nfsd4: setclientid_confirm gotoectomy
Change from "goto" to "else if" format in setclientid_confirm.

From: Fred Isaman
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:34 -07:00
NeilBrown
22de4d8374 [PATCH] knfsd: nfsd4: fix setclientid_confirm error return
NFS4_INVAL is not a valid error for setclientid_confirm, and INUSE is the more
logical error here anyway.

From: Fred Isaman
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:34 -07:00
NeilBrown
1a69c179a2 [PATCH] knfsd: nfsd4: fix setclientid_confirm cases
Setclientid_confirm code confused states 1 and 3 (numbering from the
IMPLEMENTATION section of rfc3530, section 14.2.33).  Fix this.

State 1 allows the client to change the callback channel on the fly.  We don't
implement this currently, so just turn off the callback channel in this case.

From: Fred Isaman
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:34 -07:00
NeilBrown
31f4a6c127 [PATCH] knfsd: nfsd4: fix uncomfirmed list
Setclientid code assumes there is only one match in unconfirmed list.
Make sure that assumption holds.

From: Fred Isaman
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:33 -07:00
NeilBrown
fd39ca9a80 [PATCH] knfsd: nfsd4: make needlessly global code static
This patch contains the following possible cleanups:

- make needlessly global code static

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:33 -07:00
NeilBrown
a76b4319ca [PATCH] knfsd: nfsd4: grace period end
For the purposes of reboot recovery, we want to do some work during the
transition period at the end of the grace period.  Some of that work must be
guaranteed to have a certain relationship with the end of the grace period, so
we want to control the transition there.

Our approach is to modify the in_grace() checks to consult a global variable
instead of checking the time directly, to schedule the first run of the
laundromat thread at the end of the grace period, and to set the global
end-of-grace-period there.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:33 -07:00
NeilBrown
28ce6054f1 [PATCH] knfsd: nfsd4: add find_{un}conf_by_str functions to simplify setclientid
Minor setclientid cleanup

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:33 -07:00
NeilBrown
a55370a3c0 [PATCH] knfsd: nfsd4: reboot hash
For the purposes of reboot recovery we keep a directory with subdirectories
each having a name that is the ascii hex representation of the md5 sum of a
client identifier for an active client.

This adds the code to calculate that name.  We also use it for the purposes of
comparing clients, so if someone ever manages to find two client names that
are md5 collisions, then we'll return clid_inuse to the second.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:33 -07:00
NeilBrown
7dea9d280c [PATCH] knfsd: nfsd4: setclientid simplification
We can be a little more concise here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:32 -07:00
NeilBrown
bd0b1e954e [PATCH] knfsd: nfsd4: idmap initialization
Adopt standard kernel style by defining a no-op function instead of putting
ifdef's in the code where the function is called.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:32 -07:00
NeilBrown
707d4ab7b3 [PATCH] knfsd: nfsd4: remove nfs4_reclaim_init
nfs4_reclaim_init is no longer performing any useful function.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:32 -07:00
NeilBrown
ac4d8ff2a5 [PATCH] knfsd: nfsd4: clean up state initialization
Separate out stuff that needs initialization on startup from stuff that only
needs initialization on module init from static data.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:32 -07:00
NeilBrown
76a3550ec5 [PATCH] knfsd: nfsd4: rename nfs4_state_init
Somewhat gratuitous rename to simplify following patch.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:31 -07:00
NeilBrown
7b190fecfa [PATCH] knfsd: nfsd4: delegation recovery
Allow recovery of delegations after reboot.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:31 -07:00
NeilBrown
d99a05adf8 [PATCH] knfsd: nfsd4: simplify lease changing
The only way the protocol gives to change the lease time on the fly is to
simulate a reboot.  We don't have that completely right in the current code;
among other things, we should probably put lockd in grace too while we do
this.

For now, let's just keep this simple, and wait till the next time nfsd starts
to register any changes in lease time.  If the administrator really wants to
change the lease time *now*, they can go ahead and bring nfsd down and then
back up again after changing the lease time.

Also remove the "if (reclaim_str_hashtbl_size == 0)" case, a shortcut which
skips the grace period if we know of no clients in need of recovery.  This
isn't going to work well with nlm.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:31 -07:00
NeilBrown
58da282b73 [PATCH] knfsd: nfsd4: create separate laundromat workqueue
We're running the laundromat work on the default kevent worker thread.  But
the laundromat takes the nfsv4 state semaphore, which is used for way too much
stuff, and the potential for deadlocks is high.  Better to have this on a
separate workqueue.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:31 -07:00
NeilBrown
dfc8356570 [PATCH] knfsd: nfsd4: nfs4_check_open_reclaim cleanup
Minor cleanup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:30 -07:00
NeilBrown
5ba266d632 [PATCH] knfsd: nfsd4: fix probe_callback
rpc_create_client was modified recently to do its own (synchronous) NULL ping
of the server.  We'd rather do that on our own, asynchronously, so that we
don't have to block the nfsd thread doing the probe, and so that setclientid
handling (hence, client mounts) can proceed normally whether the callback is
succesful or not.  (We can still function fine without the callback
channel--we just won't be able to give out delegations till it's verified to
work.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:30 -07:00
NeilBrown
7e06b7f9e9 [PATCH] knfsd: nfs4: hold filp while reading or writing
We're trying to read and write from a struct file that we may not hold a
reference to any more (since a close could be processed as soon as we drop the
state lock).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:30 -07:00
NeilBrown
46be925fa6 [PATCH] knfsd: lockd: flush signals on shutdown
Silence another annoying "failed to contact portmap (errno -512)" on shutdown.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:30 -07:00
NeilBrown
13cd21845d [PATCH] nfsd4: reference count struct nfs4_file
Add a struct kref to each nfs4_file and take a reference to it from each
stateid and delegation that refers to it.  The atomicity guarantees are
overkill given that all this stuff is done under the single nfsd4 state lock,
but a) we'd like finer-grained locking some day, and b) this simplifies the
cleanup of the structures a bit, something that has previously been a bit
complicated and bug-prone.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:30 -07:00
NeilBrown
8beefa2493 [PATCH] nfsd4: rename nfs4_file fields
Trivial renaming patch:

I can never remember, while looking at various lists relating the nfsd4 state
structures, which are the "heads" and which are items on other lists, or which
structures are actually on the various lists.  The following convention helps
me: given structures foo and bar, with foo containing the head of a list of
bars, use "bars" for the name of the head of the list contained in the struct
foo, and use "per_foo" for the entries in the struct bars.

Go ahead and do this for struct nfs4_file.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:29 -07:00
NeilBrown
6fa305ded4 [PATCH] nfsd4: remove debugging counters
These remaining debugging counters haven't proved that useful.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:29 -07:00
NeilBrown
5b2d21c196 [PATCH] nfsd4: slabify delegations
Allocate delegations from a slab cache.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:29 -07:00
NeilBrown
5ac049ac66 [PATCH] nfsd4: slabify stateids
Allocate stateid's from a slab cache.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:29 -07:00
NeilBrown
e60d4398a7 [PATCH] nfsd4: slabify nfs4_files
The structures the server uses to keep track of various pieces of nfsv4 state
(open files, outstanding delegations, etc.) are likely to be allocated and
deallocated frequently and seem reasonable candidates for slab caches.

While we're at it, the slab code keeps statistics that help catch leaks and
such, so we may as well take this chance to eliminate some debugging counters
that we've been keeping ourselves.

Start with the struct nfs4_file.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:29 -07:00
NeilBrown
c815afc73e [PATCH] nfsd4: block metadata ops during grace period
We currently return err_grace if a user attempts a non-reclaim open during the
grace period.  But we also need to prevent renames and removes, at least, to
ensure clients have the chance to recover state on files before they are moved
or deleted.

Of course, local users could also do renames and removes during the lease
period, and there's not much we can do about that.  This at least will help
with remote users.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
NeilBrown
496400014f [PATCH] nfsd4: fix fh_expire_type
We're returning NFS4_FH_NOEXPIRE_WITH_OPEN | NFS4_FH_VOL_RENAME for the
fh_expire_type attribute.  This is incorrect:
	1. The spec actually only allows NOEXPIRE_WITH_OPEN when
	   VOLATILE_ANY is also set.
	2. Filehandles for open files can expire, if the file is removed
	   and there is a reboot.
	3. Filehandles are only volatile on rename in the nosubtree check
	   case.

Unfortunately, there's no way to indicate that we only expire on remove.  So
our only choice is FH4_VOLATILE_ANY.  Although it's redundant, we also set
FH4_VOL_RENAME in the subtree check case, since subtreecheck does actually
cause problems in practice and it seems possibly useful to give clients some
way to distinguish that case.

Fix a mispelled #define while we're at it.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
NeilBrown
0dd3c19212 [PATCH] nfsd4: support CLAIM_DELEGATE_CUR
Add OPEN claim type NFS4_OPEN_CLAIM_DELEGATE_CUR to nfsd4_open().

A delegation stateid and a name are provided.  OPEN with O_CREAT is not legal
with this claim type; otherwise, use the NFS4_OPEN_CLAIM_NULL code path to
lookup the filename to be opened.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
NeilBrown
c44c5eeb2c [PATCH] nfsd4: add open state code for CLAIM_DELEGATE_CUR
State logic for OPEN with claim type CLAIM_DELEGATE_CUR, which the NFSv4
client uses to report local OPENs on a delegated file back to the NFSv4
server.

nfs4_check_deleg() performs input delegation stateid lookup and sanity check.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
NeilBrown
567d98292e [PATCH] nfsd4: don't reopen for delegated client
We don't really need to be doing a separate open for every stateid.  And in
the case of an open from a client that already has a delegation on a file, it
unnecessarily results in a delegation recall.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
NeilBrown
4a6e43e6d4 [PATCH] nfsd4: nfs4_check_delegmode
Additional minor code reshuffling to prepare for claim_deleg_cur support.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:27 -07:00
NeilBrown
52f4fb4306 [PATCH] nfsd4: find_delegation_file()
Factor out a bit of common code that will be useful elsewhere.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:27 -07:00
Jan Kara
bdd5b29c6b [PATCH] Make reiserfs BUG on too big transaction
Make reiserfs BUG() when somebody tries to start a larger transaction than
it's allowed (currently the code just silently deadlocks).

Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Chris Mason <mason@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:23 -07:00
Jan Kara
556a2a45bc [PATCH] quota: reiserfs: improve quota credit estimates
Use improved credits estimates for quota operations.  Also reserve space
for a quota operation in a transaction only if filesystem was mounted with
some quota option.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:20 -07:00
Jan Kara
1f54587bea [PATCH] quota: ext3: Improve quota credit estimates
Use improved credits estimates for quota operations.  Also reserve a space
for a quota operation in a transaction only if filesystem was mounted with
some quota options.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:20 -07:00
Jan Kara
bd6a1f16ff [PATCH] reiserfs: add checking of journal_begin() return value
Check return values of journal_begin() and journal_end() in the quota code
for reiserfs.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:19 -07:00
Christoph Hellwig
92198f7eaa [PATCH] pass iocb to dio_iodone_t
XFS will have to look at iocb->private to fix aio+dio.  No other filesystem
is using the blockdev_direct_IO* end_io callback.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:19 -07:00
Sonny Rao
f5f287738b JFS: performance patch
Basically, we saw a large amount of time spent in the
jfs_strfromUCS_le() function, mispredicting the branch inside the
loop, so I just added some unlikely modifiers to the if statements to
re-ordered the code.  Again, these simple changes provided > 2 % on
spec-sfs, so please consider it for inclusion.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-06-23 16:57:56 -05:00
Telemaque Ndizihiwe
fed2fc18a4 [PATCH] sys_open() cleanup
Clean up tortured logic in sys_open().

Signed-off-by: Telemaque Ndizihiwe <telendiz@eircom.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:36 -07:00
Benjamin LaHaise
63e6880918 [PATCH] aio: fix do_sync_(read|write) to properly handle aio retries
When do_sync_(read|write) encounters an aio method that makes use of the
retry mechanism, they fail to correctly retry the operation.  This fixes
that by adding the appropriate sleep and retry mechanism.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:34 -07:00
Anton Altaparmakov
152becd26e [PATCH] Bug in error recovery in fs/buffer.c::__block_prepare_write()
fs/buffer.c::__block_prepare_write() has broken error recovery.  It calls
the get_block() callback with "create = 1" and if that succeeds it
immediately clears buffer_new on the just allocated buffer (which has
buffer_new set).

The bug is that if an error occurs and get_block() returns != 0, we break
from this loop and go into recovery code.  This code has this comment:

/* Error case: */
/*
 * Zero out any newly allocated blocks to avoid exposing stale
 * data.  If BH_New is set, we know that the block was newly
 * allocated in the above loop.
 */

So the intent is obviously good in that it wants to clear just allocated
and hence not zeroed buffers.  However the code recognises allocated
buffers by checking for buffer_new being set.

Unfortunately __block_prepare_write() as discussed above already cleared
buffer_new on all allocated buffers thus no buffers will be cleared during
error recovery and old data will be leaked.

The simplest way I can see to fix this is to make the current recovery code
work by _not_ clearing buffer_new after calling get_block() in
__block_prepare_write().

We cannot safely allow buffer_new buffers to "leak out" of
__block_prepare_write(), thus we simply do a quick loop over the buffers
clearing buffer_new on each of them if it is set just before returning
"success" from __block_prepare_write().

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:34 -07:00
Christoph Hellwig
9a59f452ab [PATCH] remove <linux/xattr_acl.h>
This file duplicates <linux/posix_acl_xattr.h>, using slightly different
names.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:33 -07:00
Christoph Lameter
45778ca819 [PATCH] Remove f_error field from struct file
The following patch removes the f_error field and all checks of f_error.

Trond said:

  f_error was introduced for NFS, and made sense when we were guaranteed
  always to have a file pointer around when write errors occurred.  Since
  then, we have (for various reasons) had to introduce the nfs_open_context in
  order to track the file read/write state, and it made sense to move our
  f_error tracking there too.

Signed-off-by: Christoph Lameter <christoph@lameter.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:33 -07:00
Arnd Bergmann
bb93e3a52f [PATCH] block: add unlocked_ioctl support for block devices
This patch allows block device drivers to convert their ioctl functions to
unlocked_ioctl() like character devices and other subsystems.  All
functions that were called with the BKL held before are still used that
way, but I would not be surprised if it could be removed from the ioctl
functions in drivers/block/ioctl.c themselves.

As a side note, I found that compat_blkdev_ioctl() acquires the BKL as
well, which looks like a bug.  I have checked that every user of
disk->fops->compat_ioctl() in the current git tree gets the BKL itself, so
it could easily be removed from compat_blkdev_ioctl().

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:32 -07:00
Pekka Enberg
b030a4dd60 [PATCH] Remove eventpoll macro obfuscation
This patch gets rid of some macro obfuscation from fs/eventpoll.c by
removing slab allocator wrappers and converting macros to static inline
functions.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:30 -07:00
Oleg Nesterov
dfb388bf8a [PATCH] factor out common code in sys_fsync/sys_fdatasync
This patch consolidates sys_fsync and sys_fdatasync.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:29 -07:00
Yoav Zach
ef3daeda7b [PATCH] Don't force O_LARGEFILE for 32 bit processes on ia64
In ia64 kernel, the O_LARGEFILE flag is forced when opening a file.  This
is problematic for execution of 32 bit processes, which are not largefile
aware, either by SW emulation or by HW execution.

For such processes, the problem is two-fold:

1) When trying to open a file that is larger than 4G
   the operation should fail, but it's not
2) Writing to offset larger than 4G should fail, but
   it's not

The proposed patch takes advantage of the way 32 bit processes are
identified in ia64 systems.  Such processes have PER_LINUX32 for their
personality.  With the patch, the ia64 kernel will not enforce the
O_LARGEFILE flag if the current process has PER_LINUX32 set.  The behavior
for all other architectures remains unchanged.

Signed-off-by: Yoav Zach <yoav.zach@intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:28 -07:00
Kirill Korotaev
618f06362a [PATCH] O(1) sb list traversing on syncs
This patch removes O(n^2) super block loops in sync_inodes(),
sync_filesystems() etc.  in favour of using __put_super_and_need_restart()
which I introduced earlier.  We faced a noticably long freezes on sb
syncing when there are thousands of super blocks in the system.

Signed-Off-By: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:27 -07:00
Kirill Korotaev
af4d2ecbf0 [PATCH] Fix of bogus file max limit messages
This patch fixes incorrect and bogus kernel messages that file-max limit
reached when the allocation fails

Signed-Off-By: Kirill Korotaev <dev@sw.ru>
Signed-Off-By: Denis Lunev <den@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:26 -07:00
Christoph Hellwig
c663e5d80e [PATCH] add some comments to lookup_create()
In a duplicate of lookup_create in the af_unix code Al commented what's
going on nicely, so let's bring that over to lookup_create before the copy
is going away (I'll send a patch soon)

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:26 -07:00
Andreas Dilger
acfa1823d3 [PATCH] Support for dx directories in ext3_get_parent (NFSD)
Henrik Grubbstrom noted:

The 2.6.10 ext3_get_parent attempts to use ext3_find_entry to look up the
entry "..", which fails for dx directories since ".." is not present in the
directory hash table.  The patch below solves this by looking up the dotdot
entry in the dx_root block.

Typical symptoms of the above bug are intermittent claims by nfsd that
files or directories are missing on exported ext3 filesystems.

cf https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=3D150759 and
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=3D144556

ext3_get_parent() is IMHO the wrong place to fix this bug as it introduces
a lot of internals from htree into that function.  Instead, I think this
should be fixed in ext3_find_entry() as in the below patch.  This has the
added advantage that it works for any callers of ext3_find_entry() and not
just ext3_lookup_parent().

Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Henrik Grubbstrom <grubba@grubba.org>
Cc: <ext2-devel@lists.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:26 -07:00
Alan Cox
d6e7114481 [PATCH] setuid core dump
Add a new `suid_dumpable' sysctl:

This value can be used to query and set the core dump mode for setuid
or otherwise protected/tainted binaries. The modes are

0 - (default) - traditional behaviour.  Any process which has changed
    privilege levels or is execute only will not be dumped

1 - (debug) - all processes dump core when possible.  The core dump is
    owned by the current user and no security is applied.  This is intended
    for system debugging situations only.  Ptrace is unchecked.

2 - (suidsafe) - any binary which normally would not be dumped is dumped
    readable by root only.  This allows the end user to remove such a dump but
    not access it directly.  For security reasons core dumps in this mode will
    not overwrite one another or other files.  This mode is appropriate when
    adminstrators are attempting to debug problems in a normal environment.

(akpm:

> > +EXPORT_SYMBOL(suid_dumpable);
>
> EXPORT_SYMBOL_GPL?

No problem to me.

> >  	if (current->euid == current->uid && current->egid == current->gid)
> >  		current->mm->dumpable = 1;
>
> Should this be SUID_DUMP_USER?

Actually the feedback I had from last time was that the SUID_ defines
should go because its clearer to follow the numbers. They can go
everywhere (and there are lots of places where dumpable is tested/used
as a bool in untouched code)

> Maybe this should be renamed to `dump_policy' or something.  Doing that
> would help us catch any code which isn't using the #defines, too.

Fair comment. The patch was designed to be easy to maintain for Red Hat
rather than for merging. Changing that field would create a gigantic
diff because it is used all over the place.

)

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:26 -07:00
Christoph Hellwig
2fa389c5eb [PATCH] quota: sanitize dentry handling in vfs_quota_on_mount
Use lookup_one_len instead of opencoding a simplified lookup using
lookup_hash with a fake hash.

Also there's no need anymore for the d_invalidate as we have a completely
valid dentry now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:20 -07:00
Christoph Hellwig
84de856ed3 [PATCH] quota: consolidate code surrounding vfs_quota_on_mount
Move some code duplicated in both callers into vfs_quota_on_mount

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jan Kara <jack@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:20 -07:00
Christoph Hellwig
5f45f1a78f [PATCH] remove duplicate get_dentry functions in various places
Various filesystem drivers have grown a get_dentry() function that's a
duplicate of lookup_one_len, except that it doesn't take a maximum length
argument and doesn't check for \0 or / in the passed in filename.

Switch all these places to use lookup_one_len.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Greg KH <greg@kroah.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:20 -07:00
Neil Horman
ac20427ef6 [PATCH] add check to /proc/devices read routines
Patch to add check to get_chrdev_list and get_blkdev_list to prevent reads
of /proc/devices from spilling over the provided page if more than 4096
bytes of string data are generated from all the registered character and
block devices in a system

Signed-off-by: Neil Horman <nhorman@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:19 -07:00
Alexander Viro
991114c6fa [PATCH] fix for prune_icache()/forced final iput() races
Based on analysis and a patch from Russ Weight <rweight@us.ibm.com>

There is a race condition that can occur if an inode is allocated and then
released (using iput) during the ->fill_super functions.  The race
condition is between kswapd and mount.

For most filesystems this can only happen in an error path when kswapd is
running concurrently.  For isofs, however, the error can occur in a more
common code path (which is how the bug was found).

The logic here is "we want final iput() to free inode *now* instead of
letting it sit in cache if fs is going down or had not quite come up".  The
problem is with kswapd seeing such inodes in the middle of being killed and
happily taking over.

The clean solution would be to tell kswapd to leave those inodes alone and
let our final iput deal with them.  I.e.  add a new flag
(I_FORCED_FREEING), set it before write_inode_now() there and make
prune_icache() leave those alone.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:17 -07:00
Anton Altaparmakov
3357d4c75f Automatic merge with /usr/src/ntfs-2.6.git. 2005-06-23 11:26:22 +01:00
Trond Myklebust
eadf4598e7 [PATCH] NFS: Add debugging code to NFSv4 readdir
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:44 -04:00
Manoj Naik
6ebf3656fd [PATCH] NFSv4: Map a couple of NFSv4 errors to EINVAL.
This shows up on running tar over NFSv4.

 Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:44 -04:00
Manoj Naik
97d312d037 [PATCH] NFSv4: add support for rdattr_error in NFSv4 readdir requests.
Request RDATTR_ERROR as an attribute in readdir to distinguish between a
 directory being within an absent filesystem or one (or more) of its entries.

 Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:43 -04:00