Commit Graph

792 Commits

Author SHA1 Message Date
Dave Kleikamp
00be3e7e5c JFS: Remove bogus WARN_ON statement and some dead code
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-14 15:15:39 -05:00
Paolo 'Blaisorblade' Giarrusso
a0d43df931 [PATCH] uml: hostfs: unuse ROOT_DEV
Minimal patch removing uses of ROOT_DEV; next patch unexports it.  I've
opposed this, but I've planned to reintroduce the functionality without using
ROOT_DEV.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:25 -07:00
Paolo 'Blaisorblade' Giarrusso
8e0a218124 [PATCH] uml: fix hppfs error path
Fix the error message to refer to the error code, i.e.  err, not count, plus
add some cosmetical fixes.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:25 -07:00
Anton Altaparmakov
c514720716 Automatic merge with /usr/src/ntfs-2.6.git. 2005-07-13 23:09:23 +01:00
Linus Torvalds
3720bd8b1e Merge master.kernel.org:/pub/scm/linux/kernel/git/tglx/mtd-2.6 2005-07-13 12:19:30 -07:00
Steve Dickson
7ee91ec14b [PATCH] NFS: procfs/sysctl interfaces for lockd do not work on x86_64
Allow the setting of NLM timeouts and grace periods through the proc and
sysclt interfaces on x86_64 architectures

Signed-off-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:25:24 -07:00
Anton Altaparmakov
88bd5121d6 [PATCH] Fix soft lockup due to NTFS: VFS part and explanation
Something has changed in the core kernel such that we now get concurrent
inode write outs, one e.g via pdflush and one via sys_sync or whatever.
This causes a nasty deadlock in ntfs.  The only clean solution
unfortunately requires a minor vfs api extension.

First the deadlock analysis:

Prerequisive knowledge: NTFS has a file $MFT (inode 0) loaded at mount
time.  The NTFS driver uses the page cache for storing the file contents as
usual.  More interestingly this file contains the table of on-disk inodes
as a sequence of MFT_RECORDs.  Thus NTFS driver accesses the on-disk inodes
by accessing the MFT_RECORDs in the page cache pages of the loaded inode
$MFT.

The situation: VFS inode X on a mounted ntfs volume is dirty.  For same
inode X, the ntfs_inode is dirty and thus corresponding on-disk inode,
which is as explained above in a dirty PAGE_CACHE_PAGE belonging to the
table of inodes ($MFT, inode 0).

What happens:

Process 1: sys_sync()/umount()/whatever...  calls __sync_single_inode() for
$MFT -> do_writepages() -> write_page for the dirty page containing the
on-disk inode X, the page is now locked -> ntfs_write_mst_block() which
clears PageUptodate() on the page to prevent anyone else getting hold of it
whilst it does the write out (this is necessary as the on-disk inode needs
"fixups" applied before the write to disk which are removed again after the
write and PageUptodate is then set again).  It then analyses the page
looking for dirty on-disk inodes and when it finds one it calls
ntfs_may_write_mft_record() to see if it is safe to write this on-disk
inode.  This then calls ilookup5() to check if the corresponding VFS inode
is in icache().  This in turn calls ifind() which waits on the inode lock
via wait_on_inode whilst holding the global inode_lock.

Process 2: pdflush results in a call to __sync_single_inode for the same
VFS inode X on the ntfs volume.  This locks the inode (I_LOCK) then calls
write-inode -> ntfs_write_inode -> map_mft_record() -> read_cache_page() of
the page (in page cache of table of inodes $MFT, inode 0) containing the
on-disk inode.  This page has PageUptodate() clear because of Process 1
(see above) so read_cache_page() blocks when tries to take the page lock
for the page so it can call ntfs_read_page().

Thus Process 1 is holding the page lock on the page containing the on-disk
inode X and it is waiting on the inode X to be unlocked in ifind() so it
can write the page out and then unlock the page.

And Process 2 is holding the inode lock on inode X and is waiting for the
page to be unlocked so it can call ntfs_readpage() or discover that
Process 1 set PageUptodate() again and use the page.

Thus we have a deadlock due to ifind() waiting on the inode lock.

The only sensible solution: NTFS does not care whether the VFS inode is
locked or not when it calls ilookup5() (it doesn't use the VFS inode at
all, it just uses it to find the corresponding ntfs_inode which is of
course attached to the VFS inode (both are one single struct); and it uses
the ntfs_inode which is subject to its own locking so I_LOCK is irrelevant)
hence we want a modified ilookup5_nowait() which is the same as ilookup5()
but it does not wait on the inode lock.

Without such functionality I would have to keep my own ntfs_inode cache in
the NTFS driver just so I can find ntfs_inodes independent of their VFS
inodes which would be slow, memory and cpu cycle wasting, and incredibly
stupid given the icache already exists in the VFS.

Below is a patch that does the ilookup5_nowait() implementation in
fs/inode.c and exports it.

ilookup5_nowait.diff:

Introduce ilookup5_nowait() which is basically the same as ilookup5() but
it does not wait on the inode's lock (i.e. it omits the wait_on_inode()
done in ifind()).

This is needed to avoid a nasty deadlock in NTFS.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:25:24 -07:00
Robert Love
9a556e8908 [PATCH] inotify: misc cleanup
Really simple, basic cleanup.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:09:31 -07:00
Robert Love
0399cb08c5 [PATCH] inotify: move sysctl
This moves the inotify sysctl knobs to "/proc/sys/fs/inotify" from
"/proc/sys/fs".  Also some related cleanup.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:09:31 -07:00
Ian Dall
59192ed9e7 JFS: Need to be root to create files with security context
It turns out this is due to some inverted logic in xattr.c

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-13 09:15:18 -05:00
Dave Kleikamp
6211502d7e JFS: Allow security.* xattrs to be set on symlinks
All of the different xattr namespaces have different rules.
user.* and ACL's are not allowed on symlinks, and since these were the
first xattrs implemented, I assumed there was no need to support xattrs
on symlinks.  This one-line patch should fix it.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-13 09:07:53 -05:00
Dave Kleikamp
f7f24758ac Merge with /home/shaggy/git/linus-clean/
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-07-13 08:57:38 -05:00
Thomas Gleixner
1b3035b7fc Merge with rsync://fileserver/linux 2005-07-13 10:45:00 +02:00
Robert Love
0eeca28300 [PATCH] inotify
inotify is intended to correct the deficiencies of dnotify, particularly
its inability to scale and its terrible user interface:

        * dnotify requires the opening of one fd per each directory
          that you intend to watch. This quickly results in too many
          open files and pins removable media, preventing unmount.
        * dnotify is directory-based. You only learn about changes to
          directories. Sure, a change to a file in a directory affects
          the directory, but you are then forced to keep a cache of
          stat structures.
        * dnotify's interface to user-space is awful.  Signals?

inotify provides a more usable, simple, powerful solution to file change
notification:

        * inotify's interface is a system call that returns a fd, not SIGIO.
	  You get a single fd, which is select()-able.
        * inotify has an event that says "the filesystem that the item
          you were watching is on was unmounted."
        * inotify can watch directories or files.

Inotify is currently used by Beagle (a desktop search infrastructure),
Gamin (a FAM replacement), and other projects.

See Documentation/filesystems/inotify.txt.

Signed-off-by: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 20:38:38 -07:00
Linus Torvalds
bd4c625c06 reiserfs: run scripts/Lindent on reiserfs code
This was a pure indentation change, using:

	scripts/Lindent fs/reiserfs/*.c include/linux/reiserfs_*.h

to make reiserfs match the regular Linux indentation style.  As Jeff
Mahoney <jeffm@suse.com> writes:

 The ReiserFS code is a mix of a number of different coding styles, sometimes
 different even from line-to-line. Since the code has been relatively stable
 for quite some time and there are few outstanding patches to be applied, it
 is time to reformat the code to conform to the Linux style standard outlined
 in Documentation/CodingStyle.

 This patch contains the result of running scripts/Lindent against
 fs/reiserfs/*.c and include/linux/reiserfs_*.h. There are places where the
 code can be made to look better, but I'd rather keep those patches separate
 so that there isn't a subtle by-hand hand accident in the middle of a huge
 patch. To be clear: This patch is reformatting *only*.

 A number of patches may follow that continue to make the code more consistent
 with the Linux coding style.

 Hans wasn't particularly enthusiastic about these patches, but said he
 wouldn't really oppose them either.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 20:21:28 -07:00
Jeff Mahoney
7fa94c8868 [PATCH] reiserfs: fix up case where indent misreads the code
indent(1) doesn't know how to handle the "do not compile" error. It results
 in the item_ops array declaration being indented a tab stop in when it should
 not be. This patch replaces it with a #error that describes why it's failing.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:22:35 -07:00
Brian King
7da6844cf7 [PATCH] cdev: cdev_put oops
While fixing an oops in the st driver in a dirty release path, I
encountered an oops in cdev_put for cdevs allocated using cdev_alloc.  If
cdev_del is called when the cdev kobject still has an open user, when the
last cdev_put is called, the cdev_put will call kobject_put, which will end
up ultimately releasing the cdev in cdev_dynamic_release.  Patch fixes the
oops by preventing cdev_put from accessing freed memory.

Signed-off-by: Brian King <brking@us.ibm.com>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:01:02 -07:00
Jan Kara
50a5223428 [PATCH] ext2: fix mount options parting
Restore old set of ext2 mount options when remounting of a filesystem
fails.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:01:01 -07:00
Jan Kara
08c6a96fd7 [PATCH] ext3: fix options parsing
Fix a problem with ext3 mount option parsing.  When remount of a filesystem
fails, old options are now restored.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:01:01 -07:00
Roland McGrath
5323125031 [PATCH] reset real_timer target on exec leader change
When a noninitial thread does exec, it becomes the new group leader.  If
there is a ITIMER_REAL timer running, it points at the old group leader and
when it fires it can follow a stale pointer.  The timer data needs to be
reset to point at the exec'ing thread that is becoming the group leader.
This has to synchronize with any concurrent firing of the timer to make
sure that it_real_fn can never run when the data points to a thread that
might have been reaped already.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:01:01 -07:00
Artem B. Bityuckiy
4120db4719 [PATCH] bugfix: two read_inode() calls without clear_inode() call between
Bug symptoms
~~~~~~~~~~~~
For the same inode VFS calls read_inode() twice and doesn't call
clear_inode() between the two read_inode() invocations.

Bug description
~~~~~~~~~~~~~~~
Suppose we have an inode which has zero reference count but is still in
the inode cache. Suppose kswapd invokes shrink_icache_memory() to free
some RAM. In prune_icache() inodes are removed from i_hash. prune_icache
() is then going to call clear_inode(), but drops the inode_lock
spinlock before this. If in this moment another task calls iget() for an
inode which was just removed from i_hash by prune_icache(), then iget()
invokes read_inode() for this inode, because it is *already removed*
from i_hash.

The end result is: we call iget(#N) then iput(#N); inode #N has zero
i_count now and is in the inode cache; kswapd starts. kswapd removes the
inode #N from i_hash ans is preempted; we call iget(#N) again;
read_inode() is invoked as the result; but we expect clear_inode()
before.

Fix
~~~~~~~
To fix the bug I remove inodes from i_hash later, when clear_inode() is
actually called. I remove them from i_hash under spinlock protection.
Since the i_state is set to I_FREEING, it is safe to do this. The others
will sleep waiting for the inode state change.

I also postpone removing inodes from i_sb_list. It is not compulsory to
do so but I do it for readability reasons. Inodes are added/removed to
the lists together everywhere in the code and there is no point to
change this rule. This is harmless because the only user of i_sb_list
which somehow may interfere with me (invalidate_list()) is excluded by
the iprune_sem mutex.

The same race is possible in invalidate_list() so I do the same for it.

Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:00:59 -07:00
Miklos Szeredi
168a9fd6a1 [PATCH] __wait_on_freeing_inode fix
This patch fixes queer behavior in __wait_on_freeing_inode().

If I_LOCK was not set it called yield(), effectively busy waiting for the
removal of the inode from the hash.  This change was introduced within
"[PATCH] eliminate inode waitqueue hashtable" Changeset 1.1938.166.16 last
october by wli.

The solution is to restore the old behavior, of unconditionally waiting on
the waitqueue.  It doesn't matter if I_LOCK is not set initally, the task
will go to sleep, and wake up when wake_up_inode() is called from
generic_delete_inode() after removing the inode from the hash chain.

Comment is also updated to better reflect current behavior.

This condition is very hard to trigger normally (simultaneous clear_inode()
with iget()) so probably only heavy stress testing can reveal any change of
behavior.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 16:00:59 -07:00
Todd Poynor
a98a5d04f4 Merge with rsync://fileserver/linux 2005-07-13 00:58:44 +02:00
Todd Poynor
751382dd5c [JFFS2] Avoid compiler warnings when JFFS2_FS_WRITEBUFFER=n
Signed-off-by: Todd Poynor <tpoynor@mvista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-13 00:03:19 +02:00
Artem B. Bityuckiy
b62205986a [JFFS2] Init locks early during mount
In case of a mount error locks might be uninitialized but
accessed by the resulting call to jffs2_kill_sb().

Signed-off-by: Artem B. Bityuckiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-13 00:02:33 +02:00
Artem B. Bityuckiy
e4fef66189 [JFFS2] Rename function and update comments
We recently changed the method of collecting and sorting of
tmp_dnode objects to use a temporary RB-tree instead of a
temporary list. Rename function and update comments.

Signed-off-by: Artem B. Bityuckiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-12 23:58:26 +02:00
Artem B. Bityuckiy
86ffc0d5f5 [JFFS2] Remove needless variable initialization
Signed-off-by: Artem B. Bityuckiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-12 23:56:40 +02:00
Artem B. Bityuckiy
336d2ff711 [JFFS2] Avoid alloc/dealloc for zero sized nodes
Signed-off-by: Artem B. Bityuckiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-12 23:55:05 +02:00
Linus Torvalds
200d481f28 Merge master.kernel.org:/pub/scm/linux/kernel/git/tglx/mtd-2.6 2005-07-11 10:18:18 -07:00
NeilBrown
e34ac862ee [PATCH] nfsd4: fix fh_expire_type
After discussion at the recent NFSv4 bake-a-thon, I realized that my
assumption that NFS4_FH_PERSISTENT required filehandles to persist was a
misreading of the spec.  This also fixes an interoperability problem with the
Solaris client.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:11 -07:00
NeilBrown
4c4cd222ee [PATCH] nfsd4: check lock type against openmode.
We shouldn't be allowing, e.g., write locks on files not open for read.  To
enforce this, we add a pointer from the lock stateid back to the open stateid
it came from, so that the check will continue to be correct even after the
open is upgraded or downgraded.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:11 -07:00
NeilBrown
3a4f98bbf4 [PATCH] nfsd4: clean up nfs4_preprocess_seqid_op
As long as we're here, do some miscellaneous cleanup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:10 -07:00
NeilBrown
f8816512fc [PATCH] nfsd4: clarify close_lru handling
The handling of close_lru in preprocess_stateid_op was a source of some
confusion here recently.  Try to make the logic a little clearer, by renaming
find_openstateowner_id to make its purpose clearer and untangling some
unnecessarily complicated goto's.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:10 -07:00
NeilBrown
52fd004e29 [PATCH] nfsd4: renew lease on seqid modifying operations
nfs4_preprocess_seqid_op is called by NFSv4 operations that imply an implicit
renewal of the client lease.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
b700949b78 [PATCH] nfsd4: return better error on io incompatible with open mode
from RFC 3530:
"Share reservations are established by OPEN operations and by their
nature are mandatory in that when the OPEN denies READ or WRITE
operations, that denial results in such operations being rejected
with error NFS4ERR_LOCKED."

(Note that share_denied is really only a legal error for OPEN.)

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
444c2c07c2 [PATCH] nfsd4: always update stateid on open
An OPEN from the same client/open stateowner requires a stateid update because
of the share/deny access update.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
e66770cd7b [PATCH] nfsd4: relax new lock seqid check
We're insisting that the lock sequence id field passed in the
open_to_lockowner struct always be zero.  This is probably thanks to the
sentence in rfc3530: "The first request issued for any given lock_owner is
issued with a sequence number of zero."

But there doesn't seem to be any problem with allowing initial sequence
numbers other than zero.  And currently this is causing lock reclaims from the
Linux client to fail.

In the spirit of "be liberal in what you accept, conservative in what you
send", we'll relax the check (and patch the Linux client as well).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
7fb64cee34 [PATCH] nfsd4: seqid comments
Add some comments on the use of so_seqid, in an attempt to avoid some of the
confusion outlined in the previous patch....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
bd9aac523b [PATCH] nfsd4: fix open_reclaim seqid
The sequence number we store in the sequence id is the last one we received
from the client.  So on the next operation we'll check that the client gives
us the next higher number.

We increment sequence id's at the last moment, in encode, so that we're sure
of knowing the right error return.  (The decision to increment the sequence id
depends on the exact error returned.)

However on the *first* use of a sequence number, if we set the sequence number
to the one received from the client and then let the increment happen on
encode, we'll be left with a sequence number one to high.

For that reason, ENCODE_SEQID_OP_TAIL only increments the sequence id on
*confirmed* stateowners.

This creates a problem for open reclaims, which are confirmed on first use.
Therefore the open reclaim code, as a special exception, *decrements* the
sequence id, cancelling out the undesired increment on encode.  But this
prevents the sequence id from ever being incremented in the case where
multiple reclaims are sent with the same openowner.  Yuch!

We could add another exception to the open reclaim code, decrementing the
sequence id only if this is the first use of the open owner.

But it's simpler by far to modify the meaning of the op_seqid field: instead
of representing the previous value sent by the client, we take op_seqid, after
encoding, to represent the *next* sequence id that we expect from the client.
This eliminates the need for special-case handling of the first use of a
stateowner.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:09 -07:00
NeilBrown
893f87701c [PATCH] nfsd4: comment indentation
Yeah, it's trivial, but this drives me up the wall....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
3751517731 [PATCH] nfsd4: stop overusing RECLAIM_BAD
A misreading of the spec lead us to convert all errors on open and lock
reclaims to RECLAIM_BAD.  This causes problems--for example, a reboot within
the grace period could lead to reclaims with stale stateid's, and we'd like to
return STALE errors in those cases.

What rfc3530 actually says about RECLAIM_BAD: "The reclaim provided by the
client does not match any of the server's state consistency checks and is
bad." I'm assuming that "state consistency checks" refers to checks for
consistency with the state recorded to stable storage, and that the error
should be reserved for that case.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
0dd395dc76 [PATCH] nfsd4: ERR_GRACE should bump seqid on lock
A GRACE or NOGRACE response to a lock request should also bump the sequence
id.  So we delay the handling of grace period errors till after we've found
the relevant owner.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
b648330a1d [PATCH] nfsd4: ERR_GRACE should bump seqid on open
The GRACE and NOGRACE errors should bump the sequence id on open.  So we delay
the handling of these errors until nfsd4_process_open2, at which point we've
set the open owner, so the encode routine will be able to bump the sequence
id.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
0fa822e452 [PATCH] nfsd4: fix release_lockowner
We oops in list_for_each_entry(), because release_stateowner frees something
on the list we're traversing.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
67be431350 [PATCH] nfsd4: prevent multiple unlinks of recovery directories
Make sure we don't try to delete client recovery directories multiple times;
fixes some spurious error messages.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
cdc5524e8a [PATCH] nfsd4: lookup_one_len takes i_sem
Oops, this lookup_one_len needs the i_sem.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:07 -07:00
NeilBrown
a6ccbbb886 [PATCH] nfsd4: fix sync'ing of recovery directory
We need to fsync the recovery directory after writing to it, but we weren't
doing this correctly.  (For example, we weren't taking the i_sem when calling
->fsync().)

Just reuse the existing nfsd fsync code instead.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:07 -07:00
NeilBrown
463090294e [PATCH] nfsd4: reboot recovery fix
We need to remove the recovery directory here too.  (This chunk just got lost
somehow in the process of commuting the reboot recovery patches past the other
patches.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:07 -07:00
Miklos Szeredi
751c404b8f [PATCH] namespace: rename _mntput to mntput_no_expire
This patch renames _mntput() to something a little more descriptive:
mntput_no_expire().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:52 -07:00
Miklos Szeredi
55e700b924 [PATCH] namespace: rename mnt_fslink to mnt_expire
This patch renames vfsmount->mnt_fslink to something a little more
descriptive: vfsmount->mnt_expire.

Signed-off-by: Mike Waychison <michael.waychison@sun.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:52 -07:00
Miklos Szeredi
732dbef606 [PATCH] dcookies.c: use proper refcounting functions
Dcookies shouldn't play with the internals of dentry and vfsmnt
refcounting.  It defeats grepping, and is prone to break if implementation
details change.

In addition the function doesn't even seem to be performance critical: it
calls kmem_cache_alloc().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:52 -07:00
Miklos Szeredi
484e389c63 [PATCH] set mnt_namespace in the correct place
This patch sets ->mnt_namespace where it's actually added to the
namespace.

Previously mnt_namespace was set in do_kern_mount() even if the filesystem
was never added to any process's namespace (most kernel-internal
filesystems).

This discrepancy doesn't actually cause any problems, but it's cleaner if
mnt_namespace is NULL for these non exported filesystems.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:52 -07:00
Miklos Szeredi
ac0811538b [PATCH] namespace.c: fix mnt_namespace zeroing for expired mounts
This patch clears mnt_namespace in an expired mount.

If mnt_namespace is not cleared, it's possible to attach a new mount to the
already detached mount, because check_mnt() can return true.

The effect is a resource leak, since the resulting tree will never be
freed.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
Miklos Szeredi
ed42c879b7 [PATCH] namespace.c: fix expiring of detached mount
This patch fixes a bug noticed by Al Viro:

   However, we still have a problem here - just what would
   happen if vfsmount is detached while we were grabbing namespace
   semaphore?  Refcount alone is not useful here - we might be held by
   whoever had detached the vfsmount.  IOW, we should check that it's
   still attached (i.e. that mnt->mnt_parent != mnt).  If it's not -
   just leave it alone, do mntput() and let whoever holds it deal with
   the sucker.  No need to put it back on lists.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
Miklos Szeredi
24ca2af1e7 [PATCH] namespace.c: split mark_mounts_for_expiry()
This patch splits the mark_mounts_for_expiry() function.  It's too complex and
too deeply nested, even without the bugfix in the following patch.

Otherwise code is completely the same.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
Miklos Szeredi
a4d7027861 [PATCH] namespace.c: cleanup in mark_mounts_for_expiry()
This patch simplifies mark_mounts_for_expiry() by using detach_mnt() instead
of duplicating everything it does.

It should be an equivalent transformation except for righting the dput/mntput
order.

Al Viro said: "Looks sane".

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
Miklos Szeredi
1ce88cf466 [PATCH] namespace.c: fix race in mark_mounts_for_expiry()
This patch fixes a race found by Ram in mark_mounts_for_expiry() in
fs/namespace.c.

The bug can only be triggered with simultaneous exiting of a process having
a private namespace, and expiry of a mount from within that namespace.
It's practically impossible to trigger, and I haven't even tried.  But
still, a bug is a bug.

The race happens when put_namespace() is called by another task, while
mark_mounts_for_expiry() is between atomic_read() and get_namespace().  In
that case get_namespace() will be called on an already dead namespace with
unforeseeable results.

The solution was suggested by Al Viro, with his own words:

      Instead of screwing with atomic_read() in there, why don't we
      simply do the following:
      	a) atomic_dec_and_lock() in put_namespace()
      	b) __put_namespace() called without dropping lock
      	c) the first thing done by __put_namespace would be
      struct vfsmount *root = namespace->root;
      namespace->root = NULL;
      spin_unlock(...);
      ....
      umount_tree(root);
      ...
      	d) check in mark_... would be simply namespace && namespace->root.

      And we are all set; no screwing around with atomic_read(), no magic
      at all.  Dying namespace gets NULL ->root.
      All changes of ->root happen under spinlock.
      If under a spinlock we see non-NULL ->mnt_namespace, it won't be
      freed until we drop the lock (we will set ->mnt_namespace to NULL
      under that lock before we get to freeing namespace).
      If under a spinlock we see non-NULL ->mnt_namespace and
      ->mnt_namespace->root, we can grab a reference to namespace and be
      sure that it won't go away.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
Miklos Szeredi
202322e6f7 [PATCH] namespace.c: fix mnt_namespace clearing
This patch clears mnt_namespace on unmount.

Not clearing mnt_namespace has two effects:

   1) It is possible to attach a new mount to a detached mount,
      because check_mnt() returns true.

      This means, that when no other references to the detached mount
      remain, it still can't be freed.  This causes a resource leak,
      and possibly un-removable modules.

   2) If mnt_namespace is dereferenced (only in mark_mounts_for_expiry())
      after the namspace has been freed, it can cause an Oops, memory
      corruption, etc.

1) has been tested before and after the patch, 2) is only speculation.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:51 -07:00
KAMBAROV, ZAUR
7eaae2828d [PATCH] coverity: fs/locks.c flp null check
We're dereferencing `flp' and then we're testing it for NULLness.

Either the compiler accidentally saved us or the existing null-pointer checdk
is redundant.

This defect was found automatically by Coverity Prevent, a static analysis tool.

Signed-off-by: Zaur Kambarov <zkambarov@coverity.com>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:47 -07:00
Ian Kent
682d4fc931 [PATCH] autofs4: mistake in debug print
Fix debugging printk.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:46 -07:00
Andreas Gruenbacher
ff87b37da9 [PATCH] ext3 xattr: Don't write to the in-inode xattr space of reserved inodes
We are not using the in-inode space for xattrs in reserved inodes because
mkfs.ext3 doesn't initialize it properly.  For those inodes, we set
i_extra_isize to 0.  Make sure that we also don't overwrite the
i_extra_isize field when writing out the inode in that case.  This is for
cleanliness only, and doesn't fix an actual bug.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:46 -07:00
Christoph Lameter
6c036527a6 [PATCH] mostly_read data section
Add a new section called ".data.read_mostly" for data items that are read
frequently and rarely written to like cpumaps etc.

If these maps are placed in the .data section then these frequenly read
items may end up in cachelines with data is is frequently updated.  In that
case all processors in an SMP system must needlessly reload the cachelines
again and again containing elements of those frequently used variables.

The ability to share these cachelines will allow each cpu in an SMP system
to keep local copies of those shared cachelines thereby optimizing
performance.

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Shobhit Dayal <shobhit@calsoftinc.com>
Signed-off-by: Christoph Lameter <christoph@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:46 -07:00
Andreas Gruenbacher
b84c21572d [PATCH] acl kconfig cleanup
Original patch from Matt Mackall <mpm@selenic.com>

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:45 -07:00
Nick Piggin
a39722034a [PATCH] page_uptodate locking scalability
Use a bit spin lock in the first buffer of the page to synchronise asynch
IO buffer completions, instead of the global page_uptodate_lock, which is
showing some scalabilty problems.

Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:45 -07:00
Paolo 'Blaisorblade' Giarrusso
3f580470ba [PATCH] uml: restore hppfs support
Some time ago a trivial patch broke HPPFS (one var became a pointer, not
all uses were updated).  It wasn't fixed at that time because not very
used, now it's been requested so I've fixed this, and it has been tested
positively (at least partially).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:44 -07:00
Anton Blanchard
cf36680887 [PATCH] move ioprio syscalls into syscalls.h
- Make ioprio syscalls return long, like set/getpriority syscalls.
- Move function prototypes into syscalls.h so we can pick them up in the
  32/64bit compat code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:37 -07:00
Mark Fasheh
cb2c023375 [PATCH] export generic_drop_inode() to modules
OCFS2 wants to mark an inode which has been orphaned by another node so
that during final iput it takes the correct path through the VFS and can
pass through the OCFS2 delete_inode callback.  Since i_nlink can get out of
date with other nodes, the best way I see to accomplish this is by clearing
i_nlink on those inodes at drop_inode time.  Other than this small amount
of work, nothing different needs to happen, so I think it would be cleanest
to be able to just call generic_drop_inode at the end of the OCFS2
drop_inode callback.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:35 -07:00
Artem B. Bityuckiy
b3539219c9 Merge with rsync://fileserver/linux
Update to 2.6.12-rc3
2005-07-06 19:40:38 +02:00
Artem B. Bityuckiy
6430a8def1 [JFFS2] Simplify the tree insert code.
It isn't _normal_ that we allow key collision in rbtrees, 
but it does not matter as long as the two nodes with the same
version are together.

Signed-off-by: Artem B. Bityuckiy <dedekind@infradead.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-06 18:30:00 +02:00
David Woodhouse
265489f01d [JFFS2] Remove compatibilty cruft for ancient kernels
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-06 16:12:09 +02:00
David Woodhouse
9dee7503ce [JFFS2] Optimise jffs2_add_tn_to_list
Use an rbtree instead of a simple linked list. We were wasting 
an amazing amount of time in jffs2_add_tn_to_list(). 
Thanks to Artem Bityuckiy and Jarkko Jlavinen  for noticing.

Signed-off-by: David Woodhouse <dwmw2@infradead.org> 
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-07-06 14:07:54 +02:00
Anton Altaparmakov
07929dcb96 Automatic merge with /usr/src/ntfs-2.6.git. 2005-07-04 14:14:42 +01:00
Andrew Morton
ef6689eff4 [PATCH] fatfs sectioning fix
Fixup for the recent slab leak fix

Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 22:29:48 -07:00
Andrew Morton
c60e81ee1c [PATCH] reiserfs: handle_attrs() fix
Fix a use-uninitialised bug.

Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:13 -07:00
Pekka Enberg
8cb681b9c7 [PATCH] freevxfs: minor cleanups
This patch addresses the following minor issues:

  - Typo in printk
  - Redundant casts
  - Use C99 struct initializers instead of memset
  - Parenthesis around return value
  - Use inline instead of __inline__

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:12 -07:00
Pekka Enberg
1d2cc3b87b [PATCH] freevxfs: remove 2.4 compatability
This patch removes 2.4 compatability header from freevxfs.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:12 -07:00
Pekka Enberg
ba03bda81e [PATCH] freevxfs: fix buffer_head leak
- fix a buffer_head leak in vxfs_getfsh()

- s/SLAB_KERNEL/GFP_KERNEL/

- check sb_bread() return value

- drop pointless buffer-mapped() test.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:12 -07:00
Christoph Hellwig
1c71e22e4e [PATCH] udf_find_entry() cleanup
udf_find_entry can never be called with a NULL argument, so we shouldn't
check for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:11 -07:00
Pekka J Enberg
532a39a375 [PATCH] fat: fix slab cache leak
This patch plugs a slab cache leak in fat module initialization.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:11 -07:00
Anton Altaparmakov
c2d9b8387b Automerge with /usr/src/ntfs-2.6.git. 2005-06-30 09:52:20 +01:00
Jeff Mahoney
2949ccf937 [PATCH] reiserfs: enable attrs by default if saf
The following patch enables attrs by default if the reiserfs_attrs_cleared
bit is set in the superblock.  This allows chattr-type attrs to be used
without any further action by the user.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-29 21:02:04 -07:00
Jeff Mahoney
869eb76e7b [PATCH] reiserfs: Check if attrs are enabled for attr ioctls
ReiserFS currently will allow the user to set/get attrs for files
regardless if they are enabled.  The patch checks to see if they are
enabled, and returns -NOTTY if they are not.

ext[23] doesn't need this check because attrs are always enabled.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-29 21:02:04 -07:00
Mingming Cao
21fe3471c3 [PATCH] ext3: reduce allocate-with-reservation lock latencies
Currently in ext3 block reservation code, the global filesystem reservation
tree lock (rsv_block) is hold during the process of searching for a space
to make a new reservation window, including while scaning the block bitmap
to verify if the avalible window has a free block.  Holding the lock during
bitmap scan is unnecessary and could possibly cause scalability issue and
latency issues.

This patch tries to address this by dropping the lock before scan the
bitmap.  Before that we need to reserve the open window in case someone
else is targetting at the same window.  Question was should we reserve the
whole free reservable space or just the window size we need.  Reserve the
whole free reservable space will possibly force other threads which
intended to do block allocation nearby move to another block group(cause
bad layout).  In this patch, we just reserve the desired size before drop
the lock and scan the block bitmap.  This patch fixed a ext3 reservation
latency issue seen on a cvs check out test.  Patch is tested with many fsx,
tiobench, dbench and untar a kernel test.

Signed-Off-By: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:35 -07:00
KAMBAROV, ZAUR
c7f1721ef2 [PATCH] coverity: fs/ext3/super.c: match_int return check
The return value of  "match_int" is  checked  27 out of 28 times

In lib/parser.c
142  	/**
143  	 * match_int: - scan a decimal representation of an integer from a substring_t
144  	 * @s: substring_t to be scanned
145  	 * @result: resulting integer on success
146  	 *
147  	 * Description: Attempts to parse the &substring_t @s as a decimal integer. On
148  	 * success, sets @result to the integer represented by the string and returns 0.
149  	 * Returns either -ENOMEM or -EINVAL on failure.
150  	 */
151  	int match_int(substring_t *s, int *result)
152  	{
153  		return match_number(s, result, 0);
154  	}

Signed-off-by: Zaur Kambarov <zkambarov@coverity.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:34 -07:00
KAMBAROV, ZAUR
ec471dc484 [PATCH] coverity: fs/udf/namei.c null check
"dir" was dereferenced before null check

Signed-off-by: Zaur Kambarov <zkambarov@coverity.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:33 -07:00
Sbastien Dugu
c016e2257a [PATCH] aio-retry-fix: fix aio retry work queueing
In the case of buffered AIO, in the aio retry path (aio_run_iocb), when the
retry method returns EIOCBRETRY the kicked iocb is added to the context run
list but is never queued onto the work queue.  The request therefore is
never completed.

This patch fixes that by adding the appropriate call to aio_queue_work in
aio_run_aiocb so that subsequent retries will be handled by the aio worker
thread.

Signed-off-by: Sbastien Dugu <sebastien.dugue@bull.net>
Acked-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:32 -07:00
Christoph Hellwig
334a13ec3d [PATCH] really remove xattr_acl.h
Looks like it sneaked back with the NFS ACL merge..

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:31 -07:00
Pekka J Enberg
687a21cee1 [PATCH] rename wakeup_bdflush to wakeup_pdflush
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:31 -07:00
Wen-chien Jesse Sung
8d451687ca [PATCH] fix semaphore handling in __unregister_chrdev_region
This up() should be down() instead.

Signed-off-by: Wen-chien Jesse Sung <jesse@cola.voip.idv.tw>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-28 21:20:29 -07:00
Jens Axboe
22e2c507c3 [PATCH] Update cfq io scheduler to time sliced design
This updates the CFQ io scheduler to the new time sliced design (cfq
v3).  It provides full process fairness, while giving excellent
aggregate system throughput even for many competing processes.  It
supports io priorities, either inherited from the cpu nice value or set
directly with the ioprio_get/set syscalls.  The latter closely mimic
set/getpriority.

This import is based on my latest from -mm.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 14:33:29 -07:00
Dave Kleikamp
b38a3ab3d1 JFS: Code cleanup - getting rid of never-used debug code
I'm finally getting around to cleaning out debug code that I've never used.
There has always been code ifdef'ed out by _JFS_DEBUG_DMAP, _JFS_DEBUG_IMAP,
_JFS_DEBUG_DTREE, and _JFS_DEBUG_XTREE, which I have personally never used,
and I doubt that anyone has since the design stage back in OS/2.  There is
also a function, xtGather, that has never been used, and I don't know why it
was ever there.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-06-27 15:35:37 -05:00
Thomas Gleixner
7ca6448dbf Merge with rsync://fileserver/linux
Update to Linus latest
2005-06-26 23:20:36 +02:00
Anton Altaparmakov
2a322e4c08 Automatic merge with /usr/src/ntfs-2.6.git. 2005-06-26 22:19:40 +01:00
Anton Altaparmakov
ba6d2377c8 NTFS: Fix a nasty deadlock that appeared in recent kernels.
The situation: VFS inode X on a mounted ntfs volume is dirty.  For
      same inode X, the ntfs_inode is dirty and thus corresponding on-disk
      inode, i.e. mft record, which is in a dirty PAGE_CACHE_PAGE belonging
      to the table of inodes, i.e. $MFT, inode 0.
      What happens:
      Process 1: sys_sync()/umount()/whatever...  calls
      __sync_single_inode() for $MFT -> do_writepages() -> write_page for
      the dirty page containing the on-disk inode X, the page is now locked
      -> ntfs_write_mst_block() which clears PageUptodate() on the page to
      prevent anyone else getting hold of it whilst it does the write out.
      This is necessary as the on-disk inode needs "fixups" applied before
      the write to disk which are removed again after the write and
      PageUptodate is then set again.  It then analyses the page looking
      for dirty on-disk inodes and when it finds one it calls
      ntfs_may_write_mft_record() to see if it is safe to write this
      on-disk inode.  This then calls ilookup5() to check if the
      corresponding VFS inode is in icache().  This in turn calls ifind()
      which waits on the inode lock via wait_on_inode whilst holding the
      global inode_lock.
      Process 2: pdflush results in a call to __sync_single_inode for the
      same VFS inode X on the ntfs volume.  This locks the inode (I_LOCK)
      then calls write-inode -> ntfs_write_inode -> map_mft_record() ->
      read_cache_page() for the page (in page cache of table of inodes
      $MFT, inode 0) containing the on-disk inode.  This page has
      PageUptodate() clear because of Process 1 (see above) so
      read_cache_page() blocks when it tries to take the page lock for the
      page so it can call ntfs_read_page().
      Thus Process 1 is holding the page lock on the page containing the
      on-disk inode X and it is waiting on the inode X to be unlocked in
      ifind() so it can write the page out and then unlock the page.
      And Process 2 is holding the inode lock on inode X and is waiting for
      the page to be unlocked so it can call ntfs_readpage() or discover
      that Process 1 set PageUptodate() again and use the page.
      Thus we have a deadlock due to ifind() waiting on the inode lock.
      The solution: The fix is to use the newly introduced
      ilookup5_nowait() which does not wait on the inode's lock and hence
      avoids the deadlock.  This is safe as we do not care about the VFS
      inode and only use the fact that it is in the VFS inode cache and the
      fact that the vfs and ntfs inodes are one struct in memory to find
      the ntfs inode in memory if present.  Also, the ntfs inode has its
      own locking so it does not matter if the vfs inode is locked.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-26 22:12:02 +01:00
Andrew Morton
34f18a9887 [PATCH] jffs2 build fix
Missed conversion in the swsusp cleanup.

Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-26 08:43:19 -07:00
Linus Torvalds
2031d0f586 Merge Christoph's freeze cleanup patch 2005-06-25 17:16:53 -07:00
Christoph Lameter
3e1d1d28d9 [PATCH] Cleanup patch for process freezing
1. Establish a simple API for process freezing defined in linux/include/sched.h:

   frozen(process)		Check for frozen process
   freezing(process)		Check if a process is being frozen
   freeze(process)		Tell a process to freeze (go to refrigerator)
   thaw_process(process)	Restart process
   frozen_process(process)	Process is frozen now

2. Remove all references to PF_FREEZE and PF_FROZEN from all
   kernel sources except sched.h

3. Fix numerous locations where try_to_freeze is manually done by a driver

4. Remove the argument that is no longer necessary from two function calls.

5. Some whitespace cleanup

6. Clear potential race in refrigerator (provides an open window of PF_FREEZE
   cleared before setting PF_FROZEN, recalc_sigpending does not check
   PF_FROZEN).

This patch does not address the problem of freeze_processes() violating the rule
that a task may only modify its own flags by setting PF_FREEZE. This is not clean
in an SMP environment. freeze(process) is therefore not SMP safe!

Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 17:10:13 -07:00
Domen Puncer
c33ed27126 [PATCH] list_for_each_entry: fs-dquot.c
Make code more readable with list_for_each_entry_safe.

Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Maximilian Attems <janitor@sternwelten.at>
Signed-off-by: Domen Puncer <domen@coderock.org>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:11 -07:00
Adrian Bunk
db40716377 [PATCH] fs/ncpfs/: remove unused #ifdef USE_OLD_SLOW_DIRECTORY_LISTING code
This patch removes some unused #ifdef USE_OLD_SLOW_DIRECTORY_LISTING
code.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:04 -07:00
Adrian Bunk
94c9eca223 [PATCH] fs/jffs/: cleanups
This patch contains the following cleanups:
- make needlessly global functions static
- provide some debugging helper functions only for appropriate
  values of CONFIG_JFFS_FS_VERBOSE

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:25:04 -07:00
Adrian Bunk
486fd404fb [PATCH] small partitions/msdos cleanups
This patch makes the following changes to the msdos partition code:
- remove CONFIG_NEC98_PARTITION leftovers
- make parse_bsd static

This patch was already ACK'ed by Andries Brouwer.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:59 -07:00
Vivek Goyal
72658e9d50 [PATCH] kdump: Parse elf32 headers and export through /proc/vmcore
o Adds support for parsing core ELF32 headers.
o I am expecting ELF32 support to go away down the line. This patch has been
  introduced for testing purposes as gdb can not parse ELF64 headers for
  i386. When a decent user space solution is available, ELF32 support
  can go away.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:53 -07:00
Vivek Goyal
666bfddbe8 [PATCH] kdump: Access dump file in elf format (/proc/vmcore)
From: "Vivek Goyal" <vgoyal@in.ibm.com>

o Support for /proc/vmcore interface. This interface exports elf core image
  either in ELF32 or ELF64 format, depending on the format in which elf headers
  have been stored by crashed kernel.
o Added support for CONFIG_VMCORE config option.
o Removed the dependency on /proc/kcore.

From: "Eric W. Biederman" <ebiederm@xmission.com>

This patch has been refactored to more closely match the prevailing style in
the affected files.  And to clearly indicate the dependency between
/proc/kcore and proc/vmcore.c

From: Hariprasad Nellitheertha <hari@in.ibm.com>

This patch contains the code that provides an ELF format interface to the
previous kernel's memory post kexec reboot.

Signed off by Hariprasad Nellitheertha <hari@in.ibm.com>
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:53 -07:00
Qu Fuping
6283d58e74 [PATCH] reiserfs: do not ignore i/io error on readpage
Reiserfs's readpage does not notice i/o errors.  This patch makes
reiserfs_readpage to return -EIO when i/o error appears.

This patch makes reiserfs to not ignore I/O error on readpage.

Signed-off-by: Qu Fuping <fs@ercist.iscas.ac.cn>
Signed-off-by: Vladimir V. Saveliev <vs@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:40 -07:00
Hugh Dickins
8ae0b77811 [PATCH] fix fsync(dir) return value for ram-based filesystems
Any filesystem which is using simple_dir_operations will retunr -EINVAL for
fsync() on a directory.  Make it return zero instead.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:38 -07:00
Anton Altaparmakov
af859a42d7 NTFS: Prepare for 2.1.23 release: Update documentation and bump version.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 21:07:27 +01:00
Anton Altaparmakov
4757d7dff6 NTFS: Change ntfs_map_runlist_nolock() to only decompress the mapping pairs
if the requested vcn is inside it.  Otherwise we get into problems
      when we try to map an out of bounds vcn because we then try to map
      the already mapped runlist fragment which causes
      ntfs_mapping_pairs_decompress() to fail and return error.  Update
      ntfs_attr_find_vcn_nolock() accordingly.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 17:24:08 +01:00
Anton Altaparmakov
fa3be92317 NTFS: Add an extra parameter @last_vcn to ntfs_get_size_for_mapping_pairs()
and ntfs_mapping_pairs_build() to allow the runlist encoding to be
      partial which is desirable when filling holes in sparse attributes.
      Update all callers.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 17:15:36 +01:00
Anton Altaparmakov
1d58b27b8d NTFS: Change the runlist terminator of the newly allocated cluster(s) to
LCN_ENOENT in ntfs_attr_make_non_resident().  Otherwise the runlist
      code gets confused.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 17:04:55 +01:00
Anton Altaparmakov
3bd1f4a173 NTFS: Fix several occurences of a bug where we would perform 'var & ~const'
with a 64-bit variable and a int, i.e. 32-bit, constant.  This causes
      the higher order 32-bits of the 64-bit variable to be zeroed.  To fix
      this cast the 'const' to the same 64-bit type as 'var'.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 16:51:58 +01:00
Anton Altaparmakov
ca8fd7a0c6 NTFS: Detect the case when Windows has been suspended to disk on the volume
to be mounted and if this is the case do not allow (re)mounting
      read-write.  This is done by parsing hiberfil.sys if present.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 16:31:27 +01:00
Anton Altaparmakov
9f993fe463 NTFS: Fix a bug in address space operations error recovery code paths where
if the runlist was not mapped at all and a mapping error occured we
      would leave the runlist locked on exit to the function so that the
      next access to the same file would try to take the lock and deadlock.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 16:15:36 +01:00
Anton Altaparmakov
3f2faef00c NTFS: Stamp the transaction log ($UsnJrnl), aka user space journal, if it
is active on the volume and we are mounting read-write or remounting
      from read-only to read-write.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-06-25 15:28:56 +01:00
Anton Altaparmakov
38b22b6e9f Automerge with /usr/src/ntfs-2.6.git. 2005-06-25 14:27:27 +01:00
Alexey Dobriyan
75043cb5b3 [PATCH] fs/qnx4/*: fix sparse warnings
This patch fixes sparse warnings in the qnx4fs (and might even make
qnx4fs work on big-endian boxes)

Signed-off-by: Alexey Dobriyan <adobriyan@mail.ru>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Anders Larsen <al@alarsen.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 14:14:24 -07:00
Adrian Bunk
52c1da3953 [PATCH] make various thing static
Another rollup of patches which give various symbols static scope

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:43 -07:00
Carsten Otte
eb6fe0c388 [PATCH] xip: reduce code duplication
This patch reworks filemap_xip.c with the goal to reduce code duplication
from mm/filemap.c.  It applies agains 2.6.12-rc6-mm1.  Instead of
implementing the aio functions, this one implements the synchronous
read/write functions only.  For readv and writev, the generic fallback is
used.  For aio, we rely on the application doing the fallback.  Since our
"synchronous" function does memcpy immediately anyway, there is no
performance difference between using the fallbacks or implementing each
operation.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:41 -07:00
Carsten Otte
6d79125bba [PATCH] xip: ext2: execute in place
These are the ext2 related parts.  Ext2 now uses the xip_* file operations
along with the get_xip_page aop when mounted with -o xip.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:41 -07:00
Carsten Otte
ceffc07852 [PATCH] xip: fs/mm: execute in place
- generic_file* file operations do no longer have a xip/non-xip split
- filemap_xip.c implements a new set of fops that require get_xip_page
  aop to work proper. all new fops are exported GPL-only (don't like to
  see whatever code use those except GPL modules)
- __xip_unmap now uses page_check_address, which is no longer static
  in rmap.c, and defined in linux/rmap.h
- mm/filemap.h is now much more clean, plainly having just Linus'
  inline funcs moved here from filemap.c
- fix includes in filemap_xip to make it build cleanly on i386

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:41 -07:00
Martin Waitz
3d41088fa3 [PATCH] DocBook: update comments
This patch updates some comments to match code changes.

Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:40 -07:00
NeilBrown
0964a3d3f1 [PATCH] knfsd: nfsd4 reboot dirname fix
Set the recovery directory via /proc/fs/nfsd/nfs4recoverydir.

It may be changed any time, but is used only on startup.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:36 -07:00
NeilBrown
c7b9a45927 [PATCH] knfsd: nfsd4: reboot recovery
This patch adds the code to create and remove client subdirectories from the
recovery directory, as described in the previous patch comment.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:36 -07:00
NeilBrown
190e4fbf96 [PATCH] knfsd: nfsd4: initialize recovery directory
NFSv4 clients are required to know what state they have on the server so that
they can reclaim it on server reboot.  However, it is possible for
pathalogical combinations of server reboots and network partitions to leave a
client in a state where it cannot know whether it has lost its state on the
server.

For this reason, rfc3530 requires that we store some information about clients
to stable storage.

So we maintain a directory /var/lib/nfs/v4recovery with a subdirectory for
each client with active state.  We leave open the possibility of including
files underneath each such subdirectory with information about the client, but
for now the subdirectories are empty.

We create a client subdirectory whenever a client makes its first non-reclaim
open_confirm.

We remove a client subdirectory whenever either
        a) its lease expires, or
	b) the grace period ends without it reclaiming anything.
When handling reclaims, we allow the reclaim if and only if the client doing
the reclaim has a subdirectory.

This patch adds just the code to scan the recovery directory on nfsd startup.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:35 -07:00
NeilBrown
cb36d63457 [PATCH] knfsd: nfsd4: remove cb_parsed
The cb_parsed field is only used by probe_callback, to determine whether the
callback information has been filled in by setclientid.  But there is no way
that probe_callback() can be called without that having already happened, so
that check is superfluous, as is cb_parsed.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:35 -07:00
NeilBrown
3e9e3dbe0f [PATCH] knfsd: nfsd4: allow multiple lockowners
>From the language of rfc3530 section 8.1.3 (e.g., the suggestion that a
"process id" might be a reasonable lockowner value) it's conceivable that a
client might want to use the same lockowner string on multiple files, so we may
as well allow that.  We expect each use of open_to_lockowner to create a
distinct seqid stream, though.

For now we're also allowing multiple uses of open_to_lockowner with the same
open, though it seems unlikely clients would actually do that.

Also add a comment reminding myself of some very non-scalable data structures.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:35 -07:00
NeilBrown
ea1da636e9 [PATCH] knfsd: nfsd4: rename state list fields
Trivial renaming patch:

I can never remember, while looking at various lists relating the nfsd4 state
structures, which are the "heads" and which are items on other lists, or which
structures are actually on the various lists.  The following convention helps
me: given structures foo and bar, with foo containing the head of a list of
bars, use "bars" for the name of the head of the list contained in the struct
foo, and use "per_foo" for the entries in the struct bars.

Already done for struct nfs4_file; go ahead and do it for the other nfsd4
state structures.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:35 -07:00
NeilBrown
21ab45a480 [PATCH] knfsd: nfsd4: miscellaneous setclientid_confirm cleanup
Minor cleanup, remove some unnecessary printk's.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:34 -07:00
NeilBrown
7c79f7377c [PATCH] knfsd: nfsd4: setclientid_confirm comments
Trivial whitespace and comment fixes.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:34 -07:00
NeilBrown
08e8987c37 [PATCH] knfsd: nfsd4: setclientid_confirm gotoectomy
Change from "goto" to "else if" format in setclientid_confirm.

From: Fred Isaman
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:34 -07:00
NeilBrown
22de4d8374 [PATCH] knfsd: nfsd4: fix setclientid_confirm error return
NFS4_INVAL is not a valid error for setclientid_confirm, and INUSE is the more
logical error here anyway.

From: Fred Isaman
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:34 -07:00
NeilBrown
1a69c179a2 [PATCH] knfsd: nfsd4: fix setclientid_confirm cases
Setclientid_confirm code confused states 1 and 3 (numbering from the
IMPLEMENTATION section of rfc3530, section 14.2.33).  Fix this.

State 1 allows the client to change the callback channel on the fly.  We don't
implement this currently, so just turn off the callback channel in this case.

From: Fred Isaman
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:34 -07:00
NeilBrown
31f4a6c127 [PATCH] knfsd: nfsd4: fix uncomfirmed list
Setclientid code assumes there is only one match in unconfirmed list.
Make sure that assumption holds.

From: Fred Isaman
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:33 -07:00
NeilBrown
fd39ca9a80 [PATCH] knfsd: nfsd4: make needlessly global code static
This patch contains the following possible cleanups:

- make needlessly global code static

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:33 -07:00
NeilBrown
a76b4319ca [PATCH] knfsd: nfsd4: grace period end
For the purposes of reboot recovery, we want to do some work during the
transition period at the end of the grace period.  Some of that work must be
guaranteed to have a certain relationship with the end of the grace period, so
we want to control the transition there.

Our approach is to modify the in_grace() checks to consult a global variable
instead of checking the time directly, to schedule the first run of the
laundromat thread at the end of the grace period, and to set the global
end-of-grace-period there.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:33 -07:00
NeilBrown
28ce6054f1 [PATCH] knfsd: nfsd4: add find_{un}conf_by_str functions to simplify setclientid
Minor setclientid cleanup

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:33 -07:00
NeilBrown
a55370a3c0 [PATCH] knfsd: nfsd4: reboot hash
For the purposes of reboot recovery we keep a directory with subdirectories
each having a name that is the ascii hex representation of the md5 sum of a
client identifier for an active client.

This adds the code to calculate that name.  We also use it for the purposes of
comparing clients, so if someone ever manages to find two client names that
are md5 collisions, then we'll return clid_inuse to the second.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:33 -07:00
NeilBrown
7dea9d280c [PATCH] knfsd: nfsd4: setclientid simplification
We can be a little more concise here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:32 -07:00
NeilBrown
bd0b1e954e [PATCH] knfsd: nfsd4: idmap initialization
Adopt standard kernel style by defining a no-op function instead of putting
ifdef's in the code where the function is called.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:32 -07:00
NeilBrown
707d4ab7b3 [PATCH] knfsd: nfsd4: remove nfs4_reclaim_init
nfs4_reclaim_init is no longer performing any useful function.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:32 -07:00
NeilBrown
ac4d8ff2a5 [PATCH] knfsd: nfsd4: clean up state initialization
Separate out stuff that needs initialization on startup from stuff that only
needs initialization on module init from static data.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:32 -07:00
NeilBrown
76a3550ec5 [PATCH] knfsd: nfsd4: rename nfs4_state_init
Somewhat gratuitous rename to simplify following patch.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:31 -07:00
NeilBrown
7b190fecfa [PATCH] knfsd: nfsd4: delegation recovery
Allow recovery of delegations after reboot.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:31 -07:00
NeilBrown
d99a05adf8 [PATCH] knfsd: nfsd4: simplify lease changing
The only way the protocol gives to change the lease time on the fly is to
simulate a reboot.  We don't have that completely right in the current code;
among other things, we should probably put lockd in grace too while we do
this.

For now, let's just keep this simple, and wait till the next time nfsd starts
to register any changes in lease time.  If the administrator really wants to
change the lease time *now*, they can go ahead and bring nfsd down and then
back up again after changing the lease time.

Also remove the "if (reclaim_str_hashtbl_size == 0)" case, a shortcut which
skips the grace period if we know of no clients in need of recovery.  This
isn't going to work well with nlm.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:31 -07:00
NeilBrown
58da282b73 [PATCH] knfsd: nfsd4: create separate laundromat workqueue
We're running the laundromat work on the default kevent worker thread.  But
the laundromat takes the nfsv4 state semaphore, which is used for way too much
stuff, and the potential for deadlocks is high.  Better to have this on a
separate workqueue.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:31 -07:00
NeilBrown
dfc8356570 [PATCH] knfsd: nfsd4: nfs4_check_open_reclaim cleanup
Minor cleanup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:30 -07:00
NeilBrown
5ba266d632 [PATCH] knfsd: nfsd4: fix probe_callback
rpc_create_client was modified recently to do its own (synchronous) NULL ping
of the server.  We'd rather do that on our own, asynchronously, so that we
don't have to block the nfsd thread doing the probe, and so that setclientid
handling (hence, client mounts) can proceed normally whether the callback is
succesful or not.  (We can still function fine without the callback
channel--we just won't be able to give out delegations till it's verified to
work.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:30 -07:00
NeilBrown
7e06b7f9e9 [PATCH] knfsd: nfs4: hold filp while reading or writing
We're trying to read and write from a struct file that we may not hold a
reference to any more (since a close could be processed as soon as we drop the
state lock).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:30 -07:00
NeilBrown
46be925fa6 [PATCH] knfsd: lockd: flush signals on shutdown
Silence another annoying "failed to contact portmap (errno -512)" on shutdown.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:30 -07:00
NeilBrown
13cd21845d [PATCH] nfsd4: reference count struct nfs4_file
Add a struct kref to each nfs4_file and take a reference to it from each
stateid and delegation that refers to it.  The atomicity guarantees are
overkill given that all this stuff is done under the single nfsd4 state lock,
but a) we'd like finer-grained locking some day, and b) this simplifies the
cleanup of the structures a bit, something that has previously been a bit
complicated and bug-prone.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:30 -07:00
NeilBrown
8beefa2493 [PATCH] nfsd4: rename nfs4_file fields
Trivial renaming patch:

I can never remember, while looking at various lists relating the nfsd4 state
structures, which are the "heads" and which are items on other lists, or which
structures are actually on the various lists.  The following convention helps
me: given structures foo and bar, with foo containing the head of a list of
bars, use "bars" for the name of the head of the list contained in the struct
foo, and use "per_foo" for the entries in the struct bars.

Go ahead and do this for struct nfs4_file.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:29 -07:00
NeilBrown
6fa305ded4 [PATCH] nfsd4: remove debugging counters
These remaining debugging counters haven't proved that useful.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:29 -07:00
NeilBrown
5b2d21c196 [PATCH] nfsd4: slabify delegations
Allocate delegations from a slab cache.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:29 -07:00
NeilBrown
5ac049ac66 [PATCH] nfsd4: slabify stateids
Allocate stateid's from a slab cache.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:29 -07:00
NeilBrown
e60d4398a7 [PATCH] nfsd4: slabify nfs4_files
The structures the server uses to keep track of various pieces of nfsv4 state
(open files, outstanding delegations, etc.) are likely to be allocated and
deallocated frequently and seem reasonable candidates for slab caches.

While we're at it, the slab code keeps statistics that help catch leaks and
such, so we may as well take this chance to eliminate some debugging counters
that we've been keeping ourselves.

Start with the struct nfs4_file.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:29 -07:00
NeilBrown
c815afc73e [PATCH] nfsd4: block metadata ops during grace period
We currently return err_grace if a user attempts a non-reclaim open during the
grace period.  But we also need to prevent renames and removes, at least, to
ensure clients have the chance to recover state on files before they are moved
or deleted.

Of course, local users could also do renames and removes during the lease
period, and there's not much we can do about that.  This at least will help
with remote users.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
NeilBrown
496400014f [PATCH] nfsd4: fix fh_expire_type
We're returning NFS4_FH_NOEXPIRE_WITH_OPEN | NFS4_FH_VOL_RENAME for the
fh_expire_type attribute.  This is incorrect:
	1. The spec actually only allows NOEXPIRE_WITH_OPEN when
	   VOLATILE_ANY is also set.
	2. Filehandles for open files can expire, if the file is removed
	   and there is a reboot.
	3. Filehandles are only volatile on rename in the nosubtree check
	   case.

Unfortunately, there's no way to indicate that we only expire on remove.  So
our only choice is FH4_VOLATILE_ANY.  Although it's redundant, we also set
FH4_VOL_RENAME in the subtree check case, since subtreecheck does actually
cause problems in practice and it seems possibly useful to give clients some
way to distinguish that case.

Fix a mispelled #define while we're at it.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
NeilBrown
0dd3c19212 [PATCH] nfsd4: support CLAIM_DELEGATE_CUR
Add OPEN claim type NFS4_OPEN_CLAIM_DELEGATE_CUR to nfsd4_open().

A delegation stateid and a name are provided.  OPEN with O_CREAT is not legal
with this claim type; otherwise, use the NFS4_OPEN_CLAIM_NULL code path to
lookup the filename to be opened.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
NeilBrown
c44c5eeb2c [PATCH] nfsd4: add open state code for CLAIM_DELEGATE_CUR
State logic for OPEN with claim type CLAIM_DELEGATE_CUR, which the NFSv4
client uses to report local OPENs on a delegated file back to the NFSv4
server.

nfs4_check_deleg() performs input delegation stateid lookup and sanity check.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
NeilBrown
567d98292e [PATCH] nfsd4: don't reopen for delegated client
We don't really need to be doing a separate open for every stateid.  And in
the case of an open from a client that already has a delegation on a file, it
unnecessarily results in a delegation recall.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
NeilBrown
4a6e43e6d4 [PATCH] nfsd4: nfs4_check_delegmode
Additional minor code reshuffling to prepare for claim_deleg_cur support.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:27 -07:00
NeilBrown
52f4fb4306 [PATCH] nfsd4: find_delegation_file()
Factor out a bit of common code that will be useful elsewhere.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:27 -07:00
Jan Kara
bdd5b29c6b [PATCH] Make reiserfs BUG on too big transaction
Make reiserfs BUG() when somebody tries to start a larger transaction than
it's allowed (currently the code just silently deadlocks).

Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Chris Mason <mason@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:23 -07:00
Jan Kara
556a2a45bc [PATCH] quota: reiserfs: improve quota credit estimates
Use improved credits estimates for quota operations.  Also reserve space
for a quota operation in a transaction only if filesystem was mounted with
some quota option.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:20 -07:00
Jan Kara
1f54587bea [PATCH] quota: ext3: Improve quota credit estimates
Use improved credits estimates for quota operations.  Also reserve a space
for a quota operation in a transaction only if filesystem was mounted with
some quota options.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:20 -07:00
Jan Kara
bd6a1f16ff [PATCH] reiserfs: add checking of journal_begin() return value
Check return values of journal_begin() and journal_end() in the quota code
for reiserfs.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:19 -07:00
Christoph Hellwig
92198f7eaa [PATCH] pass iocb to dio_iodone_t
XFS will have to look at iocb->private to fix aio+dio.  No other filesystem
is using the blockdev_direct_IO* end_io callback.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:05:19 -07:00
Sonny Rao
f5f287738b JFS: performance patch
Basically, we saw a large amount of time spent in the
jfs_strfromUCS_le() function, mispredicting the branch inside the
loop, so I just added some unlikely modifiers to the if statements to
re-ordered the code.  Again, these simple changes provided > 2 % on
spec-sfs, so please consider it for inclusion.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
2005-06-23 16:57:56 -05:00
Telemaque Ndizihiwe
fed2fc18a4 [PATCH] sys_open() cleanup
Clean up tortured logic in sys_open().

Signed-off-by: Telemaque Ndizihiwe <telendiz@eircom.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:36 -07:00
Benjamin LaHaise
63e6880918 [PATCH] aio: fix do_sync_(read|write) to properly handle aio retries
When do_sync_(read|write) encounters an aio method that makes use of the
retry mechanism, they fail to correctly retry the operation.  This fixes
that by adding the appropriate sleep and retry mechanism.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:34 -07:00
Anton Altaparmakov
152becd26e [PATCH] Bug in error recovery in fs/buffer.c::__block_prepare_write()
fs/buffer.c::__block_prepare_write() has broken error recovery.  It calls
the get_block() callback with "create = 1" and if that succeeds it
immediately clears buffer_new on the just allocated buffer (which has
buffer_new set).

The bug is that if an error occurs and get_block() returns != 0, we break
from this loop and go into recovery code.  This code has this comment:

/* Error case: */
/*
 * Zero out any newly allocated blocks to avoid exposing stale
 * data.  If BH_New is set, we know that the block was newly
 * allocated in the above loop.
 */

So the intent is obviously good in that it wants to clear just allocated
and hence not zeroed buffers.  However the code recognises allocated
buffers by checking for buffer_new being set.

Unfortunately __block_prepare_write() as discussed above already cleared
buffer_new on all allocated buffers thus no buffers will be cleared during
error recovery and old data will be leaked.

The simplest way I can see to fix this is to make the current recovery code
work by _not_ clearing buffer_new after calling get_block() in
__block_prepare_write().

We cannot safely allow buffer_new buffers to "leak out" of
__block_prepare_write(), thus we simply do a quick loop over the buffers
clearing buffer_new on each of them if it is set just before returning
"success" from __block_prepare_write().

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:34 -07:00
Christoph Hellwig
9a59f452ab [PATCH] remove <linux/xattr_acl.h>
This file duplicates <linux/posix_acl_xattr.h>, using slightly different
names.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:33 -07:00
Christoph Lameter
45778ca819 [PATCH] Remove f_error field from struct file
The following patch removes the f_error field and all checks of f_error.

Trond said:

  f_error was introduced for NFS, and made sense when we were guaranteed
  always to have a file pointer around when write errors occurred.  Since
  then, we have (for various reasons) had to introduce the nfs_open_context in
  order to track the file read/write state, and it made sense to move our
  f_error tracking there too.

Signed-off-by: Christoph Lameter <christoph@lameter.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:33 -07:00
Arnd Bergmann
bb93e3a52f [PATCH] block: add unlocked_ioctl support for block devices
This patch allows block device drivers to convert their ioctl functions to
unlocked_ioctl() like character devices and other subsystems.  All
functions that were called with the BKL held before are still used that
way, but I would not be surprised if it could be removed from the ioctl
functions in drivers/block/ioctl.c themselves.

As a side note, I found that compat_blkdev_ioctl() acquires the BKL as
well, which looks like a bug.  I have checked that every user of
disk->fops->compat_ioctl() in the current git tree gets the BKL itself, so
it could easily be removed from compat_blkdev_ioctl().

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:32 -07:00
Pekka Enberg
b030a4dd60 [PATCH] Remove eventpoll macro obfuscation
This patch gets rid of some macro obfuscation from fs/eventpoll.c by
removing slab allocator wrappers and converting macros to static inline
functions.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:30 -07:00
Oleg Nesterov
dfb388bf8a [PATCH] factor out common code in sys_fsync/sys_fdatasync
This patch consolidates sys_fsync and sys_fdatasync.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:29 -07:00
Yoav Zach
ef3daeda7b [PATCH] Don't force O_LARGEFILE for 32 bit processes on ia64
In ia64 kernel, the O_LARGEFILE flag is forced when opening a file.  This
is problematic for execution of 32 bit processes, which are not largefile
aware, either by SW emulation or by HW execution.

For such processes, the problem is two-fold:

1) When trying to open a file that is larger than 4G
   the operation should fail, but it's not
2) Writing to offset larger than 4G should fail, but
   it's not

The proposed patch takes advantage of the way 32 bit processes are
identified in ia64 systems.  Such processes have PER_LINUX32 for their
personality.  With the patch, the ia64 kernel will not enforce the
O_LARGEFILE flag if the current process has PER_LINUX32 set.  The behavior
for all other architectures remains unchanged.

Signed-off-by: Yoav Zach <yoav.zach@intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:28 -07:00
Kirill Korotaev
618f06362a [PATCH] O(1) sb list traversing on syncs
This patch removes O(n^2) super block loops in sync_inodes(),
sync_filesystems() etc.  in favour of using __put_super_and_need_restart()
which I introduced earlier.  We faced a noticably long freezes on sb
syncing when there are thousands of super blocks in the system.

Signed-Off-By: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:27 -07:00
Kirill Korotaev
af4d2ecbf0 [PATCH] Fix of bogus file max limit messages
This patch fixes incorrect and bogus kernel messages that file-max limit
reached when the allocation fails

Signed-Off-By: Kirill Korotaev <dev@sw.ru>
Signed-Off-By: Denis Lunev <den@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:26 -07:00
Christoph Hellwig
c663e5d80e [PATCH] add some comments to lookup_create()
In a duplicate of lookup_create in the af_unix code Al commented what's
going on nicely, so let's bring that over to lookup_create before the copy
is going away (I'll send a patch soon)

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:26 -07:00
Andreas Dilger
acfa1823d3 [PATCH] Support for dx directories in ext3_get_parent (NFSD)
Henrik Grubbstrom noted:

The 2.6.10 ext3_get_parent attempts to use ext3_find_entry to look up the
entry "..", which fails for dx directories since ".." is not present in the
directory hash table.  The patch below solves this by looking up the dotdot
entry in the dx_root block.

Typical symptoms of the above bug are intermittent claims by nfsd that
files or directories are missing on exported ext3 filesystems.

cf https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=3D150759 and
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=3D144556

ext3_get_parent() is IMHO the wrong place to fix this bug as it introduces
a lot of internals from htree into that function.  Instead, I think this
should be fixed in ext3_find_entry() as in the below patch.  This has the
added advantage that it works for any callers of ext3_find_entry() and not
just ext3_lookup_parent().

Signed-off-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Henrik Grubbstrom <grubba@grubba.org>
Cc: <ext2-devel@lists.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:26 -07:00
Alan Cox
d6e7114481 [PATCH] setuid core dump
Add a new `suid_dumpable' sysctl:

This value can be used to query and set the core dump mode for setuid
or otherwise protected/tainted binaries. The modes are

0 - (default) - traditional behaviour.  Any process which has changed
    privilege levels or is execute only will not be dumped

1 - (debug) - all processes dump core when possible.  The core dump is
    owned by the current user and no security is applied.  This is intended
    for system debugging situations only.  Ptrace is unchecked.

2 - (suidsafe) - any binary which normally would not be dumped is dumped
    readable by root only.  This allows the end user to remove such a dump but
    not access it directly.  For security reasons core dumps in this mode will
    not overwrite one another or other files.  This mode is appropriate when
    adminstrators are attempting to debug problems in a normal environment.

(akpm:

> > +EXPORT_SYMBOL(suid_dumpable);
>
> EXPORT_SYMBOL_GPL?

No problem to me.

> >  	if (current->euid == current->uid && current->egid == current->gid)
> >  		current->mm->dumpable = 1;
>
> Should this be SUID_DUMP_USER?

Actually the feedback I had from last time was that the SUID_ defines
should go because its clearer to follow the numbers. They can go
everywhere (and there are lots of places where dumpable is tested/used
as a bool in untouched code)

> Maybe this should be renamed to `dump_policy' or something.  Doing that
> would help us catch any code which isn't using the #defines, too.

Fair comment. The patch was designed to be easy to maintain for Red Hat
rather than for merging. Changing that field would create a gigantic
diff because it is used all over the place.

)

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:26 -07:00
Christoph Hellwig
2fa389c5eb [PATCH] quota: sanitize dentry handling in vfs_quota_on_mount
Use lookup_one_len instead of opencoding a simplified lookup using
lookup_hash with a fake hash.

Also there's no need anymore for the d_invalidate as we have a completely
valid dentry now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:20 -07:00
Christoph Hellwig
84de856ed3 [PATCH] quota: consolidate code surrounding vfs_quota_on_mount
Move some code duplicated in both callers into vfs_quota_on_mount

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jan Kara <jack@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:20 -07:00
Christoph Hellwig
5f45f1a78f [PATCH] remove duplicate get_dentry functions in various places
Various filesystem drivers have grown a get_dentry() function that's a
duplicate of lookup_one_len, except that it doesn't take a maximum length
argument and doesn't check for \0 or / in the passed in filename.

Switch all these places to use lookup_one_len.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Greg KH <greg@kroah.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:20 -07:00
Neil Horman
ac20427ef6 [PATCH] add check to /proc/devices read routines
Patch to add check to get_chrdev_list and get_blkdev_list to prevent reads
of /proc/devices from spilling over the provided page if more than 4096
bytes of string data are generated from all the registered character and
block devices in a system

Signed-off-by: Neil Horman <nhorman@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:19 -07:00
Alexander Viro
991114c6fa [PATCH] fix for prune_icache()/forced final iput() races
Based on analysis and a patch from Russ Weight <rweight@us.ibm.com>

There is a race condition that can occur if an inode is allocated and then
released (using iput) during the ->fill_super functions.  The race
condition is between kswapd and mount.

For most filesystems this can only happen in an error path when kswapd is
running concurrently.  For isofs, however, the error can occur in a more
common code path (which is how the bug was found).

The logic here is "we want final iput() to free inode *now* instead of
letting it sit in cache if fs is going down or had not quite come up".  The
problem is with kswapd seeing such inodes in the middle of being killed and
happily taking over.

The clean solution would be to tell kswapd to leave those inodes alone and
let our final iput deal with them.  I.e.  add a new flag
(I_FORCED_FREEING), set it before write_inode_now() there and make
prune_icache() leave those alone.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:17 -07:00
Anton Altaparmakov
3357d4c75f Automatic merge with /usr/src/ntfs-2.6.git. 2005-06-23 11:26:22 +01:00
Trond Myklebust
eadf4598e7 [PATCH] NFS: Add debugging code to NFSv4 readdir
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:44 -04:00
Manoj Naik
6ebf3656fd [PATCH] NFSv4: Map a couple of NFSv4 errors to EINVAL.
This shows up on running tar over NFSv4.

 Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:44 -04:00
Manoj Naik
97d312d037 [PATCH] NFSv4: add support for rdattr_error in NFSv4 readdir requests.
Request RDATTR_ERROR as an attribute in readdir to distinguish between a
 directory being within an absent filesystem or one (or more) of its entries.

 Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:43 -04:00
Trond Myklebust
8d0a8a9d0e [PATCH] NFSv4: Clean up nfs4 lock state accounting
Ensure that lock owner structures are not released prematurely.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:42 -04:00
Trond Myklebust
ecdbf769b2 [PATCH] NLM: fix a client-side race on blocking locks.
If the lock blocks, the server may send us a GRANTED message that
 races with the reply to our LOCK request. Make sure that we catch
 the GRANTED by queueing up our request on the nlm_blocked list
 before we send off the first LOCK rpc call.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:42 -04:00
Trond Myklebust
4f15e2b1f4 [PATCH] NLM: cleanup for blocked locks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:41 -04:00
Trond Myklebust
80fec4c62e [PATCH] VFS: Ensure that all the on-stack struct file_lock call fl_release_private
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:40 -04:00
Trond Myklebust
3da28eb1c6 [PATCH] NFS: Replace nfs_page insertion sort with a radix sort
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:39 -04:00
Trond Myklebust
c6a556b88a [PATCH] NFS: Make searching and waiting on busy writeback requests more efficient.
Basically copies the VFS's method for tracking writebacks and applies
 it to the struct nfs_page.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:39 -04:00
Trond Myklebust
ab0a3dbedc [PATCH] NFS: Write optimization for short files and small O_SYNC writes.
Use stable writes if we can see that we are only going to put a single
 write on the wire.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:38 -04:00
Trond Myklebust
fe51beecc5 [PATCH] NFS: Ensure that fstat() always returns the correct mtime
Even if the file is open for writes.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:37 -04:00
Trond Myklebust
7d52e86274 [PATCH] NFS: Cleanup of caching code, and slight optimization of writes.
Unless we're doing O_APPEND writes, we really don't care about revalidating
 the file length. Just make sure that we catch any page cache invalidations.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:37 -04:00
Trond Myklebust
951a143b3f [PATCH] NFS: Fix the file size revalidation
Instead of looking at whether or not the file is open for writes before
 we accept to update the length using the server value, we should rather
 be looking at whether or not we are currently caching any writes.

 Failure to do so means in particular that we're not updating the file
 length correctly after obtaining a POSIX or BSD lock.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:36 -04:00
Trond Myklebust
08e9eac42e [PATCH] NFSv4: Fix up races in nfs4_proc_setattr()
If we do not hold a valid stateid that is open for writes, there is little
 point in doing an extra open of the file, as the RFC does not appear to
 mandate this...

 Make setattr use the correct stateid if we're holding mandatory byte
 range locks.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:35 -04:00
Trond Myklebust
202b50dc12 [PATCH] NFSv4: Ensure that propagate NFSv4 state errors to the reclaim code
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:34 -04:00
Trond Myklebust
f0dd2136da [PATCH] NFS: Clean up readdir changes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:34 -04:00
Olivier Galibert
00a9264227 [PATCH] NFS: Hide NFS server-generated readdir cookies from userland
NFSv3 currently returns the unsigned 64-bit cookie directly to
 userspace. The following patch causes the kernel to generate
 loff_t offsets for the benefit of userland.
 The current server-generated READDIR cookie is cached in the
 nfs_open_context instead of in filp->f_pos, so we still end up work
 correctly under directory insertions/deletion.

 Signed-off-by: Olivier Galibert <galibert@pobox.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:33 -04:00
Trond Myklebust
455a396710 [PATCH] NFSv4: Fix an Oops in the callback code.
The changeset "trond.myklebust@fys.uio.no|ChangeSet|20050322152404|16979"
 (RPC: Ensure XDR iovec length is initialized correctly in call_header)
 causes the NFSv4 callback code to BUG() due to an incorrectly initialized
 scratch buffer.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:29 -04:00
Reuben Farrelly
c56c275022 [PATCH] NFSv4: Fix build warning
From: Reuben Farrelly <reuben-lkml@reub.net>

 With gcc-4.0:

 fs/nfs/nfs4proc.c:2976: error: static declaration of
 'nfs4_file_inode_operations' follows non-static declaration
 fs/nfs/nfs4_fs.h:179: error: previous declaration of
 'nfs4_file_inode_operations' was here

 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:29 -04:00
Andrew Morton
3e9d41543b [PATCH] NFSv4: empty array fix
Older gcc's don't like this.

 fs/nfs/nfs4proc.c:2194: field `data' has incomplete type

 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:28 -04:00
Adrian Bunk
b7ef19560f [PATCH] NFSv4: fs/nfs/nfs4proc.c: small simplification
The Coverity checker noticed that such a simplification was possible.

 Signed-off-by: Adrian Bunk <bunk@stusta.de>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:27 -04:00
Andreas Gruenbacher
213484254c [PATCH] fix nfsacl pointer arithmetic and pg_class initialization bugs
* Pointer arithmetic bug: p is in word units. This fixes a memory
  corruption with big acls.
* Initialize pg_class to prevent a NULL pointer access.

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:27 -04:00
Trond Myklebust
458818ed76 [PATCH] NFS: Fix up v3 ACL caching code
Initialize the inode cache values correctly.
 Clean up __nfs3_forget_cached_acls()

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:26 -04:00
Andreas Gruenbacher
5c6a9f7d92 [PATCH] NFS: Cache the NFSv3 acls.
Attach acls to inodes in the icache to avoid unnecessary GETACL RPC
 round-trips.  As long as the client doesn't retrieve any acls itself, only the
 default acls of exiting directories and the default and access acls of new
 directories will end up in the cache, which preserves some memory compared to
 always caching the access and default acl of all files.

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Acked-by: Olaf Kirch <okir@suse.de>
 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:25 -04:00
Andreas Gruenbacher
055ffbea05 [PATCH] NFS: Fix handling of the umask when an NFSv3 default acl is present.
NFSv3 has no concept of a umask on the server side: The client applies
 the umask locally, and sends the effective permissions to the server.
 This behavior is wrong when files are created in a directory that has a
 default ACL.  In this case, the umask is supposed to be ignored, and
 only the default ACL determines the file's effective permissions.

 Usually its the server's task to conditionally apply the umask.  But
 since the server knows nothing about the umask, we have to do it on the
 client side.  This patch tries to fetch the parent directory's default
 ACL before creating a new file, computes the appropriate create mode to
 send to the server, and finally sets the new file's access and default
 acl appropriately.

 Many thanks to Buck Huppmann <buchk@pobox.com> for sending the initial
 version of this patch, as well as for arguing why we need this change.

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Acked-by: Olaf Kirch <okir@suse.de>
 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:24 -04:00
Andreas Gruenbacher
b7fa0554cf [PATCH] NFS: Add support for NFSv3 ACLs
This adds acl support fo nfs clients via the NFSACL protocol extension, by
 implementing the getxattr, listxattr, setxattr, and removexattr iops for the
 system.posix_acl_access and system.posix_acl_default attributes.  This patch
 implements a dumb version that uses no caching (and thus adds some overhead).
 (Another patch in this patchset adds caching as well.)

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Acked-by: Olaf Kirch <okir@suse.de>
 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:24 -04:00
Andreas Gruenbacher
a257cdd0e2 [PATCH] NFSD: Add server support for NFSv3 ACLs.
This adds functions for encoding and decoding POSIX ACLs for the NFSACL
 protocol extension, and the GETACL and SETACL RPCs.  The implementation is
 compatible with NFSACL in Solaris.

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Acked-by: Olaf Kirch <okir@suse.de>
 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:23 -04:00
Andreas Gruenbacher
a838cc49d9 [PATCH] NFSD: Add NFS3ERR_NOTSUPP to the nfsd error mapping table
Add the missing NFS3ERR_NOTSUPP error code (defined in NFSv3) to the
 system-to-protocol-error table in nfsd.  The nfsacl extension uses this error
 code.

 Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
 Signed-off-by: Olaf Kirch <okir@suse.de>
 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:21 -04:00
J. Bruce Fields
6a19275ada [PATCH] RPC: [PATCH] improve rpcauthauth_create error returns
Currently we return -ENOMEM for every single failure to create a new auth.
 This is actually accurate for auth_null and auth_unix, but for auth_gss it's a
 bit confusing.

 Allow rpcauth_create (and the ->create methods) to return errors.  With this
 patch, the user may sometimes see an EINVAL instead.  Whee.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:16 -04:00
J. Bruce Fields
e50a1c2e1f [PATCH] NFSv4: client-side caching NFSv4 ACLs
Add nfs4_acl field to the nfs_inode, and use it to cache acls.  Only cache
 acls of size up to a page.  Also prepare for up to a page of acl data even
 when the user doesn't pass in a buffer, as when they want to get the acl
 length to decide what size buffer to allocate.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:15 -04:00
J. Bruce Fields
4b580ee3dc [PATCH] NFSv4: ACL support for the NFSv4 client: write
Client-side write support for NFSv4 ACLs.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:14 -04:00
J. Bruce Fields
23ec6965c2 [PATCH] NFSv4: Client-side xdr for writing NFSv4 acls
Client-side support for NFSv4 acls: xdr encoding and decoding routines for
 writing acls

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:13 -04:00
J. Bruce Fields
aa1870af92 [PATCH] NFSv4: ACL support for the NFSv4 client: read
Client-side support for NFSv4 ACLs.  Exports the raw xdr code via the
 system.nfs4_acl extended attribute.  It is up to userspace to decode the acl
 (and to provide correctly xdr'd acls on setxattr), and to convert to/from
 POSIX ACLs if desired.

 This patch provides only the read support.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:13 -04:00
J. Bruce Fields
029d105e66 [PATCH] NFSv4: Client-side xdr for reading NFSv4 acls
Client-side support for NFSv4 acls: xdr encoding and decoding routines for
 reading acls

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:12 -04:00
J. Bruce Fields
9692820696 [PATCH] NFSv4: fix fattr size calculations
Make nfs4 fattr size calculations more explicit, revising them downward a
 bit in the process.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:11 -04:00
J. Bruce Fields
6b3b5496d7 [PATCH] NFSv4: Add {get,set,list}xattr methods for nfs4
Add {get,set,list}xattr methods for nfs4.  The new methods are no-ops, to be
 used by subsequent ACL patch.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:10 -04:00
Trond Myklebust
ada70d9425 [PATCH] NFS: Add hooks to allow common NFS attribute code to clear cached acls
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:09 -04:00
J. Bruce Fields
92cfc62cb8 [PATCH] NFS: Allow NFS versions to support different sets of inode operations.
ACL support will require supporting additional inode operations in v4
 (getxattr, setxattr, listxattr).  This patch allows different protocol versions
 to support different inode operations by adding a file_inode_ops to the
 nfs_rpc_ops (to match the existing dir_inode_ops).

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:09 -04:00
Trond Myklebust
464a98bd70 [PATCH] NFS: cleanup: shrink struct nfs_open_context
Remove the wait queue, and replace the functions that depended on it
 with wait_on_bit().

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:08 -04:00
Trond Myklebust
a656db9987 [PATCH] NFS: Remove unused NFS inode field readdir_timestamp.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:07 -04:00
Trond Myklebust
4ce79717ce [PATCH] NFS: Header file cleanup...
- Move NFSv4 state definitions into a private header file.
 - Clean up gunk in nfs_fs.h

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:06 -04:00
Trond Myklebust
9085bbcb76 [PATCH] NFS: Kill annoying mount version mismatch printks
Ensure that we fix up the missing fields in the nfs_mount_data with
 sane defaults for older versions of mount, and return errors in the
 cases where we cannot.

 Convert a bunch of annoying warnings into dprintks()

 Return -EPROTONOSUPPORT rather than EIO if mount() tries to set NFSv3
 without it actually being compiled in.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:04 -04:00
Trond Myklebust
5ee0ed7d3a [PATCH] RPC: Make rpc_create_client() probe server for RPC program+version support
Ensure that we don't create an RPC client without checking that the server
 does indeed support the RPC program + version that we are trying to set up.

 This enables us to immediately return an error to "mount" if it turns out
 that the server is only supporting NFSv2, when we requested NFSv3 or NFSv4.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:04 -04:00
Trond Myklebust
5b616f5d59 [PATCH] RPC: Make rpc_create_client() destroy the transport on failure.
This saves us a couple of lines of cleanup code for each call.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:03 -04:00
Linus Torvalds
2a5a68b840 Merge rsync://oss.sgi.com/git/xfs-2.6 2005-06-21 19:51:18 -07:00
Jeremy White
9769f4eb3f [PATCH] isofs: show hidden files, add granularity for assoc/hidden files flags
The current isofs treatment of hidden files is flawed in two ways.  First,
it does not provide sufficient granularity; it hides both 'hidden' files
and 'associated' files (resource fork for Mac files).  Second, the default
behavior to completely strip hidden files, while an admirable
implementation of the spec, is a poor choice given the real world use of
hidden files as a poor mans copy protection scheme for MSDOS and Windows
based systems.  A longer description of this is available here:

   http://www.uwsg.iu.edu/hypermail/linux/kernel/0205.3/0267.html

This patch was originally built after a few private conversations with Alan
Cox; I shamefully failed to persist in seeing it go forward, I hope to make
amends now.

This patch introduces granularity by allowing explicit control for both
hidden and associated files.  It also reverses the default so that by
default, hidden files are treated as regular files on the iso9660 file
system.

This allow Wine to process Windows CDs, including those that are hybrid
Mac/Windows CDs properly and completely, without our having to go muck up
peoples fstabs as we do now.  (I have tested this with such a hybrid +
hidden CD and have verified that this patch works as claimed).

Signed-off-by: Jeremy White <jwhite@codeweavers.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:38 -07:00
Andrew Morton
f2966632a1 [PATCH] rock: handle directory overflows
Handle the case where the variable-sized part of a rock-ridge directory entry
overhangs the end of the buffer which we allocated for it.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:38 -07:00
Andrew Morton
642217c17b [PATCH] rock: rename union members
The silly thing does:

	struct foo { ... };
	...
	#define foo 42

so you can no longer refer to `struct foo' in C code.

Rename the structures.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:38 -07:00
Andrew Morton
e595447e17 [PATCH] rock.c: handle corrupted directories
The bug in rock.c is that it's totally trusting of the contents of the
directories.  If the directory says there's a continuation 10000 bytes into
this 4k block then we cheerily poke around in memory we don't own and oops.

So change rock_continue() to apply various sanity checks, at least ensuring
that the offset+length remain within the bounds for the header part of a
struct rock_ridge directory entry.

Note that the kernel can still overindex the buffer due to the variable size
of the rock-ridge directory entries.  We cannot check that in rock_continue()
unless we go parse the directory entry's signature and work out its size.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:38 -07:00
Andrew Morton
9eb7f2c67c [PATCH] isofs: remove debug stuff
isofs/inode.c:

- Remove some crufty leak detection code

- coding style cleanups

- kfree(NULL) is permitted.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:38 -07:00
Andrew Morton
a089221c5e [PATCH] rock: lindent rock.h
So we have a couple of rock-ridge bugs.  First up, rotoroot the poor thing
into something which it is possible to work on.

Feed rock.h through Lindent, tidy a couple of things by hand.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:37 -07:00
Andrew Morton
7373909de4 [PATCH] rock: comment tidies
Be a bit more standard in comment layout.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:37 -07:00
Andrew Morton
ba40aaf043 [PATCH] rock: remove MAYBE_CONTINUE
- remove the MAYBE_CONTINUE macro

- kfree(NULL) is OK.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:37 -07:00
Andrew Morton
76ab07ebc3 [PATCH] rock: remove SETUP_ROCK_RIDGE
- Remove the SETUP_ROCK_RIDGE macro.

- In rock_ridge_symlink_readpage(), rename raw_inode to raw_de.  It points
  at a directory entry, not an inode.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:37 -07:00
Andrew Morton
04f7aa9c7d [PATCH] rock: remove CHECK_CE
Remove the CHECK_CE macro

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:37 -07:00
Andrew Morton
a40ea8f22e [PATCH] rock: remove CONTINUE_DECLS
Remove the CONTINUE_DECLS macro.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:36 -07:00
Andrew Morton
12121714fb [PATCH] rock: remove CHECK_SP
Remove the CHECK_SP macro.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:36 -07:00
Andrew Morton
7fa393a1d3 [PATCH] rock: manual tidies
Fix stuff which Lindent got wrong, rework a few deeply-nested blocks.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:36 -07:00
Andrew Morton
1d37211638 [PATCH] rock: lindent it
Trying to turn rock.c into something which humans can read so we can fix some
bugs.

Start out by feeding it through scripts/Lindent.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:36 -07:00
Ian Kent
1684b2bba6 [PATCH] autofs4: bad lookup fix
For browsable autofs maps, a mount request that arrives at the same time an
expire is happening can fail to perform the needed mount.

This happens becuase the directory exists and so the revalidate succeeds when
we need it to fail so that lookup is called on the same dentry to do the
mount.  Instead lookup is called on the next path component which should be
whithin the mount, but the parent isn't mounted.

The solution is to allow the revalidate to continue and perform the mount as
no directory creation (at mount time) is needed for browsable mount entries.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:35 -07:00
Ian Kent
cc9acc8858 [PATCH] autofs4: post expire race fix
At the tail end of an expire it's possible for a process to enter
autofs4_wait, with a waitq type of NFY_NONE but find that the expire is
finished.  In this cause autofs4_wait will try to create a new wait but not
notify the daemon leading to a hang.  As the wait type is meant to delay mount
requests from revalidate or lookup during an expire and the expire is done all
we need to do is check if the dentry is a mountpoint.  If it's not then we're
done.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:35 -07:00
Ian Kent
9b1e3afd6d [PATCH] autofs4: avoid panic on bind mount of autofs owned directory
While this is not a solution to bind and move mounts on autofs owned
directories it is necessary to fix the trady error handling.

At least it avoids the kernel panic I observed checking out bug #4589.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 19:07:35 -07:00
Gerald Schaefer
8680e22f29 [PATCH] VFS: memory leak in do_kern_mount()
There is a memory leak during mount when CONFIG_SECURITY is enabled and
mount options are specified.

Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com>
Acked-by: James Morris <jmorris@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:22 -07:00
Darren Hart
1ad539b2bd [PATCH] vm: try_to_free_pages unused argument
try_to_free_pages accepts a third argument, order, but hasn't used it since
before 2.6.0.  The following patch removes the argument and updates all the
calls to try_to_free_pages.

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:17 -07:00
Wolfgang Wander
1363c3cd86 [PATCH] Avoiding mmap fragmentation
Ingo recently introduced a great speedup for allocating new mmaps using the
free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
causes huge performance increases in thread creation.

The downside of this patch is that it does lead to fragmentation in the
mmap-ed areas (visible via /proc/self/maps), such that some applications
that work fine under 2.4 kernels quickly run out of memory on any 2.6
kernel.

The problem is twofold:

  1) the free_area_cache is used to continue a search for memory where
     the last search ended.  Before the change new areas were always
     searched from the base address on.

     So now new small areas are cluttering holes of all sizes
     throughout the whole mmap-able region whereas before small holes
     tended to close holes near the base leaving holes far from the base
     large and available for larger requests.

  2) the free_area_cache also is set to the location of the last
     munmap-ed area so in scenarios where we allocate e.g.  five regions of
     1K each, then free regions 4 2 3 in this order the next request for 1K
     will be placed in the position of the old region 3, whereas before we
     appended it to the still active region 1, placing it at the location
     of the old region 2.  Before we had 1 free region of 2K, now we only
     get two free regions of 1K -> fragmentation.

The patch addresses thes issues by introducing yet another cache descriptor
cached_hole_size that contains the largest known hole size below the
current free_area_cache.  If a new request comes in the size is compared
against the cached_hole_size and if the request can be filled with a hole
below free_area_cache the search is started from the base instead.

The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
(earlier posted) leakme.c test program terminates after 50000+ iterations
with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
(as expected) with thread creation, Ingo's test_str02 with 20000 threads
requires 0.7s system time.

Taking out Ingo's patch (un-patch available per request) by basically
deleting all mentions of free_area_cache from the kernel and starting the
search for new memory always at the respective bases we observe: leakme
terminates successfully with 11 distinctive hardly fragmented areas in
/proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
time for Ingo's test_str02 with 20000 threads.

Now - drumroll ;-) the appended patch works fine with leakme: it ends with
only 7 distinct areas in /proc/self/maps and also thread creation seems
sufficiently fast with 0.71s for 20000 threads.

Signed-off-by: Wolfgang Wander <wwc@rentec.com>
Credit-to: "Richard Purdie" <rpurdie@rpsys.net>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu> (partly)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:16 -07:00
Nikita Danilov
295ab93497 [PATCH] mm: add /proc/zoneinfo
Add /proc/zoneinfo file to display information about memory zones.  Useful
to analyze VM behaviour.

Signed-off-by: Nikita Danilov <nikita@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:14 -07:00
Ingo Molnar
39c715b717 [PATCH] smp_processor_id() cleanup
This patch implements a number of smp_processor_id() cleanup ideas that
Arjan van de Ven and I came up with.

The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
spaghetti was hard to follow both on the implementational and on the
usage side.

Some of the complexity arose from picking wrong names, some of the
complexity comes from the fact that not all architectures defined
__smp_processor_id.

In the new code, there are two externally visible symbols:

 - smp_processor_id(): debug variant.

 - raw_smp_processor_id(): nondebug variant. Replaces all existing
   uses of _smp_processor_id() and __smp_processor_id(). Defined
   by every SMP architecture in include/asm-*/smp.h.

There is one new internal symbol, dependent on DEBUG_PREEMPT:

 - debug_smp_processor_id(): internal debug variant, mapped to
                             smp_processor_id().

Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
lib/smp_processor_id.c file.  All related comments got updated and/or
clarified.

I have build/boot tested the following 8 .config combinations on x86:

 {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}

I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT.  (Other
architectures are untested, but should work just fine.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:13 -07:00
Dean Roehrich
e1a40fa907 [XFS] Handle inode semaphores properly for dmapi queues
SGI-PV: 931572
SGI-Modid: xfs-linux-melb:xfs-kern:189560a

Signed-off-by: Dean Roehrich <roehrich@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-22 10:20:44 +10:00
Greg KH
2c6e5a839f [PATCH] devfs: remove devfs from Kconfig preventing it from being built
Here's a much smaller patch to simply disable devfs from the build.  If
this goes well, and there are no complaints for a few weeks, I'll resend
my big "devfs-die-die-die" series of patches that rip the whole thing
out of the kernel tree.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 15:41:16 -07:00
Linus Torvalds
9527cc77e2 Merge 'for-linus' branch of rsync://rsync.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 2005-06-21 14:49:35 -07:00
Nathan Scott
ad89d0212e [XFS] Remove some debugging code from quota syscalls.
SGI-PV: 932952
SGI-Modid: xfs-linux-melb:xfs-kern:22929a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:57:57 +10:00
Nathan Scott
754002b4fb [XFS] Merge a few minor fixes to the quota warning code.
SGI-PV: 938145
SGI-Modid: xfs-linux:xfs-kern:22901a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:49:06 +10:00
Nathan Scott
06d10dd9ca [XFS] Merge fixes into realtime quota code, since one/two reported, still
not enabled though.

SGI-PV: 938145
SGI-Modid: xfs-linux:xfs-kern:22900a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:48:47 +10:00
Nathan Scott
77bc5beb59 [XFS] Makes more sense to use the fsxattr interface instead of adding new
ioctls for project IDs.

SGI-PV: 938145
SGI-Modid: xfs-linux:xfs-kern:22899a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:48:04 +10:00
Christoph Hellwig
bd5a876ac4 [XFS] (mostly) remove xfs_inval_cached_pages Since the last round of
direct I/O locking changes it is just a wrapper around
VOP_FLUSHINVAL_PAGES, so it's not nessecary anymore.  Keep a simplified
version for kernels < 2.4.22, as these don't have the changed direct I/O
locking.

SGI-PV: 938064
SGI-Modid: xfs-linux:xfs-kern:194420a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:47:39 +10:00
Christoph Hellwig
d130c14c03 [XFS] simplify ASSERT
SGI-PV: 938063
SGI-Modid: xfs-linux:xfs-kern:194416a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:43:22 +10:00
Christoph Hellwig
7d795ca344 [XFS] consolidate extent item freeing
SGI-PV: 938062
SGI-Modid: xfs-linux:xfs-kern:194415a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:41:19 +10:00
Christoph Hellwig
f898d6c09c [XFS] quiesce the filesystem proper when freezing
SGI-PV: 936977
SGI-Modid: xfs-linux:xfs-kern:193840a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:40:48 +10:00
Christoph Hellwig
48fab6bf5f [XFS] add XFS_INOBT_IS_FREE_DISK
SGI-PV: 928382
SGI-Modid: xfs-linux:xfs-kern:193778a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:40:20 +10:00
Eric Sandeen
6add2c4288 [XFS] Fix up some warning fallout from functions made static
SGI-PV: 936255
SGI-Modid: xfs-linux:xfs-kern:193691a

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:39:44 +10:00
Nathan Scott
365ca83d50 [XFS] Add support for project quota inheritance, a merge of Glens changes.
SGI-PV: 932952
SGI-Modid: xfs-linux:xfs-kern:22806a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:39:12 +10:00
Nathan Scott
c8ad20ffeb [XFS] Add support for project quota, based on Dan Knappes earlier work.
SGI-PV: 932952
SGI-Modid: xfs-linux:xfs-kern:22805a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:38:48 +10:00
Christoph Hellwig
8401e9631c [XFS] remove xfs_incore_relse
SGI-PV: 936977
SGI-Modid: xfs-linux:xfs-kern:193409a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:38:03 +10:00
Christoph Hellwig
66f58d236f [XFS] simplify XFS_PURGE_INODE
SGI-PV: 936891
SGI-Modid: xfs-linux:xfs-kern:193408a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:37:43 +10:00
Christoph Hellwig
efa8027804 [XFS] rewrite xfs_iflush_all
SGI-PV: 936890
SGI-Modid: xfs-linux:xfs-kern:193349a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:37:17 +10:00
Christoph Hellwig
ba0f32d460 [XFS] mark various symbols static Patch from Adrian Bunk
SGI-PV: 936255
SGI-Modid: xfs-linux:xfs-kern:192760a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:36:52 +10:00
Christoph Hellwig
4372d6e103 [XFS] Remove dead code. Patch from Adrian Bunk
SGI-PV: 936255
SGI-Modid: xfs-linux:xfs-kern:192759a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:36:00 +10:00
Christoph Hellwig
cf9937c6c6 [XFS] Fix pagebuf slab initialization
SGI-PV: 908809
SGI-Modid: xfs-linux:xfs-kern:192756a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:35:24 +10:00
Christoph Hellwig
02de1f0abf [XFS] fix some more compiler warnings in the vnode tracing code
SGI-PV: 934679
SGI-Modid: xfs-linux:xfs-kern:192570a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:33:48 +10:00
Christoph Hellwig
23ea4032c8 [XFS] rename various pagebuf symbols to xfsbuf
SGI-PV: 908809
SGI-Modid: xfs-linux:xfs-kern:192348a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 15:14:01 +10:00
Dean Roehrich
6fac0cb46b [XFS] coordinate mmap calls with xfs_dm_punch_hole
SGI-PV: 933551
SGI-Modid: xfs-linux:xfs-kern:190622a

Signed-off-by: Dean Roehrich <roehrich@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 14:07:45 +10:00
Nathan Scott
b74e2159c9 [XFS] Add a get/set interface for XFS project identifiers.
SGI-PV: 932952
SGI-Modid: xfs-linux:xfs-kern:21938a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-06-21 13:21:49 +10:00
Jon Smirl
9d9d27fb65 [PATCH] SYSFS: fix PAGE_SIZE check
Without this change I can't set an attribute exactly PAGE_SIZE in
length. There is no need for zero termination because the interface
uses lengths.

From: Jon Smirl <jonsmirl@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:38 -07:00
Maneesh Soni
8215534ce7 [PATCH] sysfs-iattr: set inode attributes
o Following patch sets the attributes for newly allocated inodes for sysfs
  objects. If the object has non-default attributes, inode attributes are
  set as saved in sysfs_dirent->s_iattr, pointer to struct iattr.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:37 -07:00
Maneesh Soni
988d186de5 [PATCH] sysfs-iattr: add sysfs_setattr
o This adds ->i_op->setattr VFS method for sysfs inodes. The changed
  attribues are saved in the persistent sysfs_dirent structure as a pointer
  to struct iattr. The struct iattr is allocated only for those sysfs_dirent's
  for which default attributes are getting changed. Thanks to Jon Smirl for
  this suggestion.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:37 -07:00
Maneesh Soni
6fa5c828c7 [PATCH] sysfs-iattr: attach sysfs_dirent before new inode
o The following patch makes sure to attach sysfs_dirent to the dentry before
  allocation a new inode through sysfs_create(). This change is done as
  preparatory work for implementing ->i_op->setattr() functionality for
  sysfs objects.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:36 -07:00
Arnd Bergmann
acaefc25d2 [PATCH] libfs: add simple attribute files
Based on the discussion about spufs attributes, this is my suggestion
for a more generic attribute file support that can be used by both
debugfs and spufs.

Simple attribute files behave similarly to sequential files from
a kernel programmers perspective in that a standard set of file
operations is provided and only an open operation needs to
be written that registers file specific get() and set() functions.

These operations are defined as

void foo_set(void *data, u64 val); and
u64 foo_get(void *data);

where data is the inode->u.generic_ip pointer of the file and the
operations just need to make send of that pointer. The infrastructure
makes sure this works correctly with concurrent access and partial
read calls.

A macro named DEFINE_SIMPLE_ATTRIBUTE is provided to further simplify
using the attributes.

This patch already contains the changes for debugfs to use attributes
for its internal file operations.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:30 -07:00
gregkh@suse.de
1db560afe6 [PATCH] class: convert the remaining class_simple users in the kernel to usee the new class api
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:11 -07:00
Dmitry Torokhov
c76d0abd07 [PATCH] sysfs: if show/store is missing return -EIO
sysfs: if attribute does not implement show or store method
       read/write should return -EIO instead of 0 or -EINVAL.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:02 -07:00
Dmitry Torokhov
e3a15db241 [PATCH] sysfs_{create|remove}_link should take const char *
sysfs: make sysfs_{create|remove}_link to take const char * name.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:00 -07:00
Dave Kleikamp
d039ba24f1 Merge with /home/shaggy/git/linus-clean/ 2005-06-20 08:44:00 -05:00
James Bottomley
f1970baf6d [PATCH] Add scatter-gather support for the block layer SG_IO
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-06-20 14:06:52 +02:00
Jens Axboe
b823825e8e [PATCH] Keep the bio end_io parts inside of bio.c for blk_rq_map_kern()
Signed-off-by: Jens Axboe <axboe@suse.de>
2005-06-20 14:05:27 +02:00
Mike Christie
df46b9a44c [PATCH] Add blk_rq_map_kern()
Add blk_rq_map_kern which takes a kernel buffer and maps it into
a request and bio. This can be used by the dm hw_handlers, old
sg_scsi_ioctl, and one day scsi special requests so all requests
comming into scsi will have bios. All requests having bios
should allow scsi to use scatter lists for all IO and allow it
to use block layer functions.

Signed-off-by: Jens Axboe <axboe@suse.de>
2005-06-20 14:04:44 +02:00
Linus Torvalds
c2a0f5943d Clean up subthread exec
Make sure we re-parent itimers, and use BUG_ON() instead of an explicit
conditional BUG().
2005-06-18 13:06:22 -07:00
Daniel Jacobowitz
5db92850d3 [PATCH] Fix large core dumps with a 32-bit off_t
The ELF core dump code has one use of off_t when writing out segments.
Some of the segments may be passed the 2GB limit of an off_t, even on a
32-bit system, so it's important to use loff_t instead.  This fixes a
corrupted core dump in the bigcore test in GDB's testsuite.

Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-16 09:02:59 -07:00
Trond Myklebust
980802e311 [PATCH] NFS: Ensure that we revalidate the cached file length for llseek(SEEK_END)
This fixes a data corruption error for mail delivery applications that
expect to be able to do posix locking and then append writes on NFS.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-13 10:33:02 -07:00
Steve French
f5d9b97ee0 Merge with rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-06-09 14:44:56 -07:00
Steve French
3079ca621e [CIFS] Fix cifs update of page cache. Write at correct offset when out of memory
and add_to_page_cache fails.  

Thanks to Shaggy for pointing out the fix.

Signed-off-by: Steve French (sfrench@us.ibm.com)
Signed-off-by: Shaggy (shaggy@us.ibm.com)
2005-06-09 14:44:07 -07:00
Anton Altaparmakov
364f6c717d Automatic merge with /usr/src/ntfs-2.6.git 2005-06-08 15:45:45 +01:00
Trond Myklebust
1d6757fbff [PATCH] NFS: Fix lookup intent handling
We should never apply a lookup intent to anything other than the last
path component in an open(), create() or access() call.

Introduce the helper nfs_lookup_check_intent() which always returns
zero if LOOKUP_CONTINUE or LOOKUP_PARENT are set, and returns the
intent flags if we're on the last component of the lookup.
By doing so, we fix a bug in open(O_EXCL), where we may end up
optimizing away a real lookup of the parent directory.

Problem noticed by Linda Dunaphant <linda.dunaphant@ccur.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-07 15:53:47 -07:00
Yoshinori Sato
8f5bb0438b [PATCH] binfmt_flat mmap flag fix
Make sure that binfmt_flat passes the correct flags into do_mmap().  nommu's
validate_mmap_request() will simple return -EINVAL if we try and pass it a
flags value of zero.

Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-06 14:57:51 -07:00
Al Viro
d671a1cbf7 [PATCH] namei fixes (19/19)
__do_follow_link() passes potentially worng vfsmount to touch_atime().  It
matters only in (currently impossible) case of symlink mounted on something,
but it's trivial to fix and that actually makes more sense.

Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-06 14:42:27 -07:00