Commit Graph

2725 Commits

Author SHA1 Message Date
Ingo Molnar
73d62d83ec [PATCH] splice: comment styles
- capitalize consistently
 - end sentences in one way or another
 - update comment text to match the implementation

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:57:21 +02:00
Jens Axboe
c2058e0611 [PATCH] splice: add Ingo as addition copyright holder
The comment is also somewhat out of date, correct that as well.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:56:34 +02:00
Jens Axboe
49570e9b29 [PATCH] splice: unlikely() optimizations
Also corrects a few comments. Patch mainly from Ingo, changes by me.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:56:09 +02:00
Jens Axboe
6f767b0425 [PATCH] splice: speedups and optimizations
- Kill the local variables that cache ->nrbufs, they just take up space.

- Only set do_wakeup for a real pipe. This is a big win for direct splicing.

- Kill i_mutex lock around ->f_pos update, regular io paths don't do this
  either.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:53:56 +02:00
Ingo Molnar
923f4f2394 [PATCH] pipe.c/fifo.c code cleanups
more code cleanups after the macro conversion:

 - standardize on 'struct pipe_inode_info *pipe' variable names
 - introduce 'pipe' temporaries to reduce mass inode->i_pipe dereferencing

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:53:33 +02:00
Ingo Molnar
9aeedfc471 [PATCH] get rid of the PIPE_*() macros
get rid of the PIPE_*() macros. Scripted transformation.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:53:10 +02:00
Jens Axboe
7480a90435 [PATCH] splice: speedup __generic_file_splice_read
Using find_get_page() is a lot faster than find_or_create_page(). This
gets splice a lot closer to sendfile() for fd -> socket transfers.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:52:47 +02:00
Jens Axboe
b92ce55893 [PATCH] splice: add direct fd <-> fd splicing support
It's more efficient for sendfile() emulation. Basically we cache an
internal private pipe and just use that as the intermediate area for
pages. Direct splicing is not available from sys_splice(), it is only
meant to be used for sendfile() emulation.

Additional patch from Ingo Molnar to avoid the PIPE_BUFFERS loop at
exit for the normal fast path.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-11 13:52:07 +02:00
Nathan Scott
019ff2d57b [XFS] Fix a problem in aligning inode allocations to stripe unit
boundaries.

SGI-PV: 951862
SGI-Modid: xfs-linux-melb:xfs-kern:25726a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:45:05 +10:00
Nathan Scott
8c0b5113a5 [XFS] Fix utime(2) in the case that no times parameter was passed in.
SGI-PV: 949858
SGI-Modid: xfs-linux-melb:xfs-kern:25717a

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:12:45 +10:00
David Chinner
58829e490e [XFS] Fix an inode use-after-free durin an unpin. When reclaiming inodes
that have been unlinked, we may need to execute transactions during
reclaim. By the time the transaction has hit the disk, the linux inode and
xfs vnode may already have been freed so we can't reference them safely.
Use the known xfs inode state to determine if it is safe to reference the
vnode and linux inode during the unpin operation.

SGI-PV: 946321
SGI-Modid: xfs-linux-melb:xfs-kern:25687a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:11:20 +10:00
David Chinner
1fc5d959d8 [XFS] Fix inode reclaim scalability regression. When a filesystem has
millions of inodes cached and has sparse cluster population, removing
inodes from the cluster hash consumes excessive amounts of CPU time.
Reduce the CPU cost by making removal O(1) via use of a double linked list
for the hash chains.

SGI-PV: 951551
SGI-Modid: xfs-linux-melb:xfs-kern:25683a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:11:12 +10:00
Nathan Scott
8272145c05 [XFS] Fix a writepage regression where we accidentally stopped honouring
nonblock mode with the new IO path code (since 2.6.16).

SGI-PV: 951662
SGI-Modid: xfs-linux-melb:xfs-kern:25676a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:10:55 +10:00
Nathan Scott
e50bd16fe4 [XFS] Fix superblock validation regression for the zero imaxpct case.
Thanks to kjamieson for noticing.

SGI-PV: 951661
SGI-Modid: xfs-linux-melb:xfs-kern:25675a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-04-11 15:10:45 +10:00
Linus Torvalds
e38d557896 Merge branch 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2
* 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2:
  [PATCH] CONFIGFS_FS must depend on SYSFS
  [PATCH] Bogus NULL pointer check in fs/configfs/dir.c
  ocfs2: Better I/O error handling in heartbeat
  ocfs2: test and set teardown flag early in user_dlm_destroy_lock()
  ocfs2: Handle the DLM_CANCELGRANT case in user_unlock_ast()
  ocfs2: catch an invalid ast case in dlmfs
  ocfs2: remove an overly aggressive BUG() in dlmfs
  ocfs2: multi node truncate fix
2006-04-10 16:44:09 -07:00
Eric W. Biederman
de12a7878c [PATCH] de_thread: Don't confuse users do_each_thread.
Oleg Nesterov spotted two interesting bugs with the current de_thread
code.  The simplest is a long standing double decrement of
__get_cpu_var(process_counts) in __unhash_process.  Caused by
two processes exiting when only one was created.

The other is that since we no longer detach from the thread_group list
it is possible for do_each_thread when run under the tasklist_lock to
see the same task_struct twice.  Once on the task list as a
thread_group_leader, and once on the thread list of another
thread.

The double appearance in do_each_thread can cause a double increment
of mm_core_waiters in zap_threads resulting in problems later on in
coredump_wait.

To remedy those two problems this patch takes the simple approach
of changing the old thread group leader into a child thread.
The only routine in release_task that cares is __unhash_process,
and it can be trivially seen that we handle cleaning up a
thread group leader properly.

Since de_thread doesn't change the pid of the exiting leader process
and instead shares it with the new leader process.  I change
thread_group_leader to recognize group leadership based on the
group_leader field and not based on pids.  This should also be
slightly cheaper then the existing thread_group_leader macro.

I performed a quick audit and I couldn't see any user of
thread_group_leader that cared about the difference.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-10 16:36:50 -07:00
Adrian Bunk
65714b9184 [PATCH] CONFIGFS_FS must depend on SYSFS
This patch fixes the a compile error with CONFIG_SYSFS=n

Configfs is creating, as a matter of policy, the /sys/kernel/config
mountpoint.  This means it requires CONFIG_SYSFS.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-10 11:17:21 -07:00
Eric Sesterhenn
cbca692c24 [PATCH] Bogus NULL pointer check in fs/configfs/dir.c
We check the "group" pointer after we dereference it.  This check is
bogus, as it cannot be NULL coming in.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-10 11:16:17 -07:00
Ingo Molnar
529565dcb1 [PATCH] splice: add optional input and output offsets
add optional input and output offsets to sys_splice(), for seekable file
descriptors:

 asmlinkage long sys_splice(int fd_in, loff_t __user *off_in,
                            int fd_out, loff_t __user *off_out,
                            size_t len, unsigned int flags);

semantics are straightforward: f_pos will be updated with the offset
provided by user-space, before the splice transfer is about to begin.
Providing a NULL offset pointer means the existing f_pos will be used
(and updated in situ).  Providing an offset for a pipe results in
-ESPIPE. Providing an invalid offset pointer results in -EFAULT.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 15:18:58 +02:00
Ingo Molnar
3a326a2ce8 [PATCH] introduce a "kernel-internal pipe object" abstraction
separate out the 'internal pipe object' abstraction, and make it
usable to splice. This cleans up and fixes several aspects of the
internal splice APIs and the pipe code:

 - pipes: the allocation and freeing of pipe_inode_info is now more symmetric
   and more streamlined with existing kernel practices.

 - splice: small micro-optimization: less pointer dereferencing in splice
   methods

Signed-off-by: Ingo Molnar <mingo@elte.hu>

Update XFS for the ->splice_read/->splice_write changes.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 15:18:35 +02:00
Jens Axboe
0b749ce380 [PATCH] splice: be smarter about calling do_page_cache_readahead()
We don't want to call into the read-ahead logic unless we are at the
start of a page, _or_ we have multiple pages to read.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:05:04 +02:00
Jens Axboe
49d0b21be2 [PATCH] splice: optimize the splice buffer mapping
We don't really need to lock down the pages, just make sure they
are uptodate.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:04:41 +02:00
Jens Axboe
16c523ddab [PATCH] splice: cleanup __generic_file_splice_read()
The whole shadow/pages logic got overly complex, and this simpler
approach is actually faster in testing.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:03:58 +02:00
Jens Axboe
c0bd1f650b [PATCH] splice: only call wake_up_interruptible() when we really have to
__wake_up_common() is pretty heavy in the kernel profiles, this brings
it down to a more acceptable level.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:03:32 +02:00
Dave Jones
9aefe431f5 [PATCH] splice: potential !page dereference
We can get to out: with a NULL page, which we probably
don't want to be calling page_cache_release() on.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:02:40 +02:00
Jens Axboe
c7f21e4f5a [PATCH] splice: mark the io page as accessed
We should do that, since we do the LRU manipulation ourselves now. Suggested
by Nick Piggin.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-10 09:01:01 +02:00
Mark Fasheh
a9e2ae3917 ocfs2: Better I/O error handling in heartbeat
Propagate errors received in o2hb_bio_end_io() back to the heartbeat thread
so it can skip re-arming the timer.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 18:03:09 -07:00
Mark Fasheh
2cd9888590 ocfs2: test and set teardown flag early in user_dlm_destroy_lock()
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 17:39:43 -07:00
Mark Fasheh
f43e6918c0 ocfs2: Handle the DLM_CANCELGRANT case in user_unlock_ast()
Remove the code which attempted to catch it via dlmunlock() return status -
this never happens there.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 17:37:52 -07:00
Mark Fasheh
cc6eb72595 ocfs2: catch an invalid ast case in dlmfs
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 17:36:16 -07:00
Mark Fasheh
1f7bc828e3 ocfs2: remove an overly aggressive BUG() in dlmfs
Don't BUG() user_dlm_unblock_lock() on the absence of the USER_LOCK_BLOCKED
flag - this turns out to be a valid case. Make some of the related BUG()
statements print more useful information.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 17:27:43 -07:00
Mark Fasheh
ab0920ce7e ocfs2: multi node truncate fix
Fix ocfs2_truncate_file() so that it forces a truncate_inode_pages() on all
interested nodes in all cases of a truncate(), not just allocation change.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-04-07 16:47:24 -07:00
Linus Torvalds
d69636157a Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] splice: fix page stealing LRU handling.
  [PATCH] splice: page stealing needs to wait_on_page_writeback()
  [PATCH] splice: export generic_splice_sendpage
  [PATCH] splice: add a SPLICE_F_MORE flag
  [PATCH] splice: add comments documenting more of the code
  [PATCH] splice: improve writeback and clean up page stealing
  [PATCH] splice: fix shadow[] filling logic
2006-04-02 14:22:06 -07:00
Jens Axboe
3e7ee3e7b3 [PATCH] splice: fix page stealing LRU handling.
Originally from Nick Piggin, just adapted to the newer branch.

You can't check PageLRU without holding zone->lru_lock.  The page
release code can get away with it only because the page refcount is 0 at
that point. Also, you can't reliably remove pages from the LRU unless
the refcount is 0. Ever.

Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:11:04 +02:00
Jens Axboe
ad8d6f0a78 [PATCH] splice: page stealing needs to wait_on_page_writeback()
Thanks to Andrew for the good explanation of why this is so. akpm writes:

If a page is under writeback and we remove it from pagecache, it's still
going to get written to disk.  But the VFS no longer knows about that page,
nor that this page is about to modify disk blocks.

So there might be scenarios in which those
blocks-which-are-about-to-be-written-to get reused for something else.
When writeback completes, it'll scribble on those blocks.

This won't happen in ext2/ext3-style filesystems in normal mode because the
page has buffers and try_to_release_page() will fail.

But ext2 in nobh mode doesn't attach buffers at all - it just sticks the
page in a BIO, finds some new blocks, points the BIO at those blocks and
lets it rip.

While that write IO's in flight, someone could truncate the file.  Truncate
won't block on the writeout because the page isn't in pagecache any more.
So truncate will the free the blocks from the file under the page's feet.
Then something else can reallocate those blocks.  Then write data to them.

Now, the original write completes, corrupting the filesystem.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:10:32 +02:00
Jens Axboe
059a8f3734 [PATCH] splice: export generic_splice_sendpage
Forgot that one, thanks Jeff. Also move the other EXPORT_SYMBOL
to right below the functions.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:06:05 +02:00
Jens Axboe
b2b39fa478 [PATCH] splice: add a SPLICE_F_MORE flag
This lets userspace indicate whether more data will be coming in a
subsequent splice call.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:05:41 +02:00
Jens Axboe
83f9135bdd [PATCH] splice: add comments documenting more of the code
Hopefully this will make Andrew a little more happy.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:05:09 +02:00
Jens Axboe
4f6f0bd2ff [PATCH] splice: improve writeback and clean up page stealing
By cleaning up the writeback logic (killing write_one_page() and the manual
set_page_dirty()), we can get rid of ->stolen inside the pipe_buffer and
just keep it local in pipe_to_file().

This also adds dirty page balancing logic and O_SYNC handling.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:04:46 +02:00
Jens Axboe
53cd9ae886 [PATCH] splice: fix shadow[] filling logic
Clear the entire range, and don't increment pidx or we keep filling
the same position again and again.

Thanks to KAMEZAWA Hiroyuki.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-04-02 23:04:21 +02:00
Linus Torvalds
a2308b7f08 Merge git://oss.sgi.com:8090/oss/git/xfs-2.6
* git://oss.sgi.com:8090/oss/git/xfs-2.6:
  [XFS] Provide XFS support for the splice syscall.
  [XFS] Reenable write barriers by default.
  [XFS] Make project quota enforcement return an error code consistent with
  [XFS] Implement the silent parameter to fill_super, previously ignored.
  [XFS] Cleanup comment to remove reference to obsoleted function
2006-04-02 13:11:25 -07:00
Greg Kroah-Hartman
6e0dd741a8 [PATCH] sysfs: zero terminate sysfs write buffers
No one should be writing a PAGE_SIZE worth of data to a normal sysfs
file, so properly terminate the buffer.

Thanks to Al Viro for pointing out my supidity here.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-02 13:03:31 -07:00
Linus Torvalds
63589ed078 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (48 commits)
  Documentation: fix minor kernel-doc warnings
  BUG_ON() Conversion in drivers/net/
  BUG_ON() Conversion in drivers/s390/net/lcs.c
  BUG_ON() Conversion in mm/slab.c
  BUG_ON() Conversion in mm/highmem.c
  BUG_ON() Conversion in kernel/signal.c
  BUG_ON() Conversion in kernel/signal.c
  BUG_ON() Conversion in kernel/ptrace.c
  BUG_ON() Conversion in ipc/shm.c
  BUG_ON() Conversion in fs/freevxfs/
  BUG_ON() Conversion in fs/udf/
  BUG_ON() Conversion in fs/sysv/
  BUG_ON() Conversion in fs/inode.c
  BUG_ON() Conversion in fs/fcntl.c
  BUG_ON() Conversion in fs/dquot.c
  BUG_ON() Conversion in md/raid10.c
  BUG_ON() Conversion in md/raid6main.c
  BUG_ON() Conversion in md/raid5.c
  Fix minor documentation typo
  BFP->BPF in Documentation/networking/tuntap.txt
  ...
2006-04-02 12:58:45 -07:00
Linus Torvalds
29e350944f splice: add SPLICE_F_NONBLOCK flag
It doesn't make the splice itself necessarily nonblocking (because the
actual file descriptors that are spliced from/to may block unless they
have the O_NONBLOCK flag set), but it makes the splice pipe operations
nonblocking.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-02 12:46:35 -07:00
Martin Waitz
a580290c3e Documentation: fix minor kernel-doc warnings
This patch updates the comments to match the actual code.

Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:59:55 +02:00
Eric Sesterhenn
7ec7073809 BUG_ON() Conversion in fs/freevxfs/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:41:02 +02:00
Eric Sesterhenn
2c2111c2bd BUG_ON() Conversion in fs/udf/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:40:13 +02:00
Eric Sesterhenn
d6735bfcc9 BUG_ON() Conversion in fs/sysv/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:39:21 +02:00
Eric Sesterhenn
b7542f8c7e BUG_ON() Conversion in fs/inode.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:38:18 +02:00
Eric Sesterhenn
f6298aab2e BUG_ON() Conversion in fs/fcntl.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:37:19 +02:00
Eric Sesterhenn
8abf6a4707 BUG_ON() Conversion in fs/dquot.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-02 13:36:13 +02:00
Adrian Bunk
733f896927 Merge with git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2006-04-02 10:37:38 +02:00
Linus Torvalds
547a77ae62 Merge master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] Fix typo in earlier cifs_unlink change and protect one
  [CIFS] Incorrect signature sent on SMB Read
  [CIFS] Fix unlink oops when indirectly called in rename error path
  [CIFS] Fix two remaining coverity scan tool warnings.
  [CIFS] Set correct lock type on new posix unlock call
  [CIFS] Upate cifs change log
  [CIFS] Fix slow oplock break response when mounts to different
  [CIFS] Workaround various server bugs found in testing at connectathon
  [CIFS] Allow fallback for setting file size to Procom SMB server when
  [CIFS] Make POSIX CIFS Extensions SetFSInfo match exactly what we want
  [CIFS] Move noisy debug message (triggerred by some older servers) from
  [CIFS] Use correct pid on new cifs posix byte range lock call
  [CIFS] Add posix (advisory) byte range locking support to cifs client
  [CIFS] CIFS readdir perf optimizations part 1
  [CIFS] Free small buffers earlier so we exceed the cifs
  [CIFS] Fix large (ie over 64K for MaxCIFSBufSize) buffer case for wrapping
  [CIFS] Convert remaining places in fs/cifs from
  [CIFS] SessionSetup cleanup part 2
  [CIFS] fix compile error (typo) and warning in cifssmb.c
  [CIFS] Cleanup NTLMSSP session setup handling
2006-03-31 21:27:53 -08:00
Eric Sesterhenn
99cee0cd75 BUG_ON() Conversion in fs/sysfs/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:18:38 +02:00
Eric Sesterhenn
5df0d31241 BUG_ON() Conversion in fs/smbfs/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:16:26 +02:00
Eric Sesterhenn
4b4d1cc733 BUG_ON() Conversion in fs/jffs2/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:15:35 +02:00
Eric Sesterhenn
0bf3ba538a BUG_ON() Conversion in fs/hfsplus/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:14:43 +02:00
Eric Sesterhenn
7dddb12c63 BUG_ON() Conversion in fs/exec.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:13:38 +02:00
Eric Sesterhenn
d4569d2e69 BUG_ON() Conversion in fs/direct-io.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-04-01 01:10:13 +02:00
Steve French
06bcfedd05 [CIFS] Fix typo in earlier cifs_unlink change and protect one
extra path.

Since cifs_unlink can also be called from rename path and there
was one report of oops am making the extra check for null inode.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 22:43:50 +00:00
Steve French
e9917a000f [CIFS] Incorrect signature sent on SMB Read
Fixes Samba bug 3621 and kernel.org bug 6147

For servers which require SMB/CIFS packet signing, we were sending the
wrong signature (all zeros) on SMB Read request.  The new cifs routine
to do signatures across an iovec was not complete - and SMB Read, unlike
the new SMBWrite2, did not fall back to the older routine (ie use
SendReceive vs. the more efficient SendReceive2 ie used the older
cifs_sign_smb vs. the disabled  cifs_sign_smb2) for calculating signatures.

This finishes up cifs_sign_smb2/cifs_calc_signature2 so that the callers
of SendReceive2 can get SMB/CIFS packet signatures.

Now that cifs_sign_smb2 is supported, we could start using it in
the write path but this smaller fix does not include the change
to use SMBWrite2 when signatures are required (which when enabled
will make more Writes more efficient and alloc less memory).
Currently Write2 is only used when signatures are not
required at the moment but after more testing we will enable
that as well).

Thanks to James Slepicka and Sam Flory for initial investigation.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 21:22:00 +00:00
Jes Sorensen
30c14e40ed [PATCH] avoid unaligned access when accessing poll stack
Commit 70674f95c0:

  [PATCH] Optimize select/poll by putting small data sets on the stack

resulted in the poll stack being 4-byte aligned on 64-bit architectures,
causing misaligned accesses to elements in the array.

This patch fixes it by declaring the stack in terms of 'long' instead
of 'char'.

Force alignment of poll and select stacks to long to avoid unaligned
access on 64 bit architectures.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:30:48 -08:00
Adrian Bunk
a244e1698a [PATCH] fs/namei.c: make lookup_hash() static
As announced, lookup_hash() can now become static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:01 -08:00
Eric W. Biederman
3e7e241f8c [PATCH] dcache: Add helper d_hash_and_lookup
It is very common to hash a dentry and then to call lookup.  If we take fs
specific hash functions into account the full hash logic can get ugly.
Further full_name_hash as an inline function is almost 100 bytes on x86 so
having a non-inline choice in some cases can measurably decrease code size.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:19:00 -08:00
Herbert Poetzl
e4e5d3fc80 [PATCH] cleanup in proc_check_chroot()
proc_check_chroot() does the check in a very unintuitive way (keeping a
copy of the argument, then modifying the argument), and has uncommented
sideeffects.

Signed-off-by: Herbert Poetzl <herbert@13thfloor.at>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:59 -08:00
Trond Myklebust
993dfa8776 [PATCH] fs/locks.c: Fix sys_flock() race
sys_flock() currently has a race which can result in a double free in the
multi-thread case.

Thread 1			Thread 2

sys_flock(file, LOCK_EX)
				sys_flock(file, LOCK_UN)

If Thread 2 removes the lock from inode->i_lock before Thread 1 tests for
list_empty(&lock->fl_link) at the end of sys_flock, then both threads will
end up calling locks_free_lock for the same lock.

Fix is to make flock_lock_file() do the same as posix_lock_file(), namely
to make a copy of the request, so that the caller can always free the lock.

This also has the side-effect of fixing up a reference problem in the
lockd handling of flock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:56 -08:00
Amy Griffis
7a2bd3f7ef [PATCH] inotify: IN_DELETE events missing
IN_DELETE events are no longer generated for the removal of a file from a
watched directory.

This seems to be a result of clearing DCACHE_INOTIFY_PARENT_WATCHED in
d_delete() directly before calling fsnotify_nameremove().

Assuming the flag doesn't need to be cleared before dentry_iput(), this
should do the trick.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Acked-by: Robert Love <rml@novell.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:55 -08:00
OGAWA Hirofumi
094e320d76 [PATCH] fat: kill reserved names
Since these names on old MSDOS is used as device, so, current fat driver
doesn't allow a user to create those names.  But many OSes and even Windows
can create those names actually, now.

This patch removes the reserved name check.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:55 -08:00
Andrew Morton
f79e2abb9b [PATCH] sys_sync_file_range()
Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
Reasons:

- It's more flexible.  Things which would require two or three syscalls with
  fadvise() can be done in a single syscall.

- Using fadvise() in this manner is something not covered by POSIX.

The patch wires up the syscall for x86.

The sycall is implemented in the new fs/sync.c.  The intention is that we can
move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.

Documentation for the syscall is in fs/sync.c.

A test app (sync_file_range.c) is in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.

The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
say NFS_DATA_SYNC or NFS_FILE_SYNC.  I can skip the ->fsync call for
NFS_DATA_SYNC which is hopefully the more common."

Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
the queue is congested.  This is trivial to fix: add a new flag bit, set
wbc->nonblocking.  But I'm not sure that we want to expose implementation
details down to that level.

Note: it's notable that we can sync an fd which wasn't opened for writing.
Same with fsync() and fdatasync()).

Note: the code takes some care to handle attempts to sync file contents
outside the 16TB offset on 32-bit machines.  It makes such attempts appear to
succeed, for best 32-bit/64-bit compatibility.  Perhaps it should make such
requests fail...

Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:54 -08:00
Joe Korty
68eef3b479 [PATCH] Simplify proc/devices and fix early termination regression
Make baby-simple the code for /proc/devices.  Based on the proven design
for /proc/interrupts.

This also fixes the early-termination regression 2.6.16 introduced, as
demonstrated by:

    # dd if=/proc/devices bs=1
    Character devices:
      1 mem
    27+0 records in
    27+0 records out

This should also work (but is untested) when /proc/devices >4096 bytes,
which I believe is what the original 2.6.16 rewrite fixed.

[akpm@osdl.org: cleanups, simplifications]
Signed-off-by: Joe Korty <joe.korty@ccur.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:53 -08:00
Miklos Szeredi
5ce29646eb [PATCH] locks: don't panic
Don't panic!  Just BUG_ON().

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:52 -08:00
Al Viro
347f217cec [PATCH] uml: __user annotations
__user annotations (hppfs)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-31 12:18:51 -08:00
Steve French
f1682e94a3 Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 15:43:35 +00:00
Jeff Garzik
a0f0678025 [PATCH] splice exports
Woe be unto he who builds their filesystems as modules.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
[ Obscure quote from the infamous geek bible? ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30 22:16:24 -08:00
Steve French
6910ab30a2 [CIFS] Fix unlink oops when indirectly called in rename error path
under heavy stress.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 03:37:08 +00:00
Steve French
d62e54abca Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-31 03:35:56 +00:00
Nathan Scott
1b895840ce [XFS] Provide XFS support for the splice syscall.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-31 13:08:59 +10:00
Nathan Scott
3bbcc8e397 [XFS] Reenable write barriers by default.
SGI-PV: 912426
SGI-Modid: xfs-linux-melb:xfs-kern:25634a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-31 13:04:56 +10:00
Nathan Scott
9a2a7de268 [XFS] Make project quota enforcement return an error code consistent with
its use.

SGI-PV: 951300
SGI-Modid: xfs-linux-melb:xfs-kern:25633a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-31 13:04:49 +10:00
Nathan Scott
764d1f89a5 [XFS] Implement the silent parameter to fill_super, previously ignored.
SGI-PV: 951299
SGI-Modid: xfs-linux-melb:xfs-kern:25632a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-31 13:04:17 +10:00
Mandy Kirkconnell
4b4fa25ced [XFS] Cleanup comment to remove reference to obsoleted function
xfs_bmap_do_search_extents().

SGI-PV: 951415
SGI-Modid: xfs-linux-melb:xfs-kern:208491a

Signed-off-by: Mandy Kirkconnell <alkirkco@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-31 13:03:58 +10:00
Jens Axboe
5abc97aa25 [PATCH] splice: add support for SPLICE_F_MOVE flag
This enables the caller to migrate pages from one address space page
cache to another.  In buzz word marketing, you can do zero-copy file
copies!

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30 12:28:18 -08:00
Jens Axboe
5274f052e7 [PATCH] Introduce sys_splice() system call
This adds support for the sys_splice system call. Using a pipe as a
transport, it can connect to files or sockets (latter as output only).

From the splice.c comments:

   "splice": joining two ropes together by interweaving their strands.

   This is the "extended pipe" functionality, where a pipe is used as
   an arbitrary in-memory buffer. Think of a pipe as a small kernel
   buffer that you can use to transfer data from one end to the other.

   The traditional unix read/write is extended with a "splice()" operation
   that transfers data buffers to or from a pipe buffer.

   Named by Larry McVoy, original implementation from Linus, extended by
   Jens to support splicing to files and fixing the initial implementation
   bugs.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-30 12:28:18 -08:00
Linus Torvalds
76babde121 Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (67 commits)
  [PATCH] powerpc: Remove oprofile spinlock backtrace code
  [PATCH] powerpc: Add oprofile calltrace support to all powerpc cpus
  [PATCH] powerpc: Add oprofile calltrace support
  [PATCH] for_each_possible_cpu: ppc
  [PATCH] for_each_possible_cpu: powerpc
  [PATCH] lock PTE before updating it in 440/BookE page fault handler
  [PATCH] powerpc: Kill _machine and hard-coded platform numbers
  ppc: Fix compile error in arch/ppc/lib/strcase.c
  [PATCH] git-powerpc: WARN was a dumb idea
  [PATCH] powerpc: a couple of trivial compile warning fixes
  powerpc: remove OCP references
  powerpc: Make uImage default build output for MPC8540 ADS
  powerpc: move math-emu over to arch/powerpc
  powerpc: use memparse() for mem= command line parsing
  ppc: fix strncasecmp prototype
  [PATCH] powerpc: make ISA floppies work again
  [PATCH] powerpc: Fix some initcall return values
  [PATCH] powerpc: Workaround for pSeries RTAS bug
  [PATCH] spufs: fix __init/__exit annotations
  [PATCH] powerpc: add hvc backend for rtas
  ...
2006-03-29 11:28:30 -08:00
Linus Torvalds
e71ac6032e Merge git://oss.sgi.com:8090/oss/git/xfs-2.6
* git://oss.sgi.com:8090/oss/git/xfs-2.6:
  [XFS] Cleanup in XFS after recent get_block_t interface tweaks.
  [XFS] Remove unused/obsoleted function: xfs_bmap_do_search_extents()
  [XFS] A change to inode chunk allocation to try allocating the new chunk
  Fixes a regression from the recent "remove ->get_blocks() support"
  [XFS] Fix compiler warning and small code inconsistencies in compat
  [XFS] We really suck at spulling.  Thanks to Chris Pascoe for fixing all
2006-03-29 08:55:36 -08:00
Oleg Nesterov
aa1757f90b [PATCH] convert sighand_cache to use SLAB_DESTROY_BY_RCU
This patch borrows a clever Hugh's 'struct anon_vma' trick.

Without tasklist_lock held we can't trust task->sighand until we locked it
and re-checked that it is still the same.

But this means we don't need to defer 'kmem_cache_free(sighand)'.  We can
return the memory to slab immediately, all we need is to be sure that
sighand->siglock can't dissapear inside rcu protected section.

To do so we need to initialize ->siglock inside ctor function,
SLAB_DESTROY_BY_RCU does the rest.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 18:36:42 -08:00
Oleg Nesterov
8fafabd86f [PATCH] remove add_parent()'s parent argument
add_parent(p, parent) is always called with parent == p->parent, and it makes
no sense to do it differently.  This patch removes this argument.

No changes in affected .o files.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 18:36:41 -08:00
Eric W. Biederman
d73d65293e [PATCH] pidhash: kill switch_exec_pids
switch_exec_pids is only called from de_thread by way of exec, and it is
only called when we are exec'ing from a non thread group leader.

Currently switch_exec_pids gives the leader the pid of the thread and
unhashes and rehashes all of the process groups.  The leader is already in
the EXIT_DEAD state so no one cares about it's pids.  The only concern for
the leader is that __unhash_process called from release_task will function
correctly.  If we don't touch the leader at all we know that
__unhash_process will work fine so there is no need to touch the leader.

For the task becomming the thread group leader, we just need to give it the
pid of the old thread group leader, add it to the task list, and attach it
to the session and the process group of the thread group.

Currently de_thread is also adding the task to the task list which is just
silly.

Currently the only leader of __detach_pid besides detach_pid is
switch_exec_pids because of the ugly extra work that was being
performed.

So this patch removes switch_exec_pids because it is doing too much, it is
creating an unnecessary special case in pid.c, duing work duplicated in
de_thread, and generally obscuring what it is going on.

The necessary work is added to de_thread, and it seems to be a little
clearer there what is going on.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 18:36:40 -08:00
Oleg Nesterov
1434261c07 [PATCH] simplify exec from init's subthread
I think it is enough to take tasklist_lock for reading while changing
child_reaper:

	Reparenting needs write_lock(tasklist_lock)

	Only one thread in a thread group can do exec()

	sighand->siglock garantees that get_signal_to_deliver()
	will not see a stale value of child_reaper.

This means that we can change child_reaper earlier, without calling
zap_other_threads() twice.

"child_reaper = current" is a NOOP when init does exec from main thread, we
don't care.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 18:36:40 -08:00
Eric W. Biederman
fef23e7fbb [PATCH] exec: allow init to exec from any thread.
After looking at the problem of init calling exec some more I figured out
an easy way to make the code work.

The actual symptom without out this patch is that all threads will die
except pid == 1, and the thread calling exec.  The thread calling exec will
wait forever for pid == 1 to die.

Since pid == 1 does not install a handler for SIGKILL it will never die.

This modifies the tests for init from current->pid == 1 to the equivalent
current == child_reaper.  And then it causes exec in the ugly case to
modify child_reaper.

The only weird symptom is that you wind up with an init process that
doesn't have the oldest start time on the box.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 18:36:40 -08:00
Paul Mackerras
bac30d1a78 Merge ../linux-2.6 2006-03-29 13:24:50 +11:00
Nathan Scott
c25366680b [XFS] Cleanup in XFS after recent get_block_t interface tweaks.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-29 10:44:40 +10:00
Mandy Kirkconnell
0b7e56a450 [XFS] Remove unused/obsoleted function: xfs_bmap_do_search_extents()
SGI-PV: 951415
SGI-Modid: xfs-linux-melb:xfs-kern:208490a

Signed-off-by: Mandy Kirkconnell <alkirkco@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-29 09:53:03 +10:00
Glen Overby
3ccb8b5f65 [XFS] A change to inode chunk allocation to try allocating the new chunk
contiguous with the most recently allocated chunk.  On a striped
filesystem, this will fill a stripe unit with inodes before allocating new
inodes in another stripe unit.

SGI-PV: 951416
SGI-Modid: xfs-linux-melb:xfs-kern:208488a

Signed-off-by: Glen Overby <overby@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-29 09:52:28 +10:00
Nathan Scott
3c674e7423 Fixes a regression from the recent "remove ->get_blocks() support"
change.  inode->i_blkbits should be used when making a get_block_t
request of a filesystem instead of dio->blkbits, as that does not
indicate the filesystem block size all the time (depends on request
alignment - see start of __blockdev_direct_IO).

Signed-off-by: Nathan Scott <nathans@sgi.com>
Acked-by: Badari Pulavarty <pbadari@us.ibm.com>
2006-03-29 09:26:15 +10:00
Nathan Scott
e0edd5962b [XFS] Fix compiler warning and small code inconsistencies in compat
ioctl32 land.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:25590a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-29 08:55:47 +10:00
Nathan Scott
c41564b5af [XFS] We really suck at spulling. Thanks to Chris Pascoe for fixing all
these typos.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:25539a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-29 08:55:14 +10:00
Alexey Dobriyan
7f927fcc2f [PATCH] Typo fixes
Fix a lot of typos.  Eyeballed by jmc@ in OpenBSD.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:08 -08:00
Arjan van de Ven
4b6f5d20b0 [PATCH] Make most file operations structs in fs/ const
This is a conversion to make the various file_operations structs in fs/
const.  Basically a regexp job, with a few manual fixups

The goal is both to increase correctness (harder to accidentally write to
shared datastructures) and reducing the false sharing of cachelines with
things that get dirty in .data (while .rodata is nicely read only and thus
cache clean)

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:06 -08:00
Arjan van de Ven
99ac48f54a [PATCH] mark f_ops const in the inode
Mark the f_ops members of inodes as const, as well as fix the
ripple-through this causes by places that copy this f_ops and then "do
stuff" with it.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:05 -08:00
KAMEZAWA Hiroyuki
0a94502277 [PATCH] for_each_possible_cpu: fixes for generic part
replaces for_each_cpu with for_each_possible_cpu().

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:05 -08:00
Vadim Lobanov
68c3431ae2 [PATCH] Fold select_bits_alloc/free into caller code.
Remove an unnecessary level of indirection in allocating and freeing select
bits, as per the select_bits_alloc() and select_bits_free() functions.
Both select.c and compat.c are updated.

Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:04 -08:00
Eric Dumazet
e4a1f129f9 [PATCH] use fget_light() in select/poll
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:04 -08:00
Andi Kleen
70674f95c0 [PATCH] Optimize select/poll by putting small data sets on the stack
Optimize select and poll by a using stack space for small fd sets

This brings back an old optimization from Linux 2.0.  Using the stack is
faster than kmalloc.  On a Intel P4 system it speeds up a select of a
single pty fd by about 13% (~4000 cycles -> ~3500)

It also saves memory because a daemon hanging in select or poll will
usually save one or two less pages.  This can add up - e.g.  if you have 10
daemons blocking in poll/select you save 40KB of memory.

I did a patch for this long ago, but it was never applied.  This version is
a reimplementation of the old patch that tries to be less intrusive.  I
only did the minimal changes needed for the stack allocation.

The cut off point before external memory is allocated is currently at
832bytes.  The system calls always allocate this much memory on the stack.

These 832 bytes are divided into 256 bytes frontend data (for the select
bitmaps of the pollfds) and the rest of the space for the wait queues used
by the low level drivers.  There are some extreme cases where this won't
work out for select and it falls back to allocating memory too early -
especially with very sparse large select bitmaps - but the majority of
processes who only have a small number of file descriptors should be ok.
[TBD: 832/256 might not be the best split for select or poll]

I suspect more optimizations might be possible, but they would be more
complicated.  One way would be to cache the select/poll context over
multiple system calls because typically the input values should be similar.
 Problem is when to flush the file descriptors out though.

Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:04 -08:00
Adrian Bunk
f5b95ff010 [PATCH] autofs4: proper prototype for autofs4_dentry_release()
Add a proper prototype for autofs4_dentry_release() to autofs_i.h.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:03 -08:00
Adrian Bunk
a28af471b8 [PATCH] fs/fat/: proper prototypes for two functions
Add proper prototypes for fat_cache_init() and fat_cache_destroy() in
msdos_fs.h.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:03 -08:00
Benjamin Herrenschmidt
e8222502ee [PATCH] powerpc: Kill _machine and hard-coded platform numbers
This removes statically assigned platform numbers and reworks the
powerpc platform probe code to use a better mechanism.  With this,
board support files can simply declare a new machine type with a
macro, and implement a probe() function that uses the flattened
device-tree to detect if they apply for a given machine.

We now have a machine_is() macro that replaces the comparisons of
_machine with the various PLATFORM_* constants.  This commit also
changes various drivers to use the new macro instead of looking at
_machine.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-28 23:15:54 +11:00
Michael Ellerman
5149fa47ec [PATCH] powerpc: Cope with duplicate node & property names in /proc/device-tree
Various dodgy firmware might give us nodes and/or properties in the device
tree with conflicting names. That's generally ok, except for when we export
the device tree via /proc, so check when we're creating the proc device tree
and munge names accordingly.

Tested on a faked device tree with kexec, would be good if someone with
actual bogus firmware could try it, but just for completeness.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-03-28 16:45:23 +11:00
Jun'ichi Nomura
b4cf1b72ee [PATCH] dm/md dependency tree in sysfs: convert bd_sem to bd_mutex
Convert bd_sem to bd_mutex

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:45:00 -08:00
Jun'ichi Nomura
641dc636b0 [PATCH] dm/md dependency tree in sysfs: bd_claim_by_kobject
Adding bd_claim_by_kobject() function which takes kobject as additional
signature of holder device and creates sysfs symlinks between holder device
and claimed device.  bd_release_from_kobject() is a counterpart of
bd_claim_by_kobject.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:45:00 -08:00
Andrew Morton
100873687d [PATCH] dm-md-dependency-tree-in-sysfs-holders-slaves-subdirectory-tidy
Remove all the CONFIG_SYSFS stuff.  That's supposed to all be implemented up
in header files.

Yes, the CONFIG_SYSFS=n data structures will be a little larger than
necessary, but that's a tradeoff we can decide to make.

Cc: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:59 -08:00
Jun'ichi Nomura
6a4d44c1f1 [PATCH] dm/md dependency tree in sysfs: holders/slaves subdirectory
Creating "slaves" and "holders" directories in /sys/block/<disk> and
creating "holders" directory under /sys/block/<disk>/<partition>

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:59 -08:00
KAMEZAWA Hiroyuki
ec936fc563 [PATCH] for_each_online_pgdat: renaming for_each_pgdat
Replace for_each_pgdat() with for_each_online_pgdat().

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:48 -08:00
Adrian Bunk
74cae61ab4 [PATCH] fs/nfsd/export.c,net/sunrpc/cache.c: make needlessly global code static
We can now make some code static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
NeilBrown
baab935ff3 [PATCH] knfsd: Convert sunrpc_cache to use krefs
.. it makes some of the code nicer.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
NeilBrown
f9ecc921b5 [PATCH] knfsd: Use new cache code for name/id lookup caches
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:42 -08:00
NeilBrown
8d270f7f4c [PATCH] knfsd: Use new cache_lookup for svc_expkey cache
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:42 -08:00
NeilBrown
4f7774c3a0 [PATCH] knfsd: Use new cache_lookup for svc_export
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:42 -08:00
NeilBrown
7d317f2c9f [PATCH] knfsd: Get rid of 'inplace' sunrpc caches
These were an unnecessary wart.  Also only have one 'DefineSimpleCache..'
instead of two.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00
NeilBrown
eab7e2e647 [PATCH] knfsd: Break the hard linkage from svc_expkey to svc_export
Current svc_expkey holds a pointer to the svc_export structure, so updates to
that structure have to be in-place, which is a wart on the whole cache
infrastruct.  So we break that linkage and just do a second lookup.

If this became a performance issue, it would be possible to put a direct link
back in which was only used conditionally.  i.e.  when an object is replaced
in the cache, we set a flag in the old object.  When dereferencing the link
from svc_expkey, if the flag is set, we drop the reference and do a fresh
lookup.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00
NeilBrown
efc36aa560 [PATCH] knfsd: Change the store of auth_domains to not be a 'cache'
The 'auth_domain's are simply handles on internal data structures.  They do
not cache information from user-space, and forcing them into the mold of a
'cache' misrepresents their true nature and causes confusion.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00
Ian Kent
3e7b191980 [PATCH] autofs4: atomic var underflow
Fix accidental underflow of the atomic counter.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00
Ian Kent
871f94344c [PATCH] autofs4: follow_link missing functionality
This functionality is also need for operation of autofs v5 direct mounts.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00
Dave Jones
3370c74b2e [PATCH] Remove redundant check from autofs4_put_super
We have to have a valid sbi here, or we'd have oopsed already.  (There's a
dereference of sbi->catatonic a few lines above)

Coverity #740

Signed-off-by: Dave Jones <davej@redhat.com>
Cc: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00
Ian Kent
44d53eb041 [PATCH] autofs4: change AUTOFS_TYP_* AUTOFS_TYPE_*
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:41 -08:00
Ian Kent
5c0a32fc2c [PATCH] autofs4: add new packet type for v5 communications
This patch define a new autofs packet for autofs v5 and updates the waitq.c
functions to handle the additional packet type.

Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:40 -08:00
Ian Kent
3a15e2ab5d [PATCH] autofs4: add v5 expire logic
This patch adds expire logic for autofs direct mounts.

Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:40 -08:00
Ian Kent
34ca959cfc [PATCH] autofs4: add v5 follow_link mount trigger method
This patch adds a follow_link inode method for the root of an autofs direct
mount trigger.  It also adds the corresponding mount options and updates the
show_mount method.

Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:40 -08:00
Ian Kent
051d381259 [PATCH] autofs4: nameidata needs to be up to date for follow_link
In order to be able to trigger a mount using the follow_link inode method the
nameidata struct that is passed in needs to have the vfsmount of the autofs
trigger not its parent.

During a path walk if an autofs trigger is mounted on a dentry, when the
follow_link method is called, the nameidata struct contains the vfsmount and
mountpoint dentry of the parent mount while the dentry that is passed in is
the root of the autofs trigger mount.  I believe it is impossible to get the
vfsmount of the trigger mount, within the follow_link method, when only the
parent vfsmount and the root dentry of the trigger mount are known.

This patch updates the nameidata struct on entry to __do_follow_link if it
detects that it is out of date.  It moves the path_to_nameidata to above
__do_follow_link to facilitate calling it from there.  The dput_path is moved
as well as that seemed sensible.  No changes are made to these two functions.

Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:40 -08:00
Ian Kent
e3474a8eb3 [PATCH] autofs4: change may_umount* functions to boolean
Change the functions may_umount and may_umount_tree to boolean functions to
aid code readability.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:40 -08:00
Ian Kent
90a59c7cf5 [PATCH] autofs4: rename simple_empty_nolock function
Rename the function simple_empty_nolock to __simple_empty in line with kernel
naming conventions.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:40 -08:00
Ian Kent
e77fbddf77 [PATCH] autofs4: white space cleanup for waitq.c
Whitespace and formating changes to waitq code.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:40 -08:00
Ian Kent
d7c4a5f108 [PATCH] autofs4: add a show mount options for proc filesystem
Add show_options method to display autofs4 mount options in the proc
filesystem.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:40 -08:00
Ian Kent
862b110f01 [PATCH] autofs4: remove update_atime unused function
Remove the update of i_atime from autofs4 in favour of having VFS update it.
i_atime is never used for expire in autofs4.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:39 -08:00
Ian Kent
e0a7aae940 [PATCH] autofs4: expire mounts that hold no (extra) references only
Alter the expire semantics that define how "busyness" is determined.
Currently a last_used counter is updated on every revalidate from processes
other than the mount owner process group.

This patch changes that so that an expire candidate is busy only if it has a
reference count greater than the expected minimum, such as when there is an
open file or working directory in use.

This method is the only way that busyness can be established for direct mounts
within the new implementation.  For consistency the expire semantic is made
the same for all mounts.

A side effect of the patch is that mounts which remain mounted unessessarily
in the presence of some GUI programs that scan the filesystem should now
expire.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:39 -08:00
Ian Kent
1aff3c8b05 [PATCH] autofs4: fix false negative return from expire
Fix the case where an expire returns busy on a tree mount when it is in fact
not busy.  This case was overlooked when the patch to prevent the expiring
away of "scaffolding" directories for tree mounts was applied.

The problem arises when a tree of mounts is a member of a map with other keys.
 The current logic will not expire the tree if any other mount in the map is
busy.  The solution is to maintain a "minimum" use count for each autofs
dentry and compare this to the actual dentry usage count during expire.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:39 -08:00
Ian Kent
1ce12bad85 [PATCH] autofs4: simplify expire tree traversal
Simplify the expire tree traversal code by using a function from namespace.c
to calculate the next entry in the top down tree traversals carried out during
the expire operation.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:39 -08:00
Ian Kent
1f5f2c3059 [PATCH] autofs4: expire code readability cleanup
Change the names of the boolean functions autofs4_check_mount and
autofs4_check_tree to autofs4_mount_busy and autofs4_tree_busy respectively
and alters their return codes to suit in order to aid code readabilty.

A couple of white space cleanups are included as well.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:39 -08:00
Ian Kent
2d753e62b8 [PATCH] autofs4: can't mount due to mount point dir not empty
Addresse a problem where stale dentrys stop mounts from happening.

When a mount point directory is pre-created and a non-existent entry within it
is requested a dentry ends up being created within the mount point directory
which stops future mounts.  The problem is solved by ignoring negative,
unhashed dentrys in the mount point d_subdirs list.

Additionally the apparent cacheing of -ENOENT returns from requests is
removed.  The test on d_time is a tautology and d_time is not initialised and
has an unexpected value.  In short it doesn't do what it's meant to.

The cacheing of failed requests to the daemon is important and will be
followed up later.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:39 -08:00
Ian Kent
f360ce3be4 [PATCH] autofs4: use libfs routines for readdir
Change readdir routines to use the cursor based routines in libfs.c.  This
removes reliance on old readdir code from 2.4 and should improve efficiency of
readdir in autofs4.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:39 -08:00
Ian Kent
718c604a28 [PATCH] autofs4: lookup white space cleanup
Whitespace and formating changes to lookup code.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:39 -08:00
Jeff Dike
86c79cbcee [PATCH] uml: fix hostfs stack corruption
Noted by Oleg Drokin:
We initialized an extra slot of struct kstatfs.spare, sometimes
causing stack corruption.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Oleg Drokin <green@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:39 -08:00
Linus Torvalds
9ae21d1bb3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
  drivers/char/ftape/lowlevel/fdc-io.c: Correct a comment
  Kconfig help: MTD_JEDECPROBE already supports Intel
  Remove ugly debugging stuff
  do_mounts.c: Minor ROOT_DEV comment cleanup
  BUG_ON() Conversion in drivers/s390/block/dasd_devmap.c
  BUG_ON() Conversion in mm/mempool.c
  BUG_ON() Conversion in mm/memory.c
  BUG_ON() Conversion in kernel/fork.c
  BUG_ON() Conversion in ipc/sem.c
  BUG_ON() Conversion in fs/ext2/
  BUG_ON() Conversion in fs/hfs/
  BUG_ON() Conversion in fs/dcache.c
  BUG_ON() Conversion in fs/buffer.c
  BUG_ON() Conversion in input/serio/hp_sdc_mlc.c
  BUG_ON() Conversion in md/dm-table.c
  BUG_ON() Conversion in md/dm-path-selector.c
  BUG_ON() Conversion in drivers/isdn
  BUG_ON() Conversion in drivers/char
  BUG_ON() Conversion in drivers/mtd/
2006-03-26 09:41:18 -08:00
Artem B. Bityuckiy
3be7d29fb9 Remove ugly debugging stuff
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-26 18:58:31 +02:00
Akinobu Mita
b9a2838cc2 [PATCH] bitops: ntfs: remove generic_ffs()
Now the only user who are using generic_ffs() is ntfs filesystem.  This patch
isolates generic_ffs() as ntfs_ffs() for ntfs.

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:15 -08:00
Roman Zippel
05cfb614dd [PATCH] hrtimers: remove data field
The nanosleep cleanup allows to remove the data field of hrtimer.  The
callback function can use container_of() to get it's own data.  Since the
hrtimer structure is anyway embedded in other structures, this adds no
overhead.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:03 -08:00
Roman Zippel
4dee26b7e2 [PATCH] hrtimers: remove it_real_value calculation from proc/*/stat
Remove the it_real_value from /proc/*/stat, during 1.2.x was the last time it
returned useful data (as it was directly maintained by the scheduler), now
it's only a waste of time to calculate it.  Return 0 instead.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:02 -08:00
Badari Pulavarty
a0e9285233 [PATCH] ext3: "nobh" writeback support for filesystems blocksize < pagesize
There is no valid reason why we can't support "nobh" option for filesystems
with blocksize != PAGESIZE.

This patch lets them use "nobh" option for writeback mode for blocksize <
pagesize.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:02 -08:00
Badari Pulavarty
f91a2ad2ed [PATCH] ext3: multi-block get_block()
Mingming Cao recently added multi-block allocation support for ext3,
currently used only by DIO.  I added support to map multiple blocks for
mpage_readpages().  This patch add support for ext3_get_block() to deal
with multi-block mapping.  Basically it renames ext3_direct_io_get_blocks()
as ext3_get_block().

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:02 -08:00
Andrew Morton
d6859bfca8 [PATCH] ext3: cleanups and WARN_ON()
- Clean up a few little layout things and comments.

- Add a WARN_ON to a case which I was wondering about.

- Tune up some inlines.

Cc: Mingming Cao <cmm@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:02 -08:00
Badari Pulavarty
1d8fa7a2b9 [PATCH] remove ->get_blocks() support
Now that get_block() can handle mapping multiple disk blocks, no need to have
->get_blocks().  This patch removes fs specific ->get_blocks() added for DIO
and makes it users use get_block() instead.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Badari Pulavarty
fa30bd058b [PATCH] map multiple blocks for mpage_readpages()
This patch changes mpage_readpages() and get_block() to get the disk mapping
information for multiple blocks at the same time.

b_size represents the amount of disk mapping that needs to mapped.  On the
successful get_block() b_size indicates the amount of disk mapping thats
actually mapped.  Only the filesystems who care to use this information and
provide multiple disk blocks at a time can choose to do so.

No changes are needed for the filesystems who wants to ignore this.

[akpm@osdl.org: cleanups]
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Badari Pulavarty
b0cf2321c6 [PATCH] pass b_size to ->get_block()
Pass amount of disk needs to be mapped to get_block().  This way one can
modify the fs ->get_block() functions to map multiple blocks at the same time.

[akpm@osdl.org: performance tweak]
[akpm@osdl.org: remove unneeded assignments]
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Badari Pulavarty
205f87f6b3 [PATCH] change buffer_head.b_size to size_t
Increase the size of the buffer_head b_size field (only) for 64 bit platforms.
Update some old and moldy comments in and around the structure as well.

The b_size increase allows us to perform larger mappings and allocations for
large I/O requests from userspace, which tie in with other changes allowing
the get_block_t() interface to map multiple blocks at once.

Signed-off-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Mingming Cao
d48589bfad [PATCH] ext3_get_blocks: Adjust reservation window size for mblocks
Optimize the block reservation and the multiple block allocation: with the
knowledge of the total number of blocks ahead, set or adjust the reservation
window size properly (based on the number of blocks needed) before block
allocation happens: if there isn't any reservation yet, make sure the
reservation window equals to or greater than the number of blocks needed,
before create an reservation window; if a reservation window is already
exists, try to extends the window size to match the number of blocks to
allocate.  This could increase the possibility of completing multiple blocks
allocation in a single request, as blocks are only allocated in the range of
the inode's reservation window.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Mingming Cao
faa569763a [PATCH] ext3_get_blocks: Adjust accounting info in ext3_new_blocks()
Update accounting information (quota, boundary checks, free blocks number etc)
in ext3_new_blocks().

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Mingming Cao
b54e41ec17 [PATCH] ext3_get_blocks: support multiple blocks allocation in ext3_new_block()
Change ext3_try_to_allocate() (called via ext3_new_blocks()) to try to
allocate the requested number of blocks on a best effort basis: After
allocated the first block, it will always attempt to allocate the next few(up
to the requested size and not beyond the reservation window) adjacent blocks
at the same time.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Mingming Cao
b47b24781c [PATCH] ext3_get_blocks: multiple block allocation
Add support for multiple block allocation in ext3-get-blocks().

Look up the disk block mapping and count the total number of blocks to
allocate, then pass it to ext3_new_block(), where the real block allocation is
performed.  Once multiple blocks are allocated, prepare the branch with those
just allocated blocks info and finally splice the whole branch into the block
mapping tree.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:01 -08:00
Mingming Cao
89747d369d [PATCH] ext3_get_blocks: Mapping multiple blocks at a once
Currently ext3_get_block() only maps or allocates one block at a time.  This
is quite inefficient for sequential IO workload.

I have posted a early implements a simply multiple block map and allocation
with current ext3.  The basic idea is allocating the 1st block in the existing
way, and attempting to allocate the next adjacent blocks on a best effort
basis.  More description about the implementation could be found here:
http://marc.theaimsgroup.com/?l=ext2-devel&m=112162230003522&w=2

The following the latest version of the patch: break the original patch into 5
patches, re-worked some logicals, and fixed some bugs.  The break ups are:

 [patch 1] Adding map multiple blocks at a time in ext3_get_blocks()
 [patch 2] Extend ext3_get_blocks() to support multiple block allocation
 [patch 3] Implement multiple block allocation in ext3-try-to-allocate
 (called via ext3_new_block()).
 [patch 4] Proper accounting updates in ext3_new_blocks()
 [patch 5] Adjust reservation window size properly (by the given number
 of blocks to allocate) before block allocation to increase the
 possibility of allocating multiple blocks in a single call.

Tests done so far includes fsx,tiobench and dbench.  The following numbers
collected from Direct IO tests (1G file creation/read) shows the system time
have been greatly reduced (more than 50% on my 8 cpu system) with the patches.

 1G file DIO write:
 	2.6.15		2.6.15+patches
 real    0m31.275s	0m31.161s
 user    0m0.000s	0m0.000s
 sys     0m3.384s	0m0.564s

 1G file DIO read:
 	2.6.15		2.6.15+patches
 real    0m30.733s	0m30.624s
 user    0m0.000s	0m0.004s
 sys     0m0.748s	0m0.380s

Some previous test we did on buffered IO with using multiple blocks allocation
and delayed allocation shows noticeable improvement on throughput and system
time.

This patch:

Add support of mapping multiple blocks in one call.

This is useful for DIO reads and re-writes (where blocks are already
allocated), also is in line with Christoph's proposal of using getblocks() in
mpage_readpage() or mpage_readpages().

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:00 -08:00
Andrew Morton
5515eff811 [PATCH] 2tb-files-add-blkcnt_t-fixes
Cc: Takashi Sato <sho@tnes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:00 -08:00
Matthew Dobson
93d2341c75 [PATCH] mempool: use mempool_create_slab_pool()
Modify well over a dozen mempool users to call mempool_create_slab_pool()
rather than calling mempool_create() with extra arguments, saving about 30
lines of code and increasing readability.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:00 -08:00
Matthew Dobson
0eaae62aba [PATCH] mempool: use common mempool kmalloc allocator
This patch changes several mempool users, all of which are basically just
wrappers around kmalloc(), to use the common mempool_kmalloc/kfree, rather
than their own wrapper function, removing a bunch of duplicated code.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:59 -08:00
Andy Adamson
eb76b3fda1 [PATCH] NFSD4: return conflict lock without races
Update the NFSv4 server to use the new posix_lock_file_conf() interface.
Remove unnecessary (and race-prone) posix_test_file() calls.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:56 -08:00
Andy Adamson
5842add2f3 [PATCH] VFS,fs/locks.c,NFSD4: add race_free posix_lock_file_conf() interface
Lockd and the NFSv4 server both exercise a race condition where
posix_test_lock() is called either before or after posix_lock_file() to
deal with a denied lock request due to a conflicting lock.

Remove the race condition for the NFSv4 server by adding a new conflicting
lock parameter to __posix_lock_file() , changing the name to
__posix_lock_file_conf().

Keep posix_lock_file() interface, add posix_lock_conf() interface, both
call __posix_lock_file_conf().

[akpm@osdl.org: Put the EXPORT_SYMBOL() where it belongs]
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:56 -08:00
J. Bruce Fields
6dc0fe8f8b [PATCH] VFS,fs/locks.c: cleanup locks_insert_block
BUG instead of handling a case that should never happen.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:56 -08:00
Eric Dumazet
fa3536cc14 [PATCH] Use __read_mostly on some hot fs variables
I discovered on oprofile hunting on a SMP platform that dentry lookups were
slowed down because d_hash_mask, d_hash_shift and dentry_hashtable were in
a cache line that contained inodes_stat.  So each time inodes_stats is
changed by a cpu, other cpus have to refill their cache line.

This patch moves some variables to the __read_mostly section, in order to
avoid false sharing.  RCU dentry lookups can go full speed.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:56 -08:00
NeilBrown
2ff28e22bd [PATCH] Make address_space_operations->invalidatepage return void
The return value of this function is never used, so let's be honest and
declare it as void.

Some places where invalidatepage returned 0, I have inserted comments
suggesting a BUG_ON.

[akpm@osdl.org: JBD BUG fix]
[akpm@osdl.org: rework for git-nfs]
[akpm@osdl.org: don't go BUG in block_invalidate_page()]
Signed-off-by: Neil Brown <neilb@suse.de>
Acked-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:55 -08:00
NeilBrown
3978d7179d [PATCH] Make address_space_operations->sync_page return void
The only user ignores the return value, and the only instanace
(block_sync_page) always returns 0...

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:55 -08:00
Ingo Molnar
353ab6e97b [PATCH] sem2mutex: fs/
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org>
Cc: Robert Love <rml@tech9.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:55 -08:00
Steven Rostedt
64a07bd82e [PATCH] protect remove_proc_entry
It has been discovered that the remove_proc_entry has a race in the removing
of entries in the proc file system that are siblings.  There's no protection
around the traversing and removing of elements that belong in the same
subdirectory.

This subdirectory list is protected in other areas by the BKL.  So the BKL was
at first used to protect this area too, but unfortunately, remove_proc_entry
may be called with spinlocks held.  The BKL may schedule, so this was not a
solution.

The final solution was to add a new global spin lock to protect this list,
called proc_subdir_lock.  This lock now protects the list in
remove_proc_entry, and I also went around looking for other areas that this
list is modified and added this protection there too.  Care must be taken
since these locations call several functions that may also schedule.

Since I don't see any location that these functions that modify the
subdirectory list are called by interrupts, the irqsave/restore versions of
the spin lock was _not_ used.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:53 -08:00
Eric Sesterhenn
309be53da6 BUG_ON() Conversion in fs/ext2/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-26 18:27:41 +02:00
Eric Sesterhenn
4d4ef9abe3 BUG_ON() Conversion in fs/hfs/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-26 18:26:51 +02:00
Eric Sesterhenn
28133c7b2b BUG_ON() Conversion in fs/dcache.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-26 18:25:39 +02:00
Eric Sesterhenn
e827f92355 BUG_ON() Conversion in fs/buffer.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-26 18:24:46 +02:00
Linus Torvalds
1b9a391736 Merge branch 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (22 commits)
  [PATCH] fix audit_init failure path
  [PATCH] EXPORT_SYMBOL patch for audit_log, audit_log_start, audit_log_end and audit_format
  [PATCH] sem2mutex: audit_netlink_sem
  [PATCH] simplify audit_free() locking
  [PATCH] Fix audit operators
  [PATCH] promiscuous mode
  [PATCH] Add tty to syscall audit records
  [PATCH] add/remove rule update
  [PATCH] audit string fields interface + consumer
  [PATCH] SE Linux audit events
  [PATCH] Minor cosmetic cleanups to the code moved into auditfilter.c
  [PATCH] Fix audit record filtering with !CONFIG_AUDITSYSCALL
  [PATCH] Fix IA64 success/failure indication in syscall auditing.
  [PATCH] Miscellaneous bug and warning fixes
  [PATCH] Capture selinux subject/object context information.
  [PATCH] Exclude messages by message type
  [PATCH] Collect more inode information during syscall processing.
  [PATCH] Pass dentry, not just name, in fsnotify creation hooks.
  [PATCH] Define new range of userspace messages.
  [PATCH] Filter rule comparators
  ...

Fixed trivial conflict in security/selinux/hooks.c
2006-03-25 09:24:53 -08:00
Linus Torvalds
53846a21c1 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (103 commits)
  SUNRPC,RPCSEC_GSS: spkm3--fix config dependencies
  SUNRPC,RPCSEC_GSS: spkm3: import contexts using NID_cast5_cbc
  LOCKD: Make nlmsvc_traverse_shares return void
  LOCKD: nlmsvc_traverse_blocks return is unused
  SUNRPC,RPCSEC_GSS: fix krb5 sequence numbers.
  NFSv4: Dont list system.nfs4_acl for filesystems that don't support it.
  SUNRPC,RPCSEC_GSS: remove unnecessary kmalloc of a checksum
  SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release()
  SUNRPC: Fix memory barriers for req->rq_received
  NFS: Fix a race in nfs_sync_inode()
  NFS: Clean up nfs_flush_list()
  NFS: Fix a race with PG_private and nfs_release_page()
  NFSv4: Ensure the callback daemon flushes signals
  SUNRPC: Fix a 'Busy inodes' error in rpc_pipefs
  NFS, NLM: Allow blocking locks to respect signals
  NFS: Make nfs_fhget() return appropriate error values
  NFSv4: Fix an oops in nfs4_fill_super
  lockd: blocks should hold a reference to the nlm_file
  NFSv4: SETCLIENTID_CONFIRM should handle NFS4ERR_DELAY/NFS4ERR_RESOURCE
  NFSv4: Send the delegation stateid for SETATTR calls
  ...
2006-03-25 09:18:27 -08:00
Andi Kleen
913bd90601 [PATCH] x86_64: Increase the variability of the process stack on 64bit architectures
8MB is not really very random, use 1GB (or more with larger page sizes)
instead.

Also use the low bits of the random generator output now instead of
throwing them away.

Only enabled on x86-64 right now. Other architectures need to add
a suitable STACK_RND_MASK

Cc: mingo@elte.hu
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 09:10:52 -08:00
Linus Torvalds
0c50527379 Merge branch 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2
* 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2:
  ocfs2: finally remove MLF* macros
  ocfs2: don't use MLF* in the file system
  ocfs2: don't use MLF* in dlm/ files
  ocfs2: don't use MLF* in cluster/ files
  [PATCH] ocfs2: dlm recovery fixes
  [PATCH] ocfs2: fix hang in dlm lock resource mastery
  ocfs2: use __attribute__ format
2006-03-25 08:50:56 -08:00
Linus Torvalds
1e8c573933 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (21 commits)
  BUG_ON() Conversion in drivers/video/
  BUG_ON() Conversion in drivers/parisc/
  BUG_ON() Conversion in drivers/block/
  BUG_ON() Conversion in sound/sparc/cs4231.c
  BUG_ON() Conversion in drivers/s390/block/dasd.c
  BUG_ON() Conversion in lib/swiotlb.c
  BUG_ON() Conversion in kernel/cpu.c
  BUG_ON() Conversion in ipc/msg.c
  BUG_ON() Conversion in block/elevator.c
  BUG_ON() Conversion in fs/coda/
  BUG_ON() Conversion in fs/binfmt_elf_fdpic.c
  BUG_ON() Conversion in input/serio/hil_mlc.c
  BUG_ON() Conversion in md/dm-hw-handler.c
  BUG_ON() Conversion in md/bitmap.c
  The comment describing how MS_ASYNC works in msync.c is confusing
  rcu: undeclared variable used in documentation
  fix typos "wich" -> "which"
  typo patch for fs/ufs/super.c
  Fix simple typos
  tabify drivers/char/Makefile
  ...
2006-03-25 08:41:09 -08:00
Luke Yang
1ad3dcc09c [PATCH] flat binary loader doesn't check fd table full
In binfmt_flat.c, the flat binary loader should check file descriptor table
and install the fd on the file.

Convert the function to single-exit and fix this bug.

Signed-off-by: "Luke Yang" <luke.adi@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:01 -08:00
Carsten Otte
6cc6b1226b [PATCH] remove needless check in fs/read_write.c
nr_segs is unsigned long and thus cannot be negative.  We checked against 0
few lines before.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:01 -08:00
Carsten Otte
5514854812 [PATCH] remove needless check in binfmt_elf.c
Local variable i is unsigned int and thus cannot be negative.

(akpm: unsigneds shouldn't be called `i'.  This value cannot possibly be
negative anyway).

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:01 -08:00
Chen, Kenneth W
174e27c607 [PATCH] direct-io: bug fix in dio handling write error
There is a bug in direct-io on propagating write error up to the higher I/O
layer.  When performing an async ODIRECT write to a block device, if a
device error occurred (like media error or disk is pulled), the error code
is only propagated from device driver to the DIO layer.  The error code
stops at finished_one_bio().  The aysnc write, however, is supposedly have
a corresponding AIO event with appropriate return code (in this case -EIO).
 Application which waits on the async write event, will hang forever since
such AIO event is lost forever (if such app did not use the timeout option
in io_getevents call.  Regardless, an AIO event is lost).

The discovery of above bug leads to another discovery of potential race
window with dio->result.  The fundamental problem is that dio->result is
overloaded with dual use: an indicator of fall back path for partial dio
write, and an error indicator used in the I/O completion path.  In the
event of device error, the setting of -EIO to dio->result clashes with
value used to track partial write that activates the fall back path.

It was also pointed out that it is impossible to use dio->result to track
partial write and at the same time to track error returned from device
driver.  Because direct_io_work can only determines whether it is a partial
write at the end of io submission and in mid stream of those io submission,
a return code could be coming back from the driver.  Thus messing up all
the subsequent logic.

Proposed fix is to separating out error code returned by the IO completion
path from partial IO submit tracking.  A new variable is added to dio
structure specifically to track io error returned in the completion path.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Zach Brown <zach.brown@oracle.com>
Acked-by: Suparna Bhattacharya <suparna@in.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:00 -08:00
Phillip Susi
0e6b3e5e97 [PATCH] udf: fix uid/gid options and add uid/gid=ignore and forget options
As Pekka Enberg pointed out, with the if still following the else, you can
still get a null uid written to the disk if you specify a default uid= without
uid=forget.  In other words, if the desktop user is uid=1000 and the mount
option uid=1000 is given ( which is done on ubuntu automatically and probably
other distributions that use hal ), then if any other user besides uid 1000
owns a file then a 0 will be written to the media as the owning uid instead.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:00 -08:00
Oliver Neukum
11b0b5abb2 [PATCH] use kzalloc and kcalloc in core fs code
Signed-off-by: Oliver Neukum <oliver@neukum.name>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:00 -08:00
Peter Staubach
57070d012c [PATCH] compat_sys_nfsservctl(): handle errors correctly
Correct some error handling on the compat version of the nfsservctl()
system.  It was detecting errors while copying in the arguments from user
space, but then attempting to use the arguments anyway.  This didn't seem
so good.

Signed-off-by: Peter Staubach <staubach@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:59 -08:00
Rob Landley
76c67de460 [PATCH] Ext2 flags shouldn't report "nogrpid"
If I mount ext2 "rw", I want it to say "rw", not "rw,nogrpid".

I caught this writing an automated regression test script for the busybox
mount command.  The symptom is
  /dev/loop0 on /images/ext2.dir type ext2 (rw,nogrpid)
instead of:
  /dev/loop0 on /images/ext2.dir type ext2 (rw)

The behavior was introduced by git commit
8fc2751beb.

Signed-off-by: Rob Landley <rob@landley.net>
Cc: Mark Bellon <mbellon@mvista.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:58 -08:00
NeilBrown
7e53cac41d [PATCH] Honour AOP_TRUNCATE_PAGE returns in page_symlink
As prepare_write, commit_write and readpage are allowed to return
AOP_TRUNCATE_PAGE, page_symlink should respond to them.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:58 -08:00
Benoit Boissinot
d5ee4ea833 [PATCH] indirect_print_item() warning fix
fs/reiserfs/item_ops.c: In function 'indirect_print_item':
fs/reiserfs/item_ops.c:278: warning: 'num' may be used uninitialized in this function

(akpm: this is probably just gcc being dumb)

Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.fr>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:57 -08:00
Adrian Bunk
3cdc409c16 [PATCH] reiserfs/xattr_acl.c:reiserfs_get_acl(): make size an int
The Coverity checker wasn't happy seeing a size_t compared with -ENODATA
and -ENOSYS.

Since the only place where size is set is through the result of
reiserfs_xattr_get() which is an int, we could simply make size an int.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <mason@suse.com>
Cc: Hans Reiser <reiser@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:57 -08:00
Kirk True
c1f5a19446 [PATCH] ext3: Fix debug logging-only compilation error
When EXT3FS_DEBUG is #define-d, the compile breaks due to #include file
issues.

Signed-off-by: Kirk True <kernel@kirkandsheila.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:56 -08:00
Kirill Korotaev
2ab1346085 [PATCH] Reduce sched latency in shrink_dcache_sb()
This patch reduces scheduling latency in shrink_dcache_sb() noticed during
remounting of big partitions with many cached dentries.  The same latency
fix was applied to select_parent() long ago.

Signed-off-by: Denis Lunev <den@sw.ru>
Signed-off-by: Pavel Emelianov <xemul@sw.ru>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:56 -08:00
OGAWA Hirofumi
4ffc844425 [PATCH] Move cond_resched() after iput() in sync_sb_inodes()
In here, I think the following order is more cache-friendly.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:56 -08:00
OGAWA Hirofumi
d25b9a1ff0 [PATCH] freeze_bdev() cleanup
freeze_bdev() uses a fsync_super() without sync_blockdev().  This patch
makes __fsync_super() and shares it.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:56 -08:00
Denis Vlasenko
11b8448751 [PATCH] fix messages in fs/minix
Believe it or not, but in fs/minix/*, the oldest filesystem in the kernel,
something still can be fixed:

	printk("new_inode: bit already set");

"\n" is missing!

While at it, I also removed periods from the end of error messages and made
capitalization uniform.  Also s/i-node/inode/, s/printk (/printk(/

Signed-ff-by: Denis Vlasenko <vda@ilport.com.ua>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:56 -08:00
Davide Libenzi
f348d70a32 [PATCH] POLLRDHUP/EPOLLRDHUP handling for half-closed devices notifications
Implement the half-closed devices notifiation, by adding a new POLLRDHUP
(and its alias EPOLLRDHUP) bit to the existing poll/select sets.  Since the
existing POLLHUP handling, that does not report correctly half-closed
devices, was feared to be changed, this implementation leaves the current
POLLHUP reporting unchanged and simply add a new bit that is set in the few
places where it makes sense.  The same thing was discussed and conceptually
agreed quite some time ago:

http://lkml.org/lkml/2003/7/12/116

Since this new event bit is added to the existing Linux poll infrastruture,
even the existing poll/select system calls will be able to use it.  As far
as the existing POLLHUP handling, the patch leaves it as is.  The
pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing
archs and sets the bit in the six relevant files.  The other attached diff
is the simple change required to sys/epoll.h to add the EPOLLRDHUP
definition.

There is "a stupid program" to test POLLRDHUP delivery here:

 http://www.xmailserver.org/pollrdhup-test.c

It tests poll(2), but since the delivery is same epoll(2) will work equally.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:56 -08:00
Kirk True
58bf6a2db2 [PATCH] smbfs: Fix debug logging-only compilation error
When SMBFS_DEBUG_VERBOSE is #define-d, the compile breaks:

fs/smbfs/inode.c:217: error: aggregate value used where an integer was expected

This is a simple matter of using the .tv_sec attribute of struct time_spec.

Signed-off-by: Kirk True <kernel@kirkandsheila.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:55 -08:00
Eric Van Hensbergen
67543e508d [PATCH] 9p: fix name consistency problems
There were a number of conflicting naming schemes used in the v9fs project.
The directory was fs/9p, but MAINTAINERS and Documentation referred to
v9fs.  The module name itself was 9p2000, and the file system type was 9P.
This patch attempts to clean that up, changing all references to 9p in
order to match the directory name.  We'll also start using 9p instead of
v9fs as our patch prefix.

There is also a minor consistency cleanup in the options changing the name
option to uname in order to more closely match the Plan 9 options.

Signed-off-by: Eric Van Hensbergevan <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:54 -08:00
Eric Van Hensbergen
42e8c509cf [PATCH] v9fs: update license boilerplate
Update license boilerplate to specify GPLv2 and remove the (at your option
clause).  This change was agreed to by all the copyright holders (approvals
can be found on v9fs-developer mailing list).

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:54 -08:00
Eugene Teo
c0291a05f8 [PATCH] v9fs: fix vfs_inode dereference before NULL check
__getname, which in turn will call kmem_cache_alloc, may return NULL.

Coverity bug #977

Signed-off-by: Eugene Teo <eugene.teo@eugeneteo.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:54 -08:00
Latchesar Ionkov
16cce6d27e [PATCH] v9fs: add extension field to Tcreate
Implement a new way of creating special files.  Instead of Tcreate+Twstat,
add one more field to Tcreate that contains special file description.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:54 -08:00
Latchesar Ionkov
5174fdab9f [PATCH] v9fs: print 9p messages
Print 9p messages.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:54 -08:00
Russ Cox
4a26c2429b [PATCH] v9fs: rename tids to tags to be consistent with Plan 9 documentation
The code talks about these things called tids, which I eventually figured
out are tags.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:54 -08:00
Russ Cox
27979bb2ff [PATCH] v9fs: consolidate trans_sock into trans_fd
Here is a new trans_fd.c that replaces the current trans_fd.c and
trans_sock.c.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:54 -08:00
Alexander Zarochentsev
5930860296 [PATCH] reiserfs: use balance_dirty_pages_ratelimited_nr in reiserfs_file_write()
Use the new balance_dirty_pages_ratelimited_nr in reiserfs "largeio" file
write.

Signed-off-by: Hans Reiser <reiser@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:53 -08:00
Vladimir V. Saveliev
cd02b966bf [PATCH] reiserfs: cleanups
Clean up several places where gcc issues warnings when -W is specified.
Thanks to Neil for finding that.

Signed-off-by: Vladimir V. Saveliev <vs@namesys.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Hans Reiser <reiser@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:53 -08:00
Nick Piggin
c32ccd87bf [PATCH] inotify: lock avoidance with parent watch status in dentry
Previous inotify work avoidance is good when inotify is completely unused,
but it breaks down if even a single watch is in place anywhere in the
system.  Robin Holt notices that udev is one such culprit - it slows down a
512-thread application on a 512 CPU system from 6 seconds to 22 minutes.

Solve this by adding a flag in the dentry that tells inotify whether or not
its parent inode has a watch on it.  Event queueing to parent will skip
taking locks if this flag is cleared.  Setting and clearing of this flag on
all child dentries versus event delivery: this is no in terms of race
cases, and that was shown to be equivalent to always performing the check.

The essential behaviour is that activity occuring _after_ a watch has been
added and _before_ it has been removed, will generate events.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:53 -08:00
Oleg Drokin
9a56c21392 [PATCH] Add lookup_instantiate_filp usage warning
I think it would be nice to put an usage warning in header of
lookup_instantiate_filp() to indicate it is unsafe to use it on anything
but regular files (even that is potentially unsafe, but there your ->open()
is usually in your hands anyway), so that others won't fall into the same
trap I did.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:51 -08:00
Oleg Drokin
b500531e6f [PATCH] Introduce FMODE_EXEC file flag
Introduce FMODE_EXEC file flag, to indicate that file is being opened for
execution.  This is useful for distributed filesystems to maintain
consistent behavior for returning ETXTBUSY when opening for write and
execution happens on different nodes.

akpm:

  Needed by Lustre at present.  I assume their objective to to work towards
  being able to install Lustre on an unmodified distro kernel, which seems
  sane.  It should have zero runtime cost.

  Trond and Chuck indicate that NFS4 can probably use this too, for the same
  thing.

  Steven says it's also on the GFS todo list.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:51 -08:00
Jeff Mahoney
619d5d8a2b [PATCH] reiserfs: reiserfs_file_write() will lose error code when a 0-length write occurs w/ O_SYNC
When an error occurs in reiserfs_file_write before any data is written, and
O_SYNC is set, the return code of generic_osync_write will overwrite the
error code, losing it.

This patch ensures that generic_osync_inode() doesn't run under an error
condition, losing the error.  This duplicates the logic from
generic_file_buffered_write().

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Alexander Zarochentsev <zam@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:51 -08:00
Alexander Zarochentsev
a44c94a7b8 [PATCH] reiserfs: handle trans_id overflow
Reiserfs does not handle transaction ID overflow correctly.  Transaction ID
== 0 causes reiserfs to crash.  The patch fixes all places where the
transaction ID is incremented.

Signed-off-by: Alexander Zarochentsev <zam@namesys.com>
Signed-off-by: Hans Reiser <reiser@namesys.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:51 -08:00
Alexander Zarochentzev
23f9e0f891 [PATCH] reiserfs: fix transaction overflowing
This patch fixes a bug in reiserfs truncate.  A transaction might overflow
when truncating long highly fragmented file.  The fix is to split
truncation into several transactions to avoid overflowing.

Signed-off-by: Vladimir V. Saveliev <vs@namesys.com>
Cc; Charles McColgan <cm@chuck.net>
Cc: Alexander Zarochentsev <zam@namesys.com>
Cc: Hans Reiser <reiser@namesys.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:51 -08:00
Adrian Bunk
bdfc326614 [PATCH] fs/inode.c: make iprune_mutex static
There's no reason for iprune_mutex being global.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:51 -08:00
Oleg Drokin
4af4c52f34 [PATCH] Missed error checking for intent's filp in open_namei().
It seems there is error check missing in open_namei for errors returned
through intent.open.file (from lookup_instantiate_filp).

If there is plain open performed, then such a check done inside
__path_lookup_intent_open called from path_lookup_open(), but when the open
is performed with O_CREAT flag set, then __path_lookup_intent_open is only
called with LOOKUP_PARENT set where no file opening can occur yet.

Later on lookup_hash is called where exact opening might take place and
intent.open.file may be filled.  If it is filled with error value of some
sort, then we get kernel attempting to dereference this error value as
address (and corresponding oops) in nameidata_to_filp() called from
filp_open().

While this is relatively simple to workaround in ->lookup() method by just
checking lookup_instantiate_filp() return value and returning error as
needed, this is not so easy in ->d_revalidate(), where we can only return
"yes, dentry is valid" or "no, dentry is invalid, perform full lookup
again", and just returning 0 on error would cause extra lookup (with
potential extra costly RPCs).

So in short, I believe that there should be no difference in error handling
for opening a file and creating a file in open_namei() and propose this
simple patch as a solution.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:51 -08:00
Andrew Morton
8d8c85117f [PATCH] jbd: convert kjournald to kthread API
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:50 -08:00
Andrew Morton
e3df18983e [PATCH] jbd: embed j_commit_timer in journal struct
The kjournald timer is currently on the kernel thread's stack and the journal
structure points at it.  Save a pointer hop by moving the timer into the
journal structure.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:50 -08:00
Al Viro
871751e25d [PATCH] slab: implement /proc/slab_allocators
Implement /proc/slab_allocators.   It produces output like:

idr_layer_cache: 80 idr_pre_get+0x33/0x4e
buffer_head: 2555 alloc_buffer_head+0x20/0x75
mm_struct: 9 mm_alloc+0x1e/0x42
mm_struct: 20 dup_mm+0x36/0x370
vm_area_struct: 384 dup_mm+0x18f/0x370
vm_area_struct: 151 do_mmap_pgoff+0x2e0/0x7c3
vm_area_struct: 1 split_vma+0x5a/0x10e
vm_area_struct: 11 do_brk+0x206/0x2e2
vm_area_struct: 2 copy_vma+0xda/0x142
vm_area_struct: 9 setup_arg_pages+0x99/0x214
fs_cache: 8 copy_fs_struct+0x21/0x133
fs_cache: 29 copy_process+0xf38/0x10e3
files_cache: 30 alloc_files+0x1b/0xcf
signal_cache: 81 copy_process+0xbaa/0x10e3
sighand_cache: 77 copy_process+0xe65/0x10e3
sighand_cache: 1 de_thread+0x4d/0x5f8
anon_vma: 241 anon_vma_prepare+0xd9/0xf3
size-2048: 1 add_sect_attrs+0x5f/0x145
size-2048: 2 journal_init_revoke+0x99/0x302
size-2048: 2 journal_init_revoke+0x137/0x302
size-2048: 2 journal_init_inode+0xf9/0x1c4

Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Alexander Nyberg <alexn@telia.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
DESC
slab-leaks3-locking-fix
EDESC
From: Andrew Morton <akpm@osdl.org>

Update for slab-remove-cachep-spinlock.patch

Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Alexander Nyberg <alexn@telia.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:49 -08:00
David Howells
214fda1f6e [PATCH] Optimise d_find_alias()
The attached patch optimises d_find_alias() to only take the spinlock if
there's anything in the the inode's alias list.  If there isn't, it returns
NULL immediately.

With respect to the superblock sharing patch, this should reduce by one the
number of times the dcache_lock is taken by nfs_lookup() for ordinary
directory lookups.

Only in the case where there's already a dentry for particular directory inode
(such as might happen when another mountpoint is rooted at that dentry) will
the lock then be taken the extra time.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:22:49 -08:00
Mark Fasheh
ea8aa68d36 ocfs2: finally remove MLF* macros
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-03-24 14:58:29 -08:00
Mark Fasheh
b0697053f9 ocfs2: don't use MLF* in the file system
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-03-24 14:58:28 -08:00
Kurt Hackel
29004858a7 ocfs2: don't use MLF* in dlm/ files
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-03-24 14:58:27 -08:00
Mark Fasheh
70bacbdbfa ocfs2: don't use MLF* in cluster/ files
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-03-24 14:58:26 -08:00
Kurt Hackel
c03872f5f5 [PATCH] ocfs2: dlm recovery fixes
when starting lock mastery (excepting the recovery lock) wait on any nodes
needing recovery. fix one instance where lock resources were left attached
to the recovery list after recovery completed.  ensure that the node_down
code is run uniformly regardless of which node found the dead node first.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-03-24 14:58:25 -08:00
Kurt Hackel
9c6510a5bf [PATCH] ocfs2: fix hang in dlm lock resource mastery
fixes hangs in lock mastery related to refcounting on the mle structure

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-03-24 14:58:25 -08:00
Mark Fasheh
a74e1f0e8a ocfs2: use __attribute__ format
Use the "format" attribute on ocfs2_error() and ocfs2_abort() so that the
compiler will warn when we get calls to those functions wrong.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-03-24 14:58:24 -08:00
Eric Sesterhenn
c5d3237c24 BUG_ON() Conversion in fs/coda/
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-24 18:42:13 +01:00
Eric Sesterhenn
88bcd51262 BUG_ON() Conversion in fs/binfmt_elf_fdpic.c
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-24 18:38:48 +01:00
Uwe Zeisberger
c30fe7f731 fix typos "wich" -> "which"
Signed-off-by: Uwe Zeisberger <zeisberg@informatik.uni-freiburg.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-24 18:23:14 +01:00
Michael Owen
c690a72253 typo patch for fs/ufs/super.c
Quick and simple typo fix. neTXstep -> neXTstep

Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-24 18:21:44 +01:00
Glauber de Oliveira Costa
b5a7c4f583 [PATCH] ext3: Properly report backup block present in a group
In filesystems with the meta block group flag on, ext3_bg_num_gdb() fails
to report the correct number of blocks used to store the group descriptor
backups in a given group.  It happens because meta_bg follows a different
logic from the original ext3 backup placement in groups multiples of 3, 5
and 7.

Signed-off-by: Glauber de Oliveira Costa <glommer@br.ibm.com>
Cc: Andreas Dilger <adilger@clusterfs.com>
Cc: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Alex Tomas <alex@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:30 -08:00
Kyle McMartin
82d821ddca [PATCH] Conditionalize compat_sys_newfstatat
If we don't want sys_newfstatat because __ARCH_WANT_STAT64 is defined, then
we certainly don't want compat_sys_newfstatat either.

Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:27 -08:00
Andrew Morton
18e79b40ed [PATCH] fsync: extract internal code
Pull the guts out of do_fsync() - we can use it elsewhere.

Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:27 -08:00
Andrew Morton
4741c9fd36 [PATCH] set_page_dirty() return value fixes
We need set_page_dirty() to return true if it actually transitioned the page
from a clean to dirty state.  This wasn't right in a couple of places.  Do a
kernel-wide audit, fix things up.

This leaves open the possibility of returning a negative errno from
set_page_dirty() sometime in the future.  But we don't do that at present.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:26 -08:00
Eric Dumazet
8a14342683 [PATCH] HOTPLUG_CPU: avoid hitting too many cachelines in recalc_bh_state()
Instead of using for_each_cpu(i), we can use for_each_online_cpu(i).

When a CPU goes offline (ie removed from online map), it might have a non
null bh_accounting.nr, so this patch adds a transfer of this counter to an
online CPU counter.

We already have a hotcpu_notifier, (function buffer_cpu_notify()), where we
can do this bh_accounting.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:26 -08:00
Coywolf Qi Hunt
38885bd4c2 [PATCH] sb_set_blocksize cleanup
sb_set_blocksize() cleanup: make sb_set_blocksize() use blksize_bits().

Signed-off-by: Coywolf Qi Hunt <qiyong@fc-cn.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:25 -08:00
Alex Tomas
09fe316a7b [PATCH] fast ext3_statfs
Under I/O load it may take up to a dozen seconds to read all group
descriptors.  This is what ext3_statfs() does.  At the same time, we already
maintain global numbers of free inodes/blocks.  Why don't we use them instead
of group reading and summing?

Cc: Ravikiran G Thirumalai <kiran@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:25 -08:00
Pekka Enberg
5b3cf3e0f0 [PATCH] isofs: remove unused debugging macros
Remove unused debugging macros from isofs.  The referred debug functions do
not exist in the kernel.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:24 -08:00
Alexey Dobriyan
53b3531bbb [PATCH] s/;;/;/g
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:24 -08:00
Paul Jackson
b0196009d8 [PATCH] cpuset memory spread slab cache hooks
Change the kmem_cache_create calls for certain slab caches to support cpuset
memory spreading.

See the previous patches, cpuset_mem_spread, for an explanation of cpuset
memory spreading, and cpuset_mem_spread_slab_cache for the slab cache support
for memory spreading.

The slab caches marked for now are: dentry_cache, inode_cache, some xfs slab
caches, and buffer_head.  This list may change over time.  In particular,
other file system types that are used extensively on large NUMA systems may
want to allow for spreading their directory and inode slab cache entries.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:23 -08:00
Paul Jackson
fffb60f93c [PATCH] cpuset memory spread: slab cache format
Rewrap the overly long source code lines resulting from the previous
patch's addition of the slab cache flag SLAB_MEM_SPREAD.  This patch
contains only formatting changes, and no function change.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:23 -08:00
Paul Jackson
4b6a9316fa [PATCH] cpuset memory spread: slab cache filesystems
Mark file system inode and similar slab caches subject to SLAB_MEM_SPREAD
memory spreading.

If a slab cache is marked SLAB_MEM_SPREAD, then anytime that a task that's
in a cpuset with the 'memory_spread_slab' option enabled goes to allocate
from such a slab cache, the allocations are spread evenly over all the
memory nodes (task->mems_allowed) allowed to that task, instead of favoring
allocation on the node local to the current cpu.

The following inode and similar caches are marked SLAB_MEM_SPREAD:

    file                               cache
    ====                               =====
    fs/adfs/super.c                    adfs_inode_cache
    fs/affs/super.c                    affs_inode_cache
    fs/befs/linuxvfs.c                 befs_inode_cache
    fs/bfs/inode.c                     bfs_inode_cache
    fs/block_dev.c                     bdev_cache
    fs/cifs/cifsfs.c                   cifs_inode_cache
    fs/coda/inode.c                    coda_inode_cache
    fs/dquot.c                         dquot
    fs/efs/super.c                     efs_inode_cache
    fs/ext2/super.c                    ext2_inode_cache
    fs/ext2/xattr.c (fs/mbcache.c)     ext2_xattr
    fs/ext3/super.c                    ext3_inode_cache
    fs/ext3/xattr.c (fs/mbcache.c)     ext3_xattr
    fs/fat/cache.c                     fat_cache
    fs/fat/inode.c                     fat_inode_cache
    fs/freevxfs/vxfs_super.c           vxfs_inode
    fs/hpfs/super.c                    hpfs_inode_cache
    fs/isofs/inode.c                   isofs_inode_cache
    fs/jffs/inode-v23.c                jffs_fm
    fs/jffs2/super.c                   jffs2_i
    fs/jfs/super.c                     jfs_ip
    fs/minix/inode.c                   minix_inode_cache
    fs/ncpfs/inode.c                   ncp_inode_cache
    fs/nfs/direct.c                    nfs_direct_cache
    fs/nfs/inode.c                     nfs_inode_cache
    fs/ntfs/super.c                    ntfs_big_inode_cache_name
    fs/ntfs/super.c                    ntfs_inode_cache
    fs/ocfs2/dlm/dlmfs.c               dlmfs_inode_cache
    fs/ocfs2/super.c                   ocfs2_inode_cache
    fs/proc/inode.c                    proc_inode_cache
    fs/qnx4/inode.c                    qnx4_inode_cache
    fs/reiserfs/super.c                reiser_inode_cache
    fs/romfs/inode.c                   romfs_inode_cache
    fs/smbfs/inode.c                   smb_inode_cache
    fs/sysv/inode.c                    sysv_inode_cache
    fs/udf/super.c                     udf_inode_cache
    fs/ufs/super.c                     ufs_inode_cache
    net/socket.c                       sock_inode_cache
    net/sunrpc/rpc_pipe.c              rpc_inode_cache

The choice of which slab caches to so mark was quite simple.  I marked
those already marked SLAB_RECLAIM_ACCOUNT, except for fs/xfs, dentry_cache,
inode_cache, and buffer_head, which were marked in a previous patch.  Even
though SLAB_RECLAIM_ACCOUNT is for a different purpose, it marks the same
potentially large file system i/o related slab caches as we need for memory
spreading.

Given that the rule now becomes "wherever you would have used a
SLAB_RECLAIM_ACCOUNT slab cache flag before (usually the inode cache), use
the SLAB_MEM_SPREAD flag too", this should be easy enough to maintain.
Future file system writers will just copy one of the existing file system
slab cache setups and tend to get it right without thinking.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:23 -08:00
Adrian Bunk
c98d8cfbc6 [PATCH] fs/coda/: proper prototypes
Introduce a file fs/coda/coda_int.h with proper prototypes for some code.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:21 -08:00
Adrian Bunk
2c2212901f [PATCH] fs/ext2/: proper ext2_get_parent() prototype
Add a proper prototype for ext2_get_parent().

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:21 -08:00
Adrian Bunk
29c6e48601 [PATCH] fs/9p/: possible cleanups
- mux.c: v9fs_poll_mux() was inline but not static resuling in needless
         object size bloat
- mux.c: remove all "inline"s: gcc should know best what to inline
- #if 0 the following unused global functions:
  - 9p.c: v9fs_v9fs_t_flush()
  - conv.c: v9fs_create_tauth()
  - mux.c: v9fs_mux_rpcnb()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:21 -08:00
Tobias Klauser
e8c96f8c29 [PATCH] fs: Use ARRAY_SIZE macro
Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove a
duplicate of ARRAY_SIZE.  Some trailing whitespaces are also deleted.

Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:19 -08:00
Horst Hummel
ef1597d527 [PATCH] s390: Remove old history/whitespave from partition code
Remove obsolete history and trailing whitespace.

Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:17 -08:00
Theodore Ts'o
9b04c997b1 [PATCH] vfs: MS_VERBOSE should be MS_SILENT
The meaning of MS_VERBOSE is backwards; if the bit is set, it really means,
"don't be verbose".  This is confusing and counter-intuitive.

In addition, there is also no way to set the MS_VERBOSE flag in the
mount(8) program in util-linux, but interesting, it does define options
which would do the right thing if MS_SILENT were defined, which
unfortunately we do not:

#ifdef MS_SILENT
  { "quiet",    0, 0, MS_SILENT    },   /* be quiet  */
  { "loud",     0, 1, MS_SILENT    },   /* print out messages. */
#endif

So the obvious fix is to deprecate the use of MS_VERBOSE and replace it
with MS_SILENT.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:15 -08:00
Trond Myklebust
1ebbe2b200 Merge branch 'linus' 2006-03-23 23:44:19 -05:00
Linus Torvalds
a1a051b187 Merge git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs-2.6:
  NTFS: 2.1.27 - Various bug fixes and cleanups.
  NTFS: Semaphore to mutex conversion.
  NTFS: Handle the recently introduced -ENAMETOOLONG return value from
  NTFS: Add a missing call to flush_dcache_mft_record_page() in
  NTFS: Fix a bug in fs/ntfs/inode.c::ntfs_read_locked_index_inode() where we
  NTFS: Improve comments on file attribute flags in fs/ntfs/layout.h.
  NTFS: Limit name length in fs/ntfs/unistr.c::ntfs_nlstoucs() to maximum
  NTFS: Remove all the make_bad_inode() calls.  This should only be called
  NTFS: Add support for sparse files which have a compression unit of 0.
  NTFS: Fix comparison of $MFT and $MFTMirr to not bail out when there are
  NTFS: Use buffer_migrate_page() for the ->migratepage function of all ntfs
  NTFS: Fix a buggette in an "should be impossible" case handling where we
  NTFS: Fix an (innocent) off-by-one error in the runlist code.
  NTFS: Fix two compiler warnings on Alpha.  Thanks to Andrew Morton for
2006-03-23 16:26:56 -08:00
Linus Torvalds
cec6062037 Merge branch 'blktrace' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'blktrace' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] Block queue IO tracing support (blktrace) as of 2006-03-23
  [PATCH] relay: consolidate sendfile() and read() code
  [PATCH] relay: add sendfile() support
  [PATCH] relay: migrate from relayfs to a generic relay API
2006-03-23 16:24:24 -08:00
Linus Torvalds
debf798b1e Merge git://oss.sgi.com:8090/oss/git/xfs-2.6
* git://oss.sgi.com:8090/oss/git/xfs-2.6: (71 commits)
  [XFS] Sync up one/two other minor changes missed in previous merges.
  [XFS] Reenable the noikeep (delete inode cluster space) option by default.
  [XFS] Check that a page has dirty buffers before finding it acceptable for
  [XFS] Fixup naming inconsistencies found by Pekka Enberg and one from Jan
  [XFS] Explain the race closed by the addition of vn_iowait() to the start
  [XFS] Fixing the error caused by the conflict between DIO Write's
  [XFS] Fixing KDB's xrwtrc command, also added the current process id into
  [XFS] Fix compiler warning from xfs_file_compat_invis_ioctl prototype. 
  [XFS] remove bogus INT_GET for u8 variables in xfs_dir_leaf.c 
  [XFS] endianess annotations for xfs_da_node_hdr_t 
  [XFS] endianess annotations for xfs_da_node_entry_t 
  [XFS] store xfs_attr_inactive_list_t in native endian 
  [XFS] store xfs_attr_sf_sort in native endian 
  [XFS] endianess annotations for xfs_attr_shortform_t 
  [XFS] endianess annotations for xfs_attr_leaf_name_remote_t 
  [XFS] endianess annotations for xfs_attr_leaf_name_local_t 
  [XFS] endianess annotations for xfs_attr_leaf_entry_t 
  [XFS] endianess annotations for xfs_attr_leaf_hdr_t 
  [XFS] remove bogus INT_GET on u8 variables in xfs_dir2_block.c 
  [XFS] endianess annotations for xfs_da_blkinfo_t 
  ...
2006-03-23 15:28:51 -08:00
Jens Axboe
2056a782f8 [PATCH] Block queue IO tracing support (blktrace) as of 2006-03-23
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-23 20:00:26 +01:00
Jens Axboe
b86ff981a8 [PATCH] relay: migrate from relayfs to a generic relay API
Original patch from Paul Mundt, sysfs parts removed by me since they
were broken.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-03-23 19:56:55 +01:00
Anton Altaparmakov
92fe7b9ea8 Merge branch 'master' of /usr/src/ntfs-2.6/ 2006-03-23 17:06:08 +00:00
Anton Altaparmakov
e750d1c7cc NTFS: 2.1.27 - Various bug fixes and cleanups.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 17:04:12 +00:00
Ingo Molnar
4e5e529ad6 NTFS: Semaphore to mutex conversion.
The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 16:57:48 +00:00
Anton Altaparmakov
834ba600ce NTFS: Handle the recently introduced -ENAMETOOLONG return value from
fs/ntfs/unistr.c::ntfs_nlstoucs() in fs/ntfs/namei.c::ntfs_lookup().

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 16:25:23 +00:00
Anton Altaparmakov
20fdcf1d54 NTFS: Add a missing call to flush_dcache_mft_record_page() in
fs/ntfs/inode.c::ntfs_write_inode().

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 16:21:02 +00:00
Anton Altaparmakov
a778f21732 NTFS: Fix a bug in fs/ntfs/inode.c::ntfs_read_locked_index_inode() where we
forgot to update a temporary variable so loading index inodes which
      have an index allocation attribute failed.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 16:18:23 +00:00
Anton Altaparmakov
2c2c8c1c21 NTFS: Improve comments on file attribute flags in fs/ntfs/layout.h.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 16:09:40 +00:00
Anton Altaparmakov
d4faf636d6 NTFS: Limit name length in fs/ntfs/unistr.c::ntfs_nlstoucs() to maximum
allowed by NTFS, i.e. 255 Unicode characters, not including the
      terminating NULL (which is not stored on disk).

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 16:05:11 +00:00
Anton Altaparmakov
f95c4018fd NTFS: Remove all the make_bad_inode() calls. This should only be called
from read inode and new inode code paths.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 15:59:32 +00:00
Anton Altaparmakov
a0646a1f04 NTFS: Add support for sparse files which have a compression unit of 0.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 15:53:03 +00:00
Benjamin LaHaise
b0e6e96299 [PATCH] reduce size of bio mempools
The biovec default mempool limit of 256 entries results in over 3MB of RAM
being permanently pinned, even on systems with only 128MB of RAM.  Since
mempool tries to allocate from the system pool first, it makes sense to
reduce the size of the mempool fallbacks to a more reasonable limit of 1-5
entries -- enough for the system to be able to make progress even under
load.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Acked-by: Jens Axboe <axboe@suse.de>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:18 -08:00
Andrew Morton
394e3902c5 [PATCH] more for_each_cpu() conversions
When we stop allocating percpu memory for not-possible CPUs we must not touch
the percpu data for not-possible CPUs at all.  The correct way of doing this
is to test cpu_possible() or to use for_each_cpu().

This patch is a kernel-wide sweep of all instances of NR_CPUS.  I found very
few instances of this bug, if any.  But the patch converts lots of open-coded
test to use the preferred helper macros.

Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Christian Zankel <chris@zankel.net>
Cc: Philippe Elie <phil.el@wanadoo.fr>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Jens Axboe <axboe@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:17 -08:00
Benjamin LaHaise
5a6b7951bf [PATCH] get_empty_filp tweaks, inline epoll_init_file()
Eliminate a handful of cache references by keeping current in a register
instead of reloading (helps x86) and avoiding the overhead of a function
call.  Inlining eventpoll_init_file() saves 24 bytes.  Also reorder file
initialization to make writes occur more sequentially.

Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:17 -08:00
Alexey Dobriyan
713729e8b9 [PATCH] fs/*/file.c: drop insane header dependencies
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:17 -08:00
Domen Puncer
7a673c6b8f [PATCH] devpts: use lib/parser.c for parsing mount options
Item from "2.6 should fix" list.

Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:17 -08:00
Alexey Dobriyan
3257545e40 [PATCH] ufs: switch to inode_inc_count, inode_dec_count
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:16 -08:00
Alexey Dobriyan
a513b035ea [PATCH] ext2: switch to inode_inc_count, inode_dec_count
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:16 -08:00
Alexey Dobriyan
4e907c3d45 [PATCH] sysv: switch to inode_inc_count, inode_dec_count
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:15 -08:00
Alexey Dobriyan
78ec7b6917 [PATCH] minix: switch to inode_inc_link_count, inode_dec_link_count
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:15 -08:00
Alexey Dobriyan
a7ccf00718 [PATCH] fs/ufs/file.c: drop insane header dependencies
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:15 -08:00
Arjan van de Ven
6b9438e132 [PATCH] fat_lock is used as a mutex, convert it to using the new mutex primitive
The fat code uses the fat_lock always in a mutex way (taking and releasing
the lock in the same function), the patch below converts it into the new
mutex primitive.  Please consider this patch for the code.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:15 -08:00
Ingo Molnar
1e7933defd [PATCH] sem2mutex: UDF
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:14 -08:00
Ingo Molnar
8e3f90459b [PATCH] sem2mutex: NCPFS
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:14 -08:00
Arjan van de Ven
9746151861 [PATCH] convert ext3's truncate_sem to a mutex
ext3's truncate_sem is always released in the same function it's taken
and it otherwise is a mutex as well..

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:14 -08:00
Ingo Molnar
7bf6d78dd9 [PATCH] sem2mutex: HPFS
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:13 -08:00
Ingo Molnar
1d5599e397 [PATCH] sem2mutex: autofs4 wq_sem
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:13 -08:00
Ingo Molnar
1eb0d67007 [PATCH] sem2mutex: JFFS
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:13 -08:00
Ingo Molnar
0ac1759abc [PATCH] sem2mutex: fs/seq_file.c
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:12 -08:00
Ingo Molnar
7cf34c761d [PATCH] sem2mutex: fs/libfs.c
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:12 -08:00
Arjan van de Ven
2c68ee754c [PATCH] sem2mutex: jbd, j_checkpoint_mutex
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:12 -08:00
Ingo Molnar
f24075bd0c [PATCH] sem2mutex: iprune
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:12 -08:00
Arjan van de Ven
a11f3a0574 [PATCH] sem2mutex: vfs_rename_mutex
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:12 -08:00
Arjan van de Ven
144efe3e3e [PATCH] sem2mutex: eventpoll
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:12 -08:00
Ingo Molnar
d4f9af9dac [PATCH] sem2mutex: inotify
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Acked-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:11 -08:00
Ingo Molnar
d3be915fc5 [PATCH] sem2mutex: quota
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Jan Kara <jack@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:11 -08:00
Arjan van de Ven
c039e3134a [PATCH] sem2mutex: blockdev #2
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:11 -08:00
Ingo Molnar
4f7a07b887 [PATCH] convert fs/9p/ to mutexes, fix locking bugs
Convert fs/9p/mux.c from semaphore to mutex.

NOTE: fixed locking bugs in the process - the code was using semaphores
the other way around.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:10 -08:00
Jan Kara
6362e4d4ed [PATCH] Fix oops in invalidate_dquots()
When quota is being turned off we assumed that all the references to dquots
were already dropped.  That need not be true as inodes being deleted are
not on superblock's inodes list and hence we need not reach it when
removing quota references from inodes.  So invalidate_dquots() has to wait
for all the users of dquots (as quota is already marked as turned off, no
new references can be acquired and so this is bound to happen rather
early).  When we do this, we can also remove the iprune_sem locking as it
was protecting us against exactly the same problem when freeing inodes
icache memory.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:10 -08:00
Eric Dumazet
0c9e63fd38 [PATCH] Shrinks sizeof(files_struct) and better layout
1) Reduce the size of (struct fdtable) to exactly 64 bytes on 32bits
   platforms, lowering kmalloc() allocated space by 50%.

2) Reduce the size of (files_struct), using a special 32 bits (or
   64bits) embedded_fd_set, instead of a 1024 bits fd_set for the
   close_on_exec_init and open_fds_init fields.  This save some ram (248
   bytes per task) as most tasks dont open more than 32 files.  D-Cache
   footprint for such tasks is also reduced to the minimum.

3) Reduce size of allocated fdset.  Currently two full pages are
   allocated, that is 32768 bits on x86 for example, and way too much.  The
   minimum is now L1_CACHE_BYTES.

UP and SMP should benefit from this patch, because most tasks will touch
only one cache line when open()/close() stdin/stdout/stderr (0/1/2),
(next_fd, close_on_exec_init, open_fds_init, fd_array[0 ..  2] being in the
same cache line)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:09 -08:00
Andrew Morton
d8733c2956 [PATCH] ext3_readdir: use generic readahead
Linus points out that ext3_readdir's readahead only cuts in when
ext3_readdir() is operating at the very start of the directory.  So for large
directories we end up performing no readahead at all and we suck.

So take it all out and use the core VM's page_cache_readahead().  This means
that ext3 directory reads will use all of readahead's dynamic sizing goop.

Note that we're using the directory's filp->f_ra to hold the readahead state,
but readahead is actually being performed against the underlying blockdev's
address_space.  Fortunately the readahead code is all set up to handle this.

Tested with printk.  It works.  I was struggling to find a real workload which
actually cared.

(The patch also exports page_cache_readahead() to GPL modules)

Cc: "Stephen C. Tweedie" <sct@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:09 -08:00
Neil Horman
5be0e95119 [PATCH] proc: fix duplicate line in /proc/devices
Fix a duplicate block device line printed after the "Block device" header
in /proc/devices.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:02 -08:00
Anton Altaparmakov
949763b2b8 NTFS: Fix comparison of $MFT and $MFTMirr to not bail out when there are
unused, invalid mft records which are the same in both $MFT and
      $MFTMirr.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 15:34:13 +00:00
Anton Altaparmakov
78264bd9c2 NTFS: Use buffer_migrate_page() for the ->migratepage function of all ntfs
address space operations.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 15:06:18 +00:00
Anton Altaparmakov
3ccc7384db NTFS: Fix a buggette in an "should be impossible" case handling where we
continued the attribute lookup loop instead of aborting it.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 15:03:11 +00:00
Anton Altaparmakov
67b1dfe77a NTFS: Fix an (innocent) off-by-one error in the runlist code.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2006-03-23 14:57:43 +00:00
Anton Altaparmakov
b4d8d1a93c Merge branch 'master' of /usr/src/ntfs-2.6/ 2006-03-23 14:50:51 +00:00
Linus Torvalds
8b4b6707ee Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
  fixed path to moved file in include/linux/device.h
  Fix spelling in E1000_DISABLE_PACKET_SPLIT Kconfig description
  Documentation/dvb/get_dvb_firmware: fix firmware URL
  Documentation: Update to BUG-HUNTING
  Remove superfluous NOTIFY_COOKIE_LEN define
  add "tags" to .gitignore
  Fix "frist", "fisrt", typos
  fix rwlock usage example
  It's UTF-8
2006-03-22 10:58:05 -08:00
Christoph Lameter
b20a35035f [PATCH] page migration reorg
Centralize the page migration functions in anticipation of additional
tinkering.  Creates a new file mm/migrate.c

1. Extract buffer_migrate_page() from fs/buffer.c

2. Extract central migration code from vmscan.c

3. Extract some components from mempolicy.c

4. Export pageout() and remove_from_swap() from vmscan.c

5. Make it possible to configure NUMA systems without page migration
   and non-NUMA systems with page migration.

I had to so some #ifdeffing in mempolicy.c that may need a cleanup.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:54:06 -08:00
Chen, Kenneth W
bba1e9b211 [PATCH] convert hugetlbfs_counter to atomic
Implementation of hugetlbfs_counter() is functionally equivalent to
atomic_inc_return().  Use the simpler atomic form.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:54:04 -08:00
David Gibson
b45b5bd65f [PATCH] hugepage: Strict page reservation for hugepage inodes
These days, hugepages are demand-allocated at first fault time.  There's a
somewhat dubious (and racy) heuristic when making a new mmap() to check if
there are enough available hugepages to fully satisfy that mapping.

A particularly obvious case where the heuristic breaks down is where a
process maps its hugepages not as a single chunk, but as a bunch of
individually mmap()ed (or shmat()ed) blocks without touching and
instantiating the pages in between allocations.  In this case the size of
each block is compared against the total number of available hugepages.
It's thus easy for the process to become overcommitted, because each block
mapping will succeed, although the total number of hugepages required by
all blocks exceeds the number available.  In particular, this defeats such
a program which will detect a mapping failure and adjust its hugepage usage
downward accordingly.

The patch below addresses this problem, by strictly reserving a number of
physical hugepages for hugepage inodes which have been mapped, but not
instatiated.  MAP_SHARED mappings are thus "safe" - they will fail on
mmap(), not later with an OOM SIGKILL.  MAP_PRIVATE mappings can still
trigger an OOM.  (Actually SHARED mappings can technically still OOM, but
only if the sysadmin explicitly reduces the hugepage pool between mapping
and instantiation)

This patch appears to address the problem at hand - it allows DB2 to start
correctly, for instance, which previously suffered the failure described
above.

This patch causes no regressions on the libhugetblfs testsuite, and makes a
test (designed to catch this problem) pass which previously failed (ppc64,
POWER5).

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:54:03 -08:00
Nick Piggin
84097518d1 [PATCH] mm: nommu use compound pages
Now that compound page handling is properly fixed in the VM, move nommu
over to using compound pages rather than rolling their own refcounting.

nommu vm page refcounting is broken anyway, but there is no need to have
divergent code in the core VM now, nor when it gets fixed.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: David Howells <dhowells@redhat.com>

(Needs testing, please).
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:54:01 -08:00
Christoph Lameter
ac2b898ca6 [PATCH] slab: Remove SLAB_NO_REAP option
SLAB_NO_REAP is documented as an option that will cause this slab not to be
reaped under memory pressure.  However, that is not what happens.  The only
thing that SLAB_NO_REAP controls at the moment is the reclaim of the unused
slab elements that were allocated in batch in cache_reap().  Cache_reap()
is run every few seconds independently of memory pressure.

Could we remove the whole thing?  Its only used by three slabs anyways and
I cannot find a reason for having this option.

There is an additional problem with SLAB_NO_REAP.  If set then the recovery
of objects from alien caches is switched off.  Objects not freed on the
same node where they were initially allocated will only be reused if a
certain amount of objects accumulates from one alien node (not very likely)
or if the cache is explicitly shrunk.  (Strangely __cache_shrink does not
check for SLAB_NO_REAP)

Getting rid of SLAB_NO_REAP fixes the problems with alien cache freeing.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:53:59 -08:00
Latchesar Ionkov
5e7a99ac45 [PATCH] v9fs: assign dentry ops to negative dentries
If a file is not found in v9fs_vfs_lookup, the function creates negative
dentry, but doesn't assign any dentry ops.  This leaves the negative entry
in the cache (there is no d_delete to mark it for removal).  If the file is
created outside of the mounted v9fs filesystem, the file shows up in the
directory with weird permissions.

This patch assigns the default v9fs dentry ops to the negative dentry.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-22 07:53:55 -08:00
Nathan Scott
4d74f423c7 Merge HEAD from ../linux-2.6 2006-03-22 15:31:14 +11:00
Nathan Scott
bb19fba193 [XFS] Sync up one/two other minor changes missed in previous merges.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-22 14:12:12 +11:00
Nathan Scott
e15f195cfb [XFS] Reenable the noikeep (delete inode cluster space) option by default.
SGI-PV: 951200
SGI-Modid: xfs-linux-melb:xfs-kern:25535a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-22 12:47:52 +11:00
David Chinner
2ddee844ee [XFS] Check that a page has dirty buffers before finding it acceptable for
rewrite clustering. This prevents writing excessive amounts of clean data
when doing random rewrites of a cached file.

SGI-PV: 951193
SGI-Modid: xfs-linux-melb:xfs-kern:25531a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-22 12:47:40 +11:00
Nathan Scott
3758dee9f6 [XFS] Fixup naming inconsistencies found by Pekka Enberg and one from Jan
Engelhardt.

SGI-PV: 947038
SGI-Modid: xfs-linux-melb:xfs-kern:25529a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-22 12:47:28 +11:00
David Chinner
38e2299a64 [XFS] Explain the race closed by the addition of vn_iowait() to the start
of xfs_itruncate_start().

SGI-PV: 947420
SGI-Modid: xfs-linux-melb:xfs-kern:25527a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-22 12:47:15 +11:00
Yingping Lu
9fa8046f50 [XFS] Fixing the error caused by the conflict between DIO Write's
conversion and concurrent truncate operations. Use vn_iowait to wait for
the completion of any pending DIOs. Since the truncate requires exclusive
IOLOCK, so this blocks any further DIO operations since DIO write also
needs exclusive IOBLOCK. This serves as a barrier and prevent any
potential starvation.

SGI-PV: 947420
SGI-Modid: xfs-linux-melb:xfs-kern:208088a

Signed-off-by: Yingping Lu <yingping@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-22 12:44:35 +11:00
Yingping Lu
f1fdc848aa [XFS] Fixing KDB's xrwtrc command, also added the current process id into
the trace.

SGI-PV: 948300
SGI-Modid: xfs-linux-melb:xfs-kern:208069a

Signed-off-by: Yingping Lu <yingping@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-22 12:44:15 +11:00
Alexey Dobriyan
4de151d8cd It's UTF-8
Fix some comments to "UTF-8".

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-03-22 00:13:35 +01:00
Trond Myklebust
ac58c9059d Merge branch 'linus' 2006-03-21 12:08:21 -05:00
J. Bruce Fields
df6db302cb SUNRPC,RPCSEC_GSS: spkm3--fix config dependencies
Add default selection of CRYPTO_CAST5 when selecting RPCSEC_GSS_SPKM3.

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 23:25:10 -05:00
J. Bruce Fields
5f12191bc0 LOCKD: Make nlmsvc_traverse_shares return void
The nlmsvc_traverse_shares return value is always zero, hence useless.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 23:24:25 -05:00
J. Bruce Fields
f3ee439f43 LOCKD: nlmsvc_traverse_blocks return is unused
Note that we never return non-zero.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 23:24:13 -05:00
J. Bruce Fields
096455a22a NFSv4: Dont list system.nfs4_acl for filesystems that don't support it.
Thanks to Frank Filz for pointing out that we list system.nfs4_acl extended
attribute even on filesystems where we don't actually support nfs4_acl.
This is inconsistent with the e.g. ext3 POSIX ACL behaviour, and seems to
annoy cp.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 23:23:42 -05:00
Trond Myklebust
7a1218a277 SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release()
Currently this will not happen if we exit before rpc_new_task() was called.
Also fix up rpc_run_task() to do the same (for consistency).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 18:11:10 -05:00
Greg Kroah-Hartman
b3229087c5 [PATCH] sysfs: fix a kobject leak in sysfs_add_link on the error path
As pointed out by Oliver Neukum.

Cc: Maneesh Soni <maneesh@in.ibm.com>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:59 -08:00
Greg Kroah-Hartman
832c57e9af [PATCH] sysfs: don't export dir symbols
These functions should only be used by the kobject core, and if any
driver tries to use them, bad things happen.  Unexport them to try to
prevent this from happening.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:59 -08:00
Michael Ellerman
dd308bc355 [PATCH] debugfs: Add debugfs_create_blob() helper for exporting binary data
I wanted to export a binary blob via debugfs, and although it was pretty easy
it seems like it'd be easier if there was a helper for it. It's a pity we need
the wrapper struct but I can't see a cleaner way to do it.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:59 -08:00
Maneesh Soni
c516865cfb [PATCH] sysfs: fix problem with duplicate sysfs directories and files
The following patch checks for existing sysfs_dirent before
preparing new one while creating sysfs directories and files.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:59 -08:00
Eric Sesterhenn
58d49283b8 [PATCH] sysfs: kzalloc conversion
this converts fs/sysfs to kzalloc() usage.
compile tested with make allyesconfig

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:58 -08:00
Jes Sorensen
58383af629 [PATCH] kobj_map semaphore to mutex conversion
Convert the kobj_map code to use a mutex instead of a semaphore.  It
converts the single two users as well, genhd.c and char_dev.c.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:58 -08:00
Greg Kroah-Hartman
641e6f30a0 [PATCH] sysfs: sysfs_remove_dir() needs to invalidate the dentry
When calling sysfs_remove_dir() don't allow any further sysfs functions
to work for this kobject anymore.  This fixes a nasty USB cdc-acm oops
on disconnect.

Many thanks to Bob Copeland and Paul Fulghum for taking the time to
track this down.

Cc: Bob Copeland <email@bobcopeland.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-20 13:42:57 -08:00
Amy Griffis
73241ccca0 [PATCH] Collect more inode information during syscall processing.
This patch augments the collection of inode info during syscall
processing. It represents part of the functionality that was provided
by the auditfs patch included in RHEL4.

Specifically, it:

- Collects information for target inodes created or removed during
  syscalls.  Previous code only collects information for the target
  inode's parent.

- Adds the audit_inode() hook to syscalls that operate on a file
  descriptor (e.g. fchown), enabling audit to do inode filtering for
  these calls.

- Modifies filtering code to check audit context for either an inode #
  or a parent inode # matching a given rule.

- Modifies logging to provide inode # for both parent and child.

- Protect debug info from NULL audit_names.name.

[AV: folded a later typo fix from the same author]

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-03-20 14:08:53 -05:00
Amy Griffis
f38aa94224 [PATCH] Pass dentry, not just name, in fsnotify creation hooks.
The audit hooks (to be added shortly) will want to see dentry->d_inode
too, not just the name.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-03-20 14:08:53 -05:00
Trond Myklebust
c42de9dd67 NFS: Fix a race in nfs_sync_inode()
Kudos to Neil Brown for spotting the problem:

"in nfs_sync_inode, there is effectively the sequence:

   nfs_wait_on_requests
   nfs_flush_inode
   nfs_commit_inode

 This seems a bit racy to me as if the only requests are on the
 ->commit list, and nfs_commit_inode is called separately after
 nfs_wait_on_requests completes, and before nfs_commit_inode start
 (say: by nfs_write_inode) then none of these function will return
 >0, yet there will be some pending request that aren't waited for."

The solution is to search for requests to wait upon, search for dirty
requests, and search for uncommitted requests while holding the
nfsi->req_lock

The patch also cleans up nfs_sync_inode(), getting rid of the redundant
FLUSH_WAIT flag. It turns out that we were always setting it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:51 -05:00
Trond Myklebust
7d46a49f51 NFS: Clean up nfs_flush_list()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:50 -05:00
Trond Myklebust
deb7d63826 NFS: Fix a race with PG_private and nfs_release_page()
We don't need to set PG_private for readahead pages, since they never get
unlocked while I/O is in progress. However there is a small race in
nfs_readpage_release() whereby the page may be unlocked, and have
PG_private set.

Fix is to have PG_private set only for the case of writes...

Also fix a bug in nfs_clear_page_writeback(): Don't attempt to clear the
radix_tree tag if we've already deleted the radix tree entry.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:50 -05:00
Trond Myklebust
1dd761e907 NFSv4: Ensure the callback daemon flushes signals
If the callback daemon is signalled, but is unable to exit because it still
has users, then we need to flush signals. If not, then svc_recv() can
never sleep, and so we hang.
If we flush signals, then we also have to be prepared to resend them when
we want the thread to exit.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:49 -05:00
Trond Myklebust
a9a801787a NFS, NLM: Allow blocking locks to respect signals
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:48 -05:00
Trond Myklebust
03f28e3a20 NFS: Make nfs_fhget() return appropriate error values
Currently it returns NULL, which usually gets interpreted as ENOMEM. In
fact it can mean a host of issues.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:48 -05:00
Trond Myklebust
01d0ae8bea NFSv4: Fix an oops in nfs4_fill_super
The mount statistics patches introduced a call to nfs_free_iostats that is
not only redundant, but actually causes an oops.

Also fix a memory leak due to the lack of a call to nfs_free_iostats on
unmount.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:48 -05:00
Trond Myklebust
d9f6eb75d4 lockd: blocks should hold a reference to the nlm_file
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:47 -05:00
Trond Myklebust
51581f3bf9 NFSv4: SETCLIENTID_CONFIRM should handle NFS4ERR_DELAY/NFS4ERR_RESOURCE
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:47 -05:00
Trond Myklebust
3e4f6290ca NFSv4: Send the delegation stateid for SETATTR calls
In the case where we hold a delegation stateid, use that in for inside
SETATTR calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:46 -05:00
Trond Myklebust
f25bc34967 NFSv4: Ensure nfs_callback_down() calls svc_destroy()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:46 -05:00
Trond Myklebust
6041b79192 lockd: Fix a typo in nlmsvc_grant_release()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:45 -05:00
Trond Myklebust
d471662448 lockd: Add helper for *_RES callbacks
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:45 -05:00
Trond Myklebust
92737230dd NLM: Add nlmclnt_release_call
Add a helper function to simplify the freeing of NLM client requests.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:45 -05:00
Trond Myklebust
e4cd038a45 NLM: Fix nlmclnt_test to not copy private part of locks
The struct file_lock does not carry a properly initialised lock,
so don't copy it as if it were.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:44 -05:00
Trond Myklebust
3a649b8846 NLM: Simplify client locks
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:44 -05:00
Trond Myklebust
d72b7a6b26 NFS: O_DIRECT needs to use a completion
Now that we have aio writes, it is possible for dreq->outstanding to be
zero, but for the I/O not to have completed. Convert struct nfs_direct_req
to use a completion to signal when the I/O is done.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:43 -05:00
Trond Myklebust
6b45d858ed NFS: Clean up nfs_get_user_pages
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:43 -05:00
Chuck Lever
606bbba06b NFS: fix compiler warnings on 64-bit platforms
Introduced by NFS aio+dio patches.

Test plan:
Compile kernel with CONFIG_NFS enabled on 64-bit hardware.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:42 -05:00
Trond Myklebust
35576cba57 NLM: nlmclnt_cancel_callback should accept NLM_LCK_DENIED errors
NLM_LCK_DENIED is a valid error return for an NLM_CANCEL call by the
client.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:41 -05:00