Remove the global spinlock in favor of a per-mount one.
This patch is basically find & replace. The difficult part has already been
done by the previous patch.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is in preparation for removing the global spinlock in favor of a
per-mount one.
The only critical part is the interaction between fuse_dev_release() and
fuse_fill_super(): fuse_dev_release() must see the assignment to
file->private_data, otherwise it will leak the reference to fuse_conn.
This is ensured by the fput() operation, which will synchronize the assignment
with other CPU's that may do a final fput() soon after this.
Also redundant locking is removed from fuse_fill_super(), where exclusion is
already ensured by the BKL held for this function by the VFS.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I don't like duplicating the connected and list_empty tests in fuse_dev_readv,
but this seemed cleaner than adding the f_flags test to request_wait.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds asynchronous notification to FUSE - a FUSE server can request
O_ASYNC on a /dev/fuse file descriptor and receive SIGIO when there is input
available.
One subtlety - fuse_dev_fasync, which is called when O_ASYNC is requested,
does no locking, unlink the other methods. I think it's unnecessary, as the
fuse_conn.fasync list is manipulated only by fasync_helper and kill_fasync,
which provide their own locking. It would also be wrong to use the fuse_lock,
as it's a spin lock and fasync_helper can sleep. My one concern with this is
the fuse_conn going away underneath fuse_dev_fasync - sys_fcntl takes a
reference on the file struct, so this seems not to be a problem.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
fuse_dev_poll() returned an error value instead of a poll mask. Luckily (or
unluckily) -ENODEV does contain the POLLERR bit.
There's also a race if filesystem is unmounted between fuse_get_conn() and
spin_lock(), in which case this event will be missed by poll().
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
During heavy parallel filesystem activity it was possible to Oops the kernel.
The reason is that read_cache_pages() could skip pages which have already been
inserted into the cache by another task. Occasionally this may result in zero
pages actually being sent, while fuse_send_readpages() relies on at least one
page being in the request.
So check this corner case and just free the request instead of trying to send
it.
Reported and tested by Konstantin Isakov.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The spufs file system creates files in a directory before instantiating the
directory itself, which causes a NULL pointer access in
inotify_d_instantiate since c32ccd87bf.
I'd like to keep this behavior since it means that the user will not have
access to files in the directory before I know that I succeed in creating
everything in it. This patch adds a simple check for the inode to keep
that working.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Everybody seems to be using /proc/vmcore as a method to access the kernel
crash dump. Hence probably it makes sense to enable CONFIG_PROC_VMCORE by
default if CONFIG_CRASH_DUMP is selected. This makes kdump configuration
further easier for a user.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The only record we have of the real-time age of a process, regardless of
execs it's done, is start_time. When a non-leader thread exec, the
original start_time of the process is lost. Things looking at the
real-time age of the process are fooled, for example the process accounting
record when the process finally dies. This change makes the oldest
start_time stick around with the process after a non-leader exec. This way
the association between PID and start_time is kept constant, which seems
correct to me.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As reported by Michael Kerrisk, POLLRDHUP handling was not consistent
between epoll and poll/select, since in epoll it was unmaskeable. This
patch brings uniformity in POLLRDHUP handling.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A couple of /proc/vmcore data structures overflow with 32bit systems having
memory more than 4G. This patch fixes those.
Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If SELECT_STACK_ALLOC is not a multiple of sizeof(long) then stack_fds[]
would be shorter than SELECT_STACK_ALLOC bytes and could overflow later in
the function. Fixed by simply rearranging the test later to work on
sizeof(stack_fds) Currently SELECT_STACK_ALLOC is 256 so this doesn't
happen, but it's nasty to have things like this hidden in the code. What
if later someone decides to change SELECT_STACK_ALLOC to 300?
Signed-off-by: Mitchell Blank Jr <mitch@sfgoth.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Handle a failing sget() in v9fs_get_sb().
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The mnt_flags are propagated into do_loopback(), so that they can be stored
with the vfsmount
Signed-off-by: Herbert Poetzl <herbert@13thfloor.at>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ulrich suggested that the `flags' arg to sync_file_range() become unsigned.
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce GFP_NOWAIT, as an alias for GFP_ATOMIC & ~__GFP_HIGH.
This also changes XFS, which is the only in-tree user of this idiom that I
could find. The XFS piece is compile-tested only.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Acked-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
fs/select.c: In function `core_sys_select':
fs/select.c:339: warning: assignment from incompatible pointer type
fs/select.c:376: warning: comparison of distinct pointer types lacks a cast
By using a void* we can remove lots of casts rather than adding more.
Cc: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- capitalize consistently
- end sentences in one way or another
- update comment text to match the implementation
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
Also corrects a few comments. Patch mainly from Ingo, changes by me.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
- Kill the local variables that cache ->nrbufs, they just take up space.
- Only set do_wakeup for a real pipe. This is a big win for direct splicing.
- Kill i_mutex lock around ->f_pos update, regular io paths don't do this
either.
Signed-off-by: Jens Axboe <axboe@suse.de>
Using find_get_page() is a lot faster than find_or_create_page(). This
gets splice a lot closer to sendfile() for fd -> socket transfers.
Signed-off-by: Jens Axboe <axboe@suse.de>
It's more efficient for sendfile() emulation. Basically we cache an
internal private pipe and just use that as the intermediate area for
pages. Direct splicing is not available from sys_splice(), it is only
meant to be used for sendfile() emulation.
Additional patch from Ingo Molnar to avoid the PIPE_BUFFERS loop at
exit for the normal fast path.
Signed-off-by: Jens Axboe <axboe@suse.de>
that have been unlinked, we may need to execute transactions during
reclaim. By the time the transaction has hit the disk, the linux inode and
xfs vnode may already have been freed so we can't reference them safely.
Use the known xfs inode state to determine if it is safe to reference the
vnode and linux inode during the unpin operation.
SGI-PV: 946321
SGI-Modid: xfs-linux-melb:xfs-kern:25687a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
millions of inodes cached and has sparse cluster population, removing
inodes from the cluster hash consumes excessive amounts of CPU time.
Reduce the CPU cost by making removal O(1) via use of a double linked list
for the hash chains.
SGI-PV: 951551
SGI-Modid: xfs-linux-melb:xfs-kern:25683a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
nonblock mode with the new IO path code (since 2.6.16).
SGI-PV: 951662
SGI-Modid: xfs-linux-melb:xfs-kern:25676a
Signed-off-by: Nathan Scott <nathans@sgi.com>
* 'upstream-linus' of git://oss.oracle.com/home/sourcebo/git/ocfs2:
[PATCH] CONFIGFS_FS must depend on SYSFS
[PATCH] Bogus NULL pointer check in fs/configfs/dir.c
ocfs2: Better I/O error handling in heartbeat
ocfs2: test and set teardown flag early in user_dlm_destroy_lock()
ocfs2: Handle the DLM_CANCELGRANT case in user_unlock_ast()
ocfs2: catch an invalid ast case in dlmfs
ocfs2: remove an overly aggressive BUG() in dlmfs
ocfs2: multi node truncate fix
Oleg Nesterov spotted two interesting bugs with the current de_thread
code. The simplest is a long standing double decrement of
__get_cpu_var(process_counts) in __unhash_process. Caused by
two processes exiting when only one was created.
The other is that since we no longer detach from the thread_group list
it is possible for do_each_thread when run under the tasklist_lock to
see the same task_struct twice. Once on the task list as a
thread_group_leader, and once on the thread list of another
thread.
The double appearance in do_each_thread can cause a double increment
of mm_core_waiters in zap_threads resulting in problems later on in
coredump_wait.
To remedy those two problems this patch takes the simple approach
of changing the old thread group leader into a child thread.
The only routine in release_task that cares is __unhash_process,
and it can be trivially seen that we handle cleaning up a
thread group leader properly.
Since de_thread doesn't change the pid of the exiting leader process
and instead shares it with the new leader process. I change
thread_group_leader to recognize group leadership based on the
group_leader field and not based on pids. This should also be
slightly cheaper then the existing thread_group_leader macro.
I performed a quick audit and I couldn't see any user of
thread_group_leader that cared about the difference.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes the a compile error with CONFIG_SYSFS=n
Configfs is creating, as a matter of policy, the /sys/kernel/config
mountpoint. This means it requires CONFIG_SYSFS.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
We check the "group" pointer after we dereference it. This check is
bogus, as it cannot be NULL coming in.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
add optional input and output offsets to sys_splice(), for seekable file
descriptors:
asmlinkage long sys_splice(int fd_in, loff_t __user *off_in,
int fd_out, loff_t __user *off_out,
size_t len, unsigned int flags);
semantics are straightforward: f_pos will be updated with the offset
provided by user-space, before the splice transfer is about to begin.
Providing a NULL offset pointer means the existing f_pos will be used
(and updated in situ). Providing an offset for a pipe results in
-ESPIPE. Providing an invalid offset pointer results in -EFAULT.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <axboe@suse.de>
separate out the 'internal pipe object' abstraction, and make it
usable to splice. This cleans up and fixes several aspects of the
internal splice APIs and the pipe code:
- pipes: the allocation and freeing of pipe_inode_info is now more symmetric
and more streamlined with existing kernel practices.
- splice: small micro-optimization: less pointer dereferencing in splice
methods
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Update XFS for the ->splice_read/->splice_write changes.
Signed-off-by: Jens Axboe <axboe@suse.de>
We don't want to call into the read-ahead logic unless we are at the
start of a page, _or_ we have multiple pages to read.
Signed-off-by: Jens Axboe <axboe@suse.de>
We can get to out: with a NULL page, which we probably
don't want to be calling page_cache_release() on.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
Propagate errors received in o2hb_bio_end_io() back to the heartbeat thread
so it can skip re-arming the timer.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Remove the code which attempted to catch it via dlmunlock() return status -
this never happens there.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Don't BUG() user_dlm_unblock_lock() on the absence of the USER_LOCK_BLOCKED
flag - this turns out to be a valid case. Make some of the related BUG()
statements print more useful information.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Fix ocfs2_truncate_file() so that it forces a truncate_inode_pages() on all
interested nodes in all cases of a truncate(), not just allocation change.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Originally from Nick Piggin, just adapted to the newer branch.
You can't check PageLRU without holding zone->lru_lock. The page
release code can get away with it only because the page refcount is 0 at
that point. Also, you can't reliably remove pages from the LRU unless
the refcount is 0. Ever.
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Jens Axboe <axboe@suse.de>
Thanks to Andrew for the good explanation of why this is so. akpm writes:
If a page is under writeback and we remove it from pagecache, it's still
going to get written to disk. But the VFS no longer knows about that page,
nor that this page is about to modify disk blocks.
So there might be scenarios in which those
blocks-which-are-about-to-be-written-to get reused for something else.
When writeback completes, it'll scribble on those blocks.
This won't happen in ext2/ext3-style filesystems in normal mode because the
page has buffers and try_to_release_page() will fail.
But ext2 in nobh mode doesn't attach buffers at all - it just sticks the
page in a BIO, finds some new blocks, points the BIO at those blocks and
lets it rip.
While that write IO's in flight, someone could truncate the file. Truncate
won't block on the writeout because the page isn't in pagecache any more.
So truncate will the free the blocks from the file under the page's feet.
Then something else can reallocate those blocks. Then write data to them.
Now, the original write completes, corrupting the filesystem.
Signed-off-by: Jens Axboe <axboe@suse.de>
By cleaning up the writeback logic (killing write_one_page() and the manual
set_page_dirty()), we can get rid of ->stolen inside the pipe_buffer and
just keep it local in pipe_to_file().
This also adds dirty page balancing logic and O_SYNC handling.
Signed-off-by: Jens Axboe <axboe@suse.de>
Clear the entire range, and don't increment pidx or we keep filling
the same position again and again.
Thanks to KAMEZAWA Hiroyuki.
Signed-off-by: Jens Axboe <axboe@suse.de>
* git://oss.sgi.com:8090/oss/git/xfs-2.6:
[XFS] Provide XFS support for the splice syscall.
[XFS] Reenable write barriers by default.
[XFS] Make project quota enforcement return an error code consistent with
[XFS] Implement the silent parameter to fill_super, previously ignored.
[XFS] Cleanup comment to remove reference to obsoleted function
No one should be writing a PAGE_SIZE worth of data to a normal sysfs
file, so properly terminate the buffer.
Thanks to Al Viro for pointing out my supidity here.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (48 commits)
Documentation: fix minor kernel-doc warnings
BUG_ON() Conversion in drivers/net/
BUG_ON() Conversion in drivers/s390/net/lcs.c
BUG_ON() Conversion in mm/slab.c
BUG_ON() Conversion in mm/highmem.c
BUG_ON() Conversion in kernel/signal.c
BUG_ON() Conversion in kernel/signal.c
BUG_ON() Conversion in kernel/ptrace.c
BUG_ON() Conversion in ipc/shm.c
BUG_ON() Conversion in fs/freevxfs/
BUG_ON() Conversion in fs/udf/
BUG_ON() Conversion in fs/sysv/
BUG_ON() Conversion in fs/inode.c
BUG_ON() Conversion in fs/fcntl.c
BUG_ON() Conversion in fs/dquot.c
BUG_ON() Conversion in md/raid10.c
BUG_ON() Conversion in md/raid6main.c
BUG_ON() Conversion in md/raid5.c
Fix minor documentation typo
BFP->BPF in Documentation/networking/tuntap.txt
...
It doesn't make the splice itself necessarily nonblocking (because the
actual file descriptors that are spliced from/to may block unless they
have the O_NONBLOCK flag set), but it makes the splice pipe operations
nonblocking.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch updates the comments to match the actual code.
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
* master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] Fix typo in earlier cifs_unlink change and protect one
[CIFS] Incorrect signature sent on SMB Read
[CIFS] Fix unlink oops when indirectly called in rename error path
[CIFS] Fix two remaining coverity scan tool warnings.
[CIFS] Set correct lock type on new posix unlock call
[CIFS] Upate cifs change log
[CIFS] Fix slow oplock break response when mounts to different
[CIFS] Workaround various server bugs found in testing at connectathon
[CIFS] Allow fallback for setting file size to Procom SMB server when
[CIFS] Make POSIX CIFS Extensions SetFSInfo match exactly what we want
[CIFS] Move noisy debug message (triggerred by some older servers) from
[CIFS] Use correct pid on new cifs posix byte range lock call
[CIFS] Add posix (advisory) byte range locking support to cifs client
[CIFS] CIFS readdir perf optimizations part 1
[CIFS] Free small buffers earlier so we exceed the cifs
[CIFS] Fix large (ie over 64K for MaxCIFSBufSize) buffer case for wrapping
[CIFS] Convert remaining places in fs/cifs from
[CIFS] SessionSetup cleanup part 2
[CIFS] fix compile error (typo) and warning in cifssmb.c
[CIFS] Cleanup NTLMSSP session setup handling
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
this changes if() BUG(); constructs to BUG_ON() which is
cleaner and can better optimized away
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
extra path.
Since cifs_unlink can also be called from rename path and there
was one report of oops am making the extra check for null inode.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Fixes Samba bug 3621 and kernel.org bug 6147
For servers which require SMB/CIFS packet signing, we were sending the
wrong signature (all zeros) on SMB Read request. The new cifs routine
to do signatures across an iovec was not complete - and SMB Read, unlike
the new SMBWrite2, did not fall back to the older routine (ie use
SendReceive vs. the more efficient SendReceive2 ie used the older
cifs_sign_smb vs. the disabled cifs_sign_smb2) for calculating signatures.
This finishes up cifs_sign_smb2/cifs_calc_signature2 so that the callers
of SendReceive2 can get SMB/CIFS packet signatures.
Now that cifs_sign_smb2 is supported, we could start using it in
the write path but this smaller fix does not include the change
to use SMBWrite2 when signatures are required (which when enabled
will make more Writes more efficient and alloc less memory).
Currently Write2 is only used when signatures are not
required at the moment but after more testing we will enable
that as well).
Thanks to James Slepicka and Sam Flory for initial investigation.
Signed-off-by: Steve French <sfrench@us.ibm.com>
Commit 70674f95c0:
[PATCH] Optimize select/poll by putting small data sets on the stack
resulted in the poll stack being 4-byte aligned on 64-bit architectures,
causing misaligned accesses to elements in the array.
This patch fixes it by declaring the stack in terms of 'long' instead
of 'char'.
Force alignment of poll and select stacks to long to avoid unaligned
access on 64 bit architectures.
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As announced, lookup_hash() can now become static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It is very common to hash a dentry and then to call lookup. If we take fs
specific hash functions into account the full hash logic can get ugly.
Further full_name_hash as an inline function is almost 100 bytes on x86 so
having a non-inline choice in some cases can measurably decrease code size.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
proc_check_chroot() does the check in a very unintuitive way (keeping a
copy of the argument, then modifying the argument), and has uncommented
sideeffects.
Signed-off-by: Herbert Poetzl <herbert@13thfloor.at>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
sys_flock() currently has a race which can result in a double free in the
multi-thread case.
Thread 1 Thread 2
sys_flock(file, LOCK_EX)
sys_flock(file, LOCK_UN)
If Thread 2 removes the lock from inode->i_lock before Thread 1 tests for
list_empty(&lock->fl_link) at the end of sys_flock, then both threads will
end up calling locks_free_lock for the same lock.
Fix is to make flock_lock_file() do the same as posix_lock_file(), namely
to make a copy of the request, so that the caller can always free the lock.
This also has the side-effect of fixing up a reference problem in the
lockd handling of flock.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
IN_DELETE events are no longer generated for the removal of a file from a
watched directory.
This seems to be a result of clearing DCACHE_INOTIFY_PARENT_WATCHED in
d_delete() directly before calling fsnotify_nameremove().
Assuming the flag doesn't need to be cleared before dentry_iput(), this
should do the trick.
Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Acked-by: Robert Love <rml@novell.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Since these names on old MSDOS is used as device, so, current fat driver
doesn't allow a user to create those names. But many OSes and even Windows
can create those names actually, now.
This patch removes the reserved name check.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
Reasons:
- It's more flexible. Things which would require two or three syscalls with
fadvise() can be done in a single syscall.
- Using fadvise() in this manner is something not covered by POSIX.
The patch wires up the syscall for x86.
The sycall is implemented in the new fs/sync.c. The intention is that we can
move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.
Documentation for the syscall is in fs/sync.c.
A test app (sync_file_range.c) is in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.
The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
say NFS_DATA_SYNC or NFS_FILE_SYNC. I can skip the ->fsync call for
NFS_DATA_SYNC which is hopefully the more common."
Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
the queue is congested. This is trivial to fix: add a new flag bit, set
wbc->nonblocking. But I'm not sure that we want to expose implementation
details down to that level.
Note: it's notable that we can sync an fd which wasn't opened for writing.
Same with fsync() and fdatasync()).
Note: the code takes some care to handle attempts to sync file contents
outside the 16TB offset on 32-bit machines. It makes such attempts appear to
succeed, for best 32-bit/64-bit compatibility. Perhaps it should make such
requests fail...
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make baby-simple the code for /proc/devices. Based on the proven design
for /proc/interrupts.
This also fixes the early-termination regression 2.6.16 introduced, as
demonstrated by:
# dd if=/proc/devices bs=1
Character devices:
1 mem
27+0 records in
27+0 records out
This should also work (but is untested) when /proc/devices >4096 bytes,
which I believe is what the original 2.6.16 rewrite fixed.
[akpm@osdl.org: cleanups, simplifications]
Signed-off-by: Joe Korty <joe.korty@ccur.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
__user annotations (hppfs)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Woe be unto he who builds their filesystems as modules.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
[ Obscure quote from the infamous geek bible? ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>