* 'for-linus' of git://linux-nfs.org/~bfields/linux: (100 commits)
SUNRPC: RPC program information is stored in unsigned integers
SUNRPC: Move exported symbol definitions after function declaration part 2
NLM: tear down RPC clients in nlm_shutdown_hosts
SUNRPC: spin svc_rqst initialization to its own function
nfsd: more careful input validation in nfsctl write methods
lockd: minor log message fix
knfsd: don't bother mapping putrootfh enoent to eperm
rdma: makefile
rdma: ONCRPC RDMA protocol marshalling
rdma: SVCRDMA sendto
rdma: SVCRDMA recvfrom
rdma: SVCRDMA Core Transport Services
rdma: SVCRDMA Transport Module
rdma: SVCRMDA Header File
svc: Add svc_xprt_names service to replace svc_sock_names
knfsd: Support adding transports by writing portlist file
svc: Add svc API that queries for a transport instance
svc: Add /proc/sys/sunrpc/transport files
svc: Add transport hdr size for defer/revisit
svc: Move the xprt independent code to the svc_xprt.c file
...
It's possible for a RPC to outlive the lockd daemon that created it, so
we need to make sure that all RPC's are killed when lockd is coming
down. When nlm_shutdown_hosts is called, kill off all RPC tasks
associated with the host. Since we need to wait until they have all gone
away, we might as well just shut down the RPC client altogether.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Neil Brown points out that we're checking buf[size-1] in a couple places
without first checking whether size is zero.
Actually, given the implementation of simple_transaction_get(), buf[-1]
is zero, so in both of these cases the subsequent check of the value of
buf[size-1] will catch this case.
But it seems fragile to depend on that, so add explicit checks for this
case.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: NeilBrown <neilb@suse.de>
Neither EPERM and ENOENT map to valid errors for PUTROOTFH according to
rfc 3530, and, if anything, ENOENT is likely to be slightly more
informative; so don't bother mapping ENOENT to EPERM. (Probably this
was originally done because one likely cause was that there is an fsid=0
export but that it isn't permitted to this particular client. Now that
we allow WRONGSEC returns, this is somewhat less likely.)
In the long term we should work to make this situation less likely,
perhaps by turning off nfsv4 service entirely in the absence of the
pseudofs root, or constructing a pseudofilesystem root ourselves in the
kernel as necessary.
Thanks to Benny Halevy <bhalevy@panasas.com> for pointing out this
problem.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Benny Halevy <bhalevy@panasas.com>
Create a transport independent version of the svc_sock_names function.
The toclose capability of the svc_sock_names service can be implemented
using the svc_xprt_find and svc_xprt_close services.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Update the write handler for the portlist file to allow creating new
listening endpoints on a transport. The general form of the string is:
<transport_name><space><port number>
For example:
echo "tcp 2049" > /proc/fs/nfsd/portlist
This is intended to support the creation of a listening endpoint for
RDMA transports without adding #ifdef code to the nfssvc.c file.
Transports can also be removed as follows:
'-'<transport_name><space><port number>
For example:
echo "-tcp 2049" > /proc/fs/nfsd/portlist
Attempting to add a listener with an invalid transport string results
in EPROTONOSUPPORT and a perror string of "Protocol not supported".
Attempting to remove an non-existent listener (.e.g. bad proto or port)
results in ENOTCONN and a perror string of
"Transport endpoint is not connected"
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Add a new svc function that allows a service to query whether a
transport instance has already been created. This is used in lockd
to determine whether or not a transport needs to be created when
a lockd instance is brought up.
Specifying 0 for the address family or port is effectively a wild-card,
and will result in matching the first transport in the service's list
that has a matching class name.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Move sk_list and sk_ready to svc_xprt. This involves close because these
lists are walked by svcs when closing all their transports. So I combined
the moving of these lists to svc_xprt with making close transport independent.
The svc_force_sock_close has been changed to svc_close_all and takes a list
as an argument. This removes some svc internals knowledge from the svcs.
This code races with module removal and transport addition.
Thanks to Simon Holm Thøgersen for a compile fix.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Simon Holm Thøgersen <odie@cs.aau.dk>
Modify the various kernel RPC svcs to use the svc_create_xprt service.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Fix nlm_block leak for the case of supplied blocking lock info.
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Document these checks a little better and inline, as suggested by Neil
Brown (note both functions have two callers). Remove an obviously bogus
check while we're there (checking whether unsigned value is negative).
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Neil Brown <neilb@suse.de>
The server silently ignores attempts to set the uid and gid on create.
Based on the comment, this appears to have been done to prevent some
overly-clever IRIX client from causing itself problems.
Perhaps we should remove that hack completely. For now, at least, it
makes sense to allow root (when no_root_squash is set) to set uid and
gid.
While we're there, since nfsd_create and nfsd_create_v3 share the same
logic, pull that out into a separate function. And spell out the
individual modifications of ia_valid instead of doing them both at once
inside a conditional.
Thanks to Roger Willcocks <roger@filmlight.ltd.uk> for the bug report
and original patch on which this is based.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Without the patch, there is a leakage of nlmblock structure refcount
that holds a reference nlmfile structure, that holds a reference to
struct file, when async GETFL is used (-EINPROGRESS return from
file_ops->lock()), and also in some error cases.
Fix up a style nit while we're here.
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This patch addresses a compatibility issue with a Linux NFS server and
AIX NFS client.
I have exported /export as fsid=0 with sec=krb5:krb5i
I have mount --bind /home onto /export/home
I have exported /export/home with sec=krb5i
The AIX client mounts / -o sec=krb5:krb5i onto /mnt
If I do an ls /mnt, the AIX client gets a permission error. Looking at
the network traceIwe see a READDIR looking for attributes
FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID. The response gives a
NFS4ERR_WRONGSEC which the AIX client is not expecting.
Since the AIX client is only asking for an attribute that is an
attribute of the parent file system (pseudo root in my example), it
seems reasonable that there should not be an error.
In discussing this issue with Bruce Fields, I initially proposed
ignoring the error in nfsd4_encode_dirent_fattr() if all that was being
asked for was FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID, however,
Bruce suggested that we avoid calling cross_mnt() if only these
attributes are requested.
The following patch implements bypassing cross_mnt() if only
FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID are called. Since there
is some complexity in the code in nfsd4_encode_fattr(), I didn't want to
duplicate code (and introduce a maintenance nightmare), so I added a
parameter to nfsd4_encode_fattr() that indicates whether it should
ignore cross mounts and simply fill in the attribute using the passed in
dentry as opposed to it's parent.
Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The failure to return a stateowner from nfs4_preprocess_seqid_op() means
in the case where a lock request is of a type incompatible with an open
(due to, e.g., an application attempting a write lock on a file open for
read), means that fs/nfsd/nfs4xdr.c:ENCODE_SEQID_OP_TAIL() never bumps
the seqid as it should. The client, attempting to close the file
afterwards, then gets an (incorrect) bad sequence id error. Worse, this
prevents the open file from ever being closed, so we leak state.
Thanks to Benny Halevy and Trond Myklebust for analysis, and to Steven
Wilton for the report and extensive data-gathering.
Cc: Benny Halevy <bhalevy@panasas.com>
Cc: Steven Wilton <steven.wilton@team.eftel.com.au>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
In a number of places where we wish only to translate nlm_drop_reply to
rpc_drop_reply errors we instead return early with rpc_drop_reply,
skipping some important end-of-function cleanup.
This results in reference count leaks when lockd is doing posix locking
on GFS2.
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
When the callback channel fails, we inform the client of that by
returning a cb_path_down error the next time it tries to renew its
lease.
If we wait most of a lease period before deciding that a callback has
failed and that the callback channel is down, then we decrease the
chances that the client will find out in time to do anything about it.
So, mark the channel down as soon as we recognize that an rpc has
failed. However, continue trying to recall delegations anyway, in hopes
it will come back up. This will prevent more delegations from being
given out, and ensure cb_path_down is returned to renew calls earlier,
while still making the best effort to deliver recalls of existing
delegations.
Also fix a couple comments and remove a dprink that doesn't seem likely
to be useful.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Declare this variable in the one function where it's used, and clean up
some minor style problems.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
We generate a unique cl_confirm for every new client; so if we've
already checked that this cl_confirm agrees with the cl_confirm of
unconf, then we already know that it does not agree with the cl_confirm
of conf.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Again, the only way conf and unconf can have the same clientid is if
they were created in the "probable callback update" case of setclientid,
in which case we already know that the cl_verifier fields must agree.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
If conf and unconf are both found in the lookup by cl_clientid, then
they share the same cl_clientid. We always create a unique new
cl_clientid field when creating a new client--the only exception is the
"probable callback update" case in setclientid, where we copy the old
cl_clientid from another clientid with the same name.
Therefore two clients with the same cl_client field also always share
the same cl_name field, and a couple of the checks here are redundant.
Thanks to Simon Holm Thøgersen for a compile fix.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Simon Holm Thøgersen <odie@cs.aau.dk>
Using a counter instead of the nanoseconds value seems more likely to
produce a unique cl_confirm.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
We're supposed to generate a different cl_confirm verifier for each new
client, so these to cl_confirm values should never be the same.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Most of these comments just summarize the code.
The matching of code to the cases described in the RFC may still be
useful, though; add specific section references to make that easier to
follow. Also update references to the outdated RFC 3010.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
While we're here, let's remove the redundant (and now wrong) pathname in
the comment, and the #ifdef __KERNEL__'s.
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This header is used only in a few places in fs/nfsd, so there seems to
be little point to having it in include/. (Thanks to Robert Day for
pointing this out.)
Cc: Robert P. J. Day <rpjday@crashcourse.ca>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Newer server features such as nfsv4 and gss depend on proc to work, so a
failure to initialize the proc files they need should be treated as
fatal.
Thanks to Andrew Morton for style fix and compile fix in case where
CONFIG_NFSD_V4 is undefined.
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
I assume the reason failure of creation was ignored here was just to
continue support embedded systems that want nfsd but not proc.
However, in cases where proc is supported it would be clearer to fail
entirely than to come up with some features disabled.
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The server depends on upcalls under /proc to support nfsv4 and gss.
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
There's really nothing much the caller can do if cache unregistration
fails. And indeed, all any caller does in this case is print an error
and continue. So just return void and move the printk's inside
cache_unregister.
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
If the reply cache initialization fails due to a kmalloc failure,
currently we try to soldier on with a reduced (or nonexistant) reply
cache.
Better to just fail immediately: the failure is then much easier to
understand and debug, and it could save us complexity in some later
code. (But actually, it doesn't help currently because the cache is
also turned off in some odd failure cases; we should probably find a
better way to handle those failure cases some day.)
Fix some minor style problems while we're at it, and rename
nfsd_cache_init() to remove the need for a comment describing it.
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Handle the failure case here with something closer to the standard
kernel style.
Doesn't really matter for now, but I'd like to add a few more failure
cases, and then this'll help.
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
We forgot to shut down the nfs4 state and idmapping code in this case.
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The length "nbytes" passed into read_buf should never be negative, but
we check only for too-large values of "nbytes", not for too-small
values. Make nbytes unsigned, so it's clear that the former tests are
sufficient. (Despite this read_buf() currently correctly returns an xdr
error in the case of a negative length, thanks to an unsigned
comparison with size_of() and bounds-checking in kmalloc(). This seems
very fragile, though.)
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: path name lengths are unsigned on the wire, negative lengths
are not meaningful natively either.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: adjust the sign of the length argument of nfsd_lookup and
nfsd_lookup_dentry, for consistency with recent changes. NFSD version
4 callers already pass an unsigned file name length.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: file name lengths are unsigned on the wire, negative lengths
are not meaningful natively either.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
According to The Open Group's NLM specification, NLM callers are variable
length strings. XDR variable length strings use an unsigned 32 bit length.
And internally, negative string lengths are not meaningful for the Linux
NLM implementation.
Clean up: Make nlm_lock.len and nlm_reboot.len unsigned integers. This
makes the sign of NLM string lengths consistent with the sign of xdr_netobj
lengths.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-By: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Obviously at some point we thought "error" represented the length when
positive. This appears to be a long-standing typo.
Thanks to Prasad Potluri <pvp@us.ibm.com> for finding the problem and
proposing an earlier version of this patch.
Cc: Steve French <smfltc@us.ibm.com>
Cc: Prasad V Potluri <pvp@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Dereferenced pointer "dentry" without checking and assigned to inode
in the declaration.
(We could just delete the NULL checks that follow instead, as we never
get to the encode function in this particular case. But it takes a
little detective work to verify that fact, so it's probably safer to
leave the checks in place.)
Cc: Steve French <smfltc@us.ibm.com>
Signed-off-by: Prasad V Potluri <pvp@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The whole reason to move this callback-channel probe into a separate
thread was because (for now) we don't have an easy way to create the
rpc_client asynchronously. But I forgot to move the rpc_create() to the
spawned thread. Doh! Fix that.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Our callback code doesn't actually handle concurrent attempts to probe
the callback channel. Some rethinking of the locking may be required.
However, we can also just move the callback probing to this case. Since
this is the only time a client is "confirmed" (and since that can only
happen once in the lifetime of a client), this ensures we only probe
once.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Commit block was intended to have several copies of the header. But
due to a bug it never had them and actually, nobody checks that. So
just remove the useless loop.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The buffer head pointer passed to journal_wait_on_commit_record() could
be NULL if the previous journal_submit_commit_record() failed or journal
has already aborted.
Looking at the jbd2 debug messages, before the oops happened, the jbd2
is aborted due to trying to access the next log block beyond the end
of device. This might be caused by using a corrupted image.
We need to check the error returns from journal_submit_commit_record()
and avoid calling journal_wait_on_commit_record() in the failure case.
This addresses Kernel Bugzilla #9849
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
/home/bunk/linux/kernel-2.6/git/linux-2.6/fs/hostfs/hostfs_kern.c: In function 'hostfs_show_options':
/home/bunk/linux/kernel-2.6/git/linux-2.6/fs/hostfs/hostfs_kern.c:328: error: dereferencing pointer to incomplete type
We need to include mount.h to get vfsmount.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Reported-by: Adrian Bunk <bunk@stusta.de>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert commit c6caeb7c45 ("proc: fix the
threaded /proc/self"), since Eric says "The patch really is wrong.
There is at least one corner case in procps that cares."
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Guillaume Chazarain" <guichaz@yahoo.fr>
Cc: "Pavel Emelyanov" <xemul@openvz.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
dlm: add __init and __exit marks to init and exit functions
dlm: eliminate astparam type casting
dlm: proper types for asts and basts
dlm: dlm/user.c input validation fixes
dlm: fix dlm_dir_lookup() handling of too long names
dlm: fix overflows when copying from ->m_extra to lvb
dlm: make find_rsb() fail gracefully when namelen is too large
dlm: receive_rcom_lock_args() overflow check
dlm: verify that places expecting rcom_lock have packet long enough
dlm: validate data in dlm_recover_directory()
dlm: missing length check in check_config()
dlm: use proper type for ->ls_recover_buf
dlm: do not byteswap rcom_config
dlm: do not byteswap rcom_lock
dlm: dlm_process_incoming_buffer() fixes
dlm: use proper C for dlm/requestqueue stuff (and fix alignment bug)
vmsplice_to_user() must always check the user pointer and length
with access_ok() before copying. Likewise, for the slow path of
copy_from_user_mmap_sem() we need to check that we may read from
the user region.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Cc: Wojciech Purczynski <cliph@research.coseinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Turn off quotas before filesystem is remounted read only. Otherwise quota
will try to write to read-only filesystem which does no good... We could
also just refuse to remount ro when quota is enabled but turning quota off
is consistent with what we do on umount.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some devices - notably dm and md - can change their behaviour in response
to BIO_RW_BARRIER requests. They might start out accepting such requests
but on reconfiguration, they find out that they cannot any more.
ext3 (and other filesystems) deal with this by always testing if
BIO_RW_BARRIER requests fail with EOPNOTSUPP, and retrying the write
requests without the barrier (probably after waiting for any pending writes
to complete).
However there is a bug in the handling for this for ext3.
When ext3 (jbd actually) decides to submit a BIO_RW_BARRIER request, it
sets the buffer_ordered flag on the buffer head. If the request completes
successfully, the flag STAYS SET.
Other code might then write the same buffer_head after the device has been
reconfigured to not accept barriers. This write will then fail, but the
"other code" is not ready to handle EOPNOTSUPP errors and the error will be
treated as fatal.
This can be seen without having to reconfigure a device at exactly the
wrong time by putting:
if (buffer_ordered(bh))
printk("OH DEAR, and ordered buffer\n");
in the while loop in "commit phase 5" of journal_commit_transaction.
If it ever prints the "OH DEAR ..." message (as it does sometimes for
me), then that request could (in different circumstances) have failed
with EOPNOTSUPP, but that isn't tested for.
My proposed fix is to clear the buffer_ordered flag after it has been
used, as in the following patch.
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
do_mount() uses a whopping 616 bytes of stack on x86_64 in 2.6.24-mm1,
largely thanks to gcc inlining the various helper functions.
noinlining these can slim it down a lot; on my box this patch gets it down
to 168, which is mostly the struct nameidata nd; left on the stack.
These functions are called only as do_mount() helpers; none of them should
be in any path that would see a performance benefit from inlining...
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is an outdated comment in serial_core.c also fixed.
Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are two possible races in handling of private_list in buffer cache.
1) When fsync_buffers_list() processes a private_list, it clears
b_assoc_mapping and moves buffer to its private list. Now
drop_buffers() comes, sees a buffer is on list so it calls
__remove_assoc_queue() which complains about b_assoc_mapping being
cleared (as it cannot propagate possible IO error). This race has been
actually observed in the wild.
2) When fsync_buffers_list() processes a private_list,
mark_buffer_dirty_inode() can be called on bh which is already on the
private list of fsync_buffers_list(). As buffer is on some list (note
that the check is performed without private_lock), it is not readded to
the mapping's private_list and after fsync_buffers_list() finishes, we
have a dirty buffer which should be on private_list but it isn't. This
race has not been reported, probably because most (but not all) callers
of mark_buffer_dirty_inode() hold i_mutex and thus are serialized with
fsync().
Fix these issues by not clearing b_assoc_map when fsync_buffers_list()
moves buffer to a dedicated list and by reinserting buffer in private_list
when it is found dirty after we have submitted buffer for IO. We also
change the tests whether a buffer is on a private list from
!list_empty(&bh->b_assoc_buffers) to bh->b_assoc_map so that they are
single word reads and hence lockless checks are safe.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Following the deprecation schedule the a.out ELF interpreter support
is removed now with this patch. a.out ELF interpreters were an transition
feature for moving a.out systems to ELF, but they're unlikely to be still
needed. Pure a.out systems will still work of course. This allows to
simplify the hairy ELF loader.
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to udf.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to reiserfs.
Use generic_show_options() and save the complete option string in
reiserfs_fill_super() and reiserfs_remount().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to ncpfs.
Small fix: add FS_BINARY_MOUNTDATA to the filesystem type flags, since
it can take binary data, as well as text (similarly to NFS).
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to isofs.
Use generic_show_options() and save the complete option string in
isofs_fill_super().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to hugetlbfs.
Use generic_show_options() and save the complete option string in
hugetlbfs_fill_super().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Ken Chen <kenchen@google.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to hpfs.
Use generic_show_options() and save the complete option string in
hpfs_fill_super() and hpfs_remount_fs().
Also add a small fix: hpfs_remount_fs() should return -EINVAL on
error, instead of 1, which is not an error value.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the "host path" option to /proc/mounts for UML hostfs filesystems.
The mount source (mnt_devname) should really be used for this, but not
easy to change now in a backward compatible way.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add noreservation option to /proc/mounts for ext2 filesystems.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to devpts.
Small cleanup: when parsing the "mode" option, mask with S_IALLUGO
instead of ~S_IFMT.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to befs.
Use generic_show_options() and save the complete option string in
befs_fill_super().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Sergey S. Kostyliov <rathamahata@php4.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to autofs.
Use generic_show_options() and save the complete option string in
autofs_fill_super().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add uid= and gid= options to /proc/mounts for autofs4 filesystems.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to afs.
Use generic_show_options() and save the complete option string in
afs_get_sb().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to affs.
Use generic_show_options() and save the complete option string in
affs_fill_super() and affs_remount().
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a .show_options super operation to adfs.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a new s_options field to struct super_block. Filesystems can save
mount options passed to them in mount or remount. It is automatically
freed when the superblock is destroyed.
A new helper function, generic_show_options() is introduced, which uses
this field to display the mount options in /proc/mounts.
Another helper function, save_mount_options() may be used by
filesystems to save the options in the super block.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Per previous discussions about cleaning up ufs_fs.h, people just want
this straight up dropped from userspace export. The only remaining
consumer (silo) has been fixed a while ago to not rely on this header.
This allows use to move it completely from include/linux/ to fs/ufs/
seeing as how the only in-kernel consumer is fs/ufs/.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Jan Kara <jack@ucw.cz>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement dmode option for iso9660 filesystem to allow setting of access
rights for directories on the filesystem.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: "Ilya N. Golubev" <gin@mo.msk.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These exports (which aren't used and which are in fact dangerous to use
because they pretty much form a security hole to use) have been marked
_UNUSED since 2.6.24 with removal in 2.6.25. This patch is their final
departure from the Linux kernel tree.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/afs/security.c: In function 'afs_permission':
fs/afs/security.c:290: warning: 'access' may be used uninitialized in this function
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/hfsplus/unicode.c: In function 'hfsplus_hash_dentry':
fs/hfsplus/unicode.c:328: warning: 'dsize' may be used uninitialized in this function
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When adding directory entry to a directory, we have to properly increase
length of the last extent. Handle this similarly as extending regular files -
make extents always have size multiple of block size (it will be truncated
down to proper size in udf_clear_inode()).
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Position in directory returned by readdir is offset of directory entry divided
by four (don't ask me why). Make this conversion only when reading f_pos from
userspace / writing it there and internally work in bytes. It makes things
more easily readable and also fixes a bug (we forgot to divide length of the
entry by 4 when advancing f_pos in udf_add_entry()).
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix udf_clear_inode() to request asynchronous writeout in icache reclaim
path.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sparse generated:
fs/udf/namei.c:896:15: originally declared here
fs/udf/namei.c:1147:41: warning: incorrect type in argument 3 (different signedness)
fs/udf/namei.c:1147:41: expected int *offset
fs/udf/namei.c:1147:41: got unsigned int *<noident>
fs/udf/namei.c:1152:78: warning: incorrect type in argument 3 (different signedness)
fs/udf/namei.c:1152:78: expected int *offset
fs/udf/namei.c:1152:78: got unsigned int *<noident>
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sparse generated:
fs/udf/inode.c:324:41: warning: incorrect type in argument 4 (different signedness)
fs/udf/inode.c:324:41: expected long *<noident>
fs/udf/inode.c:324:41: got unsigned long *<noident>
inode_getblk always set 4th argument to uint32_t value
3rd parameter of map_bh is sector_t (which is unsigned long or u64)
so convert phys value to sector_t
fs/udf/inode.c:1818:47: warning: incorrect type in argument 3 (different signedness)
fs/udf/inode.c:1818:47: expected int *<noident>
fs/udf/inode.c:1818:47: got unsigned int *<noident>
fs/udf/inode.c:1826:46: warning: incorrect type in argument 3 (different signedness)
fs/udf/inode.c:1826:46: expected int *<noident>
fs/udf/inode.c:1826:46: got unsigned int *<noident>
udf_get_filelongad and udf_get_shortad are called always for uint32_t
values (struct extent_position->offset), so it's safe to convert offset
parameter to uint32_t
gcc warned:
fs/udf/inode.c: In function 'udf_get_block':
fs/udf/inode.c:299: warning: 'phys' may be used uninitialized in this function
initialize it to 0 (if someday someone will break inode_getblk we will catch it immediately)
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Ben Fennema <bfennema@falcon.csc.calpoly.edu>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sparse generated:
fs/udf/dir.c:78:5: warning: symbol 'udf_readdir' was not declared. Should it be static?
there are 2 different prototypes of udf_readdir - remove them and move
code around to make it still compile
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Printing date and version of a driver makes sense if there's a maintainer
who's maintaining and using these, but printing ancient version information
only confuses users.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cache UDF_I(struct inode *) return values when there are
at least 2 uses in one function
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>