Commit Graph

762 Commits

Author SHA1 Message Date
Eric Van Hensbergen
93fa58cb83 [PATCH] v9fs: Documentation, Makefiles, Configuration
OVERVIEW

V9FS is a distributed file system for Linux which provides an
implementation of the Plan 9 resource sharing protocol 9P.  It can be
used to share all sorts of resources: static files, synthetic file servers
(such as /proc or /sys), devices, and application file servers (such as
FUSE).

BACKGROUND

Plan 9 (http://plan9.bell-labs.com/plan9) is a research operating
system and associated applications suite developed by the Computing
Science Research Center of AT&T Bell Laboratories (now a part of
Lucent Technologies), the same group that developed UNIX , C, and C++.
Plan 9 was initially released in 1993 to universities, and then made
generally available in 1995. Its core operating systems code laid the
foundation for the Inferno Operating System released as a product by
Lucent Bell-Labs in 1997. The Inferno venture was the only commercial
embodiment of Plan 9 and is currently maintained as a product by Vita
Nuova (http://www.vitanuova.com). After updated releases in 2000 and
2002, Plan 9 was open-sourced under the OSI approved Lucent Public
License in 2003.

The Plan 9 project was started by Ken Thompson and Rob Pike in 1985.
Their intent was to explore potential solutions to some of the
shortcomings of UNIX in the face of the widespread use of high-speed
networks to connect machines. In UNIX, networking was an afterthought
and UNIX clusters became little more than a network of stand-alone
systems. Plan 9 was designed from first principles as a seamless
distributed system with integrated secure network resource sharing.
Applications and services were architected in such a way as to allow
for implicit distribution across a cluster of systems. Configuring an
environment to use remote application components or services in place
of their local equivalent could be achieved with a few simple command
line instructions. For the most part, application implementations
operated independent of the location of their actual resources.

Commercial operating systems haven't changed much in the 20 years
since Plan 9 was conceived. Network and distributed systems support is
provided by a patchwork of middle-ware, with an endless number of
packages supplying pieces of the puzzle. Matters are complicated by
the use of different complicated protocols for individual services,
and separate implementations for kernel and application resources.
The V9FS project (http://v9fs.sourceforge.net) is an attempt to bring
Plan 9's unified approach to resource sharing to Linux and other
operating systems via support for the 9P2000 resource sharing
protocol.

V9FS HISTORY

V9FS was originally developed by Ron Minnich and Maya Gokhale at Los
Alamos National Labs (LANL) in 1997.  In November of 2001, Greg Watson
setup a SourceForge project as a public repository for the code which
supported the Linux 2.4 kernel.

About a year ago, I picked up the initial attempt Ron Minnich had
made to provide 2.6 support and got the code integrated into a 2.6.5
kernel.   I then went through a line-for-line re-write attempting to
clean-up the code while more closely following the Linux Kernel style
guidelines.  I co-authored a paper with Ron Minnich on the V9FS Linux
support including performance comparisons to NFSv3 using Bonnie and
PostMark - this paper appeared at the USENIX/FREENIX 2005
conference in April 2005:
( http://www.usenix.org/events/usenix05/tech/freenix/hensbergen.html ).

CALL FOR PARTICIPATION/REQUEST FOR COMMENTS

Our 2.6 kernel support is stabilizing and we'd like to begin pursuing
its integration into the official kernel tree.  We would appreciate any
review, comments, critiques, and additions from this community and are
actively seeking people to join our project and help us produce
something that would be acceptable and useful to the Linux community.

STATUS

The code is reasonably stable, although there are no doubt corner cases
our regression tests haven't discovered yet.  It is in regular use by several
of the developers and has been tested on x86 and PowerPC
(32-bit and 64-bit) in both small and large (LANL cluster) deployments.
Our current regression tests include fsx, bonnie, and postmark.

It was our intention to keep things as simple as possible for this
release -- trying to focus on correctness within the core of the
protocol support versus a rich set of features.  For example: a more
complete security model and cache layer are in the road map, but
excluded from this release.   Additionally, we have removed support for
mmap operations at Al Viro's request.

PERFORMANCE

Detailed performance numbers and analysis are included in the FREENIX
paper, but we show comparable performance to NFSv3 for large file
operations based on the Bonnie benchmark, and superior performance for
many small file operations based on the PostMark benchmark.   Somewhat
preliminary graphs (from the FREENIX paper) are available
(http://v9fs.sourceforge.net/perf/index.html).

RESOURCES

The source code is available in a few different forms:

tarballs: http://v9fs.sf.net
CVSweb: http://cvs.sourceforge.net/viewcvs.py/v9fs/linux-9p/
CVS: :pserver:anonymous@cvs.sourceforge.net:/cvsroot/v9fs/linux-9p
Git: rsync://v9fs.graverobber.org/v9fs (webgit: http://v9fs.graverobber.org)
9P: tcp!v9fs.graverobber.org!6564

The user-level server is available from either the Plan 9 distribution
or from http://v9fs.sf.net
Other support applications are still being developed, but preliminary
version can be downloaded from sourceforge.

Documentation on the protocol has historically been the Plan 9 Man
pages (http://plan9.bell-labs.com/sys/man/5/INDEX.html), but there is
an effort under way to write a more complete Internet-Draft style
specification (http://v9fs.sf.net/rfc).

There are a couple of mailing lists supporting v9fs, but the most used
is v9fs-developer@lists.sourceforge.net -- please direct/cc your
comments there so the other v9fs contibutors can participate in the
conversation.  There is also an IRC channel: irc://freenode.net/#v9fs

This part of the patch contains Documentation, Makefiles, and configuration
file changes.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:56 -07:00
Dipankar Sarma
b835996f62 [PATCH] files: lock-free fd look-up
With the use of RCU in files structure, the look-up of files using fds can now
be lock-free.  The lookup is protected by rcu_read_lock()/rcu_read_unlock().
This patch changes the readers to use lock-free lookup.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Ravikiran Thirumalai <kiran_th@gmail.com>
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:55 -07:00
Dipankar Sarma
ab2af1f500 [PATCH] files: files struct with RCU
Patch to eliminate struct files_struct.file_lock spinlock on the reader side
and use rcu refcounting rcuref_xxx api for the f_count refcounter.  The
updates to the fdtable are done by allocating a new fdtable structure and
setting files->fdt to point to the new structure.  The fdtable structure is
protected by RCU thereby allowing lock-free lookup.  For fd arrays/sets that
are vmalloced, we use keventd to free them since RCU callbacks can't sleep.  A
global list of fdtable to be freed is not scalable, so we use a per-cpu list.
If keventd is already handling the current cpu's work, we use a timer to defer
queueing of that work.

Since the last publication, this patch has been re-written to avoid using
explicit memory barriers and use rcu_assign_pointer(), rcu_dereference()
premitives instead.  This required that the fd information is kept in a
separate structure (fdtable) and updated atomically.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:55 -07:00
Dipankar Sarma
badf16621c [PATCH] files: break up files struct
In order for the RCU to work, the file table array, sets and their sizes must
be updated atomically.  Instead of ensuring this through too many memory
barriers, we put the arrays and their sizes in a separate structure.  This
patch takes the first step of putting the file table elements in a separate
structure fdtable that is embedded withing files_struct.  It also changes all
the users to refer to the file table using files_fdtable() macro.  Subsequent
applciation of RCU becomes easier after this.

Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:55 -07:00
Benjamin LaHaise
ac0b1bc1ed [PATCH] aio: kiocb locking to serialise retry and cancel
Implement a per-kiocb lock to serialise retry operations and cancel.  This
is done using wait_on_bit_lock() on the KIF_LOCKED bit of kiocb->ki_flags.
Also, make the cancellation path lock the kiocb and subsequently release
all references to it if the cancel was successful.  This version includes a
fix for the deadlock with __aio_run_iocbs.

Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:32 -07:00
Wendy Cheng
8f58202bf6 [PATCH] change io_cancel return code for no cancel case
Note that other than few exceptions, most of the current filesystem and/or
drivers do not have aio cancel specifically defined (kiob->ki_cancel field
is mostly NULL).  However, sys_io_cancel system call universally sets
return code to -EAGAIN.  This gives applications a wrong impression that
this call is implemented but just never works.  We have customer inquires
about this issue.

Changed by Benjamin LaHaise to EINVAL instead of ENOSYS

Signed-off-by: S. Wendy Cheng <wcheng@redhat.com>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:32 -07:00
Andrew Stribblehill
fac92becda [PATCH] bfs: fix endianness, signedness; add trivial bugfix
* Makes BFS code endianness-clean.

* Fixes some signedness warnings.

* Fixes a problem in fs/bfs/inode.c:164 where inodes not synced to disk
  don't get fully marked as clean.  Here's how to reproduce it:

# mount -o loop -t bfs /bfs.img /mnt
# df -i /mnt
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/bfs.img                  48       1      47    3% /mnt
# df -k /mnt
Filesystem           1K-blocks      Used Available Use% Mounted on
/bfs.img                   512         5       508   1% /mnt
# cp 60k-archive.zip /mnt/mt.zip
# df -k /mnt
Filesystem           1K-blocks      Used Available Use% Mounted on
/bfs.img                   512        65       447  13% /mnt
# df -i /mnt
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/bfs.img                  48       2      46    5% /mnt
# rm /mnt/mt.zip
# echo $?
0

 [If the unlink happens before the buffers flush, the following happens:]

# df -i /mnt
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/bfs.img                  48       2      46    5% /mnt
# df -k /mnt
Filesystem           1K-blocks      Used Available Use% Mounted on
/bfs.img                   512        65       447  13% /mnt

 fs/bfs/bfs.h           |    1

Signed-off-by: Andrew Stribblehill <ads@wompom.org>
Cc: <tigran@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:32 -07:00
Alexander Krizhanovsky
f76baf9365 [PATCH] autofs: fix "busy inodes after umount..."
This patch for old autofs (version 3) cleans dentries which are not putted
after killing the automount daemon (it's analogue of recent patch for
autofs4).

Signed-off-by: Alexander Krizhanovsky <klx@yandex.ru>
Cc: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:31 -07:00
Stephen Smalley
e31e14ec35 [PATCH] remove the inode_post_link and inode_post_rename LSM hooks
This patch removes the inode_post_link and inode_post_rename LSM hooks as
they are unused (and likely useless).

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:28 -07:00
Stephen Smalley
a74574aafe [PATCH] Remove security_inode_post_create/mkdir/symlink/mknod hooks
This patch removes the inode_post_create/mkdir/mknod/symlink LSM hooks as
they are obsoleted by the new inode_init_security hook that enables atomic
inode security labeling.

If anyone sees any reason to retain these hooks, please speak now.  Also,
is anyone using the post_rename/link hooks; if not, those could also be
removed.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:28 -07:00
Stephen Smalley
ac50960afa [PATCH] ext3: Enable atomic inode security labeling
This patch modifies ext3 to call the inode_init_security LSM hook to obtain
the security attribute for a newly created inode and to set the resulting
attribute on the new inode as part of the same transaction.  This parallels
the existing processing for setting ACLs on newly created inodes.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:28 -07:00
Stephen Smalley
10f47e6a1b [PATCH] ext2: Enable atomic inode security labeling
This patch modifies ext2 to call the inode_init_security LSM hook to obtain
the security attribute for a newly created inode and to set the resulting
attribute on the new inode.  This parallels the existing processing for
setting ACLs on newly created inodes.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:27 -07:00
Mark Fasheh
fef266580e [PATCH] update filesystems for new delete_inode behavior
Update the file systems in fs/ implementing a delete_inode() callback to
call truncate_inode_pages().  One implementation note: In developing this
patch I put the calls to truncate_inode_pages() at the very top of those
filesystems delete_inode() callbacks in order to retain the previous
behavior.  I'm guessing that some of those could probably be optimized.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:27 -07:00
Mark Fasheh
e85b565233 [PATCH] move truncate_inode_pages() into ->delete_inode()
Allow file systems supporting ->delete_inode() to call
truncate_inode_pages() on their own.  OCFS2 wants this so it can query the
cluster before making a final decision on whether to wipe an inode from
disk or not.  In some corner cases an inode marked on the local node via
voting may not actually get orphaned.  A good example is node death before
the transaction moving the inode to the orphan dir commits to the journal.
Without this patch, the truncate_inode_pages() call in
generic_delete_inode() would discard valid data for such inodes.

During earlier discussion in the 2.6.13 merge plan thread, Christoph
Hellwig indicated that other file systems might also find this useful.

IMHO, the best solution would be to just allow ->drop_inode() to do the
cluster query but it seems that would require a substantial reworking of
that section of the code.  Assuming it is safe to call write_inode_now() in
ocfs2_delete_inode() for those inodes which won't actually get wiped, this
solution should get us by for now.

Trivial testing of this patch (and a related OCFS2 update) has shown this
to avoid the corruption I'm seeing.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 13:57:26 -07:00
viro@ZenIV.linux.org.uk
3f70353ea9 [PATCH] bogus cast in bio.c
<qualifier> void * is not the same as void <qualifier> *...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-09 10:31:58 -07:00
Anton Altaparmakov
ddcc959634 Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6 2005-09-09 08:19:10 -07:00
Nathan Scott
c9fc0d6a69 [XFS] Revert recent quota Makefile change, not in a fit state for merging.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-09-09 11:38:09 +10:00
Anton Altaparmakov
223176bc72 Merge branch 'master' of /usr/src/linux-2.6 2005-09-08 23:03:30 +01:00
Anton Altaparmakov
7d333d6c73 NTFS: 2.1.24 release and some minor final fixes.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 23:01:16 +01:00
Anton Altaparmakov
e604635c8b NTFS: Improve scalability by changing the driver global spin lock in
fs/ntfs/aops.c::ntfs_end_buffer_async_read() to a bit spin lock
      in the first buffer head of a page.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 22:13:02 +01:00
Anton Altaparmakov
a01ac532b5 NTFS: Fix page_has_buffers()/page_buffers() handling in fs/ntfs/aops.c.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 22:08:11 +01:00
Anton Altaparmakov
311120eca0 NTFS: Fixup handling of sparse, compressed, and encrypted attributes in
fs/ntfs/aops.c::ntfs_readpage().

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 22:04:20 +01:00
Anton Altaparmakov
8273d5d4c2 NTFS: Fix fs/ntfs/aops.c::ntfs_{read,write}_block() to handle the case
where a concurrent truncate has truncated the runlist under our feet.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 22:00:33 +01:00
Anton Altaparmakov
54b02eb01c NTFS: Optimize fs/ntfs/aops.c::ntfs_write_block() by extending the page
lock protection over the buffer submission for i/o which allows the
      removal of the get_bh()/put_bh() pairs for each buffer.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 21:43:47 +01:00
Anton Altaparmakov
bd45fdd209 NTFS: Fixup handling of sparse, compressed, and encrypted attributes in
fs/ntfs/aops.c::ntfs_writepage().

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 21:38:05 +01:00
Anton Altaparmakov
8dcdebafb8 NTFS: Make ntfs_write_block() not instantiate sparse blocks if they are zero.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 21:25:48 +01:00
Anton Altaparmakov
67bb103725 NTFS: Fixup handling of sparse, compressed, and encrypted attributes in
fs/ntfs/inode.c::ntfs_read_locked_{,attr_,index_}inode().

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 21:19:45 +01:00
Anton Altaparmakov
1c7d469d47 NTFS: Truncate {a,c,m}time to the ntfs supported time granularity when
updating the times in the inode in ntfs_setattr().

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 21:15:09 +01:00
Anton Altaparmakov
bbf1813fb8 NTFS: Fix cluster (de)allocators to work when the runlist is NULL and more
importantly to take a locked runlist rather than them locking it
      which leads to lock reversal.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 21:09:06 +01:00
Anton Altaparmakov
807c453de7 NTFS: Fix handling of sparse attributes in ntfs_attr_make_non_resident().
Also, add BUG() checks to ntfs_attr_make_non_resident() and
      ntfs_attr_set() to ensure that these functions are never called
      for compressed or encrypted attributes.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 21:01:17 +01:00
Anton Altaparmakov
2983d1bd1a NTFS: Fix several bugs in fs/ntfs/attrib.c.
- Fix a bug in ntfs_map_runlist_nolock() where we forgot to protect
  access to the allocated size in the ntfs inode with the size lock.
- Fix ntfs_attr_vcn_to_lcn_nolock() and ntfs_attr_find_vcn_nolock() to
  return LCN_ENOENT when there is no runlist and the allocated size is
  zero.
- Fix load_attribute_list() to handle the case of a NULL runlist.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 20:56:09 +01:00
Anton Altaparmakov
0aacceacf3 NTFS: Add fs/ntfs/attrib.[hc]::ntfs_resident_attr_value_resize().
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 20:40:32 +01:00
Anton Altaparmakov
f25dfb5e44 NTFS: Remove bogus setting of PageError in ntfs_read_compressed_block().
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 20:35:33 +01:00
Anton Altaparmakov
8e08ceaeac NTFS: Fix a bug in fs/ntfs/index.c::ntfs_index_lookup(). When the returned
index entry is in the index root, we forgot to set the @ir pointer in
      the index context.  Thanks for Yura Pakhuchiy for finding this bug.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 20:29:50 +01:00
Anton Altaparmakov
6e48321a40 NTFS: Add ntfs_rl_punch_nolock() which punches a caller specified hole into a runlist.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 20:26:34 +01:00
Anton Altaparmakov
3ffc5a4438 NTFS: Change ntfs_rl_truncate_nolock() to throw away the runlist if the new
length is zero.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 20:23:06 +01:00
Anton Altaparmakov
f94ad38e68 NTFS: Report unrepresentable inodes during ntfs_readdir() as KERN_WARNING
messages and include the inode number.  Thanks to Yura Pakhuchiy for
      pointing this out.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 17:04:11 +01:00
Anton Altaparmakov
2b0ada2b8e NTFS: Fix handling of valid but empty mapping pairs array in
fs/ntfs/runlist.c::ntfs_mapping_pairs_decompress().

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 16:52:31 +01:00
Anton Altaparmakov
8bb735216a NTFS: Remove two bogus BUG_ON()s from fs/ntfs/mft.c.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 16:48:28 +01:00
Anton Altaparmakov
84d6ebe63f NTFS: Fix two nasty runlist merging bugs that had gone unnoticed so far.
Thanks to Stefano Picerno for the bug report.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 16:46:55 +01:00
Anton Altaparmakov
9529d461d0 NTFS: Use ntfs_malloc_nofs_nofail() in runlist.c::ntfs_runlists_merge()
in the two critical regions.  This means we no longer need to
      panic() when the allocation fails as it now cannot fail.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 16:33:12 +01:00
Anton Altaparmakov
06d0e3cf3d NTFS: Allow highmem kmalloc() in ntfs_malloc_nofs() and add _nofail() version.
- Modify fs/ntfs/malloc.h::ntfs_malloc_nofs() to do the kmalloc() based
  allocations with __GFP_HIGHMEM, analogous to how the vmalloc() based
  allocations are done.
- Add fs/ntfs/malloc.h::ntfs_malloc_nofs_nofail() which is analogous to
  ntfs_malloc_nofs() but it performs allocations with __GFP_NOFAIL and
  hence cannot fail.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 16:28:25 +01:00
Anton Altaparmakov
e7a1033b94 NTFS: Support more clean journal ($LogFile) states.
- Support journals ($LogFile) which have been modified by chkdsk.  This
        means users can boot into Windows after we marked the volume dirty.
        The Windows boot will run chkdsk and then reboot.  The user can then
        immediately boot into Linux rather than having to do a full Windows
        boot first before rebooting into Linux and we will recognize such a
        journal and empty it as it is clean by definition.
      - Support journals ($LogFile) with only one restart page as well as
        journals with two different restart pages.  We sanity check both and
        either use the only sane one or the more recent one of the two in the
        case that both are valid.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
2005-09-08 16:12:28 +01:00
Nathan Scott
eccdfcd6f8 [XFS] Fix modular XFS builds (Makefile botch).
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-09-08 15:38:52 +10:00
Nathan Scott
20ba02879b [XFS] Remove special Kconfig XFS menu, make XFS options "inline".
Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-09-08 15:34:58 +10:00
Nathan Scott
f016bad6be [XFS] Cleanup some -Wundef flag warnings in the endian macros (thanks
Christoph).

SGI-PV: 942400
SGI-Modid: xfs-linux-melb:xfs-kern:23771a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-09-08 15:30:05 +10:00
Linus Torvalds
0481990b75 Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-for-linus-2.6 2005-09-07 17:31:27 -07:00
Linus Torvalds
87129d9626 Merge git://oss.sgi.com:8090/oss/git/xfs-2.6 2005-09-07 17:23:52 -07:00
Miklos Szeredi
0bb6fcc13a [PATCH] pivot_root() circular reference fix
Fix http://bugzilla.kernel.org/show_bug.cgi?id=4857

When pivot_root is called from an init script in an initramfs environment,
it causes a circular reference in the mount tree.

The cause of this is that pivot_root() is not prepared to handle pivoting
an unattached mount.  In an initramfs environment, rootfs is the root of
the namespace, and so it is not attached.

This patch fixes this and related problems, by returning -EINVAL if either
the current root or the new root is detached.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Cc: <bigfish@asmallpond.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:58:01 -07:00
Jan Kara
4407c2b6b2 [PATCH] Fix race in do_get_write_access()
attached patch should fix the following race:
     Proc 1                               Proc 2

     __flush_batch()
       ll_rw_block()
                                        do_get_write_access()
					   lock_buffer
                                             jh is only waiting for checkpoint
					     -> b_transaction == NULL ->
					     do nothing
                                           unlock_buffer
    test_set_buffer_locked()
    test_clear_buffer_dirty()
                                           __journal_file_buffer()
                                        change the data
    submit_bh()

and we have sent wrong data to disk...  We now clean the dirty buffer flag
under buffer lock in all cases and hence we know that whenever a buffer is
starting to be journaled we either finish the pending write-out before
attaching a buffer to a transaction or we won't write the buffer until the
transaction is going to be committed.

The test in jbd_unexpected_dirty_buffer() is redundant - remove it.
Furthermore we have to clear the buffer dirty bit under the buffer lock to
prevent races with buffer write-out (and hence prevent returning a buffer with
IO happening).

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:57 -07:00