Commit Graph

1517 Commits

Author SHA1 Message Date
Antonino A. Daplas
e07dea9876 [PATCH] console: Fix compile error
Fix following compile error (From Kernel Bugzilla Bug 5427):

include/linux/console_struct.h:53: error: field `vt_mode' has incomplete type

Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:52 -08:00
Michal Januszewski
8fb6567e34 [PATCH] fbdev: fix the fb_find_nearest_mode() function
Currently the fb_find_nearest_mode() function finds a mode with screen
resolution closest to that described by the 'var' argument and with some
arbitrary refresh rate (eg.  in the following sequence of refresh rates: 70 60
53 85 75, 53 is selected).

This patch fixes the function so that it looks for the closest mode as far as
both resolution and refresh rate are concerned.  The function's first argument
is changed to fb_videomode so that the refresh rate can be specified by the
caller, as fb_var_screeninfo doesn't have any fields that could directly hold
this data.

Signed-off-by: Michal Januszewski <spock@gentoo.org>
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:52 -08:00
Antonino A. Daplas
c465e05a03 [PATCH] fbcon/fbdev: Move softcursor out of fbdev to fbcon
According to Jon Smirl, filling in the field fb_cursor with soft_cursor for
drivers that do not support hardware cursors is redundant.  The soft_cursor
function is usable by all drivers because it is just a wrapper around
fb_imageblit.  And because soft_cursor is an fbcon-specific hook, the file is
moved to the console directory.

Thus, drivers that do not support hardware cursors can leave the fb_cursor
field blank.  For drivers that do, they can fill up this field with their own
version.

The end result is a smaller code size.  And if the framebuffer console is not
loaded, module/kernel size is also reduced because the soft_cursor module will
also not be loaded.

Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:50 -08:00
NeilBrown
0ba7536d5d [PATCH] knfsd: Fix some minor sign problems in nfsd/xdr
There are a couple of tests which could possibly be confused by extremely
large numbers appearing in 'xdr' packets.  I think the closest to an exploit
you could get would be writing random data from a free page into a file - i.e.
 leak data out of kernel space.

I'm fairly sure they cannot be used for remote compromise.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:48 -08:00
NeilBrown
70c3b76c28 [PATCH] knfsd: Allow run-time selection of NFS versions to export
Provide a file in the NFSD filesystem that allows setting and querying of
which version of NFS are being exported.  Changes are only allowed while no
server is running.

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:48 -08:00
Matt Porter
6978bbc097 [PATCH] rapidio: message interface updates
Updates the RIO messaging interface to pass a device instance into the
event registeration and callbacks.

Signed-off-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:47 -08:00
Matt Porter
fa78cc5179 [PATCH] rapidio: core updates
Addresses issues raised with the 2.6.12-rc6-mm1 RIO support.  Fix dma_mask
init, shrink some code, general cleanup.

Signed-off-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Matt Porter
70a50ebd9a [PATCH] RapidIO support: core includes
Add RapidIO core include files.

The core code implements enumeration/discovery, management of
devices/resources, and interfaces for RIO drivers.

Signed-off-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli
d217d5450f [PATCH] Kprobes: preempt_disable/enable() simplification
Reorganize the preempt_disable/enable calls to eliminate the extra preempt
depth.  Changes based on Paul McKenney's review suggestions for the kprobes
RCU changeset.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli
3516a46042 [PATCH] Kprobes: Use RCU for (un)register synchronization - base changes
Changes to the base kprobes infrastructure to use RCU for synchronization
during kprobe registration and unregistration.  These changes coupled with the
arch kprobe changes (next in series):

a. serialize registration and unregistration of kprobes.
b. enable lockless execution of handlers. Handlers can now run in parallel.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli
e65845235c [PATCH] Kprobes: Track kprobe on a per_cpu basis - base changes
Changes to the base kprobe infrastructure to track kprobe execution on a
per-cpu basis.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:45 -08:00
Corey Minyard
393d2cc354 [PATCH] ipmi: use refcount in message handler
This patch is rather large, but it really can't be done in smaller chunks
easily and I believe it is an important change.  This has been out and tested
for a while in the latest IPMI driver release.  There are no functional
changes, just changes as necessary to convert the locking over (and a few
minor style updates).

The IPMI driver uses read/write locks to ensure that things exist while they
are in use.  This is bad from a number of points of view.  This patch removes
the rwlocks and uses refcounts and RCU lists to manage what the locks did.

Signed-off-by: Corey Minyard <minyard@acm.org>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:43 -08:00
Miklos Szeredi
befc649c22 [PATCH] FUSE: pass file handle in setattr
This patch passes the file handle supplied in iattr to userspace, in case the
->setattr() was invoked from sys_ftruncate().  This solves the permission
checking (or lack thereof) in ftruncate() for the class of filesystems served
by an unprivileged userspace process.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:42 -08:00
Miklos Szeredi
fd72faac95 [PATCH] FUSE: atomic create+open
This patch adds an atomic create+open operation.  This does not yet work if
the file type changes between lookup and create+open, but solves the
permission checking problems for the separte create and open methods.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:42 -08:00
Miklos Szeredi
31d40d74b4 [PATCH] FUSE: add access call
Add a new access call, which will only be called if ->permission is invoked
from sys_access().  In all other cases permission checking is delayed until
the actual filesystem operation.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:42 -08:00
Miklos Szeredi
5b62073d50 [PATCH] FUSE: bump interface minor version
Though the following changes are all backward compatible (from the kernel's as
well as the library's POV) change the minor version, so interested
applications can detect new features.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:42 -08:00
Miklos Szeredi
cc4e69dee4 [PATCH] VFS: pass file pointer to filesystem from ftruncate()
This patch extends the iattr structure with a file pointer memeber, and adds
an ATTR_FILE validity flag for this member.

This is set if do_truncate() is invoked from ftruncate() or from
do_coredump().

The change is source and binary compatible.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:42 -08:00
Christoph Hellwig
481bed4542 [PATCH] consolidate sys_ptrace()
The sys_ptrace boilerplate code (everything outside the big switch
statement for the arch-specific requests) is shared by most architectures.
This patch moves it to kernel/ptrace.c and leaves the arch-specific code as
arch_ptrace.

Some architectures have a too different ptrace so we have to exclude them.
They continue to keep their implementations.  For sh64 I had to add a
sh64_ptrace wrapper because it does some initialization on the first call.
For um I removed an ifdefed SUBARCH_PTRACE_SPECIAL block, but
SUBARCH_PTRACE_SPECIAL isn't defined anywhere in the tree.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-By: David Howells <dhowells@redhat.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:42 -08:00
Tim Schmielau
8c65b4a604 [PATCH] fix remaining missing includes
Fix more include file problems that surfaced since I submitted the previous
fix-missing-includes.patch.  This should now allow not to include sched.h
from module.h, which is done by a followup patch.

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:41 -08:00
Adrian Bunk
be586bab8b [PATCH] quota: small cleanups
- "extern inline" -> "static inline"

- every file should #include the headers containing the prototypes for
  it's global functions

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:39 -08:00
Zach Brown
d55b5fdaf4 [PATCH] aio: remove aio_max_nr accounting race
AIO was adding a new context's max requests to the global total before
testing if that resulting total was over the global limit.  This let
innocent tasks get their new limit tested along with a racing guilty task
that was crossing the limit.  This serializes the _nr accounting with a
spinlock It also switches to using unsigned long for the global totals.
Individual contexts are still limited to an unsigned int's worth of
requests by the syscall interface.

The problem and fix were verified with a simple program that spun creating
and destroying a context while holding on to another long lived context.
Before the patch a task creating a tiny context could get a spurious EAGAIN
if it raced with a task creating a very large context that overran the
limit.

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:38 -08:00
Ingo Molnar
28ef35845f [PATCH] small kernel_stat.h cleanup
cleanup: use for_each_cpu() instead of an open-coded NR_CPUS loop.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:37 -08:00
Hans Reiser
a43313668f [PATCH] reiser4: add radix_tree_lookup_slot()
Reiser4 uses radix trees to solve a trouble reiser4_readdir has serving nfs
requests.

Unfortunately, radix tree api lacks an operation suitable for modifying
existing entry.  This patch adds radix_tree_lookup_slot which returns pointer
to found item within the tree.  That location can be then updated.

Both Nick and Christoph Lameter have patches which need this as well.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:37 -08:00
Andrew Morton
7361f4d8ca [PATCH] readahead commentary
Add a few comments surrounding the generic readahead API.

Also convert some ulongs into pgoff_t: the identifier for PAGE_CACHE_SIZE
offsets into pagecache.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:37 -08:00
Badari Pulavarty
bf8f972d3a [PATCH] SHM_NORESERVE flags for shmget()
Add SHM_NORESERVE functionality similar to MAP_NORESERVE for shared memory
segments.

This is mainly to avoid abuse of OVERCOMMIT_ALWAYS and this flag is ignored
for OVERCOMMIT_NEVER.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:37 -08:00
Paul E. McKenney
665a7583f3 [PATCH] Remove hlist_for_each_rcu() API, convert existing use to hlist_for_each_entry_rcu
Remove the hlist_for_each_rcu() API, which is used only in one place, and
is trivially converted to hlist_for_each_entry_rcu(), making the code
shorter and more readable.  Any out-of-tree uses may be similarly
converted.

Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:35 -08:00
Matt Helsley
9f46080c41 [PATCH] Process Events Connector
This patch adds a connector that reports fork, exec, id change, and exit
events for all processes to userspace.  It replaces the fork_advisor patch
that ELSA is currently using.  Applications that may find these events
useful include accounting/auditing (e.g.  ELSA), system activity monitoring
(e.g.  top), security, and resource management (e.g.  CKRM).

Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:35 -08:00
Paul Mundt
055a251214 [PATCH] superhyway: multiple block support and VCR rework
This extends the API somewhat to allow for platform-specific VCR reading and
writing.  Some platforms (like SH4-202) implement the VCR in a split VCRL and
VCRH, but end up being in reverse order or have other quirks that need to be
dealt with, so we add a set of superhyway_ops per-bus to accomodate this.

We also have to extend the per-device resources somewhat, as some devices now
conveniently split control and data blocks.  So we allow a platform to
register its set of SuperHyway devices via superhyway_add_devices() with the
control block always ordered as the first resource (as this is the one that
userspace cares about).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:28 -08:00
Pekka J Enberg
2109a2d1b1 [PATCH] mm: rename kmem_cache_s to kmem_cache
This patch renames struct kmem_cache_s to kmem_cache so we can start using
it instead of kmem_cache_t typedef.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:24 -08:00
Thomas Gleixner
61ecfa8777 [MTD] includes: Clean up trailing white spaces
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-07 14:32:58 +01:00
Thomas Gleixner
03ead8427d [LIB] reed_solomon: Clean up trailing white spaces 2005-11-07 14:25:38 +01:00
Thomas Gleixner
182ec4eee3 [JFFS2] Clean up trailing white spaces
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-07 14:18:56 +01:00
Jeff Garzik
193515d51c [libata] eliminate use of drivers/scsi/scsi.h compatibility header/defines 2005-11-07 00:59:37 -05:00
Jeff Garzik
b78612b796 Merge branch 'master' 2005-11-06 23:01:34 -05:00
Linus Torvalds
0b154bb7d0 Merge master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild 2005-11-06 16:59:14 -08:00
Linus Torvalds
6adfd34e85 Merge master.kernel.org:/home/rmk/linux-2.6-drvmodel 2005-11-06 16:58:38 -08:00
Nicolas Pitre
fb0258730a [MTD] Don't let gcc inline functions marked __xipram
If they get inlined into non __xipram functions we're screwed.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 23:12:57 +01:00
Kyungmin Park
83a368380e [MTD] OneNAND: Enhanced support for DDP (Dual Densitiy Packages)
Add density mask for better support of DDP chips.

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 22:59:48 +01:00
Kyungmin Park
a41371eb6d [MTD] OneNAND: Power Management (PM) support
Add suspend/resume

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 22:42:28 +01:00
Kyungmin Park
87590e26ff [MTD] OneNAND: Add missing files
Simple bad block table source and header files

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 22:39:23 +01:00
Ferenc Havasi
2bc9764c48 [JFFS2] Rename jffs2_summary_node to jffs2_raw_summary
Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 22:32:45 +01:00
Artem B. Bityutskiy
733802d974 [JFFS2] Debug code simplification, update TODO
Simplify the debugging code further.
Update the TODO list

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 22:20:33 +01:00
Vitaly Wool
962034f439 [MTD] NAND: Add suspend/resume functionality
The changes introduced allow to suspend/resume NAND flash.
A new state (FL_PM_SUSPENDED) is introduced, as well as
routines for mtd->suspend and mtd->resume to put the flash in
suspended state from software pov.

Signed-off-by: Vitaly Wool <vwool@ru.mvista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:43:45 +01:00
Ferenc Havasi
e631ddba58 [JFFS2] Add erase block summary support (mount time improvement)
The goal of summary is to speed up the mount time. Erase block summary (EBS)
stores summary information at the end of every (closed) erase block. It is
no longer necessary to scan all nodes separetly (and read all pages of them)
just read this "small" summary, where every information is stored which is
needed at mount time.

This summary information is stored in a JFFS2_FEATURE_RWCOMPAT_DELETE. During
the mount process if there is no summary info the orignal scan process will
be executed. EBS works with NAND and NOR flashes, too.

There is a user space tool called sumtool to generate this summary
information for a JFFS2 image.

Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:29:48 +01:00
Kyungmin Park
d36d63d404 [PATCH] OneNAND: Fix bug in write verify
- Remove unused block, page parameters
- Add constant instead of runtime value

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:24:51 +01:00
Kyungmin Park
fcc31470c4 [PATCH] OneNAND: Update OMAP OneNAND mapping using device driver model
- Update OMAP OneNAND mapping file using device driver model
- Remove board specific macro and values.

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:22:01 +01:00
Kyungmin Park
cdc001305d [PATCH] OneNAND: Simple Bad Block handling support
Based on NAND memory bad block table code

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:20:53 +01:00
Kyungmin Park
52b0eea73d [PATCH] OneNAND: Sync. Burst Read support
Add OneNAND Sync. Burst Read support
Tested with OMAP platform

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:19:37 +01:00
Kyungmin Park
cd5f6346bc [MTD] Add initial support for OneNAND flash chips
OneNAND is a new flash technology from Samsung with integrated SRAM
buffers and logic interface.

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 21:17:24 +01:00
Nicolas Pitre
638d983840 {MTD] add support for Intel's "Sibley" flash
This updates the Primary Vendor-Specific Extended Query parsing to
version 1.4 in order to get the information about the Configurable
Programming Mode regions implemented in the Sibley flash, as well as
selecting the appropriate write command code.

This flash does not behave like traditional NOR flash when writing data.
While mtdblock should just work, further changes are needed for JFFS2 use.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 20:12:17 +01:00
brking@us.ibm.com
618ec46bda [SCSI] pci: PCI ids for new ipr adapters
Adds some new PCI IDs for new IPR adapters.

Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-11-06 13:11:42 -06:00
James Bottomley
b1081ea6f0 [SCSI] raid class update
- Update raid class to use nested classes for raid components (this will
allow us to move to a component control model now)
- Make the raid level an enumeration rather than and int.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-11-06 12:32:31 -06:00
Ferenc Havasi
2227c0ba4b [jffs2] Remove compressor lzo and lzari
Remove unused compressor code

Signed-off-by: Ferenc Havasi <havasi@inf.u-szeged.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 17:31:24 +01:00
Artem B. Bityutskiy
f302cd028c [JFFS2] Namespace clean up
Rename functions to a name matching the functionality.
Remove stall debug code

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 17:17:32 +01:00
Artem B. Bityutskiy
2b79adcca1 [JFFS2] Use f->target instead of f->dents for symlink target
JFFS2 uses f->dents to store the pointer to the symlink target string (in case
the inode is symlink). This is somewhat ugly to use the same field for
different reasons. Introduce distinct field f->target for this purpose.
Note, f->fragtree, f->dents, f->target may probably be put in a union.

Signed-off-by: Artem B. Bityutskiy <dedekind@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2005-11-06 16:25:55 +01:00
Russell King
2dd34b488a [PATCH] kbuild: permanently fix kernel configuration include mess
Include autoconf.h into every kernel compilation via the gcc command line
using -imacros.  This ensures that we have the kernel configuration
included from the start, rather than relying on each file having #include
<linux/config.h> as appropriate.  History has shown that this is something
which is difficult to get right.

Since we now include the kernel configuration automatically, make
configcheck becomes meaningless, so remove it.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-11-06 10:22:04 +01:00
Pantelis Antoniou
21c614a789 [SERIAL] Support Au1x00 8250 UARTs using the generic 8250 driver.
The offsets of the registers are in a different place, and
some parts cannot handle a full set of modem control signals.

Signed-off-by: Pantelis Antoniou <pantelis@embeddedalley.ocm>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-11-06 09:07:03 +00:00
Michael Chan
5b0c76ad94 [PATCH] bnx2: add 5708 support
Add 5708 copper and serdes basic support, including 2.5 Gbps support
on 5708 serdes. SPEED_2500 is also added to ethtool.h

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2005-11-05 21:00:02 -05:00
Arnaldo Carvalho de Melo
2d43f1128a Merge branch 'red' of 84.73.165.173:/home/tgr/repos/net-2.6 2005-11-05 22:30:29 -02:00
Stephen Hemminger
300ce174eb [NETEM]: Support time based reordering
Change netem to support packets getting reordered because of variations in
delay. Introduce a special case version of FIFO that queues packets in order
based on the netem delay.

Since netem is classful, those users that don't want jitter based reordering
can just insert a pfifo instead of the default.

This required changes to generic skbuff code to allow finer grain manipulation
of sk_buff_head.  Insertion into the middle and reverse walk.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 20:56:41 -02:00
Russell King
37c12e7497 [DRIVER MODEL] Improved dynamically allocated platform_device interface
Re-jig the simple platform device support to allow private data
to be attached to a platform device, as well as allowing the
parent device to be set.

Example usage:

	pdev = platform_device_alloc("mydev", id);
	if (pdev) {
		err = platform_device_add_resources(pdev, &resources,
						    ARRAY_SIZE(resources));
		if (err == 0)
			err = platform_device_add_data(pdev, &platform_data,
						       sizeof(platform_data));
		if (err == 0)
			err = platform_device_add(pdev);
	} else {
		err = -ENOMEM;
	}
	if (err)
		platform_device_put(pdev);

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-11-05 21:19:33 +00:00
Thomas Graf
bdc450a0bb [PKT_SCHED]: (G)RED: Introduce hard dropping
Introduces a new flag TC_RED_HARDDROP which specifies that if ECN
marking is enabled packets should still be dropped once the
average queue length exceeds the maximum threshold.

This _may_ help to avoid global synchronisation during small
bursts of peers advertising but not caring about ECN. Use this
option very carefully, it does more harm than good if
(qth_max - qth_min) does not cover at least two average burst
cycles.

The difference to the current behaviour, in which we'd run into
the hard queue limit, is that due to the low pass filter of RED
short bursts are less likely to cause a global synchronisation.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:29 +01:00
Thomas Graf
b38c7eef7e [PKT_SCHED]: GRED: Support ECN marking
Adds a new u8 flags in a unused padding area of the netlink
message. Adds ECN marking support to be used instead of dropping
packets immediately.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:29 +01:00
Thomas Graf
1e4dfaf9b9 [PKT_SCHED]: GRED: Cleanup and remove unnecessary code
Removes unnecessary includes, initializers, and simplifies
the code a bit.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-05 22:02:28 +01:00
Matt Porter
f896424cbc [PATCH] phy address mask support for generic phy layer
Adds a phy_mask field to struct mii_bus and uses it.  This field
indicates each phy address to be ignored when probing the mdio bus.

This support is needed for the fs_enet and ibm_emac drivers to be
converted to the generic phy layer among other drivers. Many systems
lock up on probing certain phy addresses or probing doesn't return
0xffff when nothing is found at the address. A new driver I'm
working on also makes use of this mask.

Signed-off-by: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-11-05 14:40:55 -05:00
Jeff Garzik
6037d6bbdf [libata] ATAPI pad allocation fixes/cleanup
Use ata_pad_{alloc,free} in two drivers, to factor out common code.

Add ata_pad_{alloc,free} to two other drivers, which needed the padding
but had not been updated.
2005-11-04 22:08:00 -05:00
Jeff Garzik
c2cc87ca95 Merge branch 'master' 2005-11-04 21:39:31 -05:00
Calin A. Culianu
7015faa7df [PATCH] nvidiafb: Geforce 7800 series support added
This adds support for the Nvidia Geforce 7800 series of cards to the
nvidiafb framebuffer driver.  All it does is add the PCI device id for
the 7800, 7800 GTX, 7800 GO, and 7800 GTX GO cards to the module device
table for the nvidiafb.ko driver, so that nvidiafb.ko will actually work
on these cards.

I also added the relevant PCI device ids to linux/pci_ids.h

I tested it on my 7800 GTX here and it works like a charm.  I now can
get framebuffer support on this card! Woo hoo!! Nothing like 200x75 text
mode to make your eyes BLEED.  ;)

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-04 18:01:34 -08:00
Trond Myklebust
d530838bfa NFSv4: Fix problem with OPEN_DOWNGRADE
RFC 3530 states that for OPEN_DOWNGRADE "The share_access and share_deny
 bits specified must be exactly equal to the union of the share_access and
 share_deny bits specified for some subset of the OPENs in effect for
 current openowner on the current file.

 Setattr is currently violating the NFSv4 rules for OPEN_DOWNGRADE in that
 it may cause a downgrade from OPEN4_SHARE_ACCESS_BOTH to
 OPEN4_SHARE_ACCESS_WRITE despite the fact that there exists no open file
 with O_WRONLY access mode.

 Fix the problem by replacing nfs4_find_state() with a modified version of
 nfs_find_open_context().

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:33:38 -05:00
Linus Torvalds
912cbe3c5b Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6 2005-11-04 10:39:28 -08:00
Linus Torvalds
0f3278d14f Merge git://oss.sgi.com:8090/oss/git/xfs-2.6 2005-11-03 16:25:58 -08:00
Dmitry Torokhov
5f94548982 Input: do not register statically allocated devices
Do not register statically allocated input devices to prevent
OOPS when attaching input interfaces since it requires class
device to be properly initialized.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2005-11-02 22:51:46 -05:00
Nathan Scott
de69e5f44e [XFS] Add a mechanism for XFS to use the generic quota sync method.
This is now used to issue a delayed allocation flush before reporting
quota, which allows the used space quota report to match reality.

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-03 13:53:34 +11:00
Nathan Scott
a2f8e178ad [XFS] Add the project quota type into the XFS quota header.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-03 13:51:23 +11:00
Nathan Scott
436d7d3060 [XFS] Update XFS quota header license to match the SGI boilerplate.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-03 13:50:05 +11:00
Stephen Hemminger
c2da8acaf4 [ETHERNET]: Add ether stuff to docbook
Fix up etherdevice docbook comments and make them (and other networking stuff)
get dragged into the kernel-api. Delete the old 8390 stuff, it really isn't
interesting anymore.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-02 22:08:52 -02:00
Stephen Hemminger
2407534f8b [ETHERNET]: Optimize is_broadcast_ether_addr
Optimize the match for broadcast address by using bit operations instead
of comparison. This saves a number of conditional branches, and generates
smaller code.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-11-02 21:54:07 -02:00
Linus Torvalds
ec1890c5df Merge git://brick.kernel.dk/data/git/linux-2.6-block 2005-11-02 08:06:02 -08:00
Linus Torvalds
f98e85691b Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6 2005-11-01 21:33:06 -08:00
Linus Torvalds
ec33b30910 Merge master.kernel.org:/home/rmk/linux-2.6-serial 2005-11-01 21:32:46 -08:00
Jens Axboe
a362357b6c [BLOCK] Unify the seperate read/write io stat fields into arrays
Instead of having ->read_sectors and ->write_sectors, combine the two
into ->sectors[2] and similar for the other fields. This saves a branch
several places in the io path, since we don't have to care for what the
actual io direction is. On my x86-64 box, that's 200 bytes less text in
just the core (not counting the various drivers).

Signed-off-by: Jens Axboe <axboe@suse.de>
2005-11-01 09:26:16 +01:00
Harald Welte
6b7d31fcdd [NETFILTER]: Add "revision" support to arp_tables and ip6_tables
Like ip_tables already has it for some time, this adds support for
having multiple revisions for each match/target.  We steal one byte from
the name in order to accomodate a 8 bit version number.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-31 16:36:08 -02:00
Linus Torvalds
4fd5f8267d Merge master.kernel.org:/home/rmk/linux-2.6-drvmodel
Manual #include fixups for clashes - there may be some unnecessary
2005-10-31 07:32:56 -08:00
Russell King
913ade51ec [SERIAL] Fix port numbering
The PORT_* macros must be uniquely numbered.  This fixes the
definitions.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-10-31 13:53:26 +00:00
Paul Mackerras
23fd07750a Merge ../linux-2.6 by hand 2005-10-31 13:37:12 +11:00
OGAWA Hirofumi
451cbaa1c3 [PATCH] fat: cleanup and optimization of checksum
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:32 -08:00
Tim Schmielau
4e57b68178 [PATCH] fix missing includes
I recently picked up my older work to remove unnecessary #includes of
sched.h, starting from a patch by Dave Jones to not include sched.h
from module.h. This reduces the number of indirect includes of sched.h
by ~300. Another ~400 pointless direct includes can be removed after
this disentangling (patch to follow later).
However, quite a few indirect includes need to be fixed up for this.

In order to feed the patches through -mm with as little disturbance as
possible, I've split out the fixes I accumulated up to now (complete for
i386 and x86_64, more archs to follow later) and post them before the real
patch.  This way this large part of the patch is kept simple with only
adding #includes, and all hunks are independent of each other.  So if any
hunk rejects or gets in the way of other patches, just drop it.  My scripts
will pick it up again in the next round.

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:32 -08:00
Oleg Nesterov
621d31219d [PATCH] cleanup the usage of SEND_SIG_xxx constants
This patch simplifies some checks for magic siginfo values.  It should not
change the behaviour in any way.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:31 -08:00
Paul Jackson
4098f9918e [PATCH] sched: hardcode non-smp set_cpus_allowed
Simplify the UP (1 CPU) implementatin of set_cpus_allowed.

The one CPU is hardcoded to be cpu 0 - so just test for that bit, and avoid
having to pick up the cpu_online_map.

Also, unexport cpu_online_map: it was only needed for set_cpus_allowed().

Signed-off-by: Paul Jackson <pj@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:28 -08:00
Paul E. McKenney
a241ec65ae [PATCH] RCU torture-testing kernel module
This patch is a rewrite of the one submitted on October 1st, using modules
(http://marc.theaimsgroup.com/?l=linux-kernel&m=112819093522998&w=2).

This rewrite adds a tristate CONFIG_RCU_TORTURE_TEST, which enables an
intense torture test of the RCU infratructure.  This is needed due to the
continued changes to the RCU infrastructure to accommodate dynamic ticks,
CPU hotplug, realtime, and so on.  Most of the code is in a separate file
that is compiled only if the CONFIG variable is set.  Documentation on how
to run the test and interpret the output is also included.

This code has been tested on i386 and ppc64, and an earlier version of the
code has received extensive testing on a number of architectures as part of
the PREEMPT_RT patchset.

Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:27 -08:00
Nikita Danilov
c0398ee6c2 [PATCH] include/linux/kernel.h:BUILD_BUG_ON(): fix a comment
Fix comment describing BUILD_BUG_ON: BUG_ON is not an assertion
(unfortunately).

Signed-off-by: Nikita Danilov <nikita@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:26 -08:00
Pavel Roskin
52303e8b5f [PATCH] modules: fix sparse warning for every MODULE_PARM
sparse complains about every MODULE_PARM used in a module: warning: symbol
'__parm_foo' was not declared.  Should it be static?

The fix is to split declaration and initialization.  While MODULE_PARM is
obsolete, it's not something sparse should report.

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:26 -08:00
Miklos Szeredi
6ea05db06f [PATCH] fuse: remove unused define
Setting ctime is implicit in all setattr cases, so the FATTR_CTIME
definition is unnecessary.

It is used by neither the kernel nor by userspace.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:24 -08:00
David Howells
29db919063 [PATCH] Keys: Add LSM hooks for key management [try #3]
The attached patch adds LSM hooks for key management facilities. The notable
changes are:

 (1) The key struct now supports a security pointer for the use of security
     modules. This will permit key labelling and restrictions on which
     programs may access a key.

 (2) Security modules get a chance to note (or abort) the allocation of a key.

 (3) The key permission checking can now be enhanced by the security modules;
     the permissions check consults LSM if all other checks bear out.

 (4) The key permissions checking functions now return an error code rather
     than a boolean value.

 (5) An extra permission has been added to govern the modification of
     attributes (UID, GID, permissions).

Note that there isn't an LSM hook specifically for each keyctl() operation,
but rather the permissions hook allows control of individual operations based
on the permission request bits.

Key management access control through LSM is enabled by automatically if both
CONFIG_KEYS and CONFIG_SECURITY are enabled.

This should be applied on top of the patch ensubjected:

	[PATCH] Keys: Possessor permissions should be additive

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Chris Wright <chrisw@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:23 -08:00
Paul Jackson
68860ec10b [PATCH] cpusets: automatic numa mempolicy rebinding
This patch automatically updates a tasks NUMA mempolicy when its cpuset
memory placement changes.  It does so within the context of the task,
without any need to support low level external mempolicy manipulation.

If a system is not using cpusets, or if running on a system with just the
root (all-encompassing) cpuset, then this remap is a no-op.  Only when a
task is moved between cpusets, or a cpusets memory placement is changed
does the following apply.  Otherwise, the main routine below,
rebind_policy() is not even called.

When mixing cpusets, scheduler affinity, and NUMA mempolicies, the
essential role of cpusets is to place jobs (several related tasks) on a set
of CPUs and Memory Nodes, the essential role of sched_setaffinity is to
manage a jobs processor placement within its allowed cpuset, and the
essential role of NUMA mempolicy (mbind, set_mempolicy) is to manage a jobs
memory placement within its allowed cpuset.

However, CPU affinity and NUMA memory placement are managed within the
kernel using absolute system wide numbering, not cpuset relative numbering.

This is ok until a job is migrated to a different cpuset, or what's the
same, a jobs cpuset is moved to different CPUs and Memory Nodes.

Then the CPU affinity and NUMA memory placement of the tasks in the job
need to be updated, to preserve their cpuset-relative position.  This can
be done for CPU affinity using sched_setaffinity() from user code, as one
task can modify anothers CPU affinity.  This cannot be done from an
external task for NUMA memory placement, as that can only be modified in
the context of the task using it.

However, it easy enough to remap a tasks NUMA mempolicy automatically when
a task is migrated, using the existing cpuset mechanism to trigger a
refresh of a tasks memory placement after its cpuset has changed.  All that
is needed is the old and new nodemask, and notice to the task that it needs
to rebind its mempolicy.  The tasks mems_allowed has the old mask, the
tasks cpuset has the new mask, and the existing
cpuset_update_current_mems_allowed() mechanism provides the notice.  The
bitmap/cpumask/nodemask remap operators provide the cpuset relative
calculations.

This patch leaves open a couple of issues:

 1) Updating vma and shmfs/tmpfs/hugetlbfs memory policies:

    These mempolicies may reference nodes outside of those allowed to
    the current task by its cpuset.  Tasks are migrated as part of jobs,
    which reside on what might be several cpusets in a subtree.  When such
    a job is migrated, all NUMA memory policy references to nodes within
    that cpuset subtree should be translated, and references to any nodes
    outside that subtree should be left untouched.  A future patch will
    provide the cpuset mechanism needed to mark such subtrees.  With that
    patch, we will be able to correctly migrate these other memory policies
    across a job migration.

 2) Updating cpuset, affinity and memory policies in user space:

    This is harder.  Any placement state stored in user space using
    system-wide numbering will be invalidated across a migration.  More
    work will be required to provide user code with a migration-safe means
    to manage its cpuset relative placement, while preserving the current
    API's that pass system wide numbers, not cpuset relative numbers across
    the kernel-user boundary.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:22 -08:00
Paul Jackson
fb5eeeee44 [PATCH] cpusets: bitmap and mask remap operators
In the forthcoming task migration support, a key calculation will be
mapping cpu and node numbers from the old set to the new set while
preserving cpuset-relative offset.

For example, if a task and its pages on nodes 8-11 are being migrated to
nodes 24-27, then pages on node 9 (the 2nd node in the old set) should be
moved to node 25 (the 2nd node in the new set.)

As with other bitmap operations, the proper way to code this is to provide
the underlying calculation in lib/bitmap.c, and then to provide the usual
cpumask and nodemask wrappers.

This patch provides that.  These operations are termed 'remap' operations.
Both remapping a single bit and a set of bits is supported.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:21 -08:00
Paul Jackson
053199edf5 [PATCH] cpusets: dual semaphore locking overhaul
Overhaul cpuset locking.  Replace single semaphore with two semaphores.

The suggestion to use two locks was made by Roman Zippel.

Both locks are global.  Code that wants to modify cpusets must first
acquire the exclusive manage_sem, which allows them read-only access to
cpusets, and holds off other would-be modifiers.  Before making actual
changes, the second semaphore, callback_sem must be acquired as well.  Code
that needs only to query cpusets must acquire callback_sem, which is also a
global exclusive lock.

The earlier problems with double tripping are avoided, because it is
allowed for holders of manage_sem to nest the second callback_sem lock, and
only callback_sem is needed by code called from within __alloc_pages(),
where the double tripping had been possible.

This is not quite the same as a normal read/write semaphore, because
obtaining read-only access with intent to change must hold off other such
attempts, while allowing read-only access w/o such intention.  Changing
cpusets involves several related checks and changes, which must be done
while allowing read-only queries (to avoid the double trip), but while
ensuring nothing changes (holding off other would be modifiers.)

This overhaul of cpuset locking also makes careful use of task_lock() to
guard access to the task->cpuset pointer, closing a couple of race
conditions noticed while reading this code (thanks, Roman).  I've never
seen these races fail in any use or test.

See further the comments in the code.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:21 -08:00
Andrew Morton
15d2bace5e [PATCH] add_timer() of a pending timer is illegal
In the recent timer rework we lost the check for an add_timer() of an
already-pending timer.  That check was useful for networking, so put it back.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:21 -08:00
Christoph Hellwig
dfb7dac3af [PATCH] unify sys_ptrace prototype
Make sure we always return, as all syscalls should.  Also move the common
prototype to <linux/syscalls.h>

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:20 -08:00
Oleg Nesterov
19a4fcb531 [PATCH] kill sigqueue->lock
This lock is used in sigqueue_free(), but it is always equal to
current->sighand->siglock, so we don't need to keep it in the struct
sigqueue.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:19 -08:00
Eric Dumazet
2f51201662 [PATCH] reduce sizeof(struct file)
Now that RCU applied on 'struct file' seems stable, we can place f_rcuhead
in a memory location that is not anymore used at call_rcu(&f->f_rcuhead,
file_free_rcu) time, to reduce the size of this critical kernel object.

The trick I used is to move f_rcuhead and f_list in an union called f_u

The callers are changed so that f_rcuhead becomes f_u.fu_rcuhead and f_list
becomes f_u.f_list

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:19 -08:00
Andrew Morton
dfc4f94d2f [PATCH] remove timer debug field
Remove timer_list.magic and associated debugging code.

I originally added this when a spinlock was added to timer_list - this meant
that an all-zeroes timer became illegal and init_timer() was required.

That spinlock isn't even there any more, although timer.base must now be
initialised.

I'll keep this debugging code in -mm.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:18 -08:00
john stultz
1bb34a4127 [PATCH] NTP shift_right cleanup
Create a macro shift_right() that avoids the numerous ugly conditionals in the
NTP code that look like:

        if(a < 0)
                b = -(-a >> shift);
        else
                b = a >> shift;

Replacing it with:

        b = shift_right(a, shift);

This should have zero effect on the logic, however it should probably have
a bit of testing just to be sure.

Also replace open-coded min/max with the macros.

Signed-off-by : John Stultz <johnstul@us.ibm.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:18 -08:00
Alan Stern
61e1a9ea4b [PATCH] Add kthread_stop_sem()
Enhance the kthread API by adding kthread_stop_sem, for use in stopping
threads that spend their idle time waiting on a semaphore.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:17 -08:00
Oleg Nesterov
a8db2db1e6 [PATCH] introduce setup_timer() helper
Every user of init_timer() also needs to initialize ->function and ->data
fields.  This patch adds a simple setup_timer() helper for that.

The schedule_timeout() is patched as an example of usage.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:17 -08:00
Jan Kara
aaa4059bc2 [PATCH] ext3: Fix unmapped buffers in transaction's lists
Fix the problem (BUG 4964) with unmapped buffers in transaction's
t_sync_data list.  The problem is we need to call filesystem's own
invalidatepage() from block_write_full_page().

block_write_full_page() must call filesystem's invalidatepage().  Otherwise
following nasty race can happen:

   proc 1                                        proc 2
   ------                                        ------
- write some new data to 'offset'
  => bh gets to the transactions data list
                                              - starts truncate
                                                => i_size set to new size
- mpage_writepages()
  - ext3_ordered_writepage() to 'offset'
    - block_write_full_page()
      - page->index > end_index+1
        - block_invalidatepage()
          - discard_buffer()
            - clear_buffer_mapped()

- commit triggers and finds unmapped buffer - BOOM!

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:17 -08:00
Shaohua Li
eb9289eb20 [PATCH] introduce .valid callback for pm_ops
Add pm_ops.valid callback, so only the available pm states show in
/sys/power/state.  And this also makes an earlier states error report at
enter_state before we do actual suspend/resume.

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Acked-by: Pavel Machek<pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:15 -08:00
Rafael J. Wysocki
2c1b4a5ca4 [PATCH] swsusp: rework memory freeing on resume
The following patch makes swsusp use the PG_nosave and PG_nosave_free flags to
mark pages that should be freed in case of an error during resume.

This allows us to simplify the code and to use swsusp_free() in all of the
swsusp's resume error paths, which makes them actually work.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:14 -08:00
Rafael J. Wysocki
25761b6eb7 [PATCH] swsusp: move snapshot functionality to separate file
The following patch moves the functionality of swsusp related to creating and
handling the snapshot of memory to a separate file, snapshot.c

This should enable us to untangle the code in the future and eventually to
implement some parts of swsusp.c in the user space.

The patch does not change the code.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:14 -08:00
Ashok Raj
ad74557a49 [PATCH] introduce get_cpu_sysdev() to retrieve a sysfs entry for a cpu.
Some modules creating sysfs entries under /sys/devices/system/cpu/cpuX/
need to know the parent sysfs entry to make devices under them.  This will
just return the sysfs entry for a given cpu.

sysfs entries showing under each cpu sysfs can be easily created if such
entries can be created by registering a sysfs driver for cpuclass.  The
issue is when the entry is created the CPU may not be online, hence we
would need to defer the creation until the online notification comes.

Current users: cache entries for Intel CPU's and cpufreq subsystem.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Zwane Mwaikambo <zwane@holomorphy.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:14 -08:00
Ingo Molnar
bda98685b8 [PATCH] x86: inline spin_unlock if !CONFIG_DEBUG_SPINLOCK and !CONFIG_PREEMPT
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:13 -08:00
James Morris
d381d8a9a0 [PATCH] SELinux: canonicalize getxattr()
This patch allows SELinux to canonicalize the value returned from
getxattr() via the security_inode_getsecurity() hook, which is called after
the fs level getxattr() function.

The purpose of this is to allow the in-core security context for an inode
to override the on-disk value.  This could happen in cases such as
upgrading a system to a different labeling form (e.g.  standard SELinux to
MLS) without needing to do a full relabel of the filesystem.

In such cases, we want getxattr() to return the canonical security context
that the kernel is using rather than what is stored on disk.

The implementation hooks into the inode_getsecurity(), adding another
parameter to indicate the result of the preceding fs-level getxattr() call,
so that SELinux knows whether to compare a value obtained from disk with
the kernel value.

We also now allow getxattr() to work for mountpoint labeled filesystems
(i.e.  mount with option context=foo_t), as we are able to return the
kernel value to the user.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:11 -08:00
Brian Gerst
0d078f6f96 [PATCH] CONFIG_IA32
Add CONFIG_X86_32 for i386.  This allows selecting options that only apply
to 32-bit systems.

(X86 && !X86_64) becomes X86_32
(X86 ||  X86_64) becomes X86

Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:10 -08:00
Jeff Garzik
054ee8fd39 Merge branch 'upstream' 2005-10-30 04:50:22 -05:00
Jeff Garzik
a7dac447bb [libata] change ata_qc_complete() to take error mask as second arg
The second argument to ata_qc_complete() was being used for two
purposes: communicate the ATA Status register to the completion
function, and indicate an error.  On legacy PCI IDE hardware, the latter
is often implicit in the former.  On more modern hardware, the driver
often completely emulated a Status register value, passing ATA_ERR as an
indication that something went wrong.

Now that previous code changes have eliminated the need to use drv_stat
arg to communicate the ATA Status register value, we can convert it to a
mask of possible error classes.

This will lead to more flexible error handling in the future.
2005-10-30 04:44:42 -05:00
Jeff Garzik
f0612bbc41 Merge branch 'upstream' 2005-10-30 01:58:18 -05:00
Jeff Garzik
81cfb8864c Merge branch 'master' 2005-10-30 01:56:31 -05:00
Linus Torvalds
9f75e1eff3 Merge master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6 2005-10-29 21:48:06 -07:00
Dave Hansen
3947be1969 [PATCH] memory hotplug: sysfs and add/remove functions
This adds generic memory add/remove and supporting functions for memory
hotplug into a new file as well as a memory hotplug kernel config option.

Individual architecture patches will follow.

For now, disable memory hotplug when swsusp is enabled.  There's a lot of
churn there right now.  We'll fix it up properly once it calms down.

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:44 -07:00
Dave Hansen
bdc8cb9845 [PATCH] memory hotplug locking: zone span seqlock
See the "fixup bad_range()" patch for more information, but this actually
creates a the lock to protect things making assumptions about a zone's size
staying constant at runtime.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:44 -07:00
Dave Hansen
208d54e551 [PATCH] memory hotplug locking: node_size_lock
pgdat->node_size_lock is basically only neeeded in one place in the normal
code: show_mem(), which is the arch-specific sysrq-m printing function.

Strictly speaking, the architectures not doing memory hotplug do no need this
locking in show_mem().  However, they are all included for completeness.  This
should also make any future consolidation of all of the implementations a
little more straightforward.

This lock is also held in the sparsemem code during a memory removal, as
sections are invalidated.  This is the place there pfn_valid() is made false
for a memory area that's being removed.  The lock is only required when doing
pfn_valid() operations on memory which the user does not already have a
reference on the page, such as in show_mem().

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:44 -07:00
Dave Hansen
4ca644d970 [PATCH] memory hotplug prep: __section_nr helper
A little helper that we use in the hotplug code.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:44 -07:00
Hugh Dickins
b8072f099b [PATCH] mm: update comments to pte lock
Updated several references to page_table_lock in common code comments.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:42 -07:00
Hugh Dickins
f412ac08c9 [PATCH] mm: fix rss and mmlist locking
A couple of oddities were guarded by page_table_lock, no longer properly
guarded when that is split.

The mm_counters of file_rss and anon_rss: make those an atomic_t, or an
atomic64_t if the architecture supports it, in such a case.  Definitions by
courtesy of Christoph Lameter: who spent considerable effort on more scalable
ways of counting, but found insufficient benefit in practice.

And adding an mm with swap to the mmlist for swapoff: the list is well-
guarded by its own lock, but the list_empty check now has to be repeated
inside it.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:42 -07:00
Hugh Dickins
4c21e2f244 [PATCH] mm: split page table lock
Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
a many-threaded application which concurrently initializes different parts of
a large anonymous area.

This patch corrects that, by using a separate spinlock per page table page, to
guard the page table entries in that page, instead of using the mm's single
page_table_lock.  (But even then, page_table_lock is still used to guard page
table allocation, and anon_vma allocation.)

In this implementation, the spinlock is tucked inside the struct page of the
page table page: with a BUILD_BUG_ON in case it overflows - which it would in
the case of 32-bit PA-RISC with spinlock debugging enabled.

Splitting the lock is not quite for free: another cacheline access.  Ideally,
I suppose we would use split ptlock only for multi-threaded processes on
multi-cpu machines; but deciding that dynamically would have its own costs.
So for now enable it by config, at some number of cpus - since the Kconfig
language doesn't support inequalities, let preprocessor compare that with
NR_CPUS.  But I don't think it's worth being user-configurable: for good
testing of both split and unsplit configs, split now at 4 cpus, and perhaps
change that to 8 later.

There is a benefit even for singly threaded processes: kswapd can be attacking
one part of the mm while another part is busy faulting.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:42 -07:00
Hugh Dickins
deceb6cd17 [PATCH] mm: follow_page with inner ptlock
Final step in pushing down common core's page_table_lock.  follow_page no
longer wants caller to hold page_table_lock, uses pte_offset_map_lock itself;
and so no page_table_lock is taken in get_user_pages itself.

But get_user_pages (and get_futex_key) do then need follow_page to pin the
page for them: take Daniel's suggestion of bitflags to follow_page.

Need one for WRITE, another for TOUCH (it was the accessed flag before:
vanished along with check_user_page_readable, but surely get_numa_maps is
wrong to mark every page it finds as accessed), another for GET.

And another, ANON to dispose of untouched_anonymous_page: it seems silly for
that to descend a second time, let follow_page observe if there was no page
table and return ZERO_PAGE if so.  Fix minor bug in that: check VM_LOCKED -
make_pages_present ought to make readonly anonymous present.

Give get_numa_maps a cond_resched while we're there.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:41 -07:00
Hugh Dickins
c34d1b4d16 [PATCH] mm: kill check_user_page_readable
check_user_page_readable is a problematic variant of follow_page.  It's used
only by oprofile's i386 and arm backtrace code, at interrupt time, to
establish whether a userspace stackframe is currently readable.

This is problematic, because we want to push the page_table_lock down inside
follow_page, and later split it; whereas oprofile is doing a spin_trylock on
it (in the i386 case, forgotten in the arm case), and needs that to pin
perhaps two pages spanned by the stackframe (which might be covered by
different locks when we split).

I think oprofile is going about this in the wrong way: it doesn't need to know
the area is readable (neither i386 nor arm uses read protection of user
pages), it doesn't need to pin the memory, it should simply
__copy_from_user_inatomic, and see if that succeeds or not.  Sorry, but I've
not got around to devising the sparse __user annotations for this.

Then we can eliminate check_user_page_readable, and return to a single
follow_page without the __follow_page variants.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:41 -07:00
Hugh Dickins
c0718806cf [PATCH] mm: rmap with inner ptlock
rmap's page_check_address descend without page_table_lock.  First just
pte_offset_map in case there's no pte present worth locking for, then take
page_table_lock for the full check, and pass ptl back to caller in the same
style as pte_offset_map_lock.  __xip_unmap, page_referenced_one and
try_to_unmap_one use pte_unmap_unlock.  try_to_unmap_cluster also.

page_check_address reformatted to avoid progressive indentation.  No use is
made of its one error code, return NULL when it fails.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:41 -07:00
Hugh Dickins
508034a32b [PATCH] mm: unmap_vmas with inner ptlock
Remove the page_table_lock from around the calls to unmap_vmas, and replace
the pte_offset_map in zap_pte_range by pte_offset_map_lock: all callers are
now safe to descend without page_table_lock.

Don't attempt fancy locking for hugepages, just take page_table_lock in
unmap_hugepage_range.  Which makes zap_hugepage_range, and the hugetlb test in
zap_page_range, redundant: unmap_vmas calls unmap_hugepage_range anyway.  Nor
does unmap_vmas have much use for its mm arg now.

The tlb_start_vma and tlb_end_vma in unmap_page_range are now called without
page_table_lock: if they're implemented at all, they typically come down to
flush_cache_range (usually done outside page_table_lock) and flush_tlb_range
(which we already audited for the mprotect case).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:41 -07:00
Hugh Dickins
c74df32c72 [PATCH] mm: ptd_alloc take ptlock
Second step in pushing down the page_table_lock.  Remove the temporary
bridging hack from __pud_alloc, __pmd_alloc, __pte_alloc: expect callers not
to hold page_table_lock, whether it's on init_mm or a user mm; take
page_table_lock internally to check if a racing task already allocated.

Convert their callers from common code.  But avoid coming back to change them
again later: instead of moving the spin_lock(&mm->page_table_lock) down,
switch over to new macros pte_alloc_map_lock and pte_unmap_unlock, which
encapsulate the mapping+locking and unlocking+unmapping together, and in the
end may use alternatives to the mm page_table_lock itself.

These callers all hold mmap_sem (some exclusively, some not), so at no level
can a page table be whipped away from beneath them; and pte_alloc uses the
"atomic" pmd_present to test whether it needs to allocate.  It appears that on
all arches we can safely descend without page_table_lock.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
Hugh Dickins
1bb3630e89 [PATCH] mm: ptd_alloc inline and out
It seems odd to me that, whereas pud_alloc and pmd_alloc test inline, only
calling out-of-line __pud_alloc __pmd_alloc if allocation needed,
pte_alloc_map and pte_alloc_kernel are entirely out-of-line.  Though it does
add a little to kernel size, change them to macros testing inline, calling
__pte_alloc or __pte_alloc_kernel to allocate out-of-line.  Mark none of them
as fastcalls, leave that to CONFIG_REGPARM or not.

It also seems more natural for the out-of-line functions to leave the offset
calculation and map to the inline, which has to do it anyway for the common
case.  At least mremap move wants __pte_alloc without _map.

Macros rather than inline functions, certainly to avoid the header file issues
which arise from CONFIG_HIGHPTE needing kmap_types.h, but also in case any
architectures I haven't built would have other such problems.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
Hugh Dickins
872fec16d9 [PATCH] mm: init_mm without ptlock
First step in pushing down the page_table_lock.  init_mm.page_table_lock has
been used throughout the architectures (usually for ioremap): not to serialize
kernel address space allocation (that's usually vmlist_lock), but because
pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.

Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
and drop it when allocating a new one, to check lest a racing task already
did.  Similarly no page_table_lock in vmalloc's map_vm_area.

Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
user mms, which are converted only by a later patch, for now they have to lock
differently according to whether or not it's init_mm.

If sources get muddled, there's a danger that an arch source taking
init_mm.page_table_lock will be mixed with common source also taking it (or
neither take it).  So break the rules and make another change, which should
break the build for such a mismatch: remove the redundant mm arg from
pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).

Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
took page_table_lock for no good reason.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:40 -07:00
Hugh Dickins
46dea3d092 [PATCH] mm: ia64 use expand_upwards
ia64 has expand_backing_store function for growing its Register Backing Store
vma upwards.  But more complete code for this purpose is found in the
CONFIG_STACK_GROWSUP part of mm/mmap.c.  Uglify its #ifdefs further to provide
expand_upwards for ia64 as well as expand_stack for parisc.

The Register Backing Store vma should be marked VM_ACCOUNT.  Implement the
intention of growing it only a page at a time, instead of passing an address
outside of the vma to handle_mm_fault, with unknown consequences.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Hugh Dickins
f449952bc8 [PATCH] mm: mm_struct hiwaters moved
Slight and timid rearrangement of mm_struct: hiwater_rss and hiwater_vm were
tacked on the end, but it seems better to keep them near _file_rss, _anon_rss
and total_vm, in the same cacheline on those arches verified.

There are likely to be more profitable rearrangements, but less obvious (is it
good or bad that saved_auxv[AT_VECTOR_SIZE] isolates cpu_vm_mask and context
from many others?), needing serious instrumentation.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Hugh Dickins
365e9c87a9 [PATCH] mm: update_hiwaters just in time
update_mem_hiwater has attracted various criticisms, in particular from those
concerned with mm scalability.  Originally it was called whenever rss or
total_vm got raised.  Then many of those callsites were replaced by a timer
tick call from account_system_time.  Now Frank van Maarseveen reports that to
be found inadequate.  How about this?  Works for Frank.

Replace update_mem_hiwater, a poor combination of two unrelated ops, by macros
update_hiwater_rss and update_hiwater_vm.  Don't attempt to keep
mm->hiwater_rss up to date at timer tick, nor every time we raise rss (usually
by 1): those are hot paths.  Do the opposite, update only when about to lower
rss (usually by many), or just before final accounting in do_exit.  Handle
mm->hiwater_vm in the same way, though it's much less of an issue.  Demand
that whoever collects these hiwater statistics do the work of taking the
maximum with rss or total_vm.

And there has been no collector of these hiwater statistics in the tree.  The
new convention needs an example, so match Frank's usage by adding a VmPeak
line above VmSize to /proc/<pid>/status, and also a VmHWM line above VmRSS
(High-Water-Mark or High-Water-Memory).

There was a particular anomaly during mremap move, that hiwater_vm might be
captured too high.  A fleeting such anomaly remains, but it's quickly
corrected now, whereas before it would stick.

What locking?  None: if the app is racy then these statistics will be racy,
it's not worth any overhead to make them exact.  But whenever it suits,
hiwater_vm is updated under exclusive mmap_sem, and hiwater_rss under
page_table_lock (for now) or with preemption disabled (later on): without
going to any trouble, minimize the time between reading current values and
updating, to minimize those occasions when a racing thread bumps a count up
and back down in between.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Nick Piggin
b5810039a5 [PATCH] core remove PageReserved
Remove PageReserved() calls from core code by tightening VM_RESERVED
handling in mm/ to cover PageReserved functionality.

PageReserved special casing is removed from get_page and put_page.

All setting and clearing of PageReserved is retained, and it is now flagged
in the page_alloc checks to help ensure we don't introduce any refcount
based freeing of Reserved pages.

MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
deprecated.  We never completely handled it correctly anyway, and is be
reintroduced in future if required (Hugh has a proof of concept).

Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
be trivially removed.

Last real user of PageReserved is swsusp, which uses PageReserved to
determine whether a struct page points to valid memory or not.  This still
needs to be addressed (a generic page_is_ram() should work).

A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
thus mapcounted and count towards shared rss).  These writes to the struct
page could cause excessive cacheline bouncing on big systems.  There are a
number of ways this could be addressed if it is an issue.

Signed-off-by: Nick Piggin <npiggin@suse.de>

Refcount bug fix for filemap_xip.c

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:39 -07:00
Hugh Dickins
4294621f41 [PATCH] mm: rss = file_rss + anon_rss
I was lazy when we added anon_rss, and chose to change as few places as
possible.  So currently each anonymous page has to be counted twice, in rss
and in anon_rss.  Which won't be so good if those are atomic counts in some
configurations.

Change that around: keep file_rss and anon_rss separately, and add them
together (with get_mm_rss macro) when the total is needed - reading two
atomics is much cheaper than updating two atomics.  And update anon_rss
upfront, typically in memory.c, not tucked away in page_add_anon_rmap.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:38 -07:00
Hugh Dickins
a8fb5618da [PATCH] mm: unlink_file_vma, remove_vma
Divide remove_vm_struct into two parts: first anon_vma_unlink plus
unlink_file_vma, to unlink the vma from the list and tree by which rmap or
vmtruncate might find it; then remove_vma to close, fput and free.

The intention here is to do the anon_vma_unlink and unlink_file_vma earlier,
in free_pgtables before freeing any page tables: so we can be sure that any
page tables traversed by rmap and vmtruncate are stable (and other, ordinary
cases are stabilized by holding mmap_sem).

This will be crucial to traversing pgd,pud,pmd without page_table_lock.  But
testing the split-out patch showed that lifting the page_table_lock is
symbiotically necessary to make this change - the lock ordering is wrong to
move those unlinks into free_pgtables while it's under ptlock.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:37 -07:00
Hugh Dickins
ab50b8ed81 [PATCH] mm: vm_stat_account unshackled
The original vm_stat_account has fallen into disuse, with only one user, and
only one user of vm_stat_unaccount.  It's easier to keep track if we convert
them all to __vm_stat_account, then free it from its __shackles.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:37 -07:00
Andi Kleen
dfcd3c0dc4 [PATCH] Convert mempolicies to nodemask_t
The NUMA policy code predated nodemask_t so it used open coded bitmaps.
Convert everything to nodemask_t.  Big patch, but shouldn't have any actual
behaviour changes (except I removed one unnecessary check against
node_online_map and one unnecessary BUG_ON)

Signed-off-by: "Andi Kleen" <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:35 -07:00
Rik Van Riel
eb92f4ef32 [PATCH] add sem_is_read/write_locked()
Add sem_is_read/write_locked functions to the read/write semaphores, along the
same lines of the *_is_locked spinlock functions.  The swap token tuning patch
uses sem_is_read_locked; sem_is_write_locked is added for completeness.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:35 -07:00
Christoph Lameter
930fc45a49 [PATCH] vmalloc_node
This patch adds

vmalloc_node(size, node)	-> Allocate necessary memory on the specified node

and

get_vm_area_node(size, flags, node)

and the other functions that it depends on.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:35 -07:00
Jeff Garzik
0169e284f6 [libata] remove ata_chk_err(), ->check_err() hook.
We now depend on ->tf_read() to provide us with the contents
of the Error shadow register.
2005-10-29 21:25:10 -04:00
Herbert Xu
d32311fed7 [PATCH] Introduce sg_set_buf
sg_init_one is a nice tool for the block layer.  However, users
of struct scatterlist in other subsystems don't usually need the
DMA attributes.  For them it's a waste of time and space to
initialise the whole struct scatterlist structure.

Therefore this patch adds a new function sg_set_buf to initialise
a scatterlist without zeroing the DMA attributes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-30 11:14:39 +11:00
Jeff Garzik
b0c4e148bd Merge branch 'master' 2005-10-29 17:49:12 -04:00
Russell King
bbbf508d64 [DRIVER MODEL] Add missing platform_device.h header.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-10-29 22:17:58 +01:00
Linus Torvalds
e9d52234e3 Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus 2005-10-29 12:19:15 -07:00
Pete Popov
26a940e217 Cleaned up AMD Au1200 IDE driver:
- converted to platform bus
- removed pci dependencies
- removed virt_to_phys/phys_to_virt calls
    
System now can root off of a disk.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

diff --git a/Documentation/mips/AU1xxx_IDE.README b/Documentation/mips/AU1xxx_IDE.README
new file mode 100644
2005-10-29 19:32:20 +01:00
Pete Popov
bdf21b18b4 Philips PNX8550 support: MIPS32-like core with 2 Trimedias on it.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2005-10-29 19:31:54 +01:00
Linus Torvalds
62d3af1b5f Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2005-10-29 11:25:16 -07:00