Commit Graph

11763 Commits

Author SHA1 Message Date
Helge Deller
3db5db4fcd [PATCH] use cycle_t instead of u64 in struct time_interpolator
The 32bit and 64bit PARISC Linux kernels suffers from the problem, that the
gettimeofday() call sometimes returns non-monotonic times.

The easiest way to fix this, is to drop the PARISC-specific implementation
and switch over to the generic TIME_INTERPOLATION framework.

But in order to make it even compile on 32bit PARISC, the patch below which
touches the generic Linux code, is mandatory.

More information and the full patch with the parisc-specific changes is included in this thread: http://lists.parisc-linux.org/pipermail/parisc-linux/2006-December/031003.html

As far as I could see, this patch does not change anything for the existing
architectures which use this framework (IA64 and SPARC64), since "cycles_t"
is defined there as unsigned 64bit-integer anyway (which then makes this
patch a no-change for them).

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <linux-arch@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:31 -08:00
Andrew Morton
fc0ecff698 [PATCH] remove invalidate_inode_pages()
Convert all calls to invalidate_inode_pages() into open-coded calls to
invalidate_mapping_pages().

Leave the invalidate_inode_pages() wrapper in place for now, marked as
deprecated.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:31 -08:00
Anton Altaparmakov
54bc485522 [PATCH] Export invalidate_mapping_pages() to modules
It makes no sense to me to export invalidate_inode_pages() and not
invalidate_mapping_pages() and I actually need invalidate_mapping_pages()
because of its range specification ability...

akpm: also remove the export of invalidate_inode_pages() by making it an
inlined wrapper.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:30 -08:00
Jiri Slaby
224299d444 [PATCH] Char: moxa, devids cleanup
Move them to pci_ids.h

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:30 -08:00
Theodore Ts'o
34f5a39899 [PATCH] Add TAINT_USER and ability to set taint flags from userspace
Allow taint flags to be set from userspace by writing to
/proc/sys/kernel/tainted, and add a new taint flag, TAINT_USER, to be used
when userspace has potentially done something dangerous that might
compromise the kernel.  This will allow support personnel to ask further
questions about what may have caused the user taint flag to have been set.

For example, they might examine the logs of the realtime JVM to see if the
Java program has used the really silly, stupid, dangerous, and
completely-non-portable direct access to physical memory feature which MUST
be implemented according to the Real-Time Specification for Java (RTSJ).
Sigh.  What were those silly people at Sun thinking?

[akpm@osdl.org: build fix]
[bunk@stusta.de: cleanup]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:29 -08:00
David Brownell
cbcdc1debd [PATCH] PNP: export pnp_bus_type
The PNP framework doesn't export "pnp_bus_type", which is an unfortunate
exception to the policy followed by pretty much every other bus.  I noticed
this when I had to find a device in order to provide its platform_data.

Note that per advice from Arjan, the "export" scope has been been minimized to
avoid the hundred-plus bytes needed to support access from modules.  In this
case, the symbol is only needed by statically linked kernel code that lives
outside the drivers/pnp directory.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Adam Belay <ambx1@neo.rr.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:28 -08:00
Mathieu Desnoyers
23c887522e [PATCH] Relay: add CPU hotplug support
Mathieu originally needed to add this for tracing Xen, but it's something
that's needed for any application that can be tracing while cpus are added.

unplug isn't supported by this patch.  The thought was that at minumum a new
buffer needs to be added when a cpu comes up, but it wasn't worth the effort
to remove buffers on cpu down since they'd be freed soon anyway when the
channel was closed.

[zanussi@us.ibm.com: avoid lock_cpu_hotplug deadlock]
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Tom Zanussi <zanussi@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:28 -08:00
Robert P. J. Day
c376222960 [PATCH] Transform kmem_cache_alloc()+memset(0) -> kmem_cache_zalloc().
Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the
corresponding "kmem_cache_zalloc()" call.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Roland McGrath <roland@redhat.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Acked-by: Joel Becker <Joel.Becker@oracle.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:27 -08:00
Adrian Bunk
1b135431ab [PATCH] drivers/char/vc_screen.c: proper prototypes
Add proper prototypes for two functions in drivers/char/vc_screen.c

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:27 -08:00
Mike Frysinger
57a87bb072 [PATCH] scrub non-__GLIBC__ checks in linux/socket.h and linux/stat.h
Userspace should be worrying about userspace, so having the socket.h
and stat.h pollute the namespace in the non-glibc case is wrong and
pretty much prevents any other libc from utilizing these headers
sanely unless they set up the __GLIBC__ define themselves (which
sucks)

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:26 -08:00
Tilman Schmidt
4564f9e5fd [PATCH] consolidate line discipline number definitions
The line discipline numbers N_* are currently defined for each architecture
individually, but (except for a seeming mistake) identically, in
asm/termios.h.  There is no obvious reason why these numbers should be
architecture specific, nor any apparent relationship with the termios
structure.  The total number of these, NR_LDISCS, is defined in linux/tty.h
anyway.  So I propose the following patch which moves the definitions of
the individual line disciplines to linux/tty.h too.

Three of these numbers (N_MASC, N_PROFIBUS_FDL, and N_SMSBLOCK) are unused
in the current kernel, but the patch still keeps the complete set in case
there are plans to use them yet.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:26 -08:00
Jason Baron
068135e635 [PATCH] lockdep: add graph depth information to /proc/lockdep
Generate locking graph information into /proc/lockdep, for lock hierarchy
documentation and visualization purposes.

sample output:

 c089fd5c OPS:     138 FD:   14 BD:    1 --..: &tty->termios_mutex
  -> [c07a3430] tty_ldisc_lock
  -> [c07a37f0] &port_lock_key
  -> [c07afdc0] &rq->rq_lock_key#2

The lock classes listed are all the first-hop lock dependencies that
lockdep has seen so far.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:26 -08:00
Eric Dumazet
37756ced1f [PATCH] avoid one conditional branch in touch_atime()
I added IS_NOATIME(inode) macro definition in include/linux/fs.h, true if
the inode superblock is marked readonly or noatime.

This new macro is then used in touch_atime() instead of separatly testing
MS_RDONLY and MS_NOATIME

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:25 -08:00
Eric Dumazet
4ba4d4c0c5 [PATCH] struct vfsmount: keep mnt_count & mnt_expiry_mark away from mnt_flags
I noticed cache misses in touch_atime() that can be avoided if we keep
mnt_count & mnt_expiry_mark in a different cache line than mnt_flags
(mostly read)

mnt_count & mnt_expiry_mark are modified each time a file is opened/closed
in a file system.

touch_atime() is called each time a file is read, and generally needs to
read mnt_flags.

Other fields of struct vfsmount are mostly read so I chose to move
mnt_count & mnt_expiry_mark at the end of struct vfsmount.  And adding a
comment so that nobody tries to re-arrange fields to fill the holes :)

On 64bits platforms, the new offsetof(mnt_count) is 0xC0
On 32bits platforms, it is 0x60, so I didnot add a
____cacheline_aligned_in_smp because it would have a too big impact on the
size of this object (in particular if CONFIG_X86_L1_CACHE_SHIFT=7)

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:25 -08:00
Andrew Morton
780a065668 [PATCH] count_vm_events-warning-fix
- Prevent things like this:

	block/ll_rw_blk.c: In function 'submit_bio':
	block/ll_rw_blk.c:3222: warning: unused variable 'count'

  inlines are very, very preferable to macros.

- remove unused get_cpu_vm_events() macro

Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:25 -08:00
Adrian Bunk
7131b6d167 [PATCH] remove include/linux/byteorder/pdp_endian.h
include/linux/byteorder/pdp_endian.h is completely unused, and the comment in
the file itself states that it's both untested and only a proof-of-concept.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:24 -08:00
Eric W. Biederman
8b6312f4dc [PATCH] vt: refactor console SAK processing
This does several things.
- It moves looking up of the current foreground console into process
  context where we can safely take the semaphore that protects this
  operation.
- It uses the new flavor of work queue processing.
- This generates a factor of do_SAK, __do_SAK that runs immediately.
- This calls __do_SAK with the console semaphore held ensuring nothing
  else happens to the console while we process the SAK operation.
- With the console SAK processing moved into process context this
  patch removes the xchg operations that I used to attempt to attomically
  update struct pid, because of the strange locking used in the SAK processing.
  With SAK using the normal console semaphore nothing special is needed.

Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:24 -08:00
Miguel Ojeda Sandonis
70e840499a [PATCH] drivers: add LCD support
Add support for auxiliary displays, the ks0108 LCD controller, the
cfag12864b LCD and adds a framebuffer device: cfag12864bfb.

- Add a "auxdisplay/" folder in "drivers/" for auxiliary display
  drivers.

- Add support for the ks0108 LCD Controller as a device driver.  (uses
  parport interface)

- Add support for the cfag12864b LCD as a device driver.  (uses ks0108
  LCD Controller driver)

- Add a framebuffer device called cfag12864bfb.  (uses cfag12864b LCD
  driver)

- Add the usual Documentation, includes, Makefiles, Kconfigs,
  MAINTAINERS, CREDITS...

- Miguel Ojeda will maintain all the stuff above.

[rdunlap@xenotime.net: workqueue fixups]
[akpm@osdl.org: kconfig fix]
Signed-off-by: Miguel Ojeda Sandonis <maxextreme@gmail.com>
Cc: Greg KH <greg@kroah.com>
Acked-by: Paulo Marques <pmarques@grupopie.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:24 -08:00
Jeff Dike
6e6d74cfac [PATCH] uml: x86_64 ptrace fixes
This patch fixes some missing ptrace bits on x86_64.  PTRACE_ARCH_PRCTL is
hooked up and implemented.  This required generalizing arch_prctl_skas
slightly to take a task_struct to modify.  Previously, it always operated on
current.

Reading and writing the debug registers is also enabled by un-ifdefing the
code that implements that.  It turns out that x86_64 is identical to i386, so
the same code can be used.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:24 -08:00
Jeff Dike
f355559cf7 [PATCH] uml: x86_64 thread fixes
x86_64 needs some TLS fixes.  What was missing was remembering the child
thread id during clone and stuffing it into the child during each context
switch.

The %fs value is stored separately in the thread structure since the host
controls what effect it has on the actual register file.  The host also needs
to store it in its own thread struct, so we need the value kept outside the
register file.

arch_prctl_skas was fixed to call PTRACE_ARCH_PRCTL appropriately.  There is
some saving and restoring of registers in the ARCH_SET_* cases so that the
correct set of registers are changed on the host and restored to the process
when it runs again.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:24 -08:00
Robert P. J. Day
f688144b82 [PATCH] uml: fix apparent "CONFIG_64_BIT" typo.
Fix apparent typo, where CONFIG_64_BIT should read CONFIG_64BIT.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:23 -08:00
Jiri Kosina
047c7c4232 [PATCH] CRIS: turn local_save_flags() + local_irq_disable() into local_irq_save() in headers
Various headers for CRIS architecture contain local_irq_disable() after
local_save_flags().  Turn it into local_irq_save().

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: Mikael Starvik <starvik@axis.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:20 -08:00
Mike Frysinger
36dbf95868 [PATCH] m68k: don't include asm-m68k/page.h in asm-m68k/user.h
We don't actually use anything from asm-m68k/page.h in asm-m68k/user.h, so
don't bother including it

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:20 -08:00
Al Viro
38135614dd [PATCH] m68k: work around binutils tokenizer change
Recent as(1) doesn't think that .  terminates a macro name, so getuser.l is
_not_ treated as invoking getuser with .l as the first argument.
arch/m68k/math-emu relies on old behaviour, so it gets a lot of undefined
macros with more or less current binutils.

Note that this behaviour remains in all recent versions and is unrelated to
another binutils problems we used to have for a while (having (%a0)+ parsed
as two arguments).  This one is there to stay; it's an intentional and
documented change.

.irp <identifier> <words>
[text]
.endr
expands to a copy of text per each word, with \<identifier> replaced with
corresponding word.  Again, what happens depends on whether gas_ident.x
is treated as one or as two tokens; in the former case we'll get old_gas
incremented once, in the latter - twice.  The rest is obvious.

Unlike .macro argument list _anything_ is explicitly allowed after
.irp <identifier>; here we are on very safe ground.  And yes, it does
work with all gas variants I've got here (including vanilla 2.15, 2.16,
2.16.1 and 2.17, plus debian and FC binutils).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:20 -08:00
Hirokazu Takata
fabb626ad6 [PATCH] m32r: cosmetic updates and trivial fixes
Cosmetic updates and trivial fixes of m32r arch-dependent files.
- Remove RCS ID strings and trailing white lines
- Other misc. cosmetic updates

Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:20 -08:00
Aneesh Kumar K.V
65fdc8544f [PATCH] Alpha: increase PERCPU_ENOUGH_ROOM
Module loading on Alpha was failing with error "Could not allocate 8 bytes
percpu data".

Looking at dmesg we have the below error "No per-cpu room for modules."

Increase the PERCPU_ENOUGH_ROOM in a similar way as x86_64

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Cc: <Jay.Estabrook@hp.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:19 -08:00
Ken Chen
767193253b [PATCH] simplify shmem_aops.set_page_dirty() method
shmem backed file does not have page writeback, nor it participates in
backing device's dirty or writeback accounting.  So using generic
__set_page_dirty_nobuffers() for its .set_page_dirty aops method is a bit
overkill.  It unnecessarily prolongs shm unmap latency.

For example, on a densely populated large shm segment (sevearl GBs), the
unmapping operation becomes painfully long.  Because at unmap, kernel
transfers dirty bit in PTE into page struct and to the radix tree tag.  The
operation of tagging the radix tree is particularly expensive because it
has to traverse the tree from the root to the leaf node on every dirty
page.  What's bothering is that radix tree tag is used for page write back.
 However, shmem is memory backed and there is no page write back for such
file system.  And in the end, we spend all that time tagging radix tree and
none of that fancy tagging will be used.  So let's simplify it by introduce
a new aops __set_page_dirty_no_writeback and this will speed up shm unmap.

Signed-off-by: Ken Chen <kenchen@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:19 -08:00
Andy Whitcroft
bd8029b660 [PATCH] zoneid: fix up calculations for ZONEID_PGSHIFT
Currently if we have a non-zero ZONES_SHIFT we assume we are able to rely
on that as the bottom edge of the ZONEID, if not then we use the
NODES_PGOFF as the right end of either NODES _or_ SECTION.  This latter is
more luck than judgement and would be incorrect if we reordered the
SECTION,NODE,ZONE options in the fields space.

Really what we want is the lower of the right hand end of the two fields we
are using (either NODE,ZONE or SECTION,ZONE).  Codify that explicitly.  As
always allow for there being no bits in either of the fields, such as might
be valid in a non-numa machine with only a zone NORMAL.

I have checked that the compiler is still able to constant fold all of this
away correctly.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:19 -08:00
Christoph Lameter
4b51d66989 [PATCH] optional ZONE_DMA: optional ZONE_DMA in the VM
Make ZONE_DMA optional in core code.

- ifdef all code for ZONE_DMA and related definitions following the example
  for ZONE_DMA32 and ZONE_HIGHMEM.

- Without ZONE_DMA, ZONE_HIGHMEM and ZONE_DMA32 we get to a ZONES_SHIFT of
  0.

- Modify the VM statistics to work correctly without a DMA zone.

- Modify slab to not create DMA slabs if there is no ZONE_DMA.

[akpm@osdl.org: cleanup]
[jdike@addtoit.com: build fix]
[apw@shadowen.org: Simplify calculation of the number of bits we need for ZONES_SHIFT]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:18 -08:00
Christoph Lameter
05a0416be2 [PATCH] Drop __get_zone_counts()
Values are readily available via ZVC per node and global sums.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:18 -08:00
Christoph Lameter
9195481d2f [PATCH] Drop nr_free_pages_pgdat()
Function is unnecessary now.  We can use the summing features of the ZVCs to
get the values we need.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:18 -08:00
Christoph Lameter
9617729941 [PATCH] Drop free_pages()
nr_free_pages is now a simple access to a global variable.  Make it a macro
instead of a function.

The nr_free_pages now requires vmstat.h to be included.  There is one
occurrence in power management where we need to add the include.  Directly
refrer to global_page_state() there to clarify why the #include was added.

[akpm@osdl.org: arm build fix]
[akpm@osdl.org: sparc64 build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:18 -08:00
Christoph Lameter
51ed449127 [PATCH] Reorder ZVCs according to cacheline
The global and per zone counter sums are in arrays of longs.  Reorder the ZVCs
so that the most frequently used ZVCs are put into the same cacheline.  That
way calculations of the global, node and per zone vm state touches only a
single cacheline.  This is mostly important for 64 bit systems were one 128
byte cacheline takes only 8 longs.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:17 -08:00
Christoph Lameter
d23ad42324 [PATCH] Use ZVC for free_pages
This is again simplifies some of the VM counter calculations through the use
of the ZVC consolidated counters.

[michal.k.k.piotrowski@gmail.com: build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:17 -08:00
Christoph Lameter
c878538598 [PATCH] Use ZVC for inactive and active counts
The determination of the dirty ratio to determine writeback behavior is
currently based on the number of total pages on the system.

However, not all pages in the system may be dirtied.  Thus the ratio is always
too low and can never reach 100%.  The ratio may be particularly skewed if
large hugepage allocations, slab allocations or device driver buffers make
large sections of memory not available anymore.  In that case we may get into
a situation in which f.e.  the background writeback ratio of 40% cannot be
reached anymore which leads to undesired writeback behavior.

This patchset fixes that issue by determining the ratio based on the actual
pages that may potentially be dirty.  These are the pages on the active and
the inactive list plus free pages.

The problem with those counts has so far been that it is expensive to
calculate these because counts from multiple nodes and multiple zones will
have to be summed up.  This patchset makes these counters ZVC counters.  This
means that a current sum per zone, per node and for the whole system is always
available via global variables and not expensive anymore to calculate.

The patchset results in some other good side effects:

- Removal of the various functions that sum up free, active and inactive
  page counts

- Cleanup of the functions that display information via the proc filesystem.

This patch:

The use of a ZVC for nr_inactive and nr_active allows a simplification of some
counter operations.  More ZVC functionality is used for sums etc in the
following patches.

[akpm@osdl.org: UP build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:17 -08:00
Randy Dunlap
f05b6284ee [PATCH] typeof __page_to_pfn with SPARSEMEM=y
With CONFIG_SPARSEMEM=y:

mm/rmap.c:579: warning: format '%lx' expects type 'long unsigned int', but argument 2 has type 'int'

Make __page_to_pfn() return unsigned long.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:17 -08:00
Robert P. J. Day
e10a4437cb [PATCH] Remove final references to deprecated "MAP_ANON" page protection flag
Remove the last vestiges of the long-deprecated "MAP_ANON" page protection
flag: use "MAP_ANONYMOUS" instead.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-11 10:51:17 -08:00
Jeff Garzik
66efc5a7e3 libata: kill ATA_ENABLE_PATA
The ATA_ENABLE_PATA define was never meant to be permanent, and in
recent kernels, it's already been unconditionally enabled.  Remove.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:40 -05:00
Alan
49554c1956 ata: Add defines for the iordy bits
IORDY and IORDY enable/disable flags.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:39 -05:00
Akira Iguchi
836250069f libata: add another IRQ calls (core and headers)
This patch is against the libata core and headers.

Two IRQ calls are added in ata_port_operations.
- irq_on() is used to enable interrupts.
- irq_ack() is used to acknowledge a device interrupt.

In most drivers, ata_irq_on() and ata_irq_ack() are used for
irq_on and irq_ack respectively.

In some drivers (ex: ahci, sata_sil24) which cannot use them
as is, ata_dummy_irq_on() and ata_dummy_irq_ack() are used.

Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Akira Iguchi <akira2.iguchi@toshiba.co.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:38 -05:00
Andrew Morton
7f25377043 git-libata-all: forward declare struct device
In file included from drivers/infiniband/hw/ipath/ipath_diag.c:44:
include/linux/io.h:35: warning: 'struct device' declared inside parameter list
include/linux/io.h:35: warning: its scope is only this definition or declaration

Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:38 -05:00
Tejun Heo
0d5ff56677 libata: convert to iomap
Convert libata core layer and LLDs to use iomap.

* managed iomap is used.  Pointer to pcim_iomap_table() is cached at
  host->iomap and used through out LLDs.  This basically replaces
  host->mmio_base.

* if possible, pcim_iomap_regions() is used

Most iomap operation conversions are taken from Jeff Garzik
<jgarzik@pobox.com>'s iomap branch.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:38 -05:00
Tejun Heo
d24bbbf251 devres: implement pcim_iomap_regions()
Implement pcim_iomap_regions().  This function takes mask of BARs to
request and iomap.  No BAR should have length of zero.  BARs are
iomapped using pcim_iomap_table().

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:37 -05:00
Tejun Heo
b878ca5d37 libata: remove unused functions
Now that all LLDs are converted to use devres, default stop callbacks
are unused.  Remove them.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:37 -05:00
Tejun Heo
f0d36efdc6 libata: update libata core layer to use devres
Update libata core layer to use devres.

* ata_device_add() acquires all resources in managed mode.

* ata_host is allocated as devres associated with ata_host_release.

* Port attached status is handled as devres associated with
  ata_host_attach_release().

* Initialization failure and host removal is handedl by releasing
  devres group.

* Except for ata_scsi_release() removal, LLD interface remains the
  same.  Some functions use hacky is_managed test to support both
  managed and unmanaged devices.  These will go away once all LLDs are
  updated to use devres.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:37 -05:00
Tejun Heo
0529c159db libata: implement ata_host_detach()
Implement ata_host_detach() which calls ata_port_detach() for each
port in the host and export it.  ata_port_detach() is now internal and
thus un-exported.  ata_host_detach() will be used as the 'deregister
from libata layer' function after devres conversion.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:37 -05:00
Tejun Heo
9ac7849e35 devres: device resource management
Implement device resource management, in short, devres.  A device
driver can allocate arbirary size of devres data which is associated
with a release function.  On driver detach, release function is
invoked on the devres data, then, devres data is freed.

devreses are typed by associated release functions.  Some devreses are
better represented by single instance of the type while others need
multiple instances sharing the same release function.  Both usages are
supported.

devreses can be grouped using devres group such that a device driver
can easily release acquired resources halfway through initialization
or selectively release resources (e.g. resources for port 1 out of 4
ports).

This patch adds devres core including documentation and the following
managed interfaces.

* alloc/free	: devm_kzalloc(), devm_kzfree()
* IO region	: devm_request_region(), devm_release_region()
* IRQ		: devm_request_irq(), devm_free_irq()
* DMA		: dmam_alloc_coherent(), dmam_free_coherent(),
		  dmam_declare_coherent_memory(), dmam_pool_create(),
		  dmam_pool_destroy()
* PCI		: pcim_enable_device(), pcim_pin_device(), pci_is_managed()
* iomap		: devm_ioport_map(), devm_ioport_unmap(), devm_ioremap(),
		  devm_ioremap_nocache(), devm_iounmap(), pcim_iomap_table(),
		  pcim_iomap(), pcim_iounmap()

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:36 -05:00
Tejun Heo
726f0785b6 libata: kill qc->nsect and cursect
libata used two separate sets of variables to record request size and
current offset for ATA and ATAPI.  This is confusing and fragile.
This patch replaces qc->nsect/cursect with qc->nbytes/curbytes and
kills them.  Also, ata_pio_sector() is updated to use bytes for
qc->cursg_ofs instead of sectors.  The field used to be used in bytes
for ATAPI and in sectors for ATA.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:31 -05:00
Tejun Heo
553c4aa630 libata: handle pci_enable_device() failure while resuming
Handle pci_enable_device() failure while resuming.  This patch kills
the "ignoring return value of 'pci_enable_device'" warning message and
propagates __must_check through ata_pci_device_do_resume().

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:30 -05:00
Tejun Heo
a0cf733b33 libata: straighten out ATA_ID_* constants
* Kill _OFS suffixes in ATA_ID_{SERNO|FW_REV|PROD}_OFS for consistency
  with other ATA_ID_* constants.

* Kill ATA_SERNO_LEN

* Add and use ATA_ID_SERNO_LEN, ATA_ID_FW_REV_LEN and ATA_ID_PROD_LEN.
  This change also makes ata_device_blacklisted() use proper length
  for fwrev.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-02-09 17:39:30 -05:00