Commit Graph

19871 Commits

Author SHA1 Message Date
Grant C. Likely
b928917516 [PATCH] powerpc: Add ML403 defconfig
Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 22:36:02 +11:00
Grant C. Likely
909aeca664 [PATCH] powerpc: Add support for Xilinx ML403 reference design
Includes fix for Xilinx silicon errata 213

Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 22:36:01 +11:00
Grant C. Likely
b58b5aa51c [PATCH] powerpc: Add xparameters file for Xilinx ML403 reference design
Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 22:36:01 +11:00
Grant C. Likely
72646c7f69 [PATCH] powerpc: Add Virtex-4 FX to cpu table
Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 22:36:00 +11:00
Grant C. Likely
5eb446cb72 [PATCH] powerpc: Add ML300 defconfig
Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 22:35:59 +11:00
Grant C. Likely
e27db622b8 [PATCH] powerpc: Migrate ML300 reference design to the platform bus
Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 22:35:59 +11:00
Grant C. Likely
1a42e53d17 [PATCH] powerpc: Migrate Xilinx Vertex support from the OCP bus to the platfom bus.
This patch only deals with the serial port definitions as there is no
support for any other xilinx IP cores in the kernel tree at the moment.

Board specific configuration moved out of virtex.[ch] and into the
xparameters.h wrapper.

This also prepares for the transition to the flattened device tree model.
When the bootloader provides a device tree generated from an xparameters.h
files, the kernel will no longer need xparameters/*.  The platform bus will
get populated with data from the device tree, and the device drivers will
be automatically connected to the devices.  Only the bootloader (or
ppcboot) will need xparameters directly.

Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 22:35:58 +11:00
Grant C. Likely
562e7370a4 [PATCH] powerpc: Make Virtex-II Pro support generic for all Virtex devices
The PPC405 hard core is used in both the Virtex-II Pro and Virtex 4 FX
FPGAs.  This patch cleans up the Virtex naming convention to reflect more
than just the Virtex-II Pro.

Rename files virtex-ii_pro.[ch] to virtex.[ch]
Rename config value VIRTEX_II_PRO to XILINX_VIRTEX

Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 22:35:57 +11:00
Grant C. Likely
b4367e7451 [PATCH] powerpc: Move xparameters.h into xilinx virtex device specific path
xparameters should not be needed by anything but virtex platform code.
Move it from include/asm-ppc/ to platforms/4xx/xparameters/

This is preparing for work to remove xparameters from the dependancy tree
for most c files.  xparam changes should not cause a recompile of the world.
Instead, drivers should get device info from the platform bus (populated
by the boot code)

Signed-off-by: Grant C. Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 22:35:35 +11:00
Nathan Lynch
7d4d61544a [PATCH] powerpc: avoid timer interrupt replay effect when onlining cpu
When a cpu is hotplug-onlined, if we don't set per_cpu(last_jiffy) to
something sane, timer_interrupt will execute its while loop for every
tick missed since the cpu was last online (or since the system was
booted, if we're adding a new cpu).  This can cause weird hangs, ssh
sessions dropping, and we can even go xmon if we take a global IPI at
the wrong time.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:51:54 +11:00
Michael Neuling
4dc4325693 [PATCH] powerpc: hypervisor check in pseries_kexec_cpu_down
We call unregister_vpa but we don't check to see if the hypervisor
supports this.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Anton Blanchard <anton@samba.org>
--
 arch/powerpc/platforms/pseries/setup.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:51:53 +11:00
Becky Bruce
7d4b95ae8e [PATCH] documentation/powerpc: add bus-frequency property to SOC node
Updated SOC node definition in documentation to include bus-frequency
property. Also extended mdio example to match specification.

Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
Signed-off-by: Kumar Gala <galak@gate.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:51:53 +11:00
Michael Ellerman
f9b4045d6b [PATCH] powerpc: Don't use toc in decrementer_iSeries_masked
Since 404849bbd2 we've been using
LOAD_REG_ADDRBASE, which uses the toc pointer, in decrementer_iSeries_masked.

This can explode if we take the decrementer interrupt while we're in a module,
because the toc pointer in r2 will be the module's toc pointer.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:51:53 +11:00
David Gibson
09f5dc44ae [PATCH] powerpc: Cleanup, consolidating icache dirtying logic
The code to mark a page as icache dirty (so that it will later be
icache-dcache flushed when we try to execute from it) is duplicated in
three places: flush_dcache_page() does this marking and nothing else,
but clear_user_page() and copy_user_page() duplicate it, since those
functions make the page icache dirty themselves.

This patch makes those other functions call flush_dcache_page()
instead, so the logic's all in one place.  This will make life less
confusing if we ever need to tweak the details of the the lazy icache
flush mechanism.

 arch/powerpc/mm/mem.c |   14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:51:53 +11:00
Jesper Juhl
95eff20feb [PATCH] Don't check pointer for NULL before passing it to kfree [arch/powerpc/kernel/rtas_flash.c]
Checking a pointer for NULL before passing it to kfree is pointless, kfree
does its own NULL checking of input.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:51:53 +11:00
Olaf Hering
4009d98022 [PATCH] powerpc: fix compile warning in udbg_init_maple_realmode
arch/powerpc/kernel/udbg_16550.c: In function `udbg_init_maple_realmode':
arch/powerpc/kernel/udbg_16550.c:162: warning: assignment from incompatible pointer type

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:51:52 +11:00
Olaf Hering
d60dcd9450 [PATCH] powerpc: add refcounting to setup_peg2 and of_get_pci_address
setup_peg2 must do some refcounting.
of_get_pci_address may need to drop the node

	Pegasos l2cr : L2 cache was not active, activating
	PCI bus 0 controlled by pci at 80000000
	Badness in kref_get at /home/olaf/kernel/olh/ppc64/linux-2.6.16-rc2-olh/lib/kref.c:32
	Call Trace:
	[C037BD00] [C0007934] show_stack+0x5c/0x184 (unreliable)
	[C037BD30] [C000E068] program_check_exception+0x184/0x584
	[C037BD90] [C000F5F0] ret_from_except_full+0x0/0x4c
	--- Exception: 700 at kref_get+0xc/0x24
	    LR = of_node_get+0x24/0x3c
	[C037BE50] [C004FD94] __pte_alloc_kernel+0x64/0x80 (unreliable)
	[C037BE70] [C000CA18] of_get_parent+0x34/0x58
	[C037BE90] [C0009B18] of_get_address+0x24/0x174
	[C037BED0] [C000A108] of_address_to_resource+0x24/0x68
	[C037BF00] [C038B128] chrp_find_bridges+0x114/0x470
	[C037BF90] [C038AE48] chrp_setup_arch+0x1fc/0x32c
	[C037BFB0] [C03849B0] setup_arch+0x144/0x188
	[C037BFD0] [C037C45C] start_kernel+0x34/0x1a8
	[C037BFF0] [000037A0] 0x37a0
	Badness in kref_get at /home/olaf/kernel/olh/ppc64/linux-2.6.16-rc2-olh/lib/kref.c:32
	Call Trace:
	[C037BC90] [C0007934] show_stack+0x5c/0x184 (unreliable)
	[C037BCC0] [C000E068] program_check_exception+0x184/0x584
	[C037BD20] [C000F5F0] ret_from_except_full+0x0/0x4c
	--- Exception: 700 at kref_get+0xc/0x24
	    LR = of_node_get+0x24/0x3c
	[C037BDE0] [00000000] 0x0 (unreliable)
	[C037BE00] [C000CA18] of_get_parent+0x34/0x58
	[C037BE20] [C0009CE8] of_translate_address+0x2c/0x2fc
	[C037BEA0] [C0009FE8] __of_address_to_resource+0x30/0xc4
	[C037BED0] [C000A130] of_address_to_resource+0x4c/0x68
	[C037BF00] [C038B128] chrp_find_bridges+0x114/0x470
	[C037BF90] [C038AE48] chrp_setup_arch+0x1fc/0x32c
	[C037BFB0] [C03849B0] setup_arch+0x144/0x188
	[C037BFD0] [C037C45C] start_kernel+0x34/0x1a8
	[C037BFF0] [000037A0] 0x37a0
	PCI bus 0 controlled by pci at c0000000
	Top of RAM: 0x10000000, Total RAM: 0x10000000

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:32:45 +11:00
Olaf Hering
090db7c86d [PATCH] powerpc: remove pointer/integer confusion in of_find_node_by_name
remove pointer/integer confusion

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:32:44 +11:00
Olaf Hering
0347880492 [PATCH] powerpc: restore clock speed in /proc/cpuinfo
Use generic_calibrate_decr to restore missing clock: speed in /proc/cpuinfo

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:32:44 +11:00
Olaf Hering
d8a8188ded [PATCH] powerpc: remove pointer/integer confusion in generic_calibrate_decr
remove pointer/integer confusion

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:32:44 +11:00
Michael Ellerman
b68239ee74 [PATCH] powerpc: Don't overwrite flat device tree with kdump kernel
It's possible for prom_init to allocate the flat device tree inside the
kdump crash kernel region. If this happens, when we load the kdump kernel we
overwrite the flattened device tree, which is bad.

We could make prom_init try and avoid allocating inside the crash kernel
region, but then we run into issues if the crash kernel region uses all the
space inside the RMO. The easiest solution is to move the flat device tree
once we're running in the kernel.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:32:44 +11:00
Dave C Boutcher
b4fd884a03 [PATCH] powerpc: remove useless call to touch_softlockup_watchdog
It turns out that we can't stop the watchdog from
triggering here.  If we touch the timer (which just uses the current jiffie
value) before we enable interrupts, it does nothing because jiffies
are not mass-updated until after we enable interrupts.  If we touch the
timer after we enable interrupts, its too late because the softlockup
watchdog will already have triggered.  The touch_softlockup_watchdog
call removed below does nothing.

Signed-off-by: Dave Boutcher <sleddog@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:32:44 +11:00
Dave C Boutcher
82a4df7462 [PATCH] powerpc: prod all processors after ibm,suspend-me
We need to prod everyone here since this is the only CPU that is
guaranteed to be running after the ibm,suspend-me RTAS call returns.

Signed-off-by: Dave Boutcher <sleddog@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:32:43 +11:00
Dave C Boutcher
c4cb8ecca6 [PATCH] powerpc: return correct rtas status from ibm,suspend-me
Correctly return the status from the RTAS call.  rtas_call expects
to return the status as a return value.

Signed-off-by: Dave Boutcher <sleddog@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:32:43 +11:00
Michael Ellerman
31a7f67e58 [PATCH] powerpc: Fix !SMP build of rtas.c
arch/powerpc/kernel/rtas.c is getting hvcall.h via spinlock.h, but when we're
building for UP we don't include spinlock.h.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:28:38 +11:00
Jake Moilanen
a958a26486 [PATCH] powerpc: IOMMU SG paranoia
This addresses two items, which are unlikely to be hit if we
trust drivers.

The first is moving a memory barrier below where the vmerged SG count
is passed back, but before the list is set to end.  If those
instructions were reordered, there could be an issue in iommu_unmap_sg().

The second is making sure we terminate the list on the failure case of
iommu_map_sg().  If a driver does not look at the failure return code,
it could pass a ill-formed SG list to iommu_unmap_sg().

Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:28:38 +11:00
Michael Ellerman
cdc3ee8f20 [PATCH] powerpc: Refuse to boot a kdump kernel via OF
You can't boot a kdump kernel via OF, not reliably anyway, the kernel being at
32 MB conflicts with the zImage wrapper etc. and it blows up.

It's trivial to check in prom_init though, and this is early enough that we can
actually drop back to OF where a reset-all will get you going again, which is
kinda nice. I think this should go in for 2.6.16.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:28:38 +11:00
Michael Ellerman
8c20fafa85 [PATCH] powerpc: Make sure we don't create empty lmb regions
To prevent problems later in boot, make sure we don't create zero-size lmb
regions.

I've checked all the callers, and at the moment no one should ever hit this.
All callers use a constant size, or they check the computed size before they
call us.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:28:38 +11:00
Michael Ellerman
fa93895329 [PATCH] powerpc: Don't allocate zero bytes in finish_device_tree()
In prom.c we run finish_node() on allnodes twice. The first time we just
calculate how much memory we'll need, the second time we do the actual work.

If the calculation stage determines that we need 0 bytes, then we should skip
the lmb allocation. Although an alloc of zero will work, it has been seen to
lead to a BUG_ON() in reserve_bootmem() on at least one machine.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:28:38 +11:00
Marcelo Tosatti
3ea4807de7 [PATCH] powerpc/8xx: last two 8MB D-TLB entries are incorrectly set
The last two 8MB TLB entries are being incorrectly set by initial_mmu on 8xx.

The first entry is written with the same virtual/physical address, which
renders it invalid:

BDI>rms 792 0x00001e00
BDI>rms 824 1
BDI>rds 824
SPR  824 : 0xc08000c0  -1065353024
BDI>rds 825
SPR  825 : 0xc0800de0  -1065349664
BDI>rds 826
SPR  826 : 0x00000000            0

And the second entry, in addition, does not have its TLB index set
correctly.

Signed-off-by: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:28:37 +11:00
Geoff Levand
aee9f26542 [PATCH] powerpc: Fix spufs initialization sequence.
This is a small fix to get the spufs init sequence right.

init_spu_base() in spu_base.c should be called (via
module_init(init_spu_base)) before spufs_init() (via
module_init(spufs_init)) in spufs/inode.c gets called.

Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 21:27:50 +11:00
Paul Mackerras
e2f5a3c1be powerpc/64: Fix bug in setting floating-point exception mode
When loading up the FPU, we were using a 'ld' (load doubleword)
instruction to get the FP exception mode from the thread_struct,
but it's only an int field.  This changes the ld to lwz (load
word and zero-extend).

Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-07 13:55:30 +11:00
Paul Mackerras
6cb6524d90 Merge ../linux-2.6 2006-02-07 10:43:36 +11:00
Greg KH
410c05427a [PATCH] USB: Fix GPL markings on usb core functions.
I thought we had fixed up all non-gpl USB drivers, and was wrong to do
this.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 14:20:14 -08:00
Linus Torvalds
7a21ef6fe9 mm/slab.c (non-NUMA): Fix compile warning and clean up code
The non-NUMA case would do an unmatched "free_alien_cache()" on an alien
pointer that had never been allocated.

It might not matter from a code generation standpoint (since in the
non-NUMA case, the code doesn't actually _do_ anything), but it not only
results in a compiler warning, it's really really ugly too.

Fix the compiler warning by just having a matching dummy allocation.
That also avoids an unnecessary #ifdef in the code.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:26:38 -08:00
Linus Torvalds
c265c46bbb Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 2006-02-05 11:10:54 -08:00
Linus Torvalds
98bd0c07b6 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2006-02-05 11:10:29 -08:00
Robb, Sam
5e375bc7d5 [PATCH] kconfig: detect if -lintl is needed when linking conf,mconf
On a system where libintl.h is present, but the NLS functionality is
supplied by a separate library instead of the system C library, an attempt
to "make config" or "make menuconfig" will fail with link errors, ex:

  scripts/kconfig/mconf.o:mconf.c:(.text+0xf63): undefined reference to
    `_libintl_gettext'

This patch attempts to correct the problem by detecting whether or not NLS
support requires linking with libintl.

Signed-off-by: Samuel J Robb <sam.robb@timesys.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:54 -08:00
Adrian Bunk
4be68a783d [PATCH] i386: HIGHMEM64G must depend on X86_CMPXCHG64
Due to the usage of set_64bit in include/asm-i386/pgtable-3level.h,
HIGHMEM64G must depend on X86_CMPXCHG64.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:54 -08:00
Takashi Iwai
911b0ad25d [PATCH] Fix "value computed is not used" compile warnings with gcc-4.1
Fix gcc4.1 compile warnings "value computed is not used" with
set_current_state() and set_task_state() on i386/SMP and x86-64.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:54 -08:00
Chuck Ebbert
b53e8f68e0 [PATCH] i386: print kernel version in register dumps
Show first field of kernel version in register dumps like x86_64 does.

Changes output from e.g.:
	(2.6.16-rc1)
to:
	(2.6.16-rc1 #12)

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:53 -08:00
Chuck Ebbert
fe38d8553c [PATCH] i386 cpu hotplug: don't access freed memory
i386 CPU init code accesses freed init memory when booting a newly-started
processor after CPU hotplug.  The cpu_devs array is searched to find the
vendor and it contains pointers to freed data.

Fix that by:

        1. Zeroing entries for freed vendor data after bootup.
        2. Changing Transmeta, NSC and UMC to all __init[data].
        3. Printing a warning (once only) and setting this_cpu
           to a safe default when the vendor is not found.

This does not change behavior for AMD systems.  They were broken already
but no error was reported.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:53 -08:00
Ulrich Drepper
170aa3d026 [PATCH] namei.c: unlock missing in error case
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:53 -08:00
Trond Myklebust
f55eab822b [PATCH] VFS: Ensure LOOKUP_CONTINUE flag is preserved by link_path_walk()
When walking a path, the LOOKUP_CONTINUE flag is used by some filesystems
(for instance NFS) in order to determine whether or not it is looking up
the last component of the path.  It this is the case, it may have to look
at the intent information in order to perform various tasks such as atomic
open.

A problem currently occurs when link_path_walk() hits a symlink.  In this
case LOOKUP_CONTINUE may be cleared prematurely when we hit the end of the
path passed by __vfs_follow_link() (i.e.  the end of the symlink path)
rather than when we hit the end of the path passed by the user.

The solution is to have link_path_walk() clear LOOKUP_CONTINUE if and only
if that flag was unset when we entered the function.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:53 -08:00
Ravikiran G Thirumalai
4484ebf12b [PATCH] NUMA slab locking fixes: fix cpu down and up locking
This fixes locking and bugs in cpu_down and cpu_up paths of the NUMA slab
allocator.  Sonny Rao <sonny@burdell.org> reported problems sometime back on
POWER5 boxes, when the last cpu on the nodes were being offlined.  We could
not reproduce the same on x86_64 because the cpumask (node_to_cpumask) was not
being updated on cpu down.  Since that issue is now fixed, we can reproduce
Sonny's problems on x86_64 NUMA, and here is the fix.

The problem earlier was on CPU_DOWN, if it was the last cpu on the node to go
down, the array_caches (shared, alien) and the kmem_list3 of the node were
being freed (kfree) with the kmem_list3 lock held.  If the l3 or the
array_caches were to come from the same cache being cleared, we hit on
badness.

This patch cleans up the locking in cpu_up and cpu_down path.  We cannot
really free l3 on cpu down because, there is no node offlining yet and even
though a cpu is not yet up, node local memory can be allocated for it.  So l3s
are usually allocated at keme_cache_create and destroyed at
kmem_cache_destroy.  Hence, we don't need cachep->spinlock protection to get
to the cachep->nodelist[nodeid] either.

Patch survived onlining and offlining on a 4 core 2 node Tyan box with a 4
dbench process running all the time.

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:53 -08:00
Ravikiran G Thirumalai
ca3b9b9173 [PATCH] NUMA slab locking fixes: irq disabling from cahep->spinlock to l3 lock
Earlier, we had to disable on chip interrupts while taking the
cachep->spinlock because, at cache_grow, on every addition of a slab to a slab
cache, we incremented colour_next which was protected by the cachep->spinlock,
and cache_grow could occur at interrupt context.  Since, now we protect the
per-node colour_next with the node's list_lock, we do not need to disable on
chip interrupts while taking the per-cache spinlock, but we just need to
disable interrupts when taking the per-node kmem_list3 list_lock.

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:53 -08:00
Ravikiran G Thirumalai
2e1217cf96 [PATCH] NUMA slab locking fixes: move color_next to l3
colour_next is used as an index to add a colouring offset to a new slab in the
cache (colour_off * colour_next).  Now with the NUMA aware slab allocator, it
makes sense to colour slabs added on the same node sequentially with
colour_next.

This patch moves the colouring index "colour_next" per-node by placing it on
kmem_list3 rather than kmem_cache.

This also helps simplify locking for CPU up and down paths.

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Cc: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:53 -08:00
Christoph Lameter
64b4a954b0 [PATCH] hugetlb: add comment explaining reasons for Bus Errors
I just spent some time researching a Bus Error.  Turns out that the huge
page fault handler can return VM_FAULT_SIGBUS for various conditions where
no huge page is available.

Add a note explaining the reasoning in the source.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:53 -08:00
Andrew Morton
fe1dcbc4f3 [PATCH] jbd: fix transaction batching
Ben points out that:

  When writing files out using O_SYNC, jbd's 1 jiffy delay results in a
  significant drop in throughput as the disk sits idle.  The patch below
  results in a 4-5x performance improvement (from 6.5MB/s to ~24-30MB/s on my
  IDE test box) when writing out files using O_SYNC.

So optimise the batching code by omitting it entirely if the process which is
doing a sync write is the same as the one which did the most recent sync
write.  If that's true, we're unlikely to get any other processes joining the
transaction.

(Has been in -mm for ages - it took me a long time to get on to performance
testing it)

Numbers, on write-cache-disabled IDE:

/usr/bin/time -p synctest -n 10 -uf -t 1 -p 1 dir-name

Unpatched:
	40 seconds
Patched:
	35 seconds
Batching disabled:
	35 seconds

This is the problematic single-process-doing-fsync case.  With multiple
fsyncing processes the numbers are AFACIT unaltered by the patch.

Aside: performance testing and instrumentation shows that the transaction
batching almost doesn't help (testing with synctest -n 1 -uf -t 100 -p 10
dir-name on non-writeback-caching IDE).  This is because by the time one
process is running a synchronous commit, a bunch of other processes already
have a transaction handle open, so they're all going to batch into the same
transaction anyway.

The batching seems to offer maybe 5-10% speedup with this workload, but I'm
pretty sure it was more important than that when it was first developed 4-odd
years ago...

Cc: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:53 -08:00
Andrew Morton
bc5e483da6 [PATCH] reiserfs_get_acl() build fix
With CONFIG_REISERFS_FS_XATTR=y, CONFIG_REISERFS_FS_POSIX_ACL=n:

fs/reiserfs/xattr.c: In function `reiserfs_check_acl':
fs/reiserfs/xattr.c:1330: called object is not a function

Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-05 11:06:52 -08:00