The powerpc numa code unconditionally onlines all nodes from 0 to the
highest node id found, regardless of whether cpus or memory are
present in the nodes. This wastes 8K per node and complicates some
cpu and memory hotplug situations, such as adding a resource that
doesn't map to one of the nodes discovered at boot.
Set nodes online as resources are scanned. Fall back to node 0 only
when we're sure this isn't a NUMA machine.
Instead of defaulting to node 0 for cases of hot-adding a resource
which doesn't belong to any initialized node, assign it to the first
online node.
Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Code to handle Power4's invalid node id (0xffff) is duplicated for cpu
and memory. Better to handle this case in one place --
of_node_to_nid. Overall behavior should be unchanged.
Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since we effectively treat the domain ids given to us by firmare as
logical node ids, make this explicit (basically s/numa_domain/nid/).
No functional changes, only variable and function names are modified.
Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
map_cpu_to_node does not need to be inline, it is never called in a
hot path.
map_cpu_to_node, numa_setup_cpu, and find_cpu_node can be marked
__cpuinit, as they are never used after boot if CONFIG_HOTPLUG_CPU=n.
Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add debug statement for map_cpu_to_node; it's useful for cpu hotplug.
Clarify debug statement about not finding the numa reference points
property.
Don't print a meaningless associativity depth (-1) on non-numa systems.
Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
At boot, the numa code is assigning boot_cpuid to node 0
unconditionally. Basically, numa_setup_cpu is being stupid about it,
but this is the minimal fix -- just call numa_setup_cpu(boot_cpuid)
later, after all nodes have been set online.
Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
My patch (d7a5b2ffa1) to always panic if
lmb_alloc() fails is broken because it checks alloc < 0, but should be
checking alloc == 0.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The spidernet drivers uses request_firmware() and thus needs to select
FW_LOADER.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
While adding USB support to an MV64360 based board this week, I
discovered that all MV64x60 boards in the kernel have platform_notify
functions marked with __init. This causes an oops if a device is added
after boot.
The patch below removes the __init markers. I do not have all these
boards to test on, but the change seems very unlikely to break anything
else.
Signed-off-by: Adrian Cox <adrian@humboldt.co.uk>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Dump a stream of rawbytes with a new 'dr' command.
Produces less output and it is simpler to feed the output to scripts.
Also, dr has no dumpsize limits.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Typical use for of_find_node_by_name and of_find_node_by_type is to
iterate over all nodes of a given type/name. Add a helper macro to
do that (in spirit of the list_for_each* macros).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This converts arch/ppc to kzalloc usage.
Crosscompile tested with allyesconfig.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
At present, the powerpc pmd_bad() and pud_bad() macros return false
unless the given pmd or pud is zero. This patch makes these tests
more thorough, checking if the given pmd or pud looks like a plausible
pte page or pmd page pointer respectively. This can result in helpful
error messages when messing with the pagetable code.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc-merge:
powerpc: update defconfigs
[PATCH] powerpc: properly configure DDR/P5IOC children devs
[PATCH] powerpc: remove duplicate EXPORT_SYMBOLS
[PATCH] powerpc: RTC memory corruption
[PATCH] powerpc: enable NAP only on cpus who support it to avoid memory corruption
[PATCH] powerpc: Clarify wording for CRASH_DUMP Kconfig option
[PATCH] powerpc/64: enable CONFIG_BLK_DEV_SL82C105
[PATCH] powerpc: correct cacheflush loop in zImage
powerpc: Fix problem with time going backwards
powerpc: Disallow lparcfg being a module
The dynamic add path for PCI Host Bridges can fail to configure children
adapters under P5IOC controllers. It fails to properly fixup bus/device
resources, and it fails to properly enable EEH. Both of these steps
need to occur before any children devices are enabled in
pci_bus_add_devices().
Signed-off-by: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
remove warnings when building a 64bit kernel.
smp_call_function triggers also with 32bit kernel.
WARNING: vmlinux: duplicate symbol 'smp_call_function' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:164:EXPORT_SYMBOL(smp_call_function);
arch/powerpc/kernel/smp.c:300:EXPORT_SYMBOL(smp_call_function);
WARNING: vmlinux: duplicate symbol 'ioremap' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:113:EXPORT_SYMBOL(ioremap);
arch/powerpc/mm/pgtable_64.c:321:EXPORT_SYMBOL(ioremap);
WARNING: vmlinux: duplicate symbol '__ioremap' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:117:EXPORT_SYMBOL(__ioremap);
arch/powerpc/mm/pgtable_64.c:322:EXPORT_SYMBOL(__ioremap);
WARNING: vmlinux: duplicate symbol 'iounmap' previous definition was in vmlinux
arch/powerpc/kernel/ppc_ksyms.c:118:EXPORT_SYMBOL(iounmap);
arch/powerpc/mm/pgtable_64.c:323:EXPORT_SYMBOL(iounmap);
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We should be memset'ing the data we are pointing to, not the pointer
itself. This is in an error path so we probably don't hit it much.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch fixes incorrect setting of powersave_nap to 1 on all
PowerMacs, potentially causing memory corruption on some models. This
bug was introuced by me during the 32/64 bits merge.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The wording of the CRASH_DUMP Kconfig option is not very clear. It gives you a
kernel that can be used _as_ the kdump kernel, not a kernel that can boot into
a kdump kernel.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Enable the onboard IDE driver for p610, p615 and p630.
They have the CD connected to this card. All other RS/6000 systems with this
controller have no connectors and dont need this option.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Correct the loop for cacheflush. No idea where I copied the code from,
but the original does not work correct. Maybe the flush is not needed.
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The recent changes to keep gettimeofday in sync with xtime had the side
effect that it was occasionally possible for the time reported by
gettimeofday to go back by a microsecond. There were two reasons:
(1) when we recalculated the offsets used by gettimeofday every 2^31
timebase ticks, we lost an accumulated fractional microsecond, and
(2) because the update is done some time after the notional start of
jiffy, if ntp is slowing the clock, it is possible to see time go backwards
when the timebase factor gets reduced.
This fixes it by (a) slowing the gettimeofday clock by about 1us in
2^31 timebase ticks (a factor of less than 1 in 3.7 million), and (b)
adjusting the timebase offsets in the rare case that the gettimeofday
result could possibly go backwards (i.e. when ntp is slowing the clock
and the timer interrupt is late). In this case the adjustment will
reduce to zero eventually because of (a).
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes not one, but _two_, silly (but admittedly hard to hit) bugs
in the ext2 filesystem "readdir()" function. It also cleans up the code
to avoid the unnecessary goto mess.
The bugs were related to re-valiating the f_pos value after somebody had
either done an "lseek()" on the directory to an invalid offset, or when
the offset had become invalid due to a file being unlinked in the
directory. The code would not only set the f_version too eagerly, it
would also not update f_pos appropriately for when the offset fixup took
place.
When that happened, we'd occasionally subsequently fail the readdir()
even when we shouldn't (no real harm done, but an ugly printk, and
obviously you would end up not necessarily seeing all entries).
Thanks to Masoud Sharbiani <masouds@google.com> who noticed the problem
and had a test-case for it, and also fixed up a thinko in the first
version of this patch.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Masoud Sharbiani <masouds@google.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Ben Dooks
arch/arm/kernel/setup.c declares mem_fclk_21285 when
this is already declared in include/asm-arm/system.h
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
arch/arm/kernel/compat.c exports two functions,
convert_to_tag_list and squash_mem_tags which
are not defined in any header files, and not
used outside arch/arm/kernel.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
The enable_hlt and disable_hlt should be declared in
include/asm/setup.h. This fixes sparse errors from
arch/arm/kernel/process.c
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Fix the following warnings from sparse:
arch/arm/kernel/process.c:86:6: warning: symbol 'default_idle' was not declared. Should it be static?
arch/arm/kernel/process.c:378:5: warning: symbol 'dump_fpu' was not declared. Should it be static?
Include <linux/elfcore.h> for dump_fpu() decleration, and
make default_idle() static as it is not used outside the file.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The Coverity checker spotted the following bug in dup_namespace():
<-- snip -->
if (!new_ns->root) {
up_write(&namespace_sem);
kfree(new_ns);
goto out;
}
...
out:
return new_ns;
<-- snip -->
Callers expect a non-NULL result to not be freed.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Albrecht Dreß
Add DMA resources to s3c2410 spi platform devices - dma_(alloc|free)_coherent should now work as expected.
Signed-off-by: Albrecht Dreß <albrecht.dress@lios-tech.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Pavel Machek
Enable frontlight during collie bootup, so that display is actually
readable in anything other than bright sunlight.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It seems that setting scheduling policy and priorities is also the kind of
thing that might be performed in apps that also use the NUMA API, so it
would seem consistent to use CAP_SYS_NICE for NUMA also.
So use CAP_SYS_NICE for controlling migration permissions.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Update the documentation for page migration.
- Fix up bits and pieces in cpusets.txt
- Rework text in vm/page-migration to be clearer and reflect the final
version of page migration in 2.6.16. Mention Andi Kleen's numactl
package that contains user space tools for page migration via
libnuma. Add reference to numa_maps and to the manpage in numactl.
- Add todo list for outstanding issues
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
page migration currently simply retries a couple of times if try_to_unmap()
fails without inspecting the return code.
However, SWAP_FAIL indicates that the page is in a vma that has the
VM_LOCKED flag set (if ignore_refs ==1). We can check for that return code
and avoid retrying the migration.
migrate_page_remove_references() now needs to return a reason why the
failure occured. So switch migrate_page_remove_references to use -Exx
style error messages.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It seems this patch got dropped (it was in addition to the `s390:
improve response code handling in chsc_enable_facility()' patch).
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Affects only XFS (i.e. DIO_OWN_LOCKING case) - currently it is
not possible to get i_mutex locking correct when using DIO_OWN
direct I/O locking in a filesystem due to indeterminism in the
possible return code/lock/unlock combinations. This can cause
a direct read to attempt a double i_mutex unlock inside XFS.
We're now ensuring __blockdev_direct_IO always exits with the
inode i_mutex (still) held for a direct reader.
Tested with the three different locking modes (via direct block
device access, ext3 and XFS) - both reading and writing; cannot
find any regressions resulting from this change, and it clearly
fixes the mutex_unlock warning originally reported here:
http://marc.theaimsgroup.com/?l=linux-kernel&m=114189068126253&w=2
Signed-off-by: Nathan Scott <nathans@sgi.com>
Acked-by: Christoph Hellwig <hch@lst.de>
This fixes a race where lsn could be cleared before taking the lock
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
lapic_shutdown() re-enables interrupts which is un-desirable for panic
case, so use local_irq_save() and local_irq_restore() to keep the irqs
disabled for kexec on panic case, and close a possible race window while
kdump shutdown as shown in this stack trace
-- BUG: spinlock lockup on CPU#1, bash/4396, c52781a0
[<c01c1870>] _raw_spin_lock+0xb7/0xd2
[<c029e148>] _spin_lock+0x6/0x8
[<c011b33f>] scheduler_tick+0xe7/0x328
[<c0128a7c>] update_process_times+0x51/0x5d
[<c0114592>] smp_apic_timer_interrupt+0x4f/0x58
[<c01141ff>] lapic_shutdown+0x76/0x7e
[<c0104d7c>] apic_timer_interrupt+0x1c/0x30
[<c01141ff>] lapic_shutdown+0x76/0x7e
[<c0116659>] machine_crash_shutdown+0x83/0xaa
[<c013cc36>] crash_kexec+0xc1/0xe3
[<c029e148>] _spin_lock+0x6/0x8
[<c013cc22>] crash_kexec+0xad/0xe3
[<c0215280>] __handle_sysrq+0x84/0xfd
[<c018d937>] write_sysrq_trigger+0x2c/0x35
[<c015e47b>] vfs_write+0xa2/0x13b
[<c015ea73>] sys_write+0x3b/0x64
[<c0103c69>] syscall_call+0x7/0xb
Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This reverts commit c33d4568ac.
Andrew Clayton and Hugh Dickins report that it's broken for them and
causes strange page table and slab corruption, and spontaneous reboots.
Let's get it right next time.
Cc: Andrew Clayton <andrew@rootshell.co.uk>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Disable the EDAC sysfs code. The sysfs interface that EDAC presents to
user space needs more thought, and is likely to change substantially.
Therefore disable it for now so users don't start depending on it in its
current form.
- Disable the default behavior of calling panic() when an uncorrectible
error is detected (since for now, there is no sysfs interface that allows
the user to configure this behavior).
Signed-off-by: David S. Peterson <dsp@llnl.gov>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In theory, NLM specs assure us that the server will only reply LCK_GRANTED or
LCK_DENIED_GRACE_PERIOD to our NLM_UNLOCK request.
In practice, we should not assume this to be the case, and the code will
currently Oops if we do.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>