A lot of Opteron BIOS just pass 10 in all SLIT entries (10 is the
normalized unit). This is actually worse than the default heuristic
because it leads to pci_distance not knowing the difference between
local and remote nodes anymore. This messes up some NUMA
heuristics in generic code.
In this case it's better to fall back to the default heuristic
which just does nodea == nodeb ? 10 : 20.
This patch does some basic sanity checking on the SLIT and only accepts
the SLIT when it passes.
Invariants enforced are:
- Node to itself shall be 10
- Any other distance shouldn't be 10
- Distances smaller than 10 are illegal
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The separation of the rex64 prefix (on fxsave/fxrstor) by way of using
a semicolon resulted in the prefix not always taking effect (because
when extended registers are needed for addressing, another rex prefix
would have been generated by the compiler), thus (depending on the
build) resulting in eventually getting 32-bit saves and/or restores.
Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some people need it now on 64bit so reuse the i386 code for
x86-64. This will be also useful for future bug workarounds.
It is a bit simplified there because there is no need
to do it very early on x86-64. This means it doesn't need
early ioremap et.al. We run it as a core initcall right now.
I hope it's not needed for early setup.
I added a general CONFIG_DMI symbol in case IA64 or someone
else wants to reuse the code later too.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This was a backup file that somehow made it into the official
tree. Never used for anything. Remove.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use single instruction for find largest set bit on x86_64.
[Updated by Jan Beulich to fix wrong asm constraints in original
patch -AK]
Cc: jbeulich@novell.com
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The introduction of call_softirq switching to the interrupt stack several
releases earlier resulted in a problem with the code in show_trace, which
assumes that it can pick the previous stack pointer from the end of the
interrupt stack.
Cc: Andi Kleen <ak@muc.de>
Cc: Arjan van de Ven <arjanv@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Be more careful with TF handling to fix some copy protection codes in wine
patch originally for i386 by Linus, then ported to x86_64 by Andi Kleen
see: [PATCH] x86_64: Some fixes for single step handling
commit: be61bff789
But it was never applied to the ia32 emulation code which breaks some
copy-protection schemes under wine when running on x86_64.
Signed-off-by: Peter Beutner <p.beutner@gmx.net>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As discussed, the flags register on x86-64 is saved and restored by the
assembly code which sets up struct pt_regs, so we do not need to save
and restore it in the inline assembler which already informs gcc that
we're clobbering the flags. This patch has been sanity booted and works
okay here.
Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
So why are we calling smp_send_stop from machine_halt?
We don't.
Looking more closely at the bug report the problem here
is that halt -p is called which triggers not a halt but
an attempt to power off.
machine_power_off calls machine_shutdown which calls smp_send_stop.
If pm_power_off is set we should never make it out machine_power_off
to the call of do_exit. So pm_power_off must not be set in this case.
When pm_power_off is not set we expect machine_power_off to devolve
into machine_halt.
So how do we fix this?
Playing too much with smp_send_stop is dangerous because it
must also be safe to be called from panic.
It looks like the obviously correct fix is to only call
machine_shutdown when pm_power_off is defined. Doing
that will make Andi's assumption about not scheduling
true and generally simplify what must be supported.
This turns machine_power_off into a noop like machine_halt
when pm_power_off is not defined.
If the expected behavior is that sys_reboot(LINUX_REBOOT_CMD_POWER_OFF)
becomes sys_reboot(LINUX_REBOOT_CMD_HALT) if pm_power_off is NULL
this is not quite a comprehensive fix as we pass a different parameter
to the reboot notifier and we set system_state to a different value
before calling device_shutdown().
Unfortunately any fix more comprehensive I can think of is not
obviously correct. The core problem is that there is no architecture
independent way to detect if machine_power will become a noop, without
calling it.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I noticed that some lowlevel send_IPI_mask helpers had a hotplug/preempt
race whereupon the cpu_online_map was read before disabling preemption;
...
cpumask_t mask = cpu_online_map;
int cpu = get_cpu();
cpu_clear(cpu, mask);
...
But then i realised that there is no need for these lowlevel functions to
be going through all this trouble when all the callers are already made
hotplug/preempt safe.
Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is one CPU here whose MCE bank count is 6. This patch increases
x86_64's MCE bank count.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following is probably a good idea given that the atomic_set() isn't
a barrier here either.
Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This
- switches the INT3 handler to run on an IST stack (to cope with
breakpoints set by a kernel debugger on places where the kernel's
%gs base hasn't been set up, yet); the IST stack used is shared with
the INT1 handler's
[AK: this also allows setting a kprobe on the interrupt/exception entry
points]
- allows nesting of INT1/INT3 handlers so that one can, with a kernel
debugger, debug (at least) the user-mode portions of the INT1/INT3
handling; the nesting isn't actively enabled here since a kernel-
debugger-free kernel doesn't need it
Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Print bits for RDTSCP, SVM, CR8-LEGACY.
Also now print power flags on i386 like x86-64 always did.
This will add a new line in the 386 cpuinfo, but that shouldn't
be an issue - did that in the past too and I haven't heard
of any breakage.
I shrunk some of the fields in the i386 cpuinfo_x86 to chars
to make up for the new int "x86_power" field. Overall it's
smaller than before.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
They previously tried to figure this out on their own.
Suggested by Venkatesh.
Cc: venkatesh.pallipadi@intel.com
Cc: davej@redhat.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Define it for i386 too.
This is a synthetic flag that signifies that the CPU's TSC runs
at a constant P state invariant frequency.
Fix up the logic on x86-64/i386 to set it on all known CPUs.
Use the AMD defined bit to set it on future AMD CPUs.
Cc: venkatesh.pallipadi@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Was only used by the floppy driver to work around some ancient
hardware bug that should never occur on any 64bit system.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Most users don't need it so no need to waste memory.
This means an user has to specify the appropiate number of
hotplug CPUs on the command line with additional_cpus=...
or fix their BIOS to follow the convention in
Documentation/x86-64/cpu-hotplug-spec
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adjust page fault protection error check before considering it to be
a vmalloc synchronization candidate.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make sure no iret can fault without attached recovery code.
Cannot happen in the normal case, but might be useful
with kernel debuggers
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Since a double fault always implies that kernel data structures are
corrupt, this fault should neither be handed to user mode handling,
nor should the handler allow resuming the faulting code stream (since
architecturally this isn't a fault, but an abort).
Note that this slightly depends on the previously submitted patch
adjusting the prototype of notify_die() (a compiler warning will result
without that other patch).
AK: Removed obsolete CONFIG_CHECKING code, added comments
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adjusts things so that handlers of the die() notifier will have
sufficient information about the trap currently being handled. It also
adjusts the notify_die() prototype to (again) match that of i386.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Other than apparently commonly assumed, the bound instruction does not
require the corresponding IDT entry to have DPL 3.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As a follow-up to the introduction of CONFIG_UNWIND_INFO, this
separates the generation of frame unwind information for x86-64 from
that of full debug information.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Based on the documentation recently posted by Richard Brunner.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Frame unwind information was still incorrect for ia32_ptregs_common
(sorry, my fault), and could be improved for some of the other entry
points.
Signed-Off-By: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Run all rio files through indent -kr -i8 -bri0 -l255, as requested by Alan.
rioboot.c and rioinit.c were skipped due to worrisome lindent warnings.
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
arch: Use <linux/capability.h> where capable() is used.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
net: Use <linux/capability.h> where capable() is used.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
fs: Use <linux/capability.h> where capable() is used.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Move capable() from sched.h to capability.h;
- Use <linux/capability.h> where capable() is used
(in include/, block/, ipc/, kernel/, a few drivers/,
mm/, security/, & sound/;
many more drivers/ to go)
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Uninline capable(). Saves 2K of kernel text on a generic .config, and 1K on a
tiny config. In addition it makes the use of capable more consistent between
CONFIG_SECURITY and !CONFIG_SECURITY
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With Andi Kleen's x86_64 patch to use DMI, and my ia64 to use DMI, there is
now a new CONFIG_DMI option which takes the place of CONFIG_X86 to denote
the availability of the DMI functions. Make the IPMI driver use CONFIG_DMI
instead.
Tested on ia64 2.6.15 kernel plus the previous patch, on a Dell PowerEdge
7250 Itanium2 server, and it now autodetects the IPMI KCS driver as
expected.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
sound/oss/i810_audio.c:3431: warning: label `out_pio' defined but not used
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is a window where a probe gets removed right after the probe is hit
on some different cpu. In this case probe handlers can't find a matching
probe instance related to break address. In this case we need to read the
original instruction at break address to see if that is not a break/int3
instruction and recover safely.
Previous code had a bug where we were not checking for the above race in
case of reentrant probes and the below patch fixes this race.
Tested on IA64, Powerpc, x86_64.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When a kprobes modules is written in such a way that probes are inserted on
itself, then unload of that moudle was not possible due to reference
couning on the same module.
The below patch makes a check and incrementes the module refcount only if
it is not a self probed module.
We need to allow modules to probe themself for kprobes performance
measurements
This patch has been tested on several x86_64, ppc64 and IA64 architectures.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sometimes we call do_journal_end() with t_refcount == 0. If quota is
turned on and we happen to have some inode with preallocation bad things
happen as we try to use the current handle for quota operations. Checks
for t_refcount in journal_begin() fail and we Oops. We raise t_refcount to
make those checks happy. We should not cause any bad as all the needed
quota blocks should be already attached to the transaction (they were
attached to the transaction when we allocated those preallocation blocks).
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <mason@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove unnecessary and incorrectly implemented page alignment of register
base address before calling ioremap()
Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Get rid of bogus extern attribute that causes sparse warning.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>