Commit Graph

156 Commits

Author SHA1 Message Date
Alexander Nyberg
4f339ecb30 [PATCH] kdump: Save trap information for later analysis
If we are faulting in kernel it is quite possible this will lead to a
panic.  Save trap number, cr2 (in case of page fault) and error_code in the
current thread (these fields already exist for signal delivery but are not
used here).

This helps later kdump crash analyzing from user-space (a script has been
submitted to dig this info out in gdb).

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Cc: <fastboot@lists.osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:55 -07:00
Alexander Nyberg
6e274d1443 [PATCH] kdump: Use real pt_regs from exception
Makes kexec_crashdump() take a pt_regs * as an argument.  This allows to
get exact register state at the point of the crash.  If we come from direct
panic assertion NULL will be passed and the current registers saved before
crashdump.

This hooks into two places:
die(): check the conditions under which we will panic when calling
do_exit and go there directly with the pt_regs that caused the fatal
fault.

die_nmi(): If we receive an NMI lockup while in the kernel use the
pt_regs and go directly to crash_kexec(). We're probably nested up badly
at this point so this might be the only chance to escape with proper
information.

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:54 -07:00
Vivek Goyal
2030eae52b [PATCH] Retrieve elfcorehdr address from command line
This patch adds support for retrieving the address of elf core header if one
is passed in command line.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:53 -07:00
Vivek Goyal
92aa63a5a1 [PATCH] kdump: Retrieve saved max pfn
This patch retrieves the max_pfn being used by previous kernel and stores it
in a safe location (saved_max_pfn) before it is overwritten due to user
defined memory map.  This pfn is used to make sure that user does not try to
read the physical memory beyond saved_max_pfn.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:52 -07:00
Vivek Goyal
a3ea8ac846 [PATCH] Kexec: Kexec on panic fix with nmi watchdog enabled
o Problem: Kexec on panic hangs if first kernel is booted with nmi_watchdog
  command line parameter. This problem occurs because kexec crash shutdown
  code replaces the NMI callback handler. This handler saves the cpu register
  states and halts the cpu. If system is booted with nmi_watchdog parameter,
  then crashing cpu also runs this nmi handler and halts itself.

o This patch fixes the problem by keeping a track of crashing cpu and not
  executing the new nmi handler on crashing cpu.

o There is a dependence on smp_processor_id() function which might return
  insane value for cpu, if cpu field of thread_info is corrupted.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:52 -07:00
Vivek Goyal
4d55476c3f [PATCH] kdump: NMI handler segment selector, stack pointer fix
CPU does not save ss and esp on stack if execution was already in kernel mode
at the time of NMI occurrence.  This leads to saving of erractic values for ss
and esp.  This patch fixes the issue.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:52 -07:00
Vivek Goyal
625f1c8219 [PATCH] Kdump: Export crash notes section address through sysfs
o Following patch exports kexec global variable "crash_notes" to user space
  through sysfs as kernel attribute in /sys/kernel.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:51 -07:00
Eric W. Biederman
1bc3b91aee [PATCH] crashdump: x86 crashkernel option
This is the x86 implementation of the crashkernel option.  It reserves a
window of memory very early in the bootup process, so we never use it for
anything but the kernel to switch to when the running kernel panics.

In addition to reserving this memory a resource structure is registered so
looking at /proc/iomem it is clear what happened to that memory.

ISSUES:
Is it possible to implement this in a architecture generic way?
What should be done with architectures that always use an iommu and
thus don't report their RAM memory resources in /proc/iomem?

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman
63d30298ef [PATCH] kexec: x86 shutdown APICs during crash_shutdown
In the case of a crash/panic an architecture specific function
machine_crash_shutdown is called.  This patch adds to the x86 machine_crash
function the standard kernel code for shutting down apics.

Every line of code added to that function increases the risk that we will call
code after a kernel panic that is not safe.

This patch should not make it to the stable kernel without a being reviewed a
lot more.  It is unclear how much a hardned kernel can take when it comes to
misconfigured apics.  So since a normal kernel has problems this patch does a
clean shutdown.

It is my expectation this patch will be dropped from future generations of the
kexec work.  But for the moment it is a crutch to keep from breaking
everything.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:50 -07:00
Eric W. Biederman
2c818b45a2 [PATCH] kexec: x86: snapshot registers during crash shutdown
After the kernel panics if we wish to generate an entire machine core file it
is very nice to know the register state at the time the machine crashed.

After long discussion it was realized that if you are going to be saving the
information anyway it is reasonable to store the information in a format that
it will be used and recognized in so the register state is stored in the
standard ELF note format.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Eric W. Biederman
c4ac4263a0 [PATCH] crashdump: x86: add NMI handler to capture other CPUs
One of the dangers when switching from one kernel to another is what happens
to all of the other cpus that were running in the crashed kernel.  In an
attempt to avoid that problem this patch adds a nmi handler and attempts to
shoot down the other cpus by sending them non maskable interrupts.

The code then waits for 1 second or until all known cpus have stopped running
and then jumps from the running kernel that has crashed to the kernel in
reserved memory.

The kernel spin loop is used for the delay as that should behave continue to
be safe even in after a crash.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Eric W. Biederman
5033cba087 [PATCH] kexec: x86 kexec core
This is the i386 implementation of kexec.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Eric W. Biederman
dd2a13054f [PATCH] kexec: x86: factor out apic shutdown code
Factor out the apic and smp shutdown code from machine_restart so it can be
called by in the kexec reboot path as well.

By switching to the bootstrap cpu by default on reboot I can delete/simplify
some motherboard fixups well.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:49 -07:00
Vivek Goyal
8a9190853c [PATCH] kexec: reserve Bootmem fix for booting nondefault location kernel
This patch fixes a problem with reserving memory during boot up of a kernel
built for non-default location.  Currently boot memory allocator reserves
the memory required by kernel image, boot allocaotor bitmap etc.  It
assumes that kernel is loaded at 1MB (HIGH_MEMORY hard coded to 1024*1024).
 But kernel can be built for non-default locatoin, hence existing
hardcoding will lead to reserving unnecessary memory.  This patch fixes it.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:48 -07:00
Eric W. Biederman
3d345e3fc9 [PATCH] kexec: x86: add CONFIG_PYSICAL_START
For one kernel to report a crash another kernel has created we need
to have 2 kernels loaded simultaneously in memory.  To accomplish this
the two kernels need to built to run at different physical addresses.

This patch adds the CONFIG_PHYSICAL_START option to the x86 kernel
so we can do just that.  You need to know what you are doing and
the ramifications are before changing this value, and most users
won't care so I have made it depend on CONFIG_EMBEDDED

bzImage kernels will work and run at a different address when compiled
with this option but they will still load at 1MB.  If you need a kernel
loaded at a different address as well you need to boot a vmlinux.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:48 -07:00
Eric W. Biederman
ad0d75ebac [PATCH] kexec: x86: vmlinux: fix physical addresses
The vmlinux on i386 does not report the correct physical address of
the kernel.  Instead in the physical address field it currently
reports the virtual address of the kernel.

This is patch is a bug fix that corrects vmlinux to report the
proper physical addresses.

This is potentially a help for crash dump analysis tools.

This definitiely allows bootloaders that load vmlinux as a standard
ELF executable.  Bootloaders directly loading vmlinux become of
practical importance when we consider the kexec on panic case.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:47 -07:00
Eric W. Biederman
650927ef8a [PATCH] kexec: x86: resture apic virtual wire mode on shutdown
When coming out of apic mode attempt to set the appropriate
apic back into virtual wire mode.  This improves on previous versions
of this patch by by never setting bot the local apic and the ioapic
into veritual wire mode.

This code looks at data from the mptable to see if an ioapic has
an ExtInt input to make this decision.  A future improvement
is to figure out which apic or ioapic was in virtual wire mode
at boot time and to remember it.  That is potentially a more accurate
method, of selecting which apic to place in virutal wire mode.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:47 -07:00
Eric W. Biederman
cee5dab485 [PATCH] kexec: x86: i8259 shutdown: disable interrupts
From: Eric W. Biederman <ebiederm@xmission.com>

This patch disables interrupt generation from the legacy pic on reboot.  Now
that there is a sys_device class it should not be called while drivers are
still using interrupts.

There is a report about this breaking ACPI power off on some systems.
http://bugme.osdl.org/show_bug.cgi?id=4041
However the final comment seems to exonerate this code.  So until
I get more information I believe that was a false positive.

Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:46 -07:00
Eric W. Biederman
9635b47d91 [PATCH] kexec: x86: local apic fix
From: "Maciej W. Rozycki" <macro@linux-mips.org>

Fix a kexec problem whcih causes local APIC detection failure.

The problem is detect_init_APIC() is called early, before the command line
have been processed.  Therefore "lapic" (and "nolapic") have not been seen,
yet.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:46 -07:00
Li Shaohua
5a72e04df5 [PATCH] suspend/resume SMP support
Using CPU hotplug to support suspend/resume SMP.  Both S3 and S4 use
disable/enable_nonboot_cpus API.  The S4 part is based on Pavel's original S4
SMP patch.

Signed-off-by: Li Shaohua<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:32 -07:00
Li Shaohua
e1367daf3e [PATCH] cpu state clean after hot remove
Clean CPU states in order to reuse smp boot code for CPU hotplug.

Signed-off-by: Li Shaohua<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:30 -07:00
Li Shaohua
0bb3184df5 [PATCH] init call cleanup
Trival patch for CPU hotplug.  In CPU identify part, only did cleaup for intel
CPUs.  Need do for other CPUs if they support S3 SMP.

Signed-off-by: Li Shaohua<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:30 -07:00
Li Shaohua
d720803a93 [PATCH] sibling map initializing rework
Make sibling map init per-cpu.  Hotplug CPU may change the map at runtime.

Signed-off-by: Li Shaohua<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:29 -07:00
Li Shaohua
6fe940d6c3 [PATCH] sep initializing rework
Make SEP init per-cpu, so it is hotplug safe.

Signed-off-by: Li Shaohua<shaohua.li@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:29 -07:00
Zwane Mwaikambo
f370513640 [PATCH] i386 CPU hotplug
(The i386 CPU hotplug patch provides infrastructure for some work which Pavel
is doing as well as for ACPI S3 (suspend-to-RAM) work which Li Shaohua
<shaohua.li@intel.com> is doing)

The following provides i386 architecture support for safely unregistering and
registering processors during runtime, updated for the current -mm tree.  In
order to avoid dumping cpu hotplug code into kernel/irq/* i dropped the
cpu_online check in do_IRQ() by modifying fixup_irqs().  The difference being
that on cpu offline, fixup_irqs() is called before we clear the cpu from
cpu_online_map and a long delay in order to ensure that we never have any
queued external interrupts on the APICs.  There are additional changes to s390
and ppc64 to account for this change.

1) Add CONFIG_HOTPLUG_CPU
2) disable local APIC timer on dead cpus.
3) Disable preempt around irq balancing to prevent CPUs going down.
4) Print irq stats for all possible cpus.
5) Debugging check for interrupts on offline cpus.
6) Hacky fixup_irqs() to redirect irqs when cpus go off/online.
7) play_dead() for offline cpus to spin inside.
8) Handle offline cpus set in flush_tlb_others().
9) Grab lock earlier in smp_call_function() to prevent CPUs going down.
10) Implement __cpu_disable() and __cpu_die().
11) Enable local interrupts in cpu_enable() after fixup_irqs()
12) Don't fiddle with NMI on dead cpu, but leave intact on other cpus.
13) Program IRQ affinity whilst cpu is still in cpu_online_map on offline.

Signed-off-by: Zwane Mwaikambo <zwane@linuxpower.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:29 -07:00
Shaohua Li
d92de65cab [PATCH] variable overflow after hundreds round of hotplug CPU
I'm doing the cpu hotplug stress test and found a variable ('ready') is
overflow after several hundreds rounds of cpu hotplug.  Here is a fix.

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:28 -07:00
Shaohua Li
a13db56624 [PATCH] CPU hotplug: fix hpet sectioning
With hpet enabled, cpu hotplug uses some routines marked with __init.

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:28 -07:00
Andrey Panin
1249c5138f [PATCH] dmi: spring cleanup
Whitespace and CodingStyle cleanup. No functionality changes.

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:28 -07:00
Andrey Panin
b625883f24 [PATCH] dmi: remove central blacklist
Since last dmi quirk looks useless (it just prints 404 compliant url) we can
finally remove central dmi blacklist.

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:28 -07:00
Andrey Panin
0f8133a8db [PATCH] dmi: move ACPI sleep quirk
This patch moves ACPI sleep quirk out of dmi_scan.c

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:28 -07:00
Andrey Panin
aea00143a8 [PATCH] dmi: move ACPI boot quirk
This patch moves ACPI boot quirks out of dmi_scan.c

Signed-off-by: Andrey Panin <pazke@donpac.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:27 -07:00
Dmitry Torokhov
e70c9d5e61 [PATCH] I8K: use standard DMI interface
I8K: Change to use stock dmi infrastructure instead of homegrown
     parsing code. The driver now requires box's DMI data to match
     list of supported models so driver can be safely compiled-in
     by default without fear of it poking into random SMM BIOS
     code. DMI checks can be ignored with i8k.ignore_dmi option.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-25 16:24:24 -07:00
Prasanna S Panchamukhi
417c8da651 [PATCH] kprobes: Temporary disarming of reentrant probe for i386
This patch includes i386 architecture specific changes to support temporary
disarming on reentrancy of probes.

Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:24 -07:00
Hien Nguyen
0aa55e4d7d [PATCH] kprobes: moves lock-unlock to non-arch kprobe_flush_task
This patch moves the lock/unlock of the arch specific kprobe_flush_task()
to the non-arch specific kprobe_flusk_task().

Signed-off-by: Hien Nguyen <hien@us.ibm.com>
Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:21 -07:00
Rusty Lynch
7e1048b11c [PATCH] Move kprobe [dis]arming into arch specific code
The architecture independent code of the current kprobes implementation is
arming and disarming kprobes at registration time.  The problem is that the
code is assuming that arming and disarming is a just done by a simple write
of some magic value to an address.  This is problematic for ia64 where our
instructions look more like structures, and we can not insert break points
by just doing something like:

*p->addr = BREAKPOINT_INSTRUCTION;

The following patch to 2.6.12-rc4-mm2 adds two new architecture dependent
functions:

     * void arch_arm_kprobe(struct kprobe *p)
     * void arch_disarm_kprobe(struct kprobe *p)

and then adds the new functions for each of the architectures that already
implement kprobes (spar64/ppc64/i386/x86_64).

I thought arch_[dis]arm_kprobe was the most descriptive of what was really
happening, but each of the architectures already had a disarm_kprobe()
function that was really a "disarm and do some other clean-up items as
needed when you stumble across a recursive kprobe." So...  I took the
liberty of changing the code that was calling disarm_kprobe() to call
arch_disarm_kprobe(), and then do the cleanup in the block of code dealing
with the recursive kprobe case.

So far this patch as been tested on i386, x86_64, and ppc64, but still
needs to be tested in sparc64.

Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:21 -07:00
Hien Nguyen
b94cce926b [PATCH] kprobes: function-return probes
This patch adds function-return probes to kprobes for the i386
architecture.  This enables you to establish a handler to be run when a
function returns.

1. API

Two new functions are added to kprobes:

	int register_kretprobe(struct kretprobe *rp);
	void unregister_kretprobe(struct kretprobe *rp);

2. Registration and unregistration

2.1 Register

  To register a function-return probe, the user populates the following
  fields in a kretprobe object and calls register_kretprobe() with the
  kretprobe address as an argument:

  kp.addr - the function's address

  handler - this function is run after the ret instruction executes, but
  before control returns to the return address in the caller.

  maxactive - The maximum number of instances of the probed function that
  can be active concurrently.  For example, if the function is non-
  recursive and is called with a spinlock or mutex held, maxactive = 1
  should be enough.  If the function is non-recursive and can never
  relinquish the CPU (e.g., via a semaphore or preemption), NR_CPUS should
  be enough.  maxactive is used to determine how many kretprobe_instance
  objects to allocate for this particular probed function.  If maxactive <=
  0, it is set to a default value (if CONFIG_PREEMPT maxactive=max(10, 2 *
  NR_CPUS) else maxactive=NR_CPUS)

  For example:

    struct kretprobe rp;
    rp.kp.addr = /* entrypoint address */
    rp.handler = /*return probe handler */
    rp.maxactive = /* e.g., 1 or NR_CPUS or 0, see the above explanation */
    register_kretprobe(&rp);

  The following field may also be of interest:

  nmissed - Initialized to zero when the function-return probe is
  registered, and incremented every time the probed function is entered but
  there is no kretprobe_instance object available for establishing the
  function-return probe (i.e., because maxactive was set too low).

2.2 Unregister

  To unregiter a function-return probe, the user calls
  unregister_kretprobe() with the same kretprobe object as registered
  previously.  If a probed function is running when the return probe is
  unregistered, the function will return as expected, but the handler won't
  be run.

3. Limitations

3.1 This patch supports only the i386 architecture, but patches for
    x86_64 and ppc64 are anticipated soon.

3.2 Return probes operates by replacing the return address in the stack
    (or in a known register, such as the lr register for ppc).  This may
    cause __builtin_return_address(0), when invoked from the return-probed
    function, to return the address of the return-probes trampoline.

3.3 This implementation uses the "Multiprobes at an address" feature in
    2.6.12-rc3-mm3.

3.4 Due to a limitation in multi-probes, you cannot currently establish
    a return probe and a jprobe on the same function.  A patch to remove
    this limitation is being tested.

This feature is required by SystemTap (http://sourceware.org/systemtap),
and reflects ideas contributed by several SystemTap developers, including
Will Cohen and Ananth Mavinakayanahalli.

Signed-off-by: Hien Nguyen <hien@us.ibm.com>
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@laposte.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:21 -07:00
Vincent Hanquez
717b594a41 [PATCH] xen: x86: Use more usermode macro
Use the user_mode macro where it's possible.

Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:14 -07:00
Vincent Hanquez
fa1e1bdf78 [PATCH] xen: x86: Rename usermode macro
Rename user_mode to user_mode_vm and add a user_mode macro similar to the
x86-64 one.

This is useful for Xen because the linux xen kernel does not runs on the same
priviledge that a vanilla linux kernel, and with this we just need to redefine
user_mode().

Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:14 -07:00
Vincent Hanquez
1cc6f12e03 [PATCH] xen: x86: Use new macro for debugreg
Make use of the 2 new macro set_debugreg and get_debugreg.

Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:13 -07:00
Andrew Morton
c92c6ffdb1 [PATCH] mtrr size-and-base debugging
Consolidate the mtrr sanity checking, add a dump_stack().

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:12 -07:00
Andrew Morton
a3a255e744 [PATCH] x86: cpu_khz type fix
x86_64's cpu_khz is unsigned int and there is no reason why x86 needs to use
unsigned long.

So make cpu_khz unsigned int on x86 as well.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:11 -07:00
Alexey Dobriyan
129f69465b [PATCH] Remove i386_ksyms.c, almost.
* EXPORT_SYMBOL's moved to other files
* #include <linux/config.h>, <linux/module.h> where needed
* #include's in i386_ksyms.c cleaned up
* After copy-paste, redundant due to Makefiles rules preprocessor directives
  removed:

	#ifdef CONFIG_FOO
	EXPORT_SYMBOL(foo);
	#endif

	obj-$(CONFIG_FOO) += foo.o

* Tiny reformat to fit in 80 columns

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:11 -07:00
Natalie Protasevich
c434b7a6ae [PATCH] x86: avoid wasting IRQs for PCI devices
I have submitted the patch for x86_64, this is submission for i386.

The patch changes the way IRQs are handed out to PCI devices.  Currently,
each I/O APIC pin gets associated with an IRQ, no matter if the pin is used
or not.  This imposes severe limitation on systems that have designs that
employ many I/O APICs, only utilizing couple lines of each, such as P64H2
chipset.  It is used in ES7000, and currently, there is no way to boot the
system with more that 9 I/O APICs.

The simple change below allows to boot a system with say 64 (or more) I/O
APICs, each providing 1 slot, which otherwise impossible because of the IRQ
gaps created for unused lines on each I/O APIC.  It does not resolve the
problem with number of devices that exceeds number of possible IRQs, but
eases up a tension for IRQs on any large system with potentually large
number of devices.

Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:10 -07:00
Jan Beulich
7fbb4f6e68 [PATCH] adjust i386 watchdog tick calculation
Get the i386 watchdog tick calculation into a state where it can also be used
on CPUs with frequencies beyond 4GHz, and it consolidates the calculation into
a single place (for potential furture adjustments).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:09 -07:00
Natalie Protasevich
ca05fea6db [PATCH] Do not enforce unique IO_APIC_ID check for xAPIC systems (i386)
This patch is per Andi's request to remove NO_IOAPIC_CHECK from genapic and
use heuristics to prevent unique I/O APIC ID check for systems that don't
need it.  The patch disables unique I/O APIC ID check for Xeon-based and
other platforms that don't use serial APIC bus for interrupt delivery.
Andi stated that AMD systems don't need unique IO_APIC_IDs either.

Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:09 -07:00
Roland McGrath
7c1def1652 [PATCH] i386: never block forced SIGSEGV
This problem was first noticed on PPC and has already been fixed there.
But the exact same issue applies to other platforms in the same way.  The
signal blocking for sa_mask and the handled signal takes place after the
handler setup.  When the stack is bogus, the handler setup forces a
SIGSEGV.  But then this will be blocked, and returning to user mode will
fault again and iterate.  This patch fixes the problem by checking whether
signal handler setup failed, and not doing the signal-blocking if so.  This
copies what was done in the ppc code.  I think all architectures' signal
handler setup code follows this pattern and needs the change.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:09 -07:00
Venkatesh Pallipadi
8a9e1b0f56 [PATCH] Platform SMIs and their interferance with tsc based delay calibration
Issue:
Current tsc based delay_calibration can result in significant errors in
loops_per_jiffy count when the platform events like SMIs
(System Management Interrupts that are non-maskable) are present. This could
lead to potential kernel panic(). This issue is becoming more visible with 2.6
kernel (as default HZ is 1000) and on platforms with higher SMI handling
latencies. During the boot time, SMIs are mostly used by BIOS (for things
like legacy keyboard emulation).

Description:
The psuedocode for current delay calibration with tsc based delay looks like
(0) Estimate a value for loops_per_jiffy
(1) While (loops_per_jiffy estimate is accurate enough)
(2)   wait for jiffy transition (jiffy1)
(3)   Note down current tsc (tsc1)
(4)   loop until tsc becomes tsc1 + loops_per_jiffy
(5)   check whether jiffy changed since jiffy1 or not and refine
loops_per_jiffy estimate

Consider the following cases
Case 1:
If SMIs happen between (2) and (3) above, we can end up with a
loops_per_jiffy value that is too low. This results in shorted delays and
kernel can panic () during boot (Mostly at IOAPIC timer initialization
timer_irq_works() as we don't have enough timer interrupts in a specified
interval).

Case 2:
If SMIs happen between (3) and (4) above, then we can end up with a
loops_per_jiffy value that is too high. And with current i386 code, too
high lpj value (greater than 17M) can result in a overflow in
delay.c:__const_udelay() again resulting in shorter delay and panic().

Solution:
The patch below makes the calibration routine aware of asynchronous events
like SMIs. We increase the delay calibration time and also identify any
significant errors (greater than 12.5%) in the calibration and notify it to
user.

Patch below changes both i386 and x86-64 architectures to use this
new and improved calibrate_delay_direct() routine.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:08 -07:00
Andy Whitcroft
05b79bdcb4 [PATCH] sparsemem memory model for i386
Provide the architecture specific implementation for SPARSEMEM for i386 SMP
and NUMA systems.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin Bligh <mbligh@aracnet.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:05 -07:00
Martin Hicks
753ee72896 [PATCH] VM: early zone reclaim
This is the core of the (much simplified) early reclaim.  The goal of this
patch is to reclaim some easily-freed pages from a zone before falling back
onto another zone.

One of the major uses of this is NUMA machines.  With the default allocator
behavior the allocator would look for memory in another zone, which might be
off-node, before trying to reclaim from the current zone.

This adds a zone tuneable to enable early zone reclaim.  It is selected on a
per-zone basis and is turned on/off via syscall.

Adding some extra throttling on the reclaim was also required (patch
4/4).  Without the machine would grind to a crawl when doing a "make -j"
kernel build.  Even with this patch the System Time is higher on
average, but it seems tolerable.  Here are some numbers for kernbench
runs on a 2-node, 4cpu, 8Gig RAM Altix in the "make -j" run:

			wall  user   sys   %cpu  ctx sw.  sleeps
			----  ----   ---   ----   ------  ------
No patch		1009  1384   847   258   298170   504402
w/patch, no reclaim     880   1376   667   288   254064   396745
w/patch & reclaim       1079  1385   926   252   291625   548873

These numbers are the average of 2 runs of 3 "make -j" runs done right
after system boot.  Run-to-run variability for "make -j" is huge, so
these numbers aren't terribly useful except to seee that with reclaim
the benchmark still finishes in a reasonable amount of time.

I also looked at the NUMA hit/miss stats for the "make -j" runs and the
reclaim doesn't make any difference when the machine is thrashing away.

Doing a "make -j8" on a single node that is filled with page cache pages
takes 700 seconds with reclaim turned on and 735 seconds without reclaim
(due to remote memory accesses).

The simple zone_reclaim syscall program is at
http://www.bork.org/~mort/sgi/zone_reclaim.c

Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:14 -07:00
Ingo Molnar
39c715b717 [PATCH] smp_processor_id() cleanup
This patch implements a number of smp_processor_id() cleanup ideas that
Arjan van de Ven and I came up with.

The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
spaghetti was hard to follow both on the implementational and on the
usage side.

Some of the complexity arose from picking wrong names, some of the
complexity comes from the fact that not all architectures defined
__smp_processor_id.

In the new code, there are two externally visible symbols:

 - smp_processor_id(): debug variant.

 - raw_smp_processor_id(): nondebug variant. Replaces all existing
   uses of _smp_processor_id() and __smp_processor_id(). Defined
   by every SMP architecture in include/asm-*/smp.h.

There is one new internal symbol, dependent on DEBUG_PREEMPT:

 - debug_smp_processor_id(): internal debug variant, mapped to
                             smp_processor_id().

Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
lib/smp_processor_id.c file.  All related comments got updated and/or
clarified.

I have build/boot tested the following 8 .config combinations on x86:

 {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}

I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT.  (Other
architectures are untested, but should work just fine.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:13 -07:00
gregkh@suse.de
8874b414ff [PATCH] class: convert arch/* to use the new class api instead of class_simple
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:09 -07:00
Thomas Hood
92c6dc59b7 [PATCH] apm.c: ignore_normal_resume is set a bit too late
This patch causes the ignore_normal_resume flag to be set slightly earlier,
before there is a chance that the apm driver will receive the normal resume
event from the BIOS.  (Addresses Debian bug #310865)

Signed-off-by: Thomas Hood <jdthood@yahoo.co.uk>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-14 07:19:35 -07:00
Keith Owens
5754c9b649 [PATCH] Stop arch/i386/kernel/vsyscall-note.o being rebuilt every time
arch/i386/kernel/vsyscall-note.o is not listed as a target so its .cmd file
is neither considered as a target nor is it read on the next build.  This
causes vsyscall-note.o to be rebuilt every time that you run make, which
causes vmlinux to be rebuilt every time.

Signed-off-by: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-08 16:21:13 -07:00
Dave Jones
f94ea640a2 [CPUFREQ] Typos.
cpfureq developers cant spel.

Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:52 -07:00
Dave Jones
6778bae0f2 [CPUFREQ] longhaul - adjust transition latency.
From patch by: Ken Staton <ken_staton@agilent.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:51 -07:00
Dave Jones
1174631418 [CPUFREQ] Longhaul: Magic timer frobbing.
As mandated by the spec, disable timer around transitions.

From code by : Ken Staton <ken_staton@agilent.com
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:51 -07:00
Dave Jones
3be6a48f3c [CPUFREQ] longhaul - disable PCI mastering around transition.
The spec states that we have to do this, which is *horrid*.

Based on code from: Ken Staton <ken_staton@agilent.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:51 -07:00
Dave Jones
065b807ca1 [CPUFREQ] dual-core powernow-k8
With the release of the dual-core AMD Opterons last week,
it's high time that cpufreq supported them.  The attached
patch applies cleanly to 2.6.12-rc3 and updates powernow-k8
to support the latest Athlon 64 and Opteron processors.

Update the driver to version 1.40.0 and provide support
for dual-core processors.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:46 -07:00
Dave Jones
c5d28fb297 [CPUFREQ] Recalibrate cpu_khz [2/2]
Some cpufreq drivers (at that time, only powernow-k7) need to recalibrate the
cpu_khz at runtime.

Signed-off-by: Bruno Ducrot <ducrot@poupinou.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:46 -07:00
Dave Jones
91350ed49b [CPUFREQ] Recalibrate cpu_khz [1/2]
We have to recalibrate cpu_khz in order to use the current FID instead the max
FID since some BIOS do not put the processor at maximum frequency at POST. 
Also, some BIOS will change the processor frequency at our back after cpu_khz
was calibrate.  Finally, this will fix a long standing bug when we do
something like this:

# rmmod powernow-k7
# modprobe powernow-k7

Signed-off-by: Bruno Ducrot <ducrot@poupinou.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:45 -07:00
Dave Jones
bf6fc9fd2d [CPUFREQ] AMD Elan SC520 cpufreq driver.
From: Sean Young <sean@mess.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:45 -07:00
Dave Jones
6f4095af6d [CPUFREQ] speedstep-smi: it works on at least one P4M
The speedstep-smi driver actually works on >=1 notebook with a
Pentium 4-M CPU where all other cpufreq drivers fail. Therefore,
allow speedstep-smi on P4Ms again, but warn users of likely failure

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:44 -07:00
Dave Jones
8282864a96 [CPUFREQ] speedstep-centrino: Pentium 4 - M (HT) support
The Pentium 4 - Ms (HT) with CPUID 0xF34 and 0xF41 seem to support
centrino-like enhanced speedstep; however, no "table" support is possible.
Therefore, put NULL entries into speedstep-centrino.c

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:43 -07:00
Dave Jones
7eb53d8823 [CPUFREQ] powernow-k7: don't print khz element of FSB.
Signed-off-by: Dave Jones <davej@redhat.com>
2005-05-31 19:03:42 -07:00
Alexander Nyberg
adaa765d76 [PATCH] acpi build fix: x86 setup.c
This is a neverending story

linux/acpi.h contains empty declarations for acpi_boot_init() &
acpi_boot_table_init() but they are nested inside #ifdef CONFIG_ACPI.

So we'll have to #ifdef in arch/i386/kernel/setup.c: setup_arch()

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-31 14:54:17 -07:00
Siddha, Suresh B
49f384b82b [PATCH] x86: fix smp_num_siblings on buggy BIOSes
This fixes 'smp_num_siblings' value on the systems with a buggy bios,
which sets number of siblings to '2' even when HT is disabled.  (more
details are at http://bugzilla.kernel.org/show_bug.cgi?id=4359)

I am planning to do more cleanup in this area (like moving smp_num_siblings
to per cpuinfo) shortly.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-28 11:14:00 -07:00
Adrian Bunk
70ffc71c5c [PATCH] arch/i386/kernel/cpu/intel_cacheinfo.c: section fix
num_cache_leaves is used in __devexit cache_remove_dev() and can therefore
not be __devinit.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-28 11:14:00 -07:00
Andi Kleen
2df9fa3664 [PATCH] x86_64: i386/x86-64: Export cpu_core_map
Needed for the powernow k8 driver for dual core support.

Signed-off-by: Andi Kleen <ak@suse.de>
Cc: <mark.langsdorf@amd.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:48:21 -07:00
Andi Kleen
b41e29398a [PATCH] x86_64: 386/x86-64 Further AMD dual core fixes
- Remove duplicated ifdef
- Make core_id match what Intel uses
- Initialize phys_proc_id correctly for non DC case
- Handle non power of two core numbers.

Fixes for both i386 and x86-64

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:48:20 -07:00
Andi Kleen
a158608bf4 [PATCH] x86_64/i386: fix defaults for physical/core id in /proc/cpuinfo
Last round hopefully of cpu_core_id changes hopefully fow now:

- Always initialize cpu_core_id for all CPUs, even when no dual core setup
  is detected.  This prevents funny /proc/cpuinfo output

- Do the same with phys_proc_id[] even when no HyperThreading - dito.

- Use the CPU APIC-ID from CPUID 1 instead of the linux virtual CPU number
  to identify the core for AMD dual core setups.

Patch for i386/x86-64.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-17 07:59:13 -07:00
Linus Torvalds
22490eb80c Fix acpi_find_rsdp() - acpi_scan_rsdp takes length, not end
Noticed by Jakub Jermar <jermar@itbs.cz>
2005-05-06 15:39:23 -07:00
maximilian attems
a27e951f1e [PATCH] cyrix: eliminate bad section references
Fix cyrix section references:
 convert __initdata to __devinitdata.

Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 00000379
R_386_32          .init.data
Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 00000399
R_386_32          .init.data
Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 000003b3
R_386_32          .init.data
Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 000003b9
R_386_32          .init.data
Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 000003bf
R_386_32          .init.data

Signed-of-by: maximilian attems <janitor@sternwelten.at>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:47 -07:00
Prasanna S Panchamukhi
0b9e2cac8a [PATCH] Kprobes: Incorrect handling of probes on ret/lret instruction
Kprobes could not handle the insertion of a probe on the ret/lret
instruction and used to oops after single stepping since kprobes was
modifying eip/rip incorrectly.  Adjustment of eip/rip is not required after
single stepping in case of ret/lret instruction, because eip/rip points to
the correct location after execution of the ret/lret instruction.  This
patch fixes the above problem.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:39 -07:00
Paolo 'Blaisorblade' Giarrusso
0c28130b5c [PATCH] x86_64: make string func definition work as intended
In include/asm-x86_64/string.h there are such comments:

/* Use C out of line version for memcmp */
#define memcmp __builtin_memcmp
int memcmp(const void * cs,const void * ct,size_t count);

This would mean that if the compiler does not decide to use __builtin_memcmp,
it emits a call to memcmp to be satisfied by the C out-of-line version in
lib/string.c.  What happens is that after preprocessing, in lib/string.i you
may find the definition of "__builtin_strcmp".

Actually, by accident, in the object you will find the definition of strcmp
and such (maybe a trick intended to redirect calls to __builtin_memcmp to the
default memcmp when the definition is not expanded); however, this particular
case is not a documented feature as far as I can see.

Also, the EXPORT_SYMBOL does not work, so it's duplicated in the arch.

I simply added some #undef to lib/string.c and removed the (now duplicated)
exports in x86-64 and UML/x86_64 subarchs (the second ones are introduced by
another patch I just posted for -mm).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:33 -07:00
Alexander Nyberg
f48d9663f1 [PATCH] x86 stack initialisation fix
The recent change fix-crash-in-entrys-restore_all.patch

 	childregs->esp = esp;

 	p->thread.esp = (unsigned long) childregs;
-	p->thread.esp0 = (unsigned long) (childregs+1);
+	p->thread.esp0 = (unsigned long) (childregs+1) - 8;

 	p->thread.eip = (unsigned long) ret_from_fork;

introduces an inconsistency between esp and esp0 before the task is run the
first time.  esp0 is no longer the actual start of the stack, but 8 bytes
off.

This shows itself clearly in a scenario when a ptracer that is set to also
ptrace eventual children traces program1 which then clones thread1.  Now
the ptracer wants to modify the registers of thread1.  The x86 ptrace
implementation bases it's knowledge about saved user-space registers upon
p->thread.esp0.  But this will be a few bytes off causing certain writes to
the kernel stack to overwrite a saved kernel function address making the
kernel when actually running thread1 jump out into user-space.  Very
spectacular.

The testcase I've used is:
/* start with strace -f ./a.out */
#include <pthread.h>
#include <stdio.h>

void *do_thread(void *p)
{
	for (;;);
}

int main()
{
	pthread_t one;
	pthread_create(&one, NULL, &do_thread, NULL);
	for (;;);
	return 0;
}

So, my solution is to instead of just adjusting esp0 that creates an
inconsitent state I adjust where the user-space registers are saved with -8
bytes.  This gives us the wanted extra bytes on the start of the stack and
esp0 is now correct.  This solves the issues I saw from the original
testcase from Mateusz Berezecki and has survived testing here.  I think
this should go into -mm a round or two first however as there might be some
cruft around depending on pt_regs lying on the start of the stack.  That
however would have broken with the first change too!

It's actually a 2-line diff but I had to move the comment of why the -8 bytes
are there a few lines up. Thanks to Zwane for helping me with this.

Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05 16:36:30 -07:00
David Woodhouse
27b030d58c Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2005-05-03 08:14:09 +01:00
Adrian Bunk
408b664a7d [PATCH] make lots of things static
Another large rollup of various patches from Adrian which make things static
where they were needlessly exported.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:29 -07:00
Jesper Juhl
7ed20e1ad5 [PATCH] convert that currently tests _NSIG directly to use valid_signal()
Convert most of the current code that uses _NSIG directly to instead use
valid_signal().  This avoids gcc -W warnings and off-by-one errors.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:14 -07:00
Jesper Juhl
e49332bd12 [PATCH] misc verify_area cleanups
There were still a few comments left refering to verify_area, and two
functions, verify_area_skas & verify_area_tt that just wrap corresponding
access_ok_skas & access_ok_tt functions, just like verify_area does for
access_ok - deprecate those.

There was also a few places that still used verify_area in commented-out
code, fix those up to use access_ok.

After applying this one there should not be anything left but finally
removing verify_area completely, which will happen after a kernel release
or two.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:08 -07:00
Matt Mackall
d59745ce3e [PATCH] clean up kernel messages
Arrange for all kernel printks to be no-ops.  Only available if
CONFIG_EMBEDDED.

This patch saves about 375k on my laptop config and nearly 100k on minimal
configs.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:02 -07:00
Paolo 'Blaisorblade' Giarrusso
5e7b83ffc6 [PATCH] uml: fix syscall table by including $(SUBARCH)'s one, for i386
Split the i386 entry.S files into entry.S and syscall_table.S which is
included in the previous one (so actually there is no difference between them)
and use the syscall_table.S in the UML build, instead of tracking by hand the
syscall table changes (which is inherently error-prone).

We must only insert the right #defines to inject the changes we need from the
i386 syscall table (for instance some different function names); also, we
don't implement some i386 syscalls, as ioperm(), nor some TLS-related ones
(yet to provide).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:55 -07:00
Pavel Pisa
ad6714230f [PATCH] Linux 2.6.x VM86 interrupt emulation fixes
Patch solves VM86 interrupt emulation deadlock on SMP systems.  The VM86
interrupt emulation has been heavily tested and works well on UP systems
after last update, but it seems to deadlock when we have used it on SMP/HT
boxes now.

It seems, that disable_irq() cannot be called from interrupts, because it
waits until disabled interrupt handler finishes
(/kernel/irq/manage.c:synchronize_irq():while(IRQ_INPROGRESS);).  This
blocks one CPU after another.  Solved by use disable_irq_nosync.

There is the second problem.  If IRQ source is fast, it is possible, that
interrupt is sometimes processed and re-enabled by the second CPU, before
it is disabled by the first one, but negative IRQ disable depths are not
allowed.  The spinlocking and disabling IRQs over call to
disable_irq_nosync/enable_irq is the only solution found reliable till now.

Signed-off-by: Michal Sojka <sojkam1@control.felk.cvut.cz>
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:52 -07:00
Zwane Mwaikambo
3c3b73b6f5 [PATCH] cpuid x87 bit on AMD falsely marked as PNI
http://bugme.osdl.org/show_bug.cgi?id=4426

vendor_id       : AuthenticAMD
cpu family      : 6
model           : 10
model name      : AMD Athlon(tm) XP
stepping        : 0
cpu MHz         : 2204.807
<snipped>
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow
bogomips        : 4358.14

We're marking bit 0 of extended function 0x80000001 cpuid as PNI support on
AMD processors, when it actually denotes x87 FPU present.  Patch for i386
and x86_64 below.

Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:51 -07:00
john stultz
35492df5ae [PATCH] i386: fix hpet for systems that don't support legacy replacement
Currently the i386 HPET code assumes the entire HPET implementation from
the spec is present.  This breaks on boxes that do not implement the
optional legacy timer replacement functionality portion of the spec.

This patch, which is very similar to my x86-64 patch for the same issue,
fixes the problem allowing i386 systems that cannot use the HPET for the
timer interrupt and RTC to still use the HPET as a time source.  I've
tested this patch on a system systems without HPET, with HPET but without
legacy timer replacement, as well as HPET with legacy timer replacement.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:50 -07:00
Lee Revell
a6954ba2e8 [PATCH] Enable write combining for server works LE rev > 6
Enable write combining for server works LE rev > 6 per
http://www.ussg.iu.edu/hypermail/linux/kernel/0104.3/1007.html

Signed-Off-By: Lee Revell <rlrevell@joe-job.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:49 -07:00
Stas Sergeev
48c88211a6 [PATCH] x86: entry.S trap return fixes
do_debug() and do_int3() return void.

This patch fixes the CONFIG_KPROBES variant of do_int3() to return void too
and adjusts entry.S accordingly.

Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:49 -07:00
Jaya Kumar
a2f7c35415 [PATCH] x86 reboot: Add reboot fixup for gx1/cs5530a
This patch by Jaya Kumar introduces a generic infrastructure to deal with
x86 chipsets with nonstandard reset sequences, and adds support for the
Geode gx1/cs5530a chipset.

Signed-off-by: Jaya Kumar <jayalk@intworks.biz>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:49 -07:00
Jack F Vogel
67701ae976 [PATCH] check nmi watchdog is broken
A bug against an xSeries system showed up recently noting that the
check_nmi_watchdog() test was failing.

I have been investigating it and discovered in both i386 and x86_64 the
recent change to the routine to use the cpu_callin_map has uncovered a
problem.  Prior to that change, on an SMP box, the test was trivally
passing because all cpu's were found to not yet be online, but now with the
callin_map they are discovered, it goes on to test the counter and they
have not yet begun to increment, so it announces a CPU is stuck and bails
out.

On all the systems I have access to test, the announcement of failure is
also bougs...  by the time you can login and check /proc/interrupts, the
NMI count is happily incrementing on all CPUs.  Its just that the test is
being done too early.

I have tried moving the call to the test around a bit, and it was always
too early.  I finally hit on this proposed solution, it delays the routine
via a late_initcall(), seems like the right solution to me.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:48 -07:00
H. J. Lu
fd51f666fa [PATCH] i386/x86_64 segment register access update
The new i386/x86_64 assemblers no longer accept instructions for moving
between a segment register and a 32bit memory location, i.e.,

        movl (%eax),%ds
        movl %ds,(%eax)

To generate instructions for moving between a segment register and a
16bit memory location without the 16bit operand size prefix, 0x66,

        mov (%eax),%ds
        mov %ds,(%eax)

should be used. It will work with both new and old assemblers. The
assembler starting from 2.16.90.0.1 will also support

        movw (%eax),%ds
        movw %ds,(%eax)

without the 0x66 prefix. I am enclosing patches for 2.4 and 2.6 kernels
here. The resulting kernel binaries should be unchanged as before, with
old and new assemblers, if gcc never generates memory access for

               unsigned gsindex;
               asm volatile("movl %%gs,%0" : "=g" (gsindex));

If gcc does generate memory access for the code above, the upper bits
in gsindex are undefined and the new assembler doesn't allow it.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:58:48 -07:00
Linus Torvalds
a879cbbb34 x86: make traps on 'iret' be debuggable in user space
This makes a trap on the 'iret' that returns us to user space
cause a nice clean SIGSEGV, instead of just a hard (and silent)
exit.

That way a debugger can actually try to see what happened, and
we also properly notify everybody who might be interested about
us being gone.

This loses the error code, but tells the debugger what happened
with ILL_BADSTK in the siginfo.
2005-04-29 09:38:44 -07:00
2fd6f58ba6 [AUDIT] Don't allow ptrace to fool auditing, log arch of audited syscalls.
We were calling ptrace_notify() after auditing the syscall and arguments,
but the debugger could have _changed_ them before the syscall was actually
invoked. Reorder the calls to fix that.

While we're touching ever call to audit_syscall_entry(), we also make it
take an extra argument: the architecture of the syscall which was made,
because some architectures allow more than one type of syscall.

Also add an explicit success/failure flag to audit_syscall_exit(), for
the benefit of architectures which return that in a condition register
rather than only returning a single register.

Change type of syscall return value to 'long' not 'int'.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2005-04-29 16:08:28 +01:00
James Bottomley
1cff94c6fe [PATCH] fix subarch breakage in amd dual core updates
The patch to arch/i386/kernel/cpu/amd.c relies on the variable
cpu_core_id which is defined in i386/kernel/smpboot.c.  This means it is
only present if CONFIG_X86_SMP is defined, not CONFIG_SMP (alternative
SMP harnesses won't have it, which is why it breaks voyager).

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-21 16:20:35 -07:00
Chris Wedgwood
15e8869943 [PATCH] x86: fix acpi compile without CONFIG_ACPI_BUS
The recent acpi boot patch breaks for me: acpi_fadt needs CONFIG_ACPI_BUS.

Signed-off-By: Chris Wedgwood <cw@f00f.org>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-18 08:01:30 -07:00
maximilian attems
c41f5eb3b8 [PATCH] efi: eliminate bad section references
Randy please double check especially this one.
there may be a better solution.

Fix efi section references:
 remove __initdata for struct efi efi_phys 
 and struct efi_memory_map memmap

Error: ./arch/i386/kernel/efi.o .text refers to 000000d3 R_386_32
.init.data
Error: ./arch/i386/kernel/efi.o .text refers to 000000ff R_386_32
.init.data

efi_memmap_walk (which is not __init nor static) 
accesses both efi_phys and memmap.

Signed-off-by: maximilian attems <janitor@sternwelten.at>
Acked-by: Randy Dunlap <rddunlap@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16 15:25:53 -07:00
Pavel Machek
438510f6f0 [PATCH] pm_message_t: more fixes in common and i386
I thought I'm done with fixing u32 vs.  pm_message_t ...  unfortunately
that turned out not to be the case as Russel King pointed out.  Here are
fixes for Documentation and common code (mainly system devices).

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16 15:25:24 -07:00
Siddha, Suresh B
d31ddaa172 [PATCH] x86, x86_64: dual core proc-cpuinfo and sibling-map fix
- broken sibling_map setup in x86_64

- grouping all the core and HT related cpuinfo fields.
  We are reasonably sure that adding new cpuinfo fields after "siblings" field,
  will not cause any app failure. Thats because today's /proc/cpuinfo
  format is completely different on x86, x86_64 and we haven't heard of any
  x86 app breakage because of this issue. Grouping these fields will 
  result in more or less common format on all architectures (ia64, x86 and 
  x86_64) and will cause less confusion.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16 15:25:20 -07:00
Andi Kleen
635186447d [PATCH] x86_64: Final support for AMD dual core
Clean up the code greatly.  Now uses the infrastructure from the Intel dual
core patch Should fix a final bug noticed by Tyan of not detecting the nodes
correctly in some corner cases.

Patch for x86-64 and i386

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16 15:25:16 -07:00
Andi Kleen
3dd9d51484 [PATCH] x86_64: add support for Intel dual-core detection and displaying
Appended patch adds the support for Intel dual-core detection and displaying
the core related information in /proc/cpuinfo.  

It adds two new fields "core id" and "cpu cores" to x86 /proc/cpuinfo and the
"core id" field for x86_64("cpu cores" field is already present in x86_64).

Number of processor cores in a die is detected using cpuid(4) and this is
documented in IA-32 Intel Architecture Software Developer's Manual (vol 2a)
(http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a)

This patch also adds cpu_core_map similar to cpu_sibling_map.

Slightly hacked by AK.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16 15:25:15 -07:00
Siddha, Suresh B
cf94b62f70 [PATCH] x86_64-always-use-cpuid-80000008-to-figure-out-mtrr fix
We need to use the size_and_mask in set_mtrr_var_ranges(which is called
while programming MTRR's for AP's

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16 15:25:11 -07:00
Andi Kleen
1f2c958ad5 [PATCH] x86_64: Always use CPUID 80000008 to figure out MTRR address space size
It doesn't make sense to only do this only for AMD K8.

This would support future CPUs with extended address spaces properly.

For i386 and x86-64

Cc: <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16 15:25:10 -07:00