Minor cleanup.
Move things into their include files, remove obsolete includes, fix
indentation, remove obsolete special cases etc.
I also added the per cpu section to asm-generic/sections.h and fixed
init/main.c to use it.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
No need to print kernel addresses there and clarify what the APIC-ID is.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Does not change any semantics because numa_add_cpu checks for CPU 0 anyways.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Various code needs this information now before the actual SMP bootup. Instead
of computing it on the fly while booting the other CPUs set it up now while
initial MPtable/MADT parsing.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When the x86_64 cpu hotplug changes went in it added a check in
default_do_nmi() which kills NMI delivery on any CPU but the BSP.
The NMI watchdog is brought up quite some time before the online bit is set
in num_online_cpus so this won't work very well. The nmi watchdogs on cpus
that are not BSP will never be reprogrammed and no NMIs.
Why was this check added? How does an offlined cpu receive an NMI?
Signed-off-by: Alexander Nyberg <alexn@telia.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Cc: <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
turn many #if $undefined_string into #ifdef $undefined_string to fix some
warnings after -Wno-def was added to global CFLAGS
Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fixes boot up lockups on some machines where CPU apic ids don't start with
0
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
i386 machine_power_off was disabling the local apic
and all of it's users wanted to be on the boot cpu.
So call machine_shutdown which places us on the boot
cpu and disables the apics. This keeps us in sync
and reduces the number of cases we need to worry about in
the power management code.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It is not safe to call set_cpus_allowed() in interrupt
context and disabling the apics is complicated code.
So unconditionally skip machine_shutdown in machine_emergency_reboot
on x86_64.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We only want to shutdown the apics if reboot_force
is not specified. Be we are doing this both
in machine_shutdown which is called unconditionally
and if (!reboot_force). So simply call machine_shutdown
if (!reboot_force). It looks like something
went weird with merging some of the kexec patches for
x86_64, and caused this.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
machine_restart, machine_halt and machine_power_off are machine
specific hooks deep into the reboot logic, that modules
have no business messing with. Usually code should be calling
kernel_restart, kernel_halt, kernel_power_off, or
emergency_restart. So don't export machine_restart,
machine_halt, and machine_power_off so we can catch buggy users.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add inotify syscall entries to x86-64.
Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add missing fsnotify_open() hook to sys32_open().
Add fsnotify_open() hook to sys32_open() on x86-64.
Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A malicious 32bit app can have an elf section at 0xffffe000. During
exec of this app, we will have a memory leak as insert_vm_struct() is
not checking for return value in syscall32_setup_pages() and thus not
freeing the vma allocated for the vsyscall page.
Check the return value and free the vma incase of failure.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is the second time this has happened: inserting a new section requires
that we adjust the arithmetic which is used to calculate the vsyscall page's
offset.
Cc: Christoph Lameter <christoph@lameter.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Create a new top-level menu named "Networking" thus moving
net related options and protocol selection way from the drivers
menu and up on the top-level where they belong.
To implement this all architectures has to source "net/Kconfig" before
drivers/*/Kconfig in their Kconfig file. This change has been
implemented for all architectures.
Device drivers for ordinary NIC's are still to be found
in the Device Drivers section, but Bluetooth, IrDA and ax25
are located with their corresponding menu entries under the new
networking menu item.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new section called ".data.read_mostly" for data items that are read
frequently and rarely written to like cpumaps etc.
If these maps are placed in the .data section then these frequenly read
items may end up in cachelines with data is is frequently updated. In that
case all processors in an SMP system must needlessly reload the cachelines
again and again containing elements of those frequently used variables.
The ability to share these cachelines will allow each cpu in an SMP system
to keep local copies of those shared cachelines thereby optimizing
performance.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Shobhit Dayal <shobhit@calsoftinc.com>
Signed-off-by: Christoph Lameter <christoph@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There has been some discuss about solving the SMP MTRR suspend/resume
breakage, but I didn't find a patch for it. This is an intent for it. The
basic idea is moving mtrr initializing into cpu_identify for all APs (so it
works for cpu hotplug). For BP, restore_processor_state is responsible for
restoring MTRR.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Implementation:
===============
The encrypt/decrypt code is based on an x86 implementation I did a while
ago which I never published. This unpublished implementation does
include an assembler based key schedule and precomputed tables. For
simplicity and best acceptance, however, I took Gladman's in-kernel code
for table generation and key schedule for the kernel port of my
assembler code and modified this code to produce the key schedule as
required by my assembler implementation. File locations and Kconfig are
kept similar to the i586 AES assembler implementation.
It may seem a little bit strange to use 32 bit I/O and registers in the
assembler implementation but this gives the best code size. My
implementation takes one instruction more per round compared to
Gladman's x86 assembler but it doesn't require any stack for local
variables or saved registers and it is less serialized than Gladman's
code.
Note that all comparisons to Gladman's code were done after my code was
implemented. I did only use FIPS PUB 197 for the implementation so my
implementation is independent work.
If anybody has a better assembler solution for x86_64 I'll be pleased to
have my code replaced with the better solution.
Testing:
========
The implementation passes the in-kernel crypto testing module and I'm
running it without any problems on my laptop where it is mainly used for
dm-crypt.
Microbenchmark:
===============
The microbenchmark was done in userspace with similar compile flags as
used during kernel compile.
Encrypt/decrypt is about 35% faster than the generic C implementation.
As the generic C as well as my assembler implementation are both table
I don't really expect that there is much room for further
improvements though I'll be glad to be corrected here.
The key schedule is about 5% slower than the generic C implementation.
This is due to the fact that some more work has to be done in the key
schedule routine to fit the schedule to the assembler implementation.
Code Size:
==========
Encrypt and decrypt are together about 2.1 Kbytes smaller than the
generic C implementation which is important with regard to L1 cache
usage. The key schedule routine is about 100 bytes larger than the
generic C implementation.
Data Size:
==========
There's no difference in data size requirements between the assembler
implementation and the generic C implementation.
License:
========
Gladmans's code is dual BSD/GPL whereas my assembler code is GPLv2 only
(I'm not going to change the license for my code). So I had to change
the module license for the x86_64 aes module from 'Dual BSD/GPL' to
'GPL' to reflect the most restrictive license within the module.
Signed-off-by: Andreas Steinmetz <ast@domdv.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following renames arch_init, a kprobes function for performing any
architecture specific initialization, to arch_init_kprobes in order to
cleanup the namespace.
Also, this patch adds arch_init_kprobes to sparc64 to fix the sparc64 kprobes
build from the last return probe patch.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that we have access to the whole MCFG table, let's properly use it
for all pci device accesses (as that's what it is there for, some boxes
don't put all the busses into one entry.)
If, for some reason, the table is incorrect, we fallback to the "old
style" of mmconfig accesses, namely, we just assume the first entry in
the table is the one for us, and blindly use it.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch is the first step in properly handling the MCFG PCI table.
It defines the structures properly, and saves off the table so that the
pci mmconfig code can access it. It moves the parsing of the table a
little later in the boot process, but still before the information is
needed.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The following patch contains the x86_64 specific changes for the new
return probe design. Changes include:
* Removing the architecture specific functions for querying a return probe
instance off a stack address
* Complete rework onf arch_prepare_kretprobe() and trampoline_probe_handler()
* Removing trampoline_post_handler()
* Adding arch_init() so that now we handle registering the return probe
trampoline instead of kernel/kprobes.c doing it
NOTE:
Note that with this new design, the dependency on calculating a pointer to
the task off the stack pointer no longer exist (resolving the problem of
interruption stacks as pointed out in the original feedback to this port.)
Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that PPC64 has no-execute support, here is a second try to fix the
single step out of line during kprobe execution. Kprobes on x86_64 already
solved this problem by allocating an executable page and using it as the
scratch area for stepping out of line. Reuse that.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I believe at least for seccomp it's worth to turn off the tsc, not just for
HT but for the L2 cache too. So it's up to you, either you turn it off
completely (which isn't very nice IMHO) or I recommend to apply this below
patch.
This has been tested successfully on x86-64 against current cogito
repository (i686 compiles so I didn't bother testing ;). People selling
the cpu through cpushare may appreciate this bit for a peace of mind.
There's no way to get any timing info anymore with this applied
(gettimeofday is forbidden of course). The seccomp environment is
completely deterministic so it can't be allowed to get timing info, it has
to be deterministic so in the future I can enable a computing mode that
does a parallel computing for each task with server side transparent
checkpointing and verification that the output is the same from all the 2/3
seller computers for each task, without the buyer even noticing (for now
the verification is left to the buyer client side and there's no
checkpointing, since that would require more kernel changes to track the
dirty bits but it'll be easy to extend once the basic mode is finished).
Eliminating a cold-cache read of the cr4 global variable will save one
cacheline during the tlb flush while making the code per-cpu-safe at the
same time. Thanks to Mikael Pettersson for noticing the tlb flush wasn't
per-cpu-safe.
The global tlb flush can run from irq (IPI calling do_flush_tlb_all) but
it'll be transparent to the switch_to code since the IPI won't make any
change to the cr4 contents from the point of view of the interrupted code
and since it's now all per-cpu stuff, it will not race. So no need to
disable irqs in switch_to slow path.
Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
1. Establish a simple API for process freezing defined in linux/include/sched.h:
frozen(process) Check for frozen process
freezing(process) Check if a process is being frozen
freeze(process) Tell a process to freeze (go to refrigerator)
thaw_process(process) Restart process
frozen_process(process) Process is frozen now
2. Remove all references to PF_FREEZE and PF_FROZEN from all
kernel sources except sched.h
3. Fix numerous locations where try_to_freeze is manually done by a driver
4. Remove the argument that is no longer necessary from two function calls.
5. Some whitespace cleanup
6. Clear potential race in refrigerator (provides an open window of PF_FREEZE
cleared before setting PF_FROZEN, recalc_sigpending does not check
PF_FROZEN).
This patch does not address the problem of freeze_processes() violating the rule
that a task may only modify its own flags by setting PF_FREEZE. This is not clean
in an SMP environment. freeze(process) is therefore not SMP safe!
Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove some of the unnecessary differences between arch/i386 and
arch/x86_64. This patch fixes more whitespace issues, some miscellaneous
typos, a wrong URL and a factually incorrect statement about the current
boot sector code.
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Put function prototypes for memset() and memcpy() ahead of where
there are used, to kill sparse warnings:
arch/x86_64/boot/compressed/../../../../lib/inflate.c:317:3: warning: undefined identifier 'memset'
arch/x86_64/boot/compressed/../../../../lib/inflate.c:601:11: warning: undefined identifier 'memcpy'
arch/x86_64/boot/compressed/misc.c:151:2: warning: undefined identifier 'memcpy'
arch/x86_64/boot/compressed/../../../../lib/inflate.c:317:3: warning: call with no type!
arch/x86_64/boot/compressed/../../../../lib/inflate.c:601:17: warning: call with no type!
arch/x86_64/boot/compressed/misc.c:151:9: warning: call with no type!
Signed-off-by: randy_dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
o Following patch provides purely cosmetic changes and corrects CodingStyle
guide lines related certain issues like below in kexec related files
o braces for one line "if" statements, "for" loops,
o more than 80 column wide lines,
o No space after "while", "for" and "switch" key words
o Changes:
o take-2: Removed the extra tab before "case" key words.
o take-3: Put operator at the end of line and space before "*/"
Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Makes kexec_crashdump() take a pt_regs * as an argument. This allows to
get exact register state at the point of the crash. If we come from direct
panic assertion NULL will be passed and the current registers saved before
crashdump.
This hooks into two places:
die(): check the conditions under which we will panic when calling
do_exit and go there directly with the pt_regs that caused the fatal
fault.
die_nmi(): If we receive an NMI lockup while in the kernel use the
pt_regs and go directly to crash_kexec(). We're probably nested up badly
at this point so this might be the only chance to escape with proper
information.
Signed-off-by: Alexander Nyberg <alexn@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
o Following patch exports kexec global variable "crash_notes" to user space
through sysfs as kernel attribute in /sys/kernel.
Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is the x86_64 implementation of the crashkernel option. It reserves
a window of memory very early in the bootup process, so we never use
it for anything but the kernel to switch to when the running
kernel panics.
In addition to reserving this memory a resource structure is registered
so looking at /proc/iomem it is clear what happened to that memory.
ISSUES:
Is it possible to implement this in a architecture generic way?
What should be done with architectures that always use an iommu and
thus don't report their RAM memory resources in /proc/iomem?
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is the x86_64 implementation of machine kexec. 32bit compatibility
support has been implemented, and machine_kexec has been enhanced to not care
about the changing internal kernel paget table structures.
From: Alexander Nyberg <alexn@dsv.su.se>
build fix
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Factor out the apic and smp shutdown code from machine_restart so it can be
called by in the kexec reboot path as well.
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For one kernel to report a crash another kernel has created we need
to have 2 kernels loaded simultaneously in memory. To accomplish this
the two kernels need to built to run at different physical addresses.
This patch adds the CONFIG_PHYSICAL_START option to the x86_64 kernel
so we can do just that. You need to know what you are doing and
the ramifications are before changing this value, and most users
won't care so I have made it depend on CONFIG_EMBEDDED
bzImage kernels will work and run at a different address when compiled
with this option but they will still load at 1MB. If you need a kernel
loaded at a different address as well you need to boot a vmlinux.
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The vmlinux on x86_64 does not report the correct physical address of
the kernel. Instead in the physical address field it currently
reports the virtual address of the kernel.
This is patch is a bug fix that corrects vmlinux to report the
proper physical addresses.
This is potentially a help for crash dump analysis tools.
This definitiely allows bootloaders that load vmlinux as a standard
ELF executable. Bootloaders directly loading vmlinux become of
practical importance when we consider the kexec on panic case.
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When coming out of apic mode attempt to set the appropriate
apic back into virtual wire mode. This improves on previous versions
of this patch by by never setting bot the local apic and the ioapic
into veritual wire mode.
This code looks at data from the mptable to see if an ioapic has
an ExtInt input to make this decision. A future improvement
is to figure out which apic or ioapic was in virtual wire mode
at boot time and to remember it. That is potentially a more accurate
method, of selecting which apic to place in virutal wire mode.
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Eric W. Biederman <ebiederm@xmission.com
The following patch simply adds a shutdown method to the x86_64 i8259 code.
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Eric W. Biederman <ebiederm@xmission.com>
It is ok to reserve resources > 4G on x86_64 struct resource is 64bit now :)
Signed-off-by: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch consolidates the CONFIG_PREEMPT and CONFIG_PREEMPT_BKL
preemption options into kernel/Kconfig.preempt. This, besides reducing
source-code, also enables more centralized tweaking of preemption related
options.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2.6.12-rc6-mm1 has a few remaining synchronize_kernel()s, some (but not
all) in comments. This patch changes these synchronize_kernel() calls (and
comments) to synchronize_rcu() or synchronize_sched() as follows:
- arch/x86_64/kernel/mce.c mce_read(): change to synchronize_sched() to
handle races with machine-check exceptions (synchronize_rcu() would not cut
it given RCU implementations intended for hardcore realtime use.
- drivers/input/serio/i8042.c i8042_stop(): change to synchronize_sched() to
handle races with i8042_interrupt() interrupt handler. Again,
synchronize_rcu() would not cut it given RCU implementations intended for
hardcore realtime use.
- include/*/kdebug.h comments: change to synchronize_sched() to handle races
with NMIs. As before, synchronize_rcu() would not cut it...
- include/linux/list.h comment: change to synchronize_rcu(), since this
comment is for list_del_rcu().
- security/keys/key.c unregister_key_type(): change to synchronize_rcu(),
since this is interacting with RCU read side.
- security/keys/process_keys.c install_session_keyring(): change to
synchronize_rcu(), since this is interacting with RCU read side.
Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>