The KVM configuration is no longer visible in the latest git tree. It looks
like it is selected by HAVE_SETUP_PER_CPU_AREA. I've moved HAVE_KVM to
under CONFIG_X86.
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
the cpa API is page aligned - warn about any weird alignments.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
extern should not appear in C files. Also, the definitions
do not match the prototype currently, not sure what way you
want to go with this, I've switched the prototype to return
int, but I can see going to the void return as well.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When the GART table is unmapped from the kernel direct mappings
during early bootup, make sure we have no leftover cachelines in it.
Note: the clflush done by set_memory_np() was not enough, because
clflush does not work on unmapped pages.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The EFI-runtime mapping code changed a larger memory area than it
should have, due to a pages/bytes parameter mixup.
noticed by Andi Kleen.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
dump_pagetable() can now become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jiri Kosina reported the following deadlock scenario with
show_unhandled_signals enabled:
[ 68.379022] gnome-settings-[2941] trap int3 ip:3d2c840f34
sp:7fff36f5d100 error:0<3>BUG: sleeping function called from invalid
context at kernel/rwsem.c:21
[ 68.379039] in_atomic():1, irqs_disabled():0
[ 68.379044] no locks held by gnome-settings-/2941.
[ 68.379050] Pid: 2941, comm: gnome-settings- Not tainted 2.6.25-rc1 #30
[ 68.379054]
[ 68.379056] Call Trace:
[ 68.379061] <#DB> [<ffffffff81064883>] ? __debug_show_held_locks+0x13/0x30
[ 68.379109] [<ffffffff81036765>] __might_sleep+0xe5/0x110
[ 68.379123] [<ffffffff812f2240>] down_read+0x20/0x70
[ 68.379137] [<ffffffff8109cdca>] print_vma_addr+0x3a/0x110
[ 68.379152] [<ffffffff8100f435>] do_trap+0xf5/0x170
[ 68.379168] [<ffffffff8100f52b>] do_int3+0x7b/0xe0
[ 68.379180] [<ffffffff812f4a6f>] int3+0x9f/0xd0
[ 68.379203] <<EOE>>
[ 68.379229] in libglib-2.0.so.0.1505.0[3d2c800000+dc000]
and tracked it down to:
commit 03252919b7
Author: Andi Kleen <ak@suse.de>
Date: Wed Jan 30 13:33:18 2008 +0100
x86: print which shared library/executable faulted in segfault etc. messages
the problem is that we call down_read() from an atomic context.
Solve this by returning from print_vma_addr() if the preempt count is
elevated. Update preempt_conditional_sti / preempt_conditional_cli to
unconditionally lift the preempt count even on !CONFIG_PREEMPT.
Reported-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
arch/x86/kernel/i8253.c:98:27: warning: symbol 'pit_clockevent' was not declared. Should it be static?
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch enhances EFI runtime code memory mapping as following:
- Move __supported_pte_mask & _PAGE_NX checking before invoking
runtime_code_page_mkexec(). This makes it possible for compiler to
eliminate runtime_code_page_mkexec() on machine without NX support.
- Use set_memory_x/nx in early_mapping_set_exec(). This eliminates the
duplicated implementation.
This patch has been tested on Intel x86_64 platform with EFI64/32
firmware.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Andi Kleen pointed out that the cache attribute logic is reverse in
efi_enter_virtual_mode(). This problem alone is harmless as we do not
(yet) do cache attribute conflict resolution. (This bug was not present
in the original EFI submission - I introduced it while fixing up rejects.)
While reviewing this code I noticed a second, worse problem: the use of
uninitialized md->virt_addr.
Fix both problems.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[ mingo@elte.hu: while gbpages cannot be enabled on mainline currently,
keep the code uptodate and this fix is easy enough. ]
Use correct page sizes and masks for GB pages in try_preserve_large_page()
This prevents a boot hang on a GB capable system with CONFIG_DIRECT_GBPAGES
enabled.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Unpin the Xen-provided pagetable once we've finished with it, so it
doesn't cause stray references which cause later swapper_pg_dir
pagetable updates to fail.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Tested-by: Jody Belka <knew-linux@pimb.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
At the early stages of boot, before the kernel pagetable has been
fully initialized, a Xen kernel will still be running off the
Xen-provided pagetables rather than swapper_pg_dir[]. Therefore,
readback cr3 to determine the base of the pagetable rather than
assuming swapper_pg_dir[].
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Tested-by: Jody Belka <knew-linux@pimb.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When reboot_32.c and reboot_64.c were unified (commit 4d022e35fd...),
the machine_ops code was broken, leading to xen pvops kernels failing
to properly halt/poweroff/reboot etc. This fixes that up.
Signed-off-by: Jody Belka <knew-linux@pimb.org>
Cc: Miguel Boton <mboton@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The makefile magic for installing the 32-bit vdso images on disk had a
little error. A single-line change would fix that bug, but this does a
little more to reduce the error-prone duplication of this bit of
makefile variable magic.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
pageattr-test.c contains a noisy debug printk that people reported.
The condition under which it prints (randomly tapping into a mem_map[]
hole and not being able to c_p_a() there) is valid behavior and not
interesting to report.
Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since we may not have a pci_dev for the device we need to access, we can't
use pci_read_config_word. But raw_pci_read is an internal implementation
detail; it's better to use the architected pci_bus_read_config_word
interface. Using PCI_DEVFN instead of a mysterious constant helps
reassure everyone that we really do intend to access device 8.
[ Thanks to Grant Grundler for pointing out to me that this is exactly
what the write immediately above this is doing -- enabling device 8 to
respond to config space cycles.
- Matthew
Grant also says:
"Can you also add a comment which points at the Intel
documentation?
The 'Intel E7320 Memory Controller Hub (MCH) Datasheet' at
http://download.intel.com/design/chipsets/datashts/30300702.pdf
Page 69 documents register F4h (DEVPRES1).
And I just doubled checked that the 0xf4 register value is
restored later in the quirk (obvious when you look at the code
but not from the patch"
so here it is.
- Linus ]
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We want to allow different implementations of pci_raw_ops for standard
and extended config space on x86. Rather than clutter generic code with
knowledge of this, we make pci_raw_ops private to x86 and use it to
implement the new raw interface -- raw_pci_read() and raw_pci_write().
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thanks to Loic Prylli <loic@myri.com>, who originally proposed
this idea.
Always using legacy configuration mechanism for the legacy config space
and extended mechanism (mmconf) for the extended config space is
a simple and very logical approach. It's supposed to resolve all
known mmconf problems. It still allows per-device quirks (tweaking
dev->cfg_size). It also allows to get rid of mmconf fallback code.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Right now, we check only the first 4k page for static required protections.
This does not take overlapping regions into account. So we might end up
setting the wrong permissions/protections for other parts of this large page.
This can be optimized further, but correctness is the important part.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Now, that the page pool is in place we can enable DEBUG_PAGEALLOC on
64bit.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Switch the split page code to use the page pool. We do this
unconditionally to avoid different behaviour with and without
DEBUG_PAGEALLOC enabled.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
DEBUG_PAGEALLOC was not possible on 64-bit due to its early-bootup
hardcoded reliance on PSE pages, and the unrobustness of the runtime
splitup of large pages. The splitup ended in recursive calls to
alloc_pages() when a page for a pte split was requested.
Avoid the recursion with a preallocated page pool, which is used to
split up large mappings and gets refilled in the return path of
kernel_map_pages after the split has been done. The size of the page
pool is adjusted to the available memory.
This part just implements the page pool and the initialization w/o
using it yet.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In some suspend and hibernation files in arch/x86/power there are
comments referring to arch/x86-64 and arch/i386 . Update them to
reflect the current code layout.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move the hibernation-specific code from arch/x86/power/suspend_64.c
to a separate file (hibernate_64.c) and the CPU-handling code to
cpu_64.c (in line with the corresponding 32-bit code).
Simplify arch/x86/power/Makefile .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Rename cpu.c, suspend.c and swsusp.S in arch/x86/power to cpu_32.c,
hibernate_32.c and hibernate_asm_32.S, respectively, and update the
purpose and copyright information in these files.
Update the Makefile in arch/x86/power to reflect the above changes.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move arch/x86/kernel/suspend_64.c to arch/x86/power .
Move arch/x86/kernel/suspend_asm_64.S to arch/x86/power
as hibernate_asm_64.S .
Update purpose and copyright information in
arch/x86/power/suspend_64.c and
arch/x86/power/hibernate_asm_64.S .
Update the Makefiles in arch/x86, arch/x86/kernel and
arch/x86/power to reflect the above changes.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In arch/x86/boot/printf.c gets rid of unused tail of digits: const char
*digits = "0123456789abcdefghijklmnopqrstuvwxyz"; (we are using 0-9a-f
only)
Uses smaller/faster lowercasing (by ORing with 0x20)
if we know that we work on numbers/digits. Makes
strtoul smaller, and also we are getting rid of
static const char small_digits[] = "0123456789abcdefx";
static const char large_digits[] = "0123456789ABCDEFX";
since this works equally well:
static const char digits[16] = "0123456789ABCDEF";
Size savings:
$ size vmlinux.org vmlinux
text data bss dec hex filename
877320 112252 90112 1079684 107984 vmlinux.org
877048 112252 90112 1079412 107874 vmlinux
It may be also a tiny bit faster because code has less
branches now, but I doubt it is measurable.
[ hugh@veritas.com: uppercase pointers fix ]
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some important parts of f6df72e71e got
dropped along the way, reintroduce them.
Only affects paravirt guests.
Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Specifically the boot time page tables in a CONFIG_X86_PAE=y enabled
kernel are in PAE format.
early_ioremap is updated to use the standard page table accessors.
Clear any mappings beyond max_low_pfn from the boot page tables in
native_pagetable_setup_start because the initial mappings can extend
beyond the range of physical memory and into the vmalloc area.
Derived from patches by Eric Biederman and H. Peter Anvin.
[ jeremy@goop.org: PAE swapper_pg_dir needs to be page-sized fix ]
Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Mika Penttilä <mika.penttila@kolumbus.fi>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Other than the defconfigs, remove the entry in compiler-gcc4.h,
Kconfig.debug and feature-removal-schedule.txt.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use a common irq_return entry point for all the iret places, which
need the paravirt INTERRUPT return wrapper.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Adjust the definition of lookup_address to take an unsigned long
level argument. Adjust callers in xen/mmu.c that pass in a
dummy variable.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use xen_khz to denote xen_specific clock speed. Avoid shadowing
cpu_khz.
arch/x86/xen/time.c:220:6: warning: symbol 'cpu_khz' shadows an earlier one
include/asm/tsc.h:17:21: originally declared here
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Each AMD Geode MFGPT timer interrupt output is paired with another
timer; esentially the interrupt goes if either timer fires. This
is okay, but the handlers need to be aware of this. Make sure in
the timer tick handler that our timer really did expire.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We *really* don't want to be reading MFGPTx_SETUP and writing back those
values. What we want to be doing is clearing CMP1 and CMP2 unconditionally;
otherwise, we have races where CMP1 and/or CMP2 fire after we've read
MFGPTx_SETUP. They can also fire between when we've written ~CNTEN to
the register, and when the new register values get copied to the timer's
version of the register. By clearing both fields, we're okay.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There isn't much value to always detecting the MFGPT timers on
Geode platforms; detection is only needed when something wants
to use the timers. Move the detection code so that it gets
called the first time a timer is allocated.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We need to be called from elsewhere, and this gets some #ifdefs out
of the .c file.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Drop F_AVAIL and the 'flags' field, replacing with an 'avail' bit. This
looks more understandable to me.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We had planned to use the 'owner' field for allowing re-allocation of
MFGPTs; however, doing it by module owner name isn't flexible enough. So,
drop this for now. If it turns out that we need timers in modules, we'll
need to come up with a scheme that matches the write-once fields of the
MFGPTx_SETUP register, and drops ponies from the sky.
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The GEODE MFGPT code assumed that 32kHz was 32000 Hz while the boards
run on a 32.768 kHz digital watch crystal. In practise, it will not
change the timer's frequency as the skew was only 2.4%, but it
should provide more accurate intervals.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- uninline timer functions; the compiler knows better than we do
whether or not to inline these.
- mfgpt_start_timer() had an unused 'clock' argument, drop it.
From both Jordan and myself.
Signed-off-by: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild:
Kbuild: Fix deb-pkg target to work with kernel versions ending with -<text-without-digit>
ide: introduce HAVE_IDE
kbuild: silence CHK/UPD messages according to $(quiet)
scsi: fix makefile for aic7(3*x)
kbuild/modpost: Use warn() for announcing section mismatches
Add binoffset to gitignore
kbuild/modpost: improve warnings if symbol is unknown