Commit Graph

3874 Commits

Author SHA1 Message Date
Vegard Nossum
b2a6a58ca6 x86: fix BUG: unable to handle kernel paging request (numaq_tsc_disable)
This section mismatch:

>> Seems to be a section mismatch; init_intel() is __cpuinit while
>> numaq_tsc_disable() is __init. Seems to be introduced in:
>>
>> commit 64898a8bad
>> Author: Yinghai Lu <yhlu.kernel@gmail.com>
>> Date:   Sat Jul 19 18:01:16 2008 -0700
>>
>>    x86: extend and use x86_quirks to clean up NUMAQ code
>
> Oops, I am wrong about numaq_tsc_disable() being __init. Still, I
> believe that Yinghai might be able to say what's really wrong :-)

Would lead to this crash:

  BUG: unable to handle kernel paging request at c08a45f0
  IP: [<c08a45f0>] numaq_tsc_disable+0x0/0x40

Fixed by the patch below.

Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 12:52:01 +02:00
Jeremy Fitzhardinge
7946612de2 x86: export pv_lock_ops non-GPL
None of the spinlock API is exported GPL, so there's no reason for
pv_lock_ops to be.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: drago01 <drago01@gmail.com>
2008-08-21 12:42:23 +02:00
Marcin Slusarz
7701e8c59b x86, mmiotrace: silence section mismatch warning - leave_uniprocessor
WARNING: vmlinux.o(.text+0x180af): Section mismatch in reference from the function leave_uniprocessor() to the function .cpuinit.text:cpu_up()
The function leave_uniprocessor() references
the function __cpuinit cpu_up().
This is often because leave_uniprocessor lacks a __cpuinit
annotation or the annotation of cpu_up is wrong.

leave_uniprocessor calls cpu_up only when CONFIG_HOTPLUG_CPU is set,
so it can be safely annotated as __ref

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Pekka Paalanen <pq@iki.fi>
2008-08-21 12:35:16 +02:00
Arjan van de Ven
bde78a79a6 x86: use WARN() in arch/x86/kernel
Use WARN() instead of a printk+WARN_ON() pair; this way the message
becomes part of the warning section for better reporting/collection.
This also allowed the folding of some if()'s into the WARN()

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 10:01:52 +02:00
Arjan van de Ven
0c072bb452 x86: use WARN() in arch/x86/mm/ioremap.c
Use WARN() instead of a printk+WARN_ON() pair; this way the message
becomes part of the warning section for better reporting/collection.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 09:59:27 +02:00
David Howells
85d577979a werror: fix pci calgary
Fix an integer comparison always false warning in the PCI Calgary 64 driver.

A u8 is being compared to something that's 512 by default, resulting in the
following warning:

arch/x86/kernel/pci-calgary_64.c:1285: warning: comparison is always false due to limited range of data type

This was introduced by patch b34e90b8f0.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-21 09:55:47 +02:00
Andi Kleen
80a8c9fffa x86: fix oprofile + hibernation badness
Vegard Nossum reported oprofile + hibernation problems:

> Now some warnings:
>
> ------------[ cut here ]------------
> WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/smp.c:328 s
> mp_call_function_mask+0x194/0x1a0()

The usual problem: the suspend function when interrupts are
already disabled calls smp_call_function which is not allowed with
interrupt off. But at this point all the other CPUs should be already
down anyways, so it should be enough to just drop that.

This patch should fix that problem at least by fixing cpu hotplug&
suspend support.

[ mingo@elte.hu: fixed 5 coding style errors. ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 16:18:31 +02:00
Cliff Wickman
99dd871330 x86, SGI UV: hardcode the TLB flush interrupt system vector
The UV TLB shootdown mechanism needs a system interrupt vector.

Its vector had been hardcoded as 200, but needs to moved to the reserved
system vector range so that it does not collide with some device vector.

This is still temporary until dynamic system IRQ allocation is provided.
But it will be needed when real UV hardware becomes available and runs 2.6.27.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 12:36:03 +02:00
Venki Pallipadi
80c5e73d60 x86: fix Xorg startup/shutdown slowdown with PAT
Rene Herman reported significant Xorg startup/shutdown slowdown due
to PAT. It turns out that the memtype list has thousands of entries.

Add cached_entry to list add routine, in order to speed up the
lookup for sequential reserve_memtype calls.

Reported-by: Rene Herman <rene.herman@keyaccess.nl>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 12:08:37 +02:00
Samuel Sieb
c6744955d0 x86: fix "kernel won't boot on a Cyrix MediaGXm (Geode)"
Cyrix MediaGXm/Cx5530 Unicorn Revision 1.19.3B has stopped
booting starting at v2.6.22.

The reason is this commit:

> commit f25f64ed5b
> Author: Juergen Beisert <juergen@kreuzholzen.de>
> Date:   Sun Jul 22 11:12:38 2007 +0200
>
>     x86: Replace NSC/Cyrix specific chipset access macros by inlined functions.

this commit activated a macro which was dormant before due to (buggy)
macro side-effects.

I've looked through various datasheets and found that the GXm and GXLV
Geode processors don't have an incrementor.

Remove the incrementor setup entirely.  As the incrementor value
differs according to clock speed and we would hope that the BIOS
configures it correctly, it is probably the right solution.

Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20 11:31:00 +02:00
Linus Torvalds
ddd13dc606 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: add acpi_find_root_bridge_handle
  PCI: acpi_pcihp: run _OSC on a root bridge
  x86/PCI: irq and pci_ids patch for Intel Ibex Peak PCHs
  x86/PCI: allow scanning of 255 PCI busses
  x86, pci: detect end_bus_number according to acpi/e820 reserved, v2
  pci: debug extra pci bus resources
  pci: debug extra pci resources range
2008-08-19 13:55:47 -07:00
Linus Torvalds
f607e3a03c Revert "[CPUFREQ][2/2] preregister support for powernow-k8"
This reverts commit 34ae7f35a2, which has
been reported to cause a number of problems.  During suspend and resume,
it apparently causes a crash in a CPU hotplug notifier to happen,
although the exact details are sketchy because of the inability to get
good traces during the suspend sequence.

See buzilla entries

	http://bugzilla.kernel.org/show_bug.cgi?id=11296
	http://bugzilla.kernel.org/show_bug.cgi?id=11339

for more examples and details.

[ Mark: "Revert the patch for now.  I'm still looking into getting a
  reliable reproduction and I do not have a fix at this time." ]

Requested-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Mark Langsdorf <mark.langsdorf@amd.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@inux-foundation.org>
2008-08-19 13:34:59 -07:00
H. Peter Anvin
f1c5d30e1d x86: use X86_FEATURE_NOPL in alternatives
Use X86_FEATURE_NOPL to determine if it is safe to use P6 NOPs in
alternatives.  Also, replace table and loop with simple if statement.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-18 18:22:18 -07:00
H. Peter Anvin
7e00df5818 x86: add NOPL as a synthetic CPU feature bit
The long noops ("NOPL") are supposed to be detected by family >= 6.
Unfortunately, several non-Intel x86 implementations, both hardware
and software, don't obey this dictum.  Instead, probe for NOPL
directly by executing a NOPL instruction and see if we get #UD.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-18 18:22:17 -07:00
H. Peter Anvin
e2fe16d912 x86: boot: stub out unimplemented CPU feature words
The CPU feature detection code in the boot code is somewhat minimal,
and doesn't include all possible CPUID words.  In particular, it
doesn't contain the code for CPU feature words 2 (Transmeta),
3 (Linux-specific), 5 (VIA), or 7 (scattered).  Zero them out, so we
can still set those bits as known at compile time; in particular, this
allows creating a Linux-specific NOPL flag and have it required (and
therefore resolvable at compile time) in 64-bit mode.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-18 18:22:17 -07:00
Jiri Kosina
8a7c5ef3ba x86 iommu: remove unneeded parenthesis
The parenthesis in __iommu_queue_command() are not needed when assigning
into 'target' variable.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-19 02:16:24 +02:00
Linus Torvalds
a7f5aaf36d Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix build warnings in real mode code
  x86, calgary: fix section mismatch warning - get_tce_space_from_tar
  x86: silence section mismatch warning - get_local_pda
  x86, percpu: silence section mismatch warnings related to EARLY_PER_CPU variables
  x86: fix i486 suspend to disk CR4 oops
  x86: mpparse.c: fix section mismatch warning
  x86: mmconf: fix section mismatch warning
  x86: fix MP_processor_info section mismatch warning
  x86, tsc: fix section mismatch warning
  x86: correct register constraints for 64-bit atomic operations
2008-08-18 12:10:14 -07:00
Jesse Barnes
ce6754235b Merge branch 'pci-for-jesse' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip into x86-merge
Conflicts:

	drivers/pci/probe.c
2008-08-18 09:54:13 -07:00
Thomas Petazzoni
8d02c2110b x86: configuration options to compile out x86 CPU support code
This patch adds some configuration options that allow to compile out
CPU vendor-specific code in x86 kernels (in arch/x86/kernel/cpu). The
new configuration options are only visible when CONFIG_EMBEDDED is
selected, as they are mostly interesting for space savings reasons.

An example of size saving, on x86 with only Intel CPU support:

   text	   data	    bss	    dec	    hex	filename
1125479	 118760	 212992	1457231	 163c4f	vmlinux.old
1121355	 116536	 212992	1450883	 162383	vmlinux
  -4124   -2224       0   -6348   -18CC +/-

However, I'm not exactly sure that the Kconfig wording is correct with
regard to !64BIT / 64BIT.

[ mingo@elte.hu: convert macro to inline ]

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 16:05:48 +02:00
Thomas Petazzoni
774400a3ba x86: move cmpxchg fallbacks to a generic place
arch/x86/kernel/cpu/intel.c defines a few fallback functions
(cmpxchg_*()) that are used when the CPU doesn't support cmpxchg
and/or cmpxchg64 natively. However, while defined in an Intel-specific
file, these functions are also used for CPUs from other vendors when
they don't support cmpxchg and/or cmpxchg64. This breaks the
compilation when support for Intel CPUs is disabled.

This patch moves these functions to a new
arch/x86/kernel/cpu/cmpxchg.c file, unconditionally compiled when
X86_32 is enabled.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: michael@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 16:05:47 +02:00
Thomas Petazzoni
8bfcb3960f x86: make movsl_mask definition non-CPU specific
movsl_mask is currently defined in arch/x86/kernel/cpu/intel.c, which
contains code specific to Intel CPUs. However, movsl_mask is used in
the non-CPU specific code in arch/x86/lib/usercopy_32.c, which breaks
the compilation when support for Intel CPUs is compiled out.

This patch solves this problem by moving movsl_mask's definition close
to its users in arch/x86/lib/usercopy_32.c.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: michael@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 16:05:45 +02:00
Andi Kleen
1b72691ce3 x86: fix build warnings in real mode code
This recent patch

commit c3965bd151
Author: Paul Jackson <pj@sgi.com>
Date:   Wed May 14 08:15:34 2008 -0700

    x86 boot: proper use of ARRAY_SIZE instead of repeated E820MAX constant

caused these new warnings during a normal build:

In file included from linux-2.6/arch/x86/boot/memory.c:17:
linux-2.6/include/linux/log2.h: In function '__ilog2_u32':
linux-2.6/include/linux/log2.h:34: warning: implicit declaration of function 'fls'
linux-2.6/include/linux/log2.h: In function '__ilog2_u64':
linux-2.6/include/linux/log2.h:42: warning: implicit declaration of function 'fls64'
linux-2.6/include/linux/log2.h: In function '__roundup_pow_of_two ':
linux-2.6/include/linux/log2.h:63: warning: implicit declaration of function 'fls_long'

I tried to fix them in log2.h, but it's difficult because the real mode
environment is completely different from a normal kernel environment. Instead
define an own ARRAY_SIZE macro in boot.h, similar to the other private
macros there.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 09:20:14 +02:00
Marcin Slusarz
f71066624d x86, calgary: fix section mismatch warning - get_tce_space_from_tar
WARNING: vmlinux.o(.text+0x27032): Section mismatch in reference from the function get_tce_space_from_tar() to the function .init.text:calgary_bus_has_devices()
The function get_tce_space_from_tar() references
the function __init calgary_bus_has_devices().
This is often because get_tce_space_from_tar lacks a __init
annotation or the annotation of calgary_bus_has_devices is wrong.

get_tce_space_from_tar is called only from __init function (calgary_init)
and calls __init function (calgary_bus_has_devices).
So annotate it properly.

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Chandru Siddalingappa <chandru@in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 09:10:57 +02:00
Marcin Slusarz
d19fbfdfe6 x86: silence section mismatch warning - get_local_pda
Take out part of get_local_pda referencing __init function (free_bootmem)
to new (static) function marked as __ref. It's safe to do because free_bootmem
is called before __init sections are dropped.

WARNING: vmlinux.o(.cpuinit.text+0x3cd7): Section mismatch in reference from the function get_local_pda() to the function .init.text:free_bootmem()
The function __cpuinit get_local_pda() references
a function __init free_bootmem().
If free_bootmem is only used by get_local_pda then
annotate free_bootmem with a matching annotation.

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 09:10:56 +02:00
David Fries
e532c06f2a x86: fix i486 suspend to disk CR4 oops
arch/x86/power/cpu_32.c __save_processor_state calls read_cr4()
only a i486 CPU doesn't have the CR4 register.  Trying to read it
produces an invalid opcode oops during suspend to disk.

Use the safe rc4 reading op instead. If the value to be written is
zero the write is skipped.

arch/x86/power/hibernate_asm_32.S
done: swapped the use of %eax and %ecx to use jecxz for
the zero test and jump over store to %cr4.
restore_image: s/%ecx/%eax/ to be consistent with done:

In addition to __save_processor_state, acpi_save_state_mem,
efi_call_phys_prelog, and efi_call_phys_epilog had checks added
(acpi restore was in assembly and already had a check for
non-zero).  There were other reads and writes of CR4, but MCE and
virtualization shouldn't be executed on a i486 anyway.

Signed-off-by: David Fries <david@fries.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 08:50:19 +02:00
Marcin Slusarz
39e00fe20a x86: mpparse.c: fix section mismatch warning
WARNING: vmlinux.o(.text+0x118f7): Section mismatch in reference from the function construct_ioapic_table() to the function .init.text:MP_bus_info()
The function construct_ioapic_table() references
the function __init MP_bus_info().
This is often because construct_ioapic_table lacks a __init
annotation or the annotation of MP_bus_info is wrong.

construct_ioapic_table is called only from construct_default_ISA_mptable which is __init

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 07:49:26 +02:00
Marcin Slusarz
c72a5efec1 x86: mmconf: fix section mismatch warning
WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1591): Section mismatch in reference from the function init_amd() to the function .init.text:check_enable_amd_mmconf_dmi()
The function __cpuinit init_amd() references
a function __init check_enable_amd_mmconf_dmi().
If check_enable_amd_mmconf_dmi is only used by init_amd then
annotate check_enable_amd_mmconf_dmi with a matching annotation.

check_enable_amd_mmconf_dmi is only called from init_amd which is __cpuinit

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 07:49:06 +02:00
Marcin Slusarz
67d0c9ebdc x86: fix MP_processor_info section mismatch warning
WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1fe7): Section mismatch in reference from the function MP_processor_info() to the variable .init.data:x86_quirks
The function __cpuinit MP_processor_info() references
a variable __initdata x86_quirks.
If x86_quirks is only used by MP_processor_info then
annotate x86_quirks with a matching annotation.

MP_processor_info uses x86_quirks which is __init and is used only from
smp_read_mpc and construct_default_ISA_mptable which are __init

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 07:48:40 +02:00
Marcin Slusarz
d554d9a429 x86, tsc: fix section mismatch warning
WARNING: vmlinux.o(.text+0x7950): Section mismatch in reference from the function native_calibrate_tsc() to the function .init.text:tsc_read_refs()
The function native_calibrate_tsc() references
the function __init tsc_read_refs().
This is often because native_calibrate_tsc lacks a __init
annotation or the annotation of tsc_read_refs is wrong.

tsc_read_refs is called from native_calibrate_tsc which is not __init
and native_calibrate_tsc cannot be marked __init

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-18 07:48:07 +02:00
Linus Torvalds
0473b79929 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (32 commits)
  x86: add MAP_STACK mmap flag
  x86: fix section mismatch warning - spp_getpage()
  x86: change init_gdt to update the gdt via write_gdt, rather than a direct write.
  x86-64: fix overlap of modules and fixmap areas
  x86, geode-mfgpt: check IRQ before using MFGPT as clocksource
  x86, acpi: cleanup, temp_stack is used only when CONFIG_SMP is set
  x86: fix spin_is_contended()
  x86, nmi: clean UP NMI watchdog failure message
  x86, NMI: fix watchdog failure message
  x86: fix /proc/meminfo DirectMap
  x86: fix readb() et al compile error with gcc-3.2.3
  arch/x86/Kconfig: clean up, experimental adjustement
  x86: invalidate caches before going into suspend
  x86, perfctr: don't use CCCR_OVF_PMI1 on Pentium 4Ds
  x86, AMD IOMMU: initialize dma_ops after sysfs registration
  x86m AMD IOMMU: cleanup: replace LOW_U32 macro with generic lower_32_bits
  x86, AMD IOMMU: initialize device table properly
  x86, AMD IOMMU: use status bit instead of memory write-back for completion wait
  x86: silence mmconfig printk
  x86, msr: fix NULL pointer deref due to msr_open on nonexistent CPUs
  ...
2008-08-16 17:14:07 -07:00
Linus Torvalds
40a3426640 Merge branch 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-2.6
* 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-2.6:
  cpuidle: Make ladder governor honor latency requirements fully
  cpuidle: Menu governor fix wrong usage of measured_us
  cpuidle: Do not use poll_idle unless user asks for it
  x86: Fix ioremap off by one BUG
2008-08-15 12:47:16 -07:00
Marcin Slusarz
8d6ea9674c x86: fix section mismatch warning - spp_getpage()
WARNING: vmlinux.o(.text+0x17a3e): Section mismatch in reference from the function set_pte_vaddr_pud() to the function .init.text:spp_getpage()
The function set_pte_vaddr_pud() references
the function __init spp_getpage().
This is often because set_pte_vaddr_pud lacks a __init
annotation or the annotation of spp_getpage is wrong.

spp_getpage is called from __init (__init_extra_mapping) and
non __init (set_pte_vaddr_pud) functions, so it can't be __init.
Unfortunately it calls alloc_bootmem_pages which is __init,
but does it only when bootmem allocator is available (after_bootmem == 0).

So annotate it accordingly.

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
2008-08-15 19:16:06 +02:00
Alex Nixon
fc0091b3c8 x86: change init_gdt to update the gdt via write_gdt, rather than a direct write.
By writing directly, a memory access violation can occur whilst
hotplugging a CPU if the entry was previously marked read-only.

Signed-off-by: Alex Nixon <alex.nixon@citrix.com>
Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 19:16:05 +02:00
Andi Kleen
e213e87785 x86: Fix ioremap off by one BUG
Jean Delvare's machine triggered this BUG

acpi_os_map_memory phys ffff0000 size 65535
------------[ cut here ]------------
kernel BUG at arch/x86/mm/pat.c:233!

with ACPI in the backtrace.

Adding some debugging output showed that ACPI calls

acpi_os_map_memory phys ffff0000 size 65535

And ioremap/PAT does this check in 32bit, so addr+size wraps and the BUG
in reserve_memtype() triggers incorrectly.

        BUG_ON(start >= end); /* end is exclusive */

But reserve_memtype already uses u64:

int reserve_memtype(u64 start, u64 end,

so the 32bit truncation must happen in the caller. Presumably in ioremap
when it passes this information to reserve_memtype().

This patch does this computation in 64bit.

http://bugzilla.kernel.org/show_bug.cgi?id=11346

Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-08-15 18:18:38 +02:00
Seth Heasley
89499759dc x86/PCI: irq and pci_ids patch for Intel Ibex Peak PCHs
This patch adds the Intel Ibex Peak (PCH) LPC and SMBus Controller DeviceIDs.

Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-08-15 09:15:47 -07:00
Ingo Molnar
529d0e402e Merge branch 'x86/geode' into x86/urgent 2008-08-15 17:53:07 +02:00
Huang Ying
3122c33119 kexec jump: fix for ftrace
Ftrace depends on some processor state that we destroyed during kexec and
restored by restore_processor_state().  So save_processor_state() and
restore_processor_state() are moved into machine_kexec() and ftrace is
restored after restore_processor_state().

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:43 -07:00
Huang Ying
fb45daa69d kexec jump: check code size in control page
Kexec/Kexec-jump require code size in control page is less than
PAGE_SIZE/2.  This patch add link-time checking for this.

ASSERT() of ld link script is used as the link-time checking mechanism.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:42 -07:00
Huang Ying
163f6876f5 kexec jump: rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE
Rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE, because control
page is used for not only code on some platform.  For example in kexec
jump, it is used for data and stack too.

[akpm@linux-foundation.org: unbreak powerpc and arm, finish conversion]
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:42 -07:00
Jan Beulich
66d4bdf22b x86-64: fix overlap of modules and fixmap areas
Plus add a build time check so this doesn't go unnoticed again.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:31:50 +02:00
Alexey Dobriyan
c9c3dddd8f x86_64: remove empty lines from stack traces/oopses
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: ak@suse.de
Cc: akpm@osdl.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:26:12 +02:00
Jens Rottmann
0d5cdc97e2 x86, geode-mfgpt: check IRQ before using MFGPT as clocksource
Adds a simple IRQ autodetection to the AMD Geode MFGPT driver, and more
importantly, adds some checks, if IRQs can actually be received on the
chosen line.  This fixes cases where MFGPT is selected as clocksource
though not producing any ticks, so the kernel simply starves during
boot.

Signed-off-by: Jens Rottmann <JRottmann@LiPPERTEmbedded.de>
Cc: Andres Salomon <dilinger@debian.org>
Cc: linux-geode@bombadil.infradead.org
Cc: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 17:12:32 +02:00
Marcin Slusarz
9744f5a328 x86, acpi: cleanup, temp_stack is used only when CONFIG_SMP is set
fix:

  arch/x86/kernel/acpi/sleep.c:24: warning: 'temp_stack' defined but not used

[ Sven Wegener <sven.wegener@stealer.net>: fix build bug ]

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 16:41:21 +02:00
Ingo Molnar
1a10390708 Merge branch 'linus' into x86/cpu 2008-08-15 16:16:15 +02:00
Ingo Molnar
8bb851900f x86, nmi: clean UP NMI watchdog failure message
clean up the failure message - and redirect people to bugzilla
instead of lkml.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 15:35:31 +02:00
Aristeu Rozanski
1563666844 x86, NMI: fix watchdog failure message
> it just won't work at boot time - the second logic unit will be stuck:
>
> Booting processor 1/2 APIC 0x1
> Initializing CPU#1
> Calibrating delay using timer specific routine.. 5586.12 BogoMIPS (lpj=2793063)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 1
> CPU1: Thermal monitoring enabled (TM1)
>               Intel(R) Pentium(R) D CPU 2.80GHz stepping 04
> Brought up 2 CPUs
> testing NMI watchdog ... <4>WARNING: CPU#1: NMI appears to be stuck (0->0)!

while at it... - fix that newline

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: jvillalo@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 15:34:59 +02:00
Hugh Dickins
a06de63000 x86: fix /proc/meminfo DirectMap
Do we actually want these DirectMap lines in the x86 /proc/meminfo?
I can see they're interesting to CPA developers and TLB optimizers,
but they don't fit its usual "where has all my memory gone?" usage.
If they are to stay, here are some fixes.

1. On x86_32 without PAE, they're not 2M but 4M pages: no need to
   mess with the internal enum, but show the right name to users.

2. Many machines can never show anything but 0 for DirectMap1G,
   so suppress that line unless direct_gbpages are really enabled.

3. The unit in /proc/meminfo is kB not number of pages: HugePages
   messed that up, but they're an example to regret not to follow.

4. Once we use kB, it's easy to see that 1GB has gone missing (which
   explains why CONFIG_CPA_DEBUG=y soon wraps DirectMap2M negative):
   because head_64.S's level2_ident_pgt entries were not counted.
   My fix is not ideal, but works for more and for less than 1G,
   and avoids interfering with early bootup pagetable contortions.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 15:27:55 +02:00
Pavel Machek
04b69447f7 arch/x86/Kconfig: clean up, experimental adjustement
Adjust experimental tags in Kconfig, update config to notice that
i386/x86_64 is now single architecture.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 14:06:54 +02:00
Mark Langsdorf
394a15051c x86: invalidate caches before going into suspend
When a CPU core is shut down, all of its caches need to be flushed
to prevent stale data from causing errors if the core is resumed.
Current Linux suspend code performs an assignment after the flush,
which can add dirty data back to the cache.  On some AMD platforms,
additional speculative reads have caused crashes on resume because
of this dirty data.

Relocate the cache flush to be the very last thing done before
halting.  Tie into an assembly line so the compile will not
reorder it.  Add some documentation explaining what is going
on and why we're doing this.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Acked-by: Mark Borden <mark.borden@amd.com>
Acked-by: Michael Hohmuth <michael.hohmuth@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 14:04:30 +02:00
Aristeu Rozanski
dcc9841668 x86, perfctr: don't use CCCR_OVF_PMI1 on Pentium 4Ds
Currently, setup_p4_watchdog() use CCCR_OVF_PMI1 to enable the counter
overflow interrupts to the second logical core. But this bit doesn't work
on Pentium 4 Ds (model 4, stepping 4) and this patch avoids its use on
these processors. Tested on 4 different machines that have this
specific model with success.

Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: jvillalovos@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:58:33 +02:00
Ingo Molnar
975439fe73 Merge branch 'x86/amd-iommu' into x86/urgent 2008-08-15 13:57:32 +02:00
Joerg Roedel
129d6aba44 x86, AMD IOMMU: initialize dma_ops after sysfs registration
If sysfs registration fails all memory used by IOMMU is freed. This
happens after dma_ops initialization and the functions will access the
freed memory then.

Fix this by initializing dma_ops after the sysfs registration.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:56:56 +02:00
Joerg Roedel
8a456695c5 x86m AMD IOMMU: cleanup: replace LOW_U32 macro with generic lower_32_bits
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:56:56 +02:00
Joerg Roedel
9f5f5fb35d x86, AMD IOMMU: initialize device table properly
This patch adds device table initializations which forbids memory accesses
for devices per default and disables all page faults.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:56:54 +02:00
Joerg Roedel
519c31bacf x86, AMD IOMMU: use status bit instead of memory write-back for completion wait
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:56:46 +02:00
Dave Jones
ef31023743 x86: silence mmconfig printk
There's so much broken mmconfig hardware/bios'es out there,
that classing this as an error seems a little extreme.
Lower its priority to KERN_INFO so that it isn't so noisy
when booting with 'quiet'

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:52:39 +02:00
Darrick J. Wong
967060d00d x86, msr: fix NULL pointer deref due to msr_open on nonexistent CPUs
msr_open tests for someone trying to open a device for a nonexistent CPU.
However, the function always returns 0, not ret like it should, hence
userspace can BUG the kernel trivially.  This bug was introduced by the
cdev lock_kernel pushdown patch last May.

The BUG can be reproduced with these commands:

# mknod fubar c 202 8 <-- pick a number less than NR_CPUS that is not
                          the number of an online CPU
# cat fubar

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-15 13:38:30 +02:00
Ingo Molnar
881b374705 Merge branch 'x86/apic' into x86/core 2008-08-14 15:13:47 +02:00
Ingo Molnar
c83d12806b Merge branches 'x86/prototypes', 'x86/x2apic' and 'x86/debug' into x86/core 2008-08-14 14:58:22 +02:00
Ingo Molnar
51ca3c6791 Merge branch 'linus' into x86/core
Conflicts:
	arch/x86/kernel/genapic_64.c
	include/asm-x86/kvm_host.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 14:58:01 +02:00
Thomas Gleixner
a6825f1c1f x86: hpet: workaround SB700 BIOS
AMD SB700 based systems with spread spectrum enabled use a SMM based
HPET emulation to provide proper frequency setting. The SMM code is
initialized with the first HPET register access and takes some time to
complete. During this time the config register reads 0xffffffff. We
check for max. 1000 loops whether the config register reads a non
0xffffffff value to make sure that HPET is up and running before we go
further. A counting loop is safe, as the HPET access takes thousands
of CPU cycles. On non SB700 based machines this check is only done
once and has no side effects.

Based on a quirk patch from: crane cai <crane.cai@amd.com>

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-08-14 13:23:45 +02:00
Ingo Molnar
8d7ccaa545 Merge commit 'v2.6.27-rc3' into x86/prototypes
Conflicts:

	include/asm-x86/dma-mapping.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 12:19:59 +02:00
Yinghai Lu
a58f03b075 x86: check bigsmp in smp_sanity_check instead of cpu_up
clear bits for cpu nr > 8.

This allows us to boot the full range of possible CPUs that the
supported APIC model will allow. Previously we'd hang or boot up
with less than 8 CPUs.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 11:35:53 +02:00
Yinghai Lu
858f774733 x86: don't call e820_regiter_active_regions if out of range on node
so we don't get warning on 32bit system with 64g RAM or more

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 11:35:52 +02:00
Max Krasnyansky
23b49c19f6 x86: resurrect proper handling of maxcpus= kernel option (v2)
For some reason we had two parsers registered for maxcpus=. One in init/main.c
and another in arch/x86/smpboot.c. So I nuked the one in arch/x86.

Also 64-bit kernels used to handle maxcpus= as documented in
Documentation/cpu-hotplug.txt. CPUs with 'id > maxcpus' are initialized
but not booted. 32-bit version for some reason ignored them even though
all the infrastructure for booting them later is there.

In the current mainline both 64 and 32 bit versions are broken.
This patch restores the correct behaviour. I've tested x86_64 version on
4- and 8- way Core2 and 2-way Opteron based machines. Various config
combinations SMP, !SMP, CPU_HOTPLUG, !CPU_HOTPLUG.
Booted with maxcpus=1 and maxcpus=4, etc. Everything is working as expected.

So far we've received two reports from different people confirming that 32-bit
version also works fine, both on dual core laptops and 16way server machines.

[v2: This version fixes visws breakage pointed out by Ingo.]

Signed-off-by: Max Krasnyansky <maxk@qualcomm.com>
Cc: lizf@cn.fujitsu.com
Cc: jeff.chua.linux@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 11:18:08 +02:00
Ingo Molnar
3167761965 Merge branch 'x86/fpu' into x86/urgent 2008-08-14 11:18:08 +02:00
Suresh Siddha
f65bc214e0 x86, xsave: use BUG_ON() instead of BUILD_BUG_ON()
All these structure sizes are runtime determined. So use a runtime
bug check.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 10:56:07 +02:00
Suresh Siddha
ed40595805 x86, xsave: clear the user buffer before doing fxsave/xsave
fxsave/xsave instructions will not touch all the bytes in the
fxsave/xsave frame. Clear the user buffer before doing fxsave/xsave
directly to user buffer during the sigcontext setup.

This is essentially needed in the context of xsave(for example,
some of the fields in the xsave header are not touched by the xsave
and defined as must be zero).

This will also present uniform and clean context to the user (from
which user can safely do fxrstor/xrstor).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 10:56:06 +02:00
Suresh Siddha
ee2b92a820 x86, xsave: remove the redundant access_ok() in setup_rt_frame()
save_i387_xstate() is already doing the required access_ok(). Remove
the redundant access_ok() before it.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 10:56:06 +02:00
Ingo Molnar
d4439087d3 Merge commit 'v2.6.27-rc3' into x86/xsave
Conflicts:

	arch/x86/kernel/genapic_64.c
	include/asm-x86/kvm_host.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 10:55:26 +02:00
H. Peter Anvin
c2dcfde827 x86: cleanup for setup code crashes during IST probe
Clean up the code for crashes during SpeedStep probing on older
machines.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-14 00:13:52 +02:00
Arjan van de Ven
875e40b975 x86: use WARN() in arch/x86/mm/pageattr.c
Use WARN() instead of a printk+WARN_ON() pair; this way the message becomes
part of the warning section for better reporting/collection.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: akpm@linux-foundation.org
Cc: arjan@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-13 19:05:39 +02:00
John Keller
a726c6009e x86: allow MMCONFIG above 4GB on x86_64
SGI UV will have MMCFG base addresses that are greater than 4GB (32 bits).

v2: Use CONFIG_RESOURCES_64BIT instead of CONFIG_X86_64.
v3: Create a flag, that is set by platform specific code,
    to disable the > 4GB check.

Signed-off-by: John Keller <jpk@sgi.com>
Cc: jpk@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-13 17:48:13 +02:00
Marcin Slusarz
6b3560229d x86: fix 2 section mismatch warnings - find_and_reserve_crashkernel
WARNING: vmlinux.o(.text+0xcd1f): Section mismatch in reference from the function find_and_reserve_crashkernel() to the function .init.text:find_e820_area()
The function find_and_reserve_crashkernel() references
the function __init find_e820_area().
This is often because find_and_reserve_crashkernel lacks a __init
annotation or the annotation of find_e820_area is wrong.

WARNING: vmlinux.o(.text+0xcd38): Section mismatch in reference from the function find_and_reserve_crashkernel() to the function .init.text:reserve_bootmem_generic()
The function find_and_reserve_crashkernel() references
the function __init reserve_bootmem_generic().
This is often because find_and_reserve_crashkernel lacks a __init
annotation or the annotation of reserve_bootmem_generic is wrong.

find_and_reserve_crashkernel is called from __init function (reserve_crashkernel)
and calls 2 __init functions (find_e820_area, reserve_bootmem_generic),
so mark it __init

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-13 17:48:12 +02:00
Marcin Slusarz
c9d08f0860 x86: fix 2 section mismatch warnings - map_high()
WARNING: vmlinux.o(.text+0x14cf8): Section mismatch in reference from the function map_high() to the function .init.text:init_extra_mapping_uc()
The function map_high() references
the function __init init_extra_mapping_uc().
This is often because map_high lacks a __init
annotation or the annotation of init_extra_mapping_uc is wrong.

WARNING: vmlinux.o(.text+0x14d05): Section mismatch in reference from the function map_high() to the function .init.text:init_extra_mapping_wb()
The function map_high() references
the function __init init_extra_mapping_wb().
This is often because map_high lacks a __init
annotation or the annotation of init_extra_mapping_wb is wrong.

map_high is called only from __init functions (map_*_high)
and calls 2 __init_functions (init_extra_mapping_*)

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-13 13:09:49 +02:00
Ingo Molnar
a12e61df4f Merge commit 'v2.6.27-rc3' into x86/urgent 2008-08-13 13:08:47 +02:00
Joerg Roedel
7b27718bdb x86: fix setup code crashes on my old 486 box
yesterday I tried to reactivate my old 486 box and wanted to install a
current Linux with latest kernel on it. But it turned out that the
latest kernel does not boot because the machine crashes early in the
setup code.

After some debugging it turned out that the problem is the query_ist()
function. If this interrupt with that function is called the machine
simply locks up. It looks like a BIOS bug. Looking for a workaround for
this problem I wrote the attached patch. It checks for the CPUID
instruction and if it is not implemented it does not call the speedstep
BIOS function. As far as I know speedstep should be available since some
Pentium earliest.

Alan Cox observed that it's available since the Pentium II, so cpuid
levels 4 and 5 can be excluded altogether.

H. Peter Anvin cleaned up the code some more:

> Right in concept, but I dislike the implementation (duplication of the
> CPU detect code we already have).  Could you try this patch and see if
> it works for you?

which, with a small modification to fix a build error with it the
resulting kernel boots on my machine.

Signed-off-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-13 11:59:18 +02:00
Linus Torvalds
1c89ac5501 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  fix spinlock recursion in hvc_console
  stop_machine: remove unused variable
  modules: extend initcall_debug functionality to the module loader
  export virtio_rng.h
  lguest: use get_user_pages_fast() instead of get_user_pages()
  mm: Make generic weak get_user_pages_fast and EXPORT_GPL it
  lguest: don't set MAC address for guest unless specified
2008-08-12 08:40:19 -07:00
Rusty Russell
912985dce4 mm: Make generic weak get_user_pages_fast and EXPORT_GPL it
Out of line get_user_pages_fast fallback implementation, make it a weak
symbol, get rid of CONFIG_HAVE_GET_USER_PAGES_FAST.

Export the symbol to modules so lguest can use it.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-08-12 17:52:53 +10:00
Linus Torvalds
7019b1b500 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix 2.6.27rc1 cannot boot more than 8CPUs
  x86: make "apic" an early_param() on 32-bit, NULL check
  EFI, x86: fix function prototype
  x86, pci-calgary: fix function declaration
  x86: work around gcc 3.4.x bug
  x86: make "apic" an early_param() on 32-bit
  x86, debug: tone down arch/x86/kernel/mpparse.c debugging printk
  x86_64: restore the proper NR_IRQS define so larger systems work.
  x86: Restore proper vector locking during cpu hotplug
  x86: Fix broken VMI in 2.6.27-rc..
  x86: fdiv bug detection fix
2008-08-11 16:44:35 -07:00
Andi Kleen
9dd1e9eb5c x86/PCI: allow scanning of 255 PCI busses
Fix an old off by one error in the legacy PCI bus check. 0xff
is a valid bus.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-08-11 15:23:50 -07:00
Yinghai Lu
b74548e76a x86: fix 2.6.27rc1 cannot boot more than 8CPUs
Jeff Chua reported that booting a !bigsmp kernel on a 16-way box
hangs silently.

this is a long-standing issue, smp start AP cpu could check the
apic id >=8 etc before trying to start it.

achieve this by moving the def_to_bigsmp check later and skip the
apicid id > 8

[ mingo@elte.hu: clean up the message that is printed. ]

Reported-by: "Jeff Chua" <jeff.chua.linux@gmail.com>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

 arch/x86/kernel/setup.c   |    6 ------
 arch/x86/kernel/smpboot.c |   10 ++++++++++
 2 files changed, 10 insertions(+), 6 deletions(-)
2008-08-11 22:42:59 +02:00
Andreas Herrmann
b55793f752 x86: cpu_init(): fix memory leak when using CPU hotplug
Exception stacks are allocated each time a CPU is set online.
But the allocated space is never freed. Thus with one CPU hotplug
offline/online cycle there is a memory leak of 24K (6 pages) for
a CPU.

Fix is to allocate exception stacks only once -- when the CPU is
set online for the first time.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 21:39:12 +02:00
Andreas Herrmann
49800efcb1 x86: pda_init(): fix memory leak when using CPU hotplug
pda->irqstackptr is allocated whenever a CPU is set online.
But it is never freed. This results in a memory leak of 16K
for each CPU offline/online cycle.

Fix is to allocate pda->irqstackptr only once.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 21:39:11 +02:00
Rene Herman
48d97cb65e x86: make "apic" an early_param() on 32-bit, NULL check
Cyrill Gorcunov observed:

> you turned it into early_param so now it's NULL injecting vulnerabled.
> Could you please add checking for NULL str param?

fix that.

Also, change the name of 'str' into 'arg', to make it more apparent
that this is an optional argument that can be NULL, not a string
parameter that is empty when unset.

Reported-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 19:40:38 +02:00
Randy Dunlap
9b0094f7f2 x86, pci-calgary: fix function declaration
Fix function declaration:

 linux-next-20080807/arch/x86/kernel/pci-calgary_64.c:1353:36: warning: non-ANSI function declaration of function 'get_tce_space_from_tar'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 18:48:45 +02:00
Jeremy Fitzhardinge
cf3e505012 x86: work around gcc 3.4.x bug
Simon Horman reported that gcc-3.4.x crashes when compiling
pgd_prepopulate_pmd() when PREALLOCATED_PMDS == 0 and CONFIG_DEBUG_INFO
is enabled.

Adding an extra check for PREALLOCATED_PMDS == 0 [which is compiled out
by gcc] seems to avoid the problem.

Reported-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 18:44:02 +02:00
Rene Herman
fb6bef8002 x86: make "apic" an early_param() on 32-bit
On 32-bit, "apic" is a __setup() param meaning it is parsed rather
late in the game. Make it an early_param() for apic_printk() use
by arch/x86/kernel/mpparse.c.

On 64-bit, it already is an early_param().

Signed-off-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 18:36:04 +02:00
Rene Herman
eeb0d7d113 x86, debug: tone down arch/x86/kernel/mpparse.c debugging printk
commit 11a62a0560 turns some formerly
nopped debugging printks in arch/x86/kernel/mppparse.c into regular
ones. The one at the top of smp_scan_config() in particular also
prints on !CONFIG_SMP/CONFIG_X86_LOCAL_APIC kernels and UP machines
without anything resembling MP tables which makes their lowly UP
owners wonder...

Turn the former Dprintk()s into apic_printk()s instead meaning that
their printing is dependent on passing the apic=verbose (or =debug)
command line param.

On 32-bit, "apic" is a __setup() param which isn't early enough
for this code and therefore needs a followup changing it into an
early_param(). On 64-bit, it already is.

Signed-off-by: Rene Herman <rene.herman@gmail.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 18:36:03 +02:00
Cyrill Gorcunov
2ae111cdd8 x86: apic interrupts - move assignments to irqinit_32.c, v2
64bit mode APIC interrupt handlers are set within irqinit_64.c.
Lets do tha same for 32bit mode which would help in furter code merging.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 16:43:09 +02:00
Ingo Molnar
8067794bec Merge branch 'linus' into x86/x2apic
Conflicts:

	arch/x86/kernel/genapic_64.c

Manual merge:

	arch/x86/kernel/genx2apic_uv_x.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11 11:19:20 +02:00
Eric W. Biederman
d388e5fdc4 x86: Restore proper vector locking during cpu hotplug
Having cpu_online_map change during assign_irq_vector can result
in some really nasty and weird things happening.  The one that
bit me last time was accessing non existent per cpu memory for non
existent cpus.

This locking was removed in a sloppy x86_64 and x86_32 merge patch.

Guys can we please try and avoid subtly breaking x86 when we are
merging files together?

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-11 10:37:34 +02:00
Alok Kataria
31343d8a50 x86: Fix broken VMI in 2.6.27-rc..
The lowmem mapping table created by VMI need not depend on max_low_pfn
at all.  Instead we now create an extra large mapping which covers all
possible lowmem instead of the physical ram that is actually available.

This allows the vmi initialization to be done before max_low_pfn could
be computed. We also move the vmi_init code very early in the boot process
so that nobody accidentally breaks the fixmap dependancy.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Acked-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-08 15:22:02 -07:00
Mark Langsdorf
34ae7f35a2 [CPUFREQ][2/2] preregister support for powernow-k8
This patch provides support for the _PSD ACPI object in the Powernow-k8
driver.  Although it looks like an invasive patch, most of it is
simply the consequence of turning the static acpi_performance_data
structure into a pointer.

AMD has tested it on several machines over the past few days without issue.

[trivial checkpatch warnings fixed up by davej]
[X86_POWERNOW_K8_ACPI=n buildfix from Randy Dunlap]

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Tested-by: Frank Arnold <frank.arnold@amd.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-08-08 16:00:49 -04:00
Mark Langsdorf
23431b495f [CPUFREQ][1/2] whitespace fix for powernow-k8
Trivial whitespace fix for powernow-k8.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-08-08 16:00:49 -04:00
Dave Jones
460f5ef283 [CPUFREQ] Fix warning in elanfreq
arch/x86/kernel/cpu/cpufreq/elanfreq.c:47:26: warning: symbol 'elan_multiplier' was not declared. Should it be static?

Yes, yes it should.

Signed-off-by: Dave Jones <davej@redhat.com>
2008-08-08 16:00:48 -04:00
Dave Jones
ec983f7060 [CPUFREQ] Remove EXPERIMENTAL annotation from VIA C7 powersaver kconfig.
This has been pretty solid, and doesn't see much change at all.

Noticed by Harald Welte.

Signed-off-by: Dave Jones <davej@redhat.com>
2008-08-08 16:00:48 -04:00
Linus Torvalds
84ff7a0012 Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: s390: Fix kvm on IBM System z10
  KVM: Advertise synchronized mmu support to userspace
  KVM: Synchronize guest physical memory map to host virtual memory map
  KVM: Allow browsing memslots with mmu_lock
  KVM: Allow reading aliases with mmu_lock
2008-08-01 12:48:16 -07:00
Linus Torvalds
57b1494d2b Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  generic, x86: fix add iommu_num_pages helper function
  x86: remove stray <6> in BogoMIPS printk
  x86: move dma32_reserve_bootmem() after reserve_crashkernel()
2008-08-01 10:28:17 -07:00
Krzysztof Helt
e0d22d03c0 x86: fdiv bug detection fix
The fdiv detection code writes s32 integer into
the boot_cpu_data.fdiv_bug.
However, the boot_cpu_data.fdiv_bug is only char (s8)
field so the detection overwrites already set fields for
other bugs, e.g. the f00f bug field.

Use local s32 variable to receive result.

This is a partial fix to Bugzilla #9928  - fixes wrong
information about the f00f bug (tested) and probably
for coma bug (I have no cpu to test this).

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-31 23:56:27 +02:00
Hiroshi Shimamoto
7ab6af7ab6 x86_32: use apic_ops at print_local_APIC()
Use apic_icr_read at print_local_APIC() in io_apic_32.c

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
2008-07-31 23:54:50 +02:00
Yinghai Lu
a677f58a8c x86: print per_cpu data address
to make sure per_cpu data on correct node.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-31 23:20:32 +02:00
H. Peter Anvin
6152e4b1c9 x86, xsave: keep the XSAVE feature mask as an u64
The XSAVE feature mask is a 64-bit number; keep it that way, in order
to avoid the mistake done with rdmsr/wrmsr.  Use the xsetbv() function
provided in the previous patch.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:50:35 +02:00
Suresh Siddha
42deec6f2c x86, xsave: update xsave header bits during ptrace fpregs set
FP/SSE bits may be zero in the xsave header(representing the init state).
Update these bits during the ptrace fpregs set operation, to indicate the
non-init state.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:28 +02:00
Suresh Siddha
c37b5efea4 x86, xsave: save/restore the extended state context in sigframe
On cpu's supporting xsave/xrstor, fpstate pointer in the sigcontext, will
include the extended state information along with fpstate information. Presence
of extended state information is indicated by the presence
of FP_XSTATE_MAGIC1 at fpstate.sw_reserved.magic1 and FP_XSTATE_MAGIC2
at fpstate + (fpstate.sw_reserved.extended_size - FP_XSTATE_MAGIC2_SIZE).

Extended feature bit mask that is saved in the memory layout is represented
by the fpstate.sw_reserved.xstate_bv

For RT signal frames, UC_FP_XSTATE in the uc_flags also indicate the
presence of extended state information in the sigcontext's fpstate
pointer.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:27 +02:00
Suresh Siddha
ab5137015f x86, xsave: reorganization of signal save/restore fpstate code layout
move 64bit routines that saves/restores fpstate in/from user stack from
signal_64.c to xsave.c

restore_i387_xstate() now handles the condition when user passes
NULL fpstate.

Other misc changes for prepartion of xsave/xrstor sigcontext support.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:26 +02:00
Suresh Siddha
3c1c7f1014 x86, xsave: dynamically allocate sigframes fpstate instead of static allocation
dynamically allocate fpstate on the stack, instead of static allocation
in the current sigframe layout on the user stack. This will allow the
fpstate structure to grow in the future, which includes extended state
information supporting xsave/xrstor.

signal handlers will be able to access the fpstate pointer from the
sigcontext structure asusual, with no change. For the non RT sigframe's
(which are supported only for 32bit apps), current static fpstate layout
in the sigframe will be unused(so that we don't change the extramask[]
offset in the sigframe and thus prevent breaking app's which modify
extramask[]).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:25 +02:00
Suresh Siddha
b359e8a434 x86, xsave: context switch support using xsave/xrstor
Uses xsave/xrstor (instead of traditional fxsave/fxrstor) in context switch
when available.

Introduces TS_XSAVE flag, which determine the need to use xsave/xrstor
instructions during context switch instead of the legacy fxsave/fxrstor
instructions. Thread-synchronous status word is already in L1 cache during
this code patch and thus minimizes the performance penality compared to
(cpu_has_xsave) checks.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:24 +02:00
Suresh Siddha
dc1e35c6e9 x86, xsave: enable xsave/xrstor on cpus with xsave support
Enables xsave/xrstor by turning on cr4.osxsave on cpu's which have
the xsave support. For now, features that OS supports/enabled are
FP and SSE.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:24 +02:00
Suresh Siddha
a648bf4632 x86, xsave: xsave cpuid feature bits
Add xsave CPU feature bits.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:49:23 +02:00
Ingo Molnar
15dd859cac Merge commit 'v2.6.27-rc1' into x86/core
Conflicts:

	include/asm-x86/dma-mapping.h
	include/asm-x86/namei.h
	include/asm-x86/uaccess.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30 19:33:48 +02:00
Ingo Molnar
b2d9d33412 Merge branch 'x86/fpu' into x86/core 2008-07-30 19:32:39 +02:00
Vitaly Mayatskikh
afd962a9e8 x86: wrong register was used in align macro
New ALIGN_DESTINATION macro has sad typo: r8d register was used instead
of ecx in fixup section. This can be considered as a regression.

Register ecx was also wrongly loaded with value in r8d in
copy_user_nocache routine.

Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-30 10:10:39 -07:00
Jack Steiner
0d39741a27 GRU Driver: export is_uv_system(), zap_page_range() & follow_page()
Exports needed by the GRU driver.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-30 09:41:48 -07:00
FUJITA Tomonori
8978b74253 generic, x86: fix add iommu_num_pages helper function
This IOMMU helper function doesn't work for some architectures:

  http://marc.info/?l=linux-kernel&m=121699304403202&w=2

It also breaks POWER and SPARC builds:

  http://marc.info/?l=linux-kernel&m=121730388001890&w=2

Currently, only x86 IOMMUs use this so let's move it to x86 for
now.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-29 12:12:48 +02:00
Ingo Molnar
35780c8ea7 Merge commit 'v2.6.27-rc1' into x86/urgent 2008-07-29 12:10:50 +02:00
Avi Kivity
ed84862433 KVM: Advertise synchronized mmu support to userspace
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-29 12:34:02 +03:00
Andrea Arcangeli
e930bffe95 KVM: Synchronize guest physical memory map to host virtual memory map
Synchronize changes to host virtual addresses which are part of
a KVM memory slot to the KVM shadow mmu.  This allows pte operations
like swapping, page migration, and madvise() to transparently work
with KVM.

Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-29 12:33:53 +03:00
Andrea Arcangeli
604b38ac03 KVM: Allow browsing memslots with mmu_lock
This allows reading memslots with only the mmu_lock hold for mmu
notifiers that runs in atomic context and with mmu_lock held.

Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-29 12:33:50 +03:00
Andrea Arcangeli
a1708ce8a3 KVM: Allow reading aliases with mmu_lock
This allows the mmu notifier code to run unalias_gfn with only the
mmu_lock held.  Only alias writes need the mmu_lock held. Readers will
either take the slots_lock in read mode or the mmu_lock.

Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-29 12:33:40 +03:00
Linus Torvalds
7874d35173 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: turn Waker into a thread, not a process
  lguest: Enlarge virtio rings
  lguest: Use GSO/IFF_VNET_HDR extensions on tun/tap
  lguest: Remove 'network: no dma buffer!' warning
  lguest: Adaptive timeout
  lguest: Tell Guest net not to notify us on every packet xmit
  lguest: net block unneeded receive queue update notifications
  lguest: wrap last_avail accesses.
  lguest: use cpu capability accessors
  lguest: virtio-rng support
  lguest: Support assigning a MAC address
  lguest: Don't leak /dev/zero fd
  lguest: fix verbose printing of device features.
  lguest: fix switcher_page leak on unload
  lguest: Guest int3 fix
  lguest: set max_pfn_mapped, growl loudly at Yinghai Lu
2008-07-28 18:16:26 -07:00
Linus Torvalds
1d9b9f6a53 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (21 commits)
  x86/PCI: use dev_printk when possible
  PCI: add D3 power state avoidance quirk
  PCI: fix bogus "'device' may be used uninitialized" warning in pci_slot
  PCI: add an option to allow ASPM enabled forcibly
  PCI: disable ASPM on pre-1.1 PCIe devices
  PCI: disable ASPM per ACPI FADT setting
  PCI MSI: Don't disable MSIs if the mask bit isn't supported
  PCI: handle 64-bit resources better on 32-bit machines
  PCI: rewrite PCI BAR reading code
  PCI: document pci_target_state
  PCI hotplug: fix typo in pcie hotplug output
  x86 gart: replace to_pages macro with iommu_num_pages
  x86, AMD IOMMU: replace to_pages macro with iommu_num_pages
  iommu: add iommu_num_pages helper function
  dma-coherent: add documentation to new interfaces
  Cris: convert to using generic dma-coherent mem allocator
  Sh: use generic per-device coherent dma allocator
  ARM: support generic per-device coherent dma mem
  Generic dma-coherent: fix DMA_MEMORY_EXCLUSIVE
  x86: use generic per-device dma coherent allocator
  ...
2008-07-28 18:14:24 -07:00
Linus Torvalds
9b79022ca9 Fix 'get_user_pages_fast()' with non-page-aligned start address
Alexey Dobriyan reported trouble with LTP with the new fast-gup code,
and Johannes Weiner debugged it to non-page-aligned addresses, where the
new get_user_pages_fast() code would do all the wrong things, including
just traversing past the end of the requested area due to 'addr' never
matching 'end' exactly.

This is not a pretty fix, and we may actually want to move the alignment
into generic code, leaving just the core code per-arch, but Alexey
verified that the vmsplice01 LTP test doesn't crash with this.

Reported-and-tested-by: Alexey Dobriyan <adobriyan@gmail.com>
Debugged-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-28 17:54:21 -07:00
Rusty Russell
5d006d8d09 lguest: set max_pfn_mapped, growl loudly at Yinghai Lu
6af61a7614 'x86: clean up max_pfn_mapped
usage - 32-bit' makes the following comment:

    XEN PV and lguest may need to assign max_pfn_mapped too.

But no CC.  Yinghai, wasting fellow developers' time is a VERY bad
habit.  If you do it again, I will hunt you down and try to extract
the three hours of my life I just lost :)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
2008-07-29 09:58:31 +10:00
Andrea Arcangeli
cddb8a5c14 mmu-notifiers: core
With KVM/GFP/XPMEM there isn't just the primary CPU MMU pointing to pages.
 There are secondary MMUs (with secondary sptes and secondary tlbs) too.
sptes in the kvm case are shadow pagetables, but when I say spte in
mmu-notifier context, I mean "secondary pte".  In GRU case there's no
actual secondary pte and there's only a secondary tlb because the GRU
secondary MMU has no knowledge about sptes and every secondary tlb miss
event in the MMU always generates a page fault that has to be resolved by
the CPU (this is not the case of KVM where the a secondary tlb miss will
walk sptes in hardware and it will refill the secondary tlb transparently
to software if the corresponding spte is present).  The same way
zap_page_range has to invalidate the pte before freeing the page, the spte
(and secondary tlb) must also be invalidated before any page is freed and
reused.

Currently we take a page_count pin on every page mapped by sptes, but that
means the pages can't be swapped whenever they're mapped by any spte
because they're part of the guest working set.  Furthermore a spte unmap
event can immediately lead to a page to be freed when the pin is released
(so requiring the same complex and relatively slow tlb_gather smp safe
logic we have in zap_page_range and that can be avoided completely if the
spte unmap event doesn't require an unpin of the page previously mapped in
the secondary MMU).

The mmu notifiers allow kvm/GRU/XPMEM to attach to the tsk->mm and know
when the VM is swapping or freeing or doing anything on the primary MMU so
that the secondary MMU code can drop sptes before the pages are freed,
avoiding all page pinning and allowing 100% reliable swapping of guest
physical address space.  Furthermore it avoids the code that teardown the
mappings of the secondary MMU, to implement a logic like tlb_gather in
zap_page_range that would require many IPI to flush other cpu tlbs, for
each fixed number of spte unmapped.

To make an example: if what happens on the primary MMU is a protection
downgrade (from writeable to wrprotect) the secondary MMU mappings will be
invalidated, and the next secondary-mmu-page-fault will call
get_user_pages and trigger a do_wp_page through get_user_pages if it
called get_user_pages with write=1, and it'll re-establishing an updated
spte or secondary-tlb-mapping on the copied page.  Or it will setup a
readonly spte or readonly tlb mapping if it's a guest-read, if it calls
get_user_pages with write=0.  This is just an example.

This allows to map any page pointed by any pte (and in turn visible in the
primary CPU MMU), into a secondary MMU (be it a pure tlb like GRU, or an
full MMU with both sptes and secondary-tlb like the shadow-pagetable layer
with kvm), or a remote DMA in software like XPMEM (hence needing of
schedule in XPMEM code to send the invalidate to the remote node, while no
need to schedule in kvm/gru as it's an immediate event like invalidating
primary-mmu pte).

At least for KVM without this patch it's impossible to swap guests
reliably.  And having this feature and removing the page pin allows
several other optimizations that simplify life considerably.

Dependencies:

1) mm_take_all_locks() to register the mmu notifier when the whole VM
   isn't doing anything with "mm".  This allows mmu notifier users to keep
   track if the VM is in the middle of the invalidate_range_begin/end
   critical section with an atomic counter incraese in range_begin and
   decreased in range_end.  No secondary MMU page fault is allowed to map
   any spte or secondary tlb reference, while the VM is in the middle of
   range_begin/end as any page returned by get_user_pages in that critical
   section could later immediately be freed without any further
   ->invalidate_page notification (invalidate_range_begin/end works on
   ranges and ->invalidate_page isn't called immediately before freeing
   the page).  To stop all page freeing and pagetable overwrites the
   mmap_sem must be taken in write mode and all other anon_vma/i_mmap
   locks must be taken too.

2) It'd be a waste to add branches in the VM if nobody could possibly
   run KVM/GRU/XPMEM on the kernel, so mmu notifiers will only enabled if
   CONFIG_KVM=m/y.  In the current kernel kvm won't yet take advantage of
   mmu notifiers, but this already allows to compile a KVM external module
   against a kernel with mmu notifiers enabled and from the next pull from
   kvm.git we'll start using them.  And GRU/XPMEM will also be able to
   continue the development by enabling KVM=m in their config, until they
   submit all GRU/XPMEM GPLv2 code to the mainline kernel.  Then they can
   also enable MMU_NOTIFIERS in the same way KVM does it (even if KVM=n).
   This guarantees nobody selects MMU_NOTIFIER=y if KVM and GRU and XPMEM
   are all =n.

The mmu_notifier_register call can fail because mm_take_all_locks may be
interrupted by a signal and return -EINTR.  Because mmu_notifier_reigster
is used when a driver startup, a failure can be gracefully handled.  Here
an example of the change applied to kvm to register the mmu notifiers.
Usually when a driver startups other allocations are required anyway and
-ENOMEM failure paths exists already.

 struct  kvm *kvm_arch_create_vm(void)
 {
        struct kvm *kvm = kzalloc(sizeof(struct kvm), GFP_KERNEL);
+       int err;

        if (!kvm)
                return ERR_PTR(-ENOMEM);

        INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);

+       kvm->arch.mmu_notifier.ops = &kvm_mmu_notifier_ops;
+       err = mmu_notifier_register(&kvm->arch.mmu_notifier, current->mm);
+       if (err) {
+               kfree(kvm);
+               return ERR_PTR(err);
+       }
+
        return kvm;
 }

mmu_notifier_unregister returns void and it's reliable.

The patch also adds a few needed but missing includes that would prevent
kernel to compile after these changes on non-x86 archs (x86 didn't need
them by luck).

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix mm/filemap_xip.c build]
[akpm@linux-foundation.org: fix mm/mmu_notifier.c build]
Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Kanoj Sarcar <kanojsarcar@yahoo.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Chris Wright <chrisw@redhat.com>
Cc: Marcelo Tosatti <marcelo@kvack.org>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Izik Eidus <izike@qumranet.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-28 16:30:21 -07:00
Bjorn Helgaas
12c0b20fa4 x86/PCI: use dev_printk when possible
Convert printks to use dev_printk().

I converted DBG() to dev_dbg().  This DBG() is from arch/x86/pci/pci.h and
requires source-code modification to enable, so dev_dbg() seems roughly
equivalent.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-28 15:32:26 -07:00
Jesse Barnes
756f7bc668 Merge branch 'core/generic-dma-coherent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip into for-linus 2008-07-28 15:15:46 -07:00
Ingo Molnar
cb28a1bbdb Merge branch 'linus' into core/generic-dma-coherent
Conflicts:

	arch/x86/Kconfig

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-29 00:07:55 +02:00
Jesse Barnes
29111f579f Merge branch 'x86/iommu' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip into for-linus 2008-07-28 14:31:10 -07:00
Linus Torvalds
e56b3bc794 cpu masks: optimize and clean up cpumask_of_cpu()
Clean up and optimize cpumask_of_cpu(), by sharing all the zero words.

Instead of stupidly generating all possible i=0...NR_CPUS 2^i patterns
creating a huge array of constant bitmasks, realize that the zero words
can be shared.

In other words, on a 64-bit architecture, we only ever need 64 of these
arrays - with a different bit set in one single world (with enough zero
words around it so that we can create any bitmask by just offsetting in
that big array). And then we just put enough zeroes around it that we
can point every single cpumask to be one of those things.

So when we have 4k CPU's, instead of having 4k arrays (of 4k bits each,
with one bit set in each array - 2MB memory total), we have exactly 64
arrays instead, each 8k bits in size (64kB total).

And then we just point cpumask(n) to the right position (which we can
calculate dynamically). Once we have the right arrays, getting
"cpumask(n)" ends up being:

  static inline const cpumask_t *get_cpu_mask(unsigned int cpu)
  {
          const unsigned long *p = cpu_bit_bitmap[1 + cpu % BITS_PER_LONG];
          p -= cpu / BITS_PER_LONG;
          return (const cpumask_t *)p;
  }

This brings other advantages and simplifications as well:

 - we are not wasting memory that is just filled with a single bit in
   various different places

 - we don't need all those games to re-create the arrays in some dense
   format, because they're already going to be dense enough.

if we compile a kernel for up to 4k CPU's, "wasting" that 64kB of memory
is a non-issue (especially since by doing this "overlapping" trick we
probably get better cache behaviour anyway).

[ mingo@elte.hu:

  Converted Linus's mails into a commit. See:

     http://lkml.org/lkml/2008/7/27/156
     http://lkml.org/lkml/2008/7/28/320

  Also applied a family filter - which also has the side-effect of leaving
  out the bits where Linus calls me an idio... Oh, never mind ;-)
]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 22:20:41 +02:00
Ingo Molnar
414f746d23 Merge branch 'linus' into cpus4096 2008-07-28 21:14:43 +02:00
Ingo Molnar
6ce37a58e3 Merge branch 'x86/crashdump' into x86/urgent 2008-07-28 17:19:02 +02:00
Ingo Molnar
239bd83104 x86: L3 cache index disable for 2.6.26, fix #2
fix !PCI build failure:

 arch/x86/kernel/cpu/intel_cacheinfo.c: In function 'get_k8_northbridge':
 arch/x86/kernel/cpu/intel_cacheinfo.c:675: error: implicit declaration of function 'pci_match_id'

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 16:49:50 +02:00
Ingo Molnar
b7d0b67845 Merge branch 'linus' into x86/cpu
Conflicts:

	arch/x86/kernel/cpu/intel_cacheinfo.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 16:26:31 +02:00
Ingo Molnar
cdcf772ed1 x86 l3 cache index disable for 2 6 26 fix
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 16:22:07 +02:00
Mark Langsdorf
a24e8d36f5 x86: L3 cache index disable for 2.6.26
On Monday 21 July 2008, Ingo Molnar wrote:
> > applied to tip/x86/cpu, thanks Mark.
> >
> > I've done some coding style fixes for the new functions you've
> > introduced, see that commit below.
>
> -tip testing found the following build failure:
>
>  arch/x86/kernel/built-in.o: In function `show_cache_disable':
>  intel_cacheinfo.c:(.text+0xbbf2): undefined reference to `k8_northbridges'
>  arch/x86/kernel/built-in.o: In function `store_cache_disable':
>  intel_cacheinfo.c:(.text+0xbd91): undefined reference to `k8_northbridges'
>
> please send a delta fix patch against the tip/x86/cpu branch:
>
>   http://people.redhat.com/mingo/tip.git/README
>
> which has your patch plus the cleanup applied.

delta fix patch follows.  It removes the dependency on k8_northbridges.

-Mark Langsdorf
Operating System Research Center
AMD

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 16:22:06 +02:00
Ingo Molnar
7a4983bb5f x86: L3 cache index disable for 2.6.26, cleanups
No change in functionality.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 16:17:47 +02:00
Mark Langsdorf
8cb22bcb1f x86: L3 cache index disable for 2.6.26
New versions of AMD processors have support to disable parts
of their L3 caches if too many MCEs are generated by the
L3 cache.

This patch provides a /sysfs interface under the cache
hierarchy to display which caches indices are disabled
(if any) and to monitoring applications to disable a
cache index.

This patch does not set an automatic policy to disable
the L3 cache.  Policy decisions would need to be made
by a RAS handler.  This patch merely makes it easier to
see what indices are currently disabled.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28 16:17:43 +02:00
Linus Torvalds
fb4284b2b7 Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip
* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
  x86: fix cpu hotplug on 32bit
2008-07-27 16:46:51 -07:00
Thomas Gleixner
583323b9d2 x86: fix cpu hotplug on 32bit
commit 3e9704739d ("x86: boot secondary
cpus through initial_code") causes the kernel to crash when a CPU is
brought online after the read only sections have been write
protected. The write to initial_code in do_boot_cpu() fails.

Move inital_code to .cpuinit.data section.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
2008-07-27 21:43:11 +02:00
Sheng Yang
5fdbcb9dd1 KVM: VMX: Fix undefined beaviour of EPT after reload kvm-intel.ko
As well as move set base/mask ptes to vmx_init().

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:10 +03:00
Sheng Yang
5ec5726a16 KVM: VMX: Fix bypass_guest_pf enabling when disable EPT in module parameter
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:10 +03:00
Marcelo Tosatti
c93cd3a588 KVM: task switch: translate guest segment limit to virt-extension byte granular field
If 'g' is one then limit is 4kb granular.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:10 +03:00
Avi Kivity
577bdc4966 KVM: Avoid instruction emulation when event delivery is pending
When an event (such as an interrupt) is injected, and the stack is
shadowed (and therefore write protected), the guest will exit.  The
current code will see that the stack is shadowed and emulate a few
instructions, each time postponing the injection.  Eventually the
injection may succeed, but at that time the guest may be unwilling
to accept the interrupt (for example, the TPR may have changed).

This occurs every once in a while during a Windows 2008 boot.

Fix by unshadowing the fault address if the fault was due to an event
injection.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:10 +03:00
Marcelo Tosatti
34198bf842 KVM: task switch: use seg regs provided by subarch instead of reading from GDT
There is no guarantee that the old TSS descriptor in the GDT contains
the proper base address. This is the case for Windows installation's
reboot-via-triplefault.

Use guest registers instead. Also translate the address properly.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:09 +03:00
Marcelo Tosatti
98899aa0e0 KVM: task switch: segment base is linear address
The segment base is always a linear address, so translate before
accessing guest memory.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:09 +03:00
Joerg Roedel
5f4cb662a0 KVM: SVM: allow enabling/disabling NPT by reloading only the architecture module
If NPT is enabled after loading both KVM modules on AMD and it should be
disabled, both KVM modules must be reloaded. If only the architecture module is
reloaded the behavior is undefined. With this patch it is possible to disable
NPT only by reloading the kvm_amd module.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:09 +03:00
Yinghai Lu
d25ae38b7e x86: add apic probe for genapic 64bit - fix
intr_remapping_enabled get assigned later, so need to check that
in setup_apic_routing

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-27 06:28:16 +02:00
Linus Torvalds
fb3b806144 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, AMD IOMMU: include amd_iommu_last_bdf in device initialization
  x86: fix IBM Summit based systems' phys_cpu_present_map on 32-bit kernels
  x86, RDC321x: remove gpio.h complications
  x86, RDC321x: add to mach-default
  crashdump: fix undefined reference to `elfcorehdr_addr'
  flag parameters: fix compile error of sys_epoll_create1
2008-07-26 13:25:05 -07:00
Johannes Weiner
8dad322f54 x86: use generic show_mem()
Remove arch-specific show_mem() in favor of the generic version.

This also removes the following redundant information display:

	- pages in swapcache, printed by show_swap_cache_info()
	- dirty pages, writeback pages, mapped pages, slab pages,
	  pagetable pages, printed by show_free_areas()

where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:10 -07:00