Commit Graph

5701 Commits

Author SHA1 Message Date
Joerg Roedel
c156e347d6 AMD IOMMU: add domain init function for IOMMU API
Impact: add a generic function for allocation protection domains

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:55 +01:00
Joerg Roedel
6d98cd8043 AMD IOMMU: add domain cleanup helper function
Impact: add a function to remove all devices from a domain

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:55 +01:00
Joerg Roedel
e275a2a0fc AMD IOMMU: add device notifier callback
Impact: inform IOMMU about state change of a device in the driver core

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:55 +01:00
Joerg Roedel
355bf553ed AMD IOMMU: add device detach helper functions
Impact: add helper functions to detach a device from a domain

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:55 +01:00
Joerg Roedel
f1179dc005 AMD IOMMU: rename set_device_domain function
Impact: rename set_device_domain() to attach_device()

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:54 +01:00
Joerg Roedel
863c74ebd0 AMD IOMMU: add device reference counting for protection domains
Impact: know how many devices are assigned to a domain

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:54 +01:00
Joerg Roedel
5b28df6f43 AMD IOMMU: add checks for dma_ops domain to dma_ops functions
Impact: detect when a driver uses a device assigned otherwise

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:54 +01:00
Joerg Roedel
9fdb19d64c AMD IOMMU: add protection domain flags
Imapct: add a new struct member to 'struct protection_domain'

When using protection domains for dma_ops and KVM its better to know for
which subsystem it was allocated. Add a flags member to struct
protection domain for that purpose.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:54 +01:00
Joerg Roedel
43f4960983 AMD IOMMU: add iommu_flush_domain function
Impact: add a function to flush a domain id on every IOMMU

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:54 +01:00
Joerg Roedel
9e919012e3 AMD IOMMU: don't remove protection domain from iommu_pd_list
Impact: save unneeded logic to add and remove domains to the list

The removal of a protection domain from the iommu_pd_list is not
necessary. Another benefit is that we save complexity because we don't
have to readd it later when the device no longer uses the domain.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:54 +01:00
Joerg Roedel
237b6f3329 AMD IOMMU: move invalidation command building to a separate function
Impact: refactoring of iommu_queue_inv_iommu_pages

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:54 +01:00
Joerg Roedel
8d201968e1 AMD IOMMU: refactor completion wait handling into separate functions
Impact: split one function into three

The separate functions are required synchronize commands across all
hardware IOMMUs in the system.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:54 +01:00
Joerg Roedel
a2acfb7579 AMD IOMMU: add domain id free function
Impact: add code to release a domain id

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:53 +01:00
Joerg Roedel
86db2e5d47 AMD IOMMU: make dma_ops_free_pagetable generic
Impact: change code to free pagetables from protection domains

The dma_ops_free_pagetable function can only free pagetables from
dma_ops domains. Change that to free pagetables of pure protection
domains.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:53 +01:00
Joerg Roedel
38e817febe AMD IOMMU: rename iommu_map to iommu_map_page
Impact: function rename

The iommu_map function maps only one page. Make this clear in the
function name.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:53 +01:00
Joerg Roedel
19de40a847 KVM: change KVM to use IOMMU API
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:11:07 +01:00
Joerg Roedel
1aaf118352 select IOMMU_API when DMAR and/or AMD_IOMMU is selected
These two IOMMUs can implement the current version of this API. So
select the API if one or both of these IOMMU drivers is selected.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:10:09 +01:00
Joerg Roedel
c4fa386428 KVM: rename vtd.c to iommu.c
Impact: file renamed

The code in the vtd.c file can be reused for other IOMMUs as well. So
rename it to make it clear that it handle more than VT-d.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-01-03 14:10:09 +01:00
Markus Trippelsdorf
37dd3cb415 x86: remove debug printks (io_apic.c)
Impact: reduce printk noise

The message "alloc irq_2_pin on cpu 0 node 0" is printed way too often.

% dmesg|grep irq_2_pin|wc -l
20

Get rid of the debug printks.

Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-02 23:19:36 +01:00
Ingo Molnar
47dabdc7fc Merge branches 'x86/cleanups', 'x86/fpu' and 'x86/urgent' into x86/core 2009-01-02 22:41:52 +01:00
Ingo Molnar
923a789b49 Merge branch 'linus' into x86/cleanups
Conflicts:
	arch/x86/kernel/reboot.c
2009-01-02 22:41:36 +01:00
Roland Dreier
79ff56ebd3 swiotlb: add missing __init annotations
Impact: cleanup, reduce kernel size a bit

The current kernel build warns:

    WARNING: vmlinux.o(.text+0x11458): Section mismatch in reference from the function swiotlb_alloc_boot() to the function .init.text:__alloc_bootmem_low()
    The function swiotlb_alloc_boot() references
    the function __init __alloc_bootmem_low().
    This is often because swiotlb_alloc_boot lacks a __init
    annotation or the annotation of __alloc_bootmem_low is wrong.

    WARNING: vmlinux.o(.text+0x1011f2): Section mismatch in reference from the function swiotlb_late_init_with_default_size() to the function .init.text:__alloc_bootmem_low()
    The function swiotlb_late_init_with_default_size() references
    the function __init __alloc_bootmem_low().
    This is often because swiotlb_late_init_with_default_size lacks a __init
    annotation or the annotation of __alloc_bootmem_low is wrong.

and indeed the functions calling __alloc_bootmem_low() can be marked
__init as well.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-02 21:52:39 +01:00
Linus Torvalds
b840d79631 Merge branch 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits)
  x86: export vector_used_by_percpu_irq
  x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and()
  sched: nominate preferred wakeup cpu, fix
  x86: fix lguest used_vectors breakage, -v2
  x86: fix warning in arch/x86/kernel/io_apic.c
  sched: fix warning in kernel/sched.c
  sched: move test_sd_parent() to an SMP section of sched.h
  sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0
  sched: activate active load balancing in new idle cpus
  sched: bias task wakeups to preferred semi-idle packages
  sched: nominate preferred wakeup cpu
  sched: favour lower logical cpu number for sched_mc balance
  sched: framework for sched_mc/smt_power_savings=N
  sched: convert BALANCE_FOR_xx_POWER to inline functions
  x86: use possible_cpus=NUM to extend the possible cpus allowed
  x86: fix cpu_mask_to_apicid_and to include cpu_online_mask
  x86: update io_apic.c to the new cpumask code
  x86: Introduce topology_core_cpumask()/topology_thread_cpumask()
  x86: xen: use smp_call_function_many()
  x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c
  ...

Fixed up trivial conflict in kernel/time/tick-sched.c manually
2009-01-02 11:44:09 -08:00
Linus Torvalds
597b0d2162 Merge branch 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (140 commits)
  KVM: MMU: handle large host sptes on invlpg/resync
  KVM: Add locking to virtual i8259 interrupt controller
  KVM: MMU: Don't treat a global pte as such if cr4.pge is cleared
  MAINTAINERS: Maintainership changes for kvm/ia64
  KVM: ia64: Fix kvm_arch_vcpu_ioctl_[gs]et_regs()
  KVM: x86: Rework user space NMI injection as KVM_CAP_USER_NMI
  KVM: VMX: Fix pending NMI-vs.-IRQ race for user space irqchip
  KVM: fix handling of ACK from shared guest IRQ
  KVM: MMU: check for present pdptr shadow page in walk_shadow
  KVM: Consolidate userspace memory capability reporting into common code
  KVM: Advertise the bug in memory region destruction as fixed
  KVM: use cpumask_var_t for cpus_hardware_enabled
  KVM: use modern cpumask primitives, no cpumask_t on stack
  KVM: Extract core of kvm_flush_remote_tlbs/kvm_reload_remote_mmus
  KVM: set owner of cpu and vm file operations
  anon_inodes: use fops->owner for module refcount
  x86: KVM guest: kvm_get_tsc_khz: return khz, not lpj
  KVM: MMU: prepopulate the shadow on invlpg
  KVM: MMU: skip global pgtables on sync due to cr3 switch
  KVM: MMU: collapse remote TLB flushes on root sync
  ...
2009-01-02 11:41:11 -08:00
Ingo Brueckl
e8e3232627 Fix compiler warning in arch/x86/mm/init_32.c
Signed-off-by: Ingo Brueckl <ib@wupperonline.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-02 10:27:32 -08:00
Jaswinder Singh Rajput
103ceffb95 x86: mpparse.c fix style problems
Impact: cleanup, fix style problems, more readable

Fixes style problems:

 WARNING: Use #include <linux/smp.h> instead of <asm/smp.h>
 WARNING: Use #include <linux/acpi.h> instead of <asm/acpi.h>
 WARNING: suspect code indent for conditional statements (8, 17)
 WARNING: space prohibited between function name and open parenthesis '('

total: 0 errors, 5 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-02 19:24:28 +01:00
Jaswinder Singh Rajput
dceb4521c8 x86: nmi.c fix style problems
Impact: cleanup, fix style problems

Fixes style problems:

 WARNING: Use #include <linux/smp.h> instead of <asm/smp.h>
 WARNING: Use #include <linux/nmi.h> instead of <asm/nmi.h>

total: 0 errors, 2 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-02 19:03:38 +01:00
Jaswinder Singh Rajput
423a54058f x86: ldt.c fix style problems
Impact: cleanup

Fixes style problems:

 WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
 ERROR: space required before the open parenthesis '('

total: 1 errors, 1 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-02 17:46:24 +01:00
Jaswinder Singh Rajput
f634fa9411 x86: cpuid.c fix style problems
Impact: cleanup

Fixes style problems:

 WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
 ERROR: "foo * bar" should be "foo *bar"
 ERROR: trailing whitespace
 WARNING: usage of NR_CPUS is often wrong - consider using cpu_possible(), num_possible_cpus(), for_each_possible_cpu(), etc

total: 2 errors, 2 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-02 17:46:23 +01:00
Cliff Wickman
46814dded1 x86, UV: remove erroneous BAU initialization
Impact: fix crash on x86/UV

UV is the SGI "UltraViolet" machine, which is x86_64 based.
BAU is the "Broadcast Assist Unit", used for TLB shootdown in UV.

This patch removes the allocation and initialization of an unused table.

This table is left over from a development test mode.  It is unused in
the present code.

And it was incorrectly initialized: 8 entries allocated but 17 initialized,
causing slab corruption.

This patch should go into 2.6.27 and 2.6.28 as well as the current tree.

Diffed against 2.6.28 (linux-next, 12/30/08)

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-02 17:26:25 +01:00
Ravikiran G Thirumalai
26799a6311 x86: fix incorrect __read_mostly on _boot_cpu_pda
The pda rework (commit 3461b0af02)
to remove static boot cpu pdas introduced a performance bug.

_boot_cpu_pda is the actual pda used by the boot cpu and is definitely
not "__read_mostly" and ended up polluting the read mostly section with
writes.  This bug caused regression of about 8-10% on certain syscall
intensive workloads.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Acked-by: Mike Travis <travis@sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-02 17:16:29 +01:00
Ingo Brueckl
a9067d5376 x86: convert permanent_kmaps_init() from macro to inline
Impact: cleanup

This compiler warning:

  arch/x86/mm/init_32.c:515: warning: unused variable 'pgd_base'

triggers because permanent_kmaps_init() is a CPP macro in the
!CONFIG_HIGHMEM case, that does not tell the compiler that the
'pgd_base' parameter is used.

Convert permanent_kmaps_init() (and set_highmem_pages_init()) to
C inline functions - which gives the parameter a proper type and
which gets rid of the compiler warning as well.

Signed-off-by: Ingo Brueckl <ib@wupperonline.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-02 15:41:03 +01:00
Cyrill Gorcunov
c64d8996bd x86: early_printk - use sizeof instead of hardcoded number
Impact: cleanup

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-02 10:27:46 +01:00
Linus Torvalds
b58602a4ba Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (34 commits)
  nfsd race fixes: jfs
  nfsd race fixes: reiserfs
  nfsd race fixes: ext4
  nfsd race fixes: ext3
  nfsd race fixes: ext2
  nfsd/create race fixes, infrastructure
  filesystem notification: create fs/notify to contain all fs notification
  fs/block_dev.c: __read_mostly improvement and sb_is_blkdev_sb utilization
  kill ->dir_notify()
  filp_cachep can be static in fs/file_table.c
  fix f_count description in Documentation/filesystems/files.txt
  make INIT_FS use the __RW_LOCK_UNLOCKED initialization
  take init_fs to saner place
  kill vfs_permission
  pass a struct path * to may_open
  kill walk_init_root
  remove incorrect comment in inode_permission
  expand some comments (d_path / seq_path)
  correct wrong function name of d_put in kernel document and source comment
  fix switch_names() breakage in short-to-short case
  ...
2008-12-31 15:57:56 -08:00
Al Viro
18d8fda7c3 take init_fs to saner place
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31 18:07:42 -05:00
Linus Torvalds
db200df0b3 Merge branch 'irq-fixes-for-linus-4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus-4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sparseirq: move __weak symbols into separate compilation unit
  sparseirq: work around __weak alias bug
  sparseirq: fix hang with !SPARSE_IRQ
  sparseirq: set lock_class for legacy irq when sparse_irq is selected
  sparseirq: work around compiler optimizing away __weak functions
  sparseirq: fix desc->lock init
  sparseirq: do not printk when migrating IRQ descriptors
  sparseirq: remove duplicated arch_early_irq_init()
  irq: simplify for_each_irq_desc() usage
  proc: remove ifdef CONFIG_SPARSE_IRQ from stat.c
  irq: for_each_irq_desc() move to irqnr.h
  hrtimer: remove #include <linux/irq.h>
2008-12-31 09:00:59 -08:00
Marcelo Tosatti
8791723920 KVM: MMU: handle large host sptes on invlpg/resync
The invlpg and sync walkers lack knowledge of large host sptes,
descending to non-existant pagetable level.

Stop at directory level in such case.

Fixes SMP Windows XP with hugepages.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:49 +02:00
Avi Kivity
3f353858c9 KVM: Add locking to virtual i8259 interrupt controller
While most accesses to the i8259 are with the kvm mutex taken, the call
to kvm_pic_read_irq() is not.  We can't easily take the kvm mutex there
since the function is called with interrupts disabled.

Fix by adding a spinlock to the virtual interrupt controller.  Since we
can't send an IPI under the spinlock (we also take the same spinlock in
an irq disabled context), we defer the IPI until the spinlock is released.
Similarly, we defer irq ack notifications until after spinlock release to
avoid lock recursion.

Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:48 +02:00
Avi Kivity
25e2343246 KVM: MMU: Don't treat a global pte as such if cr4.pge is cleared
The pte.g bit is meaningless if global pages are disabled; deferring
mmu page synchronization on these ptes will lead to the guest using stale
shadow ptes.

Fixes Vista x86 smp bootloader failure.

Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:48 +02:00
Jan Kiszka
4531220b71 KVM: x86: Rework user space NMI injection as KVM_CAP_USER_NMI
There is no point in doing the ready_for_nmi_injection/
request_nmi_window dance with user space. First, we don't do this for
in-kernel irqchip anyway, while the code path is the same as for user
space irqchip mode. And second, there is nothing to loose if a pending
NMI is overwritten by another one (in contrast to IRQs where we have to
save the number). Actually, there is even the risk of raising spurious
NMIs this way because the reason for the held-back NMI might already be
handled while processing the first one.

Therefore this patch creates a simplified user space NMI injection
interface, exporting it under KVM_CAP_USER_NMI and dropping the old
KVM_CAP_NMI capability. And this time we also take care to provide the
interface only on archs supporting NMIs via KVM (right now only x86).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:47 +02:00
Jan Kiszka
264ff01d55 KVM: VMX: Fix pending NMI-vs.-IRQ race for user space irqchip
As with the kernel irqchip, don't allow an NMI to stomp over an already
injected IRQ; instead wait for the IRQ injection to be completed.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:47 +02:00
Marcelo Tosatti
eb64f1e8cd KVM: MMU: check for present pdptr shadow page in walk_shadow
walk_shadow assumes the caller verified validity of the pdptr pointer in
question, which is not the case for the invlpg handler.

Fixes oops during Solaris 10 install.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:46 +02:00
Avi Kivity
ca9edaee1a KVM: Consolidate userspace memory capability reporting into common code
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:46 +02:00
Eduardo Habkost
e93353c93a x86: KVM guest: kvm_get_tsc_khz: return khz, not lpj
kvm_get_tsc_khz() currently returns the previously-calculated preset_lpj
value, but it is in loops-per-jiffy, not kHz. The current code works
correctly only when HZ=1000.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:44 +02:00
Marcelo Tosatti
ad218f85e3 KVM: MMU: prepopulate the shadow on invlpg
If the guest executes invlpg, peek into the pagetable and attempt to
prepopulate the shadow entry.

Also stop dirty fault updates from interfering with the fork detector.

2% improvement on RHEL3/AIM7.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:44 +02:00
Marcelo Tosatti
6cffe8ca4a KVM: MMU: skip global pgtables on sync due to cr3 switch
Skip syncing global pages on cr3 switch (but not on cr4/cr0). This is
important for Linux 32-bit guests with PAE, where the kmap page is
marked as global.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:44 +02:00
Marcelo Tosatti
b1a368218a KVM: MMU: collapse remote TLB flushes on root sync
Collapse remote TLB flushes on root sync.

kernbench is 2.7% faster on 4-way guest. Improvements have been seen
with other loads such as AIM7.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:43 +02:00
Marcelo Tosatti
60c8aec6e2 KVM: MMU: use page array in unsync walk
Instead of invoking the handler directly collect pages into
an array so the caller can work with it.

Simplifies TLB flush collapsing.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:43 +02:00
Amit Shah
fbce554e94 KVM: x86 emulator: Fix handling of VMMCALL instruction
The VMMCALL instruction doesn't get recognised and isn't processed
by the emulator.

This is seen on an Intel host that tries to execute the VMMCALL
instruction after a guest live migrates from an AMD host.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:43 +02:00
Guillaume Thouvenin
9bf8ea42fe KVM: x86 emulator: add the emulation of shld and shrd instructions
Add emulation of shld and shrd instructions

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:43 +02:00
Guillaume Thouvenin
d175226a5f KVM: x86 emulator: add the assembler code for three operands
Add the assembler code for instruction with three operands and one
operand is stored in ECX register

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:42 +02:00
Guillaume Thouvenin
bfcadf83ec KVM: x86 emulator: add a new "implied 1" Src decode type
Add SrcOne operand type when we need to decode an implied '1' like with
regular shift instruction

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:42 +02:00
Guillaume Thouvenin
0dc8d10f7d KVM: x86 emulator: add Src2 decode set
Instruction like shld has three operands, so we need to add a Src2
decode set. We start with Src2None, Src2CL, and Src2ImmByte, Src2One to
support shld/shrd and we will expand it later.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:42 +02:00
Guillaume Thouvenin
45ed60b371 KVM: x86 emulator: Extend the opcode descriptor
Extend the opcode descriptor to 32 bits. This is needed by the
introduction of a new Src2 operand type.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:41 +02:00
Hannes Eder
efff9e538f KVM: VMX: fix sparse warning
Impact: make global function static

  arch/x86/kvm/vmx.c:134:3: warning: symbol 'vmx_capability' was not declared. Should it be static?

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:06 +02:00
Avi Kivity
f3fd92fbdb KVM: Remove extraneous semicolon after do/while
Notices by Guillaume Thouvenin.

Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:05 +02:00
Avi Kivity
2b48cc75b2 KVM: x86 emulator: fix popf emulation
Set operand type and size to get correct writeback behavior.

Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:05 +02:00
Avi Kivity
cf5de4f886 KVM: x86 emulator: fix ret emulation
'ret' did not set the operand type or size for the destination, so
writeback ignored it.

Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:05 +02:00
Avi Kivity
8a09b6877f KVM: x86 emulator: switch 'pop reg' instruction to emulate_pop()
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:04 +02:00
Avi Kivity
781d0edc5f KVM: x86 emulator: allow pop from mmio
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:04 +02:00
Avi Kivity
faa5a3ae39 KVM: x86 emulator: Extract 'pop' sequence into a function
Switch 'pop r/m' instruction to use the new function.

Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:04 +02:00
Avi Kivity
6b7ad61ffb KVM: x86 emulator: consolidate emulation of two operand instructions
No need to repeat the same assembly block over and over.

Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:03 +02:00
Avi Kivity
dda96d8f1b KVM: x86 emulator: reduce duplication in one operand emulation thunks
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:03 +02:00
Marcelo Tosatti
ecc5589f19 KVM: MMU: optimize set_spte for page sync
The write protect verification in set_spte is unnecessary for page sync.

Its guaranteed that, if the unsync spte was writable, the target page
does not have a write protected shadow (if it had, the spte would have
been write protected under mmu_lock by rmap_write_protect before).

Same reasoning applies to mark_page_dirty: the gfn has been marked as
dirty via the pagefault path.

The cost of hash table and memslot lookups are quite significant if the
workload is pagetable write intensive resulting in increased mmu_lock
contention.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:02 +02:00
Glauber Costa
423cd25a5a x86: KVM guest: sign kvmclock as paravirt
Currently, we only set the KVM paravirt signature in case
of CONFIG_KVM_GUEST. However, it is possible to have it turned
off, while CONFIG_KVM_CLOCK is turned on. This is also a paravirt
case, and should be shown accordingly.

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:00 +02:00
Avi Kivity
df203ec9a7 KVM: VMX: Conditionally request interrupt window after injecting irq
If we're injecting an interrupt, and another one is pending, request
an interrupt window notification so we don't have excess latency on the
second interrupt.

This shouldn't happen in practice since an EOI will be issued, giving a second
chance to request an interrupt window, but...

Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:00 +02:00
Eduardo Habkost
d176720d34 x86: disable VMX on all CPUs on reboot
On emergency_restart, we may need to use an NMI to disable virtualization
on all CPUs. We do that using nmi_shootdown_cpus() if VMX is enabled.

Note: With this patch, we will run the NMI stuff only when the CPU where
emergency_restart() was called has VMX enabled. This should work on most
cases because KVM enables VMX on all CPUs, but we may miss the small
window where KVM is doing that. Also, I don't know if all code using
VMX out there always enable VMX on all CPUs like KVM does. We have two
other alternatives for that:

a) Have an API that all code that enables VMX on any CPU should use
   to tell the kernel core that it is going to enable VMX on the CPUs.
b) Always call nmi_shootdown_cpus() if the CPU supports VMX. This is
   a bit intrusive and more risky, as it would run nmi_shootdown_cpus()
   on emergency_reboot() even on systems where virtualization is never
   enabled.

Finding a proper point to hook the nmi_shootdown_cpus() call isn't
trivial, as the non-emergency machine_restart() (that doesn't need the
NMI tricks) uses machine_emergency_restart() directly.

The solution to make this work without adding a new function or argument
to machine_ops was setting a 'reboot_emergency' flag that tells if
native_machine_emergency_restart() needs to do the virt cleanup or not.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:54:58 +02:00
Eduardo Habkost
2340b62f77 kdump: forcibly disable VMX and SVM on machine_crash_shutdown()
We need to disable virtualization extensions on all CPUs before booting
the kdump kernel, otherwise the kdump kernel booting will fail, and
rebooting after the kdump kernel did its task may also fail.

We do it using cpu_emergency_vmxoff() and cpu_emergency_svm_disable(),
that should always work, because those functions check if the CPUs
support SVM or VMX before doing their tasks.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:30 +02:00
Eduardo Habkost
0f3e9eeba0 x86: cpu_emergency_svm_disable() function
This function can be used by the reboot or kdump code to forcibly
disable SVM on the CPU.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:30 +02:00
Eduardo Habkost
2c8dceebb2 KVM: SVM: move svm_hardware_disable() code to asm/virtext.h
Create cpu_svm_disable() function.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:30 +02:00
Eduardo Habkost
63d1142f8f KVM: SVM: move has_svm() code to asm/virtext.h
Use a trick to keep the printk()s on has_svm() working as before. gcc
will take care of not generating code for the 'msg' stuff when the
function is called with a NULL msg argument.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:29 +02:00
Eduardo Habkost
6aa07a0d77 x86: cpu_emergency_vmxoff() function
Add cpu_emergency_vmxoff() and its friends: cpu_vmx_enabled() and
__cpu_emergency_vmxoff().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:29 +02:00
Eduardo Habkost
710ff4a855 KVM: VMX: extract kvm_cpu_vmxoff() from hardware_disable()
Along with some comments on why it is different from the core cpu_vmxoff()
function.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:29 +02:00
Eduardo Habkost
1e9931146c x86: asm/virtext.h: add cpu_vmxoff() inline function
Unfortunately we can't use exactly the same code from vmx
hardware_disable(), because the KVM function uses the
__kvm_handle_fault_on_reboot() tricks.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:29 +02:00
Eduardo Habkost
6210e37b12 KVM: VMX: move cpu_has_kvm_support() to an inline on asm/virtext.h
It will be used by core code on kdump and reboot, to disable
vmx if needed.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:28 +02:00
Eduardo Habkost
eca70fc567 KVM: VMX: move ASM_VMX_* definitions from asm/kvm_host.h to asm/vmx.h
Those definitions will be used by code outside KVM, so move it outside
of a KVM-specific source file.

Those definitions are used only on kvm/vmx.c, that already includes
asm/vmx.h, so they can be moved safely.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:28 +02:00
Eduardo Habkost
c2cedf7be2 KVM: SVM: move svm.h to include/asm
svm.h will be used by core code that is independent of KVM, so I am
moving it outside the arch/x86/kvm directory.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:28 +02:00
Eduardo Habkost
13673a90f1 KVM: VMX: move vmx.h to include/asm
vmx.h will be used by core code that is independent of KVM, so I am
moving it outside the arch/x86/kvm directory.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:27 +02:00
Nitin A Kamble
0fdf8e59fa KVM: Fix cpuid iteration on multiple leaves per eac
The code to traverse the cpuid data array list for counting type of leaves is
currently broken.

This patches fixes the 2 things in it.

 1. Set the 1st counting entry's flag KVM_CPUID_FLAG_STATE_READ_NEXT. Without
    it the code will never find a valid entry.

 2. Also the stop condition in the for loop while looking for the next unflaged
    entry is broken. It needs to stop when it find one matching entry;
    and in the case of count of 1, it will be the same entry found in this
    iteration.

Signed-Off-By: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:24 +02:00
Nitin A Kamble
0853d2c1d8 KVM: Fix cpuid leaf 0xb loop termination
For cpuid leaf 0xb the bits 8-15 in ECX register define the end of counting
leaf.      The previous code was using bits 0-7 for this purpose, which is
a bug.

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:24 +02:00
Izik Eidus
2843099fee KVM: MMU: Fix aliased gfns treated as unaliased
Some areas of kvm x86 mmu are using gfn offset inside a slot without
unaliasing the gfn first.  This patch makes sure that the gfn will be
unaliased and add gfn_to_memslot_unaliased() to save the calculating
of the gfn unaliasing in case we have it unaliased already.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:50 +02:00
Sheng Yang
6eb55818c0 KVM: Enable Function Level Reset for assigned device
Ideally, every assigned device should in a clear condition before and after
assignment, so that the former state of device won't affect later work.
Some devices provide a mechanism named Function Level Reset, which is
defined in PCI/PCI-e document. We should execute it before and after device
assignment.

(But sadly, the feature is new, and most device on the market now don't
support it. We are considering using D0/D3hot transmit to emulate it later,
but not that elegant and reliable as FLR itself.)

[Update: Reminded by Xiantao, execute FLR after we ensure that the device can
be assigned to the guest.]

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:49 +02:00
Guillaume Thouvenin
1d5a4d9b92 KVM: VMX: Handle mmio emulation when guest state is invalid
If emulate_invalid_guest_state is enabled, the emulator is called
when guest state is invalid.  Until now, we reported an mmio failure
when emulate_instruction() returned EMULATE_DO_MMIO.  This patch adds
the case where emulate_instruction() failed and an MMIO emulation
is needed.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:48 +02:00
Guillaume Thouvenin
e93f36bcfa KVM: allow emulator to adjust rip for emulated pio instructions
If we call the emulator we shouldn't call skip_emulated_instruction()
in the first place, since the emulator already computes the next rip
for us. Thus we move ->skip_emulated_instruction() out of
kvm_emulate_pio() and into handle_io() (and the svm equivalent). We
also replaced "return 0" by "break" in the "do_io:" case because now
the shadow register state needs to be committed. Otherwise eip will never
be updated.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:48 +02:00
Amit Shah
c0d09828c8 KVM: SVM: Set the 'busy' flag of the TR selector
The busy flag of the TR selector is not set by the hardware. This breaks
migration from amd hosts to intel hosts.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:48 +02:00
Amit Shah
25022acc3d KVM: SVM: Set the 'g' bit of the cs selector for cross-vendor migration
The hardware does not set the 'g' bit of the cs selector and this breaks
migration from amd hosts to intel hosts. Set this bit if the segment
limit is beyond 1 MB.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:48 +02:00
Amit Shah
b8222ad2e5 KVM: x86: Fix typo in function name
get_segment_descritptor_dtable() contains an obvious type.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:47 +02:00
Jan Kiszka
cc6e462cd5 KVM: x86: Optimize NMI watchdog delivery
As suggested by Avi, this patch introduces a counter of VCPUs that have
LVT0 set to NMI mode. Only if the counter > 0, we push the PIT ticks via
all LAPIC LVT0 lines to enable NMI watchdog support.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:47 +02:00
Jan Kiszka
8fdb2351d5 KVM: x86: Fix and refactor NMI watchdog emulation
This patch refactors the NMI watchdog delivery patch, consolidating
tests and providing a proper API for delivering watchdog events.

An included micro-optimization is to check only for apic_hw_enabled in
kvm_apic_local_deliver (the test for LVT mask is covering the
soft-disabled case already).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:46 +02:00
Guillaume Thouvenin
291fd39bfc KVM: x86 emulator: Add decode entries for 0x04 and 0x05 opcodes (add acc, imm)
Add decode entries for 0x04 and 0x05 (ADD) opcodes, execution is already
implemented.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:46 +02:00
Sheng Yang
6fe639792c KVM: VMX: Move private memory slot position
PCI device assignment would map guest MMIO spaces as separate slot, so it is
possible that the device has more than 2 MMIO spaces and overwrite current
private memslot.

The patch move private memory slot to the top of userspace visible memory slots.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:46 +02:00
Sheng Yang
291f26bc0f KVM: MMU: Extend kvm_mmu_page->slot_bitmap size
Otherwise set_bit() for private memory slot(above KVM_MEMORY_SLOTS) would
corrupted memory in 32bit host.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:45 +02:00
Sheng Yang
d73fa29a9b KVM: Clean up kvm_x86_emulate.h
Remove one left improper comment of removed CR2.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:45 +02:00
Sheng Yang
64d4d52175 KVM: Enable MTRR for EPT
The effective memory type of EPT is the mixture of MSR_IA32_CR_PAT and memory
type field of EPT entry.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:45 +02:00
Sheng Yang
74be52e3e6 KVM: Add local get_mtrr_type() to support MTRR
For EPT memory type support.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:45 +02:00
Sheng Yang
468d472f3f KVM: VMX: Add PAT support for EPT
GUEST_PAT support is a new feature introduced by Intel Core i7 architecture.
With this, cpu would save/load guest and host PAT automatically, for EPT memory
type in guest depends on MSR_IA32_CR_PAT.

Also add save/restore for MSR_IA32_CR_PAT.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:44 +02:00
Sheng Yang
0bed3b568b KVM: Improve MTRR structure
As well as reset mmu context when set MTRR.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:44 +02:00
Sheng Yang
932d27a791 x86: Export some definition of MTRR
For KVM can reuse the type define, and need them to support shadow MTRR.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:44 +02:00
Sheng Yang
b558bc0a25 x86: Rename mtrr_state struct and macro names
Prepare for exporting them.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:44 +02:00
Gleb Natapov
5f179287fa KVM: call kvm_arch_vcpu_reset() instead of the kvm_x86_ops callback
Call kvm_arch_vcpu_reset() instead of directly using arch callback.
The function does additional things.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:43 +02:00
Jan Kiszka
3b86cd9967 KVM: VMX: work around lacking VNMI support
Older VMX supporting CPUs do not provide the "Virtual NMI" feature for
tracking the NMI-blocked state after injecting such events. For now
KVM is unable to inject NMIs on those CPUs.

Derived from Sheng Yang's suggestion to use the IRQ window notification
for detecting the end of NMI handlers, this patch implements virtual
NMI support without impact on the host's ability to receive real NMIs.
The downside is that the given approach requires some heuristics that
can cause NMI nesting in vary rare corner cases.

The approach works as follows:
 - inject NMI and set a software-based NMI-blocked flag
 - arm the IRQ window start notification whenever an NMI window is
   requested
 - if the guest exits due to an opening IRQ window, clear the emulated
   NMI-blocked flag
 - if the guest net execution time with NMI-blocked but without an IRQ
   window exceeds 1 second, force NMI-blocked reset and inject anyway

This approach covers most practical scenarios:
 - succeeding NMIs are seperated by at least one open IRQ window
 - the guest may spin with IRQs disabled (e.g. due to a bug), but
   leaving the NMI handler takes much less time than one second
 - the guest does not rely on strict ordering or timing of NMIs
   (would be problematic in virtualized environments anyway)

Successfully tested with the 'nmi n' monitor command, the kgdbts
testsuite on smp guests (additional patches required to add debug
register support to kvm) + the kernel's nmi_watchdog=1, and a Siemens-
specific board emulation (+ guest) that comes with its own NMI
watchdog mechanism.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:43 +02:00
Jan Kiszka
487b391d6e KVM: VMX: Provide support for user space injected NMIs
This patch adds the required bits to the VMX side for user space
injected NMIs. As with the preexisting in-kernel irqchip support, the
CPU must provide the "virtual NMI" feature for proper tracking of the
NMI blocking state.

Based on the original patch by Sheng Yang.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:43 +02:00
Jan Kiszka
c4abb7c9cd KVM: x86: Support for user space injected NMIs
Introduces the KVM_NMI IOCTL to the generic x86 part of KVM for
injecting NMIs from user space and also extends the statistic report
accordingly.

Based on the original patch by Sheng Yang.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:42 +02:00
Jan Kiszka
26df99c6c5 KVM: Kick NMI receiving VCPU
Kick the NMI receiving VCPU in case the triggering caller runs in a
different context.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:42 +02:00
Jan Kiszka
0496fbb973 KVM: x86: VCPU with pending NMI is runnabled
Ensure that a VCPU with pending NMIs is considered runnable.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:42 +02:00
Jan Kiszka
23930f9521 KVM: x86: Enable NMI Watchdog via in-kernel PIT source
LINT0 of the LAPIC can be used to route PIT events as NMI watchdog ticks
into the guest. This patch aligns the in-kernel irqchip emulation with
the user space irqchip with already supports this feature. The trick is
to route PIT interrupts to all LAPIC's LVT0 lines.

Rebased and slightly polished patch originally posted by Sheng Yang.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:41 +02:00
Jan Kiszka
66a5a347c2 KVM: VMX: fix real-mode NMI support
Fix NMI injection in real-mode with the same pattern we perform IRQ
injection.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:41 +02:00
Jan Kiszka
f460ee43e2 KVM: VMX: refactor IRQ and NMI window enabling
do_interrupt_requests and vmx_intr_assist go different way for
achieving the same: enabling the nmi/irq window start notification.
Unify their code over enable_{irq|nmi}_window, get rid of a redundant
call to enable_intr_window instead of direct enable_nmi_window
invocation and unroll enable_intr_window for both in-kernel and user
space irq injection accordingly.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:41 +02:00
Jan Kiszka
33f089ca5a KVM: VMX: refactor/fix IRQ and NMI injectability determination
There are currently two ways in VMX to check if an IRQ or NMI can be
injected:
 - vmx_{nmi|irq}_enabled and
 - vcpu.arch.{nmi|interrupt}_window_open.
Even worse, one test (at the end of vmx_vcpu_run) uses an inconsistent,
likely incorrect logic.

This patch consolidates and unifies the tests over
{nmi|interrupt}_window_open as cache + vmx_update_window_states
for updating the cache content.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:40 +02:00
Jan Kiszka
448fa4a9c5 KVM: x86: Reset pending/inject NMI state on CPU reset
CPU reset invalidates pending or already injected NMIs, therefore reset
the related state variables.

Based on original patch by Gleb Natapov.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:40 +02:00
Jan Kiszka
60637aacfd KVM: VMX: Support for NMI task gates
Properly set GUEST_INTR_STATE_NMI and reset nmi_injected when a
task-switch vmexit happened due to a task gate being used for handling
NMIs. Also avoid the false warning about valid vectoring info in
kvm_handle_exit.

Based on original patch by Gleb Natapov.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:40 +02:00
Jan Kiszka
e4a41889ec KVM: VMX: Use INTR_TYPE_NMI_INTR instead of magic value
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:40 +02:00
Jan Kiszka
a26bf12afb KVM: VMX: include all IRQ window exits in statistics
irq_window_exits only tracks IRQ window exits due to user space
requests, nmi_window_exits include all exits. The latter makes more
sense, so let's adjust irq_window_exits accounting.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:39 +02:00
Guillaume Thouvenin
2786b014ec KVM: x86 emulator: consolidate push reg
This patch consolidate the emulation of push reg instruction.

Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:39 +02:00
Martin Schwidefsky
79741dd357 [PATCH] idle cputime accounting
The cpu time spent by the idle process actually doing something is
currently accounted as idle time. This is plain wrong, the architectures
that support VIRT_CPU_ACCOUNTING=y can do better: distinguish between the
time spent doing nothing and the time spent by idle doing work. The first
is accounted with account_idle_time and the second with account_system_time.
The architectures that use the account_xxx_time interface directly and not
the account_xxx_ticks interface now need to do the check for the idle
process in their arch code. In particular to improve the system vs true
idle time accounting the arch code needs to measure the true idle time
instead of just testing for the idle process.
To improve the tick based accounting as well we would need an architecture
primitive that can tell us if the pt_regs of the interrupted context
points to the magic instruction that halts the cpu.

In addition idle time is no more added to the stime of the idle process.
This field now contains the system time of the idle process as it should
be. On systems without VIRT_CPU_ACCOUNTING this will always be zero as
every tick that occurs while idle is running will be accounted as idle
time.

This patch contains the necessary common code changes to be able to
distinguish idle system time and true idle time. The architectures with
support for VIRT_CPU_ACCOUNTING need some changes to exploit this.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-31 15:11:46 +01:00
Rusty Russell
2ca1a61583 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	arch/x86/kernel/io_apic.c
2008-12-31 23:05:57 +10:30
Linus Torvalds
ab70537c32 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: struct device - replace bus_id with dev_name()
  lguest: move the initial guest page table creation code to the host
  kvm-s390: implement config_changed for virtio on s390
  virtio_console: support console resizing
  virtio: add PCI device release() function
  virtio_blk: fix type warning
  virtio: block: dynamic maximum segments
  virtio: set max_segment_size and max_sectors to infinite.
  virtio: avoid implicit use of Linux page size in balloon interface
  virtio: hand virtio ring alignment as argument to vring_new_virtqueue
  virtio: use KVM_S390_VIRTIO_RING_ALIGN instead of relying on pagesize
  virtio: use LGUEST_VRING_ALIGN instead of relying on pagesize
  virtio: Don't use PAGE_SIZE for vring alignment in virtio_pci.
  virtio: rename 'pagesize' arg to vring_init/vring_size
  virtio: Don't use PAGE_SIZE in virtio_pci.c
  virtio: struct device - replace bus_id with dev_name(), dev_set_name()
  virtio-pci queue allocation not page-aligned
2008-12-30 17:37:25 -08:00
Linus Torvalds
526ea064f9 Merge branch 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  oprofile: select RING_BUFFER
  ring_buffer: adding EXPORT_SYMBOLs
  oprofile: fix lost sample counter
  oprofile: remove nr_available_slots()
  oprofile: port to the new ring_buffer
  ring_buffer: add remaining cpu functions to ring_buffer.h
  oprofile: moving cpu_buffer_reset() to cpu_buffer.h
  oprofile: adding cpu_buffer_entries()
  oprofile: adding cpu_buffer_write_commit()
  oprofile: adding cpu buffer r/w access functions
  ftrace: remove unused function arg in trace_iterator_increment()
  ring_buffer: update description for ring_buffer_alloc()
  oprofile: set values to default when creating oprofilefs
  oprofile: implement switch/case in buffer_sync.c
  x86/oprofile: cleanup IBS init/exit functions in op_model_amd.c
  x86/oprofile: reordering IBS code in op_model_amd.c
  oprofile: fix typo
  oprofile: whitspace changes only
  oprofile: update comment for oprofile_add_sample()
  oprofile: comment cleanup
2008-12-30 17:31:25 -08:00
Linus Torvalds
179475a3b4 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, sparseirq: clean up Kconfig entry
  x86: turn CONFIG_SPARSE_IRQ off by default
  sparseirq: fix numa_migrate_irq_desc dependency and comments
  sparseirq: add kernel-doc notation for new member in irq_desc, -v2
  locking, irq: enclose irq_desc_lock_class in CONFIG_LOCKDEP
  sparseirq, xen: make sure irq_desc is allocated for interrupts
  sparseirq: fix !SMP building, #2
  x86, sparseirq: move irq_desc according to smp_affinity, v7
  proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQ
  sparse irqs: add irqnr.h to the user headers list
  sparse irqs: handle !GENIRQ platforms
  sparseirq: fix !SMP && !PCI_MSI && !HT_IRQ build
  sparseirq: fix Alpha build failure
  sparseirq: fix typo in !CONFIG_IO_APIC case
  x86, MSI: pass irq_cfg and irq_desc
  x86: MSI start irq numbering from nr_irqs_gsi
  x86: use NR_IRQS_LEGACY
  sparse irq_desc[] array: core kernel and x86 changes
  genirq: record IRQ_LEVEL in irq_desc[]
  irq.h: remove padding from irq_desc on 64bits
2008-12-30 16:20:19 -08:00
Linus Torvalds
bb758e9637 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  hrtimers: fix warning in kernel/hrtimer.c
  x86: make sure we really have an hpet mapping before using it
  x86: enable HPET on Fujitsu u9200
  linux/timex.h: cleanup for userspace
  posix-timers: simplify de_thread()->exit_itimers() path
  posix-timers: check ->it_signal instead of ->it_pid to validate the timer
  posix-timers: use "struct pid*" instead of "struct task_struct*"
  nohz: suppress needless timer reprogramming
  clocksource, acpi_pm.c: put acpi_pm_read_slow() under CONFIG_PCI
  nohz: no softirq pending warnings for offline cpus
  hrtimer: removing all ur callback modes, fix
  hrtimer: removing all ur callback modes, fix hotplug
  hrtimer: removing all ur callback modes
  x86: correct link to HPET timer specification
  rtc-cmos: export second NVRAM bank

Fixed up conflicts in sound/drivers/pcsp/pcsp.c and sound/core/hrtimer.c
manually.
2008-12-30 16:16:21 -08:00
Linus Torvalds
5f34fe1cfc Merge branch 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
  stacktrace: provide save_stack_trace_tsk() weak alias
  rcu: provide RCU options on non-preempt architectures too
  printk: fix discarding message when recursion_bug
  futex: clean up futex_(un)lock_pi fault handling
  "Tree RCU": scalable classic RCU implementation
  futex: rename field in futex_q to clarify single waiter semantics
  x86/swiotlb: add default swiotlb_arch_range_needs_mapping
  x86/swiotlb: add default phys<->bus conversion
  x86: unify pci iommu setup and allow swiotlb to compile for 32 bit
  x86: add swiotlb allocation functions
  swiotlb: consolidate swiotlb info message printing
  swiotlb: support bouncing of HighMem pages
  swiotlb: factor out copy to/from device
  swiotlb: add arch hook to force mapping
  swiotlb: allow architectures to override phys<->bus<->phys conversions
  swiotlb: add comment where we handle the overflow of a dma mask on 32 bit
  rcu: fix rcutorture behavior during reboot
  resources: skip sanity check of busy resources
  swiotlb: move some definitions to header
  swiotlb: allow architectures to override swiotlb pool allocation
  ...

Fix up trivial conflicts in
  arch/x86/kernel/Makefile
  arch/x86/mm/init_32.c
  include/linux/hardirq.h
as per Ingo's suggestions.
2008-12-30 16:10:19 -08:00
Jaswinder Singh Rajput
7820b75643 x86: xsave.c: restore_user_xstate should be static
Impact: cleanup, reduce kernel size a bit, avoid sparse warning

Fixes sparse warning:
arch/x86/kernel/xsave.c:162:5: warning: symbol 'restore_user_xstate' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-30 13:31:41 -08:00
Jaswinder Singh Rajput
fa95826fe0 x86: uv_bau.h: fix dubious bitfield
Impact: cleanup, avoid sparse warnings

declare bitfield as unsigned to avoid dubious bitfield issue

 CHECK   arch/x86/kernel/tlb_64.c
arch/x86/include/asm/uv/uv_bau.h:136:22: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:138:25: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:140:15: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:143:14: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:146:14: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:149:18: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:151:18: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:155:14: error: dubious one-bit signed bitfield
arch/x86/include/asm/uv/uv_bau.h:159:18: error: dubious one-bit signed bitfield
arch/x86/include/asm/uv/uv_bau.h:173:19: error: dubious one-bit signed bitfield
arch/x86/include/asm/uv/uv_bau.h:181:16: error: dubious one-bit signed bitfield
arch/x86/include/asm/uv/uv_bau.h:185:18: error: dubious one-bit signed bitfield
arch/x86/include/asm/uv/uv_bau.h:188:16: error: dubious one-bit signed bitfield

 CHECK   arch/x86/kernel/tlb_uv.c
arch/x86/include/asm/uv/uv_bau.h:136:22: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:138:25: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:140:15: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:143:14: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:146:14: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:149:18: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:151:18: warning: dubious bitfield without explicit `signed' or `unsigned'
arch/x86/include/asm/uv/uv_bau.h:155:14: error: dubious one-bit signed bitfield
arch/x86/include/asm/uv/uv_bau.h:159:18: error: dubious one-bit signed bitfield
arch/x86/include/asm/uv/uv_bau.h:173:19: error: dubious one-bit signed bitfield
arch/x86/include/asm/uv/uv_bau.h:181:16: error: dubious one-bit signed bitfield
arch/x86/include/asm/uv/uv_bau.h:185:18: error: dubious one-bit signed bitfield
arch/x86/include/asm/uv/uv_bau.h:188:16: error: dubious one-bit signed bitfield

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-30 13:31:37 -08:00
Jaswinder Singh Rajput
ec8c842a52 x86: apic.c: xapic_icr_read and x2apic_icr_read should be static
Impact: cleanup, reduce kernel size a bit, avoid sparse warning

Fixes sparse warning:
arch/x86/kernel/apic.c:270:5: warning: symbol 'x2apic_icr_read' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-30 13:31:28 -08:00
Jaswinder Singh Rajput
c62e9d56ea x86: bios_uv.c: uv_systab should be static
Impact: cleanup, reduce kernel size a bit, avoid sparse warning

Fixes sparse warning:
arch/x86/kernel/bios_uv.c:28:18: warning: symbol 'uv_systab' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-29 22:08:28 -08:00
Jaswinder Singh Rajput
4d08d97f52 x86: genx2apic_phys.c: x2apic_send_IPI_self and init_x2apic_ldr should be static
Impact: cleanup, reduce kernel size a bit, avoid sparse warnings

Fixes sparse warnings:
arch/x86/kernel/genx2apic_phys.c:164:6: warning: symbol 'x2apic_send_IPI_self' was not declared. Should it be static?
arch/x86/kernel/genx2apic_phys.c:169:6: warning: symbol 'init_x2apic_ldr' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-29 22:08:22 -08:00
Jaswinder Singh Rajput
557f687c87 x86: amd_iommu.c: prealloc_protection_domains should be static
Impact: cleanup, reduce kernel size a bit, avoid sparse warning

Fixes sparse warning:
arch/x86/kernel/amd_iommu.c:1299:6: warning: symbol 'prealloc_protection_domains' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-29 22:08:17 -08:00
Jaswinder Singh Rajput
412a1be265 x86: amd_iommu_init.c: iommu_enable and iommu_enable_event_logging should be static
Impact: cleanup, reduce kernel size a bit, avoid sparse warning

Fixes sparse warning:
arch/x86/kernel/amd_iommu_init.c:246:13: warning: symbol 'iommu_enable' was not declared. Should it be static?
arch/x86/kernel/amd_iommu_init.c:259:13: warning: symbol 'iommu_enable_event_logging' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-29 22:08:12 -08:00
Matias Zabaljauregui
58a2456644 lguest: move the initial guest page table creation code to the host
This patch moves the initial guest page table creation code to the host,
so the launcher keeps working with PAE enabled configs.

Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-12-30 09:26:11 +10:30
Rusty Russell
33edcf133b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-12-30 08:02:35 +10:30
Jaswinder Singh Rajput
824877111c x86, pci: move arch/x86/pci/pci.h to arch/x86/include/asm/pci_x86.h
Impact: cleanup

Now that arch/x86/pci/pci.h is used in a number of other places as well,
move the lowlevel x86 pci definitions into the architecture include files.
(not to be confused with the existing arch/x86/include/asm/pci.h file,
which provides public details about x86 PCI)

Tested on: X86_32_UP, X86_32_SMP and X86_64_SMP

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29 18:17:36 +01:00
Jaswinder Singh Rajput
c854c91979 x86_64: pci-gart_64.c iommu_fullflush should be static
Impact: cleanup, reduce kernel size a bit, avoid sparse warning

Fixes sparse warning:

  arch/x86/kernel/pci-gart_64.c:55:5: warning: symbol 'iommu_fullflush' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29 18:17:33 +01:00
Jaswinder Singh Rajput
cbafbc826b x86: efi.c declare add_efi_memmap before they get used
Impact: cleanup, avoid sparse warning

Fixes this sparse warning:

  arch/x86/kernel/efi.c:67:5: warning: symbol 'add_efi_memmap' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29 18:17:32 +01:00
Jaswinder Singh Rajput
7f3e632f9d x86: io_apic.c io_apic_sync should be static
Impact: cleanup, reduce kernel size a bit, avoid sparse warning

Fixes sparse warning:

  arch/x86/kernel/io_apic.c:709:6: warning: symbol 'io_apic_sync' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29 18:17:31 +01:00
Jaswinder Singh Rajput
a1ae299dfb x86: apic.c declare pic_mode before they get used
Impact: cleanup, avoid sparse warning

In asm/mpspec.h moved out pic_mode from CONFIG_X86_32 as it is common
for both 32 and 64 bit.

Fixes this sparse warning for x86_64:

  arch/x86/kernel/apic.c:128:5: warning: symbol 'pic_mode' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29 18:17:30 +01:00
Robert Richter
83bd924395 x86/oprofile: fix pci_dev use count for AMD northbridge devices
This patch fixes the PCI device use count for AMD northbridge
devices. In case of an IBS LVT initialization failure, the PCI device
is released now by calling pci_dev_put().

If there are no initialization errors, the devices are released in
pci_get_device() while iterating.

Signed-off-by: Robert Richter <robert.richter@amd.com>
2008-12-29 15:19:32 +01:00
Jaswinder Singh Rajput
2f06de0671 x86: introducing asm/sys_ia32.h
Impact: cleanup, avoid 44 sparse warnings, new file asm/sys_ia32.h

Fixes following sparse warnings:

  CHECK   arch/x86/ia32/sys_ia32.c
arch/x86/ia32/sys_ia32.c:53:17: warning: symbol 'sys32_truncate64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:60:17: warning: symbol 'sys32_ftruncate64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:98:17: warning: symbol 'sys32_stat64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:109:17: warning: symbol 'sys32_lstat64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:119:17: warning: symbol 'sys32_fstat64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:128:17: warning: symbol 'sys32_fstatat' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:164:17: warning: symbol 'sys32_mmap' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:195:17: warning: symbol 'sys32_mprotect' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:201:17: warning: symbol 'sys32_pipe' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:215:17: warning: symbol 'sys32_rt_sigaction' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:291:17: warning: symbol 'sys32_sigaction' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:330:17: warning: symbol 'sys32_rt_sigprocmask' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:370:17: warning: symbol 'sys32_alarm' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:383:17: warning: symbol 'sys32_old_select' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:393:17: warning: symbol 'sys32_waitpid' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:401:17: warning: symbol 'sys32_sysfs' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:406:17: warning: symbol 'sys32_sched_rr_get_interval' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:421:17: warning: symbol 'sys32_rt_sigpending' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:445:17: warning: symbol 'sys32_rt_sigqueueinfo' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:472:17: warning: symbol 'sys32_sysctl' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:517:17: warning: symbol 'sys32_pread' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:524:17: warning: symbol 'sys32_pwrite' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:532:17: warning: symbol 'sys32_personality' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:545:17: warning: symbol 'sys32_sendfile' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:565:17: warning: symbol 'sys32_mmap2' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:589:17: warning: symbol 'sys32_olduname' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:626:6: warning: symbol 'sys32_uname' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:641:6: warning: symbol 'sys32_ustat' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:663:17: warning: symbol 'sys32_execve' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:678:17: warning: symbol 'sys32_clone' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:693:6: warning: symbol 'sys32_lseek' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:698:6: warning: symbol 'sys32_kill' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:703:6: warning: symbol 'sys32_fadvise64_64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:712:6: warning: symbol 'sys32_vm86_warning' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:726:6: warning: symbol 'sys32_lookup_dcookie' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:732:20: warning: symbol 'sys32_readahead' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:738:17: warning: symbol 'sys32_sync_file_range' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:746:17: warning: symbol 'sys32_fadvise64' was not declared. Should it be static?
arch/x86/ia32/sys_ia32.c:753:17: warning: symbol 'sys32_fallocate' was not declared. Should it be static?
  CHECK   arch/x86/ia32/ia32_signal.c
arch/x86/ia32/ia32_signal.c:126:17: warning: symbol 'sys32_sigsuspend' was not declared. Should it be static?
arch/x86/ia32/ia32_signal.c:141:17: warning: symbol 'sys32_sigaltstack' was not declared. Should it be static?
arch/x86/ia32/ia32_signal.c:249:17: warning: symbol 'sys32_sigreturn' was not declared. Should it be static?
arch/x86/ia32/ia32_signal.c:279:17: warning: symbol 'sys32_rt_sigreturn' was not declared. Should it be static?
  CHECK   arch/x86/ia32/ipc32.c
arch/x86/ia32/ipc32.c:12:17: warning: symbol 'sys32_ipc' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29 13:18:40 +01:00
Cyrill Gorcunov
c805b7300e x86: mach-default setup.c cleanups
Impact: cleanup

- Break long lines into shorter form.
- Use pr_ macros instead of plain printk.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29 13:18:39 +01:00
Sergio Luis
6092848a2a x86: mark get_cpu_leaves() with __cpuinit annotation
Impact: fix section mismatch warning

Commit b2bb855491 ("x86: Remove cpumask games
in x86/kernel/cpu/intel_cacheinfo.c") introduced get_cpu_leaves(), which
references __cpuinit cpuid4_cache_lookup().

Mark get_cpu_leaves() with a __cpuinit annotation.

Signed-off-by: Sergio Luis <sergio@larces.uece.br>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29 13:11:18 +01:00
Ingo Molnar
75329f1f0c Merge branch 'linus' into x86/cleanups 2008-12-29 13:09:40 +01:00
Linus Torvalds
96faec945f Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (25 commits)
  allow stripping of generated symbols under CONFIG_KALLSYMS_ALL
  kbuild: strip generated symbols from *.ko
  kbuild: simplify use of genksyms
  kernel-doc: check for extra kernel-doc notations
  kbuild: add headerdep used to detect inclusion cycles in header files
  kbuild: fix string equality testing in tags.sh
  kbuild: fix make tags/cscope
  kbuild: fix make incompatibility
  kbuild: remove TAR_IGNORE
  setlocalversion: add git-svn support
  setlocalversion: print correct subversion revision
  scripts: improve the decodecode script
  scripts/package: allow custom options to rpm
  genksyms: allow to ignore symbol checksum changes
  genksyms: track symbol checksum changes
  tags and cscope support really belongs in a shell script
  kconfig: fix options to check-lxdialog.sh
  kbuild: gen_init_cpio expands shell variables in file names
  remove bashisms from scripts/extract-ikconfig
  kbuild: teach mkmakfile to be silent
  ...
2008-12-28 15:13:48 -08:00
Linus Torvalds
1db2a5c11e Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (85 commits)
  [S390] provide documentation for hvc_iucv kernel parameter.
  [S390] convert ctcm printks to dev_xxx and pr_xxx macros.
  [S390] convert zfcp printks to pr_xxx macros.
  [S390] convert vmlogrdr printks to pr_xxx macros.
  [S390] convert zfcp dumper printks to pr_xxx macros.
  [S390] convert cpu related printks to pr_xxx macros.
  [S390] convert qeth printks to dev_xxx and pr_xxx macros.
  [S390] convert sclp printks to pr_xxx macros.
  [S390] convert iucv printks to dev_xxx and pr_xxx macros.
  [S390] convert ap_bus printks to pr_xxx macros.
  [S390] convert dcssblk and extmem printks messages to pr_xxx macros.
  [S390] convert monwriter printks to pr_xxx macros.
  [S390] convert s390 debug feature printks to pr_xxx macros.
  [S390] convert monreader printks to pr_xxx macros.
  [S390] convert appldata printks to pr_xxx macros.
  [S390] convert setup printks to pr_xxx macros.
  [S390] convert hypfs printks to pr_xxx macros.
  [S390] convert time printks to pr_xxx macros.
  [S390] convert cpacf printks to pr_xxx macros.
  [S390] convert cio printks to pr_xxx macros.
  ...
2008-12-28 12:33:21 -08:00
Linus Torvalds
a39b863342 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
  sched: fix warning in fs/proc/base.c
  schedstat: consolidate per-task cpu runtime stats
  sched: use RCU variant of list traversal in for_each_leaf_rt_rq()
  sched, cpuacct: export percpu cpuacct cgroup stats
  sched, cpuacct: refactoring cpuusage_read / cpuusage_write
  sched: optimize update_curr()
  sched: fix wakeup preemption clock
  sched: add missing arch_update_cpu_topology() call
  sched: let arch_update_cpu_topology indicate if topology changed
  sched: idle_balance() does not call load_balance_newidle()
  sched: fix sd_parent_degenerate on non-numa smp machine
  sched: add uid information to sched_debug for CONFIG_USER_SCHED
  sched: move double_unlock_balance() higher
  sched: update comment for move_task_off_dead_cpu
  sched: fix inconsistency when redistribute per-cpu tg->cfs_rq shares
  sched/rt: removed unneeded defintion
  sched: add hierarchical accounting to cpu accounting controller
  sched: include group statistics in /proc/sched_debug
  sched: rename SCHED_NO_NO_OMIT_FRAME_POINTER => SCHED_OMIT_FRAME_POINTER
  sched: clean up SCHED_CPUMASK_ALLOC
  ...
2008-12-28 12:27:58 -08:00
Linus Torvalds
b0f4b285d7 Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits)
  sched, trace: update trace_sched_wakeup()
  tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3
  Revert "x86: disable X86_PTRACE_BTS"
  ring-buffer: prevent false positive warning
  ring-buffer: fix dangling commit race
  ftrace: enable format arguments checking
  x86, bts: memory accounting
  x86, bts: add fork and exit handling
  ftrace: introduce tracing_reset_online_cpus() helper
  tracing: fix warnings in kernel/trace/trace_sched_switch.c
  tracing: fix warning in kernel/trace/trace.c
  tracing/ring-buffer: remove unused ring_buffer size
  trace: fix task state printout
  ftrace: add not to regex on filtering functions
  trace: better use of stack_trace_enabled for boot up code
  trace: add a way to enable or disable the stack tracer
  x86: entry_64 - introduce FTRACE_ frame macro v2
  tracing/ftrace: add the printk-msg-only option
  tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()
  x86, bts: correctly report invalid bts records
  ...

Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits
being already partly merged by the SH merge.
2008-12-28 12:21:10 -08:00
Linus Torvalds
be9c5ae4ee Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (246 commits)
  x86: traps.c replace #if CONFIG_X86_32 with #ifdef CONFIG_X86_32
  x86: PAT: fix address types in track_pfn_vma_new()
  x86: prioritize the FPU traps for the error code
  x86: PAT: pfnmap documentation update changes
  x86: PAT: move track untrack pfnmap stubs to asm-generic
  x86: PAT: remove follow_pfnmap_pte in favor of follow_phys
  x86: PAT: modify follow_phys to return phys_addr prot and return value
  x86: PAT: clarify is_linear_pfn_mapping() interface
  x86: ia32_signal: remove unnecessary declaration
  x86: common.c boot_cpu_stack and boot_exception_stacks should be static
  x86: fix intel x86_64 llc_shared_map/cpu_llc_id anomolies
  x86: fix warning in arch/x86/kernel/microcode_amd.c
  x86: ia32.h: remove unused struct sigfram32 and rt_sigframe32
  x86: asm-offset_64: use rt_sigframe_ia32
  x86: sigframe.h: include headers for dependency
  x86: traps.c declare functions before they get used
  x86: PAT: update documentation to cover pgprot and remap_pfn related changes - v3
  x86: PAT: add pgprot_writecombine() interface for drivers - v3
  x86: PAT: change pgprot_noncached to uc_minus instead of strong uc - v3
  x86: PAT: implement track/untrack of pfnmap regions for x86 - v3
  ...
2008-12-28 12:07:57 -08:00
Linus Torvalds
bb26c6c29b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (105 commits)
  SELinux: don't check permissions for kernel mounts
  security: pass mount flags to security_sb_kern_mount()
  SELinux: correctly detect proc filesystems of the form "proc/foo"
  Audit: Log TIOCSTI
  user namespaces: document CFS behavior
  user namespaces: require cap_set{ug}id for CLONE_NEWUSER
  user namespaces: let user_ns be cloned with fairsched
  CRED: fix sparse warnings
  User namespaces: use the current_user_ns() macro
  User namespaces: set of cleanups (v2)
  nfsctl: add headers for credentials
  coda: fix creds reference
  capabilities: define get_vfs_caps_from_disk when file caps are not enabled
  CRED: Allow kernel services to override LSM settings for task actions
  CRED: Add a kernel_service object class to SELinux
  CRED: Differentiate objective and effective subjective credentials on a task
  CRED: Documentation
  CRED: Use creds in file structs
  CRED: Prettify commoncap.c
  CRED: Make execve() take advantage of copy-on-write credentials
  ...
2008-12-28 11:43:54 -08:00
FUJITA Tomonori
1da4f9894c swiotlb: replace architecture-specific swiotlb.h with linux/swiotlb.h
Impact: cleanup

This replaces architecture-specific swiotlb.h (X86 and IA64) with
linux/swiotlb.h.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 10:04:00 +01:00
Jeremy Fitzhardinge
70a7d3cc13 swiotlb: add hwdev to swiotlb_phys_to_bus() / swiotlb_sg_to_bus()
Impact: extend functions with a (yet unused) parameter, update callsites

Some architectures need it - in preparation for highmem swiotlb.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-28 09:54:52 +01:00
Yinghai Lu
13a0c3c269 sparseirq: work around compiler optimizing away __weak functions
Impact: fix panic on null pointer with sparseirq

Some GCC versions seem to inline the weak global function,
when that function is empty.

Work it around, by making the functions return a (dummy) integer.

Signed-off-by: Yinghai <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-27 13:24:00 +01:00
Jaswinder Singh
b6b301aa9f x86: apic.c x2apic_preenabled and disable_x2apic should be static
Impact: cleanup, reduce kernel size a bit, avoid sparse warning

Fixes sparse warning:

  arch/x86/kernel/apic.c:103:5: warning: symbol 'disable_x2apic' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-27 11:30:18 +01:00
Jaswinder Singh
0b936bfdeb x86: reboot.c declare port_cf9_safe before they get used
Impact: cleanup, avoid sparse warning

Include "../pci/pci.h" for port_cf9_safe

Fixes this sparse warning:

  arch/x86/kernel/reboot.c:43:6: warning: symbol 'port_cf9_safe' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-27 11:30:17 +01:00
Ingo Molnar
34bf5d0ff5 Merge branch 'x86/core' into x86/cleanups 2008-12-27 11:30:05 +01:00
Rusty Russell
030bb203e0 cpumask: cpu_coregroup_mask(): x86
Impact: New API

Like cpu_coregroup_map, but returns a (const) pointer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
2008-12-26 22:23:41 +10:30
Rusty Russell
393d68fb99 cpumask: x86: Introduce cpumask_of_{node,pcibus} to replace {node,pcibus}_to_cpumask
Impact: New APIs

The old node_to_cpumask/node_to_pcibus returned a cpumask_t: these
return a pointer to a struct cpumask.  Part of removing cpumasks from
the stack.

Also makes __pcibus_to_node take a const pointer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Ingo Molnar <mingo@elte.hu>
2008-12-26 22:23:38 +10:30
KOSAKI Motohiro
18eefedfe8 irq: simplify for_each_irq_desc() usage
Impact: cleanup

all for_each_irq_desc() usage point have !desc check.
then its check can move into for_each_irq_desc() macro.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-26 09:48:18 +01:00
Ingo Molnar
bd8b96dfc2 x86: clean up comment style in arch/x86/kernel/traps.c
Impact: cleanup

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-26 09:21:13 +01:00
Ingo Molnar
c656d9ca48 Merge branch 'x86/fpu' into x86/cleanups 2008-12-26 09:21:05 +01:00
Rusty Russell
71ab6b245f x86: remove impossible test in mtrr/main.c
Impact: cleanup

enable_mtrr_cleanup is static, and is never set to anything but 0 or 1.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-25 12:46:06 -08:00
H. Peter Anvin
a73ad3331f x86: unify the implementation of FPU traps
On 32 bits, we may suffer IRQ 13, or supposedly we might have a buggy
implementation which gives spurious trap 16.  We did not check for
this on 64 bits, but there is no reason we can't make the code the
same in both cases.  Furthermore, this is presumably rare, so do the
spurious check last, instead of first.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-25 10:39:01 -08:00
Ingo Molnar
32e8d18683 Merge branches 'timers/clocksource', 'timers/hpet', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/rtc' into timers/core 2008-12-25 18:02:25 +01:00
Ingo Molnar
860cf8894b Merge branches 'irq/sparseirq', 'irq/genirq' and 'irq/urgent'; commit 'v2.6.28' into irq/core 2008-12-25 16:27:54 +01:00
Ingo Molnar
973656fe1a x86, sparseirq: clean up Kconfig entry
Impact: improve help text

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-25 16:26:47 +01:00
Ingo Molnar
6638101c11 Merge branches 'core/debugobjects', 'core/iommu', 'core/locking', 'core/printk', 'core/rcu', 'core/resources', 'core/softirq' and 'core/stacktrace' into core/core 2008-12-25 14:06:29 +01:00
Ingo Molnar
0b271ef452 Merge commit 'v2.6.28' into core/core 2008-12-25 13:51:46 +01:00
Ingo Molnar
4e202284e6 Merge branch 'sched/urgent'; commit 'v2.6.28' into sched/core 2008-12-25 13:42:23 +01:00
Martin Schwidefsky
fc5243d98a [S390] arch_setup_additional_pages arguments
arch_setup_additional_pages currently gets two arguments, the binary
format descripton and an indication if the process uses an executable
stack or not. The second argument is not used by anybody, it could
be removed without replacement.

What actually does make sense is to pass an indication if the process
uses the elf interpreter or not. The glibc code will not use anything
from the vdso if the process does not use the dynamic linker, so for
statically linked binaries the architecture backend can choose not
to map the vdso.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:54 +01:00
Ingo Molnar
5250d329e3 Merge branches 'tracing/ftrace', 'tracing/hw-branch-tracing' and 'tracing/ring-buffer'; commit 'v2.6.28' into tracing/core 2008-12-25 13:11:00 +01:00
Ingo Molnar
a3eeeefbf1 Merge branch 'x86/tsc' into tracing/core
Merge it to resolve this incidental conflict between the BTS fixes/cleanups
and changes in x86/tsc:

Conflicts:
	arch/x86/kernel/cpu/intel.c
2008-12-25 12:48:18 +01:00
Ingo Molnar
4e17fee24a x86: turn CONFIG_SPARSE_IRQ off by default
New feature - lets disable it by default first - can flip it around
later.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-25 12:04:17 +01:00
Ingo Molnar
79a66b96c3 Merge branches 'x86/pat2' and 'x86/fpu'; commit 'v2.6.28' into x86/core 2008-12-25 11:50:41 +01:00
Jaswinder Singh
1fcccb008b x86: traps.c replace #if CONFIG_X86_32 with #ifdef CONFIG_X86_32
Impact: cleanup, avoid warning on X86_64

Fixes this warning on X86_64:

  CC      arch/x86/kernel/traps.o
  arch/x86/kernel/traps.c:695:5: warning: "CONFIG_X86_32" is not defined

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-25 11:49:55 +01:00
Frederic Weisbecker
0ca59dd948 tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3
Impact: fix a crash/hard-reboot on certain configs while enabling cpu runtime

On some archs, the boot of a secondary cpu can have an early fragile state.
On x86-64, the pda is not initialized on the first stage of a cpu boot but
it is needed to get the cpu number and the current task pointer. This data
is needed during tracing. As they were dereferenced at this stage, we got a
crash while tracing a cpu being enabled at runtime.

Some other archs like ia64 can have such kind of issue too.

Changes on v2:

We dropped the previous solution of a per-arch called function to guess the
current state of a cpu. That could slow down the tracing.

This patch removes the -pg flag on arch/x86/kernel/cpu/common.c where
the low level cpu boot functions exist, on start_secondary() and a helper
function used at this stage.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-25 09:39:22 +01:00
James Morris
cbacc2c7f0 Merge branch 'next' into for-linus 2008-12-25 11:40:09 +11:00
Herbert Xu
b7e8bdadce crypto: crc32c-intel - Switch to shash
This patch changes crc32c-intel to the new shash interface.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-12-25 11:01:37 +11:00
Kent Liu
1c06da81a5 crypto: crc32c-intel - Update copyright head
The original copyright head for crc32c-intel.c is incorrect. Please merge
the patch to update it.

Signed-Off-By: Kent Liu <kent.liu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-12-25 11:01:18 +11:00
Ingo Molnar
67be403d89 Revert "x86: disable X86_PTRACE_BTS"
This reverts commit 40f15ad8aa.

The CONFIG_X86_PTRACE_BTS bugs have been fixed via:

 c5dee61: x86, bts: memory accounting
 bf53de9: x86, bts: add fork and exit handling

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-24 21:08:37 +01:00
Ingo Molnar
db8862eafe Merge branch 'linus' into tracing/hw-branch-tracing 2008-12-24 21:08:26 +01:00
Ingo Molnar
40f15ad8aa x86: disable X86_PTRACE_BTS
there's a new ptrace arch level feature in .28:

  config X86_PTRACE_BTS
  bool "Branch Trace Store"

it has broken fork() handling: the old DS area gets copied over into
a new task without clearing it.

Fixes exist but they came too late:

  c5dee61: x86, bts: memory accounting
  bf53de9: x86, bts: add fork and exit handling

and are queued up for v2.6.29. This shows that the facility is still not
tested well enough to release into a stable kernel - disable it for now and
reactivate in .29. In .29 the hardware-branch-tracer will use the DS/BTS
facilities too - hopefully resulting in better code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-24 10:49:51 +01:00
H. Peter Anvin
c1c15b65ec x86: PAT: fix address types in track_pfn_vma_new()
Impact: cleanup, fix warning

This warning:

 arch/x86/mm/pat.c: In function track_pfn_vma_copy:
 arch/x86/mm/pat.c:701: warning: passing argument 5 of follow_phys from incompatible pointer type

Triggers because physical addresses are resource_size_t, not u64.

This really matters when calling an interface like follow_phys() which
takes a pointer to a physical address -- although on x86, being
littleendian, it would generally work anyway as long as the memory region
wasn't completely uninitialized.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-24 10:40:19 +01:00
Ingo Molnar
c3d80000e3 x86: export vector_used_by_percpu_irq
Impact: build fix

lguest can be built as a module and makes use of this new symbol:

ERROR: "vector_used_by_percpu_irq" [drivers/lguest/lg.ko] undefined!

export it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 22:37:31 +01:00
Suresh Siddha
7d87d53655 x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and()
These commits:

	commit 95d313cf1c
	Author: Mike Travis <travis@sgi.com>
	Date:   Tue Dec 16 17:33:54 2008 -0800

	    x86: Add cpu_mask_to_apicid_and

and
	commit 6eeb7c5a99
	Author: Mike Travis <travis@sgi.com>
	Date:   Tue Dec 16 17:33:55 2008 -0800

	    x86: update add-cpu_mask_to_apicid_and to use struct cpumask*

broke interrupt delivery on x2apic platforms.  As x2apic cluster mode uses
logical delivery mode, we need to use logical apicid instead of physical apicid
in x2apic_cpu_mask_to_apicid_and()

Impact: fixes the broken interrupt delivery issue on generic x2apic platforms.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Mike Travis <travis@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 22:37:30 +01:00
Yinghai Lu
b77b881f21 x86: fix lguest used_vectors breakage, -v2
Impact: fix lguest, clean up

32-bit lguest used used_vectors to record vectors, but that model of
allocating vectors changed and got broken, after we changed vector
allocation to a per_cpu array.

Try enable that for 64bit, and the array is used for all vectors that
are not managed by vector_irq per_cpu array.

Also kill system_vectors[], that is now a duplication of the
used_vectors bitmap.

[ merged in cpus4096 due to io_apic.c cpumask changes. ]
[ -v2, fix build failure ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-23 22:37:28 +01:00
Ingo Molnar
bed4f13065 Merge branch 'x86/irq' into x86/core 2008-12-23 16:30:31 +01:00
Ingo Molnar
3e5621edb3 Merge branch 'x86/iommu' into x86/core 2008-12-23 16:30:27 +01:00
Ingo Molnar
be9a1d3c2e Merge branch 'x86/tsc' into x86/core 2008-12-23 16:30:20 +01:00
Ingo Molnar
7e3cbc3f77 Merge branch 'x86/ptrace' into x86/tsc
Conflicts:
	arch/x86/kernel/cpu/intel.c
2008-12-23 16:29:31 +01:00
Ingo Molnar
fa623d1b02 Merge branches 'x86/apic', 'x86/cleanups', 'x86/cpufeature', 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/detect-hyper', 'x86/doc', 'x86/dumpstack', 'x86/early-printk', 'x86/fpu', 'x86/idle', 'x86/io', 'x86/memory-corruption-check', 'x86/microcode', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/pat2', 'x86/pci-ioapic-boot-irq-quirks', 'x86/ptrace', 'x86/quirks', 'x86/reboot', 'x86/setup-memory', 'x86/signal', 'x86/sparse-fixes', 'x86/time', 'x86/uv' and 'x86/xen' into x86/core 2008-12-23 16:27:23 +01:00
Ingo Molnar
bf8bd66d05 Merge branch 'x86/apic' into x86/irq
Conflicts:
	arch/x86/kernel/apic.c
2008-12-23 16:24:15 +01:00
Ingo Molnar
1ccedb7cdb Merge commit 'v2.6.28-rc9' into x86/apic 2008-12-23 16:23:23 +01:00
Len Brown
7b37b5fd9b ACPI: disable MPS when NO APIC-table found
When ACPI is asked to find an MADT (APIC table)
and fails, then ACPI expects to run in PIC mode.

However, if an MP Table is was found, IRQs will be
registered as if an IOAPIC is being used, even
though ACPI is configuring interrupt links links for PIC mode.

In this scenario, disable MPS so that IRQs
are registered in PIC mode, consistent with ACPI.

http://bugzilla.kernel.org/show_bug.cgi?id=12257

Signed-off-by: Len Brown <len.brown@intel.com>
2008-12-23 01:47:42 -05:00
H. Peter Anvin
adf77bac05 x86: prioritize the FPU traps for the error code
In the case of multiple FPU errors, prioritize the error codes,
instead of returning __SI_FAULT, which ends up pushing a 0 as the
error code to userspace, a POSIX violation.

For i386, we will simply return if there are no errors at all; for
x86-64 this is probably a "can't happen" (and the code should be
unified), but for this patch, return __SI_FAULT|SI_KERNEL if this ever
happens.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-22 18:00:18 -08:00
Dmitry Adamushko
280a9ca5d0 x86: fix resume (S2R) broken by Intel microcode module, on A110L
Impact: fix deadlock

This is in response to the following bug report:

Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12100
Subject         : resume (S2R) broken by Intel microcode module, on A110L
Submitter       : Andreas Mohr <andi@lisas.de>
Date            : 2008-11-25 08:48 (19 days old)
Handled-By      : Dmitry Adamushko <dmitry.adamushko@gmail.com>

[ The deadlock scenario has been discovered by Andreas Mohr ]

I think I might have a logical explanation why the system:

  (http://bugzilla.kernel.org/show_bug.cgi?id=12100)

might hang upon resuming, OTOH it should have likely hanged each and every time.

(1) possible deadlock in microcode_resume_cpu() if either 'if' section is
taken;

(2) now, I don't see it in spec. and can't experimentally verify it (newer
ucodes don't seem to be available for my Core2duo)... but logically-wise, I'd
think that when read upon resuming, the 'microcode revision' (MSR 0x8B) should
be back to its original one (we need to reload ucode anyway so it doesn't seem
logical if a cpu doesn't drop the version)... if so, the comparison with
memcmp() for the full 'struct cpu_signature' is wrong... and that's how one of
the aforementioned 'if' sections might have been triggered - leading to a
deadlock.

Obviously, in my tests I simulated loading/resuming with the ucode of the same
version (just to see that the file is loaded/re-loaded upon resuming) so this
issue has never popped up.

I'd appreciate if someone with an appropriate system might give a try to the
2nd patch (titled "fix a comparison && deadlock...").

In any case, the deadlock situation is a must-have fix.

Reported-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Tested-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-20 14:29:20 +01:00
Markus Metzger
c5dee6177f x86, bts: memory accounting
Impact: move the BTS buffer accounting to the mlock bucket

Add alloc_locked_buffer() and free_locked_buffer() functions to mm/mlock.c
to kalloc a buffer and account the locked memory to current.

Account the memory for the BTS buffer to the tracer.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-20 09:15:47 +01:00
Markus Metzger
bf53de907d x86, bts: add fork and exit handling
Impact: introduce new ptrace facility

Add arch_ptrace_untrace() function that is called when the tracer
detaches (either voluntarily or when the tracing task dies);
ptrace_disable() is only called on a voluntary detach.

Add ptrace_fork() and arch_ptrace_fork(). They are called when a
traced task is forked.

Clear DS and BTS related fields on fork.

Release DS resources and reclaim memory in ptrace_untrace(). This
releases resources already when the tracing task dies. We used to do
that when the traced task dies.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-20 09:15:46 +01:00
venkatesh.pallipadi@intel.com
34801ba9bf x86: PAT: move track untrack pfnmap stubs to asm-generic
Impact: Cleanup and branch hints only.

Move the track and untrack pfn stub routines from memory.c to asm-generic.
Also add unlikely to pfnmap related calls in fork and exit path.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-19 15:40:30 -08:00
venkatesh.pallipadi@intel.com
982d789ab7 x86: PAT: remove follow_pfnmap_pte in favor of follow_phys
Impact: Cleanup - removes a new function in favor of a recently modified older one.

Replace follow_pfnmap_pte in pat code with follow_phys. follow_phys lso
returns protection eliminating the need of pte_pgprot call. Using follow_phys
also eliminates the need for pte_pa.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-12-19 15:40:30 -08:00
Hiroshi Shimamoto
8403295e0f x86: ia32_signal: remove unnecessary declaration
Impact: cleanup

No need to declare do_signal().

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-19 23:33:59 +01:00
Jaswinder Singh
34945ede31 x86: common.c boot_cpu_stack and boot_exception_stacks should be static
Impact: cleanup, avoid sparse warnings, reduce kernel size a bit

Fixes these sparse warnings:

 arch/x86/kernel/cpu/common.c:869:6: warning: symbol 'boot_cpu_stack' was not declared. Should it be static?
 arch/x86/kernel/cpu/common.c:910:6: warning: symbol 'boot_exception_stacks' was not declared. Should it be static?

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-19 23:16:08 +01:00
Yinghai Lu
b909895739 sparseirq: fix numa_migrate_irq_desc dependency and comments
Impact: reduce kconfig variable scope and clean up

Bartlomiej pointed out that the config dependencies and comments are not right.

update it depend to NUMA, and fix some comments

Reported-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-19 22:56:02 +01:00
Jan Beulich
9bb482476c allow stripping of generated symbols under CONFIG_KALLSYMS_ALL
Building upon parts of the module stripping patch, this patch
introduces similar stripping for vmlinux when CONFIG_KALLSYMS_ALL=y.
Using CONFIG_KALLSYMS_STRIP_GENERATED reduces the overhead of
CONFIG_KALLSYMS_ALL from 245k/310k to 65k/80k for the (i386/x86-64)
kernels I tested with.

The patch also does away with the need to special case the kallsyms-
internal symbols by making them available even in the first linking
stage.

While it is a generated file, the patch includes the changes to
scripts/genksyms/keywords.c_shipped, as I'm unsure what the procedure
here is.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-12-19 22:47:10 +01:00