ps3av:
- Move the definition of struct ps3av to ps3av.c, as it's locally used only.
- Kill ps3av.sem, use the existing ps3av.mutex instead.
- Make the 512-byte buffer in ps3av_do_pkt() static to reduce stack usage.
Its use is protected by a semaphore anyway.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ps3av: Replace the kernel_thread and the ping pong semaphores by a singlethread
workqueue and a completion.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ps3fb: Replace the kernel_thread and the semaphore by a proper kthread, which
is simply woken up when the screen must be updated
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The recent conversion from `memcpy' to `skb_copy_from_linear_data' removed a
few casts, which were needed to silence compiler warnings. Re-add them.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kill resource_size_t warnings by casting resource_size_t to unsigned long when
formatting Zorro bus resources, as they are always 32-bit.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Install the built-in macsonic interrupt handler on both IRQs when using
via_alt_mapping. Otherwise the rare interrupt that still comes from the
nubus slot will wedge the nubus.
$ cat /proc/interrupts
auto 2: 89176 via2
auto 3: 744367 sonic
auto 4: 0 scc
auto 6: 318363 via1
auto 7: 0 NMI
mac 9: 119413 framebuffer vbl
mac 10: 1971 ADB
mac 14: 198517 timer
mac 17: 89104 nubus
mac 19: 72 Mac ESP SCSI
mac 56: 629 sonic
mac 62: 1142593 ide0
Version 1 of this patch had a bug where a nubus sonic card would register
two interrupt handlers. Only a built-in sonic needs both.
Versions 2 and 3 needed some cleanups, as Raylynn Knight and Christoph
Hellwig pointed out (thanks).
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a potential problem in the timeout handling: don't free the DMA buffers
before resetting the chip.
Also a trivial cleanup. Bring macsonic and jazzsonic into sync.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a race condition in the transmit code, where the dma interrupt could update
the free tx buffer count concurrently and wedge the tx queue.
Fix the misuse of the rx frame status and rx frame length registers: no more
"fifo overrun" errors caused by the OFLOW bit being tested in the frame length
register (instead of the status register), and no more missed packets due to
incorrect length taken from status register (instead of the frame length
register).
Fix a panic (skb_over_panic BUG) caused by allocating and then copying an
incoming packet while the packet length register was changing.
Cut-and-paste the reset code from the powermac mace driver (mace.c), so the NIC
functions when MacOS does not initialise it (important for anyone wanting to
use the Emile boot loader).
Cut-and-paste the error counting and timeout recovery code from mace.c.
Fix over allocation of rx buffer memory (it's page order, not page count).
Converted to driver model.
Converted to DMA API.
Since I've run out of ways to make it fail, and since it performs well now,
promote the driver from EXPERIMENTAL status. Tested on both quadra 840av and
660av.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the flakiness in the CUDA ADB driver on m68k macs (keypresses getting
wedged down or ADB just going AWOL altogether).
The only IRQ used by this driver is the VIA shift register IRQ. The PowerMac
conditional code disables the other VIA IRQ sources, so don't mess with the
other IRQ flags in the common code -- m68k macs need them.
When polling, don't disable local interrupts when we only need to disable the
CUDA interrupt.
Unless polling, don't clear the shift register IRQ flag. On m68k macs this
creates a race that often breaks CUDA ADB.
Tested on Quadra 840av and LC630 (both m68k); also Beige G3 (powerpc).
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a crash caused by requests placed in the queue with the completed flag
already set. This lead to some ADB_SYNC requests returning early and their
request structs being popped off the stack while still queued. Stack corruption
ensued or an invalid request callback pointer was invoked or both. Eliminate
macii_retransmit() and its buggy implementation of macii_write(). Have
macii_queue_poll() fully initialise the request queues.
Fix a bug in macii_queue_poll() where the last_req pointer was not being set.
This caused some requests to leave the queue before being completed (and would
also corrupt the stack under certain conditions).
Fix a race in macii_start that could set the state machine to "reading" while
current_req was null.
No longer send poll commands with the ADBREQ_REPLY flag -- doing that caused
the replies to be stored in the request buffer where they were forgotten
about.
Don't autopoll by continuously sending new Talk commands. Get the controller to
do that for us. This reduces the ADB interrupt rate on an idle bus to about 5
per second. Only autopoll the devices that were probed.
Explicitly clear the interrupt flag when polling.
Use disable_irq rather than local_irq_save when polling.
Remove excess local_irq_save/restore pairs.
Improve bus timeout and service request detection.
Remove unused code (last_reply, adb_dir etc) and unneeded code (prefix_len,
first_byte etc).
Change TIP and TACK to their correct names on this ADB controller (ST_EVEN and
ST_ODD).
Add some commentry.
Add a generous quantity of sanity checks (BUG_ONs).
Let m68k macs use the adb_sync boot param too.
Tested on Mac II, Mac IIci, Quadra 650, Quadra 700 etc.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the support for C/NET nubus ethernet cards etc. Sync up the DP8390 driver
with the latest code in the mac68k repo.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sync the nubus defines with the latest code in the mac68k repo. Some of these
are needed for DP8390 driver update in the next patch.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Macintosh CS89x0 Ethernet: Netif updates
Addition of netif_stop_queue() before transmission by Michael Schmitz
skb_copy_{from,to}_linear_data() conversion by Geert Uytterhoeven
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update the atari fb to 2.6 by Michael Schmitz,
Reformatting and rewrite of bit plane functions by Roman Zippel,
A few more fixes by Geert Uytterhoeven.
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Atari keyboard and mouse support.
(reformating and Kconfig fixes by Roman Zippel)
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SCSI should be working on a TT (but someone should really try!) but causes
trouble on a Falcon (as in: it ate a filesystem of mine) at least when
used concurrently with IDE. I have the notion it's because locking of the
ST-DMA interrupt by IDE is broken in 2.6 (the IDE driver always complains
about trying to release an already-released ST-DMA). Needs more work, but
that's on the IDE or m68k interrupt side rather than SCSI.
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The SCO buffer size values for Bluetooth chips from Broadcom are wrong
and the USB Bluetooth driver has to set a quirk to correct these SCO
buffer size values.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch adds the vendor and product id of the Targus ACB10US
dongle and sets a flag to send HCI_Reset as the first command.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org
Instead of the deprecated read_conf_data(), implement a new function
tape_3590_read_dev_chars().
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Instead of the deprecated read_conf_data(), implement a new function
qeth_read_conf_data().
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Instead of the deprecated read_dev_chars() and read_conf_data_lpm(),
implement dasd_generic_read_dev_chars() and dasd_eckd_read_conf_lpm().
These should even recover better from error than the original cio
functions.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use atomic_t/atomic64_t to make qdio performance statistics smp safe.
Remove temporarily calculation of "total time of inbound actions".
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the check for skb->len greater than MTU when doing TSO. When
the destination has a smaller MSS than the source, a TSO packet may
be smaller than the MTU at the source and we still need to process it
as a TSO packet.
Thanks to Brian Ristuccia <bristuccia@starentnetworks.com> for
reporting the problem.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanup of dev_base list use, with the aim to simplify making device
list per-namespace. In almost every occasion, use of dev_base variable
and dev->next pointer could be easily replaced by for_each_netdev
loop. A few most complicated places were converted to using
first_netdev()/next_netdev().
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the code to print PCI or PCIE bus information for all devices.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 5709 supports the one-shot MSI handler similar to some of the tg3
chips. In this mode, the MSI disables itself automatically until it
is re-enabled at the end of NAPI poll.
Put the request_irq/free_irq logic in common procedures.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Restructure by adding bnx2_phy_event_is_set() to make code cleaner
and easier to understand.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The indirect register access method will be used by more than one
caller in BH context (NAPI poll and timer), so a spinlock is required.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add PCI ID and code to support the 5709 Serdes PHY.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add some common procedures to handle enabling and disabling 2.5G.
Add some missing code to resolve flow control.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 5709 Serdes device uses non-standard MII register offsets. This
re-structuring will make it easier to support 5709 Serdes.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the problem of not counting all dropped multicast packets.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is needed to save the MSI state which will be lost during
suspend.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hot-plug scripts can call bnx2_open() as soon as register_netdev() is
called in bnx2_init_one(). We need to call pci_set_drvdata() and
setup everything before calling register_netdev(). netif_carrier_off()
also needs to be moved to bnx2_open() to avoid race conditions with
the irq.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The internal PCIE-to-PCIX bridge of the 5708 has the same 40-bit DMA
limitation as some of the tg3 chips. Set dma_mask and persistent DMA
mask to 40-bit to workaround.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device may be in D3hot state and should not allow MII register
access.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moves the i.MX UART register descriptions from
include/asm-arm/arch-imx/imx-regs.h to the serial driver itself.
This helps using the driver on other architectures like mx31
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add pata_platform device for RiscPC, thereby converting the primary
IDE channel on the machine to PATA.
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Allow slower serial baud-rates by switching the UART clock from MCK to
MCK/8.
Based on patches by Mike Wolfram and Russell King.
Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It is illegal not to return from a pio or mmio request without completing
it, as mmio or pio is an atomic operation. Therefore, we can simplify
the userspace interface by avoiding the completion indication.
Signed-off-by: Avi Kivity <avi@qumranet.com>
When emulating an mmio read, we actually emulate twice: once to determine
the physical address of the mmio, and, after we've exited to userspace to
get the mmio value, we emulate again to place the value in the result
register and update any flags.
But we don't really need to enter the guest again for that, only to take
an immediate vmexit. So, if we detect that we're doing an mmio read,
emulate a single instruction before entering the guest again.
Signed-off-by: Avi Kivity <avi@qumranet.com>
We only have to save/restore MSR_GS_BASE on every VMEXIT. The rest can be
saved/restored when we leave the VCPU. Since we don't emulate the DEBUGCTL
MSRs and the guest cannot write to them, we don't have to worry about
saving/restoring them at all.
This shaves a whopping 40% off raw vmexit costs on AMD.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
It might have worked in this case since PT_PRESENT_MASK is 1, but let's
express this correctly.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Only save/restore the FPU host state when the guest is actually using the
FPU.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Set all of the host mask bits for CR0 so that we can maintain a proper
shadow of CR0. This exposes CR0.TS, paving the way for lazy fpu handling.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Avoid saving and restoring the guest fpu state on every exit. This
shaves ~100 cycles off the guest/host switch.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Make the exit statistics per-vcpu instead of global. This gives a 3.5%
boost when running one virtual machine per core on my two socket dual core
(4 cores total) machine.
Signed-off-by: Avi Kivity <avi@qumranet.com>
By checking if a reschedule is needed, we avoid dropping the vcpu.
[With changes by me, based on Anthony Liguori's observations]
Signed-off-by: Avi Kivity <avi@qumranet.com>
Intel hosts only support syscall/sysret in long more (and only if efer.sce
is enabled), so only reload the related MSR_K6_STAR if the guest will
actually be able to use it.
This reduces vmexit cost by about 500 cycles (6400 -> 5870) on my setup.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Some msrs are only used by x86_64 instructions, and are therefore
not needed when the guest is legacy mode. By not bothering to switch
them, we reduce vmexit latency by 2400 cycles (from about 8800) when
running a 32-bt guest on a 64-bit host.
Signed-off-by: Avi Kivity <avi@qumranet.com>
THe automatically switched msrs are never changed on the host (with
the exception of MSR_KERNEL_GS_BASE) and thus there is no need to save
them on every vm entry.
This reduces vmexit latency by ~400 cycles on i386 and by ~900 cycles (10%)
on x86_64.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Usually, guest page faults are detected by the kvm page fault handler,
which detects if they are shadow faults, mmio faults, pagetable faults,
or normal guest page faults.
However, in ceratin circumstances, we can detect a page fault much later.
One of these events is the following combination:
- A two memory operand instruction (e.g. movsb) is executed.
- The first operand is in mmio space (which is the fault reported to kvm)
- The second operand is in an ummaped address (e.g. a guest page fault)
The Windows 2000 installer does such an access, an promptly hangs. Fix
by adding the missing page fault injection on that path.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Some guests (Solaris) do not set up all four pdptrs, but leave some invalid.
kvm incorrectly treated these as valid page directories, pinning the
wrong pages and causing general confusion.
Fix by checking the valid bit of a pae pdpte. This closes sourceforge bug
1698922.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Solaris panics if it sees a cpu with no fpu, and it seems to rely on this
bit. Closes sourceforge bug 1698920.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The expression
sp - 6 < sp
where sp is a u16 is undefined in C since 'sp - 6' is promoted to int,
and signed overflow is undefined in C. gcc 4.2 actually warns about it.
Replace with a simpler test.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch enables the virtualization of the last branch record MSRs on
SVM if this feature is available in hardware. It also introduces a small
and simple check feature for specific SVM extensions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
With this, we can specify that accesses to one physical memory range will
be remapped to another. This is useful for the vga window at 0xa0000 which
is used as a movable window into the (much larger) framebuffer.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Mapping a guest page to a host page is a common operation. Currently,
one has first to find the memory slot where the page belongs (gfn_to_memslot),
then locate the page itself (gfn_to_page()).
This is clumsy, and also won't work well with memory aliases. So simplify
gfn_to_page() not to require memory slot translation first, and instead do it
internally.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Functions that play around with the physical memory map
need a way to clear mappings to possibly nonexistent or
invalid memory. Both the mmu cache and the processor tlb
are cleared.
Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
On x86, bit operations operate on a string of bits that can reside in
multiple words. For example, 'btsl %eax, (blah)' will touch the word
at blah+4 if %eax is between 32 and 63.
The x86 emulator compensates for that by advancing the operand address
by (bit offset / BITS_PER_LONG) and truncating the bit offset to the
range (0..BITS_PER_LONG-1). This has a side effect of forcing the operand
size to 8 bytes on 64-bit hosts.
Now, a 32-bit guest goes and fork()s a process. It write protects a stack
page at 0xbffff000 using the 'btr' instruction, at offset 0xffc in the page
table, with bit offset 1 (for the write permission bit).
The emulator now forces the operand size to 8 bytes as previously described,
and an innocent page table update turns into a cross-page-boundary write,
which is assumed by the mmu code not to be a page table, so it doesn't
actually clear the corresponding shadow page table entry. The guest and
host permissions are out of sync and guest memory is corrupted soon
afterwards, leading to guest failure.
Fix by not using BITS_PER_LONG as the word size; instead use the actual
operand size, so we get a 32-bit write in that case.
Note we still have to teach the mmu to handle cross-page-boundary writes
to guest page table; but for now this allows Damn Small Linux 0.4 (2.4.20)
to boot.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Remove unused function
CC drivers/kvm/svm.o
drivers/kvm/svm.c:207: warning: ‘inject_db’ defined but not used
Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
When a vcpu is migrated from one cpu to another, its timestamp counter
may lose its monotonic property if the host has unsynced timestamp counters.
This can confuse the guest, sometimes to the point of refusing to boot.
As the rdtsc instruction is rather fast on AMD processors (7-10 cycles),
we can simply record the last host tsc when we drop the cpu, and adjust
the vcpu tsc offset when we detect that we've migrated to a different cpu.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The kvm mmu keeps a shadow page for hugepage pdes; if several such pdes map
the same physical address, they share the same shadow page. This is a fairly
common case (kernel mappings on i386 nonpae Linux, for example).
However, if the two pdes map the same memory but with different permissions, kvm
will happily use the cached shadow page. If the access through the more
permissive pde will occur after the access to the strict pde, an endless pagefault
loop will be generated and the guest will make no progress.
Fix by making the access permissions part of the cache lookup key.
The fix allows Xen pae to boot on kvm and run guest domains.
Thanks to Jeremy Fitzhardinge for reporting the bug and testing the fix.
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch forbids the guest to execute monitor/mwait instructions on
SVM. This is necessary because the guest can execute these instructions
if they are available even if the kvm cpuid doesn't report its
existence.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Some older (~2.6.7) kernels write MCG_STATUS register during kernel
boot (mce_clear_all() function, called from mce_init()). It's not
currently handled by kvm and will cause it to inject a GPF.
Following patch adds a "nop" handler for this.
Signed-off-by: Sergey Kiselev <sergey.kiselev@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
As usual, we need to mangle segment registers when emulating real mode
as vm86 has specific constraints. We special case the reset segment base,
and set the "access rights" (or descriptor flags) to vm86 comaptible values.
This fixes reboot on vmx.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The SET_SREGS ioctl modifies both cr0.pe (real mode/protected mode) and
guest segment registers. Since segment handling is modified by the mode on
Intel procesors, update the segment registers after the mode switch has taken
place.
Signed-off-by: Avi Kivity <avi@qumranet.com>
set_cr0_no_modeswitch() was a hack to avoid corrupting segment registers.
As we now cache the protected mode values on entry to real mode, this
isn't an issue anymore, and it interferes with reboot (which usually _is_
a modeswitch).
Signed-off-by: Avi Kivity <avi@qumranet.com>
The reset state has cs.selector == 0xf000 and cs.base == 0xffff0000,
which aren't compatible with vm86 mode, which is used for real mode
virtualization.
When we create a vcpu, we set cs.base to 0xf0000, but if we get there by
way of a reset, the values are inconsistent and vmx refuses to enter
guest mode.
Workaround by detecting the state and munging it appropriately.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The initial, noncaching, version of the kvm mmu flushed the all nonglobal
shadow page table translations (much like a native tlb flush). The new
implementation flushes translations only when they change, rendering global
pte tracking superfluous.
This removes the unused tracking mechanism and storage space.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The current string pio interface communicates using guest virtual addresses,
relying on userspace to translate addresses and to check permissions. This
interface cannot fully support guest smp, as the check needs to take into
account two pages at one in case an unaligned string transfer straddles a
page boundary.
Change the interface not to communicate guest addresses at all; instead use
a buffer page (mmaped by userspace) and do transfers there. The kernel
manages the virtual to physical translation and can perform the checks
atomically by taking the appropriate locks.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Some ioctls ignore their arguments. By requiring them to be zero now,
we allow a nonzero value to have some special meaning in the future.
Signed-off-by: Avi Kivity <avi@qumranet.com>
This allows us to store offsets in the kernel/user kvm_run area, and be
sure that userspace has them mapped. As offsets can be outside the
kvm_run struct, userspace has no way of knowing how much to mmap.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Allow a special signal mask to be used while executing in guest mode. This
allows signals to be used to interrupt a vcpu without requiring signal
delivery to a userspace handler, which is quite expensive. Userspace still
receives -EINTR and can get the signal via sigwait().
Signed-off-by: Avi Kivity <avi@qumranet.com>
This is redundant, as we also return -EINTR from the ioctl, but it
allows us to examine the exit_reason field on resume without seeing
old data.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Currently, userspace is told about the nature of the last exit from the
guest using two fields, exit_type and exit_reason, where exit_type has
just two enumerations (and no need for more). So fold exit_type into
exit_reason, reducing the complexity of determining what really happened.
Signed-off-by: Avi Kivity <avi@qumranet.com>
KVM used to handle cpuid by letting userspace decide what values to
return to the guest. We now handle cpuid completely in the kernel. We
still let userspace decide which values the guest will see by having
userspace set up the value table beforehand (this is necessary to allow
management software to set the cpu features to the least common denominator,
so that live migration can work).
The motivation for the change is that kvm kernel code can be impacted by
cpuid features, for example the x86 emulator.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Currently when passing the a PIO emulation request to userspace, we
rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx
(on string instructions). This (a) requires two extra ioctls for getting
and setting the registers and (b) is unfriendly to non-x86 archs, when
they get kvm ports.
So fix by doing the register fixups in the kernel and passing to userspace
only an abstract description of the PIO to be done.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Instead of passing a 'struct kvm_run' back and forth between the kernel and
userspace, allocate a page and allow the user to mmap() it. This reduces
needless copying and makes the interface expandable by providing lots of
free space.
Signed-off-by: Avi Kivity <avi@qumranet.com>
When auditing a 32-bit guest on a 64-bit host, sign extension of the page
table directory pointer table index caused bogus addresses to be shown on
audit errors.
Fix by declaring the index unsigned.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Instead of twiddling the rip registers directly, use the
skip_emulated_instruction() function to do that for us.
Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
The hypercall code mixes up the ->cache_regs() and ->decache_regs()
callbacks, resulting in guest register corruption.
Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
pci_create_sysfs_dev_files() should call pci_remove_resource_files() in
its error path, to match the call it makes to pci_create_resource_files().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Use menuconfigs instead of menus, so the whole menu can be disabled at
once instead of going through all options.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Cc: Scott Murray <scottm@somanetworks.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
cc: Philip Guo <pg@cs.stanford.edu>
Here's a small patch against the current git tree for the ZT5550 CPCI
hotplug driver to fix an issue with port freeing that Philip Guo found.
Signed-off-by: Scott Murray <scottm@somanetworks.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove the semaphores from the get routine. These do not
appear to be protecting anything that I can make out,
and they also do not seem to be required by the hotplug
driver.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Calls to pcibios_add should be symmetric with calls to pcibios_remove.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
At first blush, the disable_slot() routine does not look
at all like its symmetric with the enable_slot() routine;
as it seems to call a very different set of routines.
However, this is easily fixed: pcibios_remove_pci_devices()
does the right thing.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix up the documentation: the rpaphp_add_slot() does not actually
handle embedded slots: in fact, it ignores them. Fix the flow of
control in the routine that checks for embedded slots.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Document some of the interaction between dlpar and hotplug.
viz, the a dlpar remove of a htoplug slot uses hotplug to remove it.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Rename rpaphp_register_pci_slot() because its easy to confuse
with rpaphp_register_slot() even though it does something
completely different. Rename it to rpaphp_enable_slot() because
its almost identical to enbale_slot().
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Eliminate the tail call to rpaphp_register_slot()
by placing it in the caller. This will help later
dis-entanglement.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The rpaphp_set_attention_status() routine seems to be a wrapper
around a single rtas call. Abolish it.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The debug function print_slot_pci_funcs() is a large wrapper
around two debug print statements. Just invoke these directly.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The setup_pci_slot() routine appears to be nothing else than
a big, complicated wrapper around pcibios_add_pci_devices().
Remove the wrapping, and call pcibios_add_pci_devices() directly.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Delete another stovepipe: a call to a routine which does nothing.
Remove un-needed semaphore as well.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove another stove-pipe; this funcion was called from
two different places, with a compile-time const that is
then run-time checked to perform two different things.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove another stovepipe: a call which wraps another call, and
just adds printks.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove a stove-pipe-- a function that is called from only one place,
does nothing but wraps another function with debug printk's.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix a memleak; the slot->location string was never freed.
Fix some whitespace and overlong-line probelms while we're here.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The routine that called an alloc should be the same routine that
calles the mathcing free, if anything in the middle failed.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Cleanup cruft: remove the global "num_slots" variable;
although scattered across multiple files, it is used only
once, in a debug statement.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Cleanup the flow of control for rpaphp_add_slot(), so as to
make it easier to read. The ext patch will fix a bug in this
same code.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Cc: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch removes the PCI_MULTITHREAD_PROBE option that had already
been marked as broken.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch introduces an optional function, arch_teardown_msi_irqs(),
which gives an arch the opportunity to do per-device teardown for
MSI/X. If that's not required, the default version simply calls
arch_teardown_msi_irq() for each msi irq required.
arch_teardown_msi_irqs() is simply passed a pdev, attached to the pdev
is a list of msi_descs, it is up to the arch to free the irq associated
with each of these as appropriate.
For archs that _don't_ implement arch_teardown_msi_irqs(), all msi_descs
with irq == 0 are considered unallocated, and the arch teardown routine
is not called on them.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch introduces an optional function, arch_setup_msi_irqs(),
(note the plural) which gives an arch the opportunity to do per-device
setup for MSI/X and then allocate all the requested MSI/Xs at once.
If that's not required by the arch, the default version simply calls
arch_setup_msi_irq() for each MSI irq required.
arch_setup_msi_irqs() is passed a pdev, attached to the pdev is a list
of msi_descs with irq == 0, it is up to the arch to connect these up to
an irq (via set_irq_msi()) or return an error. For convenience the number
of vectors and the type are passed also.
All msi_descs with irq != 0 are considered allocated, and the arch
teardown routine will be called on them when necessary.
The existing semantics of pci_enable_msix() are that if the requested
number of irqs can not be allocated, the maximum number that _could_ be
allocated is returned. To support that, we define that in case of an
error from arch_setup_msi_irqs(), the number of msi_descs with irq != 0
are considered allocated, and are counted toward the "max that could be
allocated".
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
set_irq_msi() currently connects an irq_desc to an msi_desc. The archs call
it at some point in their setup routine, and then the generic code sets up the
reverse mapping from the msi_desc back to the irq.
set_irq_msi() should do both connections, making it the one and only call
required to connect an irq with it's MSI desc and vice versa.
The arch code MUST call set_irq_msi(), and it must do so only once it's sure
it's not going to fail the irq allocation.
Given that there's no need for the arch to return the irq anymore, the return
value from the arch setup routine just becomes 0 for success and anything else
for failure.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Allows architectures to advertise that they support MSI rather than listing
each architecture as a PCI_MSI dependency.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Now that we keep a list of msi descriptors, we don't need first_msi_irq
in the pci dev.
If we somehow have zero MSIs configured list_entry() will give us weird
oopes or nice memory corruption bugs. So be paranoid. Add BUG_ONs and also
a check in pci_msi_check_device() to make sure nvec > 0.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The msi descriptors are linked together with what looks a lot like
a linked list, but isn't a struct list_head list. Make it one.
The only complication is that previously we walked a list of irqs, and
got the descriptor for each with get_irq_msi(). Now we have a list of
descriptors and need to get the irq out of it, so it needs to be in the
actual struct msi_desc. We use 0 to indicate no irq is setup.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>