Commit Graph

30529 Commits

Author SHA1 Message Date
Bartlomiej Zolnierkiewicz
88ae4d8c38 sc1200: fix ->dma_base equal zero handling
Set hwif->atapi_dma/{ultra,mwdma}_mask and drive->autodma after checking that
->dma_base exists.  If ->dma_base is not set (== PCI BAR4 cannot be reserved)
then DMA hooks shouldn't be initialized or bad things will happen.

OTOH hwif->set_{pio,dma}_mode hooks should be set even if hwif->dma_base == 0.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:53 +02:00
Bartlomiej Zolnierkiewicz
dfb2311226 cs5520: fix ->dma_base equal zero handling
Set hwif->ide_dma_{check,on} and hwif->autodma to 1 after checking that
->dma_base exists.  If ->dma_base is not set (== PCI BAR4 cannot be reserved)
then DMA hooks shouldn't be initialized or bad things will happen.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:52 +02:00
Bartlomiej Zolnierkiewicz
b9d9e61abb sgiioc4: add missing ->dma_base check
If ->dma_base is not set (== PCI BAR4 cannot be reserved) then DMA hooks
shouldn't be initialized or bad things will happen.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:52 +02:00
Bartlomiej Zolnierkiewicz
7bda292d12 cs5535: add missing ->dma_base check
If ->dma_base is not set (== PCI BAR4 cannot be reserved) then DMA hooks
shouldn't be initialized or bad things will happen.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:52 +02:00
Bartlomiej Zolnierkiewicz
76bb7782c6 ide: remove CONFIG_IDEDMA_IVB config option
Devices which don't set word 93 validation bit should be now handled by
ide-iops.c::ivb_list[] and for debugging purposes cable detection can be
completely overriden with "idex=ata66" parameter.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:52 +02:00
Bartlomiej Zolnierkiewicz
b140b99c41 ide: change master/slave IDENTIFY order
Need to probe slave device first to make it release PDIAG-
(this is required for correct device side cable detection).

Based on libata commit f31f0cc2f0.

Thanks to Craig for testing this patch.

Cc: Craig Block <chblock3@yahoo.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:51 +02:00
Bartlomiej Zolnierkiewicz
88b2b32bab ide: move ide_config_drive_speed() calls to upper layers (take 2)
* Convert {ide_hwif_t,ide_pci_device_t}->host_flag to be u16.

* Add IDE_HFLAG_POST_SET_MODE host flag to indicate the need to program 
  the host for the transfer mode after programming the device.  Set it
  in au1xxx-ide, amd74xx, cs5530, cs5535, pdc202xx_new, sc1200, pmac
  and via82cxxx host drivers.

* Add IDE_HFLAG_NO_SET_MODE host flag to indicate the need to completely
  skip programming of host/device for the transfer mode ("smart" hosts).
  Set it in it821x host driver and check it in ide_tune_dma().

* Add ide_set_pio_mode()/ide_set_dma_mode() helpers and convert all
  direct ->set_pio_mode/->speedproc users to use these helpers.

* Move ide_config_drive_speed() calls from ->set_pio_mode/->speedproc
  methods to callers.

* Rename ->speedproc method to ->set_dma_mode, make it void and update
  all implementations accordingly.

* Update ide_set_xfer_rate() comments.

* Unexport ide_config_drive_speed().

v2:
* Fix issues noticed by Sergei:
  - export ide_set_dma_mode() instead of moving ->set_pio_mode abuse wrt
    to setting DMA modes from sc1200_set_pio_mode() to do_special()
  - check IDE_HFLAG_NO_SET_MODE in ide_tune_dma()
  - check for (hwif->set_pio_mode) == NULL in ide_set_pio_mode()
  - check for (hwif->set_dma_mode) == NULL in ide_set_dma_mode()
  - return -1 from ide_set_{pio,dma}_mode() if ->set_{pio,dma}_mode == NULL
  - don't set ->set_{pio,dma}_mode on it821x in "smart" mode
  - fix build problem in pmac.c
  - minor fixes in au1xxx-ide.c/cs5530.c/siimage.c
  - improve patch description

Changes in behavior caused by this patch:
- HDIO_SET_PIO_MODE ioctl would now return -ENOSYS for attempts to change
  PIO mode if it821x controller is in "smart" mode
- removal of two debugging printk-s (from cs5530.c and sc1200.c)
- transfer modes 0x00-0x07 passed from user space may be programmed twice on
  the device (not really an issue since 0x00 is not supported correctly by
  any host driver ATM, 0x01 is not supported at all and 0x02-0x07 are invalid)

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:51 +02:00
Bartlomiej Zolnierkiewicz
6e249395ea pdc202xx_new: check ide_config_drive_speed() return value
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:51 +02:00
Bartlomiej Zolnierkiewicz
249aa4ff17 cs5535: check ide_config_drive_speed() return value
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:51 +02:00
Bartlomiej Zolnierkiewicz
3b4024d429 amd74xx/via82cxxx: check ide_config_drive_speed() return value
* Check ide_config_drive_speed() return value.

* While at also call ide_config_drive_speed() if the transfer mode is
  XFER_PIO_SLOW (this case happens iff the transfer mode has already been
  set on the device by ide-proc.c::set_xfer_rate()) and remove redundant
  setting of ->{init,current}_speed.

* Bump driver version.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:51 +02:00
Bartlomiej Zolnierkiewicz
0f458943e0 au1xxx: fix au1xxx_set_pio_mode()
Set transfer mode on the device before programming the host controller for
the new timings (matches what auide_tune_chipset() is doing wrt DMA modes).

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:51 +02:00
Bartlomiej Zolnierkiewicz
75d7d963e3 icside: use ide_tune_dma()
* Add "good DMA drives" hack for icside to ide-dma.c::ide_find_dma_mode()
  (in the long-term it should be either removed or generalized for all hosts).

* Use ide_tune_dma() in icside.c::icside_dma_check().

  This results in the following changes in behavior:
  - pre-EIDE SWDMA modes are now also respected
  - drive->autodma is checked instead of hwif->autodma
    (doesn't really matter as icside sets both to "1")

* Make ide-dma.c::__ide_dma_good_drive() static and drop "__" prefix.

Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:50 +02:00
Benjamin Herrenschmidt
0b46ff2ea2 ide-pmac: fix PIO setup and enable autotune
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:50 +02:00
Bartlomiej Zolnierkiewicz
254bb55036 ide-pmac: use ide_tune_dma() (take 2)
* Add missing initialization of hwif->autodma and drive->autodma to
  pmac_ide_setup_dma().

* Use ide_tune_dma() in pmac_ide_dma_check().

While at it:

* Fix pmac_ide_dma_check() return value if DMA mode is not programmed
  (should be "-1" otherwise ide_set_dma() will try to enable DMA).

* Remove unnecessary drive->using_dma fiddling (->dma_off_quietly is always
  called before ide_set_dma() call and ide_set_dma() calls ->ide_dma_on if
  ->ide_dma_check returns "0").

v2:
* No reason to blacklist all ide_floppy devices and the old code was always
  enabling DMA anyway (without even programming controller/device if the
  device was ide_floppy).

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:50 +02:00
Bartlomiej Zolnierkiewicz
aedea5910c ide-pmac: remove pmac_ide_do_setfeature() (take 2)
Use ide_config_drive_speed() instead of pmac_ide_do_setfeature() and remove
the latter, also  ide-iops.c::__ide_wait_stat() could be static again.

Since for IDE PMAC host driver IDE_CONTROL_REG is always true, device's
->quirk_list is always zero and ->ide_dma_host_{on,off} are nops than
the only changes in behavior are:

* if PIO mode is set then ->dma_off_queitly is called to disable DMA

* if setting transfer mode fails ide_dump_status() is called to dump status

v2:
* IDE PMAC controllers allow separate PIO and DMA timings and PPC userland
  depends on this fact, and calls "hdparm -p" without calling "hdparm -d".

  Therefore to compensate for DMA being disabled by ide_config_drive_speed()
  for PIO modes:

  - add IDE_HFLAG_SET_PIO_MODE_KEEP_DMA flag and set it in PMAC host driver

  - add handling of the new flag to ide-io.c::do_special()

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:50 +02:00
Bartlomiej Zolnierkiewicz
3b2d0093b8 ide-pmac: remove nIEN clearing from pmac_ide_do_setfeature()
Upper layers are responsible for controlling nIEN so don't clear nIEN after
command execution in pmac_ide_do_setfeature().

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:49 +02:00
Bartlomiej Zolnierkiewicz
ddf151026a ide-pmac: use __ide_wait_stat()
* Use __ide_wait_stat() instead of wait_for_ready() in pmac_ide_do_setfeature().

While at it do following changes to match __ide_wait_stat() call in
ide_config_drive_speed():

* Wait WAIT_CMD time (20 sec) instead of 2 sec for device to clear BUSY_STAT.

* Check DRQ_STAT bit (shouldn't be set for good device status).

Also remove no longer needed wait_for_ready() from ide-iops.c.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:49 +02:00
Bartlomiej Zolnierkiewicz
218ee5f364 ide-pmac: remove extra good status wait from pmac_ide_do_setfeature()
Don't check for good device status before executing the command in
pmac_ide_do_setfeature() (ide_config_drive_speed() doesn't do this).

It is a job of upper layers to guarantee that the device is ready to
accept new command before we get here.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:49 +02:00
Bartlomiej Zolnierkiewicz
74af21cf4d ide: add __ide_wait_stat() helper
* Split off checking of the status register from ide_wait_stat() to
  __ide_wait_stat() helper.

* Use the new helper in ide_config_drive_speed().  The only change in the
  functionality is that the function now fails if after 20 sec (WAIT_CMD)
  device is still busy (BUSY_STAT bit is set) while previously instead of
  failing the function continued with checking for the correct device status
  (which would give the device additional 10 usec to clear BUSY_STAT bit).

* Remove stale comment for ide_config_drive_speed().

* Remove duplicate comment for ide_wait_stat() from <linux/ide.h>.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:49 +02:00
Bartlomiej Zolnierkiewicz
fd553ce868 ide-pmac: remove pmac_ide_{m,u}dma_enable() (take 2)
* Fix pmac_ide_dma_check() to use pmac_ide_tune_chipset() and remove no longer
  necessary pmac_ide_{m,u}dma_enable().

* While at it remove some dead code from pmac_ide_dma_check() (leftovers from
  conversion to use ide_max_dma_mode()).

There should be no functionality changes caused by this patch.

v2:
* Fix compile by replacing "id" with "drive->id" in pmac_ide_dma_check()
  (Noticed by Ben).

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:49 +02:00
Bartlomiej Zolnierkiewicz
78103940e4 ide-pmac: remove control register messing from pmac_ide_dma_check()
pmac_ide_do_setfeature() contains matching nIEN setting/clearing so this
Device Control register messing in pmac_ide_dma_check() is totally unnecessary.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:49 +02:00
Bartlomiej Zolnierkiewicz
90f72eca36 ide-pmac: fix set_timings_mdma()
* Move adjusting of cycle time for devices providing explicit DMA cycle time
  from pmac_ide_mdma_enable() to set_timings_mdma().

* Remove no longer needed drive_cycle_time argument.

* BUG() if unsupported speed argument value is passed (shouldn't happen).

* Matching access/recovery timings always exist so remove redundant check.

* Make set_timings_mdma() void.

* Update pmac_ide_tune_chipset()'s comment.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:48 +02:00
Bartlomiej Zolnierkiewicz
085798b12f ide-pmac: pmac_ide_tune_chipset() fixes
* Don't check check for pmif == NULL (it should never be NULL if we got here).

* Make a local copy of the timings and set the pmif->timings[] only after
  setting the transfer mode on the device (otherwise SELECT_DRIVE() call in
  pmac_ide_do_setfeature() will program new timings before the transfer mode
  is set on the device - this was pointed out by Sergei).  This change makes
  pmac_ide_tune_chipset() behavior match this of pmac_ide_{m,u}dma_enable().

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:48 +02:00
Bartlomiej Zolnierkiewicz
90a87ea480 ide-pmac: don't check kauai_lookup_timing() return value
kauai_lookup_timing() should always return non-zero return value:

* BUG() in kauai_lookup_timing() if the timing info cannot be found.

* Remove code checking for zero return value from all callers.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:48 +02:00
Adrian Bunk
39e5f590b6 ide: unexport ide_acpi_set_state
This patch removes the unused EXPORT_SYMBOL_GPL(ide_acpi_set_state)

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:48 +02:00
Bartlomiej Zolnierkiewicz
ceec1827e2 ide_platform: set hwif->chipset
We need to set hwif->chipset or IDE PCI host drivers may try to claim
our ide_hwifs[] slot.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2007-10-13 17:47:47 +02:00
David Woodhouse
ebf8889bd1 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2007-10-13 14:58:23 +01:00
David Woodhouse
b160292cc2 Merge Linux 2.6.23 2007-10-13 14:43:54 +01:00
Bryan Wu
b37bde1478 [MTD] [NAND] Blackfin on-chip NAND Flash Controller driver
This is the driver for latest Blackfin on-chip nand flash controller

 - use nand_chip and mtd_info common nand driver interface
 - provide both PIO and dma operation
 - compiled with ezkit bf548 configuration
 - use hardware 1-bit ECC
 - tested with YAFFS2 and can mount YAFFS2 filesystem as rootfs

ChangeLog from try#1
 - use hweight32() instead of count_bits()
 - replace bf54x with bf5xx and BF54X with BF5XX
 - compare against plat->page_size in 2 cases when enable hardware ECC

ChangeLog from try#2
 - passed nand_test suites
 - use cpu_relax() instead of busy wait loop
 - some coding style issue pointed out by Andrew

Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-10-13 14:36:49 +01:00
Kevin Hao
c4a9f88daf [MTD] [NOR] fix ctrl-alt-del can't reboot for intel flash bug
When we press ctrl-alt-del,kernel_restart_prepare will invoke 
cfi_intelext_reboot which will set flash to read array mode, but later 
when device_shutdown is invoked which may put current work queue to 
sleep and other process may be scheduled to running and programming 
flash in not FL_READY mode again. So we can't boot up if this flash is 
used for bootloader.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-10-13 14:36:18 +01:00
akpm@linux-foundation.org
f96880d1e8 [MTD] [NAND] Fix compiler warning in Alauda driver
drivers/mtd/nand/alauda.c: In function 'alauda_bounce_read':
drivers/mtd/nand/alauda.c:412: warning: comparison of distinct pointer types lacks a cast

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-10-13 14:33:27 +01:00
Kyungmin Park
1437085c37 [MTD] [OneNAND] Fix typo related with recent commit
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2007-10-13 11:26:44 +01:00
Avi Kivity
0967b7bf1c KVM: Skip pio instruction when it is emulated, not executed
If we defer updating rip until pio instructions are executed, we have a
problem with reset:  a pio reset updates rip, and when the instruction
completes we skip the emulated instruction, pointing rip somewhere completely
unrelated.

Fix by updating rip when we see decode the instruction, not after emulation.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:29 +02:00
Nitin A Kamble
535eabcf0e KVM: x86 emulator: popf
Implement emulation of instruction:
    popf
    opcode:  0x9d

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:29 +02:00
Nitin A Kamble
12fa272e31 KVM: x86 emulator: fix src, dst value initialization
Some operand fetches are less than the machine word size and can result in
stale bits if used together with operands of different sizes.

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:29 +02:00
Nitin A Kamble
26a3e983d1 KVM: x86 emulator: jmp abs
Implement emulation of instruction:
    jump absolute r/m
    opcode: 0xff /4

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:29 +02:00
Nitin A Kamble
7e0b54b149 KVM: x86 emulator: lea
Implement emulation of instruction
    lea r16/r32, m
    opcode:  0x8d:

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:29 +02:00
Nitin A Kamble
55bebde45e KVM: X86 emulator: jump conditional short
Implement emulation of more jump conditional instructions
    jcc shortrel
    opcodes: 0x70 - 0x7f

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:29 +02:00
Nitin A Kamble
bbe9abbdac KVM: x86 emulator: imlpement jump conditional relative
Implement emulation of instruction:
    jump conditional rel
    opcodes: 0x0f 0x80 - 0x0f 0x8f

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:28 +02:00
Nitin A Kamble
7de752482c KVM: x86 emulator: sort opcodes into ascending order
Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:28 +02:00
Avi Kivity
054b136967 KVM: Improve emulation failure reporting
Report failed opcodes from all locations.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:28 +02:00
Nitin A Kamble
fd2a760865 KVM: x86 emulator: pushf
Implement emulation of instruction
	pushf
	opcode: 0x9c

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:28 +02:00
Nitin A Kamble
f6eed39135 KVM: x86 emulator: call near
Implement emulation of instruction
	opcode: 0xe8
	call (near)

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:28 +02:00
Nitin A Kamble
7d31691163 KVM: x86 emulator: push imm8
Implement the instruction

    	push imm8
    	opcode: 0x6a

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:28 +02:00
He, Qing
bfdaab0903 KVM: VMX: Fix exit qualification width on i386
According to Intel Software Developer's Manual, Vol. 3B, Appendix H.4.2,
exit qualification should be of natural width. However, current code
uses u64 as the data type for this register, which occasionally
introduces invalid value to VMExit handling logics. This patch fixes
this bug.

I have tested Windows and Linux guest on i386 host, and they can boot
successfully with this patch.

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:28 +02:00
Avi Kivity
04d2cc7780 KVM: Move main vcpu loop into subarch independent code
This simplifies adding new code as well as reducing overall code size.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:28 +02:00
Avi Kivity
29bd8a7808 KVM: VMX: Move vm entry failure handling to the exit handler
This will help moving the main loop to subarch independent code.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:27 +02:00
Avi Kivity
2e3e5882dc KVM: MMU: Don't do GFP_NOWAIT allocations
Before preempt notifiers, kvm needed to allocate memory with GFP_NOWAIT so
as not to have to enable preemption and take a heavyweight exit.  On oom, we'd
fall back to a GFP_KERNEL allocation.

With preemption notifiers, we can do a GFP_KERNEL allocation, and perform
the heavyweight exit only if the kernel decides to put us to sleep.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:27 +02:00
Christian Ehrhardt
cbdd1bea2a KVM: Rename kvm_arch_ops to kvm_x86_ops
This patch just renames the current (misnamed) _arch namings to _x86 to
ensure better readability when a real arch layer takes place.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:27 +02:00
Laurent Vivier
0d8d2bd4f2 KVM: Simplify memory allocation
The mutex->splinlock convertion alllows us to make some code simplifications.
As we can keep the lock longer, we don't have to release it and then
have to check if the environment has not been modified before re-taking it. We
can remove kvm->busy and kvm->memory_config_version.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:27 +02:00
Rusty Russell
1747fb71fd KVM: Hoist SVM's get_cs_db_l_bits into core code.
SVM gets the DB and L bits for the cs by decoding the segment.  This
is in fact the completely generic code, so hoist it for kvm-lite to use.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:27 +02:00
Rusty Russell
81f50e3bfd KVM: Keep control regs in sync
We don't update the vcpu control registers in various places.  We
should do so.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:27 +02:00
Rusty Russell
b85b9ee925 KVM: Clean up unloved invlpg emulation
invlpg shouldn't fetch the "src" address, since it may not be valid,
however SVM's "solution" which neuters emulation of all group 7
instruction is horrible and breaks kvm-lite.  The simplest fix is to
put a special check in for invlpg.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:27 +02:00
Rusty Russell
c9a1185c94 KVM: Remove the unused invlpg member of struct kvm_arch_ops.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:27 +02:00
Amit Shah
380102c8e4 KVM: Set the ET flag in CR0 after initializing FX
This was missed when moving stuff around in fbc4f2e

Fixes Solaris guests and bug #1773613

Signed-off-by: Amit Shah <amit.shah@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:27 +02:00
He, Qing
c5ec153402 KVM: enable in-kernel APIC INIT/SIPI handling
This patch enables INIT/SIPI handling using in-kernel APIC by
introducing a ->mp_state field to emulate the SMP state transition.

[avi: remove smp_processor_id() warning]

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Xin Li <xin.b.li@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:26 +02:00
He, Qing
932f72adbe KVM: round robin for APIC lowest priority delivery mode
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:26 +02:00
Qing He
40487c680d KVM: deliver PIC interrupt only to vcpu0
This patch changes the PIC interrupts delivery. Now it is only delivered
to vcpu0 when either condition is met (on vcpu0):
  1. local APIC is hardware disabled
  2. LVT0 is unmasked and configured to delivery mode ExtInt

It fixes the 2x faster wall clock on x86_64 and SMP i386 Linux guests

Signed-off-by: Eddie (Yaozu) Dong <eddie.dong@intel.com>
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:26 +02:00
He, Qing
5cd4f6fd85 KVM: disable tpr/cr8 sync when in-kernel APIC is used
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:26 +02:00
Eddie Dong
a3d7f85f47 KVM: Migrate lapic hrtimer when vcpu moves to another cpu
This reduces overhead by accessing cachelines from the wrong node, as well
as simplifying locking.

[Qing: fix for inactive or expired one-shot timer]

Signed-off-by: Yaozu (Eddie) Dong <Eddie.Dong@intel.com>
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:26 +02:00
Eddie Dong
1b9778dae7 KVM: Keep track of missed timer irq injections
APIC timer IRQ is set every time when a certain period
expires at host time, but the guest may be descheduled
at that time and thus the irq be overwritten by later fire.
This patch keep track of firing irq numbers and decrease
only when the IRQ is injected to guest or buffered in
APIC.

Signed-off-by: Yaozu (Eddie) Dong <Eddie.Dong@intel.com>
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:26 +02:00
Yang, Sheng
6e5d865c0b KVM: VMX: Use shadow TPR/cr8 for 64-bits guests
This patch enables TPR shadow of VMX on CR8 access. 64bit Windows using
CR8 access TPR frequently. The TPR shadow can improve the performance of
access TPR by not causing vmexit.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:26 +02:00
Eddie Dong
2a8067f17b KVM: pending irq save/restore
Add in kernel irqchip save/restore support for pending vectors.

[avi: fix compile warning on i386]
[avi: remove printk]

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:26 +02:00
Eddie Dong
96ad2cc613 KVM: in-kernel LAPIC save and restore support
This patch adds a new vcpu-based IOCTL to save and restore the local
apic registers for a single vcpu. The kernel only copies the apic page as
a whole, extraction of registers is left to userspace side. On restore, the
APIC timer is restarted from the initial count, this introduces a little
delay, but works fine.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:25 +02:00
He, Qing
6bf9e962d1 KVM: in-kernel IOAPIC save and restore support
This patch adds support for in-kernel ioapic save and restore (to
and from userspace). It uses the same get/set_irqchip ioctl as
in-kernel PIC.

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:25 +02:00
He, Qing
c52fb35a8b KVM: Bypass irq_pending get/set when using in kernel irqchip
vcpu->irq_pending is saved in get/set_sreg IOCTL, but when in-kernel
local APIC is used, doing this may occasionally overwrite vcpu->apic to
an invalid value, as in the vm restore path.

Signed-off-by: Qing He <qing.he@intel.com>
2007-10-13 10:18:25 +02:00
He, Qing
6ceb9d791e KVM: Add get/set irqchip ioctls for in-kernel PIC live migration support
This patch adds two new ioctls to dump and write kernel irqchips for
save/restore and live migration. PIC s/r and l/m is implemented in this
patch.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:25 +02:00
Eddie Dong
9cf98828d1 KVM: Protect in-kernel pio using kvm->lock
pio operation and IRQ_LINE kvm_vm_ioctl is not kvm->lock
protected.  Add lock to same with IOAPIC MMIO operations.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:25 +02:00
Eddie Dong
b6958ce44a KVM: Emulate hlt in the kernel
By sleeping in the kernel when hlt is executed, we simplify the in-kernel
guest interrupt path considerably.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:25 +02:00
Eddie Dong
1fd4f2a5ed KVM: In-kernel I/O APIC model
This allows in-kernel host-side device drivers to raise guest interrupts
without going to userspace.

[avi: fix level-triggered interrupt redelivery on eoi]
[avi: add missing #include]
[avi: avoid redelivery of edge-triggered interrupt]
[avi: implement polarity]
[avi: don't deliver edge-triggered interrupts when unmasking]
[avi: fix host oops on invalid guest access]

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:25 +02:00
Eddie Dong
97222cc831 KVM: Emulate local APIC in kernel
Because lightweight exits (exits which don't involve userspace) are many
times faster than heavyweight exits, it makes sense to emulate high usage
devices in the kernel.  The local APIC is one such device, especially for
Windows and for SMP, so we add an APIC model to kvm.

It also allows in-kernel host-side drivers to inject interrupts without
going through userspace.

[compile fix on i386 from Jindrich Makovicka]

Signed-off-by: Yaozu (Eddie) Dong <Eddie.Dong@intel.com>
Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:25 +02:00
Eddie Dong
7017fc3d1a KVM: Define and use cr8 access functions
This patch is to wrap APIC base register and CR8 operation which can
provide a unique API for user level irqchip and kernel irqchip.
This is a preparation of merging lapic/ioapic patch.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:25 +02:00
Eddie Dong
85f455f7dd KVM: Add support for in-kernel PIC emulation
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:24 +02:00
Laurent Vivier
152d3f2f24 KVM: VMX: Split segments reload in vmx_load_host_state()
vmx_load_host_state() bundles fs, gs, ldt, and tss reloading into
one in the hope that it is infrequent. With smp guests, fs reloading is
frequent due to fs being used by threads.

Unbundle the reloads so reduce expensive gs reloads.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:24 +02:00
Avi Kivity
d39dba54ce KVM: X86 emulator: fix 'push reg' writeback
Pointed out by Rusty Russell.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:24 +02:00
Izik Eidus
2e2c618dad KVM: Support more memory slots
Needed for mapping memory at 4GB.

Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:24 +02:00
Izik Eidus
33f5fa1664 KVM: VMX: allow rmode_tss_base() to work with >2G of guest memory
Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:24 +02:00
Nitin A Kamble
7e778161fb KVM: x86 emulator: implement 'push reg' (opcodes 0x50-0x57)
Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:24 +02:00
Nitin A Kamble
c53ce170a9 KVM: x86 emulator: Implement 'jmp rel short' instruction (opcode 0xeb)
Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:24 +02:00
Nitin A Kamble
098c937ba3 KVM: x86 emulator: implement 'jmp rel' instruction (opcode 0xe9)
Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:24 +02:00
Nitin A Kamble
19eb938e01 KVM: x86 emulator: implement 'and $imm, %{al|ax|eax}'
Implement emulation of instruction
    and al imm8 (opcode 0x24)
    and ax/eax imm16/imm32 (opcode 0x25)

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:23 +02:00
Yang, Sheng
253abdee5e KVM: Communicate cr8 changes to userspace
This allows running 64-bit Windows.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:23 +02:00
Avi Kivity
7e66f350cf KVM: Close minor race in signal handling
We need to check for signals inside the critical section, otherwise a
signal can be sent which we will not notice.  Also move the check
before entry, so that if the signal happens before the first entry,
we exit immediately instead of waiting for something to happen to the
guest.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:23 +02:00
Laurent Vivier
3090dd7377 KVM: Clean up kvm_setup_pio()
Split kvm_setup_pio() into two functions, one to setup in/out pio
(kvm_emulate_pio()) and one to setup ins/outs pio (kvm_emulate_pio_string()).

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:23 +02:00
Laurent Vivier
e70669abd4 KVM: Cleanup string I/O instruction emulation
Both vmx and svm decode the I/O instructions, and both botch the job,
requiring the instruction prefixes to be fetched in order to completely
decode the instruction.

So, if we see a string I/O instruction, use the x86 emulator to decode it,
as it already has all the prefix decoding machinery.

This patch defines ins/outs opcodes in x86_emulate.c and calls
emulate_instruction() from io_interception() (svm.c) and from handle_io()
(vmx.c).  It removes all vmx/svm prefix instruction decoders
(get_addr_size(), io_get_override(), io_address(), get_io_count())

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:23 +02:00
Laurent Vivier
9fdaaac38e KVM: Remove useless assignment
Line 1809 of kvm_main.c is useless, value is overwritten in line 1815:

1809         now = min(count, PAGE_SIZE / size);
1810
1811         if (!down)
1812                 in_page = PAGE_SIZE - offset_in_page(address);
1813         else
1814                 in_page = offset_in_page(address) + size;
1815         now = min(count, (unsigned long)in_page / size);
1816         if (!now) {

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:23 +02:00
Li, Xin B
1e4e6e0013 KVM: VMX: Remove a duplicated ia32e mode vm entry control
Remove a duplicated ia32e mode VM Entry control definition and use the
proper one.

Signed-off-by: Xin Li <xin.b.li@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:23 +02:00
Rusty Russell
a477034750 KVM: Use kmem_cache_free for kmem_cache_zalloc'ed objects
We use kfree in svm.c and vmx.c, and this works, but it could break at
any time.  kfree() is supposed to match up with kmalloc().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:23 +02:00
Rusty Russell
f02424785a KVM: Add and use pr_unimpl for standard formatting of unimplemented features
All guest-invokable printks should be ratelimited to prevent malicious
guests from flooding logs.  This is a start.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:22 +02:00
Rusty Russell
33830b4f5b KVM: Remove unneeded kvm_dev_open and kvm_dev_release functions.
Devices don't need open or release functions.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:22 +02:00
Rusty Russell
3dea7ca716 KVM: Remove stat_set from debugfs
We shouldn't define stat_set on the debug attributes, since that will
cause silent failure on writing: without a set argument, userspace
will get -EACCESS.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:22 +02:00
Gabriel C
54e11fa1f8 KVM: Fix defined but not used warning in drivers/kvm/vmx.c
move_msr_up() is used only on X86_64 and generates a warning on !X86_64

Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:22 +02:00
Rusty Russell
37c00051b5 KVM: Remove redundant alloc_vmcs_cpu declaration
alloc_vmcs_cpu is already declared (static) above, no need to
redeclare.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:22 +02:00
Rusty Russell
bfc733a7a3 KVM: SVM: Make set_msr_interception more reliable
set_msr_interception() is used by svm to set up which MSRs should be
intercepted.  It can only fail if someone has changed the code to try
to intercept an MSR without updating the array of ranges.

The return value is ignored anyway: it should just BUG() if it doesn't
work.  (A build-time failure would be better, but that's tricky).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:22 +02:00
Rusty Russell
7e9d619d2a KVM: Cleanup mark_page_dirty
For some reason, mark_page_dirty open-codes __gfn_to_memslot().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:22 +02:00
Rusty Russell
fb76441649 KVM: Don't assign vcpu->cr3 if it's invalid: check first, set last
sSigned-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:22 +02:00
Yang, Sheng
002c7f7c32 KVM: VMX: Add cpu consistency check
All the physical CPUs on the board should support the same VMX feature
set.  Add check_processor_compatibility to kvm_arch_ops for the consistency
check.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:22 +02:00
Rusty Russell
39214915f5 KVM: kvm_vm_ioctl_get_dirty_log restore "nothing dirty" optimization
kvm_vm_ioctl_get_dirty_log scans bitmap to see it it's all zero, but
doesn't use that information.

Avi says:
	Looks like it was used to guard	kvm_mmu_slot_remove_write_access();
	optimizing the case where the guest just leaves the screen alone (which
	it usually does, especially in benchmarks).

	I'd rather reinstate that optimization.  See
	90cb0529dd where the damage was done.

It's pretty simple: if the bitmap is all zero, we don't need to do anything to
clean it.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:21 +02:00
Rusty Russell
b114b0804d KVM: Use alignment properties of vcpu to simplify FPU ops
Now we use a kmem cache for allocating vcpus, we can get the 16-byte
alignment required by fxsave & fxrstor instructions, and avoid
manually aligning the buffer.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:21 +02:00
Rusty Russell
c16f862d02 KVM: Use kmem cache for allocating vcpus
Avi wants the allocations of vcpus centralized again.  The easiest way
is to add a "size" arg to kvm_init_arch, and expose the thus-prepared
cache to the modules.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:21 +02:00
Laurent Vivier
e7d5d76cae KVM: Remove kvm_{read,write}_guest()
... in favor of the more general emulator_{read,write}_*.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:21 +02:00
Laurent Vivier
cebff02b11 KVM: Change the emulator_{read,write,cmpxchg}_* functions to take a vcpu
... instead of a x86_emulate_ctxt, so that other callers can use it easily.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:21 +02:00
Rusty Russell
0e5017d4ae KVM: SVM: internal function name cleanup
Changes some svm.c internal function names:
1) io_adress -> io_address  (de-germanify the spelling)
2) kvm_reput_irq -> reput_irq  (it's not a generic kvm function)
3) kvm_do_inject_irq -> (it's not a generic kvm function)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:21 +02:00
Rusty Russell
e756fc626d KVM: SVM: de-containization
container_of is wonderful, but not casting at all is better.  This
patch changes svm.c's internal functions to pass "struct vcpu_svm"
instead of "struct kvm_vcpu" and using container_of.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:21 +02:00
Rusty Russell
3077c4513c KVM: Remove three magic numbers
There are several places where hardcoded numbers are used in place of
the easily-available constant, which is poor form.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:21 +02:00
Rusty Russell
8b9cf98cc7 KVM: VMX: pass vcpu_vmx internally
container_of is wonderful, but not casting at all is better.  This
patch changes vmx.c's internal functions to pass "struct vcpu_vmx"
instead of "struct kvm_vcpu" and using container_of.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:21 +02:00
Rusty Russell
9bd01506ee KVM: fx_init() needs preemption disabled while it plays with the FPU state
Now that kvm generally runs with preemption enabled, we need to protect
the fpu intialization sequence.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:20 +02:00
Shaohua Li
11ec280471 KVM: Convert vm lock to a mutex
This allows the kvm mmu to perform sleepy operations, such as memory
allocation.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:20 +02:00
Avi Kivity
15ad71460d KVM: Use the scheduler preemption notifiers to make kvm preemptible
Current kvm disables preemption while the new virtualization registers are
in use.  This of course is not very good for latency sensitive workloads (one
use of virtualization is to offload user interface and other latency
insensitive stuff to a container, so that it is easier to analyze the
remaining workload).  This patch re-enables preemption for kvm; preemption
is now only disabled when switching the registers in and out, and during
the switch to guest mode and back.

Contains fixes from Shaohua Li <shaohua.li@intel.com>.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:20 +02:00
Jeff Dike
519ef35341 KVM: add hypercall nr to kvm_run
Add the hypercall number to kvm_run and initialize it.  This changes the ABI,
but as this particular ABI was unusable before this no users are affected.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:20 +02:00
Yang, Sheng
1c3d14fe0a KVM: VMX: Improve the method of writing vmcs control
Put cpu feature detecting part in hardware_setup, and stored the vmcs
condition in global variable for further check.

[glommer: fix for some i386-only machines not supporting CR8 load/store
 exiting]

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:20 +02:00
Rusty Russell
fb3f0f51d9 KVM: Dynamically allocate vcpus
This patch converts the vcpus array in "struct kvm" to a pointer
array, and changes the "vcpu_create" and "vcpu_setup" hooks into one
"vcpu_create" call which does the allocation and initialization of the
vcpu (calling back into the kvm_vcpu_init core helper).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:20 +02:00
Gregory Haskins
a2fa3e9f52 KVM: Remove arch specific components from the general code
struct kvm_vcpu has vmx-specific members; remove them to a private structure.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:20 +02:00
Rusty Russell
c820c2aa27 KVM: load_pdptrs() cleanups
load_pdptrs can be handed an invalid cr3, and it should not oops.
This can happen because we injected #gp in set_cr3() after we set
vcpu->cr3 to the invalid value, or from kvm_vcpu_ioctl_set_sregs(), or
memory configuration changes after the guest did set_cr3().

We should also copy the pdpte array once, before checking and
assigning, otherwise an SMP guest can potentially alter the values
between the check and the set.

Finally one nitpick: ret = 1 should be done as late as possible: this
allows GCC to check for unset "ret" should the function change in
future.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:20 +02:00
Aurelien Jarno
3ccb8827fb KVM: Remove dead code in the cmpxchg instruction emulation
The writeback fixes (02c03a326a) let
some dead code in the cmpxchg instruction emulation. Remove it.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:19 +02:00
Yang, Sheng
62b3ffb8b3 KVM: VMX: Import some constants of vmcs from IA32 SDM
This patch mainly imports some constants and rename two exist constants
of vmcs according to IA32 SDM.

It also adds two constants to indicate Lock bit and Enable bit in
MSR_IA32_FEATURE_CONTROL, and replace the hardcode _5_ with these two
bits.

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:19 +02:00
Shaohua Li
fe55188194 KVM: Move gfn_to_page out of kmap/unmap pairs
gfn_to_page might sleep with swap support. Move it out of the kmap calls.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:19 +02:00
Shaohua Li
9ae0448f53 KVM: Hoist kvm_mmu_reload() out of the critical section
vmx_cpu_run doesn't handle error correctly and kvm_mmu_reload might
sleep with mutex changes, so I move it above.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:19 +02:00
Rusty Russell
310bc76c2b KVM: Return if the pdptrs are invalid when the guest turns on PAE.
Don't fall through and turn on PAE in this case.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:19 +02:00
Avi Kivity
394b6e5944 KVM: x86 emulator: fix faulty check for two-byte opcode
Right now, the bug is harmless as we never emulate one-byte 0xb6 or 0xb7.
But things may change.

Noted by the mysterious Gabriel C.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:19 +02:00
Avi Kivity
e3243452f4 KVM: x86 emulator: fix cmov for writeback changes
The writeback fixes (02c03a326a) broke
cmov emulation.  Fix.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:19 +02:00
Rusty Russell
7075bc816c KVM: Use standard CR8 flags, and fix TPR definition
Intel manual (and KVM definition) say the TPR is 4 bits wide.  Also fix
CR8_RESEVED_BITS typo.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:19 +02:00
Jeff Dike
8fc0d085f5 KVM: Set exit_reason to KVM_EXIT_MMIO where run->mmio is initialized.
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:19 +02:00
Rusty Russell
9eb829ced8 KVM: Trivial: Use standard BITMAP macros, open-code userspace-exposed header
Creating one's own BITMAP macro seems suboptimal: if we use manual
arithmetic in the one place exposed to userspace, we can use standard
macros elsewhere.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:18 +02:00
Rusty Russell
66aee91aaa KVM: Use standard CR4 flags, tighten checking
On this machine (Intel), writing to the CR4 bits 0x00000800 and
0x00001000 cause a GPF.  The Intel manual is a little unclear, but
AFIACT they're reserved, too.

Also fix spelling of CR4_RESEVED_BITS.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:18 +02:00
Rusty Russell
f802a307cb KVM: Use standard CR3 flags, tighten checking
The kernel now has asm/cpu-features.h: use those macros instead of inventing
our own.

Also spell out definition of CR3_RESEVED_BITS, fix spelling and
tighten it for the non-PAE case.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:18 +02:00
Rusty Russell
707d92fa72 KVM: Trivial: Use standard CR0 flags macros from asm/cpu-features.h
The kernel now has asm/cpu-features.h: use those macros instead of
inventing our own.

Also spell out definition of CR0_RESEVED_BITS (no code change) and fix typo.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:18 +02:00
Rusty Russell
9a2b85c620 KVM: Trivial: Avoid hardware_disable predeclaration
Don't pre-declare hardware_disable: shuffle the reboot hook down.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:18 +02:00
Rusty Russell
dcc0766b22 KVM: Trivial: Comment spelling may escape grep
Speling error in comment.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:18 +02:00
Rusty Russell
1e3c5cb0d5 KVM: Trivial: Make decode_register() static
I have shied away from touching x86_emulate.c (it could definitely use
some love, but it is forked from the Xen code, and it would be more
productive to cross-merge fixes).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:18 +02:00
Rusty Russell
5eb549a085 KVM: Trivial: Remove unused struct cpu_user_regs declaration
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:18 +02:00
Eddie Dong
65619eb5a8 KVM: In-kernel string pio write support
Add string pio write support to support some version of Windows.

Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:17 +02:00
Qing He
dad3795d2b KVM: SMP: Add vcpu_id field in struct vcpu
This patch adds a `vcpu_id' field in `struct vcpu', so we can
differentiate BSP and APs without pointer comparison or arithmetic.

Signed-off-by: Qing He <qing.he@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:17 +02:00
Nguyen Anh Quynh
cd0d913797 KVM: Fix *nopage() in kvm_main.c
*nopage() in kvm_main.c should only store the type of mmap() fault if
the pointers are not NULL. This patch fixes the problem.

Signed-off-by: Nguyen Anh Quynh <aquynh@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-13 10:18:17 +02:00
Linus Torvalds
ab9c232286 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (119 commits)
  [libata] struct pci_dev related cleanups
  libata: use ata_exec_internal() for PMP register access
  libata: implement ATA_PFLAG_RESETTING
  libata: add @timeout to ata_exec_internal[_sg]()
  ahci: fix notification handling
  ahci: clean up PORT_IRQ_BAD_PMP enabling
  ahci: kill leftover from enabling NCQ over PMP
  libata: wrap schedule_timeout_uninterruptible() in loop
  libata: skip suppress reporting if ATA_EHI_QUIET
  libata: clear ehi description after initial host report
  pata_jmicron: match vendor and class code only
  libata: add ST9160821AS / 3.ALD to NCQ blacklist
  pata_acpi: ACPI driver support
  libata-core: Expose gtm methods for driver use
  libata: add HDT722516DLA380 to NCQ blacklist
  libata: blacklist NCQ on Seagate Barracuda ST380817AS
  [libata] Turn on ACPI by default
  libata_scsi: Fix ATAPI transfer lengths
  libata: correct handling of SRST reset sequences
  libata: Integrate ACPI-based PATA/SATA hotplug - version 5
  ...
2007-10-12 16:16:41 -07:00
Linus Torvalds
6a84258e5f Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (37 commits)
  PCI: merge almost all of pci_32.h and pci_64.h together
  PCI: X86: Introduce and enable PCI domain support
  PCI: Add 'nodomains' boot option, and pci_domains_supported global
  PCI: modify PCI bridge control ISA flag for clarity
  PCI: use _CRS for PCI resource allocation
  PCI: avoid P2P prefetch window for expansion ROMs
  PCI: skip ISA ioresource alignment on some systems
  PCI: remove transparent bridge sizing
  pci: write file size to inode on proc bus file write
  pci: use size stored in proc_dir_entry for proc bus files
  pci: implement "pci=noaer"
  PCI: fix IDE legacy mode resources
  MSI: Use correct data offset for 32-bit MSI in read_msi_msg()
  PCI: Fix incorrect argument order to list_add_tail() in PCI dynamic ID code
  PCI: i386: Compaq EVO N800c needs PCI bus renumbering
  PCI: Remove no longer correct documentation regarding MSI vector assignment
  PCI: re-enable onboard sound on "MSI K8T Neo2-FIR"
  PCI: quirk_vt82c586_acpi: Omit reading PCI revision ID
  PCI: quirk amd_8131_mmrbc: Omit reading pci revision ID
  cpqphp: Use PCI_CLASS_REVISION instead of PCI_REVISION_ID for read
  ...
2007-10-12 15:50:23 -07:00
Linus Torvalds
efefc6eb38 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (75 commits)
  PM: merge device power-management source files
  sysfs: add copyrights
  kobject: update the copyrights
  kset: add some kerneldoc to help describe what these strange things are
  Driver core: rename ktype_edd and ktype_efivar
  Driver core: rename ktype_driver
  Driver core: rename ktype_device
  Driver core: rename ktype_class
  driver core: remove subsystem_init()
  sysfs: move sysfs file poll implementation to sysfs_open_dirent
  sysfs: implement sysfs_open_dirent
  sysfs: move sysfs_dirent->s_children into sysfs_dirent->s_dir
  sysfs: make sysfs_root a regular directory dirent
  sysfs: open code sysfs_attach_dentry()
  sysfs: make s_elem an anonymous union
  sysfs: make bin attr open get active reference of parent too
  sysfs: kill unnecessary NULL pointer check in sysfs_release()
  sysfs: kill unnecessary sysfs_get() in open paths
  sysfs: reposition sysfs_dirent->s_mode.
  sysfs: kill sysfs_update_file()
  ...
2007-10-12 15:49:37 -07:00
Linus Torvalds
117494a1b6 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6: (142 commits)
  USB: fix race in autosuspend reschedule
  atmel_usba_udc: Keep track of the device status
  USB: Nikon D40X unusual_devs entry
  USB: serial core should respect driver requirements
  USB: documentation for USB power management
  USB: skip autosuspended devices during system resume
  USB: mutual exclusion for EHCI init and port resets
  USB: allow usbstorage to have LUNS greater than 2Tb
  USB: Adding support for SHARP WS011SH to ipaq.c
  USB: add atmel_usba_udc driver
  USB: ohci SSB bus glue
  USB: ehci build fixes on au1xxx, ppc-soc
  USB: add runtime frame_no quirk for big-endian OHCI
  USB: funsoft: Fix termios
  USB: visor: termios bits
  USB: unusual_devs entry for Nikon DSC D2Xs
  USB: re-remove <linux/usb_sl811.h>
  USB: move <linux/usb_gadget.h> to <linux/usb/gadget.h>
  USB: Export URB statistics for powertop
  USB: serial gadget: Disable endpoints on unload
  ...
2007-10-12 15:49:10 -07:00
Linus Torvalds
4d5709a7b7 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Don't take semaphore in cpufreq_quick_get()
  [CPUFREQ] Support different families in fid/did to frequency conversion
  [CPUFREQ] cpufreq_stats: misc cpuinit section annotations
  [CPUFREQ] implement !CONFIG_CPU_FREQ stub for  cpufreq_unregister_notifier()
  [CPUFREQ] mark hotplug notifier callback as __cpuinit
  [CPUFREQ] Only check for transition latency on problematic governors (kconfig fix)
  [CPUFREQ] allow ondemand and conservative cpufreq governors to be used as default
  [CPUFREQ] move policy's governor initialisation out of low-level drivers into cpufreq core
  [CPUFREQ] Longhaul - Add support for PM133 northbridge
  [CPUFREQ] x86: use num_online_nodes to get physical cpus numbers for
2007-10-12 15:42:01 -07:00
Linus Torvalds
57c5b9998e Merge git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86
* git://git.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86: (40 commits)
  x86: HPET add another ICH7 PCI id
  x86: HPET force enable ICH5 suspend/resume fix
  x86: HPET force enable for ICH5
  x86: HPET try to activate force detected hpet
  x86: HPET force enable o ICH7 and later
  x86: HPET restructure hpet code for hpet force enable
  clock events: allow replacement of broadcast timer
  i386/x8664: cleanup the shared hpet code
  i386: Remove the useless #ifdef in i8253.h
  ACPI: remove the now unused ifdef code
  jiffies: remove unused macros
  x86_64: cleanup apic.c after clock events switch
  x86_64: remove now unused code
  x86: unify timex.h variants
  x86: kill 8253pit.h
  x86: disable apic timer for AMD C1E enabled CPUs
  x86: Fix irq0 / local apic timer accounting
  x86_64: convert to clock events
  x86_64: Add (not yet used) clock event functions
  x86_64: prepare idle loop for dynamic ticks
  ...
2007-10-12 15:39:39 -07:00
Linus Torvalds
42f04b6d4c Merge branch 'isdn-cleanups' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6
* 'isdn-cleanups' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
  [ISDN] HiSax diva: split setup into three smaller functions
  [ISDN] HiSax sedlbauer: move ISAPNP and PCI code into functions of their own
  [ISDN] HiSax elsa: split huge setup function into four smaller functions
  [ISDN] HiSax avm_pci: split setup into three smaller functions
  [ISDN] Remove CONFIG_PCI ifdefs from 100% PCI source code
2007-10-12 15:03:35 -07:00
Jeff Garzik
32a2eea795 PCI: Add 'nodomains' boot option, and pci_domains_supported global
* Introduce pci_domains_supported global, hardcoded to zero if
  !CONFIG_PCI_DOMAINS.

* Introduce 'nodomains' boot option, which clears pci_domains_supported
  on platforms that enable it by default (x86, x86-64, and others when
  they are converted to use this).

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:18 -07:00
Gary Hade
11949255d9 PCI: modify PCI bridge control ISA flag for clarity
Modify PCI Bridge Control ISA flag for clarity

This patch changes PCI_BRIDGE_CTL_NO_ISA to PCI_BRIDGE_CTL_ISA
and modifies it's clarifying comment and locations where used.
The change reduces the chance of future confusion since it makes
the set/unset meaning of the bit the same in both the bridge
control register and bridge_ctl field of the pci_bus struct.

Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Acked-by: Linas Vepstas <linas@austin.ibm.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:18 -07:00
Gary Hade
fd64cb4606 PCI: avoid P2P prefetch window for expansion ROMs
Avoid creating P2P prefetch window for expansion ROMs

Because of the future possibility that P2P prefetch windows will contain
address ranges above 4GB some BIOSes are providing space in the P2P
non-prefetch windows for expansion ROMs.  This is due to expansion ROM
BAR 32-bit limitation.  When expansion ROM BARs without BIOS assigned
address(es) are currently found behind a P2P bridge, the kernel attempts
to create a P2P prefetch window for them even though space for them has
already been provided in the non-prefetch window.  _CRS on some systems
with certain resource conservation conscious BIOSes may not provide the
extra 1MB or more memory resource needed for the expansion ROM motivated
prefetch window causing resource allocation errors.

This change corrects the problem by removing IORESOURCE_PREFETCH from
the expansion ROM flags initialization.  It also removes
IORESOURCE_CACHEABLE which seems inappropriate if only non-cacheable
memory is available.

Signed-off-by: Gary Hade <gary.hade@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:18 -07:00
Gary Hade
036fff4cf7 PCI: skip ISA ioresource alignment on some systems
Skip ISA ioresource alignment on some systems

To conserve limited PCI i/o resource on some IBM multi-node systems, the
BIOS allocates (via _CRS) and expects the kernel to use addresses in
ranges currently excluded by pcibios_align_resource() [i386/pci/i386.c].
This change allows the kernel to use the currently excluded address
ranges on the IBM x3800, x3850, and x3950.

Signed-off-by: Gary Hade <gary.hade@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:18 -07:00
Gary Hade
8fa5913d54 PCI: remove transparent bridge sizing
Remove transparent bridge sizing.

Due to code in pci_read_bridge_bases() [drivers/pci/probe.c] the child
bus of a transparent bridge already has access to the parent bus
resources so transparent bridge sizing appears unnecessary.  The bridge
sizing includes alignment and granularity adjustments that can cause
significantly more memory to be reserved from the parant bus than
required by devices on the child bus and allotted by _CRS.

Signed-off-by: Gary Hade <gary.hade@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:17 -07:00
David Rientjes
ecb3908046 pci: write file size to inode on proc bus file write
When a /proc/bus/pci file is written to, the size of that PCI device's
configuration space must be written to the inode.  Otherwise, it is
possible for the file to specify a size of 0 on stat if a task is holding
the same file open.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:17 -07:00
David Rientjes
cd68602f36 pci: use size stored in proc_dir_entry for proc bus files
On pci_proc_attach_device(), the size of the PCI configuration space is
stored in the proc_dir_entry as the size of the file.  Thus, the procfs
interface to PCI devices should use it instead of the device directly.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:17 -07:00
Randy Dunlap
7f78576366 pci: implement "pci=noaer"
For cases in which CONFIG_PCIEAER=y (such as distro kernels), allow users
to disable PCIE Advanced Error Reporting by using "pci=noaer" on the
kernel command line.

This can be used to work around hardware or (kernel) software problems.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:17 -07:00
Yoichi Yuasa
fd6e732186 PCI: fix IDE legacy mode resources
I got the following error on MIPS Cobalt.

PCI: Unable to reserve I/O region #1:8@f00001f0 for device 0000:00:09.1
pata_via 0000:00:09.1: failed to request/iomap BARs for port 0 (errno=-16)
PCI: Unable to reserve I/O region #3:8@f0000170 for device 0000:00:09.1
pata_via 0000:00:09.1: failed to request/iomap BARs for port 1 (errno=-16)
pata_via 0000:00:09.1: no available native port

The legacy mode IDE resources set the following order.

pci_setup_device()
    Legacy mode ATA controllers have fixed addresses.
    IDE resources: 0x1F0-0x1F7, 0x3F6, 0x170-0x177, 0x376
    |
    V
pcibios_fixup_bus()
    MIPS Cobalt PCI bus regions have the -0x10000000 offset from PCI resources.
    pcibios_fixup_bus() fix PCI bus regions.
    0x1F0 - 0x10000000 = 0xF00001F0
    |
    V
ata_pci_init_one()
    PCI: Unable to reserve I/O region #1:8@f00001f0 for device 0000:00:09.1

In some architectures, PCI bus regions have the offset from PCI resources. 
For this reason, pci_setup_device() should set PCI bus regions to
dev->resource[].

[akpm@linux-foundation.org: use struct initialiser]
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:17 -07:00