Checking source for other get_paca()->field preemption dangers found that
open_high_hpage_areas does a structure copy into its paca while preemption
is enabled: unsafe however gcc accomplishes it. Just remove that copy:
it's done safely afterwards by on_each_cpu, as in open_low_hpage_areas.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Repeated -j20 kernel builds on a G5 Quad running an SMP PREEMPT kernel
would often collapse within a day, some exec failing with "Bad address".
In each case examined, load_elf_binary was doing a kernel_read, but
generic_file_aio_read's access_ok saw current->thread.fs.seg as USER_DS
instead of KERNEL_DS.
objdump of filemap.o shows gcc 4.1.0 emitting "mr r5,r13 ... ld r9,416(r5)"
here for get_paca()->__current, instead of the expected and much more usual
"ld r9,416(r13)"; I've seen other gcc4s do the same, but perhaps not gcc3s.
So, if the task is preempted and rescheduled on a different cpu in between
the mr and the ld, r5 will be looking at a different paca_struct from the
one it's now on, pick up the wrong __current, and perhaps the wrong seg.
Presumably much worse could happen elsewhere, though that split is rare.
Other architectures appear to be safe (x86_64's read_pda is more limiting
than get_paca), but ppc64 needs to force "current" into one instruction.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Changed qe_issue_cmd() to write cmd_input to the CECDR unmodified. It
was treating cmd_input as a virtual address and tried to convert it to
a physical address.
Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 10Gigabit ethernet device drivers appear to be able to chew
up all 256MB of TCE mappings on pSeries systems, as evidenced by
numerous error messages:
iommu_alloc failed, tbl c0000000010d5c48 vaddr c0000000d875eff0 npages 1
Some experimentation indicates that this is essentially because
one 1500 byte ethernet MTU gets mapped as a 64K DMA region when
the large 64K pages are enabled. Thus, it doesn't take much to
exhaust all of the available DMA mappings for a high-speed card.
This patch changes the iommu allocator to work with its own
unique, distinct page size. Although the patch is long, its
actually quite simple: it just #defines a distinct IOMMU_PAGE_SIZE
and then uses this in all the places that matter.
As a side effect, it also dramatically improves network performance
on platforms with H-calls on iommu translation inserts/removes (since
we no longer call it 16 times for a 1500 bytes packet when the iommu HW
is still 4k).
In the future, we might want to make the IOMMU_PAGE_SIZE a variable
in the iommu_table instance, thus allowing support for different HW
page sizes in the iommu itself.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fixed a compile error in building the 85xx support with oprofile, and in
the process cleaned up some issues with the fsl_booke performance monitor
code.
* Reorganized FSL Book-E performance monitoring code so that the 7450
wouldn't be built if the e500 was, and cleaned it up so it was more
self-contained.
* Added a cpu_setup function for FSL Book-E. The original
cpu_setup function prototype had no arguments, assuming that
the reg_setup function would copy the required information into
variables which represented the registers. This was silly for
e500, since it has 1 register per counter (rather than 3 for
all counters), so the code has been restructured to have
cpu_setup take the current counter config array as an argument,
with op_powerpc_setup() invoking op_powerpc_cpu_setup() through
on_each_cpu(), and op_powerpc_cpu_setup() invoking the
model-specific cpu_setup function with an argument. The
argument is ignored on all other platforms at present.
* Fixed a confusing line where a trinary operator only had two
arguments
Signed-off-by: Andrew Fleming <afleming@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The e500 core generates an illegal instruction exception when it tries
to execute the lwsync instruction, which we currently use for rmb().
This fixes it by using the LWSYNC macro, which turns into a plain sync
on 32-bit machines.
Signed-off-by: Andrew Fleming <afleming@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch fixes a few issues in offb:
- A test was inverted causing the palette hack to never work
(no device node was passed down to the init function)
- Some cards seem to have their assigned-addresses property in a random
order, thus we need to try using of_get_pci_address() first, which will
fail if it's not a PCI device, and fallback to of_get_address() in that
case. of_get_pci_address() properly parsees assigned-addresses to test
the BAR number and thus will get it right whatever the order is.
- Some cards (like GXT4500) provide a linebytes of 0xffffffff in the
device-tree which does no good. This patch handles that by using the
screen width when that happens. (Also fixes btext.c while at it).
- Add detection of the GXT4500 in addition to the GXT2000 for the
palette hacks (we use the same hack, palette is linear in register space
at offset 0x6000).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
ICH7M was separated from ICH6M to allow undocumented MAP value 01b
which was spotted on an ASUS notebook. However, there is also
notebooks with MAP value 01b on ICH6M. This patch re-merges ICH6M and
ICH7M entries and allows MAP value 01b for both.
This problem has been reported and initial patch provided by Jonathan
Dieter.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jonathan Dieter <jdieter@gmail.com>
Cc: Tom Deblauwe <tom.deblauwe@telenet.be>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ata_dev_revalidate() isn't used outside of libata core. Unexport it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Hi Jeff,
I tested the PATA support on my old VAIO notebook, and it failed to find
my piix device:
00:07.1 Class 0101: 8086:7111 (rev 01) (prog-if 80 [Master])
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 64
Region 4: I/O ports at fc90 [size=16]
This patch adds the pci id to ata_piix.c and things then work as
expected.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
sis_init_one() modifies probe_ent->port_flags after allocating and
initializing it using ata_pci_init_native_mode(). This makes port_flags
for the secondary port (probe_ent->pinfo2->flags) go out of sync resulting
in misdetection of device due to incorrectly initialized SCR access flag.
This patch make probe_ent alloc/init happen after the final port flags
value is determined. This is fragile but probe_ent and all the related
mess are scheduled to go away soon for exactly this reason. We just need
to hold everything together till then.
This has been spotted and diagnosed and tested by Patrick McHardy.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Patric McHardy <kaber@trash.net>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The sky2 driver uses a single NAPI poll routine for both ports on dual ported
cards (because there is a single IRQ and status ring). Netpoll makes assumptions
about the relationship between network device and NAPI that aren't correct
on the second port, this will cause the port to never clear work.
Most systems, just have single port, so not a big issue.
The easy fix is just make the second port, not netpoll capable.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
I don't want my code to downgraded to GPLv3 because of
cut-n-pasted the comments. These files which I hold copyright
on were started before it was clear what GPLv3 was going to be.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Checks for NULL dev_alloc_skb() and returns on true to avoid subsequent
dereference.
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Christoph Hellwig <hch@infrared.org>
Signed-off-by: David Rientjes <rientjes@cs.washington.edu>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
No need to keep defining PCI_DEVICE_ID_SERVERWORKS_HT2000_PCIE
in the driver code since it is now defined in pci_ids.h.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The sky2 driver is no longer in experimental state.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
modprobe n2 with no parameters or no such devices
will get confusing error message.
# modprobe n2
... Kernel does not have module support
This patch replaces return code from -ENOSYS to -EINVAL.
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
drivers/net/wan/n2.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
Signed-off-by: Jeff Garzik <jeff@garzik.org>
- Call platform_driver_unregister() before return when no cards found.
(fixes data corruption when no cards found)
- Check platform_device_register_simple() return value
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Mike Phillips <mikep@linuxtr.net>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
drivers/net/tokenring/proteon.c | 9 +++++++--
drivers/net/tokenring/skisa.c | 9 +++++++--
2 files changed, 14 insertions(+), 4 deletions(-)
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Flooding the console with error messages for every RX FIFO overrun,
checksum error and framing error isn't very sensible. Each of these
errors can occur during normal operation, so stop printk'ing error
messages for RX errors at all.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Ray Lehtiniemi reported that an incoming UDP packet flood can lock up
the ep93xx ethernet driver. Herbert Valerio Riedel noted that due to
the way ep93xx_eth manages the RX/TXstatus rings, it cannot distinguish
a full ring from an empty one, and correctly suggested that this was
likely to be causing this lockup to occur.
Instead of looking at the hardware's RX/TXstatus ring write pointers
to determine when to stop reading from those rings, we should just check
every individual RX/TXstatus descriptor's valid bit instead, since there
is no other way to distinguish an empty ring from a full ring, and if
there is a descriptor waiting, we take the hit of reading the descriptor
from memory anyway.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* 'release' of master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Correct definition of handle_IPI
[IA64] move SAL_CACHE_FLUSH check later in boot
[IA64] MCA recovery: Montecito support
[IA64] cpu-hotplug: Fixing confliction between CPU hot-add and IPI
[IA64] don't double >> PAGE_SHIFT pointer for /dev/kmem access
The declaration of handle_IPI in arch/ia64/kernel/smp.c was changed but
not the definition of this function. Remove struct pt_regs from
handle_IPI().
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The check to see if the firmware drops interrupts during a
SAL_CACHE_FLUSH is done to early in the boot. SAL_CACHE_FLUSH expects
to be able to make PAL calls in virtual mode, on some cell based
machines a fault occurs causing a MCA. This patch moves the check
after mmu_context_init so the TLB and VHPT are properly setup.
Signed-off-by Troy Heber <troy.heber@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The information in MCA records is filled in slightly differently on
Montecito than on Madison/McKinley. Usually, the cache check and bus
check target identifiers have the same address. On Montecito the
cache check and bus check target identifiers can be different if
a corrected error (ie SBE or unconsumed poison data) was encountered and
then an uncorrected error (ie DBE) was consumed. In that case, the
cache check target identifier is the physical address of the DBE (that
caused the MCA to surface) while the bus check target identifier is the
physical address of the SBE. This patch correctly finds the target
identifier that triggered the MCA.
If there are multiple valid cache target identifiers in the same
error record then use the one with the lowest cache level.
Signed-off-by: Russ Anderson (rja@sgi.com)
Signed-off-by: Tony Luck <tony.luck@intel.com>
Since we already moved to GENERIC_TIME, we should implement alternatives
of old do_gettimeoffset routines to get sub-jiffies resolution from
gettimeofday(). This patch includes:
* MIPS clocksource support (based on works by Manish Lachwani).
* remove unused gettimeoffset routines and related codes.
* remove unised 64bit do_div64_32().
* simplify mips_hpt_init. (no argument needed, __init tag)
* simplify c0_hpt_timer_init. (no need to write to c0_count)
* remove some hpt_init routines.
* mips_hpt_mask variable to specify bitmask of hpt value.
* convert jmr3927_do_gettimeoffset to jmr3927_hpt_read.
* convert ip27_do_gettimeoffset to ip27_hpt_read.
* convert bcm1480_do_gettimeoffset to bcm1480_hpt_read.
* simplify sb1250 hpt functions. (no need to subtract and shift)
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
arch/mips/kernel/traps.c:1115: warning: int format, long unsigned int arg (arg 2)
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Several fields in an incoming MAD extended info header were passed
into the MAD_IFC firmware command at incorrect offsets (mostly off by
4 bytes). As the result, the HCA will fail to generate traps in which
this info is needed (e.g. traps which include the GRH of the incoming
packet), in violation of the IB spec.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
SCSI_QLA_ISCSI needs to depend on NET to prevent build (link) failures
that are caused by selecting SCSI_ISCSI_ATTRS.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In very rare circumstances would we be pruning a merged request and at
the same time delete the implicated cfqq from the rr_list, and not readd
it when the merged request got added. This could cause io stalls until
that process issued io again.
Fix it up by putting the rr_list add handling into cfq_add_rq_rb(),
identical to how pruning is handled in cfq_del_rq_rb(). This fixes a
hang reproducible with fsx-linux.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Partitions are not limited to live within a device. So we should range
check after partition mapping.
Note that 'maxsector' was being used for two different things. I have
split off the second usage into 'old_sector' so that maxsector can be still
be used for it's primary usage later in the function.
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the use of dget/dput calls to balance out on the lower filesystem.
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is no point to calling the lower umount_begin when the eCryptfs
umount_begin is called.
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Opens on lower dentry objects happen in several places in eCryptfs, and they
all involve the same steps (dget, mntget, dentry_open). This patch
consolidates the lower open events into a single function call.
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Update cipher block encryption code to the new crypto API.
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Update eCryptfs hash code to the new kernel crypto API.
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up the crypto initialization code; let the crypto API take care of the
key size checks.
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If there are no listeners, taskstats_exit_send() just returns because
taskstats_exit_alloc() didn't allocate *tidstats. This is wrong, each
sub-thread should do fill_tgid_exit() on exit, otherwise its ->delays is
not recorded in ->signal->stats and lost.
Q: We don't send TASKSTATS_TYPE_AGGR_TGID when single-threaded process
exits. Is it good? How can the listener figure out that it was actually a
process exit, not sub-thread?
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Balbir Singh <balbir@in.ibm.com>
Acked-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>