This is a simplified and actually more comprehensive form of a bug
fix from Mitch Williams <mitch.a.williams@intel.com>.
When we mask or unmask a msi-x irqs the writes may be posted because
we are writing to memory mapped region. This means the mask and
unmask don't happen immediately but at some unspecified time in the
future. Which is out of sync with how the mask/unmask logic work
for ioapic irqs.
The practical result is that we get very subtle and hard to track down
irq migration bugs.
This patch performs a read flush after writes to the MSI-X table for mask
and unmask operations. Since the SMP affinity is set while the interrupt
is masked, and since it's unmasked immediately after, no additional flushes
are required in the various affinity setting routines.
The testing by Mitch Williams on his especially problematic system should
still be valid as I have only simplified the code, not changed the
functionality.
We currently have 7 drivers: cciss, mthca, cxgb3, forceth, s2io,
pcie/portdrv_core, and qla2xxx in 2.6.21 that are affected by this
problem when the hardware they driver is plugged into the right slot.
Given the difficulty of reproducing this bug and tracing it down to
anything that even remotely resembles a cause, even if people are
being affected we aren't likely to see many meaningful bug reports, and
the people who see this bug aren't likely to be able to reproduce this
bug in a timely fashion. So it is best to get this problem fixed
as soon as we can so people don't have problems.
Then if people do have a kernel message stating "No irq for vector" we
will know it is yet another novel cause that needs a complete new
investigation.
Cc: Greg KH <greg@kroah.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SCSI]: Fix scsi_send_eh_cmnd scatterlist handling
[SPARC]: Add unsigned to unused bit field in a.out.h
This fixes a regression caused by commit:
2dc611de5a
The sense buffer code in scsi_send_eh_cmnd was changed to use
alloc_page() and a scatter list, but the sense data copy was not
updated to match so what we actually get in the sense buffer is total
grabage starting with the kernel address of the struct page we got.
Basically the stack frame of scsi_send_eh_cmd() is what ends up
in the sense buffer.
Depending upon how pointers look on a given platform, you can
end up getting sr_ioctl.c errors when you mount a cdrom. If
the CDROM gives a check condition for GPCMD_GET_CONFIGURATION issued
by drivers/cdrom/cdrom.c:cdrom_mmc_profile(), sr_ioctl will
spit out this error message in sr_do_ioctl() with the way pointers
are on sparc64:
default:
printk(KERN_ERR "%s: CDROM (ioctl) error, command: ", cd->cdi.name);
__scsi_print_command(cgc->cmd);
scsi_print_sense_hdr("sr", &sshdr);
err = -EIO;
This is the error Tom Callaway reported in:
http://marc.info/?l=linux-sparc&m=117407453208101&w=2
Anyways, fix this by using page_address(sgl.page) which is OK
because we know this is low-mem due to GFP_ATOMIC.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Christoph Hellwig <hch@lst.de>
Add unsigned to unused bit field in a.out.h to make sparse happy.
[ I took care of the sparc64 side as well -DaveM ]
Signed-off-by: Robert Reif <reif@earthlink.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The nvram dword alignment logic was broken when writing less than 4
bytes on a non-aligned offset. It was missing logic to round the
length to 4 bytes.
The page erase code is also moved so that it is only called when
using non-buffered flash for better code clarity.
Update version to 1.5.7.
Based on initial patch from Tony Cureington <tony.cureington@hp.com>.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In article <20070329.142644.70222545.davem@davemloft.net> (at Thu, 29 Mar 2007 14:26:44 -0700 (PDT)), David Miller <davem@davemloft.net> says:
> From: Sridhar Samudrala <sri@us.ibm.com>
> Date: Thu, 29 Mar 2007 14:17:28 -0700
>
> > The check for length in rawv6_sendmsg() is incorrect.
> > As len is an unsigned int, (len < 0) will never be TRUE.
> > I think checking for IPV6_MAXPLEN(65535) is better.
> >
> > Is it possible to send ipv6 jumbo packets using raw
> > sockets? If so, we can remove this check.
>
> I don't see why such a limitation against jumbo would exist,
> does anyone else?
>
> Thanks for catching this Sridhar. A good compiler should simply
> fail to compile "if (x < 0)" when 'x' is an unsigned type, don't
> you think :-)
Dave, we use "int" for returning value,
so we should fix this anyway, IMHO;
we should not allow len > INT_MAX.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This changes the "not found" error return for the lookup
function to -ESRCH so that it can be distinguished from
the case where a rule or route resulting in -ENETUNREACH
has been found during the search.
It fixes a bug where if DECnet was compiled with routing
support, but no routes were added to the routing table,
it was failing to fall back to endnode routing.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6:
[PATCH] x86: Don't probe for DDC on VBE1.2
[PATCH] x86-64: Increase NMI watchdog probing timeout
[PATCH] x86-64: Let oprofile reserve MSR on all CPUs
[PATCH] x86-64: Disable local APIC timer use on AMD systems with C1E
The __copy_to_user_inatomic() calls in file_read_actor() and pipe_read()
are broken on original i386 machines, where WP-works-ok == false, as
__copy_to_user_inatomic() on such systems calls functions which might
sleep and/or contain cond_resched() calls inside of a kmap_atomic()
region.
The original check for WP-works-ok was in access_ok(), but got moved
during the 2.5 series to fix a race vs. swap.
Return the number of bytes to copy in the case where we are in an atomic
region, so the non atomic code pathes in file_read_actor() and
pipe_read() are taken.
This could be optimized to avoid the kmap_atomicby moving the check for
WP-works-ok into fault_in_pages_writeable(), but this is more intrusive
and can be done later.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On a multiprocessor machine the VT_WAITACTIVE ioctl call may return 0 if
fg_console has already been updated in redraw_screen() but the console
switch itself hasn't been completed. Fix this by checking fg_console in
vt_waitactive() with the console sem held.
Signed-off-by: Michal Januszewski <spock@gentoo.org>
Acked-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the regression resulting from the recent change of suspend code
ordering that causes systems based on Intel x86 CPUs using the microcode
driver to hang during the resume.
The problem occurs since the microcode driver uses request_firmware() in
its CPU hotplug notifier, which is called after tasks has been frozen and
hangs. It can be fixed by telling the microcode driver to use the
microcode stored in memory during the resume instead of trying to load it
from disk.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Adrian Bunk <bunk@stusta.de>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Maxim <maximlevitsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lockdep reported cmos_suspend() and cmos_resume() calling rtc_update_irq()
with IRQs enabled; not allowed.
Also fix problems seen on some hardware, whereby false alarm IRQs could be
reported (primarily to userspace); and update two comments to match changes
in ACPI. Those make up most of this patch, by volume.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change PnP resource handling code to use proper type for resource start and
length. Fixes bogus regions reported in /proc/iomem.
I've also made some pointer constant, as they are constant...
Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Adam Belay <ambx1@neo.rr.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revert e92a4d595b.
Dmitry points out
"When we block_prepare_write() failed while ext3_prepare_write() we jump to
"failure" label and call ext3_prepare_failure() witch search last mapped bh
and invoke commit_write untill it. This is wrong!! because some bh from
begining to the last mapped bh may be not uptodate. As a result we commit to
disk not uptodate page content witch contains garbage from previous usage."
and
"Unexpected file size increasing."
Call trace the same as it was in first issue but result is different.
For example we have file with i_size is zero. we want write two blocks ,
but fs has only one free block.
->ext3_prepare_write(...from == 0, to == 2048)
retry:
->block_prepare_write() == -ENOSPC# we failed but allocated one block here.
->ext3_prepare_failure()
->commit_write( from == 0, to == 1024) # after this i_size becomes 1024 :)
if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
goto retry;
Finally when all retries will be spended ext3_prepare_failure return
-ENOSPC, but i_size was increased and later block trimm procedures can't
help here.
We don't appear to have the horsepower to fix these issues, so let's put
things back the way they were for now.
Cc: Kirill Korotaev <dev@openvz.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ken Chen <kenneth.w.chen@intel.com>
Cc: Andrey Savochkin <saw@sw.ru>
Cc: <linux-ext4@vger.kernel.org>
Cc: Dmitriy Monakhov <dmonakhov@openvz.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the dump cannot occur most likely because of a full file system and
the page to be written is the zero page, the call to page_cache_release()
is missed.
Signed-off-by: Brian Pomerantz <bapper@mvista.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It seems that there must be at least one node in mems and at least one CPU
in cpus in order to be able to assign tasks to a cpuset. This makes sense.
And I think it would also make sense to include a mems setting in the
basic usage section of the documentation.
I also wonder if something logged to dmsg, explaining why a write failed,
would be a good enhancement. I ended up having rummage arround in cpuset.c
in order to work out why my configuration was failing.
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Paul Jackson <pj@sgi.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix an off-by-one spotted by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Vincent Sanders <vince@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently we have a confused udelay implementation.
* __const_udelay does not accept usecs but xloops in i386 and x86_64
* our implementation requires usecs as arg
* it gets a xloops count when called by asm/arch/delay.h
Bugs related to this (extremely long shutdown times) where reported by some
x86_64 users, especially using Device Mapper.
To hit this bug, a compile-time constant time parameter must be passed -
that's why UML seems to work most times. Fix this with a simple udelay
implementation.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We're using #ifdef CONFIG_SYSCTL, but we should be using CONFIG_PROC_SYSCTL,
so we get
fs/built-in.o: In function `proc_root_init':
/usr/src/linux/fs/proc/root.c:83: undefined reference to `proc_sys_init'
Fix that up and remove an ifdef-in-C.
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Helge Hafting <helgehaf@aitel.hist.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ADEF bits in the TSCR register have different meanings in read and
write mode. For this reason ADEF has to be reset on every
read-modify-write operation.
This patch introduces a special write function for this register, which
takes care of it.
Thanks to Holger Magnussen for pointing my nose at this problem.
Signed-off-by: Andreas Oberritter <obi@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Setting the message length to zero means to send one byte, so you need a
subtraction instead of an addition.
Signed-off-by: Andreas Oberritter <obi@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
VBE1.2 doesn't support function 15h (DDC) resulting in a 'hang' whilst
uncompressing kernel with some video cards. Make sure we check VBE version
before fiddling around with DDC.
http://bugzilla.kernel.org/show_bug.cgi?id=1458
Opened: 2003-10-30 09:12 Last update: 2007-02-13 22:03
Much thanks to Tobias Hain for help in testing and investigating the bug.
Tested on;
i386, Chips & Technologies 65548 VESA VBE 1.2
CONFIG_VIDEO_SELECT=Y
CONFIG_FIRMWARE_EDID=Y
Untested on x86_64.
Signed-off-by: Zwane Mwaikambo <zwane@infradead.org>
Signed-off-by: Andi Kleen <ak@suse.de>
The MSR reservation is per CPU and oprofile would only allocate them
on the CPU it was initialized on. Change this to handle all CPUs.
This also fixes a warning about unprotected use of smp_processor_id()
in preemptible kernels.
Signed-off-by: Andi Kleen <ak@suse.de>
AMD dual core laptops with C1E do not run the APIC timer correctly
when they go idle. Previously the code assumed this only happened
on C2 or deeper. But not all of these systems report support C2.
Use a AMD supplied snippet to detect C1E being enabled and then disable
local apic timer use.
This supercedes an earlier workaround using DMI detection of specific systems.
Thanks to Mark Langsdorf for the detection snippet.
Signed-off-by: Andi Kleen <ak@suse.de>
This patch:
- Switches mb/rmb/wmb back to being full-blown DMBs on ARM SMP systems,
since mb/rmb/wmb are required to order Normal memory accesses as well.
- Enables the use of DMB and ISB on XSC3 (which is an ARMv5TE ISA core
but conforms to the ARMv6 memory ordering model and supports the
various ARMv6 barriers.)
- Makes DMA coherent platforms (only ixp23xx at the moment) map
mb/rmb/wmb to dmb(), as on DMA coherent platforms, DMA consistent
mappings are done as Normal mappings, which are weakly ordered.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch addresses the following issues with the pxa2xx FIr driver:
1. increment overrun error counter and not frame error counter on ICSR1_ROR bit set in ICSR1.
2. drop frames reported with the frame error from the IC.
3. when resetting the receiver and preparing it for the next DMA in pxa_irda_fir_irq() actually clear the Rx FIFO. See description in Table 11-2 in PXA270 Developer's Manual of the RXE bit.
Correction added in version 2: clearing the IC Rx FIFO also has to be done in pxa_irda_fir_dma_tx_irq()
Signed-off-by: G. Liakhovetski <gl@dsa-ac.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 2e3646e51b changed the way the
split config tree is built, but failed to also adjust fixdep accordingly
- if changing a config option from or to m, files referencing the
respective CONFIG_..._MODULE (but not the corresponding CONFIG_...)
didn't get rebuilt.
The problem is that trisate symbol are represent with three different
symbols:
SYMBOL=n => no symbol defined
SYMBOL=y => CONFIG_SYMBOL defined to '1'
SYMBOL=m => CONFIG_SYMBOL_MODULE defined to '1'
But conf_split_config do not distingush between the =y and =m case, so
only the =y case is honoured.
This is fixed in fixdep so when a CONFIG symbol with _MODULE is found we
skip that part and only look for the CONFIG_SYMBOL version.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ingo reported that built-in drivers suffered bootup hangs with certain
driver unregistry sequences, due to sysfs breakage.
Do the minimal fix for v2.6.21: only wait if the driver is a module.
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6:
[CRYPTO] api: Flush the current page right than the next
[CRYPTO] api: Use the right value when advancing scatterwalk_copychunks
On platforms where flush_dcache_page is needed we're currently flushing
the next page right than the one we've just processed. This patch fixes
the off-by-one error.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the scatterwalk_copychunks loop, We should be advancing by
len_this_page and not nbytes. The latter is the total length.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There was a typo in commit 7632fc8f80,
preventing it from working - 32bit binaries crashed hopelessly before
the below fix and work perfectly now.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the scatterwalk_copychunks loop, We should be advancing by
len_this_page and not nbytes. The latter is the total length.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
I've seen this several times on this drive, completely reproducible.
Once it has hung, power needs to be cut from the drive to recover it, a
simple reboot is not enough. So I'd suggest disabling NCQ on this
drive.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix compilation fail for ixp4xx platforms for the case when CONFIG_IXP4XX_INDIRECT_PCI is set. That is due to the check_signature() is appeared in include/linux/io.h.
Signed-off-by: Vladimir Barinov <vbarinov@ru.mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] SMTC: Fix recursion in instant IPI replay code.
[MIPS] BCM1480: Fix setting of irq affinity.
[MIPS] do_page_fault() needs to use raw_smp_processor_id().
[MIPS] SMTC: Fix false trigger of debug code on single VPE.
[MIPS] SMTC: irq_{enter,leave} and kstats keeping for relayed timer ints.
[MIPS] lockdep: Deal with interrupt disable hazard in TRACE_IRQFLAGS
[MIPS] lockdep: Handle interrupts in R3000 style c0_status register.
[MIPS] MV64340: Add missing prototype for mv64340_irq_init().
[MIPS] MT: MIPS_MT_SMTC_INSTANT_REPLAY currently conflicts with PREEMPT.
[MIPS] EV64120: Include <asm/irq.h> to fix warning.
[MIPS] Ocelot: Fix warning.
[MIPS] Ocelot: Give PMON_v1_setup a proper prototype.
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Fix arch/ia64/pci/pci.c:571: warning: `return' with a value
[IA64] Speed up boot - skip unnecessary clock calibration
[IA64] bugfix stack layout upside-down
[IA64] Fix possible invalid memory access in ia64_setup_msi_irq()
local_irq_restore -> raw_local_irq_restore -> irq_restore_epilog ->
smtc_ipi_replay -> smtc_ipi_dq -> spin_unlock_irqrestore ->
_spin_unlock_irqrestore -> local_irq_restore
The recursion does abort when there is no more IPI queued for a CPU, so
this isn't usually fatal which is why we got away with this for so long
until this was discovered by code inspection.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>