The ZVC counter update threshold is currently set to a fixed value of 32.
This patch sets up the threshold depending on the number of processors and
the sizes of the zones in the system.
With the current threshold of 32, I was able to observe slight contention
when more than 130-140 processors concurrently updated the counters. The
contention vanished when I either increased the threshold to 64 or used
Andrew's idea of overstepping the interval (see ZVC overstep patch).
However, we saw contention again at 220-230 processors. So we need higher
values for larger systems.
But the current default is already a bit of an overkill for smaller
systems. Some systems have tiny zones where precision matters. For
example i386 and x86_64 have 16M DMA zones and either 900M ZONE_NORMAL or
ZONE_DMA32. These are even present on SMP and NUMA systems.
The patch here sets up a threshold based on the number of processors in the
system and the size of the zone that these counters are used for. The
threshold should grow logarithmically, so we use fls() as an easy
approximation.
Results of tests on a system with 1024 processors (4TB RAM)
The following output is from a test allocating 1GB of memory concurrently
on each processor (Forking the process. So contention on mmap_sem and the
pte locks is not a factor):
X MIN
TYPE: CPUS WALL WALL SYS USER TOTCPU
fork 1 0.552 0.552 0.540 0.012 0.552
fork 4 0.552 0.548 2.164 0.036 2.200
fork 16 0.564 0.548 8.812 0.164 8.976
fork 128 0.580 0.572 72.204 1.208 73.412
fork 256 1.300 0.660 310.400 2.160 312.560
fork 512 3.512 0.696 1526.836 4.816 1531.652
fork 1020 20.024 0.700 17243.176 6.688 17249.863
So a threshold of 32 is fine up to 128 processors. At 256 processors contention
becomes a factor.
Overstepping the counter (earlier patch) improves the numbers a bit:
fork 4 0.552 0.548 2.164 0.040 2.204
fork 16 0.552 0.548 8.640 0.148 8.788
fork 128 0.556 0.548 69.676 0.956 70.632
fork 256 0.876 0.636 212.468 2.108 214.576
fork 512 2.276 0.672 997.324 4.260 1001.584
fork 1020 13.564 0.680 11586.436 6.088 11592.523
Still contention at 512 and 1020. Contention at 1020 is down by a third.
256 still has a slight bit of contention.
After this patch the counter threshold will be set to 125 which reduces
contention significantly:
fork 128 0.560 0.548 69.776 0.932 70.708
fork 256 0.636 0.556 143.460 2.036 145.496
fork 512 0.640 0.548 284.244 4.236 288.480
fork 1020 1.500 0.588 1326.152 8.892 1335.044
[akpm@osdl.org: !SMP build fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Increments and decrements are usually grouped rather than mixed. We can
optimize the inc and dec functions for that case.
Increment and decrement the counters by 50% more than the threshold in
those cases and set the differential accordingly. This decreases the need
to update the atomic counters.
The idea came originally from Andrew Morton. The overstepping alone was
sufficient to address the contention issue found when updating the global
and the per zone counters from 160 processors.
Also remove some code in dec_zone_page_state.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When skipping to the last TD of an URB, go to the _last_ entry in the
list instead of the _first_ entry (as780). This fixes Bugzilla #6747
and possibly others.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch is support LD-USB20 of the USB LAN device.
http://www2.elecom.co.jp/products/LD-USB20.html ( Japanese only )
I am using this device.
And, I confirmed work by using this patch.
Signed-off-by: Nobuhiro Iwamatsu <hemamu@t-base.ne.jp>
Acked-by: Petko Manolov <petkan@nucleusys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Patch to add VIA PCI quirk for Enhanced/Extended USB on VT8235
southbridge. It is needed in order to use EHCI/USB 2.0 with ACPI.
Without it IRQs are not routed correctly, you get an "Unlink after
no-IRQ?" error and the device is unusable.
I belive this could also be a fix for Bugzilla Bug 5835.
Signed-off-by: Mark Hindley <mark@hindley.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We need to wait until any currently-running handler has completed. Fixes an
unplug-time oops reported by "Miles Lane" <miles.lane@gmail.com>.
Cc: "Petko Manolov" <petkan@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This entry was sent in by Emmanuel Vasilakis <evas@forthnet.gr>, turned
into a patch by yours truly.
Signed-off-by: Phil Dibowitz <phil@ipom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch removes the Kyocera Finecam L3 entry in unusual devices
originally submitted by Michael Krauth <michael.krauth@web.de> and
Alessandro Fracchetti <al.fracchetti@tin.it> given that Gerriet
<ger.haw@gmx.de> finds he doesn't need it and Alessandro confirms it
isn't needed anymore as well.
Signed-off-by: Phil Dibowitz <phil@ipom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Unlike other sorts of endpoint queues, Isochronous queues don't stop
when an error is encountered. This patch (as772) fixes the scanning
routine in uhci-hcd, to make it keep on going when it finds an Iso
error.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The new spinlock debug code turned up a spinlock recursion bug in the
Ethernet gadget driver on a disconnect path; it would show up with any
UDC driver where the cancellation of active requests was synchronous,
rather than e.g. delayed until a controller's completion IRQ.
That recursion is fixed here by creating and using a new spinlock to
protect the relevant lists.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Adds all GTCO CalComp Digitizers and InterWrite School Products to
hid-core.c blacklist.
Signed-off-by: Jeremy A. Roberson <jroberson@gtcocalcomp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It is supposed to be OK to call mthca_create_ah() and mthca_destroy_ah()
from any context. However, for mem-full HCAs, these functions use the
mthca_alloc() and mthca_free() bitmap helpers, and those helpers use
non-IRQ-safe spin_lock() internally. Lockdep correctly warns that
this could lead to a deadlock. Fix this by changing mthca_alloc() and
mthca_free() to use spin_lock_irqsave().
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When I tested Linux kernel 2.6.17.7 about statistics
"ipFragFails",found that this counter couldn't increase correctly. The
criteria is RFC2011:
RFC2011
ipFragFails OBJECT-TYPE
SYNTAX Counter32
MAX-ACCESS read-only
STATUS current
DESCRIPTION
"The number of IP datagrams that have been discarded because
they needed to be fragmented at this entity but could not
be, e.g., because their Don't Fragment flag was set."
::= { ip 18 }
When I send big IP packet to a router with DF bit set to 1 which need to
be fragmented, and router just sends an ICMP error message
ICMP_FRAG_NEEDED but no increments for this counter(in the function
ip_fragment).
Signed-off-by: Wei Dong <weid@nanjing-fnst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch limits the warning messages when socket allocation failures
happen. It happens under memory pressure.
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bug noticed by Remi Denis-Courmont <rdenis@simphalempin.com>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 8c74932779 ("i386: Remove
alternative_smp") did not actually compile on x86 with CONFIG_SMP.
This fixes the __build_read/write_lock helpers. I've boot tested on
SMP.
[ Andi: "Oops, I think that was a quilt unrefreshed patch. Sorry. I
fixed those before testing, but then still send out the old patch." ]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Ben Dooks
Cleanup for include/asm-arma/arch-s3c2410/dma.h,
by using tab characters to indent items, remove the
now un-necessary changelog, and update the copyright
information.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
The type naming in the s3c24xx dma code is riddled with
typedefs creating _t types, from the code import from 2.4
which is contrary to the current Kernel coding style.
This patch cleans this up, removing the typedefs and
and fixing up the resultant code changes.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from David Brownell
This adds RTC support to the csb337 default config. Both the AT91
and the ds1307 RTCs are enabled (rtc0 and rtc1 respectively).
The ds1307 is used to initialize the system time, since it's battery-backed.
From then on the AT91 RTC is used, since it's more capable (with both
alarm and update irqs, and system wakeup capability) even though it
needs manual initialization (symlink /dev/rtc to /dev/rtc0 for older
versions of hwclock, then "hwclock --systohc") in an rc script or
from inittab.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Matthew Wilcox pointed out that the generic implementation
of this is unfit for use. Here's an ARM optimised version
instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fix return value from memcpy
[POWERPC] iseries: Define insw et al. so libata/ide will compile
[POWERPC] Fix irq enable/disable in smp_generic_take_timebase
[POWERPC] Fix problem with time not advancing on 32-bit platforms
[POWERPC] Restore copyright notice in arch/powerpc/kernel/fpu.S
[POWERPC] Fix up ibm_architecture_vec definition
[POWERPC] Make OF irq map code detect more error cases
[POWERPC] Support for "weird" MPICs and fixup mpc7448_hpc2
[POWERPC] Fix MPIC sense codes in documentation
[POWERPC] Fix performance regression in IRQ radix tree locking
[POWERPC] Add mpc7448hpc2 device tree source file
[POWERPC] Add MPC8349E MDS device tree source file to arch/powerpc/boot/dts
[POWERPC] modify mpc83xx platforms to use new IRQ layer
[POWERPC] Adapt ipic driver to new host_ops interface, add set_irq_type to set IRQ sense
[POWERPC] back up old school ipic.[hc] to arch/ppc
[POWERPC] Use mpc8641hpcn PIC base address from dev tree.
[POWERPC] Allow MPC8641 HPCN to build with CONFIG_PCI disabled too.
[POWERPC] Fix powerpc 44x_mmu build
[POWERPC] Remove flush_dcache_all export
This fixes a hang on ppc32.
The problem was that I was comparing a 32-bit quantity with a 64-bit
quantity, and consequently time wasn't advancing. This makes us use a
64-bit quantity on all platforms, which ends up simplifying the code
since we can now get rid of the tb_last_stamp variable (which actually
fixes another bug that Ben H and I noticed while going carefully through
the code).
This works fine on my G4 tibook. Let me know how it goes on your
machines.
Acked-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The backlight changes that went in had a bug where they could cause the
kernel to access an unitialized pointer when blanking if there is no
backlight control on a machine.
The bug affects atyfb, aty128fb, nvidiafb and rivafb. radeonfb seems to
be ok. This fixes it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As pointed out by Herbert Xu <herbert@gondor.apana.org.au>, our
memcpy implementation didn't return the destination pointer as its
return value, and there is code in the kernel that expects that.
This fixes it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Increase default nodes shift to 10, nr_cpus to 1024
[IA64] remove redundant local_irq_save() calls from sn_sal.h
[IA64] panic if topology_init kzalloc fails
[IA64-SGI] Silent data corruption caused by XPC V2.
The radeon requires a VAP state flush when enabling/disabling
vertex programs on the r200 cards.
Signed-off-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following change from -mm is important to 2.6.18 (actually to 2.6.17
but its too late for that). This was contributed over three months ago
by VIA to Bartlomiej and nothing happened. As a result the new chipset
is now out and Linux won't run on it. By the time 2.6.18 is finalised
this will be the defacto standard VIA chipset so support would be a good
plan.
Tested in -mm for a while, its essentially a PCI ident update but for
the bridge chip because VIA do things in weird ways.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It's possible to get an invalid page fault in kernel mode when we try to
write out segments from vsyscall32 when dumping core for a 32bit process if
the vsyscall32 DSO is not mapped in its address space (which can happen if,
for example, ulimit -v 100 is run).
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The values in init_tss.ist[] can change when an IST event occurs. Save
the original IST values for checking stack addresses when debugging or
doing stack traces.
Signed-off-by: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There was a bogus hunk from the genirq merge that essentially
broke stack switching for hard interrupts. Remove it since it isn't
needed.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
After all their only point is having them in user space. On x86-64
they don't even work in kernel space.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As a replacement for the earlier removal of the e820 MCFG check
we blacklist the Intel SDV with the original BIOS bug that
motivated that check. On those machines don't use MMCONFIG.
This also adds a new pci=mmconf parameter to override the blacklist.
Cc: Greg KH <gregkh@suse.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The .fill causes miscompilations with some binutils version.
Instead just patch the lock prefix in the lock constructs. That is the
majority of the cost and should be good enough.
Cc: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The .fill causes miscompilations with some binutils version.
Instead just patch the lock prefix in the lock constructs. That is the
majority of the cost and should be good enough.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Noticed by Jan Beulich.
When the kernel was moved from 1MB to 2MB in 2.6.17 the kernel reservation
code wasn't adjusted and it still reserved starting with 1MB. This means 1MB always
were lost.
This patch fixes this by reserving only starting with _text.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The unwinder fallback logic still had potential for falling through to
the legacy stack trace code without printing an indication (at once
serving as a separator) of this.
Further, the stack pointer retrieval for the fallback should be as
restrictive as possible (in order to avoid having the legacy stack
tracer try to access invalid memory). The patch tightens that, but
this could certainly be further improved.
Also making the call_trace command line option now conditional upon
CONFIG_STACK_UNWIND (as it's meaningless otherwise).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
By hard-coding the cpuid keys for alternative_smp() rather than using
the symbolic constant it turned out that incorrect values were used on
both i386 (0x68 instead of 0x69) and x86-64 (0x66 instead of 0x68).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
One open question: Should this added push perhaps be made conditional
upon CONFIG_STACK_UNWIND or CONFIG_UNWIND_INFO?
[AK: not needed, these are all very slow paths]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
One open question: Should these added pushes perhaps be made
conditional upon CONFIG_STACK_UNWIND or CONFIG_UNWIND_INFO?
[AK: Not needed -- these are all very slow paths]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The check for the MCFG table being reserved in the e820 map was originally
added to detect a broken BIOS in a preproduction Intel SDV. However it also
breaks the Apple x86 Macs, which can't supply this properly, but need
a working MCFG. With this patch they wouldn't use the MCFG and not work.
After some discussion I think it's best to remove the heuristic again.
It also failed on some other boxes (although it didn't cause much
problems there because old style port access for PCI config space
still works as fallback), but the preproduction SDVs can just use
pci=nommcfg. Supporting production machines properly is more
important.
Edgar Hucek did all the debugging work.
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Daniel Jacobowitz
vfp_put_double didn't work in a CONFIG_AEABI kernel. By swapping
the arguments, we arrange for them to be in the same place regardless
of ABI. I made the same change to vfp_put_float for consistency.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Calls to set a device online with path grouping may get stuck in
some cases because certain device conditions where discarded after
unsolicited interrupts.
Check subchannel activity after unsolicited interrupts and retry
the operation if the subchannel is idle.
Signed-off-by: Stefan Bader <shbader@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>