Srivatsa Vaddagiri noticed occasionally incorrect CPU usage
values in top and tracked it down to stime going below 0 in
task_stime(). Negative values are possible there due to the
sampled nature of stime/utime.
Fix suggested by Balbir Singh.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <balbir@linux.vnet.ibm.com>
The commit
commit 5cb350baf5
Author: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Date: Mon Oct 15 17:00:14 2007 +0200
sched: group scheduling, sysfs tunables
introduced the uids_mutex and the helpers to lock/unlock it.
Unfortunately, the error paths of alloc_uid() were not patched
to unlock it.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
warmbloodedcreature@gmail.com reported that an APIC-enabled
Asus a7v8x-x with an Athlon XP reboots early in the bootup:
http://bugzilla.kernel.org/show_bug.cgi?id=8723
after a long marathon of spontaneous-reboot debugging, it turns
out to be caused by sync_Arb_ids(). AMD CPUs never really needed
this sequence anyway, so just return early if we meet an AMD CPU.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Michael Kerrisk reported that a long standing bug in the adjtimex()
system call causes glibc's adjtime(3) function to deliver the wrong
results if 'delta' is NULL.
add the ADJ_OFFSET_SS_READ API detail, which will be used by glibc
to fix this API compatibility bug.
Also see: http://bugzilla.kernel.org/show_bug.cgi?id=6761
[ mingo@elte.hu: added patch description and made it backwards compatible ]
NOTE: the new flag is defined 0xa001 so that it returns -EINVAL on
older kernels - this way glibc can use it safely. Suggested by Ulrich
Drepper.
Acked-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The latest KVM driver wants to use the empty_zero_page symbol, and it's
not exported in 32-bit x86 (although it is exported by x86_64, s390, and
uml architectures).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: tglx@linutronix.de
Cc: linux-kernel@vger.kernel.com
Cc: kvm-devel@lists.sourceforge.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
arch/x86/kernel/kprobes_64.c: In function 'set_current_kprobe':
arch/x86/kernel/kprobes_64.c:152: sorry, unimplemented: inlining failed in call to 'is_IF_modifier': recursive inlining
arch/x86/kernel/kprobes_64.c:166: sorry, unimplemented: called from here
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: mingo@elte.hu
Cc: akpm@linux-foundation.org
Cc: tglx@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
HP ProLiant systems DL385 G2 and DL585 G2 need pci=bfsort to enumerate PCI
devices in the expected order.
Matt sayeth:
biosdevname is a userspace app I wrote to help solve this so we don't need
to patch the kernel for future systems. It's not integrated into any
distributions properly yet, but is included in openSUSE 10.3 and Fedora 8
for people who want to download and install it there. It acts as a udev
helper.
For the time being, patching the kernel is necessary. I really hope
biosdevname eliminates that need in future distributions.
http://linux.dell.com/biosdevname/
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Cc: mingo@elte.hu
Cc: andy@greyhouse.net
Cc: john.cagle@hp.com
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: Greg KH <greg@kroah.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86: correctly set UTS_MACHINE for "make ARCH=x86"
For a kernel built with "make ARCH=x86" the following system
information is displayed when running the new kernel
$ uname -m
x86
On some i386 systems (e.g. K7) we even have the following information
$ uname -m
x66
This is weird. The usual information for "uname -m" should be "x86_64"
on 64-bit and "i386" or "i686" on 32-bit.
This patch fixes the issue by setting UTS_MACHINE to "i386" for 32-bit
kernel builds and to "x86_64" for 64-bit kernel builds. I.e., "x86"
won't be used for UTS_MACHINE anymore.
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andreas Herrmann <aherrman@arcor.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@redhat.com>
ACPI processor idle code references local_apic_timer_c2_ok, which
is not available when LOCAL_APIC is disabled.
Define local_apic_timer_c2_ok as a constant, when LOCAL_APIC=n
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
today, all oopses contain a version number of the kernel, which is nice
because the people who actually do bother to read the oops get this
vital bit of information always without having to ask the reporter in
another round trip.
However, WARN_ON() and many other dump_stack() users right now lack this
information; the patch below adds this. This information is essential
for getting people to use their time effectively when looking at these
things; in addition, it's essential for tools that try to collect
statistics about defects.
Please consider, since its so simple and important for long term kernel
quality processes.
The code is identical between 32/64 bit; a lot of this code should be
unified over time, the patch keeps the identical-ness intact.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
AMD Opteron processors before CG revision don't like C-states > 1.
This solves the long standing bugzilla #5303 and probably some more
on affected machines:
http://bugzilla.kernel.org/show_bug.cgi?id=5303
[ tglx@linutronix.de: reworked the patch so it does not wreck ia64 ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
More than 3 years ago Niclas Gustafsson reported a 'stopped time'
problem:
> Watching the /proc/interrupts with 10s apart after the "stop".
>
> [root@s151 root]# more /proc/interrupts
> CPU0
> 0: 66413955 local-APIC-edge timer
[...]
> LOC: 67355837
> ERR: 0
> MIS: 0
> [root@s151 root]# more /proc/interrupts
> CPU0
> 0: 66413955 local-APIC-edge timer
[...]
> LOC: 67379568
> ERR: 0
> MIS: 0
This may be because buggy SMM firmware messes with the 8259A (configured
for a transparent mode -- yes that rare "local-APIC-edge" mode is tricky
;-) ) insanely.
this should resolve:
http://bugzilla.kernel.org/show_bug.cgi?id=2544http://bugzilla.kernel.org/show_bug.cgi?id=6296
Patch-dusted-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Sibyte SOCs only have 32-bit PCI. Due to the sparse use of the address
space only the first 1GB of memory is mapped at physical addresses
below 1GB. If a system has more than 1GB of memory 32-bit DMA will
not be able to reach all of it.
For now this patch is good enough to keep Sibyte users happy but it seems
eventually something like swiotlb will be needed for Sibyte.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In particular as-is it's not suited for multicore and mutiprocessors
systems where there is on guarantee that the counter are synchronized
or running from the same clock at all. This broke Sibyte and probably
others since the "[MIPS] Handle R4000/R4400 mfc0 from count register."
commit.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The R4000 and R4400 have an errata where if the cp0 count register is read
in the exact moment when it matches the compare register no interrupt will
be generated.
This bug may be triggered if the cp0 count register is being used as
clocksource and the compare interrupt as clockevent. So a simple
workaround is to avoid using the compare for both facilities on the
affected CPUs.
This is different from the workaround suggested in the old errata documents;
at some opportunity probably the official version should be implemented
and tested. Another thing to find out is which processor versions
exactly are affected. I only have errata documents upto R4400 V3.0
available so for the moment the code treats all R4000 and R4400 as broken.
This is potencially a problem for some machines that have no other decent
clocksource available; this workaround will cause them to fall back to
another clocksource, worst case the "jiffies" source.
The LL / SC loops in __futex_atomic_op() have the usual fixups necessary
for memory acccesses to userspace from kernel space installed:
__asm__ __volatile__(
" .set push \n"
" .set noat \n"
" .set mips3 \n"
"1: ll %1, %4 # __futex_atomic_op \n"
" .set mips0 \n"
" " insn " \n"
" .set mips3 \n"
"2: sc $1, %2 \n"
" beqz $1, 1b \n"
__WEAK_LLSC_MB
"3: \n"
" .set pop \n"
" .set mips0 \n"
" .section .fixup,\"ax\" \n"
"4: li %0, %6 \n"
" j 2b \n" <-----
" .previous \n"
" .section __ex_table,\"a\" \n"
" "__UA_ADDR "\t1b, 4b \n"
" "__UA_ADDR "\t2b, 4b \n"
" .previous \n"
: "=r" (ret), "=&r" (oldval), "=R" (*uaddr)
: "0" (0), "R" (*uaddr), "Jr" (oparg), "i" (-EFAULT)
: "memory");
The branch at the end of the fixup code, it goes back to the SC
instruction, no matter if the fault was first taken by the LL or SC
instruction resulting in an endless loop which will only terminate if
the address become valid again due to another thread setting up an
accessible mapping and the CPU happens to execute the SC instruction
successfully which due to the preceeding ERET instruction of the fault
handler would only happen if UNPREDICTABLE instruction behaviour of the
SC instruction without a preceeding LL happens to favor that outcome.
But normally processes are nice, pass valid arguments and we were just
getting away with this.
Thanks to Kaz Kylheku <kaz@zeugmasystems.com> for providing the original
report and a test case.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
A new born thread starts execution not in schedule but rather in
ret_from_fork which results in it bypassing the part of the code to
load a new context written in C which are the DSP context and the
userlocal register which Linux uses for the TLS pointer. Frequently
we were just getting away with this bug for a number of reasons:
o Real world application scenarios are very unlikely to use clone or fork
in blocks of DSP code.
o Linux by default runs the child process right after the fork, so the
child by luck will find all the right context in the DSP and userlocal
registers.
o So far the rdhwr instruction was emulated on all hardware so userlocal
wasn't getting referenced at all and the emulation wasn't suffering
from the issue since it gets it's value straight from the thread's
thread_info.
Fixed by moving the code to load the context from switch_to() to
finish_arch_switch which will be called by newborn and old threads.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
None of the drives I have follows what the standard says about
transfer chunk size. Of the four SATA and six PATA ATAPI devices
tested, four ignore transfer chunk size completely and the ones which
honor it don't behave according to the spec when it's odd.
According to the spec, transfer chunk size can be odd if the amount of
data to transfer equals or is smaller than the chunk size and the
device can indicate the same odd number and transfer the whole thing
at one go with a pad byte appended. However, in reality, none of the
drives I have does that. They all indicate and transfer even number
of bytes one byte shorter than the chunk size first; then indicate and
transfer two bytes, which is clearly out of spec.
In addition to unnecessary second PIO data phase, this also creates a
weird problem when combined with SATA controllers which perform PIO
via DMA. Some of these controllers use actualy number of bytes
received to update DMA pointer so chunks which are sized 4n + 2 makes
DMA pointer off by two bytes. This causes data corruption and buffer
overruns.
This patch rounds nbytes up to the nearest even number such that ATAPI
devices don't split data transfer for the last odd byte. This
shouldn't confuse controllers which depend on transfer chunk size as
devices will report the rounded-up number, actually transfer that much
and padding buffer is there to receive them.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The #ifdef's in arp_process() were not only a mess, they were also wrong
in the CONFIG_NET_ETHERNET=n and (CONFIG_NETDEV_1000=y or
CONFIG_NETDEV_10000=y) cases.
Since they are not required this patch removes them.
Also removed are some #ifdef's around #include's that caused compile
errors after this change.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The skb_morph function only freed the data part of the dst skb, but leaked
the auxiliary data such as the netfilter fields. This patch fixes this by
moving the relevant parts from __kfree_skb to skb_release_all and calling
it in skb_morph.
It also makes kfree_skbmem static since it's no longer called anywhere else
and it now no longer does skb_release_data.
Thanks to Yasuyuki KOZAKAI for finding this problem and posting a patch for
it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The inet_ehash_locks_alloc() looks like this:
#ifdef CONFIG_NUMA
if (size > PAGE_SIZE)
x = vmalloc(...);
else
#endif
x = kmalloc(...);
Unlike it, the inet_ehash_locks_alloc() looks like this:
#ifdef CONFIG_NUMA
if (size > PAGE_SIZE)
vfree(x);
else
#else
kfree(x);
#endif
The error is obvious - if the NUMA is on and the size
is less than the PAGE_SIZE we leak the pointer (kfree is
inside the #else branch).
Compiler doesn't warn us because after the kfree(x) there's
a "x = NULL" assignment, so here's another (minor?) bug: we
don't set x to NULL under certain circumstances.
Boring explanation, I know... Patch explains it better.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The change 050f009e16
[IPSEC]: Lock state when copying non-atomic fields to user-space
caused a regression.
Ingo Molnar reports that it causes a potential dead-lock found by the
lock validator as it tries to take x->lock within xfrm_state_lock while
numerous other sites take the locks in opposite order.
For 2.6.24, the best fix is to simply remove the added locks as that puts
us back in the same state as we've been in for years. For later kernels
a proper fix would be to reverse the locking order for every xfrm state
user such that if x->lock is taken together with xfrm_state_lock then
it is to be taken within it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
If a card has no IRQ then pass no interrupt handler but allow polled
usage.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Hopefully there is a better long term solution but for now lets favour
reliability.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
sil24 unnecessarily used LIBATA_MAX_PRD and ATAPI sg table was short
by one entry which might cause very obscure problems. This patch
updates sg table sizing such that
* One full page is used for PRB + sg table. On 4k page,
this results in 253 sg's.
* Make ATAPI sg block properly sized.
* Make build fail if command block size doesn't equal PAGE_SIZE.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
There are two bugs in disabled port handling.
* test in PORT_PATA0 is reversed
* ->prereset should return -ENOENT for disabled ports not 0
The first bug makes the PATA channel considered disabled but the
second bug saves the day by returning 0. The net result is that cable
is always left at ATA_CBL_UNKNOWN. This results in false 80c
configuration and thus transfer errors.
This patch fixes both bugs.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Since writing to two reserved bits ain't much of a housekeeping, I think it's
time we get rid of the custom error handler in this driver. ;-)
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
As it is crypto_remove_spawn may try to unregister an instance which is
yet to be registered. This patch fixes this by checking whether the
instance has been registered before attempting to remove it.
It also removes a bogus cra_destroy check in crypto_register_instance as
1) it's outside the mutex;
2) we have a check in __crypto_register_alg already.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It seems that newer versions of gcc have regressed in their abilities to
analyse initialisations. This patch moves the initialisations up to avoid
the warnings.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The original code has striking complexity to perform a query
which can be reduced to a very simple compare.
FIN seqno may be included to write_seq but it should not make
any significant difference here compared to skb->len which was
used previously. One won't end up there with SYN still queued.
Use of write_seq check guarantees that there's a valid skb in
send_head so I removed the extra check.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: John Heffner <jheffner@psc.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It seems that the checked range for receiver window check should
begin from the first rather than from the last skb that is going
to be included to the probe. And that can be achieved without
reference to skbs at all, snd_nxt and write_seq provides the
correct seqno already. Plus, it SHOULD account packets that are
necessary to trigger fast retransmit [RFC4821].
Location of snd_wnd < probe_size/size_needed check is bogus
because it will cause the other if() match as well (due to
snd_nxt >= snd_una invariant).
Removed dead obvious comment.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the ax88796 device driver to the r7785rp defconfig.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>