quicklists must keep even off node pages on the quicklists until the TLB
flush has been completed.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Increase the mininum number of partial slabs to keep around and put
partial slabs to the end of the partial queue so that they can add
more objects.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ Regression added by changeset:
cd40b7d398
[NET]: make netlink user -> kernel interface synchronious
-DaveM ]
nl_fib_input re-reuses incoming skb to send the reply. This means that this
packet will be freed twice, namely in:
- netlink_unicast_kernel
- on receive path
Use clone to send as a cure, the caller is responsible for kfree_skb on error.
Thanks to Alexey Dobryan, who originally found the problem.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
x86: intel_cacheinfo.c: cpu cache info entry for Intel Tolapai
x86: fix die() to not be preemptible
After reading the directory contents into the temporary buffer, we grab
each dirent and pass it to filldir witht eh current offset of the dirent.
The current offset was not being set for the first dirent in the temporary
buffer, which coul dresult in bad offsets being set in the f_pos field
result in looping and duplicate entries being returned from readdir.
SGI-PV: 974905
SGI-Modid: xfs-linux-melb:xfs-kern:30282a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
This was broken by my '[XFS] simplify xfs_create/mknod/symlink prototype',
which assigned the re-shuffled ondisk dev_t back to the rdev variable in
xfs_vn_mknod. Because of that i_rdev is set to the ondisk dev_t instead of
the linux dev_t later down the function.
Fortunately the fix for it is trivial: we can just remove the assignment
because xfs_revalidate_inode has done the proper job before unlocking the
inode.
SGI-PV: 974873
SGI-Modid: xfs-linux-melb:xfs-kern:30273a
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
This patch adds a cpu cache info entry for the Intel Tolapai cpu.
Signed-off-by: Jason Gaston <jason.d.gaston@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Andrew "Eagle Eye" Morton noticed that we use raw_local_save_flags()
instead of raw_local_irq_save(flags) in die(). This allows the
preemption of oopsing contexts - which is highly undesirable. It also
causes CONFIG_DEBUG_PREEMPT to complain, as reported by Miles Lane.
this bug was introduced via:
commit 39743c9ef7
Author: Andi Kleen <ak@suse.de>
Date: Fri Oct 19 20:35:03 2007 +0200
x86: use raw locks during oopses
- spin_lock_irqsave(&die.lock, flags);
+ __raw_spin_lock(&die.lock);
+ raw_local_save_flags(flags);
that is not a correct open-coding of spin_lock_irqsave(): both the
ordering is wrong (irqs should be disabled _first_), and the wrong
flags-saving API was used.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When used function put_cmsg() to copy kernel information to user
application memory, if the memory length given by user application is
not enough, by the bad length calculate of msg.msg_controllen,
put_cmsg() function may cause the msg.msg_controllen to be a large
value, such as 0xFFFFFFF0, so the following put_cmsg() can also write
data to usr application memory even usr has no valid memory to store
this. This may cause usr application memory overflow.
int put_cmsg(struct msghdr * msg, int level, int type, int len, void *data)
{
struct cmsghdr __user *cm
= (__force struct cmsghdr __user *)msg->msg_control;
struct cmsghdr cmhdr;
int cmlen = CMSG_LEN(len);
~~~~~~~~~~~~~~~~~~~~~
int err;
if (MSG_CMSG_COMPAT & msg->msg_flags)
return put_cmsg_compat(msg, level, type, len, data);
if (cm==NULL || msg->msg_controllen < sizeof(*cm)) {
msg->msg_flags |= MSG_CTRUNC;
return 0; /* XXX: return error? check spec. */
}
if (msg->msg_controllen < cmlen) {
~~~~~~~~~~~~~~~~~~~~~~~~
msg->msg_flags |= MSG_CTRUNC;
cmlen = msg->msg_controllen;
}
cmhdr.cmsg_level = level;
cmhdr.cmsg_type = type;
cmhdr.cmsg_len = cmlen;
err = -EFAULT;
if (copy_to_user(cm, &cmhdr, sizeof cmhdr))
goto out;
if (copy_to_user(CMSG_DATA(cm), data, cmlen - sizeof(struct cmsghdr)))
goto out;
cmlen = CMSG_SPACE(len);
~~~~~~~~~~~~~~~~~~~~~~~~~~~
If MSG_CTRUNC flags is set, msg->msg_controllen is less than
CMSG_SPACE(len), "msg->msg_controllen -= cmlen" will cause unsinged int
type msg->msg_controllen to be a large value.
~~~~~~~~~~~~~~~~~~~~~~~~~~~
msg->msg_control += cmlen;
msg->msg_controllen -= cmlen;
~~~~~~~~~~~~~~~~~~~~~
err = 0;
out:
return err;
}
The same promble exists in put_cmsg_compat(). This patch can fix this
problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix possible max_phys_segments violation in cloned dm-crypt bio.
In write operation dm-crypt needs to allocate new bio request
and run crypto operation on this clone. Cloned request has always
the same size, but number of physical segments can be increased
and violate max_phys_segments restriction.
This can lead to data corruption and serious hardware malfunction.
This was observed when using XFS over dm-crypt and at least
two HBA controller drivers (arcmsr, cciss) recently.
Fix it by using bio_add_page() call (which tests for other
restrictions too) instead of constructing own biovec.
All versions of dm-crypt are affected by this bug.
Cc: stable@kernel.org
Cc: dm-crypt@saout.de
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Make sure dm honours max_hw_sectors of underlying devices
We still have no firm testing evidence in support of this patch but
believe it may help to resolve some bug reports. - agk
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Insert a missing KOBJ_CHANGE notification when a device is renamed.
Cc: Scott James Remnant <scott@ubuntu.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Fix BIO_UPTODATE test for write io.
Cc: stable@kernel.org
Cc: dm-crypt@saout.de
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
With CONFIG_SCSI=n __scsi_print_sense() is never linked in.
drivers/built-in.o: In function `hp_sw_end_io':
dm-mpath-hp-sw.c:(.text+0x914f8): undefined reference to `__scsi_print_sense'
Caught with a randconfig on current git.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This patch fixes a panic on shrinking a DM device if there is
outstanding I/O to the part of the device that is being removed.
(Normally this doesn't happen - a filesystem would be resized first,
for example.)
The bug is that __clone_and_map() assumes dm_table_find_target()
always returns a valid pointer. It may fail if a bio arrives from the
block layer but its target sector is no longer included in the DM
btree.
This patch appends an empty entry to table->targets[] which will
be returned by a lookup beyond the end of the device.
After calling dm_table_find_target(), __clone_and_map() and target_message()
check for this condition using
dm_target_is_valid().
Sample test script to trigger oops:
The problem was introduced by commit "mm: variable length argument
support" (b6a2fea393)
as it didn't update fs/binfmt_aout.c like other binfmt's.
I noticed that on alpha when accidentally launched old OSF/1
Acrobat Reader binary. Obviously, other architectures are affected
as well.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Ollie Wild <aaw@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Right now it's nearly impossible for parsers that collect kernel crashes
from logs or emails (such as www.kerneloops.org) to detect the
end-of-oops condition. In addition, it's not currently possible to
detect whether or not 2 oopses that look alike are actually the same
oops reported twice, or are truly two unique oopses.
This patch adds an end-of-oops marker, and makes the end marker include
a very simple 64-bit random ID to be able to detect duplicate reports.
Normally, this ID is calculated as a late_initcall() (in the hope that
at that time there is enough entropy to get a unique enough ID); however
for early oopses the oops_exit() function needs to generate the ID on
the fly.
We do this all at the _end_ of an oops printout, so this does not impact
our ability to get the most important portions of a crash out to the
console first.
[ Sidenote: the already existing oopses-since-bootup counter we print
during crashes serves as the differentiator between multiple oopses
that trigger during the same bootup. ]
Tested on 32-bit and 64-bit x86. Artificially injected very early
crashes as well, as expected they result in this constant ID after
multiple bootups:
---[ end trace ca143223eefdc828 ]---
---[ end trace ca143223eefdc828 ]---
because the random pools are still all zero. But it all still works
fine and causes no additional problems (which is the main goal of
instrumentation code).
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Realtime tasks would not account their runtime during ticks. Which would lead
to:
struct sched_param param = { .sched_priority = 10 };
pthread_setschedparam(pthread_self(), SCHED_FIFO, ¶m);
while (1) ;
Not showing up in top.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
I included these operations vector cases for situations
where we never need to do anything, the entries aren't
filled in by any implementation, so we OOPS trying to
invoke NULL pointer functions.
Really make them NOPs, to fix the bug.
Signed-off-by: David S. Miller <davem@davemloft.net>
This operation helper abstracts:
skb->mac_header = skb->data;
but it was done in two more places which were actually:
skb->mac_header = skb->network_header;
and those are corrected here.
Signed-off-by: David S. Miller <davem@davemloft.net>
mac_header update in ipgre_recv() was incorrectly changed to
skb_reset_mac_header() when it was introduced.
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
In several places the arguments to the xfrm_audit_start() function are
in the wrong order resulting in incorrect user information being
reported. This patch corrects this by pacing the arguments in the
correct order.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The aalgos/ealgos fields are only 32 bits wide. However, af_key tries
to test them with the expression 1 << id where id can be as large as
253. This produces different behaviour on different architectures.
The following patch explicitly checks whether ID is greater than 31
and fails the check if that's the case.
We cannot easily extend the mask to be longer than 32 bits due to
exposure to user-space. Besides, this whole interface is obsolete
anyway in favour of the xfrm_user interface which doesn't use this
bit mask in templates (well not within the kernel anyway).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
In arp_process() (net/ipv4/arp.c), there is unused code: definition
and assignment of tha (target hw address ).
Signed-off-by: Mark Ryden <markryde@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tg3_nvram_write_block_unbuffered() is reading data from nvram into
allocated buffer before overwriting a part of it with user-supplied
data. Then it feeds the entire page back to nvram. It should be
storing the words it had read as little-endian, not as host-endian.
Note that tg3_set_eeprom() does exactly that for padding the same
data to full words before it gets passed down to tg3_nvram_write_block()
and then to tg3_nvram_write_block_unbuffered().
Moreover, when we get to sending the entire thing back to nvram, we
go through it word-by-word, doing essentially
writel(swab32(le32_to_cpu(word)), ...)
so if we want them to reach the card in host-independent endianness,
we'd better really have all that buffer filled with fixed-endian.
For user-supplied part we obviously do have that (it's an array of
octets memcpy'd in), ditto for padding of user-supplied part to word
boundaries (taken care of in tg3_set_eeprom()). The rest of the
buffer gets filled by tg3_nvram_write_block_unbuffered() and it would
damn better be consistent with that (and with tg3_get_eeprom(), while
we are at it - there we also convert the words read from nvram to
little-endian before returning the buffer to user).
The bug should get triggered on big-endian boxen when set_eeprom is done
for less than entire page. Then the words that should've been unaffected
at all will actually get byteswapped in place in nvram.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed misannotations, introduced a new helper - tg3_nvram_read_le().
It gets __le32 * instead of u32 * and puts there the value converted
to little-endian. A lot of callers of tg3_nvram_read() were doing
that; converted them to tg3_nvram_read_le().
At that point the driver is practically endian-clean; the only remaining
place is an actual bug, AFAICS; will be dealt with in the next patch.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix inappropriate memory freeing in case of requested rate_control_ops was
not found. In this case the list head entity is going to be accidentally
wasted.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When using recvfrom() on a SOCK_DGRAM packet socket, I noticed that the MAC
address passed back for wireless frames was always completely wrong. The
reason for this is that the header parse function assigned to our virtual
interfaces is a function parsing an 802.11 rather than 802.3 header. This
patch fixes it by keeping the default ethernet header operations assigned.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There is no point in staying in IEEE80211_ASSOCIATED if there is no
sta_info entry to receive frames with.
Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Adjust CMCI mask on CPU hotplug
[IA64] make flush_tlb_kernel_range() an inline function
[IA64] Guard elfcorehdr_addr with #if CONFIG_PROC_FS
[IA64] Fix Altix BTE error return status
[IA64] Remove assembler warnings on head.S
[IA64] Remove compiler warinings about uninitialized variable in irq_ia64.c
[IA64] set_thread_area fails in IA32 chroot
[IA64] print kernel release in OOPS to make kerneloops.org happy
[IA64] Two trivial spelling fixes
[IA64] Avoid unnecessary TLB flushes when allocating memory
[IA64] ia32 nopage
[IA64] signal: remove redundant code in setup_sigcontext()
IA64: Slim down __clear_bit_unlock