DEBUG_PAGEALLOC is not compatible with hugetlb page support. That debug
option turns off PSE. Once it is turned off in CR4, the cpu will ignore
pse bit in the pmd and causing infinite page-not- present faults.
So disable DEBUG_PAGEALLOC if the user selected hugetlbfs.
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make CONFIG_HOTPLUG_CPU depend on !X86_PC, so we need to turn on either
CONFIG_GENERICARCH, CONFIG_BIGSMP or any other subarch except X86_PC when
CONFIG_HOTPLUG_CPU=y
With 2.6.15+ kernels when CONFIG_HOTPLUG_CPU is turned on we switch to
bigsmp mode for sending IPI's and ioapic configurations that caused the
following error message.
>> More than 8 CPUs detected and CONFIG_X86_PC cannot handle it.
>> Use CONFIG_X86_GENERICARCH or CONFIG_X86_BIGSMP.
Originally bigsmp was added just to handle >8 cpus, but now with hotplug
cpu support we need to use bigsmp mode (why? see below), that cause the
above error message even if there were less than 8 cpus in the system.
The message is bogus, but we are cannot use logical flat mode due to issues
with broadcast IPI can confuse a CPU just comming up. We use flat physical
mode just like x86_64 case. More details on why bigsmp now uses flat
physical mode (vs. cluster mode) in following link.
http://marc.theaimsgroup.com/?l=linux-kernel&m=113261865814107&w=2
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In zone_pcp_init we print out all zones even if they are empty:
On node 0 totalpages: 245760
DMA zone: 245760 pages, LIFO batch:31
DMA32 zone: 0 pages, LIFO batch:0
Normal zone: 0 pages, LIFO batch:0
HighMem zone: 0 pages, LIFO batch:0
To conserve dmesg space why not print only the non zero zones.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The page migration code could function without NUMA but we currently have
no users for the non-NUMA case.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We have had this memory leak for a while now. The situation is complicated
by the use of alloc_kmemlist() as a function to resize various caches by
do_tune_cpucache().
What we do here is first of all make sure that we deallocate properly in
the loop over all the nodes.
If we are just resizing caches then we can simply return with -ENOMEM if an
allocation fails.
If the cache is new then we need to rollback and remove all earlier
allocations.
We detect that a cache is new by checking if the link to the global cache
chain has been setup. This is a bit hackish ....
(also fix up too overlong lines that I added in the last patch...)
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Inspired by Jesper Juhl's patch from today
1. Get rid of err
We do not set it to anything else but zero.
2. Drop the CONFIG_NUMA stuff.
There are definitions for alloc_alien_cache and free_alien_cache()
that do the right thing for the non NUMA case.
3. Better naming of variables.
4. Remove redundant cachep->nodelists[node] expressions.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
__drain_alien_cache() currently drains objects by freeing them to the
(remote) freelists of the original node. However, each node also has a
shared list containing objects to be used on any processor of that node.
We can avoid a number of remote node accesses by copying the pointers to
the free objects directly into the remote shared array.
And while we are at it: Skip alien draining if the alien cache spinlock is
already taken.
Kiran reported that this is a performance benefit.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
slabr_objects() can be used to transfer objects between various object
caches of the slab allocator. It is currently only used during
__cache_alloc() to retrieve elements from the shared array. We will be
using it soon to transfer elements from the alien caches to the remote
shared array.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Convert mm/ to use the new kmem_cache_zalloc allocator.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As suggested by Eric Dumazet, optimize kzalloc() calls that pass a
compile-time constant size. Please note that the patch increases kernel
text slightly (~200 bytes for defconfig on x86).
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce a memory-zeroing variant of kmem_cache_alloc. The allocator
already exits in XFS and there are potential users for it so this patch
makes the allocator available for the general public.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a driver for the on-chip watchdog on the cirrus ep93xx series of ARM
CPUs.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The attached patch optimises d_find_alias() to only take the spinlock if
there's anything in the the inode's alias list. If there isn't, it returns
NULL immediately.
With respect to the superblock sharing patch, this should reduce by one the
number of times the dcache_lock is taken by nfs_lookup() for ordinary
directory lookups.
Only in the case where there's already a dentry for particular directory inode
(such as might happen when another mountpoint is rooted at that dentry) will
the lock then be taken the extra time.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
According to the specification the timevals must be validated and an
errorcode -EINVAL returned in case the timevals are not in canonical form.
This check was never done in Linux.
The pre 2.6.16 code converted invalid timevals silently. Negative timeouts
were converted by the timeval_to_jiffies conversion to the maximum timeout.
hrtimers and the ktime_t operations expect timevals in canonical form.
Otherwise random results might happen on 32 bits machines due to the
optimized ktime_add/sub operations. Negative timeouts are treated as
already expired. This might break applications which work on pre 2.6.16.
To prevent random behaviour and API breakage the timevals are checked and
invalid timevals sanitized in a simliar way as the pre 2.6.16 code did.
Invalid timevals are reported with a per boot limited number of kernel
messages so applications which use this misfeature can be corrected.
After a grace period of one year the sanitizing should be replaced by a
correct validation check. This is also documented in
Documentation/feature-removal-schedule.txt
The validation and sanitizing is done inside do_setitimer so all callers
(sys_setitimer, compat_sys_setitimer, osf_setitimer) are catched.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
alarm() calls the kernel with an unsigend int timeout in seconds. The
value is stored in the tv_sec field of a struct timeval to setup the
itimer. The tv_sec field of struct timeval is of type long, which causes
the tv_sec value to be negative on 32 bit machines if seconds > INT_MAX.
Before the hrtimer merge (pre 2.6.16) such a negative value was converted
to the maximum jiffies timeout by the timeval_to_jiffies conversion. It's
not clear whether this was intended or just happened to be done by the
timeval_to_jiffies code.
hrtimers expect a timeval in canonical form and treat a negative timeout as
already expired. This breaks the legitimate usage of alarm() with a
timeout value > INT_MAX seconds.
For 32 bit machines it is therefor necessary to limit the internal seconds
value to avoid API breakage. Instead of doing this in all implementations
of sys_alarm the duplicated sys_alarm code is moved into a common function
in itimer.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I seem to have lost this hunk in yesterday's patch. It brings the
coming-online CPU's softlockup timer up to date so we don't get false-positive
tripups during CPU hot-add.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Attached patch fixes invalid pointer arithmetic in DMI code to make onboard
device discovery working again.
akpm: bug has been present since dmi_find_device() was added in 2.6.14.
Affects ipmi only (I think) - the symptoms weren't described.
akpm: changed to use pointer arithmetic rather than open-coded sizeof.
Signed-off-by: Andrey Panin <pazke@donpac.ru>
Cc: Corey Minyard <minyard@acm.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
drivers/media/video/v4l2-common.c: In function `v4l_printk_ioctl_arg':
drivers/media/video/v4l2-common.c:477: warning: `0' flag used with `%p' printf format
Someone went and "improved" my patch ;)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Because of historic reasons, there are two separate directories with
V4L stuff. Most drivers are located at driver/media/video. However, some
code for USB Webcams were inserted under drivers/usb/media.
This makes difficult for module authors to know were things should be.
Also, makes Kconfig menu confusing for normal users.
This patch moves all V4L content under drivers/usb/media to
drivers/media/video, and fixes Kconfig/Makefile entries.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Semaphore to mutex conversion.
The conversion was generated via scripts, and the result was validated
automatically via a script as well.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we get an ICMP need-to-frag message, the original TOS value in the
ICMP payload cannot be used as a key to look up the routes to update.
This is because the TOS field may have been modified by routers on the
way. Similarly, ip_rt_redirect should also ignore the TOS as the router
that gave us the message may have modified the TOS value.
The patch achieves this objective by aggregating entries with different
TOS values (but are otherwise identical) into the same bucket. This
makes it easy to update them at the same time when an ICMP message is
received.
In future we should use a twin-hashing scheme where teh aggregation
occurs at the entry level. That is, the TOS goes back into the hash
for normal lookups while ICMP lookups will end up with a node that
gives us a list that contains all other route entries that differ
only by TOS.
Signed-off-by: Ilia Sotnikov <hostcc@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch sets the maximum TCP buffer sizes (available to automatic
buffer tuning, not to setsockopt) based on the TCP memory pool size.
The maximum sndbuf and rcvbuf each will be up to 4 MB, but no more
than 1/128 of the memory pressure threshold.
Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
I was working on the ipip/xfrm problem and as usual I get side-tracked by
other problems.
As part of an attempt to change the IPv4 protocol handler calling
convention I found that SCTP violated the existing convention.
It's returning non-zero values after freeing the skb. This is doubly bad
as 1) the skb gets resubmitted; 2) the return value is interpreted as a
protocol number.
This patch changes those return values to zero.
IPv6 doesn't suffer from this problem because it uses a positive return
value as an indication for resubmission. So the only effect of this patch
there is to increment the IPSTATS_MIB_INDELIVERS counter which IMHO is
the right thing to do.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netdev notifier call chain is currently unregistered without taking
any locks outside the notifier system. Because the notifier system itself
does not synchronise unregistration with respect to the calling of the
chain, we as its user need to do our own locking.
We are supposed to take the RTNL for all calls to netdev notifiers, so
taking the RTNL should be sufficient to protect it.
The registration path in dev.c already takes the RTNL so it's OK.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
[description by AK]
Made a cut'n'paste error when adding the entry for the ALI M1695
AGP bridge and added a second entry for the 1689
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
This patch causes the network interface to respond to P_Key change
events correctly. As a result, you'll see a child interface in the
"RUNNING" state (netif_carrier_on()) only when the corresponding P_Key
is configured by the SM. When SM removes a P_Key, the "RUNNING" state
will be disabled for the corresponding network interface. To
implement this, I added IB_EVENT_PKEY_CHANGE event handling. To
prevent flushing the device before the device is open by the "delay
open" mechanism, I added an additional device flag called
IPOIB_FLAG_INITIALIZED.
This also prevents the child network interface from trying to join to
multicast groups until the PKEY is configured. We used to get error
messages like:
ib0.f2f2: couldn't attach QP to multicast group ff12:401b:f2f2:0:0:0:ffff:ffff
in this case. To fix this, I just check IPOIB_FLAG_OPER_UP flag in
ipoib_set_mcast_list().
Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
If the call to mthca_MODIFY_QP() failed, then mthca_modify_qp() would
still do some things it shouldn't, such as store away attributes for
special QPs. Fix this, and simplify the code, by simply jumping to
the exit path if mthca_MODIFY_QP() fails.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
With the current IPoIB driver, the status of network interfaces stays
"RUNNING" even if the link goes down (for example because a cable is
unplugged). Fix this by flushing the IPoIB interface when the link
goes down.
Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
mthca_alloc_sqp() by mthca_set_qp_size() need to set qp->transport
before calling mthca_set_qp_size(), since the value is used there.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When setting the shared receive queue (SRQ) watermark in a modify SRQ
operation, make sure that the supplied value is not larger than the
full size of the SRQ.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Guarantee the calculated work queue entry size does not exceed the max
allowable WQE size when creating an SRQ. This is a problem with Arbel
in Tavor-compatibility mode because the current WQE size computation
method rounds up to next power of 2.
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add a check that the modify QP parameters sgid_index and path_mtu are
valid, since they might come from userspace.
Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Since the SCSI midlayer is moving towards entirely getting rid of
commands with use_sg == 0, we should treat this case as an exception.
Therefore, change the IB SRP initiator to create a fake scatterlist
for these commands with sg_init_one(). This simplifies the flow of
DMA mapping and unmapping, since SRP can just use dma_map_sg() and
dma_unmap_sg() unconditionally, rather than having to choose between
the dma_{map,unmap}_sg() and dma_{map,unmap}_single() variants.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
I accidentally ended up with a config that set NET_RADIO off,
and NET_WIRELESS_RTNETLINK on, which blew up thus..
net/built-in.o: In function `do_setlink':net/core/rtnetlink.c:479: undefined reference to `wireless_rtnetlink_set'
net/built-in.o: In function `do_getlink':net/core/rtnetlink.c:521: undefined reference to `wireless_rtnetlink_get'
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ipoib_ib_dev_flush() should get passed cpriv->dev, not &cpriv->dev.
Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add an extern declaration for exported symbols to make the compiler warn
on symbols declared statically.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
I see lots of
kernel unaligned access to 0xa0000001009dbb6f, ip=0xa000000100811591
kernel unaligned access to 0xa0000001009dbb6b, ip=0xa0000001008115c1
kernel unaligned access to 0xa0000001009dbb6d, ip=0xa0000001008115f1
messages in my logs on IA64 when using the ethernet bridge with 2.6.16.
Appended is a patch to fix them.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
when starting lock mastery (excepting the recovery lock) wait on any nodes
needing recovery. fix one instance where lock resources were left attached
to the recovery list after recovery completed. ensure that the node_down
code is run uniformly regardless of which node found the dead node first.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>