This patch gets rid of a bogus __in_dev_put() in pktgen.c. This was
spotted by Suzanne Wood.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following patch renames __in_dev_get() to __in_dev_get_rtnl() and
introduces __in_dev_get_rcu() to cover the second case.
1) RCU with refcnt should use in_dev_get().
2) RCU without refcnt should use __in_dev_get_rcu().
3) All others must hold RTNL and use __in_dev_get_rtnl().
There is one exception in net/ipv4/route.c which is in fact a pre-existing
race condition. I've marked it as such so that we remember to fix it.
This patch is based on suggestions and prior work by Suzanne Wood and
Paul McKenney.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Meelis Roos <mroos@linux.ee> wrote:
> RK> My firewall setup relies on proxyarp working. However, with 2.6.14-rc3,
> RK> it appears to be completely broken. The firewall is 212.18.232.186,
>
> Same here with some kernel between 14-rc2 and 14-rc3 - no reposnse to
> ARP on a proxyarp gateway. Sorry, no exact revison and no more debugging
> yet since it'a a production gateway.
The breakage is caused by the change to use the CB area for flagging
whether a packet has been queued due to proxy_delay. This area gets
cleared every time arp_rcv gets called. Unfortunately packets delayed
due to proxy_delay also go through arp_rcv when they are reprocessed.
In fact, I can't think of a reason why delayed proxy packets should go
through netfilter again at all. So the easiest solution is to bypass
that and go straight to arp_process.
This is essentially what would've happened before netfilter support
was added to ARP.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
During the build for ARM machine type "fortunet", this error occurred:
CC net/sysctl_net.o
net/sysctl_net.c:36: error: 'core_table' undeclared here (not in a function)
It appears that the following configuration settings cause this error
due to a missing include:
CONFIG_SYSCTL=y
CONFIG_NET=y
# CONFIG_INET is not set
core_table appears to be declared in net/sock.h. if CONFIG_INET were
defined, net/sock.h would have been included via:
sysctl_net.c -> net/ip.h -> linux/ip.h -> net/sock.h
so include it directly.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnaldo and I agreed it could be applied now, because I have other
pending patches depending on this one (Thank you Arnaldo)
(The other important patch moves skc_refcnt in a separate cache line,
so that the SMP/NUMA performance doesnt suffer from cache line ping pongs)
1) First some performance data :
--------------------------------
tcp_v4_rcv() wastes a *lot* of time in __inet_lookup_established()
The most time critical code is :
sk_for_each(sk, node, &head->chain) {
if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif))
goto hit; /* You sunk my battleship! */
}
The sk_for_each() does use prefetch() hints but only the begining of
"struct sock" is prefetched.
As INET_MATCH first comparison uses inet_sk(__sk)->daddr, wich is far
away from the begining of "struct sock", it has to bring into CPU
cache cold cache line. Each iteration has to use at least 2 cache
lines.
This can be problematic if some chains are very long.
2) The goal
-----------
The idea I had is to change things so that INET_MATCH() may return
FALSE in 99% of cases only using the data already in the CPU cache,
using one cache line per iteration.
3) Description of the patch
---------------------------
Adds a new 'unsigned int skc_hash' field in 'struct sock_common',
filling a 32 bits hole on 64 bits platform.
struct sock_common {
unsigned short skc_family;
volatile unsigned char skc_state;
unsigned char skc_reuse;
int skc_bound_dev_if;
struct hlist_node skc_node;
struct hlist_node skc_bind_node;
atomic_t skc_refcnt;
+ unsigned int skc_hash;
struct proto *skc_prot;
};
Store in this 32 bits field the full hash, not masked by (ehash_size -
1) Using this full hash as the first comparison done in INET_MATCH
permits us immediatly skip the element without touching a second cache
line in case of a miss.
Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to
sk_hash and tw_hash) already contains the slot number if we mask with
(ehash_size - 1)
File include/net/inet_hashtables.h
64 bits platforms :
#define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
(((__sk)->sk_hash == (__hash))
((*((__u64 *)&(inet_sk(__sk)->daddr)))== (__cookie)) && \
((*((__u32 *)&(inet_sk(__sk)->dport))) == (__ports)) && \
(!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
32bits platforms:
#define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
(((__sk)->sk_hash == (__hash)) && \
(inet_sk(__sk)->daddr == (__saddr)) && \
(inet_sk(__sk)->rcv_saddr == (__daddr)) && \
(!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))
- Adds a prefetch(head->chain.first) in
__inet_lookup_established()/__tcp_v4_check_established() and
__inet6_lookup_established()/__tcp_v6_check_established() and
__dccp_v4_check_established() to bring into cache the first element of the
list, before the {read|write}_lock(&head->lock);
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test for VIA K8T800 north bridge instead of AMD K8 HyperTransport
bridge based on new information from Andi Kleen. The AMD
HyperTransport interface is not responsible for PCI transactions
and so the re-ordering is more likely done by the VIA north bridge.
This code is subject to change if we get more information from AMD
or VIA.
PCI Express devices are excluded from doing the read flush since all
chipsets in the write_reorder list are PCI chipsets.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I've found the problem in general. It affects any 64-bit
architecture. The problem occurs when you change the system time.
Suppose that when you boot your system clock is forward by a day.
This gets recorded down in skb_tv_base. You then wind the clock back
by a day. From that point onwards the offset will be negative which
essentially overflows the 32-bit variables they're stored in.
In fact, why don't we just store the real time stamp in those 32-bit
variables? After all, we're not going to overflow for quite a while
yet.
When we do overflow, we'll need a better solution of course.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2.6.14-rc2 does not assign cpus to proper nodeids on our em64t numa boxen.
Our boxes use acpi srat for parsing the numa information.
srat_detect_node() used phys_proc_id[] to get to the cpu's local apic id,
but phys_proc_id[] represents the cpu<->initial_apic_id mapping. The
following patch fixes this problem. Now apicid_to_node[] is properly
indexed with the local apic id.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Improve explanation of the Subject line fields in
Documentation/SubmittingPatches Canonical Patch Format.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some Legacy megaraid cards can't actually cope with the scatter/gather
version of the READ CAPACITY command (which is what we now send them
since altering all SCSI internal I/O to go via the block layer). Fix
this (and a few other broken megaraid driver assumptions) by sending
the non-sg version of the command if the sg list only has a single
element.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Document more details of patch format such as the "from" line
and the "---" marker line, and provide more references for
patch guidelines.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Catalin Marinas
Data abort caused by ldrex/strex can leave the exclusive monitor in an
unpredictable state. It is recommended that a clrex/strex is performed to
clear this state.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pass in the pointer to the on-stack registers rather than using them
directly as the arguments.
Ivan noticed that I missed a spot when purging the registers as first
stack parameter idiom.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In these drivers, scsi_remove_host() is called too late, at the point
it is called, the driver has already shut down too far to accept any
I/O that the shutdown might generate. Any generated I/O actually
triggers a panic.
Fix this by calling scsi_remove_host() as early as possible and not
calling scsi_host_put() until just before we kfree the ahc_softc.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
There's a problem in our host release in that it calls
scsi_proc_hostdir_rm(). However, if you hold a reference to the host as
you remove the module, the host template (which proc uses) will be freed
and the system will panic when the host device is finally released.
Fix this by moving scsi_proc_hostdir_rm() to where it should be: in
scsi_remove_host().
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Arrange for the initialisation printks to happen after we've
registered the network interface, so we know what name the
device is. Also, check the link every 500ms (and use
msecs_to_jiffies.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This adds the new iBook G4 (manufactured after July 2005) to the
PowerMac models table. The model name (PowerBook6,7) is taken from a
12" iBook, I don't know if it also matches the 14" version. The patch
applies to a vanilla 2.6.13.2 kernel.
Signed-off-by: Sven Henkel <shenkel@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This adds suspend support for the Radeon M11 chip in 12" iBooks
manufactured after July 2005. I don't know if the new 14" iBooks also
have that chip, so they might also be supported.
The chip identifies itself as "RV350 NV" (pci id 0x4e56), revision 0x80.
Apple calls it "Snowy", xfree86 names it "ATI FireGL Mobility T2 (M11)
NV (AGP)". So, we seem to be lucky here: The suspend-code for the M10
(which also is a "RV350 NV") works flawless for that chip.
Signed-off-by: Sven Henkel <shenkel@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Vincent Sanders
When building the fortunet ARM platform it fails to compile because of
missing include.
Signed-off-by: Vincent Sanders <vince@arm.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Vincent Sanders
When building the mx1ads ARM platforms the serial driver fails to compile
with GCC 4.01 due to extern/static ambiguity.
Signed-off-by: Vincent Sanders <vince@arm.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We should always use bitmask ops, rather than depend on some ordering of
the different states. With the TASK_NONINTERACTIVE flag, the inequality
doesn't really work.
Oleg Nesterov argues (likely correctly) that this test is unnecessary in
the first place. However, the minimal fix for now is to at least make
it work in the presense of TASK_NONINTERACTIVE. Waiting for consensus
from Roland & co on potential bigger cleanups.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use '#ifdef' consistently on __KERNEL__. This was reported as bug #5340
(isn't easier to send a fix than report the bug?!)
Signed-off-by: Diego Calleja <diegocg@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The incorrect kprobe_mutex usage on x86_64 had percolated to ppc64 too.
First noticed by Yanmin Zhang.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Serial port only needs 32 bytes of resource space but we are currently
asking for 64K.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
[ diff went missing first time due to corrupted patch ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Tone down the r8169 driver
As an alternative, people can use the boot time 'debug' option and/or use
'ethtool -s ethX msglvl xyz'. The different messages are listed at:
http://www.zoreil.com/~romieu/r8169/doc/msglvl.txt
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The old 550Mhz titanium powerbook can switch to a lower frequency
(500Mhz). A user has been repeately reporting overtemp conditions on his
machine at high speed so this simple patch adds support to PowerMac
cpufreq for this machine. The difference in frequency isn't big but seem
enough to fix that user's problems. The patch has been around for some
time now and doesn't seem to cause any problem, so I suppose it could go
in now.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alain RICHARD <alain.richard@equation.fr>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Serial port only needs 32 bytes of resource space but we are currently
asking for 64K.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remember to free the multicast group context memory table.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The tests Alok carried out on Petr's box confirmed that cpu_to_node[BP] is
not setup early enough by numa_init_array due to the x86_64 changes in
2.6.14-rc*, and unfortunately set wrongly by the work around code in
numa_init_array(). cpu_to_node[0] gets set with 1 early and later gets set
properly to 0 during identify_cpu() when all cpus are brought up, but
confusing the numa slab in the process.
Here is a quick fix for this. The right fix obviously is to have
cpu_to_node[bsp] setup early for numa_init_array(). The following patch
will fix the problem now, and the code can stay on even when
cpu_to_node{BP] gets fixed early correctly.
Thanks to Petr for access to his box.
Signed off by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the BP node_to_cpumask. 2.6.14-rc* broke the boot cpu bit as the
cpu_to_node(0) is now not setup early enough for numa_init_array.
cpu_to_node[] is setup much later at srat_detect_node on acpi srat based
em64t machines. This seems like a problem on amd machines too, Tested on
em64t though. /sys/devices/system/node/node0/cpumap shows up sanely after
this patch.
Signed off by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The up()/down() orders are incorrect in arch/x86_64/kprobes.c file.
kprobe_mutext is used to protect the free kprobe instruction slot list.
arch_prepare_kprobe applies for a slot from the free list, and
arch_remove_kprobe returns a slot to the free list. The incorrect up()/down()
orders to operate on kprobe_mutex fail to protect the free list. If 2 threads
try to get/return kprobe instruction slot at the same time, the free slot list
might be broken, or a free slot might be applied by 2 threads.
Signed-off-by: Zhang Yanmin <Yanmin.zhang@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
less noise in dmesg
Signed-off-by: Ben Collins <bcollins@debian.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@steamballoon.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
amdtp, dv1394, raw1394, video1394:
Delete legacy module aliases. The macros did not work and the aliases are not
needed nowadays.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@debian.org>
Signed-off-by: Jody McIntyre <scjody@steamballoon.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Work around limitation in rawiso routines. Required with 1394b cards on
architectures where PAGE_SIZE is 4096. Based on a previous patch by Ben
Collins.
Signed-off-by: Jody McIntyre <scjody@steamballoon.com>
Cc: Ben Collins <bcollins@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
trivial edits of a few comments
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@debian.org>
Signed-off-by: Jody McIntyre <scjody@steamballoon.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use of time_before() macro, defined at linux/jiffies.h, which deal with
wrapping correctly and are nicer to read.
Signed-off-by: Marcelo Feitoza Parisi <marcelo@feitoza.com.br>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Ben Collins <bcollins@debian.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@steamballoon.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix debug code so it prints the correct speed (was defaulting to 100, so
anything > 400 showed only 100).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@debian.org>
Signed-off-by: Jody McIntyre <scjody@steamballoon.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Skip a superfluous pause that occured when the config ROM of a node was
scanned unsuccessfully. This also occurs if a node without link wrongly
enables its "link active" self ID flag. A GWCTech 6-port hub does this.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@steamballoon.com>
Cc: Ben Collins <bcollins@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Units were not detected if the local IRM performed a bus reset. ("The root
node is not cycle master capable; selecting a new root node and resetting...",
often seen with iPods and other SBP-2 devices). Rearrange the order of IRM
duties and node scanning. TODO: Audit the ROM caching and parsing code for
underlying issues.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@steamballoon.com>
Cc: Ben Collins <bcollins@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Set serialize_io=1 by default. This is safer and required by seemingly more
and more hardware. It causes little or no performance loss for S400 devices.
Performance of S800 1394b devices may drop by 25...30%. Therefore make the
parameter's description and dmesg message clearer about performance impact.
Update description of the max_speed parameter too. IEEE1394_SPEED_MAX is
currently S800.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@steamballoon.com>
Cc: Ben Collins <bcollins@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fixes for deadlocks of the ieee1394 and scsi subsystems and long delays in
futile error recovery attempts when SBP-2 devices are removed or drivers are
unloaded.
- Complete commands quickly with DID_NO_CONNECT if the 1394 node is gone or if
the 1394 low-level driver was unloaded.
- Skip unnecessary work in the eh_abort_handler and eh_device_reset_handler if
the node or 1394 low-level driver is gone.
- Let scsi's high-level shut down gracefully when sbp2 is being unloaded or
detached from the 1394 unit. A call to scsi_remove_device is added for this
purpose, which requires us to store a scsi_device pointer.
- scsi_device pointer is obtained from slave_alloc hook and cleared by
slave_destroy. This avoids usage of the pointer after the scsi device was
deleted e.g. by the user via scsi_mod's sysfs interface.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@steamballoon.com>
Cc: Ben Collins <bcollins@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>