Commit Graph

1055 Commits

Author SHA1 Message Date
Al Viro
7d877f3bda [PATCH] gfp_t: net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:47 -07:00
Trond Myklebust
434f1d10c1 Merge /home/trondmy/scm/kernel/git/torvalds/linux-2.6 2005-10-27 22:13:32 -04:00
Trond Myklebust
6070fe6f82 RPC: Ensure that nobody can queue up new upcalls after rpc_close_pipes()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:46 -04:00
Jeff Garzik
b2ab040db8 Merge branch 'master' 2005-10-27 20:35:17 -04:00
Trond Myklebust
4c2cb58c55 Merge /home/trondmy/scm/kernel/git/torvalds/linux-2.6 2005-10-27 19:12:49 -04:00
Trond Myklebust
6fa05b1736 Revert "RPC: stops the release_pipe() funtion from being called twice"
This reverts 747c5534c9 commit.
2005-10-27 19:08:18 -04:00
Herbert Xu
2ad41065d9 [TCP]: Clear stale pred_flags when snd_wnd changes
This bug is responsible for causing the infamous "Treason uncloaked"
messages that's been popping up everywhere since the printk was added.
It has usually been blamed on foreign operating systems.  However,
some of those reports implicate Linux as both systems are running
Linux or the TCP connection is going across the loopback interface.

In fact, there really is a bug in the Linux TCP header prediction code
that's been there since at least 2.1.8.  This bug was tracked down with
help from Dale Blount.

The effect of this bug ranges from harmless "Treason uncloaked"
messages to hung/aborted TCP connections.  The details of the bug
and fix is as follows.

When snd_wnd is updated, we only update pred_flags if
tcp_fast_path_check succeeds.  When it fails (for example,
when our rcvbuf is used up), we will leave pred_flags with
an out-of-date snd_wnd value.

When the out-of-date pred_flags happens to match the next incoming
packet we will again hit the fast path and use the current snd_wnd
which will be wrong.

In the case of the treason messages, it just happens that the snd_wnd
cached in pred_flags is zero while tp->snd_wnd is non-zero.  Therefore
when a zero-window packet comes in we incorrectly conclude that the
window is non-zero.

In fact if the peer continues to send us zero-window pure ACKs we
will continue making the same mistake.  It's only when the peer
transmits a zero-window packet with data attached that we get a
chance to snap out of it.  This is what triggers the treason
message at the next retransmit timeout.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-27 15:11:04 -02:00
Andrew Morton
4bcde03d41 [PATCH] svcsock timestamp fix
Convert nanoseconds to microseconds correctly.

Spotted by Steve Dickson <SteveD@redhat.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-26 10:39:43 -07:00
Jeff Garzik
35848e048f [PATCH] kill massive wireless-related log spam
Although this message is having the intended effect of causing wireless
driver maintainers to upgrade their code, I never should have merged this
patch in its present form.  Leading to tons of bug reports and unhappy
users.

Some wireless apps poll for statistics regularly, which leads to a printk()
every single time they ask for stats.  That's a little bit _too_ much of a
reminder that the driver is using an old API.

Change this to printing out the message once, per kernel boot.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-26 10:39:43 -07:00
Jeff Garzik
1f57389a38 Merge branch 'master' 2005-10-26 01:06:45 -04:00
James Ketrenos
077783f877 [PATCH] ieee80211 build fix
James Ketrenos wrote:
> [3/4] Use the tx_headroom and reserve requested space.

This patch introduced a compile problem; patch below corrects this.

Fixed compilation error due to not passing tx_headroom in
ieee80211_tx_frame.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-10-26 00:54:23 -04:00
David Engel
dcab5e1eec [IPV4]: Fix setting broadcast for SIOCSIFNETMASK
Fix setting of the broadcast address when the netmask is set via
SIOCSIFNETMASK in Linux 2.6.  The code wanted the old value of
ifa->ifa_mask but used it after it had already been overwritten with
the new value.

Signed-off-by: David Engel <gigem@comcast.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 01:20:21 -02:00
Ralf Baechle
95df1c04ab [AX.25]: Use constant instead of magic number
Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 01:14:09 -02:00
Randy Dunlap
c83c248618 [SK_BUFF] kernel-doc: fix skbuff warnings
Add kernel-doc to skbuff.h, skbuff.c to eliminate kernel-doc warnings.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 01:10:18 -02:00
Jayachandran C
0d0d2bba97 [IPV4]: Remove dead code from ip_output.c
skb_prev is assigned from skb, which cannot be NULL. This patch removes the
unnecessary NULL check.

Signed-off-by: Jayachandran C. <c.jayachandran at gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:58:54 -02:00
Jayachandran C
ea7ce40649 [NETLINK]: Remove dead code in af_netlink.c
Remove the variable nlk & call to nlk_sk as it does not have any side effect.

Signed-off-by: Jayachandran C. <c.jayachandran at gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:54:46 -02:00
Herbert Xu
80b30c1023 [IPSEC]: Kill obsolete get_mss function
Now that we've switched over to storing MTUs in the xfrm_dst entries,
we no longer need the dst's get_mss methods.  This patch gets rid of
them.

It also documents the fact that our MTU calculation is not optimal
for ESP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:48:45 -02:00
Herbert Xu
1371e37da2 [IPV4]: Kill redundant rcu_dereference on fa_info
This patch kills a redundant rcu_dereference on fa->fa_info in fib_trie.c.
As this dereference directly follows a list_for_each_entry_rcu line, we
have already taken a read barrier with respect to getting an entry from
the list.

This read barrier guarantees that all values read out of fa are valid.
In particular, the contents of structure pointed to by fa->fa_info is
initialised before fa->fa_info is actually set (see fn_trie_insert);
the setting of fa->fa_info itself is further separated with a write
barrier from the insertion of fa into the list.

Therefore by taking a read barrier after obtaining fa from the list
(which is given by list_for_each_entry_rcu), we can be sure that
fa->fa_info contains a valid pointer, as well as the fact that the
data pointed to by fa->fa_info is itself valid.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:25:03 -02:00
Harald Welte
eed75f191d [NETFILTER] ip_conntrack: Make "hashsize" conntrack parameter writable
It's fairly simple to resize the hash table, but currently you need to
remove and reinsert the module.  That's bad (we lose connection
state).  Harald has even offered to write a daemon which sets this
based on load.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:19:27 -02:00
Stephen Hemminger
d50a6b56f0 [PKTGEN]: proc interface revision
The code to handle the /proc interface can be cleaned up in several places:
* use seq_file for read
* don't need to remember all the filenames separately
* use for_online_cpu's
* don't vmalloc a buffer for small command from user.

Committer note:
This patch clashed with John Hawkes's "[NET]: Wider use of for_each_*cpu()",
so I fixed it up manually.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:12:18 -02:00
Stephen Hemminger
b4099fab75 [PKTGEN]: Spelling and white space
Fix some cosmetic issues. Indentation, spelling errors, and some whitespace.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:08:10 -02:00
Stephen Hemminger
2845b63b50 [PKTGEN]: Use kzalloc
These are cleanup patches for pktgen that can go in 2.6.15
Can use kzalloc in a couple of places.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:05:32 -02:00
Stephen Hemminger
b7c8921bf1 [PKTGEN]: Sleeping function called under lock
pktgen is calling kmalloc GFP_KERNEL and vmalloc with lock held.
The simplest fix is to turn the lock into a semaphore, since the
thread lock is only used for admin control from user context.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-26 00:03:12 -02:00
John Hawkes
670c02c2bf [NET]: Wider use of for_each_*cpu()
In 'net' change the explicit use of for-loops and NR_CPUS into the
general for_each_cpu() or for_each_online_cpu() constructs, as
appropriate.  This widens the scope of potential future optimizations
of the general constructs, as well as takes advantage of the existing
optimizations of first_cpu() and next_cpu(), which is advantageous
when the true CPU count is much smaller than NR_CPUS.

Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-25 23:54:01 -02:00
Patrick Caulfield
900e0143a5 [DECNET]: Remove some redundant ifdeffed code
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-25 23:49:29 -02:00
Jochen Friedrich
5ac660ee13 [TR]: Preserve RIF flag even for 2 byte RIF fields.
Signed-off-by: Jochen Friedrich <jochen@scram.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-25 21:31:38 -02:00
Yan Zheng
4ea6a8046b [IPV6]: Fix refcnt of struct ip6_flowlabel
Signed-off-by: Yan Zheng <yanzheng@21cn.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-25 21:17:52 -02:00
Herbert Xu
49636bb128 [NEIGH] Fix timer leak in neigh_changeaddr
neigh_changeaddr attempts to delete neighbour timers without setting
nud_state.  This doesn't work because the timer may have already fired
when we acquire the write lock in neigh_changeaddr.  The result is that
the timer may keep firing for quite a while until the entry reaches
NEIGH_FAILED.

It should be setting the nud_state straight away so that if the timer
has already fired it can simply exit once we relinquish the lock.

In fact, this whole function is simply duplicating the logic in
neigh_ifdown which in turn is already doing the right thing when
it comes to deleting timers and setting nud_state.

So all we have to do is take that code out and put it into a common
function and make both neigh_changeaddr and neigh_ifdown call it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-23 17:18:00 +10:00
Herbert Xu
6fb9974f49 [NEIGH] Fix add_timer race in neigh_add_timer
neigh_add_timer cannot use add_timer unconditionally.  The reason is that
by the time it has obtained the write lock someone else (e.g., neigh_update)
could have already added a new timer.

So it should only use mod_timer and deal with its return value accordingly.

This bug would have led to rare neighbour cache entry leaks.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-23 16:37:48 +10:00
Herbert Xu
203755029e [NEIGH] Print stack trace in neigh_add_timer
Stack traces are very helpful in determining the exact nature of a bug.
So let's print a stack trace when the timer is added twice.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2005-10-23 16:11:39 +10:00
Julian Anastasov
c98d80edc8 [SK_BUFF]: ipvs_property field must be copied
IPVS used flag NFC_IPVS_PROPERTY in nfcache but as now nfcache was removed the
new flag 'ipvs_property' still needs to be copied. This patch should be
included in 2.6.14.

Further comments from Harald Welte:

Sorry, seems like the bug was introduced by me.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-22 17:06:01 -02:00
Michael Buesch
d3f7bf4fa9 ieee80211 subsystem:
* Use GFP mask on TX skb allocation.
* Use the tx_headroom and reserve requested space.

Signed-off-by: Michael Buesch <mbuesch@freenet.de>
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-10-21 13:00:28 -05:00
Herbert Xu
b2cc99f04c [TCP] Allow len == skb->len in tcp_fragment
It is legitimate to call tcp_fragment with len == skb->len since
that is done for FIN packets and the FIN flag counts as one byte.
So we should only check for the len > skb->len case.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-20 17:13:13 -02:00
Herbert Xu
49c5bfaffe [DCCP]: Clear the IPCB area
Turns out the problem has nothing to do with use-after-free or double-free.
It's just that we're not clearing the CB area and DCCP unlike TCP uses a CB
format that's incompatible with IP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-20 14:49:59 -02:00
Herbert Xu
ffa29347df [DCCP]: Make dccp_write_xmit always free the packet
icmp_send doesn't use skb->sk at all so even if skb->sk has already
been freed it can't cause crash there (it would've crashed somewhere
else first, e.g., ip_queue_xmit).

I found a double-free on an skb that could explain this though.
dccp_sendmsg and dccp_write_xmit are a little confused as to what
should free the packet when something goes wrong.  Sometimes they
both go for the ball and end up in each other's way.

This patch makes dccp_write_xmit always free the packet no matter
what.  This makes sense since dccp_transmit_skb which in turn comes
from the fact that ip_queue_xmit always frees the packet.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-20 14:44:29 -02:00
Herbert Xu
fda0fd6c5b [DCCP]: Use skb_set_owner_w in dccp_transmit_skb when skb->sk is NULL
David S. Miller <davem@davemloft.net> wrote:
> One thing you can probably do for this bug is to mark data packets
> explicitly somehow, perhaps in the SKB control block DCCP already
> uses for other data.  Put some boolean in there, set it true for
> data packets.  Then change the test in dccp_transmit_skb() as
> appropriate to test the boolean flag instead of "skb_cloned(skb)".

I agree.  In fact we already have that flag, it's called skb->sk.
So here is patch to test that instead of skb_cloned().

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ian McDonald <imcdnzl@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2005-10-20 14:25:28 -02:00
Hong Liu
f0f15ab554 Fixed oops if an uninitialized key is used for encryption.
Without this patch, if you try and use a key that has not been
configured, for example:

% iwconfig eth1 key deadbeef00 [2]

without having configured key [1], then the active key will still be
[1], but privacy will now be enabled.  Transmission of a packet in this
situation will result in a kernel oops.

Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-10-20 11:06:36 -05:00
Hong Liu
5b74eda78d Fixed problem with not being able to decrypt/encrypt broadcast packets.
Signed-off-by: James Ketrenos <jketreno@linux.intel.com>
2005-10-19 16:49:03 -05:00
J. Bruce Fields
a0857d03b2 RPCSEC_GSS: krb5 cleanup
Remove some senseless wrappers.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:47 -07:00
J. Bruce Fields
00fd6e1425 RPCSEC_GSS remove all qop parameters
Not only are the qop parameters that are passed around throughout the gssapi
 unused by any currently implemented mechanism, but there appears to be some
 doubt as to whether they will ever be used.  Let's just kill them off for now.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:47 -07:00
J. Bruce Fields
14ae162c24 RPCSEC_GSS: Add support for privacy to krb5 rpcsec_gss mechanism.
Add support for privacy to the krb5 rpcsec_gss mechanism.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:46 -07:00
J. Bruce Fields
bfa91516b5 RPCSEC_GSS: krb5 pre-privacy cleanup
The code this was originally derived from processed wrap and mic tokens using
 the same functions.  This required some contortions, and more would be required
 with the addition of xdr_buf's, so it's better to separate out the two code
 paths.

 In preparation for adding privacy support, remove the last vestiges of the
 old wrap token code.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:45 -07:00
J. Bruce Fields
f7b3af64c6 RPCSEC_GSS: Simplify rpcsec_gss crypto code
Factor out some code that will be shared by privacy crypto routines

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:45 -07:00
J. Bruce Fields
2d2da60c63 RPCSEC_GSS: client-side privacy support
Add the code to the client side to handle privacy.  This is dead code until
 we actually add privacy support to krb5.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:44 -07:00
J. Bruce Fields
24b2605bec RPCSEC_GSS: cleanup au_rslack calculation
Various xdr encode routines use au_rslack to guess where the reply argument
 will end up, so we can set up the xdr_buf to recieve data into the right place
 for zero copy.

 Currently we calculate the au_rslack estimate when we check the verifier.
 Normally this only depends on the verifier size.  In the integrity case we add
 a few bytes to allow for a length and sequence number.

 It's a bit simpler to calculate only the verifier size when we check the
 verifier, and delay the full calculation till we unwrap.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:44 -07:00
J. Bruce Fields
f3680312a7 SUNRPC: Retry wrap in case of memory allocation failure.
For privacy we need to allocate extra pages to hold encrypted page data when
 wrapping requests.  This allocation may fail, and we handle that case by
 waiting and retrying.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:43 -07:00
J. Bruce Fields
ead5e1c26f SUNRPC: Provide a callback to allow free pages allocated during xdr encoding
For privacy, we need to allocate pages to store the encrypted data (passed
 in pages can't be used without the risk of corrupting data in the page cache).
 So we need a way to free that memory after the request has been transmitted.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:43 -07:00
J. Bruce Fields
293f1eb551 SUNRPC: Add support for privacy to generic gss-api code.
Add support for privacy to generic gss-api code.  This is dead code until we
 have both a mechanism that supports privacy and code in the client or server
 that uses it.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:42 -07:00
Steve Dickson
747c5534c9 RPC: stops the release_pipe() funtion from being called twice
This patch stops the release_pipe() funtion from being called
 twice by invalidating the ops pointer in the rpc_inode
 when rpc_pipe_release() is called.

 Signed-off-by: Steve Dickson <steved@redhat.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:40 -07:00
Jiri Benc
757d18faee [PATCH] ieee80211: division by zero fix
This fixes division by zero bug in ieee80211_wx_get_scan().

Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-10-18 17:25:36 -04:00
Trond Myklebust
5e5ce5be6f RPC: allow call_encode() to delay transmission of an RPC call.
Currently, call_encode will cause the entire RPC call to abort if it returns
 an error. This is unnecessarily rigid, and gets in the way of attempts
 to allow the NFSv4 layer to order RPC calls that carry sequence ids.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:11 -07:00
Chuck Lever
ea635a517e SUNRPC: Retry rpcbind requests if the server's portmapper isn't up
After a server crash/reboot, rebinding should always retry, otherwise
 requests on "hard" mounts will fail when they shouldn't.

 Test plan:
 Run a lock-intensive workload against a server while rebooting the server
 repeatedly.

 Signed-off-by: Chuck Lever <cel@netapp.com>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:10 -07:00
Jeff Garzik
28af493cd7 Merge branch 'master' 2005-10-18 17:14:17 -04:00
Trond Myklebust
cff6bf9709 Merge /home/trondmy/scm/kernel/git/torvalds/linux-2.6 2005-10-18 13:50:52 -07:00
Andrew Morton
e6850cce8f [NETFILTER]: Fix ip6_table.c build with NETFILTER_DEBUG enabled.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-15 16:15:38 -07:00
Jeff Garzik
59aee3c2a1 Merge branch 'master' 2005-10-13 21:22:27 -04:00
Herbert Xu
046d20b739 [TCP]: Ratelimit debugging warning.
Better safe than sorry.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-13 14:42:24 -07:00
Andi Kleen
34cb711ba9 [NET]: Disable NET_SCH_CLK_CPU for SMP x86 hosts
Opterons with frequency scaling have fully unsynchronized TSCs
running at different frequencies, so using TSCs there is not a good idea. 
Also some other x86 boxes have this problem. gettimeofday should be good 
enough, so just disable it.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-13 14:41:44 -07:00
David S. Miller
c8923c6b85 [NETFILTER]: Fix OOPSes on machines with discontiguous cpu numbering.
Original patch by Harald Welte, with feedback from Herbert Xu
and testing by Sbastien Bernard.

EBTABLES, ARP tables, and IP/IP6 tables all assume that cpus
are numbered linearly.  That is not necessarily true.

This patch fixes that up by calculating the largest possible
cpu number, and allocating enough per-cpu structure space given
that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-13 14:41:23 -07:00
Herbert Xu
9ff5c59ce2 [TCP]: Add code to help track down "BUG at net/ipv4/tcp_output.c:438!"
This is the second report of this bug.  Unfortunately the first
reporter hasn't been able to reproduce it since to provide more
debugging info.

So let's apply this patch for 2.6.14 to

1) Make this non-fatal.
2) Provide the info we need to track it down.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-12 15:59:39 -07:00
Stephen Hemminger
ab4060e858 [BRIDGE]: fix race on bridge del if
This fixes the RCU race on bridge delete interface.  Basically,
the network device has to be detached from the bridge in the first
step (pre-RCU), rather than later. At that point, no more bridge traffic
will come in, and the other code will not think that network device
is part of a bridge.

This should also fix the XEN test problems.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-12 15:10:01 -07:00
Arnaldo Carvalho de Melo
eeb2b85606 [TWSK]: Grab the module refcount for timewait sockets
This is required to avoid unloading a module that has active timewait
sockets, such as DCCP.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:25:23 -07:00
Arnaldo Carvalho de Melo
2a9bc9bb4d [DCCP]: Transition from PARTOPEN to OPEN when receiving DATA packets
Noticed by Andrea Bittau, that provided a patch that was modified to
not transition from RESPOND to OPEN when receiving DATA packets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:25:00 -07:00
Arnaldo Carvalho de Melo
777b25a2fe [CCID]: Check if ccid is NULL in the hc_[tr]x_exit functions
For consistency with ccid_exit and to fix a bug when
IP_DCCP_UNLOAD_HACK is enabled as the control sock is not associated
to any CCID.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:24:20 -07:00
Pablo Neira Ayuso
061cb4a0ec [NETFILTER] ctnetlink: add support to change protocol info
This patch add support to change the state of the private protocol
information via conntrack_netlink.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:23:46 -07:00
Pablo Neira Ayuso
3392315375 [NETFILTER] ctnetlink: allow userspace to change TCP state
This patch adds the ability of changing the state a TCP connection. I know
that this must be used with care but it's required to provide a complete
conntrack creation via conntrack_netlink. So I'll document this aspect on
the upcoming docs.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:23:28 -07:00
Harald Welte
a051a8f730 [NETFILTER]: Use only 32bit counters for CONNTRACK_ACCT
Initially we used 64bit counters for conntrack-based accounting, since we
had no event mechanism to tell userspace that our counters are about to
overflow.  With nfnetlink_conntrack, we now have such a event mechanism and
thus can save 16bytes per connection.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:21:10 -07:00
Herbert Xu
d4875b049b [IPSEC] Fix block size/MTU bugs in ESP
This patch fixes the following bugs in ESP:

* Fix transport mode MTU overestimate.  This means that the inner MTU
  is smaller than it needs be.  Worse yet, given an input MTU which
  is a multiple of 4 it will always produce an estimate which is not
  a multiple of 4.

  For example, given a standard ESP/3DES/MD5 transform and an MTU of
  1500, the resulting MTU for transport mode is 1462 when it should
  be 1464.

  The reason for this is because IP header lengths are always a multiple
  of 4 for IPv4 and 8 for IPv6.

* Ensure that the block size is at least 4.  This is required by RFC2406
  and corresponds to what the esp_output function does.  At the moment
  this only affects crypto_null as its block size is 1.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:11:34 -07:00
Herbert Xu
a02a64223e [IPSEC]: Use ALIGN macro in ESP
This patch uses the macro ALIGN in all the applicable spots for ESP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 21:11:08 -07:00
Pablo Neira Ayuso
e1c73b78e3 [NETFILTER] ctnetlink: add one nesting level for TCP state
To keep consistency, the TCP private protocol information is nested
attributes under CTA_PROTOINFO_TCP. This way the sequence of attributes to
access the TCP state information looks like here below:

CTA_PROTOINFO
CTA_PROTOINFO_TCP
CTA_PROTOINFO_TCP_STATE

instead of:

CTA_PROTOINFO
CTA_PROTOINFO_TCP_STATE

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:55:49 -07:00
Pablo Neira Ayuso
a1bcc3f268 [NETFILTER] ctnetlink: ICMP ID is not mandatory
The ID is only required by ICMP type 8 (echo), so it's not
mandatory for all sort of ICMP connections. This patch makes
mandatory only the type and the code for ICMP netlink messages.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:53:16 -07:00
Harald Welte
d000eaf772 [NETFILTER] conntrack_netlink: Fix endian issue with status from userspace
When we send "status" from userspace, we forget to convert the endianness.
This patch adds the reqired conversion.  Thanks to Pablo Neira for
discovering this.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:52:51 -07:00
Harald Welte
ebe0bbf06c [NETFILTER] nfnetlink: use highest bit of nfa_type to indicate nested TLV
As Henrik Nordstrom pointed out, all our efforts with "split endian" (i.e.
host byte order tags, net byte order values) are useless, unless a parser
can determine whether an attribute is nested or not.

This patch steals the highest bit of nfattr.nfa_type to indicate whether
the data payload contains a nested nfattr (1) or not (0).

This will break userspace compatibility, but luckily no kernel with
nfnetlink was released so far.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:52:19 -07:00
Harald Welte
f40863cec8 [NETFILTER] ipt_ULOG: Mark ipt_ULOG as OBSOLETE
Similar to nfnetlink_queue and ip_queue, we mark ipt_ULOG as obsolete.
This should have been part of the original nfnetlink_log merge, but
I somehow missed it.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:51:53 -07:00
Harald Welte
85d9b05d9b [NETFILTER] PPTP helper: Add missing Kconfig dependency
PPTP should not be selectable without conntrack enabled

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 20:47:42 -07:00
Al Viro
dd0fc66fb3 [PATCH] gfp flags annotations - part 1
- added typedef unsigned int __nocast gfp_t;

 - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
   the same warnings as far as sparse is concerned, doesn't change
   generated code (from gcc point of view we replaced unsigned int with
   typedef) and documents what's going on far better.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-08 15:00:57 -07:00
Jean-Denis Boyer
4f55cd105c [ATM]: [br2684] if we free the skb, we should return 0
From: "Jean-Denis Boyer" <jdboyer@mediatrix.com>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-07 13:44:35 -07:00
Eric Kinzie
0f21ba7cc3 [ATM]: add support for LECS addresses learned from network
From: Eric Kinzie <ekinzie@cmf.nrl.navy.mil>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-06 22:19:28 -07:00
Ivan Skytte Jrgensen
5fe467ee97 [SCTP] Fix sctp_get{pl}addrs() API to work with 32-bit apps on 64-bit kernels.
The old socket options are marked with a _OLD suffix so that the
existing 32-bit apps on 32-bit kernels do not break.

Signed-off-by: Ivan Skytte Jrgensen <isj-sctp@i1.dk>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-06 21:36:17 -07:00
Ralf Baechle
3a867b36c3 [AX.25]: Fix packet socket crash
Since changeset 98a82febb6 AX.25 is passing
received IP and ARP packets to the stack through netif_rx() but we don't
set the skb->mac.raw to right value which may result in a crash with
applications that use a packet socket.

Signed-off-by: Ralf Baechle DL5RB <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-05 12:16:04 -07:00
Herbert Xu
77d8d7a684 [IPSEC]: Document that policy direction is derived from the index.
Here is a patch that adds a helper called xfrm_policy_id2dir to
document the fact that the policy direction can be and is derived
from the index.

This is based on a patch by YOSHIFUJI Hideaki and 210313105@suda.edu.cn.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-05 12:15:12 -07:00
YOSHIFUJI Hideaki
140e26fcd5 [IPV6]: Fix NS handing for proxy/anycast address
Timer set up by pneigh_enqueue() ended up calling ndisc_rcv()
via pndisc_redo(), which clears LOCALLY_ENQUEUED flag in
NEIGH_CB(skb) and NS was queued again.
Let's call ndisc_recv_ns() directly to avoid the loop.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-05 12:11:41 -07:00
Stephen Hemminger
42a39450f8 [TCP]: BIC coding bug in Linux 2.6.13
Missing parenthesis in causes BIC to be slow in increasing congestion
window.

Spotted by Injong Rhee.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-05 12:09:31 -07:00
Yan Zheng
fab10fe37a [MCAST] ipv6: Fix address size in grec_size
Signed-Off-By: Yan Zheng <yanzheng@21cn.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-05 12:08:13 -07:00
Jeff Garzik
0d69ae5fb7 Merge branch 'master' 2005-10-05 02:11:33 -04:00
Randy Dunlap
83fa3400eb [XFRM]: fix sparse gfp nocast warnings
Fix implicit nocast warnings in xfrm code:
net/xfrm/xfrm_policy.c:232:47: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:45:35 -07:00
Randy Dunlap
dd13a285b7 [RPC]: fix sparse gfp nocast warnings
Fix nocast sparse warnings:
net/rxrpc/call.c:2013:25: warning: implicit cast to nocast type
net/rxrpc/connection.c:538:46: warning: implicit cast to nocast type
net/sunrpc/sched.c:730:36: warning: implicit cast to nocast type
net/sunrpc/sched.c:734:56: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:44:45 -07:00
Randy Dunlap
00fa023345 [AF_KEY]: fix sparse gfp nocast warnings
Fix implicit nocast warnings in net/key code:
net/key/af_key.c:195:27: warning: implicit cast to nocast type
net/key/af_key.c:1439:28: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:43:04 -07:00
Randy Dunlap
c6f4fafccf [NETFILTER]: fix sparse gfp nocast warnings
Fix implicit nocast warnings in nfnetlink code:
net/netfilter/nfnetlink.c:204:43: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:42:42 -07:00
Randy Dunlap
8eea00a44d [IPVS]: fix sparse gfp nocast warnings
From: Randy Dunlap <rdunlap@xenotime.net>

Fix implicit nocast warnings in ip_vs code:
net/ipv4/ipvs/ip_vs_app.c:631:54: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:42:15 -07:00
Randy Dunlap
f4a19a56e3 [DECNET]: fix sparse gfp nocast warnings
Fix implicit nocast warnings in decnet code:
net/decnet/af_decnet.c:458:40: warning: implicit cast to nocast type
net/decnet/dn_nsp_out.c:125:35: warning: implicit cast to nocast type
net/decnet/dn_nsp_out.c:219:29: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:41:48 -07:00
Randy Dunlap
7b5b3f3d82 [ATM]: fix sparse gfp nocast warnings
Fix implicit nocast warnings in atm code:
net/atm/atm_misc.c:35:44: warning: implicit cast to nocast type
drivers/atm/fore200e.c:183:33: warning: implicit cast to nocast type

Also use kzalloc() instead of kmalloc().

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 22:38:44 -07:00
Horst H. von Brand
a5181ab06d [NETFILTER]: Fix Kconfig typo
Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 15:58:56 -07:00
Robert Olsson
e6308be85a [IPV4]: fib_trie root-node expansion
The patch below introduces special thresholds to keep root node in the trie 
large. This gives a flatter tree at the cost of a modest memory increase.
Overall it seems to be gain and this was also proposed by one the authors 
of the paper in recent a seminar.

Main table after loading 123 k routes.

	Aver depth:     3.30
	Max depth:      9
        Root-node size  12 bits
        Total size: 4044  kB

With the patch:
	Aver depth:     2.78
	Max depth:      8
        Root-node size  15 bits
        Total size: 4150  kB

An increase of 8-10% was seen in forwading performance for an rDoS attack. 

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 13:01:58 -07:00
YOSHIFUJI Hideaki
87bf9c97b4 [IPV6]: Fix infinite loop in udp_v6_get_port().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 13:00:39 -07:00
Jeff Garzik
13d1ef29bc Merge rsync://bughost.org/repos/ieee80211-delta/ 2005-10-04 08:22:13 -04:00
Jeff Garzik
d9e34325fd Merge branch 'upstream-fixes' 2005-10-04 05:30:02 -04:00
Randy Dunlap
f36a29d567 [PATCH] ieee80211: fix gfp flags type
Fix implicit nocast warnings in ieee80211 code, including __nocast:
net/ieee80211/ieee80211_tx.c:215:9: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-10-04 05:29:48 -04:00
Jeff Garzik
3c8c7b2f32 Merge branch 'upstream-fixes' 2005-10-03 22:06:19 -04:00
Randy Dunlap
8cb6108bae [PATCH] ieee80211: fix gfp flags type
Fix implicit nocast warnings in ieee80211 code:
net/ieee80211/ieee80211_tx.c:215:9: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2005-10-03 22:01:14 -04:00