Commit Graph

2599 Commits

Author SHA1 Message Date
Allan Stephens
5392d64688 [TIPC]: Fixed link switchover bugs
Incorporates several related fixes:
- switchover now occurs when switching from an active link to a standby link
- failure of a standby link no longer initiates switchover
- links now display correct # of received packtes following reactivation

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:52:50 -07:00
Allan Stephens
a10bd924a4 [TIPC]: Enhanced & cleaned up system messages; fixed 2 obscure memory leaks.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:52:17 -07:00
Allan Stephens
f131072c3d [TIPC]: First phase of assert() cleanup
This also contains enhancements to simplify comparisons in name table
publication removal algorithm and to simplify name table sanity checking
when shutting down TIPC.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:51:37 -07:00
Allan Stephens
e100ae92a6 [TIPC]: Disallow config operations that aren't supported in certain modes.
This change provides user-friendly feedback when TIPC is unable to perform
certain configuration operations that don't work properly in certain modes.
(In particular, any reconfiguration request that would temporarily take TIPC
from network mode to standalone mode, or from standalone mode to not running
mode, is disallowed.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:51:08 -07:00
Allan Stephens
c33d53b235 [TIPC]: Fixed memory leak in tipc_link_send() when destination is unreachable
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:50:30 -07:00
Allan Stephens
a75bf87427 [TIPC]: Added missing warning for out-of-memory condition
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:50:01 -07:00
Allan Stephens
a7513528cd [TIPC]: Withdrawing all names from nameless port now returns success, not error
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:49:33 -07:00
Allan Stephens
51f9cc1ff8 [TIPC]: Optimized argument validation done by connect().
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:49:06 -07:00
Allan Stephens
a3b0a5a9d0 [TIPC]: Simplify code for returning partial success of stream send request.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:48:22 -07:00
Allan Stephens
4b087b28a6 [TIPC]: recvmsg() now returns TIPC ancillary data using correct level (SOL_TIPC)
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:47:44 -07:00
Allan Stephens
499786516f [TIPC]: Improved performance of error checking during socket creation.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:47:18 -07:00
Allan Stephens
1303e8f173 [TIPC]: Stream socket send indicates partial success if data partially sent.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:46:50 -07:00
Allan Stephens
bdd94789d2 [TIPC]: Connected send now checks socket state when retrying congested send.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:45:53 -07:00
Allan Stephens
3546c7508d [TIPC]: Can now return destination name of form {0,x,y} via ancillary data.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:45:24 -07:00
Allan Stephens
3388007bc4 [TIPC]: Implied connect now saves dest name for retrieval as ancillary data.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:44:57 -07:00
Allan Stephens
6b384de853 [TIPC]: Fixed connect() to detect a dest address that is missing or too short.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:44:27 -07:00
Allan Stephens
e9024f0f79 [TIPC]: Non-operation-affecting corrections to comments & function definitions.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:43:57 -07:00
Allan Stephens
687a25f1cd [TIPC]: Validate entire interface name when locating bearer to enable.
This fix prevents a bearer from being enabled using the wrong interface.
For example, specifying "eth:eth14" might enable "eth:eth1" by mistake.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
2006-06-25 23:43:21 -07:00
Allan Stephens
a592ea6362 [TIPC]: Added support for MODULE_VERSION capability.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:42:47 -07:00
Allan Stephens
8b1f0a92e9 [TIPC]: Fix misleading comment in buf_discard() routine.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:42:19 -07:00
Allan Stephens
70cb234770 [TIPC]: Fixed privilege checking typo in dest_name_check().
This patch originated by Stephane Ouellette <ouellettes@videotron.ca>.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:41:47 -07:00
Eric Sesterhenn
3ac90216ab [TIPC] Fix for NULL pointer dereference
This fixes a bug spotted by the coverity checker, bug id #366. If
(mod(seqno - prev) != 1) we set buf to NULL, dereference it in the for
case, and set it to whatever value happes to be at adress 0+next, if it
happens to be non-zero, we even stay in the loop. It seems that the author
intended to break there.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:41:15 -07:00
Allan Stephens
a4e0927902 [TIPC]: Allow compilation when CONFIG_TIPC_DEBUG is not set.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:40:35 -07:00
Allan Stephens
d356eeba8e [TIPC]: Multicast link failure now resets all links to "nacking" node.
This fix prevents node from crashing.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:40:01 -07:00
Allan Stephens
260082471e [TIPC]: Links now validate destination node specified by incoming messages.
This fix prevents link flopping and name table inconsistency problems arising
when a node is assigned a different <Z.C.N> value than it used previously.
(Changing the <Z.C.N> value causes other nodes to have two link endpoints
sending to the same MAC address using two different destination <Z.C.N> values,
requiring the receiving node to filter out the unwanted messages.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:39:31 -07:00
Allan Stephens
9688243b63 [TIPC]: Allow ports to receive multicast messages through native API.
This fix prevents a kernel panic if an application mistakenly sends a
multicast message to  TIPC's topology service or configuration service.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:38:58 -07:00
Allan Stephens
4938450789 [TIPC]: Corrected potential misuse of tipc_media_addr structure.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:38:29 -07:00
Allan Stephens
2535ec50b7 [TIPC]: Use correct upper bound when validating network zone number.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:38:00 -07:00
Allan Stephens
9ab230f82f [TIPC]: Prevent name table corruption if no room for new publication
Now exits cleanly if attempt to allocate larger array of subsequences fails,
without losing track of pointer to existing array.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:37:24 -07:00
Jon Maloy
5e3c8854c1 [TIPC] Improved tolerance to promiscuous mode interface
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:36:43 -07:00
Linus Torvalds
1d77062b14 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (51 commits)
  nfs: remove nfs_put_link()
  nfs-build-fix-99
  git-nfs-build-fixes
  Merge branch 'odirect'
  NFS: alloc nfs_read/write_data as direct I/O is scheduled
  NFS: Eliminate nfs_get_user_pages()
  NFS: refactor nfs_direct_free_user_pages
  NFS: remove user_addr, user_count, and pos from nfs_direct_req
  NFS: "open code" the NFS direct write rescheduler
  NFS: Separate functions for counting outstanding NFS direct I/Os
  NLM: Fix reclaim races
  NLM: sem to mutex conversion
  locks.c: add the fl_owner to nlm_compare_locks
  NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts
  NFS: Split fs/nfs/inode.c
  NFS: Fix typo in nfs_do_clone_mount()
  NFS: Fix compile errors introduced by referrals patches
  NFSv4: Ensure that referral mounts bind to a reserved port
  NFSv4: A root pathname is sent as a zero component4
  NFSv4: Follow a referral
  ...
2006-06-25 10:54:14 -07:00
Paul Mackerras
bfe5d83419 [PATCH] Define __raw_get_cpu_var and use it
There are several instances of per_cpu(foo, raw_smp_processor_id()), which
is semantically equivalent to __get_cpu_var(foo) but without the warning
that smp_processor_id() can give if CONFIG_DEBUG_PREEMPT is enabled.  For
those architectures with optimized per-cpu implementations, namely ia64,
powerpc, s390, sparc64 and x86_64, per_cpu() turns into more and slower
code than __get_cpu_var(), so it would be preferable to use __get_cpu_var
on those platforms.

This defines a __raw_get_cpu_var(x) macro which turns into per_cpu(x,
raw_smp_processor_id()) on architectures that use the generic per-cpu
implementation, and turns into __get_cpu_var(x) on the architectures that
have an optimized per-cpu implementation.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:01 -07:00
Andrew Morton
fb1bb34d45 [PATCH] remove for_each_cpu()
Convert a few stragglers over to for_each_possible_cpu(), remove
for_each_cpu().

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:00:54 -07:00
Trond Myklebust
816724e65c Merge branch 'master' of /home/trondmy/kernel/linux-2.6/
Conflicts:

	fs/nfs/inode.c
	fs/super.c

Fix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch
'VFS: Permit filesystem to override root dentry on mount'
2006-06-24 13:07:53 -04:00
Linus Torvalds
199f4c9f76 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: Require CAP_NET_ADMIN to create tuntap devices.
  [NET]: fix net-core kernel-doc
  [TCP]: Move inclusion of <linux/dmaengine.h> to correct place in <linux/tcp.h>
  [IPSEC]: Handle GSO packets
  [NET]: Added GSO toggle
  [NET]: Add software TSOv4
  [NET]: Add generic segmentation offload
  [NET]: Merge TSO/UFO fields in sk_buff
  [NET]: Prevent transmission after dev_deactivate
  [IPV6] ADDRCONF: Fix default source address selection without CONFIG_IPV6_PRIVACY
  [IPV6]: Fix source address selection.
  [NET]: Avoid allocating skb in skb_pad
2006-06-23 08:00:01 -07:00
Oleg Nesterov
626ab0e69d [PATCH] list: use list_replace_init() instead of list_splice_init()
list_splice_init(list, head) does unneeded job if it is known that
list_empty(head) == 1.  We can use list_replace_init() instead.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:07 -07:00
Jean-Luc Leger
538c5902b8 [PATCH] clean up default value of IP_DCCP_ACKVEC
Default values for boolean and tristate options can only be 'y', 'm' or 'n'.
This patch removes wrong default for IP_DCCP_ACKVEC.

Signed-off-by: Jean-Luc Leger <jean-luc.leger@dspnet.fr.eu.org>
Cc: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:04 -07:00
David Howells
454e2398be [PATCH] VFS: Permit filesystem to override root dentry on mount
Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.

The filesystem is then required to manually set the superblock and root dentry
pointers.  For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).

The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.

This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing.  In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.

The patch also makes the following changes:

 (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
     pointer argument and return an integer, so most filesystems have to change
     very little.

 (*) If one of the convenience function is not used, then get_sb() should
     normally call simple_set_mnt() to instantiate the vfsmount. This will
     always return 0, and so can be tail-called from get_sb().

 (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
     dcache upon superblock destruction rather than shrink_dcache_anon().

     This is required because the superblock may now have multiple trees that
     aren't actually bound to s_root, but that still need to be cleaned up. The
     currently called functions assume that the whole tree is rooted at s_root,
     and that anonymous dentries are not the roots of trees which results in
     dentries being left unculled.

     However, with the way NFS superblock sharing are currently set to be
     implemented, these assumptions are violated: the root of the filesystem is
     simply a dummy dentry and inode (the real inode for '/' may well be
     inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
     with child trees.

     [*] Anonymous until discovered from another tree.

 (*) The documentation has been adjusted, including the additional bit of
     changing ext2_* into foo_* in the documentation.

[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:45 -07:00
Randy Dunlap
f4b8ea7849 [NET]: fix net-core kernel-doc
Warning(/var/linsrc/linux-2617-g4//include/linux/skbuff.h:304): No description found for parameter 'dma_cookie'
Warning(/var/linsrc/linux-2617-g4//include/net/sock.h:1274): No description found for parameter 'copied_early'
Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'chan'
Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'event'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:42 -07:00
Herbert Xu
09b8f7a93e [IPSEC]: Handle GSO packets
This patch segments GSO packets received by the IPsec stack.  This can
happen when a NIC driver injects GSO packets into the stack which are
then forwarded to another host.

The primary application of this is going to be Xen where its backend
driver may inject GSO packets into dom0.

Of course this also can be used by other virtualisation schemes such as
VMWare or UML since the tap device could be modified to inject GSO packets
received through splice.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:38 -07:00
Herbert Xu
37c3185a02 [NET]: Added GSO toggle
This patch adds a generic segmentation offload toggle that can be turned
on/off for each net device.  For now it only supports in TCPv4.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:36 -07:00
Herbert Xu
f4c50d990d [NET]: Add software TSOv4
This patch adds the GSO implementation for IPv4 TCP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:33 -07:00
Herbert Xu
f6a78bfcb1 [NET]: Add generic segmentation offload
This patch adds the infrastructure for generic segmentation offload.
The idea is to tap into the potential savings of TSO without hardware
support by postponing the allocation of segmented skb's until just
before the entry point into the NIC driver.

The same structure can be used to support software IPv6 TSO, as well as
UFO and segmentation offload for other relevant protocols, e.g., DCCP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:31 -07:00
Herbert Xu
7967168cef [NET]: Merge TSO/UFO fields in sk_buff
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
going to scale if we add any more segmentation methods (e.g., DCCP).  So
let's merge them.

They were used to tell the protocol of a packet.  This function has been
subsumed by the new gso_type field.  This is essentially a set of netdev
feature bits (shifted by 16 bits) that are required to process a specific
skb.  As such it's easy to tell whether a given device can process a GSO
skb: you just have to and the gso_type field and the netdev's features
field.

I've made gso_type a conjunction.  The idea is that you have a base type
(e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
For example, if we add a hardware TSO type that supports ECN, they would
declare NETIF_F_TSO | NETIF_F_TSO_ECN.  All TSO packets with CWR set would
have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
packets would be SKB_GSO_TCPV4.  This means that only the CWR packets need
to be emulated in software.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:29 -07:00
Herbert Xu
d4828d85d1 [NET]: Prevent transmission after dev_deactivate
The dev_deactivate function has bit-rotted since the introduction of
lockless drivers.  In particular, the spin_unlock_wait call at the end
has no effect on the xmit routine of lockless drivers.

With a little bit of work, we can make it much more useful by providing
the guarantee that when it returns, no more calls to the xmit routine
of the underlying driver will be made.

The idea is simple.  There are two entry points in to the xmit routine.
The first comes from dev_queue_xmit.  That one is easily stopped by
using synchronize_rcu.  This works because we set the qdisc to noop_qdisc
before the synchronize_rcu call.  That in turn causes all subsequent
packets sent to dev_queue_xmit to be dropped.  The synchronize_rcu call
also ensures all outstanding calls leave their critical section.

The other entry point is from qdisc_run.  Since we now have a bit that
indicates whether it's running, all we have to do is to wait until the
bit is off.

I've removed the loop to wait for __LINK_STATE_SCHED to clear.  This is
useless because netif_wake_queue can cause it to be set again.  It is
also harmless because we've disarmed qdisc_run.

I've also removed the spin_unlock_wait on xmit_lock because its only
purpose of making sure that all outstanding xmit_lock holders have
exited is also given by dev_watchdog_down.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:26 -07:00
YOSHIFUJI Hideaki
5e2707fa3a [IPV6] ADDRCONF: Fix default source address selection without CONFIG_IPV6_PRIVACY
We need to update hiscore.rule even if we don't enable CONFIG_IPV6_PRIVACY,
because we have more less significant rule; longest match.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:24 -07:00
Łukasz Stelmach
102128e3a2 [IPV6]: Fix source address selection.
Two additional labels (RFC 3484, sec. 10.3) for IPv6 addreses
are defined to make a distinction between global unicast
addresses and Unique Local Addresses (fc00::/7, RFC 4193) and
Teredo (2001::/32, RFC 4380). It is necessary to avoid attempts
of connection that would either fail (eg. fec0:: to 2001:feed::)
or be sub-optimal (2001:0:: to 2001:feed::).

Signed-off-by: Łukasz Stelmach <stlman@poczta.fm>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:22 -07:00
Herbert Xu
5b057c6b1a [NET]: Avoid allocating skb in skb_pad
First of all it is unnecessary to allocate a new skb in skb_pad since
the existing one is not shared.  More importantly, our hard_start_xmit
interface does not allow a new skb to be allocated since that breaks
requeueing.

This patch uses pskb_expand_head to expand the existing skb and linearize
it if needed.  Actually, someone should sift through every instance of
skb_pad on a non-linear skb as they do not fit the reasons why this was
originally created.

Incidentally, this fixes a minor bug when the skb is cloned (tcpdump,
TCP, etc.).  As it is skb_pad will simply write over a cloned skb.  Because
of the position of the write it is unlikely to cause problems but still
it's best if we don't do it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:06:41 -07:00
Jeff Garzik
dbe1ab9514 Merge branch 'master' into upstream 2006-06-22 22:51:46 -04:00
Trond Myklebust
d59bf96cdd Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ 2006-06-20 08:59:45 -04:00
Al Viro
ff7512e1a2 [ATM]: fix broken uses of NIPQUAD in net/atm
NIPQUAD expects an l-value of type __be32, _NOT_ a pointer to __be32.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 03:27:27 -07:00
Al Viro
8ca84481b6 [SCTP]: sctp_unpack_cookie() fix
sizeof(pointer) != sizeof(array)...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 03:26:14 -07:00
Jeff Garzik
4b2d9cf009 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream 2006-06-20 04:46:02 -04:00
Herbert Xu
48d83325b6 [NET]: Prevent multiple qdisc runs
Having two or more qdisc_run's contend against each other is bad because
it can induce packet reordering if the packets have to be requeued.  It
appears that this is an unintended consequence of relinquinshing the queue
lock while transmitting.  That in turn is needed for devices that spend a
lot of time in their transmit routine.

There are no advantages to be had as devices with queues are inherently
single-threaded (the loopback device is not but then it doesn't have a
queue).

Even if you were to add a queue to a parallel virtual device (e.g., bolt
a tbf filter in front of an ipip tunnel device), you would still want to
process the queue in sequence to ensure that the packets are ordered
correctly.

The solution here is to steal a bit from net_device to prevent this.

BTW, as qdisc_restart is no longer used by anyone as a module inside the
kernel (IIRC it used to with netif_wake_queue), I have not exported the
new __qdisc_run function.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-19 23:57:59 -07:00
Patrick McHardy
d3dcd4efe2 [NETFILTER]: xt_sctp: fix endless loop caused by 0 chunk length
Fix endless loop in the SCTP match similar to those already fixed in
the SCTP conntrack helper (was CVE-2006-1527).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-19 23:39:45 -07:00
Linus Torvalds
4c84a39c8a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (46 commits)
  IB/uverbs: Don't serialize with ib_uverbs_idr_mutex
  IB/mthca: Make all device methods truly reentrant
  IB/mthca: Fix memory leak on modify_qp error paths
  IB/uverbs: Factor out common idr code
  IB/uverbs: Don't decrement usecnt on error paths
  IB/uverbs: Release lock on error path
  IB/cm: Use address handle helpers
  IB/sa: Add ib_init_ah_from_path()
  IB: Add ib_init_ah_from_wc()
  IB/ucm: Get rid of duplicate P_Key parameter
  IB/srp: Factor out common request reset code
  IB/srp: Support SRP rev. 10 targets
  [SCSI] srp.h: Add I/O Class values
  IB/fmr: Use device's max_map_map_per_fmr attribute in FMR pool.
  IB/mthca: Fill in max_map_per_fmr device attribute
  IB/ipath: Add client reregister event generation
  IB/mthca: Add client reregister event generation
  IB: Move struct port_info from ipath to <rdma/ib_smi.h>
  IPoIB: Handle client reregister events
  IB: Add client reregister event type
  ...
2006-06-19 19:01:59 -07:00
Linus Torvalds
d0b952a983 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (109 commits)
  [ETHTOOL]: Fix UFO typo
  [SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.
  [SCTP]: Send only 1 window update SACK per message.
  [SCTP]: Don't do CRC32C checksum over loopback.
  [SCTP] Reset rtt_in_progress for the chunk when processing its sack.
  [SCTP]: Reject sctp packets with broadcast addresses.
  [SCTP]: Limit association max_retrans setting in setsockopt.
  [PFKEYV2]: Fix inconsistent typing in struct sadb_x_kmprivate.
  [IPV6]: Sum real space for RTAs.
  [IRDA]: Use put_unaligned() in irlmp_do_discovery().
  [BRIDGE]: Add support for NETIF_F_HW_CSUM devices
  [NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM
  [TG3]: Convert to non-LLTX
  [TG3]: Remove unnecessary tx_lock
  [TCP]: Add tcp_slow_start_after_idle sysctl.
  [BNX2]: Update version and reldate
  [BNX2]: Use CPU native page size
  [BNX2]: Use compressed firmware
  [BNX2]: Add firmware decompression
  [BNX2]: Allow WoL settings on new 5708 chips
  ...

Manual fixup for conflict in drivers/net/tulip/winbond-840.c
2006-06-19 18:55:56 -07:00
Herbert Xu
47552c4e55 [ETHTOOL]: Fix UFO typo
The function ethtool_get_ufo was referring to ETHTOOL_GTSO instead of
ETHTOOL_GUFO.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 23:00:20 -07:00
Neil Horman
d5b9f4c083 [SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.
In the event that our entire receive buffer is full with a series of
chunks that represent a single gap-ack, and then we accept a chunk
(or chunks) that fill in the gap between the ctsn and the first gap,
we renege chunks from the end of the buffer, which effectively does
nothing but move our gap to the end of our received tsn stream. This
does little but move our missing tsns down stream a little, and, if the
sender is sending sufficiently large retransmit frames, the result is a
perpetual slowdown which can never be recovered from, since the only
chunk that can be accepted to allow progress in the tsn stream necessitates
that a new gap be created to make room for it. This leads to a constant
need for retransmits, and subsequent receiver stalls. The fix I've come up
with is to deliver the frame without reneging if we have a full receive
buffer and the receiving sockets sk_receive_queue is empty(indicating that
the receive buffer is being blocked by a missing tsn).

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:59:03 -07:00
Tsutomu Fujii
d7c2c9e397 [SCTP]: Send only 1 window update SACK per message.
Right now, every time we increase our rwnd by more then MTU bytes, we
trigger a SACK.  When processing large messages, this will generate a
SACK for almost every other SCTP fragment. However since we are freeing
the entire message at the same time, we might as well collapse the SACK
generation to 1.

Signed-off-by: Tsutomu Fujii <t-fujii@nb.jp.nec.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:58:28 -07:00
Sridhar Samudrala
503b55fd77 [SCTP]: Don't do CRC32C checksum over loopback.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:57:28 -07:00
Vlad Yasevich
4c9f5d5305 [SCTP] Reset rtt_in_progress for the chunk when processing its sack.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:56:08 -07:00
Vlad Yasevich
5636bef732 [SCTP]: Reject sctp packets with broadcast addresses.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:55:35 -07:00
Vlad Yasevich
402d68c433 [SCTP]: Limit association max_retrans setting in setsockopt.
When using ASSOCINFO socket option, we need to limit the number of
maximum association retransmissions to be no greater than the sum
of all the path retransmissions. This is specified in Section 7.1.2
of the SCTP socket API draft.
However, we only do this if the association has multiple paths. If
there is only one path, the protocol stack will use the
assoc_max_retrans setting when trying to retransmit packets.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:54:51 -07:00
YOSHIFUJI Hideaki
c5396a31b2 [IPV6]: Sum real space for RTAs.
This patch fixes RTNLGRP_IPV6_IFINFO netlink notifications.  Issue
pointed out by Patrick McHardy <kaber@trash.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:48:48 -07:00
David S. Miller
b293acfd31 [IRDA]: Use put_unaligned() in irlmp_do_discovery().
irda_device_info->hints[] is byte aligned but is being
accessed as a u16

Based upon a patch by Luke Yang <luke.adi@gmail.com>.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:16:13 -07:00
Herbert Xu
2c6cc0d853 [BRIDGE]: Add support for NETIF_F_HW_CSUM devices
As it is the bridge will only ever declare NETIF_F_IP_CSUM even if all
its constituent devices support NETIF_F_HW_CSUM.  This patch fixes
this by supporting the first one out of NETIF_F_NO_CSUM,
NETIF_F_HW_CSUM, and NETIF_F_IP_CSUM that is supported by all
constituent devices.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:06:45 -07:00
Herbert Xu
8648b3053b [NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM
The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
identically so we test for them in quite a few places.  For the sake
of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two.  We
also test the disjunct of NETIF_F_IP_CSUM and the other two in various
places, for that purpose I've added NETIF_F_ALL_CSUM.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:06:05 -07:00
David S. Miller
35089bb203 [TCP]: Add tcp_slow_start_after_idle sysctl.
A lot of people have asked for a way to disable tcp_cwnd_restart(),
and it seems reasonable to add a sysctl to do that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:53 -07:00
Luca De Cicco
bc726a71d2 [TCP] Westwood: reset RTT min after FRTO
RTT_min is updated each time a timeout event occurs
in order to cope with hard handovers in wireless scenarios such as UMTS.

Signed-off-by: Luca De Cicco <ldecicco@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:38 -07:00
Luca De Cicco
b3a92eabe5 [TCP] Westwood: bandwidth filter startup
The bandwidth estimate filter is now initialized with the first
sample in order to have better performances in the case of small
file transfers.

Signed-off-by: Luca De Cicco <ldecicco@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:36 -07:00
Luca De Cicco
b7d7a9e3c9 [TCP] Westwood: comment fixes
Cleanup some comments and add more references

Signed-off-by: Luca De Cicco <ldecicco@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:34 -07:00
Stephen Hemminger
f61e29018a [TCP] Westwood: fix first sample
Need to update send sequence number tracking after first ack.
Rework of patch from Luca De Cicco.

Signed-off-by: Stephen Hemminger <shemminger@dxpl.pdx.osdl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:32 -07:00
Stephen Hemminger
bdeb04c6d9 [NET]: net.ipv4.ip_autoconfig sysctl removal
The sysctl net.ipv4.ip_autoconfig is a legacy value that is not used.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:30 -07:00
Alexey Dobriyan
f8d5962112 [IPX]: Endian bug in ipxrtr_route_packet()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:24 -07:00
Herbert Xu
3cc0e87398 [NET]: Warn in __skb_trim if skb is paged
It's better to warn and fail rather than rarely triggering BUG on paths
that incorrectly call skb_trim/__skb_trim on a non-linear skb.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:22 -07:00
Herbert Xu
b38dfee3d6 [NET]: skb_trim audit
I found a few more spots where pskb_trim_rcsum could be used but were not.
This patch changes them to use it.

Also, sk_filter can get paged skb data.  Therefore we must use pskb_trim
instead of skb_trim.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:20 -07:00
Herbert Xu
364c6badde [NET]: Clean up skb_linearize
The linearisation operation doesn't need to be super-optimised.  So we can
replace __skb_linearize with __pskb_pull_tail which does the same thing but
is more general.

Also, most users of skb_linearize end up testing whether the skb is linear
or not so it helps to make skb_linearize do just that.

Some callers of skb_linearize also use it to copy cloned data, so it's
useful to have a new function skb_linearize_cow to copy the data if it's
either non-linear or cloned.

Last but not least, I've removed the gfp argument since nobody uses it
anymore.  If it's ever needed we can easily add it back.

Misc bugs fixed by this patch:

* via-velocity error handling (also, no SG => no frags)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:16 -07:00
Herbert Xu
932ff279a4 [NET]: Add netif_tx_lock
Various drivers use xmit_lock internally to synchronise with their
transmission routines.  They do so without setting xmit_lock_owner.
This is fine as long as netpoll is not in use.

With netpoll it is possible for deadlocks to occur if xmit_lock_owner
isn't set.  This is because if a printk occurs while xmit_lock is held
and xmit_lock_owner is not set can cause netpoll to attempt to take
xmit_lock recursively.

While it is possible to resolve this by getting netpoll to use
trylock, it is suboptimal because netpoll's sole objective is to
maximise the chance of getting the printk out on the wire.  So
delaying or dropping the message is to be avoided as much as possible.

So the only alternative is to always set xmit_lock_owner.  The
following patch does this by introducing the netif_tx_lock family of
functions that take care of setting/unsetting xmit_lock_owner.

I renamed xmit_lock to _xmit_lock to indicate that it should not be
used directly.  I didn't provide irq versions of the netif_tx_lock
functions since xmit_lock is meant to be a BH-disabling lock.

This is pretty much a straight text substitution except for a small
bug fix in winbond.  It currently uses
netif_stop_queue/spin_unlock_wait to stop transmission.  This is
unsafe as an IRQ can potentially wake up the queue.  So it is safer to
use netif_tx_disable.

The hamradio bits used spin_lock_irq but it is unnecessary as
xmit_lock must never be taken in an IRQ handler.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:14 -07:00
Patrick McHardy
bf0857ea32 [NETFILTER]: hashlimit match: fix random initialization
hashlimit does:

        if (!ht->rnd)
                get_random_bytes(&ht->rnd, 4);

ignoring that 0 is also a valid random number.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:11 -07:00
Patrick McHardy
2b2283d030 [NETFILTER]: recent match: missing refcnt initialization
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:09 -07:00
Patrick McHardy
a0e889bb1b [NETFILTER]: recent match: fix "sleeping function called from invalid context"
create_proc_entry must not be called with locks held. Use a mutex
instead to protect data only changed in user context.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:07 -07:00
James Morris
100468e9c0 [SECMARK]: Add CONNSECMARK xtables target
Add a new xtables target, CONNSECMARK, which is used to specify rules
for copying security marks from packets to connections, and for
copyying security marks back from connections to packets.  This is
similar to the CONNMARK target, but is more limited in scope in that
it only allows copying of security marks to and from packets, as this
is all it needs to do.

A typical scenario would be to apply a security mark to a 'new' packet
with SECMARK, then copy that to its conntrack via CONNMARK, and then
restore the security mark from the connection to established and
related packets on that connection.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:03 -07:00
James Morris
7c9728c393 [SECMARK]: Add secmark support to conntrack
Add a secmark field to IP and NF conntracks, so that security markings
on packets can be copied to their associated connections, and also
copied back to packets as required.  This is similar to the network
mark field currently used with conntrack, although it is intended for
enforcement of security policy rather than network policy.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:01 -07:00
James Morris
5e6874cdb8 [SECMARK]: Add xtables SECMARK target
Add a SECMARK target to xtables, allowing the admin to apply security
marks to packets via both iptables and ip6tables.

The target currently handles SELinux security marking, but can be
extended for other purposes as needed.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:59 -07:00
James Morris
984bc16cc9 [SECMARK]: Add secmark support to core networking.
Add a secmark field to the skbuff structure, to allow security subsystems to
place security markings on network packets.  This is similar to the nfmark
field, except is intended for implementing security policy, rather than than
networking policy.

This patch was already acked in principle by Dave Miller.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:57 -07:00
David S. Miller
6f68dc3775 [NET]: Fix warnings after LSM-IPSEC changes.
Assignment used as truth value in xfrm_del_sa()
and xfrm_get_policy().

Wrong argument type declared for security_xfrm_state_delete()
when SELINUX is disabled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:49 -07:00
Dave Jones
9dadaa19cb [NET]: NET_TCPPROBE Kconfig fix
Just spotted this typo in a new option.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:47 -07:00
Catherine Zhang
c8c05a8eec [LSM-IPsec]: SELinux Authorize
This patch contains a fix for the previous patch that adds security
contexts to IPsec policies and security associations.  In the previous
patch, no authorization (besides the check for write permissions to
SAD and SPD) is required to delete IPsec policies and security
assocations with security contexts.  Thus a user authorized to change
SAD and SPD can bypass the IPsec policy authorization by simply
deleteing policies with security contexts.  To fix this security hole,
an additional authorization check is added for removing security
policies and security associations with security contexts.

Note that if no security context is supplied on add or present on
policy to be deleted, the SELinux module allows the change
unconditionally.  The hook is called on deletion when no context is
present, which we may want to change.  At present, I left it up to the
module.

LSM changes:

The patch adds two new LSM hooks: xfrm_policy_delete and
xfrm_state_delete.  The new hooks are necessary to authorize deletion
of IPsec policies that have security contexts.  The existing hooks
xfrm_policy_free and xfrm_state_free lack the context to do the
authorization, so I decided to split authorization of deletion and
memory management of security data, as is typical in the LSM
interface.

Use:

The new delete hooks are checked when xfrm_policy or xfrm_state are
deleted by either the xfrm_user interface (xfrm_get_policy,
xfrm_del_sa) or the pfkey interface (pfkey_spddelete, pfkey_delete).

SELinux changes:

The new policy_delete and state_delete functions are added.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:45 -07:00
David S. Miller
f86502bfc1 [IPV4] icmp: Kill local 'ip' arg in icmp_redirect().
It is typed wrong, and it's only assigned and used once.
So just pass in iph->daddr directly which fixes both problems.

Based upon a patch by Alexey Dobriyan.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:41 -07:00
Alexey Dobriyan
6d74165350 [IPV4]: Right prototype of __raw_v4_lookup()
All users pass 32-bit values as addresses and internally they're
compared with 32-bit entities. So, change "laddr" and "raddr" types to
__be32.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:39 -07:00
Alexey Dobriyan
338fcf9886 [IPV4] igmp: Fixup struct ip_mc_list::multiaddr type
All users except two expect 32-bit big-endian value. One is of

	->multiaddr = ->multiaddr

variety. And last one is "%08lX".

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:37 -07:00
David S. Miller
70df2311ee [TCP]: Fix compile warning in tcp_probe.c
The suseconds_t et al. are not necessarily any particular type on
every platform, so cast to unsigned long so that we can use one printf
format string and avoid warnings across the board

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:35 -07:00
Stephen Hemminger
738980ffa6 [TCP]: Limited slow start for Highspeed TCP
Implementation of RFC3742 limited slow start. Added as part
of the TCP highspeed congestion control module.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:33 -07:00
Stephen Hemminger
a42e9d6ce8 [TCP]: TCP Probe congestion window tracing
This adds a new module for tracking TCP state variables non-intrusively
using kprobes.  It has a simple /proc interface that outputs one line
for each packet received. A sample usage is to collect congestion
window and ssthresh over time graphs.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:31 -07:00
Stephen Hemminger
72dc5b9225 [TCP]: Minimum congestion window consolidation.
Many of the TCP congestion methods all just use ssthresh
as the minimum congestion window on decrease.  Rather than
duplicating the code, just have that be the default if that
handle in the ops structure is not set.

Minor behaviour change to TCP compound.  It probably wants
to use this (ssthresh) as lower bound, rather than ssthresh/2
because the latter causes undershoot on loss.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:29 -07:00
Stephen Hemminger
a4ed258495 [TCP]: TCP Compound quad root function
The original code did a 64 bit divide directly, which won't work on
32 bit platforms.  Rather than doing a 64 bit square root twice,
just implement a 4th root function in one pass using Newton's method.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:27 -07:00
Angelo P. Castellani
f890f92104 [TCP]: TCP Compound congestion control
TCP Compound is a sender-side only change to TCP that uses
a mixed Reno/Vegas approach to calculate the cwnd.

For further details look here:
  ftp://ftp.research.microsoft.com/pub/tr/TR-2005-86.pdf

Signed-off-by: Angelo P. Castellani <angelo.castellani@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:25 -07:00
Bin Zhou
76f1017757 [TCP]: TCP Veno congestion control
TCP Veno module is a new congestion control module to improve TCP
performance over wireless networks. The key innovation in TCP Veno is
the enhancement of TCP Reno/Sack congestion control algorithm by using
the estimated state of a connection based on TCP Vegas. This scheme
significantly reduces "blind" reduction of TCP window regardless of
the cause of packet loss.

This work is based on the research paper "TCP Veno: TCP Enhancement
for Transmission over Wireless Access Networks." C. P. Fu, S. C. Liew,
IEEE Journal on Selected Areas in Communication, Feb. 2003.

Original paper and many latest research works on veno:
 http://www.ntu.edu.sg/home/ascpfu/veno/veno.html

Signed-off-by: Bin Zhou <zhou0022@ntu.edu.sg>
	       Cheng Peng Fu <ascpfu@ntu.edu.sg>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:23 -07:00
Wong Hoi Sing Edison
7c106d7e78 [TCP]: TCP Low Priority congestion control
TCP Low Priority is a distributed algorithm whose goal is to utilize only
 the excess network bandwidth as compared to the ``fair share`` of
 bandwidth as targeted by TCP. Available from:
   http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf

Original Author:
 Aleksandar Kuzmanovic <akuzma@northwestern.edu>

See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
As of 2.6.13, Linux supports pluggable congestion control algorithms.
Due to the limitation of the API, we take the following changes from
the original TCP-LP implementation:
 o We use newReno in most core CA handling. Only add some checking
   within cong_avoid.
 o Error correcting in remote HZ, therefore remote HZ will be keeped
   on checking and updating.
 o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne
   OWD have a similar meaning as RTT. Also correct the buggy formular.
 o Handle reaction for Early Congestion Indication (ECI) within
   pkts_acked, as mentioned within pseudo code.
 o OWD is handled in relative format, where local time stamp will in
   tcp_time_stamp format.

Port from 2.4.19 to 2.6.16 as module by:
 Wong Hoi Sing Edison <hswong3i@gmail.com>
 Hung Hing Lun <hlhung3i@gmail.com>

Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:21 -07:00
Andrew Morton
2f45c340e0 [LLC]: Fix double receive of SKB.
Oops fix from Stephen: remove duplicate rcv() calls.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:19 -07:00
Alexey Dobriyan
c45fb1089e [NETFILTER]: PPTP helper: fixup gre_keymap_lookup() return type
GRE keys are 16-bit wide.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:17 -07:00
Patrick McHardy
ae5b7d8ba2 [NETFILTER]: Add SIP connection tracking helper
Add SIP connection tracking helper. Originally written by
Christian Hentschel <chentschel@arnet.com.ar>, some cleanup, minor
fixes and bidirectional SIP support added by myself.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:15 -07:00
Patrick McHardy
e44ab66a75 [NETFILTER]: H.323 helper: replace internal_net_addr parameter by routing-based heuristic
Call Forwarding doesn't need to create an expectation if both peers can
reach each other without our help. The internal_net_addr parameter
lets the user explicitly specify a single network where this is true,
but is not very flexible and even fails in the common case that calls
will both be forwarded to outside parties and inside parties. Use an
optional heuristic based on routing instead, the assumption is that
if bpth the outgoing device and the gateway are equal, both peers can
reach each other directly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:13 -07:00
Jing Min Zhao
c0d4cfd96d [NETFILTER]: H.323 helper: Add support for Call Forwarding
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:11 -07:00
Patrick McHardy
c952616934 [NETFILTER]: amanda helper: convert to textsearch infrastructure
When a port number within a packet is replaced by a differently sized
number only the packet is resized, but not the copy of the data.
Following port numbers are rewritten based on their offsets within
the copy, leading to packet corruption.

Convert the amanda helper to the textsearch infrastructure to avoid
the copy entirely.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:09 -07:00
Patrick McHardy
7d8c501817 [NETFILTER]: FTP helper: search optimization
Instead of skipping search entries for the wrong direction simply index
them by direction.

Based on patch by Pablo Neira <pablo@netfilter.org>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:07 -07:00
Patrick McHardy
695ecea329 [NETFILTER]: SNMP helper: fix debug module param type
debug is the debug level, not a bool.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:05 -07:00
Patrick McHardy
89f2e21883 [NETFILTER]: ctnetlink: change table dumping not to require an unique ID
Instead of using the ID to find out where to continue dumping, take a
reference to the last entry dumped and try to continue there.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:03 -07:00
Patrick McHardy
3726add766 [NETFILTER]: ctnetlink: fix NAT configuration
The current configuration only allows to configure one manip and overloads
conntrack status flags with netlink semantic.

Signed-off-by: Patrick Mchardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:01 -07:00
Eric Leblond
997ae831ad [NETFILTER]: conntrack: add fixed timeout flag in connection tracking
Add a flag in a connection status to have a non updated timeout.
This permits to have connection that automatically die at a given
time.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:59 -07:00
Patrick McHardy
39a27a35c5 [NETFILTER]: conntrack: add sysctl to disable checksumming
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:57 -07:00
Patrick McHardy
6442f1cf89 [NETFILTER]: conntrack: don't call helpers for related ICMP messages
None of the existing helpers expects to get called for related ICMP
packets and some even drop them if they can't parse them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:55 -07:00
Patrick McHardy
404bdbfd24 [NETFILTER]: recent match: replace by rewritten version
Replace the unmaintainable ipt_recent match by a rewritten version that
should be fully compatible.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:53 -07:00
Patrick McHardy
f3389805e5 [NETFILTER]: x_tables: add statistic match
Add statistic match which is a combination of the nth and random matches.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:51 -07:00
Patrick McHardy
62b7743483 [NETFILTER]: x_tables: add quota match
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:49 -07:00
Patrick McHardy
957dc80ac3 [NETFILTER]: x_tables: add SCTP/DCCP support where missing
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:47 -07:00
Patrick McHardy
3e72b2fe5b [NETFILTER]: x_tables: remove some unnecessary casts
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:45 -07:00
Herbert Xu
31a4ab9302 [IPSEC] proto: Move transport mode input path into xfrm_mode_transport
Now that we have xfrm_mode objects we can move the transport mode specific
input decapsulation code into xfrm_mode_transport.  This removes duplicate
code as well as unnecessary header movement in case of tunnel mode SAs
since we will discard the original IP header immediately.

This also fixes a minor bug for transport-mode ESP where the IP payload
length is set to the correct value minus the header length (with extension
headers for IPv6).

Of course the other neat thing is that we no longer have to allocate
temporary buffers to hold the IP headers for ESP and IPComp.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:41 -07:00
Herbert Xu
b59f45d0b2 [IPSEC] xfrm: Abstract out encapsulation modes
This patch adds the structure xfrm_mode.  It is meant to represent
the operations carried out by transport/tunnel modes.

By doing this we allow additional encapsulation modes to be added
without clogging up the xfrm_input/xfrm_output paths.

Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and
BEET modes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:39 -07:00
Herbert Xu
546be2405b [IPSEC] xfrm: Undo afinfo lock proliferation
The number of locks used to manage afinfo structures can easily be reduced
down to one each for policy and state respectively.  This is based on the
observation that the write locks are only held by module insertion/removal
which are very rare events so there is no need to further differentiate
between the insertion of modules like ipv6 versus esp6.

The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look
suspicious at first.  However, after you realise that nobody ever takes
the corresponding write lock you'll feel better :)

As far as I can gather it's an attempt to guard against the removal of
the corresponding modules.  Since neither module can be unloaded at all
we can leave it to whoever fixes up IPv6 unloading :)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:37 -07:00
David S. Miller
15986e1aad [TCP]: tcp_rcv_rtt_measure_ts() call in pure-ACK path is superfluous
We only want to take receive RTT mesaurements for data
bearing frames, here in the header prediction fast path
for a pure-sender, we know that we have a pure-ACK and
thus the checks in tcp_rcv_rtt_mesaure_ts() will not pass.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:16 -07:00
Stephen Hemminger
11dc1f36a6 [BRIDGE]: netlink interface for link management
Add basic netlink support to the Ethernet bridge. Including:
 * dump interfaces in bridges
 * monitor link status changes
 * change state of bridge port

For some demo programs see:
	http://developer.osdl.org/shemminger/prototypes/brnl.tar.gz

These are to allow building a daemon that does alternative
implementations of Spanning Tree Protocol.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:14 -07:00
Stephen Hemminger
c090971326 [BRIDGE]: fix module startup error handling
Return address in use, if some other kernel code has the SAP.
Propogate out error codes from netfilter registration and unwind.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:12 -07:00
Stephen Hemminger
9ef513bed6 [BRIDGE]: optimize conditional in forward path
Small optimizations of bridge forwarding path.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:10 -07:00
Stephen Hemminger
bc0e646796 [LLC]: add multicast support for datagrams
Allow mulitcast reception of datagrams (similar to UDP).
All sockets bound to the same SAP receive a clone.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:08 -07:00
Stephen Hemminger
8f182b494f [LLC]: allow applications to get copy of kernel datagrams
It is legal for an application to bind to a SAP that is also being
used by the kernel. This happens if the bridge module binds to the
STP SAP, and the user wants to have a daemon for STP as well.
It is possible to have kernel doing STP on one bridge, but
let application do RSTP on another bridge.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:06 -07:00
Stephen Hemminger
23dbe7912d [LLC]: use rcu_dereference on receive handler
The receive hander pointer might be modified during network changes
of protocol. So use rcu_dereference (only matters on alpha).

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:04 -07:00
Stephen Hemminger
29efcd2666 [LLC]: allow datagram recvmsg
LLC receive is broken for SOCK_DGRAM.
If an application does recv() on a datagram socket and there
is no data present, don't return "not connected". Instead, just
do normal datagram semantics.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:02 -07:00
Stephen Hemminger
aecbd4e45c [LLC]: use more efficient ether address routines
Use more cache efficient Ethernet address manipulation functions
in etherdevice.h.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
2006-06-17 21:26:00 -07:00
Chris Leech
1a2449a87b [I/OAT]: TCP recv offload to I/OAT
Locks down user pages and sets up for DMA in tcp_recvmsg, then calls
dma_async_try_early_copy in tcp_v4_do_rcv

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:56 -07:00
Chris Leech
9593782585 [I/OAT]: Add a sysctl for tuning the I/OAT offloaded I/O threshold
Any socket recv of less than this ammount will not be offloaded

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:54 -07:00
Chris Leech
624d116473 [I/OAT]: Make sk_eat_skb I/OAT aware.
Add an extra argument to sk_eat_skb, and make it move early copied
packets to the async_wait_queue instead of freeing them.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:52 -07:00
Chris Leech
0e4b4992b8 [I/OAT]: Rename cleanup_rbuf to tcp_cleanup_rbuf and make non-static
Needed to be able to call tcp_cleanup_rbuf in tcp_input.c for I/OAT

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:50 -07:00
Chris Leech
97fc2f0848 [I/OAT]: Structure changes for TCP recv offload to I/OAT
Adds an async_wait_queue and some additional fields to tcp_sock, and a
dma_cookie_t to sk_buff.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:48 -07:00
Chris Leech
de5506e155 [I/OAT]: Utility functions for offloading sk_buff to iovec copies
Provides for pinning user space pages in memory, copying to iovecs,
and copying from sk_buffs including fragmented and chained sk_buffs.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:46 -07:00
Chris Leech
db21733488 [I/OAT]: Setup the networking subsystem as a DMA client
Attempts to allocate per-CPU DMA channels

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:24:58 -07:00
Sean Hefty
a1e8733e55 [NET]: Export ip_dev_find()
Export ip_dev_find() to allow locating a net_device given an IP address.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:28 -07:00
Larry Finger
7bd6b91800 [PATCH] wireless: correct dump of WPA IE
In net/ieee80211/softmac/ieee80211softmac_wx.c, there is a bug that
prints extended sign information whenever the byte value exceeds
0x7f. The following patch changes the printk to use a u8 cast to limit
the output to 2 digits. This bug was first noticed by Dan Williams
<dcbw@redhat.com>. This patch applies to the current master branch
of the Linville tree.

Signed-Off-By: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-15 15:48:14 -04:00
Jeff Garzik
b5ed7639c9 Merge branch 'master' into upstream 2006-06-13 20:29:04 -04:00
John W. Linville
76df73ff90 Merge branch 'from-linus' into upstream 2006-06-13 15:38:11 -04:00
Weidong
42d1d52e69 [IPV4]: Increment ipInHdrErrors when TTL expires.
Signed-off-by: Weidong <weid@nanjing-fnst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-12 13:09:59 -07:00
Aki M Nyrhinen
79320d7e14 [TCP]: continued: reno sacked_out count fix
From: Aki M Nyrhinen <anyrhine@cs.helsinki.fi>

IMHO the current fix to the problem (in_flight underflow in reno)
is incorrect.  it treats the symptons but ignores the problem. the
problem is timing out packets other than the head packet when we
don't have sack. i try to explain (sorry if explaining the obvious).

with sack, scanning the retransmit queue for timed out packets is
fine because we know which packets in our retransmit queue have been
acked by the receiver.

without sack, we know only how many packets in our retransmit queue the
receiver has acknowledged, but no idea which packets.

think of a "typical" slow-start overshoot case, where for example
every third packet in a window get lost because a router buffer gets
full.

with sack, we check for timeouts on those every third packet (as the
rest have been sacked). the packet counting works out and if there
is no reordering, we'll retransmit exactly the packets that were 
lost.

without sack, however, we check for timeout on every packet and end up
retransmitting consecutive packets in the retransmit queue. in our
slow-start example, 2/3 of those retransmissions are unnecessary. these
unnecessary retransmissions eat the congestion window and evetually
prevent fast recovery from continuing, if enough packets were lost.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-11 21:18:56 -07:00
Andrea Bittau
afec35e3fe [DCCP] Ackvec: fix soft lockup in ackvec handling code
A soft lockup existed in the handling of ack vector records.
Specifically, when a tail of the list of ack vector records was
removed, it was possible to end up iterating infinitely on an element
of the tail.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-11 21:08:03 -07:00
Trond Myklebust
81039f1f20 NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:34 -04:00
Trond Myklebust
8b23ea7bed RPC: Allow struc xdr_stream to read the page section of an xdr_buf
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:21 -04:00
Trond Myklebust
1f5ce9e93a VFS: Unexport do_kern_mount() and clean up simple_pin_fs()
Replace all module uses with the new vfs_kern_mount() interface, and fix up
simple_pin_fs().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:16 -04:00
Chuck Lever
bf3fcf8955 SUNRPC: NFS_ROOT always uses the same XIDs
The XID generator uses get_random_bytes to generate an initial XID.
NFS_ROOT starts up before the random driver, though, so get_random_bytes
doesn't set a random XID for NFS_ROOT.  This causes NFS_ROOT mount points
to reuse XIDs every time the client is booted.  If the client boots often
enough, the server will start serving old replies out of its DRC.

Use net_random() instead.

Test plan:
I/O intensive workloads should perform well and generate no errors.  Traces
taken during client reboots should show that NFS_ROOT mounts use unique
XIDs after every reboot.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:06 -04:00
Chuck Lever
b85d880684 SUNRPC: select privileged port numbers at random
Make the RPC client select privileged ephemeral source ports at
random.  This improves DRC behavior on the server by using the
same port when reconnecting for the same mount point, but using
a different port for fresh mounts.

The Linux TCP implementation already does this for nonprivileged
ports.  Note that TCP sockets in TIME_WAIT will prevent quick reuse
of a random ephemeral port number by leaving the port INUSE until
the connection transitions out of TIME_WAIT.

Test plan:
Connectathon against every known server implementation using multiple
mount points.  Locking especially.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:05 -04:00
Jeff Garzik
ba9b28d19a Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream 2006-06-08 15:48:25 -04:00
Jeff Garzik
d15a88fc21 Merge branch 'master' into upstream 2006-06-08 15:24:46 -04:00
Jiri Benc
36485707bb [BRIDGE]: fix locking and memory leak in br_add_bridge
There are several bugs in error handling in br_add_bridge:
- when dev_alloc_name fails, allocated net_device is not freed
- unregister_netdev is called when rtnl lock is held
- free_netdev is called before netdev_run_todo has a chance to be run after
  unregistering net_device

Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05 16:39:34 -07:00
Florin Malita
8c893ff6ab [IRDA]: Missing allocation result check in irlap_change_speed().
The skb allocation may fail, which can result in a NULL pointer dereference
in irlap_queue_xmit().

Coverity CID: 434.

Signed-off-by: Florin Malita <fmalita@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05 15:34:52 -07:00
Jes Sorensen
6569a351da [NET]: Eliminate unused /proc/sys/net/ethernet
The /proc/sys/net/ethernet directory has been sitting empty for more than
10 years!  Time to eliminate it!

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05 15:34:11 -07:00
Herbert Xu ~{PmVHI~}
f291196979 [TCP]: Avoid skb_pull if possible when trimming head
Trimming the head of an skb by calling skb_pull can cause the packet
to become unaligned if the length pulled is odd.  Since the length is
entirely arbitrary for a FIN packet carrying data, this is actually
quite common.

Unaligned data is not the end of the world, but we should avoid it if
it's easily done.  In this case it is trivial.  Since we're discarding
all of the head data it doesn't matter whether we move skb->data forward
or back.

However, it is still possible to have unaligned skb->data in general.
So network drivers should be prepared to handle it instead of crashing.

This patch also adds an unlikely marking on len < headlen since partial
ACKs on head data are extremely rare in the wild.  As the return value
of __pskb_trim_head is no longer ever NULL that has been removed.

Signed-off-by: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-05 15:03:37 -07:00
Joseph Jezak
c4b3d1bb32 [PATCH] softmac: unified capabilities computation
This patch moves the capabilities field computation to a function for clarity
and adds some previously unimplemented bits.

Signed off by Joseph Jezak <josejx@gentoo.org>
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-By: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05 15:51:30 -04:00
Daniel Drake
6ae15df16e [PATCH] softmac: Fix handling of authentication failure
My router blew up earlier, but exhibited some interesting behaviour during
its dying moments. It was broadcasting beacons but wouldn't respond to
any authentication requests.

I noticed that softmac wasn't playing nice with this, as I couldn't make it try
to connect to other networks after it had timed out authenticating to my ill
router.

To resolve this, I modified the softmac event/notify API to pass the event
code to the callback, so that callbacks being notified from
IEEE80211SOFTMAC_EVENT_ANY masks can make some judgement. In this case, the
ieee80211softmac_assoc callback needs to make a decision based upon whether
the association passed or failed.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05 15:51:30 -04:00
Daniel Drake
76ea4c7f4c [PATCH] softmac: complete shared key authentication
This patch finishes of the partially-complete shared key authentication
implementation in softmac.

The complication here is that we need to encrypt a management frame during
the authentication process. I don't think there are any other scenarios where
this would have to happen.

To get around this without causing too many headaches, we decided to just use
software encryption for this frame. The softmac config option now selects
IEEE80211_CRYPT_WEP so that we can ensure this available. This also involved
a modification to some otherwise unused ieee80211 API.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05 15:51:29 -04:00
Toralf Förster
47fbe1bf39 [PATCH] ieee80211softmac_io.c: fix warning "defined but not used"
Got this compiler warning and Johannes Berg <johannes@sipsolutions.net>
wrote:

Yeah, known 'bug', we have that code there but never use it. Feel free
to submit a patch (to John Linville, CC netdev and softmac-dev) to
remove it.

Signed-off-by: Toralf Foerster <toralf.foerster@gmx.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-06-05 15:48:31 -04:00
John W. Linville
dea58b80f2 Merge branch 'from-linus' into upstream 2006-06-05 14:42:27 -04:00
Stephen Hemminger
fb80a6e1a5 [TCP] tcp_highspeed: Fix problem observed by Xiaoliang (David) Wei
When snd_cwnd is smaller than 38 and the connection is in
congestion avoidance phase (snd_cwnd > snd_ssthresh), the snd_cwnd
seems to stop growing.

The additive increase was confused because C array's are 0 based.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-02 17:51:08 -07:00
Alexey Dobriyan
7114b0bb6d [NETFILTER]: PPTP helper: fix sstate/cstate typo
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-28 22:51:05 -07:00
Patrick McHardy
ca3ba88d0c [NETFILTER]: mark H.323 helper experimental
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-28 22:50:40 -07:00
Marcel Holtmann
6c813c3fe9 [NETFILTER]: Fix small information leak in SO_ORIGINAL_DST (CVE-2006-1343)
It appears that sockaddr_in.sin_zero is not zeroed during
getsockopt(...SO_ORIGINAL_DST...) operation. This can lead
to an information leak (CVE-2006-1343).

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-28 22:50:18 -07:00
Jeff Garzik
cbc696a5fa Merge branch 'upstream-fixes' into upstream 2006-05-26 21:26:34 -04:00
Stephen Hemminger
3041a06909 [NET]: dev.c comment fixes
Noticed that dev_alloc_name() comment was incorrect, and more spellung
errors.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-26 13:25:24 -07:00
YOSHIFUJI Hideaki
4d0c591166 [IPV6] ROUTE: Don't try less preferred routes for on-link routes.
In addition to the real on-link routes, NONEXTHOP routes
should be considered on-link.

Problem reported by Meelis Roos <mroos@linux.ee>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-26 13:23:41 -07:00
John W. Linville
f587fb74b2 Merge branch 'from-linus' into upstream 2006-05-26 16:06:58 -04:00
Jeff Garzik
db21e578e5 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream 2006-05-24 00:29:57 -04:00
Jeff Garzik
d99ef36ed7 Merge branch 'master' into upstream 2006-05-24 00:27:05 -04:00
Stephen Hemminger
387e2b0439 [BRIDGE]: need to ref count the LLC sap
Bridge will OOPS on removal if other application has the SAP open.
The bridge SAP might be shared with other usages, so need
to do reference counting on module removal rather than explicit
close/delete.

Since packet might arrive after or during removal, need to clear
the receive function handle, so LLC only hands it to user (if any).

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-23 15:20:25 -07:00
Chris Wright
4a06373913 [NETFILTER]: SNMP NAT: fix memleak in snmp_object_decode
If kmalloc fails, error path leaks data allocated from asn1_oid_decode().

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-23 15:15:13 -07:00
Patrick McHardy
4d942d8b39 [NETFILTER]: H.323 helper: fix sequence extension parsing
When parsing unknown sequence extensions the "son"-pointer points behind
the last known extension for this type, don't try to interpret it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-23 15:15:10 -07:00
Patrick McHardy
7185989db4 [NETFILTER]: H.323 helper: fix parser error propagation
The condition "> H323_ERROR_STOP" can never be true since H323_ERROR_STOP
is positive and is the highest possible return code, while real errors are
negative, fix the checks. Also only abort on real errors in some spots
that were just interpreting any return value != 0 as error.

Fixes crashes caused by use of stale data after a parsing error occured:

BUG: unable to handle kernel paging request at virtual address bfffffff
 printing eip:
c01aa0f8
*pde = 1a801067
*pte = 00000000
Oops: 0000 [#1]
PREEMPT
Modules linked in: ip_nat_h323 ip_conntrack_h323 nfsd exportfs sch_sfq sch_red cls_fw sch_hfsc  xt_length ipt_owner xt_MARK iptable_mangle nfs lockd sunrpc pppoe pppoxx
CPU:    0
EIP:    0060:[<c01aa0f8>]    Not tainted VLI
EFLAGS: 00210646   (2.6.17-rc4 #8)
EIP is at memmove+0x19/0x22
eax: d77264e9   ebx: d77264e9   ecx: e88d9b17   edx: d77264e9
esi: bfffffff   edi: bfffffff   ebp: de6a7680   esp: c0349db8
ds: 007b   es: 007b   ss: 0068
Process asterisk (pid: 3765, threadinfo=c0349000 task=da068540)
Stack: <0>00000006 c0349e5e d77264e3 e09a2b4e e09a38a0 d7726052 d7726124 00000491
       00000006 00000006 00000006 00000491 de6a7680 d772601e d7726032 c0349f74
       e09a2dc2 00000006 c0349e5e 00000006 00000000 d76dda28 00000491 c0349f74
Call Trace:
 [<e09a2b4e>] mangle_contents+0x62/0xfe [ip_nat]
 [<e09a2dc2>] ip_nat_mangle_tcp_packet+0xa1/0x191 [ip_nat]
 [<e0a2712d>] set_addr+0x74/0x14c [ip_nat_h323]
 [<e0ad531e>] process_setup+0x11b/0x29e [ip_conntrack_h323]
 [<e0ad534f>] process_setup+0x14c/0x29e [ip_conntrack_h323]
 [<e0ad57bd>] process_q931+0x3c/0x142 [ip_conntrack_h323]
 [<e0ad5dff>] q931_help+0xe0/0x144 [ip_conntrack_h323]
...

Found by the PROTOS c07-h2250v4 testsuite.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-23 15:15:08 -07:00
Jeff Garzik
9528454f9c Merge branch 'master' into upstream 2006-05-23 17:20:58 -04:00
Linus Torvalds
9cfe864842 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NETFILTER]: SNMP NAT: fix memory corruption
  [IRDA]: fixup type of ->lsap_state
  [IRDA]: fix 16/32 bit confusion
  [NET]: Fix "ntohl(ntohs" bugs
  [BNX2]: Use kmalloc instead of array
  [BNX2]: Fix bug in bnx2_nvram_write()
  [TG3]: Add some missing rx error counters
2006-05-23 10:40:19 -07:00
NeilBrown
f2d395865f [PATCH] knfsd: Fix two problems that can cause rmmod nfsd to die
Both cause the 'entries' count in the export cache to be non-zero at module
removal time, so unregistering that cache fails and results in an oops.

1/ exp_pseudoroot (used for NFSv4 only) leaks a reference to an export
   entry.
2/ sunrpc_cache_update doesn't increment the entries count when it adds
   an entry.

Thanks to "david m.  richter" <richterd@citi.umich.edu> for triggering the
problem and finding one of the bugs.

Cc: "david m. richter" <richterd@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-23 10:35:31 -07:00
Patrick McHardy
f41d5bb1d9 [NETFILTER]: SNMP NAT: fix memory corruption
Fix memory corruption caused by snmp_trap_decode:

- When snmp_trap_decode fails before the id and address are allocated,
  the pointers contain random memory, but are freed by the caller
  (snmp_parse_mangle).

- When snmp_trap_decode fails after allocating just the ID, it tries
  to free both address and ID, but the address pointer still contains
  random memory. The caller frees both ID and random memory again.

- When snmp_trap_decode fails after allocating both, it frees both,
  and the callers frees both again.

The corruption can be triggered remotely when the ip_nat_snmp_basic
module is loaded and traffic on port 161 or 162 is NATed.

Found by multiple testcases of the trap-app and trap-enc groups of the
PROTOS c06-snmpv1 testsuite.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-22 16:55:14 -07:00
Alexey Dobriyan
405a42c5c8 [IRDA]: fix 16/32 bit confusion
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-22 16:54:08 -07:00
Alexey Dobriyan
4195f81453 [NET]: Fix "ntohl(ntohs" bugs
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-22 16:53:22 -07:00
John W. Linville
3b38f317e5 Merge branch 'from-linus' into upstream 2006-05-22 14:26:25 -04:00
Jeff Garzik
badc48e660 Merge branch 'master' into upstream 2006-05-20 00:03:38 -04:00
Vladislav Yasevich
b89498a1c2 [SCTP]: Allow linger to abort 1-N style sockets.
Enable SO_LINGER functionality for 1-N style sockets. The socket API
draft will be clarfied to allow for this functionality. The linger
settings will apply to all associations on a given socket.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 14:32:06 -07:00
Vladislav Yasevich
a601266e4f [SCTP]: Validate the parameter length in HB-ACK chunk.
If SCTP receives a badly formatted HB-ACK chunk, it is possible
that we may access invalid memory and potentially have a buffer
overflow.  We should really make sure that the chunk format is
what we expect, before attempting to touch the data.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 14:25:53 -07:00
Vladislav Yasevich
61c9fed416 [SCTP]: A better solution to fix the race between sctp_peeloff() and
sctp_rcv().

The goal is to hold the ref on the association/endpoint throughout the
state-machine process.  We accomplish like this:

  /* ref on the assoc/ep is taken during lookup */

  if owned_by_user(sk)
 	sctp_add_backlog(skb, sk);
  else
 	inqueue_push(skb, sk);

  /* drop the ref on the assoc/ep */

However, in sctp_add_backlog() we take the ref on assoc/ep and hold it
while the skb is on the backlog queue.  This allows us to get rid of the
sock_hold/sock_put in the lookup routines.

Now sctp_backlog_rcv() needs to account for potential association move.
In the unlikely event that association moved, we need to retest if the
new socket is locked by user.  If we don't this, we may have two packets
racing up the stack toward the same socket and we can't deal with it.
If the new socket is still locked, we'll just add the skb to its backlog
continuing to hold the ref on the association.  This get's rid of the
need to move packets from one backlog to another and it also safe in
case new packets arrive on the same backlog queue.

The last step, is to lock the new socket when we are moving the
association to it.  This is needed in case any new packets arrive on
the association when it moved.  We want these to go to the backlog since
we would like to avoid the race between this new packet and a packet
that may be sitting on the backlog queue of the old socket toward the
same association.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 11:01:18 -07:00
Sridhar Samudrala
8de8c87380 [SCTP]: Set sk_err so that poll wakes up after a non-blocking connect failure.
Also fix some other cases where sk_err is not set for 1-1 style sockets.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
2006-05-19 10:58:12 -07:00
Patrick McHardy
ee433530d9 [NETFILTER]: nfnetlink_log: fix byteorder confusion
flags is a u16, so use htons instead of htonl. Also avoid double
conversion.

Noticed by Alexey Dobriyan <adobriyan@gmail.com>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-19 02:17:18 -07:00
Solar Designer
2c8ac66bb2 [NETFILTER]: Fix do_add_counters race, possible oops or info leak (CVE-2006-0039)
Solar Designer found a race condition in do_add_counters(). The beginning
of paddc is supposed to be the same as tmp which was sanity-checked
above, but it might not be the same in reality. In case the integer
overflow and/or the race condition are triggered, paddc->num_counters
might not match the allocation size for paddc. If the check below
(t->private->number != paddc->num_counters) nevertheless passes (perhaps
this requires the race condition to be triggered), IPT_ENTRY_ITERATE()
would read kernel memory beyond the allocation size, potentially causing
an oops or leaking sensitive data (e.g., passwords from host system or
from another VPS) via counter increments. This requires CAP_NET_ADMIN.

Signed-off-by: Solar Designer <solar@openwall.com>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-19 02:16:52 -07:00
Alexey Dobriyan
a467704dcb [NETFILTER]: GRE conntrack: fix htons/htonl confusion
GRE keys are 16 bit.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-19 02:16:29 -07:00
Philip Craig
5c170a09d9 [NETFILTER]: fix format specifier for netfilter log targets
The prefix argument for nf_log_packet is a format specifier,
so don't pass the user defined string directly to it.

Signed-off-by: Philip Craig <philipc@snapgear.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-19 02:15:47 -07:00
Jesper Juhl
493e2428aa [NETFILTER]: Fix memory leak in ipt_recent
The Coverity checker spotted that we may leak 'hold' in
net/ipv4/netfilter/ipt_recent.c::checkentry() when the following
is true:
  if (!curr_table->status_proc) {
    ...
    if(!curr_table) {
    ...
      return 0;  <-- here we leak.
Simply moving an existing vfree(hold); up a bit avoids the possible leak.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-19 02:15:13 -07:00
John W. Linville
5dd8816aeb Merge branch 'from-linus' into upstream 2006-05-17 14:51:24 -04:00
Angelo P. Castellani
8872d8e1c4 [TCP]: reno sacked_out count fix
From: "Angelo P. Castellani" <angelo.castellani+lkml@gmail.com>

Using NewReno, if a sk_buff is timed out and is accounted as lost_out,
it should also be removed from the sacked_out.

This is necessary because recovery using NewReno fast retransmit could
take up to a lot RTTs and the sk_buff RTO can expire without actually
being really lost.

left_out = sacked_out + lost_out
in_flight = packets_out - left_out + retrans_out

Using NewReno without this patch, on very large network losses,
left_out becames bigger than packets_out + retrans_out (!!).

For this reason unsigned integer in_flight overflows to 2^32 - something.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 21:42:11 -07:00
Alexey Dobriyan
d8fd0a7316 [IPV6]: Endian fix in net/ipv6/netfilter/ip6t_eui64.c:match().
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 15:24:41 -07:00
Adrian Bunk
6599519e9c [TR]: Remove an unused export.
This patch removes the unused EXPORT_SYMBOL(tr_source_route).

(Note, the usage in net/llc/llc_output.c can't be modular.)

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 15:23:40 -07:00
Alexey Dobriyan
4ac396c046 [IPX]: Correct return type of ipx_map_frame_type().
Casting BE16 to int and back may or may not work. Correct, to be sure.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 15:17:49 -07:00
Alexey Dobriyan
53d42f5412 [IPX]: Correct argument type of ipxrtr_delete().
A single caller passes __u32. Inside function "net" is compared with
__u32 (__be32 really, just wasn't annotated).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 15:07:28 -07:00
Stephen Hemminger
338f7566e5 [PKT_SCHED]: Potential jiffy wrap bug in dev_watchdog().
There is a potential jiffy wraparound bug in the transmit watchdog
that is easily avoided by using time_after().

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-16 15:02:12 -07:00
Simon Kelley
bd89efc532 [NEIGH]: Fix IP-over-ATM and ARP interaction.
The classical IP over ATM code maintains its own IPv4 <-> <ATM stuff>
ARP table, using the standard neighbour-table code. The
neigh_table_init function adds this neighbour table to a linked list
of all neighbor tables which is used by the functions neigh_delete()
neigh_add() and neightbl_set(), all called by the netlink code.

Once the ATM neighbour table is added to the list, there are two
tables with family == AF_INET there, and ARP entries sent via netlink
go into the first table with matching family. This is indeterminate
and often wrong.

To see the bug, on a kernel with CLIP enabled, create a standard IPv4
ARP entry by pinging an unused address on a local subnet. Then attempt
to complete that entry by doing

ip neigh replace <ip address> lladdr <some mac address> nud reachable

Looking at the ARP tables by using 

ip neigh show

will reveal two ARP entries for the same address. One of these can be
found in /proc/net/arp, and the other in /proc/net/atm/arp.

This patch adds a new function, neigh_table_init_no_netlink() which
does everything the neigh_table_init() does, except add the table to
the netlink all-arp-tables chain. In addition neigh_table_init() has a
check that all tables on the chain have a distinct address family.
The init call in clip.c is changed to call
neigh_table_init_no_netlink().

Since ATM ARP tables are rather more complicated than can currently be
handled by the available rtattrs in the netlink protocol, no
functionality is lost by this patch, and non-ATM ARP manipulation via
netlink is rescued. A more complete solution would involve a rtattr
for ATM ARP entries and some way for the netlink code to give
neigh_add and friends more information than just address family with
which to find the correct ARP table.

[ I've changed the assertion checking in neigh_table_init() to not
  use BUG_ON() while holding neigh_tbl_lock.  Instead we remember that
  we found an existing tbl with the same family, and after dropping
  the lock we'll give a diagnostic kernel log message and a stack dump.
  -DaveM ]

Signed-off-by: Simon Kelley <simon@thekelleys.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-12 14:56:08 -07:00
Patrick McHardy
210525d65d [NET_SCHED]: HFSC: fix thinko in hfsc_adjust_levels()
When deleting the last child the level of a class should drop to zero.

Noticed by Andreas Mueller <andreas@stapelspeicher.org>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-11 12:22:03 -07:00
Alexey Kuznetsov
b0013fd47b [IPV6]: skb leakage in inet6_csk_xmit
inet6_csk_xit does not free skb when routing fails.

Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-10 13:24:38 -07:00
Stephen Hemminger
ac05202e8b [BRIDGE]: Do sysfs registration inside rtnl.
Now that netdevice sysfs registration is done as part of
register_netdevice; bridge code no longer has to be tricky when adding
it's kobjects to bridges.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-10 13:21:53 -07:00
Stephen Hemminger
b17a7c179d [NET]: Do sysfs registration as part of register_netdevice.
The last step of netdevice registration was being done by a delayed
call, but because it was delayed, it was impossible to return any error
code if the class_device registration failed.

Side effects:
 * one state in registration process is unnecessary.
 * register_netdevice can sleep inside class_device registration/hotplug
 * code in netdev_run_todo only does unregistration so it is simpler.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-10 13:21:17 -07:00
Herbert Xu
8c1056839e [NET] linkwatch: Handle jiffies wrap-around
The test used in the linkwatch does not handle wrap-arounds correctly.
Since the intention of the code is to eliminate bursts of messages we
can afford to delay things up to a second.  Using that fact we can
easily handle wrap-arounds by making sure that we don't delay things
by more than one second.

This is based on diagnosis and a patch by Stefan Rompf.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Stefan Rompf <stefan@loplof.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09 15:27:54 -07:00
Adrian Bunk
11766199a0 [IRDA]: Removing unused EXPORT_SYMBOLs
This patch removes the following unused EXPORT_SYMBOL's:
- irias_find_attrib
- irias_new_string_value
- irias_new_octseq_value

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Samuel Ortiz <samuel.ortiz@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09 15:25:25 -07:00
Alan Stern
f07d5b9465 [NET]: Make netdev_chain a raw notifier.
From: Alan Stern <stern@rowland.harvard.edu>

This chain does it's own locking via the RTNL semaphore, and
can also run recursively so adding a new mutex here was causing
deadlocks.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09 15:23:03 -07:00
Wei Yongjun
63cbd2fda3 [IPV4]: ip_options_fragment() has no effect on fragmentation
Fix error point to options in ip_options_fragment(). optptr get a
error pointer to the ipv4 header, correct is pointer to ipv4 options.

Signed-off-by: Wei Yongjun <weiyj@soft.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-09 15:18:50 -07:00
Stephen Hemminger
23aee82e75 Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-05-08 16:01:20 -07:00
Hua Zhong
0182bd2b1e [IPV4]: Remove likely in ip_rcv_finish()
This is another result from my likely profiling tool
(dwalker@mvista.com just sent the patch of the profiling tool to
linux-kernel mailing list, which is similar to what I use).

On my system (not very busy, normal development machine within a
VMWare workstation), I see a 6/5 miss/hit ratio for this "likely".

Signed-off-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-06 18:11:39 -07:00
Stephen Hemminger
fe9925b551 [NET]: Create netdev attribute_groups with class_device_add
Atomically create attributes when class device is added. This avoids
the race between registering class_device (which generates hotplug
event), and the creation of attribute groups.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-06 17:56:03 -07:00
John Heffner
5528e568a7 [TCP]: Fix snd_cwnd adjustments in tcp_highspeed.c
Xiaoliang (David) Wei wrote:
> Hi gurus,
> 
>    I am reading the code of tcp_highspeed.c in the kernel and have a
> question on the hstcp_cong_avoid function, specifically the following
> AI part (line 136~143 in net/ipv4/tcp_highspeed.c ):
> 
>                /* Do additive increase */
>                if (tp->snd_cwnd < tp->snd_cwnd_clamp) {
>                        tp->snd_cwnd_cnt += ca->ai;
>                        if (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
>                                tp->snd_cwnd++;
>                                tp->snd_cwnd_cnt -= tp->snd_cwnd;
>                        }
>                }
> 
>    In this part, when (tp->snd_cwnd_cnt == tp->snd_cwnd),
> snd_cwnd_cnt will be -1... snd_cwnd_cnt is defined as u16, will this
> small chance of getting -1 becomes a problem?
> Shall we change it by reversing the order of the cwnd++ and cwnd_cnt -= 
> cwnd?

Absolutely correct.  Thanks.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:41:44 -07:00
Ralf Baechle
f530937b2c [NETROM/ROSE]: Kill module init version kernel log messages.
There are out of date and don't tell the user anything useful.
The similar messages which IPV4 and the core networking used
to output were killed a long time ago.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:19:26 -07:00
Herbert Xu
134af34632 [DCCP]: Fix sock_orphan dead lock
Calling sock_orphan inside bh_lock_sock in dccp_close can lead to dead
locks.  For example, the inet_diag code holds sk_callback_lock without
disabling BH.  If an inbound packet arrives during that admittedly tiny
window, it will cause a dead lock on bh_lock_sock.  Another possible
path would be through sock_wfree if the network device driver frees the
tx skb in process context with BH enabled.

We can fix this by moving sock_orphan out of bh_lock_sock.

The tricky bit is to work out when we need to destroy the socket
ourselves and when it has already been destroyed by someone else.

By moving sock_orphan before the release_sock we can solve this
problem.  This is because as long as we own the socket lock its
state cannot change.

So we simply record the socket state before the release_sock
and then check the state again after we regain the socket lock.
If the socket state has transitioned to DCCP_CLOSED in the time being,
we know that the socket has been destroyed.  Otherwise the socket is
still ours to keep.

This problem was discoverd by Ingo Molnar using his lock validator.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:09:13 -07:00
Stephen Hemminger
1c29fc4989 [BRIDGE]: keep track of received multicast packets
It makes sense to add this simple statistic to keep track of received
multicast packets.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:07:13 -07:00
Sridhar Samudrala
35d63edb1c [SCTP]: Fix state table entries for chunks received in CLOSED state.
Discard an unexpected chunk in CLOSED state rather can calling BUG().

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:05:23 -07:00
Sridhar Samudrala
62b08083ec [SCTP]: Fix panic's when receiving fragmented SCTP control chunks.
Use pskb_pull() to handle incoming COOKIE_ECHO and HEARTBEAT chunks that
are received as skb's with fragment list.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:04:43 -07:00
Vladislav Yasevich
672e7cca17 [SCTP]: Prevent possible infinite recursion with multiple bundled DATA.
There is a rare situation that causes lksctp to go into infinite recursion
and crash the system.  The trigger is a packet that contains at least the
first two DATA fragments of a message bundled together. The recursion is
triggered when the user data buffer is smaller that the full data message.
The problem is that we clone the skb for every fragment in the message.
When reassembling the full message, we try to link skbs from the "first
fragment" clone using the frag_list. However, since the frag_list is shared
between two clones in this rare situation, we end up setting the frag_list
pointer of the second fragment to point to itself.  This causes
sctp_skb_pull() to potentially recurse indefinitely.

Proposed solution is to make a copy of the skb when attempting to link
things using frag_list.

Signed-off-by: Vladislav Yasevich <vladsilav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:03:49 -07:00
Neil Horman
7c3ceb4fb9 [SCTP]: Allow spillover of receive buffer to avoid deadlock.
This patch fixes a deadlock situation in the receive path by allowing
temporary spillover of the receive buffer.

- If the chunk we receive has a tsn that immediately follows the ctsn,
  accept it even if we run out of receive buffer space and renege data with
  higher TSNs.
- Once we accept one chunk in a packet, accept all the remaining chunks
  even if we run out of receive buffer space.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Mark Butler <butlerm@middle.net>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-05 17:02:09 -07:00
Daniel Drake
8462fe3cd9 [PATCH] softmac: suggest per-frame-type TX rate
This patch is the first step towards rate control inside softmac.

The txrates substructure has been extended to provide
different fields for different types of packets (management/data,
unicast/multicast). These fields are updated on association to values
compatible with the access point we are associating to.

Drivers can then use the new ieee80211softmac_suggest_txrate() function
call when deciding which rate to transmit each frame at. This is
immensely useful for ZD1211, and bcm can use it too.

The user can still specify a rate through iwconfig, which is matched
for all transmissions (assuming the rate they have specified is in
the rate set required by the AP).

At a later date, we can incorporate automatic rate management into
the ieee80211softmac_recalc_txrates() function.

This patch also removes the mcast_fallback field. Sam Leffler pointed
out that this field is meaningless, because no driver will ever be
retransmitting mcast frames (they are not acked).

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 17:10:41 -04:00
Adrian Bunk
6274115ce9 [PATCH] ieee80211_wx.c: remove dead code
Since sec->key_sizes[] is an u8, len can't be < 0.

Spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 17:10:40 -04:00
Daniel Drake
6d92f83ffa [PATCH] softmac: deauthentication implies deassociation
The 802.11 specs state that deauthenticating also implies
disassociating. This patch implements that, which improve the behaviour
of SIOCSIWMLME.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 17:10:39 -04:00
John W. Linville
fd5226a726 Merge branch 'upstream-fixes' into upstream 2006-05-05 16:56:24 -04:00
Daniel Drake
d57336e3f2 [PATCH] softmac: make non-operational after being stopped
zd1211 with softmac and wpa_supplicant revealed an issue with softmac
and the use of workqueues. Some of the work functions actually
reschedule themselves, so this meant that there could still be
pending work after flush_scheduled_work() had been called during
ieee80211softmac_stop().

This patch introduces a "running" flag which is used to ensure that
rescheduling does not happen in this situation.

I also used this flag to ensure that softmac's hooks into ieee80211 are
non-operational once the stop operation has been started. This simply
makes softmac a little more robust, because I could crash it easily
by receiving frames in the short timeframe after shutting down softmac
and before turning off the ZD1211 radio. (ZD1211 is now fixed as well!)

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 16:55:22 -04:00
Daniel Drake
995c99268e [PATCH] softmac: don't reassociate if user asked for deauthentication
When wpa_supplicant exits, it uses SIOCSIWMLME to request
deauthentication.  softmac then tries to reassociate without any user
intervention, which isn't the desired behaviour of this signal.

This change makes softmac only attempt reassociation if the remote
network itself deauthenticated us.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-05-05 16:55:22 -04:00
John W. Linville
aad61439e6 Merge branch 'from-linus' into upstream 2006-05-05 16:50:23 -04:00
Patrick Caulfield
d1a6498388 [DECNET]: Fix level1 router hello
This patch fixes hello messages sent when a node is a level 1
router. Slightly contrary to the spec (maybe) VMS ignores hello
messages that do not name level2 routers that it also knows about.

So, here we simply name all the routers that the node knows about
rather just other level1 routers.  (I hope the patch is clearer than
the description. sorry).

Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:36:23 -07:00
Herbert Xu
75c2d9077c [TCP]: Fix sock_orphan dead lock
Calling sock_orphan inside bh_lock_sock in tcp_close can lead to dead
locks.  For example, the inet_diag code holds sk_callback_lock without
disabling BH.  If an inbound packet arrives during that admittedly tiny
window, it will cause a dead lock on bh_lock_sock.  Another possible
path would be through sock_wfree if the network device driver frees the
tx skb in process context with BH enabled.

We can fix this by moving sock_orphan out of bh_lock_sock.

The tricky bit is to work out when we need to destroy the socket
ourselves and when it has already been destroyed by someone else.

By moving sock_orphan before the release_sock we can solve this
problem.  This is because as long as we own the socket lock its
state cannot change.

So we simply record the socket state before the release_sock
and then check the state again after we regain the socket lock.
If the socket state has transitioned to TCP_CLOSE in the time being,
we know that the socket has been destroyed.  Otherwise the socket is
still ours to keep.

Note that I've also moved the increment on the orphan count forward.
This may look like a problem as we're increasing it even if the socket
is just about to be destroyed where it'll be decreased again.  However,
this simply enlarges a window that already exists.  This also changes
the orphan count test by one.

Considering what the orphan count is meant to do this is no big deal.

This problem was discoverd by Ingo Molnar using his lock validator.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:31:35 -07:00
Ralf Baechle
82e84249f0 [ROSE]: Eleminate HZ from ROSE kernel interfaces
Convert all ROSE sysctl time values from jiffies to ms as units.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:28:20 -07:00
Ralf Baechle
4d8937d0b1 [NETROM]: Eleminate HZ from NET/ROM kernel interfaces
Convert all NET/ROM sysctl time values from jiffies to ms as units.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:27:47 -07:00
Ralf Baechle
e1fdb5b396 [AX.25]: Eleminate HZ from AX.25 kernel interfaces
Convert all AX.25 sysctl time values from jiffies to ms as units.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:27:16 -07:00
Ralf Baechle
4cc7c2734e [ROSE]: Fix routing table locking in rose_remove_neigh.
The locking rule for rose_remove_neigh() are that the caller needs to
hold rose_neigh_list_lock, so we better don't take it yet again in
rose_neigh_list_lock.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:26:20 -07:00
Ralf Baechle
70868eace5 [AX.25]: Move AX.25 symbol exports
Move AX.25 symbol exports to next to their definitions where they're
supposed to be these days.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:25:17 -07:00
Ralf Baechle
86cfcb95ec [AX25, ROSE]: Remove useless SET_MODULE_OWNER calls.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:23:48 -07:00
Ralf Baechle
3f072310d0 [AX.25]: Spelling fix
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:22:36 -07:00
Ralf Baechle
0cc5ae24af [ROSE]: Remove useless prototype for rose_remove_neigh().
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:22:01 -07:00
Patrick McHardy
7800007c1e [NETFILTER]: x_tables: don't use __copy_{from,to}_user on unchecked memory in compat layer
Noticed by Linus Torvalds <torvalds@osdl.org>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:20:27 -07:00
Jing Min Zhao
7582e9d17e [NETFILTER]: H.323 helper: Change author's email address
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:19:59 -07:00
Patrick McHardy
2354feaeb2 [NETFILTER]: NAT: silence unused variable warnings with CONFIG_XFRM=n
net/ipv4/netfilter/ip_nat_standalone.c: In function 'ip_nat_out':
net/ipv4/netfilter/ip_nat_standalone.c:223: warning: unused variable 'ctinfo'
net/ipv4/netfilter/ip_nat_standalone.c:222: warning: unused variable 'ct'

Surprisingly no complaints so far ..

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:19:26 -07:00
Patrick McHardy
4228e2a989 [NETFILTER]: H.323 helper: fix use of uninitialized data
When a Choice element contains an unsupported choice no error is returned
and parsing continues normally, but the choice value is not set and
contains data from the last parsed message. This may in turn lead to
parsing of more stale data and following crashes.

Fixes a crash triggered by testcase 0003243 from the PROTOS c07-h2250v4
testsuite following random other testcases:

CPU:    0
EIP:    0060:[<c01a9554>]    Not tainted VLI
EFLAGS: 00210646   (2.6.17-rc2 #3)
EIP is at memmove+0x19/0x22
eax: d7be0307   ebx: d7be0307   ecx: e841fcf9   edx: d7be0307
esi: bfffffff   edi: bfffffff   ebp: da5eb980   esp: c0347e2c
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 4, threadinfo=c0347000 task=dff86a90)
Stack: <0>00000006 c0347ea6 d7be0301 e09a6b2c 00000006 da5eb980 d7be003e d7be0052
       c0347f6c e09a6d9c 00000006 c0347ea6 00000006 00000000 d7b9a548 00000000
       c0347f6c d7b9a548 00000004 e0a1a119 0000028f 00000006 c0347ea6 00000006
Call Trace:
 [<e09a6b2c>] mangle_contents+0x40/0xd8 [ip_nat]
 [<e09a6d9c>] ip_nat_mangle_tcp_packet+0xa1/0x191 [ip_nat]
 [<e0a1a119>] set_addr+0x60/0x14d [ip_nat_h323]
 [<e0ab6e66>] q931_help+0x2da/0x71a [ip_conntrack_h323]
 [<e0ab6e98>] q931_help+0x30c/0x71a [ip_conntrack_h323]
 [<e09af242>] ip_conntrack_help+0x22/0x2f [ip_conntrack]
 [<c022934a>] nf_iterate+0x2e/0x5f
 [<c025d357>] xfrm4_output_finish+0x0/0x39f
 [<c02294ce>] nf_hook_slow+0x42/0xb0
 [<c025d357>] xfrm4_output_finish+0x0/0x39f
 [<c025d732>] xfrm4_output+0x3c/0x4e
 [<c025d357>] xfrm4_output_finish+0x0/0x39f
 [<c0230370>] ip_forward+0x1c2/0x1fa
 [<c022f417>] ip_rcv+0x388/0x3b5
 [<c02188f9>] netif_receive_skb+0x2bc/0x2ec
 [<c0218994>] process_backlog+0x6b/0xd0
 [<c021675a>] net_rx_action+0x4b/0xb7
 [<c0115606>] __do_softirq+0x35/0x7d
 [<c0104294>] do_softirq+0x38/0x3f

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:17:11 -07:00
Patrick McHardy
6fd737031e [NETFILTER]: H.323 helper: fix endless loop caused by invalid TPKT len
When the TPKT len included in the packet is below the lowest valid value
of 4 an underflow occurs which results in an endless loop.

Found by testcase 0000058 from the PROTOS c07-h2250v4 testsuite.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:16:29 -07:00
Patrick McHardy
e17df688f7 [NETFILTER] SCTP conntrack: fix infinite loop
fix infinite loop in the SCTP-netfilter code: check SCTP chunk size to
guarantee progress of for_each_sctp_chunk(). (all other uses of
for_each_sctp_chunk() are preceded by do_basic_checks(), so this fix
should be complete.)

Based on patch from Ingo Molnar <mingo@elte.hu>

CVE-2006-1527

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-02 17:26:39 -07:00
Jeff Garzik
1fb5fef9b8 Merge branch 'master' into upstream 2006-05-02 14:33:57 -04:00
Linus Torvalds
532f57da40 Merge branch 'audit.b10' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b10' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] Audit Filter Performance
  [PATCH] Rework of IPC auditing
  [PATCH] More user space subject labels
  [PATCH] Reworked patch for labels on user space messages
  [PATCH] change lspp ipc auditing
  [PATCH] audit inode patch
  [PATCH] support for context based audit filtering, part 2
  [PATCH] support for context based audit filtering
  [PATCH] no need to wank with task_lock() and pinning task down in audit_syscall_exit()
  [PATCH] drop task argument of audit_syscall_{entry,exit}
  [PATCH] drop gfp_mask in audit_log_exit()
  [PATCH] move call of audit_free() into do_exit()
  [PATCH] sockaddr patch
  [PATCH] deal with deadlocks in audit_free()
2006-05-01 21:43:05 -07:00
Patrick McHardy
46c5ea3c9a [NETFILTER] x_tables: fix compat related crash on non-x86
When iptables userspace adds an ipt_standard_target, it calculates the size
of the entire entry as:

sizeof(struct ipt_entry) + XT_ALIGN(sizeof(struct ipt_standard_target))

ipt_standard_target looks like this:

  struct xt_standard_target
  {
        struct xt_entry_target target;
        int verdict;
  };

xt_entry_target contains a pointer, so when compiled for 64 bit the
structure gets an extra 4 byte of padding at the end. On 32 bit
architectures where iptables aligns to 8 byte it will also have 4
byte padding at the end because it is only 36 bytes large.

The compat_ipt_standard_fn in the kernel adjusts the offsets by

  sizeof(struct ipt_standard_target) - sizeof(struct compat_ipt_standard_target),

which will always result in 4, even if the structure from userspace
was already padded to a multiple of 8. On x86 this works out by
accident because userspace only aligns to 4, on all other
architectures this is broken and causes incorrect adjustments to
the size and following offsets.

Thanks to Linus for lots of debugging help and testing.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-01 20:48:32 -07:00
Steve Grubb
e7c3497013 [PATCH] Reworked patch for labels on user space messages
The below patch should be applied after the inode and ipc sid patches.
This patch is a reworking of Tim's patch that has been updated to match
the inode and ipc patches since its similar.

[updated:
>  Stephen Smalley also wanted to change a variable from isec to tsec in the
>  user sid patch.                                                              ]

Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-05-01 06:09:58 -04:00
Steve Grubb
d6fe3945b4 [PATCH] sockaddr patch
On Thursday 23 March 2006 09:08, John D. Ramsdell wrote:
>  I noticed that a socketcall(bind) and socketcall(connect) event contain a
>  record of type=SOCKADDR, but I cannot see one for a system call event
>  associated with socketcall(accept).  Recording the sockaddr of an accepted
>  socket is important for cross platform information flow analys

Thanks for pointing this out. The following patch should address this.

Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-05-01 06:06:10 -04:00
YOSHIFUJI Hideaki
c302e6d54e [IPV6]: Fix race in route selection.
We eliminated rt6_dflt_lock (to protect default router pointer)
at 2.6.17-rc1, and introduced rt6_select() for general router selection.
The function is called in the context of rt6_lock read-lock held,
but this means, we have some race conditions when we do round-robin.

Signed-off-by; YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:22 -07:00
Ingo Molnar
e959d8121f [XFRM]: fix incorrect xfrm_policy_afinfo_lock use
xfrm_policy_afinfo_lock can be taken in bh context, at:

 [<c013fe1a>] lockdep_acquire_read+0x54/0x6d
 [<c0f6e024>] _read_lock+0x15/0x22
 [<c0e8fcdb>] xfrm_policy_get_afinfo+0x1a/0x3d
 [<c0e8fd10>] xfrm_decode_session+0x12/0x32
 [<c0e66094>] ip_route_me_harder+0x1c9/0x25b
 [<c0e770d3>] ip_nat_local_fn+0x94/0xad
 [<c0e2bbc8>] nf_iterate+0x2e/0x7a
 [<c0e2bc50>] nf_hook_slow+0x3c/0x9e
 [<c0e3a342>] ip_push_pending_frames+0x2de/0x3a7
 [<c0e53e19>] icmp_push_reply+0x136/0x141
 [<c0e543fb>] icmp_reply+0x118/0x1a0
 [<c0e54581>] icmp_echo+0x44/0x46
 [<c0e53fad>] icmp_rcv+0x111/0x138
 [<c0e36764>] ip_local_deliver+0x150/0x1f9
 [<c0e36be2>] ip_rcv+0x3d5/0x413
 [<c0df760f>] netif_receive_skb+0x337/0x356
 [<c0df76c3>] process_backlog+0x95/0x110
 [<c0df5fe2>] net_rx_action+0xa5/0x16d
 [<c012d8a7>] __do_softirq+0x6f/0xe6
 [<c0105ec2>] do_softirq+0x52/0xb1

this means that all write-locking of xfrm_policy_afinfo_lock must be
bh-safe. This patch fixes xfrm_policy_register_afinfo() and
xfrm_policy_unregister_afinfo().

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:21 -07:00
Ingo Molnar
f3111502c0 [XFRM]: fix incorrect xfrm_state_afinfo_lock use
xfrm_state_afinfo_lock can be read-locked from bh context, so take it
in a bh-safe manner in xfrm_state_register_afinfo() and
xfrm_state_unregister_afinfo(). Found by the lock validator.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:20 -07:00
Hua Zhong
83de47cd0c [TCP]: Fix unlikely usage in tcp_transmit_skb()
The following unlikely should be replaced by likely because the
condition happens every time unless there is a hard error to transmit
a packet.

Signed-off-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:19 -07:00
Ingo Molnar
8dff7c2970 [XFRM]: fix softirq-unsafe xfrm typemap->lock use
xfrm typemap->lock may be used in softirq context, so all write_lock()
uses must be softirq-safe.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:18 -07:00
Herbert Xu
a76e07acd0 [IPSEC]: Fix IP ID selection
I was looking through the xfrm input/output code in order to abstract
out the address family specific encapsulation/decapsulation code.  During
that process I found this bug in the IP ID selection code in xfrm4_output.c.

At that point dst is still the xfrm_dst for the current SA which
represents an internal flow as far as the IPsec tunnel is concerned.
Since the IP ID is going to sit on the outside of the encapsulated
packet, we obviously want the external flow which is just dst->child.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:16 -07:00
Heiko Carstens
a536e07787 [IPV4]: inet_init() -> fs_initcall
Convert inet_init to an fs_initcall to make sure its called before any
device driver's initcall.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:14 -07:00
Soyoung Park
09493abfdb [NETLINK]: cleanup unused macro in net/netlink/af_netlink.c
1 line removal, of unused macro.
ran 'egrep -r' from linux-2.6.16/ for Nprintk and
didn't see it anywhere else but here, in #define...

Signed-off-by: Soyoung Park <speattle@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:13 -07:00
Stephen Hemminger
89bbb0a361 [PKT_SCHED] netem: fix loss
The following one line fix is needed to make loss function of
netem work right when doing loss on the local host.
Otherwise, higher layers just recover.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:12 -07:00
Shaun Pereira
43dff98b02 [X25]: fix for spinlock recurse and spinlock lockup with timer handler
When the sk_timer function x25_heartbeat_expiry() is called by the
kernel in a running/terminating process, spinlock-recursion and
spinlock-lockup locks up the kernel.  This has happened with testing
on some distro's and the patch below fixed it.

Signed-off-by: Shaun Pereira <spereira@tusc.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-29 18:33:11 -07:00
Jeff Garzik
1a2e8a6f8e Merge branch 'master' into upstream 2006-04-27 04:52:44 -04:00
Linus Torvalds
07db8696f5 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
  [PATCH] forcedeth: fix initialization
  [PATCH] sky2: version 1.2
  [PATCH] sky2: reset function can be devinit
  [PATCH] sky2: use ALIGN() macro
  [PATCH] sky2: add fake idle irq timer
  [PATCH] sky2: reschedule if irq still pending
  [PATCH] bcm43xx: make PIO mode usable
  [PATCH] bcm43xx: add to MAINTAINERS
  [PATCH] softmac: fix SIOCSIWAP
  [PATCH] Fix crash on big-endian systems during scan
  e1000: Update truesize with the length of the packet for packet split
  [PATCH] Fix locking in gianfar
2006-04-26 07:46:19 -07:00
Jeff Garzik
00355cd938 Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream 2006-04-26 06:18:15 -04:00
Jeff Garzik
3b908870b8 Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes 2006-04-26 06:16:50 -04:00
Stephen Hemminger
85ca719e57 [BRIDGE]: allow full size vlan packets
Need to allow for VLAN header when bridging.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-26 02:39:19 -07:00
Patrick McHardy
18118cdbfd [NETFILTER]: ipt action: use xt_check_target for basic verification
The targets don't do the basic verification themselves anymore so
the ipt action needs to take care of it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:34 -07:00
Dmitry Mishin
91536b7ae6 [NETFILTER]: x_tables: move table->lock initialization
xt_table->lock should be initialized before xt_replace_table() call, which
uses it. This patch removes strict requirement that table should define
lock before registering.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:33 -07:00
Patrick McHardy
e4a79ef811 [NETFILTER]: ip6_tables: remove broken comefrom debugging
The introduction of x_tables broke comefrom debugging, remove it from
ip6_tables as well (ip_tables already got removed).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:32 -07:00
Yasuyuki Kozakai
2c16b774c7 [NETFILTER]: nf_conntrack: kill unused callback init_conntrack
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:31 -07:00
Thomas Voegtle
44adf28f4a [NETFILTER]: ULOG target is not obsolete
The backend part is obsoleted, but the target itself is still needed.

Signed-off-by: Thomas Voegtle <tv@lio96.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:29 -07:00
Yasuyuki Kozakai
e1bbdebdba [NETFILTER]: nf_conntrack: Fix module refcount dropping too far
If nf_ct_l3proto_find_get() fails to get the refcount of
nf_ct_l3proto_generic, nf_ct_l3proto_put() will drop the refcount
too far.

This gets rid of '.me = THIS_MODULE' of nf_ct_l3proto_generic so that
nf_ct_l3proto_find_get() doesn't try to get refcount of it.
It's OK because its symbol is usable until nf_conntrack.ko is unloaded.

This also kills unnecessary NULL pointer check as well.
__nf_ct_proto_find() allways returns non-NULL pointer.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-24 17:27:28 -07:00
Johannes Berg
921a91ef6a [PATCH] softmac: clean up event handling code
This patch cleans up the event handling code in ieee80211softmac_event.c and
makes the module slightly smaller by removing some strings that are not used
any more and consolidating some code.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:58 -04:00
Johannes Berg
9a1771e867 [PATCH] softmac: add SIOCSIWMLME
This patch adds the SIOCSIWMLME wext to softmac, this functionality
appears to be used by wpa_supplicant and is softmac-specific.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Jouni Malinen <jkm@devicescape.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:58 -04:00
Zhu Yi
7736b5bd93 [PATCH] ieee80211: replace debug IEEE80211_WARNING with each own debug macro
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:54 -04:00
Zhu Yi
35c14b855f [PATCH] ieee80211: remove unnecessary CONFIG_WIRELESS_EXT checking
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:53 -04:00
Zhu Yi
09593047d8 [PATCH] ieee80211: export list of bit rates with standard WEXT procddures
The patch replace the way to export the list of bit rates in scan results
from IWEVCUSTOM to SIOCGIWRATE. It also removes the max_rate item exported
with SIOCGIWRATE since this should be done by userspace.

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:53 -04:00
Zhu Yi
73858062b6 [PATCH] ieee80211: Fix TX code doesn't enable QoS when using WPA + QoS
Fix ieee80211 TX code when using WPA+QOS. TKIP/CCMP will use
the TID field of qos_ctl in 802.11 frame header to do encryption. We
cannot ignore this field when doing host encryption and add the qos_ctl
field later.

Signed-off-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:53 -04:00
Zhu Yi
ea2841521a [PATCH] ieee80211: Fix TKIP MIC calculation for QoS frames
Fix TKIP MIC verification failure when receiving QoS frames from AP.

Signed-off-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 16:15:53 -04:00
Johannes Berg
818667f7c4 [PATCH] softmac: fix SIOCSIWAP
There are some bugs in the current implementation of the SIOCSIWAP wext,
for example that when you do it twice and it fails, it may still try
another access point for some reason. This patch fixes this by introducing
a new flag that tells the association code that the bssid that is in use
was fixed by the user and shouldn't be deviated from.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-24 15:20:23 -04:00
Linus Torvalds
f4ffaa452e Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (21 commits)
  [PATCH] wext: Fix RtNetlink ENCODE security permissions
  [PATCH] bcm43xx: iw_priv_args names should be <16 characters
  [PATCH] bcm43xx: sysfs code cleanup
  [PATCH] bcm43xx: fix pctl slowclock limit calculation
  [PATCH] bcm43xx: fix dyn tssi2dbm memleak
  [PATCH] bcm43xx: fix config menu alignment
  [PATCH] bcm43xx wireless: fix printk format warnings
  [PATCH] softmac: report when scanning has finished
  [PATCH] softmac: fix event sending
  [PATCH] softmac: handle iw_mode properly
  [PATCH] softmac: dont send out packets while scanning
  [PATCH] softmac: return -EAGAIN from getscan while scanning
  [PATCH] bcm43xx: set trans_start on TX to prevent bogus timeouts
  [PATCH] orinoco: fix truncating commsquality RID with the latest Symbol firmware
  [PATCH] softmac: fix spinlock recursion on reassoc
  [PATCH] Revert NET_RADIO Kconfig title change
  [PATCH] wext: Fix IWENCODEEXT security permissions
  [PATCH] wireless/atmel: send WEXT scan completion events
  [PATCH] wireless/airo: clean up WEXT association and scan events
  [PATCH] softmac uses Wiress Ext.
  ...
2006-04-20 15:26:25 -07:00
Jeff Garzik
f18b95c3e2 Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2006-04-20 17:36:10 -04:00
Jayachandran C
18bc89aa25 [EBTABLES]: Clean up vmalloc usage in net/bridge/netfilter/ebtables.c
Make all the vmalloc calls in net/bridge/netfilter/ebtables.c follow
the standard convention.  Remove unnecessary casts, and use '*object'
instead of 'type'.

Signed-off-by: Jayachandran C. <c.jayachandran@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-20 00:14:49 -07:00
David S. Miller
dc6de33674 [NET]: Add skb->truesize assertion checking.
Add some sanity checking.  truesize should be at least sizeof(struct
sk_buff) plus the current packet length.  If not, then truesize is
seriously mangled and deserves a kernel log message.

Currently we'll do the check for release of stream socket buffers.

But we can add checks to more spots over time.

Incorporating ideas from Herbert Xu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-20 00:10:50 -07:00
Herbert Xu
b60b49ea6a [TCP]: Account skb overhead in tcp_fragment
Make sure that we get the full sizeof(struct sk_buff)
plus the data size accounted for in skb->truesize.

This will create invariants that will allow adding
assertion checks on skb->truesize.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-19 21:35:00 -07:00
David S. Miller
5185db09f4 [LLC]: Use pskb_trim_rcsum() in llc_fixup_skb().
Kernel Bugzilla #6409

If we use plain skb_trim(), that's wrong, because if
the SKB is cloned, and it can be because we unshared
it in the caller, we have to allow reallocation.  The
pskb_trim*() family of routines is therefore the most
appropriate here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-19 15:37:13 -07:00
Hua Zhong
3672558c61 [NET]: sockfd_lookup_light() returns random error for -EBADFD
This applies to 2.6.17-rc2.

There is a missing initialization of err in sockfd_lookup_light() that
could return random error for an invalid file handle.

Signed-off-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-19 15:25:02 -07:00
Jean Tourrilhes
848ef85552 [PATCH] wext: Fix RtNetlink ENCODE security permissions
I've just realised that the RtNetlink code does not check the
permission for SIOCGIWENCODE and SIOCGIWENCODEEXT, which means that
any user can read the encryption keys. The fix is trivial and should
go in 2.6.17 alonside the two other patch I sent you last week.

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:41 -04:00
Johannes Berg
6788a07f8f [PATCH] softmac: report when scanning has finished
Make softmac report a scan event when scanning has finished, that way
userspace can wait for the event to happen instead of polling for the
results.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:40 -04:00
Johannes Berg
feeeaa87e8 [PATCH] softmac: fix event sending
Softmac is sending custom events to userspace already, but it
should _really_ be sending the right WEXT events instead. This
patch fixes that.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:39 -04:00
johannes@sipsolutions.net
68970ce6ac [PATCH] softmac: handle iw_mode properly
Below patch allows using iw_mode auto with softmac. bcm43xx forces managed
so this bug wasn't noticed earlier, but this was one of the problems why
zd1211 didn't work earlier.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:39 -04:00
johannes@sipsolutions.net
fc242746ea [PATCH] softmac: dont send out packets while scanning
Seems we forgot to stop the queue while scanning. Better do that so we
don't transmit packets all the time during background scanning.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:39 -04:00
johannes@sipsolutions.net
ba2f8c1875 [PATCH] softmac: return -EAGAIN from getscan while scanning
Below patch was developed after discussion with Daniel Drake who
mentioned to me that wireless tools expect an EAGAIN return from getscan
so that they can wait for the scan to finish before printing out the
results.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:39 -04:00
Michael Buesch
9b0b4d8ae8 [PATCH] softmac: fix spinlock recursion on reassoc
This fixes a spinlock recursion on receiving a reassoc request.

On reassoc, the softmac calls back into the driver. This results in a
driver lock recursion. This schedules the assoc workqueue, instead
of calling it directly.

Probably, we should defer the _whole_ management frame processing
to a tasklet or workqueue, because it does several callbacks into the driver.
That is dangerous.

This fix should go into linus's tree, before 2.6.17 is released, because it
is remote exploitable (DoS by crash).

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:38 -04:00
Jean Tourrilhes
a417016d1a [PATCH] wext: Fix IWENCODEEXT security permissions
Check the permissions when user-space try to read the
encryption parameters via SIOCGIWENCODEEXT. This is trivial and
probably should go in 2.6.17...
	Bug was found by Brian Eaton <eaton.lists@gmail.com>, thanks !

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:38 -04:00
Randy Dunlap
e4b5fae8b3 [PATCH] softmac uses Wiress Ext.
softmac uses wireless extensions, so let it SELECT that config option;
WARNING: "wireless_send_event" [net/ieee80211/softmac/ieee80211softmac.ko] undefined!

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-04-19 17:25:37 -04:00
Eric Sesterhenn
a5f9145bc9 SUNRPC: Dead code in net/sunrpc/auth_gss/auth_gss.c
Hi,

the coverity checker spotted that cred is always NULL
when we jump to out_err ( there is just one case, when
we fail to allocate the memory for cred )
This is Coverity ID #79

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 13:06:49 -04:00
Adrian Bunk
ec535ce154 NFS: make 2 functions static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:47 -04:00
J. Bruce Fields
d4a30e7e66 RPCSEC_GSS: fix leak in krb5 code caused by superfluous kmalloc
I was sloppy when generating a previous patch; I modified the callers of
krb5_make_checksum() to allocate memory for the buffer where the result is
returned, then forgot to modify krb5_make_checksum to stop allocating that
memory itself.  The result is a per-packet memory leak.  This fixes the
problem by removing the now-superfluous kmalloc().

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:46 -04:00
Jesper Juhl
63903ca6af [NET]: Remove redundant NULL checks before [kv]free
Redundant NULL check before kfree removal
from net/

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:55 -07:00
Dmitry Mishin
40daafc80b unaligned access in sk_run_filter()
This patch fixes unaligned access warnings noticed on IA64
in sk_run_filter(). 'ptr' can be unaligned.

Signed-off-By: Dmitry Mishin <dim@openvz.org>
Signed-off-By: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:54 -07:00
YOSHIFUJI Hideaki
b809739a1b [IPV6]: Clean up hop-by-hop options handler.
- Removed unused argument (nhoff) for ipv6_parse_hopopts().
- Make ipv6_parse_hopopts() to align with other extension header
  handlers.
- Removed pointless assignment (hdr), which is not used afterwards.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:53 -07:00
YOSHIFUJI Hideaki
e5d25a9088 [IPV6] XFRM: Fix decoding session with preceding extension header(s).
We did not correctly decode session with preceding extension
header(s).  This was because we had already pulled preceding
headers, skb->nh.raw + 40 + 1 - skb->data was minus, and
pskb_may_pull() failed.

We now have IP6CB(skb)->nhoff and skb->h.raw, and we can
start parsing / decoding upper layer protocol from current
position.

Tracked down by Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
and tested by Kazunori Miyazawa <kazunori@miyazawa.org>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:52 -07:00
YOSHIFUJI Hideaki
e3cae904d7 [IPV6] XFRM: Don't use old copy of pointer after pskb_may_pull().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:51 -07:00
YOSHIFUJI Hideaki
ec6700958a [IPV6]: Ensure to have hop-by-hop options in our header of &sk_buff.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-18 15:57:50 -07:00