Commit Graph

6279 Commits

Author SHA1 Message Date
Pavel Emelyanov
6a9fb9479f [IPV4]: Clean the ip_sockglue.c from some ugly ifdefs
The #idfed CONFIG_IP_MROUTE is sometimes places inside the if-s,
which looks completely bad. Similar ifdefs inside the functions
looks a bit better, but they are also not recommended to be used.

Provide an ifdef-ed ip_mroute_opt() helper to cleanup the code.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:55 -08:00
Alexey Dobriyan
4e058063f4 [DECNET]: "addr" module param can't be __initdata
sysfs keeps references to module parameters via /sys/module/*/parameters,
so marking them as __initdata can't work.

Steps to reproduce:

	modprobe decnet
	cat /sys/module/decnet/parameters/addr

BUG: unable to handle kernel paging request at virtual address f88cd410
printing eip: c043dfd1 *pdpt = 0000000000004001 *pde = 0000000004408067 *pte = 0000000000000000
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: decnet sunrpc af_packet ipv6 binfmt_misc dm_mirror dm_multipath dm_mod sbs sbshc fan dock battery backlight ac power_supply parport loop rtc_cmos serio_raw rtc_core rtc_lib button amd_rng sr_mod cdrom shpchp pci_hotplug ehci_hcd ohci_hcd uhci_hcd usbcore
Pid: 2099, comm: cat Not tainted (2.6.24-rc1-b1d08ac064268d0ae2281e98bf5e82627e0f0c56-bloat #6)
EIP: 0060:[<c043dfd1>] EFLAGS: 00210286 CPU: 1
EIP is at param_get_int+0x6/0x20
EAX: c5c87000 EBX: 00000000 ECX: 000080d0 EDX: f88cd410
ESI: f8a108f8 EDI: c5c87000 EBP: 00000000 ESP: c5c97f00
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 2099, ti=c5c97000 task=c641ee10 task.ti=c5c97000)
Stack: 00000000 f8a108f8 c5c87000 c043db6b f8a108f1 00000124 c043de1a c043db2f
       f88cd410 ffffffff c5c87000 f8a16bc8 f8a16bc8 c043dd69 c043dd54 c5dd5078
       c043dbc8 c5cc7580 c06ee64c c5d679f8 c04c431f c641f480 c641f484 00001000
Call Trace:
 [<c043db6b>] param_array_get+0x3c/0x62
 [<c043de1a>] param_array_set+0x0/0xdf
 [<c043db2f>] param_array_get+0x0/0x62
 [<c043dd69>] param_attr_show+0x15/0x2d
 [<c043dd54>] param_attr_show+0x0/0x2d
 [<c043dbc8>] module_attr_show+0x1a/0x1e
 [<c04c431f>] sysfs_read_file+0x7c/0xd9
 [<c04c42a3>] sysfs_read_file+0x0/0xd9
 [<c048d4b2>] vfs_read+0x88/0x134
 [<c042090b>] do_page_fault+0x0/0x7d5
 [<c048d920>] sys_read+0x41/0x67
 [<c04080fa>] sysenter_past_esp+0x6b/0xc1
 =======================
Code: 00 83 c4 0c c3 83 ec 0c 8b 52 10 8b 12 c7 44 24 04 27 dd 6c c0 89 04 24 89 54 24 08 e8 ea 01 0c 00 83 c4 0c c3 83 ec 0c 8b 52 10 <8b> 12 c7 44 24 04 58 8c 6a c0 89 04 24 89 54 24 08 e8 ca 01 0c
EIP: [<c043dfd1>] param_get_int+0x6/0x20 SS:ESP 0068:c5c97f00

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:55 -08:00
Mitsuru Chinen
7a0ff716c2 [IPv6] SNMP: Restore Udp6InErrors incrementation
As the checksum verification is postponed till user calls recv or poll,
the inrementation of Udp6InErrors counter should be also postponed.
Currently, it is postponed in non-blocking operation case. However it
should be postponed in all case like the IPv4 code.

Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:54 -08:00
Alexey Dobriyan
3f192b5c58 [NET]: Remove /proc/net/stat/*_arp_cache upon module removal
neigh_table_init_no_netlink() creates them, but they aren't removed anywhere.

Steps to reproduce:

	modprobe clip
	rmmod clip
	cat /proc/net/stat/clip_arp_cache

BUG: unable to handle kernel paging request at virtual address f89d7758
printing eip: c05a99da *pdpt = 0000000000004001 *pde = 0000000004408067 *pte = 0000000000000000
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: atm af_packet ipv6 binfmt_misc sbs sbshc fan dock battery backlight ac power_supply parport loop rtc_cmos rtc_core rtc_lib serio_raw button k8temp hwmon amd_rng sr_mod cdrom shpchp pci_hotplug ehci_hcd ohci_hcd uhci_hcd usbcore
Pid: 2082, comm: cat Not tainted (2.6.24-rc1-b1d08ac064268d0ae2281e98bf5e82627e0f0c56-bloat #4)
EIP: 0060:[<c05a99da>] EFLAGS: 00210256 CPU: 0
EIP is at neigh_stat_seq_next+0x26/0x3f
EAX: 00000001 EBX: f89d7600 ECX: c587bf40 EDX: 00000000
ESI: 00000000 EDI: 00000001 EBP: 00000400 ESP: c587bf1c
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 2082, ti=c587b000 task=c5984e10 task.ti=c587b000)
Stack: c06228cc c5313790 c049e5c0 0804f000 c45a7b00 c53137b0 00000000 00000000
       00000082 00000001 00000000 00000000 00000000 fffffffb c58d6780 c049e437
       c45a7b00 c04b1f93 c587bfa0 00000400 0804f000 00000400 0804f000 c04b1f2f
Call Trace:
 [<c049e5c0>] seq_read+0x189/0x281
 [<c049e437>] seq_read+0x0/0x281
 [<c04b1f93>] proc_reg_read+0x64/0x77
 [<c04b1f2f>] proc_reg_read+0x0/0x77
 [<c048907e>] vfs_read+0x80/0xd1
 [<c0489491>] sys_read+0x41/0x67
 [<c04080fa>] sysenter_past_esp+0x6b/0xc1
 =======================
Code: e9 ec 8d 05 00 56 8b 11 53 8b 40 70 8b 58 3c eb 29 0f a3 15 80 91 7b c0 19 c0 85 c0 8d 42 01 74 17 89 c6 c1 fe 1f 89 01 89 71 04 <8b> 83 58 01 00 00 f7 d0 8b 04 90 eb 09 89 c2 83 fa 01 7e d2 31
EIP: [<c05a99da>] neigh_stat_seq_next+0x26/0x3f SS:ESP 0068:c587bf1c

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:53 -08:00
Pavel Emelyanov
bf138862b1 [IPV6]: Consolidate the ip cork destruction in ip6_output.c
The ip6_push_pending_frames and ip6_flush_pending_frames do the
same things to flush the sock's cork. Move this into a separate
function and save ~100 bytes from the .text

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:26 -08:00
Pavel Emelyanov
429f08e950 [IPV4]: Consolidate the ip cork destruction in ip_output.c
The ip_push_pending_frames and ip_flush_pending_frames do the
same things to flush the sock's cork. Move this into a separate
function and save ~80 bytes from the .text

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:25 -08:00
Bart De Schuymer
e011ff48ab [NETFILTER]: ebt_arp: fix --arp-gratuitous matching dependence on --arp-ip-{src,dst}
Fix --arp-gratuitous matching dependence on --arp-ip-{src,dst}

Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Lutz Preßler <Lutz.Pressler@SerNet.DE>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:25 -08:00
Alexey Dobriyan
55d84acd36 [NETFILTER]: nf_sockopts list head cleanup
Code is using knowledge that nf_sockopt_ops::list list_head is first
field in structure by using casts. Switch to list_for_each_entry()
itetators while I am at it.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:24 -08:00
Patrick McHardy
d1332e0ab8 [NETFILTER]: remove unneeded rcu_dereference() calls
As noticed by Paul McKenney, the rcu_dereference calls in the init path
of NAT modules are unneeded, remove them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:23 -08:00
Jan Engelhardt
0795c65d9f [NETFILTER]: Clean up Makefile
Sort matches and targets in the NF makefiles.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:22 -08:00
Jan Engelhardt
ba5dc2756c [NETFILTER]: Copyright/Email update
Transfer all my copyright over to our company.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:20 -08:00
Alexey Dobriyan
7351a22a3a [NETFILTER]: ip{,6}_queue: convert to seq_file interface
I plan to kill ->get_info which means killing proc_net_create().

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-07 04:08:20 -08:00
Latchesar Ionkov
55762690e2 9p: add missing end-of-options record for trans_fd
The list of options that the fd transport accepts is missing end-of-options
marker. This patch adds it.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-11-06 08:02:53 -06:00
Latchesar Ionkov
dd1a458412 9p: return NULL when trans not found
v9fs_match_trans function returns arbitrary transport module instead of NULL
when the requested transport is not registered. This patch modifies the
function to return NULL in that case.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-11-06 08:02:53 -06:00
Jens Axboe
c46f2334c8 [SG] Get rid of __sg_mark_end()
sg_mark_end() overwrites the page_link information, but all users want
__sg_mark_end() behaviour where we just set the end bit. That is the most
natural way to use the sg list, since you'll fill it in and then mark the
end point.

So change sg_mark_end() to only set the termination bit. Add a sg_magic
debug check as well, and clear a chain pointer if it is set.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-02 08:47:06 +01:00
Adrian Bunk
87ae9afdca cleanup asm/scatterlist.h includes
Not architecture specific code should not #include <asm/scatterlist.h>.

This patch therefore either replaces them with
#include <linux/scatterlist.h> or simply removes them if they were
unused.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-11-02 08:47:06 +01:00
David S. Miller
49259d34c5 [IRDA] IRNET: Fix build when TCGETS2 is defined.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 02:26:38 -07:00
Stephen Hemminger
3b582cc14c [NET]: docbook fixes for netif_ functions
Documentation updates for network interfaces.

1. Add doc for netif_napi_add
2. Remove doc for unused returns from netif_rx
3. Add doc for netif_receive_skb

[ Incorporated minor mods from Randy Dunlap -DaveM ]

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 02:21:47 -07:00
Pavel Emelyanov
d57a9212e0 [NET]: Hide the net_ns kmem cache
This cache is only required to create new namespaces,
but we won't have them in CONFIG_NET_NS=n case.

Hide it under the appropriate ifdef.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:46:50 -07:00
Pavel Emelyanov
1a2ee93d28 [NET]: Mark the setup_net as __net_init
The setup_net is called for the init net namespace
only (int the CONFIG_NET_NS=n of course) from the __init
function, so mark it as __net_init to disappear with the
caller after the boot.

Yet again, in the perfect world this has to be under
#ifdef CONFIG_NET_NS, but it isn't guaranteed that every
subsystem is registered *after* the init_net_ns is set
up. After we are sure, that we don't start registering
them before the init net setup, we'll be able to move
this code under the ifdef.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:45:59 -07:00
Pavel Emelyanov
6a1a3b9f68 [NET]: Hide the dead code in the net_namespace.c
The namespace creation/destruction code is never called
if the CONFIG_NET_NS is n, so it's OK to move it under
appropriate ifdef.

The copy_net_ns() in the "n" case checks for flags and
returns -EINVAL when new net ns is requested. In a perfect
world this stub must be in net_namespace.h, but this
function need to know the CLONE_NEWNET value and thus
requires sched.h. On the other hand this header is to be
injected into almost every .c file in the networking code,
and making all this code depend on the sched.h is a
suicidal attempt.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:44:50 -07:00
Pavel Emelyanov
1dba323b3f [NETNS]: Make the init/exit hooks checks outside the loop
When the new pernet something (subsys, device or operations) is
being registered, the init callback is to be called for each
namespace, that currently exitst in the system. During the
unregister, the same is to be done with the exit callback.

However, not every pernet something has both calls, but the
check for the appropriate pointer to be not NULL is performed
inside the for_each_net() loop.

This is (at least) strange, so tune this.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:42:43 -07:00
Pavel Emelyanov
6257ff2177 [NET]: Forget the zero_it argument of sk_alloc()
Finally, the zero_it argument can be completely removed from
the callers and from the function prototype.

Besides, fix the checkpatch.pl warnings about using the
assignments inside if-s.

This patch is rather big, and it is a part of the previous one.
I splitted it wishing to make the patches more readable. Hope 
this particular split helped.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:39:31 -07:00
Pavel Emelyanov
154adbc846 [NET]: Remove bogus zero_it argument from sk_alloc
At this point nobody calls the sk_alloc(() with zero_it == 0,
so remove unneeded checks from it.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:38:43 -07:00
Pavel Emelyanov
8fd1d178a3 [NET]: Make the sk_clone() lighter
The sk_prot_alloc() already performs all the stuff needed by the
sk_clone(). Besides, the sk_prot_alloc() requires almost twice
less arguments than the sk_alloc() does, so call the sk_prot_alloc()
saving the stack a bit.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:37:32 -07:00
Pavel Emelyanov
2e4afe7b35 [NET]: Move some core sock setup into sk_prot_alloc
The security_sk_alloc() and the module_get is a part of the
object allocations - move it in the proper place.

Note, that since we do not reset the newly allocated sock
in the sk_alloc() (memset() is removed with the previous
patch) we can safely do this.

Also fix the error path in sk_prot_alloc() - release the security
context if needed.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:36:26 -07:00
Pavel Emelyanov
3f0666ee30 [NET]: Auto-zero the allocated sock object
We have a __GFP_ZERO flag that allocates a zeroed chunk of memory.
Use it in the sk_alloc() and avoid a hand-made memset().

This is a temporary patch that will help us in the nearest future :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:34:42 -07:00
Pavel Emelyanov
c308c1b20e [NET]: Cleanup the allocation/freeing of the sock object
The sock object is allocated either from the generic cache with
the kmalloc, or from the proc->slab cache.

Move this logic into an isolated set of helpers and make the
sk_alloc/sk_free look a bit nicer.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:33:50 -07:00
Pavel Emelyanov
1e2e6b89f1 [NET]: Move the get_net() from sock_copy()
The sock_copy() is supposed to just clone the socket. In a perfect
world it has to be just memcpy, but we have to handle the security
mark correctly. All the extra setup must be performed in sk_clone() 
call, so move the get_net() into more proper place.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:31:26 -07:00
Pavel Emelyanov
f1a6c4da14 [NET]: Move the sock_copy() from the header
The sock_copy() call is not used outside the sock.c file,
so just move it into a sock.c

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:29:45 -07:00
Ilpo Järvinen
261ab365fa [TCP]: Another TAGBITS -> SACKED_ACKED|LOST conversion
Similar to commit 3eec0047d9, point of this is to avoid
skipping R-bit skbs.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:10:18 -07:00
Ilpo Järvinen
e56d6cd605 [TCP]: Process DSACKs that reside within a SACK block
DSACK inside another SACK block were missed if start_seq of DSACK
was larger than SACK block's because sorting prioritizes full
processing of the SACK block before DSACK. After SACK block
sorting situation is like this:

             SSSSSSSSS
                  D
                        SSSSSS
                               SSSSSSS

Because write_queue is walked in-order, when the first SACK block
has been processed, TCP is already past the skb for which the
DSACK arrived and we haven't taught it to backtrack (nor should
we), so TCP just continues processing by going to the next SACK
block after the DSACK (if any).

Whenever such DSACK is present, do an embedded checking during
the previous SACK block.

If the DSACK is below snd_una, there won't be overlapping SACK
block, and thus no problem in that case. Also if start_seq of
the DSACK is equal to the actual block, it will be processed
first.

Tested this by using netem to duplicate 15% of packets, and
by printing SACK block when found_dup_sack is true and the 
selected skb in the dup_sack = 1 branch (if taken):

  SACK block 0: 4344-5792 (relative to snd_una 2019137317)
  SACK block 1: 4344-5792 (relative to snd_una 2019137317) 

equal start seqnos => next_dup = 0, dup_sack = 1 won't occur...

  SACK block 0: 5792-7240 (relative to snd_una 2019214061)
  SACK block 1: 2896-7240 (relative to snd_una 2019214061)
  DSACK skb match 5792-7240 (relative to snd_una)

...and next_dup = 1 case (after the not shown start_seq sort),
went to dup_sack = 1 branch.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-11-01 00:09:37 -07:00
Stephen Rothwell
298bb62175 [AF_KEY]: suppress a warning for 64k pages.
On PowerPC allmodconfig build we get this:

net/key/af_key.c:400: warning: comparison is always false due to limited range of data type

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30 23:57:05 -07:00
David S. Miller
51c739d1f4 [NET]: Fix incorrect sg_mark_end() calls.
This fixes scatterlist corruptions added by

	commit 68e3f5dd4d
	[CRYPTO] users: Fix up scatterlist conversion errors

The issue is that the code calls sg_mark_end() which clobbers the
sg_page() pointer of the final scatterlist entry.

The first part fo the fix makes skb_to_sgvec() do __sg_mark_end().

After considering all skb_to_sgvec() call sites the most correct
solution is to call __sg_mark_end() in skb_to_sgvec() since that is
what all of the callers would end up doing anyways.

I suspect this might have fixed some problems in virtio_net which is
the sole non-crypto user of skb_to_sgvec().

Other similar sg_mark_end() cases were converted over to
__sg_mark_end() as well.

Arguably sg_mark_end() is a poorly named function because it doesn't
just "mark", it clears out the page pointer as a side effect, which is
what led to these bugs in the first place.

The one remaining plain sg_mark_end() call is in scsi_alloc_sgtable()
and arguably it could be converted to __sg_mark_end() if only so that
we can delete this confusing interface from linux/scatterlist.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30 21:29:29 -07:00
Alexey Dobriyan
07afa04025 [IPVS]: Remove /proc/net/ip_vs_lblcr
It's under CONFIG_IP_VS_LBLCR_DEBUG option which never existed.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30 21:16:27 -07:00
Daniel Lezcano
1675c7b254 [IPV6]: remove duplicate call to proc_net_remove
The file /proc/net/if_inet6 is removed twice.
First time in:
        inet6_exit
             ->addrconf_cleanup
And followed a few lines after by:
        inet6_exit
             -> if6_proc_exit

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30 21:16:24 -07:00
Daniel Lezcano
310928d963 [NETNS]: fix net released by rcu callback
When a network namespace reference is held by a network subsystem,
and when this reference is decremented in a rcu update callback, we
must ensure that there is no more outstanding rcu update before
trying to free the network namespace.

In the normal case, the rcu_barrier is called when the network namespace
is exiting in the cleanup_net function.

But when a network namespace creation fails, and the subsystems are
undone (like the cleanup), the rcu_barrier is missing.

This patch adds the missing rcu_barrier.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30 21:16:21 -07:00
Daniel Lezcano
93ee31f14f [NET]: Fix free_netdev on register_netdev failure.
Point 1:
The unregistering of a network device schedule a netdev_run_todo.
This function calls dev->destructor when it is set and the
destructor calls free_netdev.

Point 2:
In the case of an initialization of a network device the usual code
is:
 * alloc_netdev
 * register_netdev
    -> if this one fails, call free_netdev and exit with error.

Point 3:
In the register_netdevice function at the later state, when the device
is at the registered state, a call to the netdevice_notifiers is made.
If one of the notification falls into an error, a rollback to the
registered state is done using unregister_netdevice.

Conclusion:
When a network device fails to register during initialization because
one network subsystem returned an error during a notification call
chain, the network device is freed twice because of fact 1 and fact 2.
The second free_netdev will be done with an invalid pointer.

Proposed solution:
The following patch move all the code of unregister_netdevice *except*
the call to net_set_todo, to a new function "rollback_registered".

The following functions are changed in this way:
 * register_netdevice: calls rollback_registered when a notification fails
 * unregister_netdevice: calls rollback_register + net_set_todo, the call
                         order to net_set_todo is changed because it is the
                         latest now. Since it justs add an element to a list
                         that should not break anything.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30 21:16:18 -07:00
Dirk Hohndel
e403149c92 Kbuild/doc: fix links to Documentation files
Fix links to files in Documentation/* in various Kconfig files

Signed-off-by: Dirk Hohndel <hohndel@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-30 14:26:30 -07:00
J. Bruce Fields
521c2a43b2 [SUNRPC]: fix rpc debugging
Commit baa3a2a0d2, by removing initialization
of the ctl_name field, broke this conditional, preventing the display of
rpc_tasks that you previously got when turning on rpc debugging.

[akpm@linux-foundation.org: coding-style fixes]

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30 01:07:15 -07:00
Jean Delvare
0ccfe61803 [TCP]: Saner thash_entries default with much memory.
On systems with a very large amount of memory, the heuristics in
alloc_large_system_hash() result in a very large TCP established hash
table: 16 millions of entries for a 128 GB ia64 system. This makes
reading from /proc/net/tcp pretty slow (well over a second) and as a
result netstat is slow on these machines. I know that /proc/net/tcp is
deprecated in favor of tcp_diag, however at the moment netstat only
knows of the former.

I am skeptical that such a large TCP established hash is often needed.
Just because a system has a lot of memory doesn't imply that it will
have several millions of concurrent TCP connections. Thus I believe
that we should put an arbitrary high limit to the size of the TCP
established hash by default. Users who really need a bigger hash can
always use the thash_entries boot parameter to get more.

I propose 2 millions of entries as the arbitrary high limit. This
makes /proc/net/tcp reasonably fast on the system in question (0.2 s)
while being still large enough for me to be confident that network
performance won't suffer.

This is just one way to limit the hash size, there are others; I am not
familiar enough with the TCP code to decide which is best. Thus, I
would welcome the proposals of alternatives.

[ 2 million is still too large, thus I've modified the limit in the
  change to be '512 * 1024'. -DaveM ]

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30 00:59:25 -07:00
Stephen Rothwell
e08a132b0e [SUNRPC] rpc_rdma: we need to cast u64 to unsigned long long for printing
as some architectures have unsigned long for u64.

net/sunrpc/xprtrdma/rpc_rdma.c: In function 'rpcrdma_create_chunks':
net/sunrpc/xprtrdma/rpc_rdma.c:222: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/sunrpc/xprtrdma/rpc_rdma.c:234: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/sunrpc/xprtrdma/rpc_rdma.c: In function 'rpcrdma_count_chunks':
net/sunrpc/xprtrdma/rpc_rdma.c:577: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64

Noticed on PowerPC pseries_defconfig build.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-30 00:44:32 -07:00
Mitsuru Chinen
064f3605be [IPv4] SNMP: Refer correct memory location to display ICMP out-going statistics
While displaying ICMP out-going statistics as Out<name> counters in
/proc/net/snmp, the memory location for ICMP in-coming statistics
was referred by mistake.

Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-29 22:37:36 -07:00
David S. Miller
bf3c23d171 [NET]: Fix error reporting in sys_socketpair().
If either of the two sock_alloc_fd() calls fail, we
forget to update 'err' and thus we'll erroneously
return zero in these cases.

Based upon a report and patch from Rich Paul, and
commentary from Chuck Ebbert.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-29 22:37:34 -07:00
Andrew Morton
29b67497f2 [NETFILTER]: nf_ct_alloc_hashtable(): use __GFP_NOWARN
This allocation is expected to fail and we handle it by fallback to vmalloc().

So don't scare people with nasty messages like
http://bugzilla.kernel.org/show_bug.cgi?id=9190

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-29 22:37:31 -07:00
David S. Miller
0a7606c121 [NET]: Fix race between poll_napi() and net_rx_action()
netpoll_poll_lock() synchronizes the ->poll() invocation
code paths, but once we have the lock we have to make
sure that NAPI_STATE_SCHED is still set.  Otherwise we
get:

	cpu 0			cpu 1

	net_rx_action()		poll_napi()
	netpoll_poll_lock()	... spin on ->poll_lock
	->poll()
	  netif_rx_complete
	netpoll_poll_unlock()	acquire ->poll_lock()
				->poll()
				 netif_rx_complete()
				 CRASH

Based upon a bug report from Tina Yang.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-29 22:37:28 -07:00
Matthias M. Dellweg
b0a713e9e6 [TCP] MD5: Remove some more unnecessary casting.
while reviewing the tcp_md5-related code further i came across with
another two of these casts which you probably have missed. I don't
actually think that they impose a problem by now, but as you said we
should remove them.

Signed-off-by: Matthias M. Dellweg <2500@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-29 22:37:27 -07:00
Xiaoliang (David) Wei
c940587bf6 [TCP] vegas: Fix a bug in disabling slow start by gamma parameter.
TCP Vegas implementation has a bug in the process of disabling
slow-start with gamma parameter. The bug may lead to extreme
unfairness in the presence of early packet loss. See details in:
http://www.cs.caltech.edu/~weixl/technical/ns2linux/known_linux/index.html#vegas

Switch the order of "if (tp->snd_cwnd <= tp->snd_ssthresh)" statement
and "if (diff > gamma)" statement to eliminate the problem.

Signed-off-by: Xiaoliang (David) Wei <davidwei79@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-29 22:37:25 -07:00
Andy Gospodarek
5c81833c2f [IPVS]: use proper timeout instead of fixed value
Instead of using the default timeout of 3 minutes, this uses the timeout
specific to the protocol used for the connection. The 3 minute timeout
seems somewhat arbitrary (though I know it is used other places in the
ipvs code) and when failing over it would be much nicer to use one of
the configured timeout values.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-29 22:37:23 -07:00
YOSHIFUJI Hideaki
ad02ac145d [IPV6] NDISC: Fix setting base_reachable_time_ms variable.
This bug was introduced by the commit
d12af679bc (sysctl: fix neighbour table
sysctls).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-29 22:37:22 -07:00
Al Viro
d06f608265 SCTP endianness annotations regression
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-29 07:41:32 -07:00
Al Viro
2d8a972661 SUNRPC endianness annotations
rpcrdma stuff lacks endianness annotations for on-the-wire data.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-29 07:41:32 -07:00
Herbert Xu
68e3f5dd4d [CRYPTO] users: Fix up scatterlist conversion errors
This patch fixes the errors made in the users of the crypto layer during
the sg_init_table conversion.  It also adds a few conversions that were
missing altogether.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-27 00:52:07 -07:00
Eric W. Biederman
ceaa79c434 [NETNS]: Fix get_net_ns_by_pid
The pid namespace patches changed the semantics of
find_task_by_pid without breaking the compile resulting
in get_net_ns_by_pid doing the wrong thing.

So switch to using the intended find_task_by_vpid.

Combined with Denis' earlier patch to make netlink traffic
fully synchronous the inadvertent race I introduced with
accessing current is actually removed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 22:56:12 -07:00
Eric W. Biederman
2b008b0a8e [NET]: Marking struct pernet_operations __net_initdata was inappropriate
It is not safe to to place struct pernet_operations in a special section.
We need struct pernet_operations to last until we call unregister_pernet_subsys.
Which doesn't happen until module unload.

So marking struct pernet_operations is a disaster for modules in two ways.
- We discard it before we call the exit method it points to.
- Because I keep struct pernet_operations on a linked list discarding
  it for compiled in code removes elements in the middle of a linked
  list and does horrible things for linked insert.

So this looks safe assuming __exit_refok is not discarded
for modules.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 22:54:53 -07:00
Adrian Bunk
72998d8c84 [INET] ESP: Must #include <linux/scatterlist.h>
This patch fixes the following compile errors in some configurations:

<--  snip  -->

...
  CC      net/ipv4/esp4.o
/home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv4/esp4.c: In function 'esp_output':
/home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv4/esp4.c:113: error: implicit declaration of function 'sg_init_table'
make[3]: *** [net/ipv4/esp4.o] Error 1
...
/home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv6/esp6.c: In function 'esp6_output':
/home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv6/esp6.c:112: error: implicit declaration of function 'sg_init_table'
make[3]: *** [net/ipv6/esp6.o] Error 1


<--  snip  -->

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 22:53:58 -07:00
Jeff Garzik
18134bed02 [TCP] IPV6: fix softnet build breakage
net/ipv6/tcp_ipv6.c: In function 'tcp_v6_rcv':
net/ipv6/tcp_ipv6.c:1736: error: implicit declaration of function
'get_softnet_dma'
net/ipv6/tcp_ipv6.c:1736: warning: assignment makes pointer from integer
without a cast

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 22:53:14 -07:00
Paul Moore
4be2700fb7 [NetLabel]: correct usage of RCU locking
This fixes some awkward, and perhaps even problematic, RCU lock usage in the
NetLabel code as well as some other related trivial cleanups found when
looking through the RCU locking.  Most of the changes involve removing the
redundant RCU read locks wrapping spinlocks in the case of a RCU writer.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 04:29:08 -07:00
Ryousei Takano
94d3b1e586 [TCP]: fix D-SACK cwnd handling
In the current net-2.6 kernel, handling FLAG_DSACKING_ACK is broken.
The flag is cleared to 1 just after FLAG_DSACKING_ACK is set.

        if (found_dup_sack)
                flag |= FLAG_DSACKING_ACK;
	:
	flag = 1;

To fix it, this patch introduces a part of the tcp_sacktag_state patch:
	http://marc.info/?l=linux-netdev&m=119210560431519&w=2

Signed-off-by: Ryousei Takano <takano-ryousei@aist.go.jp>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 04:27:59 -07:00
Adrian Bunk
8ad7c62b75 [SCTP] net/sctp/auth.c: make 3 functions static
This patch makes three needlessly global functions static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 04:21:23 -07:00
David S. Miller
b4caea8aa8 [TCP]: Add missing I/O AT code to ipv6 side.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 04:20:13 -07:00
Adrian Bunk
d84d64dcb3 [SCTP]: #if 0 sctp_update_copy_cksum()
sctp_update_copy_cksum() is no longer used.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 04:07:20 -07:00
Adrian Bunk
39296ed669 [INET]: Unexport icmpmsg_statistics
This patch removes the unused EXPORT_SYMBOL(icmpmsg_statistics).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 04:06:08 -07:00
Adrian Bunk
bbbb1a812d [NET]: Unexport sock_enable_timestamp().
sock_enable_timestamp() no longer has any modular users.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 03:59:45 -07:00
Adrian Bunk
0f79efdc23 [TCP]: Make tcp_match_skb_to_sack() static.
tcp_match_skb_to_sack() can become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 03:57:36 -07:00
Adrian Bunk
d76081f875 [IRDA]: Make ircomm_tty static.
ircomm_tty can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 03:56:43 -07:00
Stephen Hemminger
c8d90dca32 [NET] dev_change_name: ignore changes to same name
Prevent error/backtrace from dev_rename() when changing
name of network device to the same name. This is a common
situation with udev and other scripts that bind addr to device.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 03:53:42 -07:00
David S. Miller
8c56a347c1 Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6 2007-10-26 03:50:02 -07:00
Jamal Hadi Salim
a057ae3c10 [NET_CLS_ACT]: Use skb_act_clone
clean skb_clone of any signs of CONFIG_NET_CLS_ACT and
have mirred us skb_act_clone()

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 02:47:54 -07:00
David S. Miller
c7da57a183 [TCP]: Fix scatterlist handling in MD5 signature support.
Use sg_init_table() and sg_mark_end() as needed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 00:41:21 -07:00
David S. Miller
0e0940d4bb [IPSEC]: Fix scatterlist handling in skb_icv_walk().
Use sg_init_one() and sg_init_table() as needed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 00:39:27 -07:00
David S. Miller
ed0e7e0ca3 [IPSEC]: Add missing sg_init_table() calls to ESP.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-26 00:38:39 -07:00
Ryousei Takano
564262c1f0 [TCP]: Fix inconsistency of terms.
Fix inconsistency of terms:
1) D-SACK
2) F-RTO

Signed-off-by: Ryousei Takano <takano-ryousei@aist.go.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-25 23:03:52 -07:00
David S. Miller
4bc3e17cce Merge branch 'fixes-davem' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2007-10-25 22:49:43 -07:00
Johannes Berg
ddd68587d0 [PATCH] mac80211: fix printk warning on 64-bit
My AID message patch introduced a warning on 64-bit machines because ~
extends to unsigned long:

| net/mac80211/ieee80211_sta.c: In function ‘ieee80211_rx_mgmt_assoc_resp’:
| net/mac80211/ieee80211_sta.c:1187: warning: format ‘%d’ expects type ‘int’, but argument 7 has type ‘long unsigned int’

This fixes it by explicitly casting the result to u16 (which 'aid' is).

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-10-26 00:14:29 -04:00
Michael Wu
48225709be [PATCH] mac80211: Fix SSID matching in AP selection
The length of the SSID desired should also be compared in addition to
the memcmp of the SSIDs.

Thanks to Andrea Merello <andreamrl@tiscali.it> for finding this issue.

Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-10-25 22:32:05 -04:00
Vlad Yasevich
fee9dee730 [UDP]: Make use of inet_iif() when doing socket lookups.
UDP currently uses skb->dev->ifindex which may provide the wrong
information when the socket bound to a specific interface.
This patch makes inet_iif() accessible to UDP and makes UDP use it.

The scenario we are trying to fix is when a client is running on
the same system and the server and both client and server bind to
a non-loopback device.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-25 18:54:46 -07:00
David S. Miller
8a6911b12f [IPV4]: Remove no longer used snmp4_icmp_list.
This was obsoleted by a previous change, but the removal was
forgotten.

Reported by David Howells and David Stevens.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-25 18:40:05 -07:00
Linus Torvalds
06dbbfef82 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  [IPV4]: Explicitly call fib_get_table() in fib_frontend.c
  [NET]: Use BUILD_BUG_ON in net/core/flowi.c
  [NET]: Remove in-code externs for some functions from net/core/dev.c
  [NET]: Don't declare extern variables in net/core/sysctl_net_core.c
  [TCP]: Remove unneeded implicit type cast when calling tcp_minshall_update()
  [NET]: Treat the sign of the result of skb_headroom() consistently
  [9P]: Fix missing unlock before return in p9_mux_poll_start
  [PKT_SCHED]: Fix sch_prio.c build with CONFIG_NETDEVICES_MULTIQUEUE
  [IPV4] ip_gre: sendto/recvfrom NBMA address
  [SCTP]: Consolidate sctp_ulpq_renege_xxx functions
  [NETLINK]: Fix ACK processing after netlink_dump_start
  [VLAN]: MAINTAINERS update
  [DCCP]: Implement SIOCINQ/FIONREAD
  [NET]: Validate device addr prior to interface-up
2007-10-25 15:50:32 -07:00
Gerrit Renker
24c667db59 [CCID2/3]: Initialisation assignments of 0 are redundant
Assigning initial values of `0' is redundant when loading a new CCID structure,
since in net/dccp/ccid.c the entire CCID structure is zeroed out prior to
initialisation in ccid_new():

    	struct ccid {
    		struct ccid_operations *ccid_ops;
    		char		       ccid_priv[0];
    	};

    	// ...
    	if (rx) {
    		memset(ccid + 1, 0, ccid_ops->ccid_hc_rx_obj_size);
    		if (ccid->ccid_ops->ccid_hc_rx_init != NULL &&
    		    ccid->ccid_ops->ccid_hc_rx_init(ccid, sk) != 0)
    			goto out_free_ccid;
    	} else {
    		memset(ccid + 1, 0, ccid_ops->ccid_hc_tx_obj_size);
    		/* analogous to the rx case */
    	}

This patch therefore removes the redundant assignments. Thanks to Arnaldo for
the inspiration.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-10-24 10:53:01 -02:00
Gerrit Renker
76fd1e87d9 [DCCP]: Unaligned pointer access
This fixes `unaligned (read) access' errors of the type

Kernel unaligned access at TPC[100f970c] dccp_parse_options+0x4f4/0x7e0 [dccp]
Kernel unaligned access at TPC[1011f2e4] ccid3_hc_tx_parse_options+0x1ac/0x380 [dccp_ccid3]
Kernel unaligned access at TPC[100f9898] dccp_parse_options+0x680/0x880 [dccp]

by using the get_unaligned macro for parsing options.

Commiter note: Preserved the sparse __be{16,32} annotations.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-10-24 10:46:58 -02:00
Gerrit Renker
d8ef2c29a0 [DCCP]: Convert Reset code into socket error number
This adds support for converting the 11 currently defined Reset codes into system
error numbers, which are stored in sk_err for further interpretation.

This makes the externally visible API behaviour similar to TCP, since a client
connecting to a non-existing port will experience ECONNREFUSED.

* Code 0, Unspecified, is interpreted as non-error (0);
* Code 1, Closed (normal termination), also maps into 0;
* Code 2, Aborted, maps into "Connection reset by peer" (ECONNRESET);
* Code 3, No Connection and
  Code 7, Connection Refused, map into "Connection refused" (ECONNREFUSED);
* Code 4, Packet Error, maps into "No message of desired type" (ENOMSG);
* Code 5, Option Error, maps into "Illegal byte sequence" (EILSEQ);
* Code 6, Mandatory Error, maps into "Operation not supported on transport endpoint" (EOPNOTSUPP);
* Code 8, Bad Service Code, maps into "Invalid request code" (EBADRQC);
* Code 9, Too Busy, maps into "Too many users" (EUSERS);
* Code 10, Bad Init Cookie, maps into "Invalid request descriptor" (EBADR);
* Code 11, Aggression Penalty, maps into "Quota exceeded" (EDQUOT)
  which makes sense in terms of using more than the `fair share' of bandwidth.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-10-24 10:27:48 -02:00
Gerrit Renker
1238d0873b [DCCP]: One more exemption from full sequence number checks
This fixes the following problem: client connects to peer which has no DCCP
enabled or loaded; ICMP error messages ("Protocol Unavailable") can be seen
on the wire, but the application hangs. Reason: ICMP packets don't get through
to dccp_v4_err.

When reporting errors, a sequence number check is made for the DCCP packet
that had caused an ICMP error to arrive.
Such checks can not be made if the socket is in state LISTEN, RESPOND (which
in the implementation is the same as LISTEN), or REQUEST, since update_gsr()
has not been called in these states, hence the sequence window is 0..0.

This patch fixes the problem by adding the REQUEST state as another exemption
to the window check. The error reporting now works as expected on connecting.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
2007-10-24 10:18:06 -02:00
Gerrit Renker
fde20105f3 [DCCP]: Retrieve packet sequence number for error reporting
This fixes a problem when analysing erroneous packets in dccp_v{4,6}_err:
* dccp_hdr_seq currently takes an skb
* however, the transport headers in the skb are shifted, due to the
  preceding IPv4/v6 header.
Fixed for v4 and v6 by changing dccp_hdr_seq to take a struct dccp_hdr as
argument. Verified that the correct sequence number is now reported in the
error handler.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-10-24 10:12:09 -02:00
Jens Axboe
642f149031 SG: Change sg_set_page() to take length and offset argument
Most drivers need to set length and offset as well, so may as well fold
those three lines into one.

Add sg_assign_page() for those two locations that only needed to set
the page, where the offset/length is set outside of the function context.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-24 11:20:47 +02:00
Geert Uytterhoeven
5a1cb47ff4 m68k: sg fallout
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>
2007-10-24 08:55:40 +02:00
Pavel Emelyanov
03cf786c4e [IPV4]: Explicitly call fib_get_table() in fib_frontend.c
In case the "multiple tables" config option is y, the ip_fib_local_table
is not a variable, but a macro, that calls fib_get_table(RT_TABLE_LOCAL).

Some code uses this "variable" *3* times in one place, thus implicitly
making 3 calls. Fix it.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:58 -07:00
Pavel Emelyanov
f0fe91ded3 [NET]: Use BUILD_BUG_ON in net/core/flowi.c
Instead of ugly extern not-existing function.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:57 -07:00
Pavel Emelyanov
342709efc7 [NET]: Remove in-code externs for some functions from net/core/dev.c
Inconsistent prototype and real type for functions may have worse
consequences, than those for variables, so move them into a header.

Since they are used privately in net/core, make this file reside in
the same place.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:56 -07:00
Pavel Emelyanov
a37ae4086e [NET]: Don't declare extern variables in net/core/sysctl_net_core.c
Some are already declared in include/linux/netdevice.h, while
some others (xfrm ones) need to be declared.

The driver/net/rrunner.c just uses same extern as well, so
cleanup it also.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:56 -07:00
Chuck Lever
c2636b4d9e [NET]: Treat the sign of the result of skb_headroom() consistently
In some places, the result of skb_headroom() is compared to an unsigned
integer, and in others, the result is compared to a signed integer.  Make
the comparisons consistent and correct.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:55 -07:00
Roel Kluin
0ffdd58149 [9P]: Fix missing unlock before return in p9_mux_poll_start
Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:54 -07:00
Pavel Emelyanov
0034622693 [PKT_SCHED]: Fix sch_prio.c build with CONFIG_NETDEVICES_MULTIQUEUE
Fix one more user of netiff_subqueue_stopped. To check for the
queue id one must use the __netiff_subqueue_stoped call.

This run out of my sight when I made the:

668f895a85
[NET]: Hide the queue_mapping field inside netif_subqueue_stopped

commit :(

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:53 -07:00
Timo Teras
6a5f44d7a0 [IPV4] ip_gre: sendto/recvfrom NBMA address
When GRE tunnel is in NBMA mode, this patch allows an application to use
a PF_PACKET socket to:
- send a packet to specific NBMA address with sendto()
- use recvfrom() to receive packet and check which NBMA address it came from

This is required to implement properly NHRP over GRE tunnel.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:53 -07:00
Pavel Emelyanov
16d14ef9f2 [SCTP]: Consolidate sctp_ulpq_renege_xxx functions
Both are equal, except for the list to be traversed.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:52 -07:00
Denis V. Lunev
5c58298c25 [NETLINK]: Fix ACK processing after netlink_dump_start
Revert to original netlink behavior. Do not reply with ACK if the
netlink dump has bees successfully started.

libnl has been broken by the cd40b7d398
The following command reproduce the problem:
   /nl-route-get 192.168.1.1

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:51 -07:00
Arnaldo Carvalho de Melo
6273172e17 [DCCP]: Implement SIOCINQ/FIONREAD
Just like UDP.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Leandro Melo de Sales <leandroal@gmail.com>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:50 -07:00
Jeff Garzik
bada339ba2 [NET]: Validate device addr prior to interface-up
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-23 21:27:50 -07:00
Linus Torvalds
01e7ae8c13 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: v9fs_vfs_rename incorrect clunk order
  9p: fix memleak in fs/9p/v9fs.c
  9p: add virtio transport
2007-10-23 12:04:01 -07:00
Ralf Baechle
117636092a [PATCH] Fix breakage after SG cleanups
Commits

  58b053e4ce ("Update arch/ to use sg helpers")
  45711f1af6 ("[SG] Update drivers to use sg helpers")
  fa05f1286b ("Update net/ to use sg helpers")

converted many files to use the scatter gather helpers without ensuring
that the necessary headerfile <linux/scatterlist> is included.  This
happened to work for ia64, powerpc, sparc64 and x86 because they
happened to drag in that file via their <asm/dma-mapping.h>.

On most of the others this probably broke.

Instead of increasing the header file spider web I choose to include
<linux/scatterlist.h> directly into the affectes files.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-23 12:02:39 -07:00