Commit Graph

10672 Commits

Author SHA1 Message Date
Patrick McHardy
869f37d8e4 [NETFILTER]: nf_conntrack/nf_nat: add IRC helper port
Add nf_conntrack port of the IRC conntrack/NAT helper. Since DCC doesn't
support IPv6 yet, the helper is still IPv4 only.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:09:06 -08:00
Patrick McHardy
f587de0e2f [NETFILTER]: nf_conntrack/nf_nat: add H.323 helper port
Add IPv4 and IPv6 capable nf_conntrack port of the H.323 conntrack/NAT helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:08:46 -08:00
Patrick McHardy
1695890057 [NETFILTER]: nf_conntrack/nf_nat: add amanda helper port
Add IPv4 and IPv6 capable nf_conntrack port of the Amanda conntrack/NAT helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:08:26 -08:00
Patrick McHardy
d6a9b6500a [NETFILTER]: nf_conntrack: add helper function for expectation initialization
Expectation address masks need to be differently initialized depending
on the address family, create helper function to avoid cluttering up
the code too much.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:08:01 -08:00
Jozsef Kadlecsik
55a733247d [NETFILTER]: nf_nat: add FTP NAT helper port
Add FTP NAT helper.

Split out from Jozsef's big nf_nat patch with a few small fixes by myself.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:07:44 -08:00
Jozsef Kadlecsik
5b1158e909 [NETFILTER]: Add NAT support for nf_conntrack
Add NAT support for nf_conntrack. Joint work of Jozsef Kadlecsik,
Yasuyuki Kozakai, Martin Josefsson and myself.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:07:13 -08:00
Patrick McHardy
9457d851fc [NETFILTER]: nf_conntrack: automatic helper assignment for expectations
Some helpers (namely H.323) manually assign further helpers to expected
connections. This is not possible with nf_conntrack anymore since we
need to know whether a helper is used at allocation time.

Handle the helper assignment centrally, which allows to perform the
correct allocation and as a nice side effect eliminates the need
for the H.323 helper to fiddle with nf_conntrack_lock.

Mid term the allocation scheme really needs to be redesigned since
we do both the helper and expectation lookup _twice_ for every new
connection.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:05:25 -08:00
Patrick McHardy
bff9a89bca [NETFILTER]: nf_conntrack: endian annotations
Resync with Al Viro's ip_conntrack annotations and fix a missed
spot in ip_nat_proto_icmp.c.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:05:08 -08:00
Patrick McHardy
f9aae95828 [NETFILTER]: nf_conntrack: fix helper structure alignment
Adding the alignment to the size doesn't make any sense, what it
should do is align the size of the conntrack structure to the
alignment requirements of the helper structure and return an
aligned pointer in nfct_help().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:04:50 -08:00
Patrick McHardy
9a7c9337a0 [NET]: Accept wildcard delimiters in in[46]_pton
Accept -1 as delimiter to abort parsing without an error at the first
unknown character. This is needed by the upcoming nf_conntrack SIP
helper, where addresses are delimited by either '\r' or '\n' characters.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:04:04 -08:00
Jamal Hadi Salim
a4d1366d50 [GENETLINK]: Add cmd dump completion.
Remove assumption that generic netlink commands cannot have dump
completion callbacks.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:09 -08:00
Al Viro
1e419cd995 [EBTABLES]: Split ebt_replace into user and kernel variants, annotate.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:05 -08:00
Miika Komu
76b3f055f3 [IPSEC]: Add encapsulation family.
Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:48 -08:00
Patrick McHardy
43effa1e57 [NET_SCHED]: Fix endless loops caused by inaccurate qlen counters (part 1)
There are multiple problems related to qlen adjustment that can lead
to an upper qdisc getting out of sync with the real number of packets
queued, leading to endless dequeueing attempts by the upper layer code.

All qdiscs must maintain an accurate q.qlen counter. There are basically
two groups of operations affecting the qlen: operations that propagate
down the tree (enqueue, dequeue, requeue, drop, reset) beginning at the
root qdisc and operations only affecting a subtree or single qdisc
(change, graft, delete class). Since qlen changes during operations from
the second group don't propagate to ancestor qdiscs, their qlen values
become desynchronized.

This patch adds a function to propagate qlen changes up the qdisc tree,
optionally calling a callback function to perform qdisc-internal
maintenance when the child qdisc becomes empty. The follow-up patches
will convert all qdiscs to use this function where necessary.

Noticed by Timo Steinbach <tsteinbach@astaro.com>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:42 -08:00
Patrick McHardy
9f9afec482 [NET_SCHED]: Set parent classid in default qdiscs
Set parent classids in default qdiscs to allow walking up the tree
from outside the qdiscs. This is needed by the next patch.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:41 -08:00
Paul Moore
0275276035 NetLabel: convert to an extensibile/sparse category bitmap
The original NetLabel category bitmap was a straight char bitmap which worked
fine for the initial release as it only supported 240 bits due to limitations
in the CIPSO restricted bitmap tag (tag type 0x01).  This patch converts that
straight char bitmap into an extensibile/sparse bitmap in order to lay the
foundation for other CIPSO tag types and protocols.

This patch also has a nice side effect in that all of the security attributes
passed by NetLabel into the LSM are now in a format which is in the host's
native byte/bit ordering which makes the LSM specific code much simpler; look
at the changes in security/selinux/ss/ebitmap.c as an example.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:31:36 -08:00
Bart De Schuymer
d12cdc3ccf [NETFILTER]: ebtables: add --snap-arp option
The attached patch adds --snat-arp support, which makes it possible to
change the source mac address in both the mac header and the arp header
with one rule.

Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:32 -08:00
Patrick McHardy
baf7b1e112 [NETFILTER]: x_tables: add NFLOG target
Add new NFLOG target to allow use of nfnetlink_log for both IPv4 and IPv6.
Currently we have two (unsupported by userspace) hacks in the LOG and ULOG
targets to optionally call to the nflog API. They lack a few features,
namely the IPv4 and IPv6 LOG targets can not specify a number of arguments
related to nfnetlink_log, while the ULOG target is only available for IPv4.
Remove those hacks and add a clean way to use nfnetlink_log.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:31 -08:00
Patrick McHardy
39b46fc6f0 [NETFILTER]: x_tables: add port of hashlimit match for IPv4 and IPv6
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:31 -08:00
Patrick McHardy
d7a5c32442 [NETFILTER]: nfnetlink_log: remove useless prefix length limitation
There is no reason for limiting netlink attributes in size.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:30 -08:00
Eric Leblond
829e17a1a6 [NETFILTER]: nfnetlink_queue: allow changing queue length through netlink
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:29 -08:00
Patrick McHardy
1b683b5512 [NETFILTER]: sip conntrack: better NAT handling
The NAT handling of the SIP helper has a few problems:

- Request headers are only mangled in the reply direction, From/To headers
  not at all, which can lead to authentication failures with DNAT in case
  the authentication domain is the IP address

- Contact headers in responses are only mangled for REGISTER responses

- Headers may be mangled even though they contain addresses not
  participating in the connection, like alternative addresses

- Packets are droppen when domain names are used where the helper expects
  IP addresses

This patch takes a different approach, instead of fixed rules what field
to mangle to what content, it adds symetric mapping of From/To/Via/Contact
headers, which allows to deal properly with echoed addresses in responses
and foreign addresses not belonging to the connection.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:26 -08:00
Patrick McHardy
40883e8184 [NETFILTER]: sip conntrack: do case insensitive SIP header search
SIP headers are generally case-insensitive, only SDP headers are
case sensitive.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:24 -08:00
Patrick McHardy
9d5b8baa4e [NETFILTER]: sip conntrack: minor cleanup
- Use enum for header field enumeration
- Use numerical value instead of pointer to header info structure to
  identify headers, unexport ct_sip_hdrs
- group SIP and SDP entries in header info structure
- remove double forward declaration of ct_sip_get_info

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:23 -08:00
Yasuyuki Kozakai
468ec44bd5 [NETFILTER]: conntrack: add '_get' to {ip, nf}_conntrack_expect_find
We usually uses 'xxx_find_get' for function which increments
reference count.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:21 -08:00
Patrick McHardy
e4bd8bce3e [NETFILTER]: nf_conntrack: /proc compatibility with old connection tracking
This patch adds /proc/net/ip_conntrack, /proc/net/ip_conntrack_expect and
/proc/net/stat/ip_conntrack files to keep old programs using them working.

The /proc/net/ip_conntrack and /proc/net/ip_conntrack_expect files show only
IPv4 entries, the /proc/net/stat/ip_conntrack shows global statistics.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:20 -08:00
Patrick McHardy
a999e68376 [NETFILTER]: nf_conntrack: sysctl compatibility with old connection tracking
This patch adds an option to keep the connection tracking sysctls visible
under their old names.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:19 -08:00
Patrick McHardy
d62f9ed4a4 [NETFILTER]: nf_conntrack: automatic sysctl registation for conntrack protocols
Add helper functions for sysctl registration with optional instantiating
of common path elements (like net/netfilter) and use it for support for
automatic registation of conntrack protocol sysctls.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:17 -08:00
Patrick McHardy
f8eb24a89a [NETFILTER]: nf_conntrack: move extern declaration to header files
Using extern in a C file is a bad idea because the compiler can't
catch type errors.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:16 -08:00
Martin Josefsson
824621eddd [NETFILTER]: nf_conntrack: remove unused struct list_head from protocols
Remove unused struct list_head from struct nf_conntrack_l3proto and
nf_conntrack_l4proto as all protocols are kept in arrays, not linked
lists.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:13 -08:00
Martin Josefsson
ae5718fb3d [NETFILTER]: nf_conntrack: more sanity checks in protocol registration/unregistration
Add some more sanity checks when registering/unregistering l3/l4 protocols.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:10 -08:00
Martin Josefsson
605dcad6c8 [NETFILTER]: nf_conntrack: rename struct nf_conntrack_protocol
Rename 'struct nf_conntrack_protocol' to 'struct nf_conntrack_l4proto' in
order to help distinguish it from 'struct nf_conntrack_l3proto'. It gets
rather confusing with 'nf_conntrack_protocol'.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:09 -08:00
Martin Josefsson
f61801218a [NETFILTER]: nf_conntrack: split out the event cache
This patch splits out the event cache into its own file
nf_conntrack_ecache.c

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:06 -08:00
Martin Josefsson
7e5d03bb9d [NETFILTER]: nf_conntrack: split out helper handling
This patch splits out handling of helpers into its own file
nf_conntrack_helper.c

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:05 -08:00
Martin Josefsson
77ab9cff0f [NETFILTER]: nf_conntrack: split out expectation handling
This patch splits out expectation handling into its own file
nf_conntrack_expect.c

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:04 -08:00
Gerrit Renker
5aed324369 [DCCP]: Tidy up unused structures
This removes and cleans up unused variables and structures which have become
unnecessary following the introduction of the EWMA patch to automatically track
the CCID 3 receiver/sender packet sizes `s'.

It deprecates the PACKET_SIZE socket option by returning an error code and
printing a deprecation warning if an application tries to read or write this
socket option.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:30:59 -08:00
Gerrit Renker
4384260443 [DCCP]: Remove allocation of sysctl numbers
This is in response to a request sent earlier by Eric W. Biederman
and replaces all sysctl numbers for net.dccp.default with CTL_UNNUMBERED.

It has been tested to compile and to work.

Commiter note: I've removed the use of CTL_UNNUMBERED, not setting .ctl_name
               sets it to 0, that is the what CTL_UNNUMBERED is, reason is
               to avoid unneeded source code cluttering.

Signed-off-by: Gerrit Renker  <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:30:56 -08:00
Arnaldo Carvalho de Melo
ee41e2dff1 [INET]: Change protocol field in struct inet_protosw to u16
[acme@newtoy net-2.6.20]$ pahole /tmp/tcp_ipv6.o inet_protosw
/* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/net/protocol.h:69 */
struct inet_protosw {
        struct list_head           list;                 /*     0     8 */
        short unsigned int         type;                 /*     8     2 */

        /* XXX 2 bytes hole, try to pack */

        int                        protocol;             /*    12     4 */
        struct proto *             prot;                 /*    16     4 */
        const struct proto_ops  *  ops;                  /*    20     4 */
        int                        capability;           /*    24     4 */
        char                       no_check;             /*    28     1 */
        unsigned char              flags;                /*    29     1 */
}; /* size: 32, sum members: 28, holes: 1, sum holes: 2, padding: 2 */

So that we can kill that hole, protocol can only go all the way to 255 (RAW).

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:30:55 -08:00
Arnaldo Carvalho de Melo
3a137d2065 [TCP]: Renove the __ prefix on the struct tcp_sock members
As this struct is not userland visible at all.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:30:54 -08:00
Arnaldo Carvalho de Melo
2ff52f282c [TCP]: Change tcp_header_len member in tcp_sock to u16
With this we eliminate the last hole in struct tcp_sock.

End result:

[acme@newtoy net-2.6.20]$ codiff -sV /tmp/tcp.o.before net/ipv4/tcp.o
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp.c:
  struct tcp_sock |   -4
    tcp_header_len;
     from: int                   /*  1000(0)     4(0) */
     to:   u16                   /*  1000(0)     2(0) */
 1 struct changed
[acme@newtoy net-2.6.20]$

Now sizeof(tcp_sock) is just...

[acme@newtoy net-2.6.20]$ pahole --sizes ../OUTPUT/qemu/net-2.6.20/net/ipv4/tcp.o | grep -w tcp_sock
struct tcp_sock: 1500 0

1500 bytes ;-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:30:53 -08:00
Arnaldo Carvalho de Melo
46ca5f5dc4 [XFRM]: Pack struct xfrm_policy
[acme@newtoy net-2.6.20]$ pahole net/ipv4/tcp.o xfrm_policy
/* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/linux/security.h:67 */
struct xfrm_policy {
        struct xfrm_policy *       next;                 /*     0     4 */
        struct hlist_node          bydst;                /*     4     8 */
        struct hlist_node          byidx;                /*    12     8 */
        rwlock_t                   lock;                 /*    20    36 */
        atomic_t                   refcnt;               /*    56     4 */
        struct timer_list          timer;                /*    60    24 */
        u8                         type;                 /*    84     1 */

        /* XXX 3 bytes hole, try to pack */

        u32                        priority;             /*    88     4 */
        u32                        index;                /*    92     4 */
        struct xfrm_selector       selector;             /*    96    56 */
        struct xfrm_lifetime_cfg   lft;                  /*   152    64 */
        struct xfrm_lifetime_cur   curlft;               /*   216    32 */
        struct dst_entry *         bundles;              /*   248     4 */
        __u16                      family;               /*   252     2 */
        __u8                       action;               /*   254     1 */
        __u8                       flags;                /*   255     1 */
        __u8                       dead;                 /*   256     1 */
        __u8                       xfrm_nr;              /*   257     1 */

        /* XXX 2 bytes hole, try to pack */

        struct xfrm_sec_ctx *      security;             /*   260     4 */
        struct xfrm_tmpl           xfrm_vec[6];          /*   264   360 */
}; /* size: 624, sum members: 619, holes: 2, sum holes: 5 */

So lets have just one hole instead of two, by moving 'type' to just before 'action',
end result:

[acme@newtoy net-2.6.20]$ codiff -s /tmp/tcp.o.before net/ipv4/tcp.o
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp.c:
  struct xfrm_policy |   -4
 1 struct changed
[acme@newtoy net-2.6.20]$

[acme@newtoy net-2.6.20]$ pahole -c 64 net/ipv4/tcp.o xfrm_policy
/* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/linux/security.h:67 */
struct xfrm_policy {
        struct xfrm_policy *       next;                 /*     0     4 */
        struct hlist_node          bydst;                /*     4     8 */
        struct hlist_node          byidx;                /*    12     8 */
        rwlock_t                   lock;                 /*    20    36 */
        atomic_t                   refcnt;               /*    56     4 */
        struct timer_list          timer;                /*    60    24 */
        u32                        priority;             /*    84     4 */
        u32                        index;                /*    88     4 */
        struct xfrm_selector       selector;             /*    92    56 */
        struct xfrm_lifetime_cfg   lft;                  /*   148    64 */
        struct xfrm_lifetime_cur   curlft;               /*   212    32 */
        struct dst_entry *         bundles;              /*   244     4 */
        u16                        family;               /*   248     2 */
        u8                         type;                 /*   250     1 */
        u8                         action;               /*   251     1 */
        u8                         flags;                /*   252     1 */
        u8                         dead;                 /*   253     1 */
        u8                         xfrm_nr;              /*   254     1 */

        /* XXX 1 byte hole, try to pack */

        struct xfrm_sec_ctx *      security;             /*   256     4 */
        struct xfrm_tmpl           xfrm_vec[6];          /*   260   360 */
}; /* size: 620, sum members: 619, holes: 1, sum holes: 1 */

Are there any fugly data dependencies here? None that I know.

In the process changed the removed the __ prefixed types, that are just for
userspace visible headers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:30:48 -08:00
Arnaldo Carvalho de Melo
d5c42c0ec4 [NET]: Pack struct hh_cache
[acme@newtoy net-2.6.20]$ pahole net/ipv4/tcp.o hh_cache
/* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/linux/netdevice.h:190 */
struct hh_cache {
        struct hh_cache *          hh_next;              /*     0     4 */
        atomic_t                   hh_refcnt;            /*     4     4 */
        __be16                     hh_type;              /*     8     2 */

        /* XXX 2 bytes hole, try to pack */

        int                        hh_len;               /*    12     4 */
        int                        (*hh_output)();       /*    16     4 */
        rwlock_t                   hh_lock;              /*    20    36 */
        long unsigned int          hh_data[24];          /*    56    96 */
}; /* size: 152, sum members: 150, holes: 1, sum holes: 2 */

[acme@newtoy net-2.6.20]$ find net -name "*.[ch]" | xargs grep 'hh_len.\+=' | sort -u
net/atm/br2684.c:               hh->hh_len = PADLEN + ETH_HLEN;
net/ethernet/eth.c:     hh->hh_len = ETH_HLEN;
net/ipv4/ipconfig.c:    int hh_len = LL_RESERVED_SPACE(dev);
net/ipv4/ip_output.c:   hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
net/ipv4/ip_output.c:   int hh_len = LL_RESERVED_SPACE(dev);
net/ipv4/netfilter.c:   hh_len = (*pskb)->dst->dev->hard_header_len;
net/ipv4/raw.c: hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
net/ipv6/ip6_output.c:  hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
net/ipv6/netfilter/ip6t_REJECT.c:       hh_len = (dst->dev->hard_header_len + 15)&~15;
net/ipv6/raw.c: hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
[acme@newtoy net-2.6.20]$

[acme@newtoy net-2.6.20]$ find include -name "*.h" | xargs grep 'define ETH_HLEN'
include/linux/if_ether.h:#define ETH_HLEN       14              /* Total octets in header.       */

        (((dev)->hard_header_len&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)

[acme@newtoy net-2.6.20]$ pahole net/ipv4/tcp.o net_device | grep hard_header_len
        short unsigned int         hard_header_len;      /*   106     2 */
[acme@newtoy net-2.6.20]$

So I think we're safe in turning hh_len an u16, end result:

[acme@newtoy net-2.6.20]$ codiff -sV /tmp/tcp.o.before net/ipv4/tcp.o
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp.c:
  struct hh_cache |   -4
    hh_len;
     from: int                   /*    12(0)     4(0) */
     to:   u16                   /*    10(0)     2(0) */
 1 struct changed
[acme@newtoy net-2.6.20]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:30:47 -08:00
Arnaldo Carvalho de Melo
850db6b8c5 [INET_CONNECTION_SOCK]: Pack struct inet_connection_sock_af_ops
We have a hole in:

[acme@newtoy net-2.6.20]$ pahole net/ipv6/tcp_ipv6.o inet_connection_sock_af_ops
/* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/net/inet_connection_sock.h:38 */
struct inet_connection_sock_af_ops {
        int                        (*queue_xmit)();      /*     0     4 */
        void                       (*send_check)();      /*     4     4 */
        int                        (*rebuild_header)();  /*     8     4 */
        int                        (*conn_request)();    /*    12     4 */
        struct sock *              (*syn_recv_sock)();   /*    16     4 */
        int                        (*remember_stamp)();  /*    20     4 */
        __u16                      net_header_len;       /*    24     2 */

        /* XXX 2 bytes hole, try to pack */

        int                        (*setsockopt)();      /*    28     4 */
        int                        (*getsockopt)();      /*    32     4 */
        int                        (*compat_setsockopt)(); /*    36     4 */
        int                        (*compat_getsockopt)(); /*    40     4 */
        void                       (*addr2sockaddr)();   /*    44     4 */
        int                        sockaddr_len;         /*    48     4 */
}; /* size: 52, sum members: 50, holes: 1, sum holes: 2 */

But we don't need sockaddr_len to be an int:

[acme@newtoy net-2.6.20]$ find net -name "*.[ch]" | xargs grep '\.sockaddr_len.\+=' | sort -u
net/dccp/ipv4.c:        .sockaddr_len      = sizeof(struct sockaddr_in),
net/dccp/ipv6.c:        .sockaddr_len      = sizeof(struct sockaddr_in6),
net/ipv4/tcp_ipv4.c:    .sockaddr_len      = sizeof(struct sockaddr_in),
net/ipv6/tcp_ipv6.c:    .sockaddr_len      = sizeof(struct sockaddr_in6),
net/sctp/ipv6.c:        .sockaddr_len      = sizeof(struct sockaddr_in6),
net/sctp/protocol.c:    .sockaddr_len      = sizeof(struct sockaddr_in),

[acme@newtoy net-2.6.20]$ pahole --sizes net/ipv6/tcp_ipv6.o | grep sockaddr_in
struct sockaddr_in: 16 0
struct sockaddr_in6: 28 0
[acme@newtoy net-2.6.20]$

So I turned sockaddr_len a 'u16', and now:

[acme@newtoy net-2.6.20]$ pahole net/ipv6/tcp_ipv6.o inet_connection_sock_af_ops
/* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/net/inet_connection_sock.h:38 */
struct inet_connection_sock_af_ops {
        int            (*queue_xmit)();        /*     0   4 */
        void           (*send_check)();        /*     4   4 */
        int            (*rebuild_header)();    /*     8   4 */
        int            (*conn_request)();      /*    12   4 */
        struct sock *  (*syn_recv_sock)();     /*    16   4 */
        int            (*remember_stamp)();    /*    20   4 */
        u16            net_header_len;         /*    24   2 */
        u16            sockaddr_len;           /*    26   2 */
        int            (*setsockopt)();        /*    28   4 */
        int            (*getsockopt)();        /*    32   4 */
        int            (*compat_setsockopt)(); /*    36   4 */
        int            (*compat_getsockopt)(); /*    40   4 */
        void           (*addr2sockaddr)();     /*    44   4 */
}; /* size: 48 */

So we've saved 4 bytes:

[acme@newtoy net-2.6.20]$ codiff -sV /tmp/tcp_ipv6.o.before net/ipv6/tcp_ipv6.o
/pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv6/tcp_ipv6.c:
  struct inet_connection_sock_af_ops |   -4
    net_header_len;
     from: __u16                 /*    24(0)     2(0) */
     to:   u16                   /*    24(0)     2(0) */
    sockaddr_len;
     from: int                   /*    48(0)     4(0) */
     to:   u16                   /*    26(0)     2(0) */
 1 struct changed
[acme@newtoy net-2.6.20]$

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:30:46 -08:00
Gerrit Renker
4c0a6cb0db [UDP(-Lite)]: consolidate v4 and v6 get|setsockopt code
This patch consolidates set/getsockopt code between UDP(-Lite) v4 and 6. The
justification is that UDP(-Lite) is a transport-layer protocol and therefore
the socket option code (at least in theory) should be AF-independent.

Furthermore, there is the following code reduplication:
 * do_udp{,v6}_getsockopt is 100% identical between v4 and v6
 * do_udp{,v6}_setsockopt is identical up to the following differerence
	--v4 in contrast to v4 additionally allows the experimental encapsulation
          types  UDP_ENCAP_ESPINUDP and UDP_ENCAP_ESPINUDP_NON_IKE
	--the remainder is identical between v4 and v6
   I believe that this difference is of little relevance.

The advantages in not duplicating twice almost completely identical code.

The patch further simplifies the interface of udp{,v6}_push_pending_frames,
since for the second argument (struct udp_sock *up) it always holds that
up = udp_sk(sk); where sk is the first function argument.

Signed-off-by: Gerrit Renker  <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:30:45 -08:00
Thomas Graf
e3703b3de1 [RTNETLINK]: Add rtnl_put_cacheinfo() to unify some code
IPv4, IPv6, and DECNet all use struct rta_cacheinfo in a similiar
way, therefore rtnl_put_cacheinfo() is added to reuse code.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:30:44 -08:00
Thomas Graf
4e9b826935 [NETLINK]: Remove unused dst_pid field in netlink_skb_parms
The destination PID is passed directly to netlink_unicast()
respectively netlink_multicast().

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:30:43 -08:00
Gerrit Renker
d61c167dd0 [NET]: Add documentation for TFRC structures
This adds documentation for the TFRC structure fields.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:30:42 -08:00
Thomas Graf
4a89c2562c [DECNET] address: Convert to new netlink interface
Extends the netlink interface to support the __le16 type and
converts address addition, deletion and, dumping to use the
new netlink interface.

Fixes multiple occasions of possible illegal memory references
due to not validated netlink attributes.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:30:30 -08:00
Al Viro
66c6f529c3 [NET]: net/sched annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:19 -08:00
Al Viro
ff1dcadb1b [NET]: Split skb->csum
... into anonymous union of __wsum and __u32 (csum and csum_offset resp.)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:18 -08:00
Al Viro
8e5200f540 [NET]: Fix assorted misannotations (from md5 and udplite merges).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:16 -08:00
Al Viro
2178eda826 [SCTP]: SCTP_CMD_PROCESS_CTSN annotations.
argument passed as __be32

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:14 -08:00
Al Viro
962c837275 [SCTP]: Netfilter sctp annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:12 -08:00
Al Viro
3dbe86566e [SCTP]: Annotate ->supported_addrs().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:11 -08:00
Al Viro
e1857ea28d [SCTP]: sctp_association ->peer.i is a host-endian analog of sctp_inthdr.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:10 -08:00
Al Viro
6fbfa9f951 [SCTP]: Annotate ->inaddr_any().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:08 -08:00
Al Viro
c9c938cb05 [SCTP]: flip_to_{h,n}() are not needed anymore.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:07 -08:00
Al Viro
516b20ee2d [SCTP]: ->a_h is gone now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:05 -08:00
Al Viro
74af924ab6 [SCTP]: ->a_h is gone now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:27:00 -08:00
Al Viro
80f15d6241 [SCTP]: ->source_h is not used anymore.
kill it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:57 -08:00
Al Viro
a926626893 [SCTP]: Switch all remaining users of ->saddr_h to ->saddr.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:56 -08:00
Al Viro
dd86d136f9 [SCTP]: Switch ->from_addr_param() to net-endian.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:48 -08:00
Al Viro
854d43a465 [SCTP]: Annotate ->dst_saddr()
switched to taking a pointer to net-endian sctp_addr
and a net-endian port number.  Instances and callers
adjusted; interestingly enough, the only calls are
direct calls of specific instances - the method is not
used at all.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:35 -08:00
Al Viro
2a6fd78ade [SCTP] embedded sctp_addr: net-endian mirrors
Add sctp_chunk->source, sctp_sockaddr_entry->a, sctp_transport->ipaddr
and sctp_transport->saddr, maintain them as net-endian mirrors of
their host-endian counterparts.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:30 -08:00
Al Viro
09ef7fecea [SCTP]: Beginning of conversion to net-endian for embedded sctp_addr.
Part 1: rename sctp_chunk->source, sctp_sockaddr_entry->a,
sctp_transport->ipaddr and sctp_transport->saddr (to ..._h)

The next patch will reintroduce these fields and keep them as
net-endian mirrors of the original (renamed) ones.  Split in
two patches to make sure that we hadn't forgotten any instanes.

Later in the series we'll eliminate uses of host-endian variants
(basically switching users to net-endian counterparts as we
progress through that mess).  Then host-endian ones will die.

Other embedded host-endian sctp_addr will be easier to switch
directly, so we leave them alone for now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:29 -08:00
Al Viro
04afd8b282 [SCTP]: Beginning of sin_port fixes.
That's going to be a long series.  Introduced temporary helpers
doing copy-and-convert for sctp_addr; they are used to kill
flip-in-place in global data structures and will be used
to gradually push host-endian uses of sctp_addr out of existence.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:24 -08:00
Al Viro
dbc16db1e5 [SCTP]: Trivial sctp endianness annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:23 -08:00
Al Viro
72f17e1c09 [SCTP]: Annotate tsn_dups.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:22 -08:00
Al Viro
dc251b2b1c [SCTP]: SCTP_CMD_INIT_FAILED annotations.
argument stored for SCTP_CMD_INIT_FAILED is always __be16
(protocol error).  Introduced new field and accessor for
it (SCTP_PERR()); switched to their use (from SCTP_U32() and
.u32)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:20 -08:00
Al Viro
63706c5c6f [SCTP]: sctp_make_op_error() annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:18 -08:00
Al Viro
5bf2db0390 [SCTP]: Annotate sctp_init_cause().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:17 -08:00
Al Viro
f3ffaf1468 [SCTP]: Annotate SCTP headers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:16 -08:00
Adrian Bunk
89c8945815 [IPV6] net/ipv6/sit.c: make 2 functions static
This patch makes two needlessly global functions static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:26:15 -08:00
Ian McDonald
82e3ab9dbe [DCCP]: Adds the tx buffer sysctls
This one got lost on the way from Ian to Gerrit to me, fix it.

Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:24:42 -08:00
Michael Chan
bac0dff6cd [BNX2]: Add 5709 PCI ID.
Add PCI ID and detection for 5709 copper and SerDes chips.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:24:33 -08:00
Paul Moore
c6b1677a54 NetLabel: use the correct CIPSOv4 MLS label limits
The CIPSOv4 engine currently has MLS label limits which are slightly larger
than what the draft allows.  This is not a major problem due to the current
implementation but we should fix this so it doesn't bite us later.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:24:12 -08:00
Paul Moore
701a90bad9 NetLabel: make netlbl_lsm_secattr struct easier/quicker to understand
The existing netlbl_lsm_secattr struct required the LSM to check all of the
fields to determine if any security attributes were present resulting in a lot
of work in the common case of no attributes.  This patch adds a 'flags' field
which is used to indicate which attributes are present in the structure; this
should allow the LSM to do a quick comparison to determine if the structure
holds any security attributes.

Example:

 if (netlbl_lsm_secattr->flags)
	/* security attributes present */
 else
	/* NO security attributes present */

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:24:07 -08:00
Paul Moore
c6fa82a9dd NetLabel: change netlbl_secattr_init() to return void
The netlbl_secattr_init() function would always return 0 making it pointless
to have a return value.  This patch changes the function to return void.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:24:06 -08:00
Paul Moore
1f758d9354 NetLabel: use gfp_t instead of int where it makes sense
There were a few places in the NetLabel code where the int type was being used
instead of the gfp_t type, this patch corrects this mistake.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:24:04 -08:00
Arnaldo Carvalho de Melo
58a5a7b955 [NET]: Conditionally use bh_lock_sock_nested in sk_receive_skb
Spotted by Ian McDonald, tentatively fixed by Gerrit Renker:

http://www.mail-archive.com/dccp%40vger.kernel.org/msg00599.html

Rewritten not to unroll sk_receive_skb, in the common case, i.e. no lock
debugging, its optimized away.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:23:51 -08:00
David S. Miller
6bb100b9fc [UDPLite]: udplite.h needs ip6_checksum.h
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:48 -08:00
Al Viro
1f61ab5ca5 [NET]: Preliminaty annotation of skb->csum.
It's still not completely right; we need to split it into anon unions
of __wsum and unsigned - for cases when we use it for partial checksum
and for offset of checksum in skb

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:44 -08:00
Al Viro
43bc0ca7ea [NET]: netfilter checksum annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:42 -08:00
Al Viro
f9214b2627 [NET]: ipvs checksum annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:41 -08:00
Al Viro
5c78f275e6 [NET]: IP header modifier helpers annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:40 -08:00
Al Viro
f6ab028804 [NET]: Make mangling a checksum (0 -> 0xffff on the wire) explicit.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:39 -08:00
Al Viro
b51655b958 [NET]: Annotate __skb_checksum_complete() and friends.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:38 -08:00
Al Viro
b1550f2212 [NET]: Annotate ip_vs_checksum_complete() and callers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:37 -08:00
Al Viro
81d7766276 [NET]: Annotate skb_copy_and_csum_bits() and callers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:36 -08:00
Al Viro
2bbbc86890 [NET]: Annotate skb_checksum() and callers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:35 -08:00
Al Viro
5084205faf [NET]: Annotate callers of csum_partial_copy_...() and csum_and_copy...() in net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:33 -08:00
Al Viro
44bb93633f [NET]: Annotate csum_partial() callers in net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:32 -08:00
Al Viro
868c86bcb5 [NET]: annotate csum_ipv6_magic() callers in net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:31 -08:00
Al Viro
6b11687ef0 [NET]: Annotate csum_tcpudp_magic() callers in net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:29 -08:00
Al Viro
d6f5493c1a [NET]: Annotate callers of csum_tcpudp_nofold() in net/*
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:28 -08:00
Al Viro
9981a0e36a [NET]: Annotate checksums in on-the-wire packets.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:26 -08:00
Al Viro
56649d5d3c [NET]: Generic checksum annotations and cleanups.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:25 -08:00
Al Viro
b8e4e01dd5 [NET]: XTENSA checksum annotations and cleanups.
* sanitize prototypes, annotate
* kill csum_partial_copy_fromuser
* kill csum_partial_copy
* kill useless shifts
* usual ntohs->shift

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:24 -08:00
Al Viro
d5c6393641 [NET]: SPARC64 checksum annotations and cleanups.
* sanitize prototypes, annotate
* kill useless shift

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:23 -08:00
Al Viro
16ec7e168d [NET]: SPARC checksum annotations and cleanups.
* sanitize prototypes, annotate
* kill bogus access_ok() in csum_partial_copy_from_user (the only caller checks)
* kill useless shift

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:22 -08:00
Al Viro
7c73a746ba [NET]: SH checksum annotations and cleanups.
* sanitize prototypes, annotate
* kill useless shifts
* usual ntohs->shift

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:22 -08:00
Al Viro
f994aae1bd [NET]: S390 checksum annotations and cleanups.
* sanitize prototypes, annotate
* kill useless shifts

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:21 -08:00
Al Viro
879178cfbe [NET]: POWERPC checksum annotations and cleanups.
* sanitize prototypes, annotate
* kill useless shifts

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:20 -08:00
Al Viro
72685fcd28 [NET]: I386 checksum annotations and cleanups.
* sanitize prototypes, annotate
* usual ntohs->shift

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:19 -08:00
Al Viro
475b8f311b [NET]: AVR32 checksum annotations and cleanups.
* sanitize prototypes, annotate
* kill useless shifts
* kill useless ntohs (it's big-endian, for fsck sake!)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:18 -08:00
Al Viro
0ffb7122bc [NET]: ARM26 checksum annotations and cleanups.
* sanitize prototypes, annotate
* kill csum_partial_copy

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:16 -08:00
Al Viro
eb5a965865 [NET]: ARM checksum annotations and cleanups.
* sanitize prototypes, annotate
* kill csum_partial_copy
* usual ntohs->shift, this time in assembler part

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:15 -08:00
Al Viro
a4f89fb7c0 [NET]: X86_64 checksum annotations and cleanups.
* sanitize prototypes, annotate
* usual ntohs->shift

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:14 -08:00
Al Viro
9d3d419558 [NET]: V850 checksum annotations and cleanups.
* sanitize prototypes, annotate
* collapse csum_partial_copy
* usual ntohs->shift

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:13 -08:00
Al Viro
c459dd90f0 [NET]: SH64 checksum annotations and cleanups.
* sanitize prototypes, annotate
* collapse csum_partial_copy
* kill csum_partial_copy_fromuser
* ntohs->shift in checksum calculation

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:11 -08:00
Al Viro
7814e4b6d6 [NET]: PARISC checksum annotations and cleanups.
* sanitized prototypes, annotated
* kill shift-by-16 in checksum calculation

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:10 -08:00
Al Viro
8e3d8433d8 [NET]: MIPS checksum annotations and cleanups.
* sanitize prototypes, annotate
* kill shift-by-16 in checksum calculations
* htons->shift in l-e checksum calculations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:09 -08:00
Al Viro
59ed05a7e8 [NET]: M68Knommu checksum annotations and cleanups.
* sanitize prototypes, annotated
* collapsed csum_partial_copy()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:08 -08:00
Al Viro
2061acaaae [NET]: M68K checksum annotations and cleanups.
* sanitize prototypes, annotate

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:07 -08:00
Al Viro
85d20dee20 [NET]: M32R checksum annotations and cleanups.
* sanitize prototypes, annotate
* ntohs -> shift in checksum calculations in l-e case
* kill shift-by-16 in checksum calculations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:06 -08:00
Al Viro
322529961e [NET]: IA64 checksum annotations and cleanups.
* sanitize prototypes, annotate
* ntohs -> shift in checksum calculations
* kill access_ok() in csum_partial_copy_from_user
* collapse do_csum_partial_copy_from_user

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:05 -08:00
Al Viro
db521083bc [NET]: H8300 checksum annotations and cleanups.
* sanitize prototypes and annotate
* collapse csum_partial_copy

NB: csum_partial() is almost certainly still buggy.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:04 -08:00
Al Viro
8042c44b8a [NET]: FRV checksum annotations.
* sanitize prototypes and annotate
* collapse csum_partial_copy

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:03 -08:00
Al Viro
3532010bcf [NET]: Cris checksum annotations and cleanups.
* sanitize prototypes and annotate
* kill cast-as-lvalue abuses in csum_partial()
* usual ntohs-equals-shift for checksum purposes

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:02 -08:00
Al Viro
9be259aae5 [NET]: Alpha checksum annotations and cleanups.
* sanitize prototypes and annotate
* kill useless access_ok() in csum_partial_copy_from_user() (the only
caller checks it already).
* do_csum_partial_copy_from_user() is not needed now
* replace htons(len) with len << 8 - they are the same wrt checksums
on little-endian.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:01 -08:00
Al Viro
2bc357987a [NET]: Introduce types for checksums.
New types - for 16bit checksums and "unfolded" 32bit variant.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:23:00 -08:00
Al Viro
a64b78a077 [NET]: Annotate net_srandom().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:57 -08:00
Al Viro
47c183fa5e [BRIDGE]: Annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:56 -08:00
Al Viro
30d492da73 [ATM]: Annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:55 -08:00
Al Viro
42d224aa17 [NETFILTER]: More trivial annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:54 -08:00
Al Viro
ef296f56f8 [IPV6]: __ipv6_addr_diff() annotations and cleanup.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:53 -08:00
Al Viro
e69a4adc66 [IPV6]: Misc endianness annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:52 -08:00
Al Viro
b09b845ca6 [RANDOM]: Annotate random.h IP helpers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:51 -08:00
Al Viro
714e85be35 [IPV6]: Assorted trivial endianness annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:50 -08:00
Al Viro
448c31aa34 [IRDA]: Trivial annotations.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:48 -08:00
Gerrit Renker
ba4e58eca8 [NET]: Supporting UDP-Lite (RFC 3828) in Linux
This is a revision of the previously submitted patch, which alters
the way files are organized and compiled in the following manner:

	* UDP and UDP-Lite now use separate object files
	* source file dependencies resolved via header files
	  net/ipv{4,6}/udp_impl.h
	* order of inclusion files in udp.c/udplite.c adapted
	  accordingly

[NET/IPv4]: Support for the UDP-Lite protocol (RFC 3828)

This patch adds support for UDP-Lite to the IPv4 stack, provided as an
extension to the existing UDPv4 code:
        * generic routines are all located in net/ipv4/udp.c
        * UDP-Lite specific routines are in net/ipv4/udplite.c
        * MIB/statistics support in /proc/net/snmp and /proc/net/udplite
        * shared API with extensions for partial checksum coverage

[NET/IPv6]: Extension for UDP-Lite over IPv6

It extends the existing UDPv6 code base with support for UDP-Lite
in the same manner as per UDPv4. In particular,
        * UDPv6 generic and shared code is in net/ipv6/udp.c
        * UDP-Litev6 specific extensions are in net/ipv6/udplite.c
        * MIB/statistics support in /proc/net/snmp6 and /proc/net/udplite6
        * support for IPV6_ADDRFORM
        * aligned the coding style of protocol initialisation with af_inet6.c
        * made the error handling in udpv6_queue_rcv_skb consistent;
          to return `-1' on error on all error cases
        * consolidation of shared code

[NET]: UDP-Lite Documentation and basic XFRM/Netfilter support

The UDP-Lite patch further provides
        * API documentation for UDP-Lite
        * basic xfrm support
        * basic netfilter support for IPv4 and IPv6 (LOG target)

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:46 -08:00
Thomas Graf
6051e2f4fb [IPv6] prefix: Convert RTM_NEWPREFIX notifications to use the new netlink api
RTM_GETPREFIX is completely unused and is thus removed.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:45 -08:00
Thomas Graf
17c157c889 [GENL]: Add genlmsg_put_reply() to simplify building reply headers
By modyfing genlmsg_put() to take a genl_family and by adding
genlmsg_put_reply() the process of constructing the netlink
and generic netlink headers is simplified.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:42 -08:00
Thomas Graf
81878d27fd [GENL]: Add genlmsg_reply() to simply unicast replies to requests
A generic netlink user has no interest in knowing how to
address the source of the original request.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:41 -08:00
Thomas Graf
3dabc71578 [GENL]: Add genlmsg_new() to allocate generic netlink messages
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:40 -08:00
YOSHIFUJI Hideaki
cfb6eeb4c8 [TCP]: MD5 Signature Option (RFC2385) support.
Based on implementation by Rick Payne.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:39 -08:00
Stephen Hemminger
bf6bce71ea netpoll header cleanup
As Steve left netpoll beast, hopefully not to return soon.
He noticed that the header was messy. He straightened it
up and polished it a little, then waved goodbye.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
2006-12-02 21:22:38 -08:00
Stephen Hemminger
5de4a473bd netpoll queue cleanup
The beast had a long and not very happy history. At one
point, a friend (netdump) had asked that he open up a little.
Well, the friend was long gone now, and the beast had
this dangling piece hanging (netpoll_queue).

It wasn't hard to stitch the netpoll_queue back in
where it belonged and make everything tidy.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
2006-12-02 21:22:37 -08:00
Stephen Hemminger
2bdfe0baec netpoll retry cleanup
The netpoll beast was still not happy. If the beast got
clogged pipes, it tended to stare blankly off in space
for a long time.

The problem couldn't be completely fixed because the
beast talked with irq's disabled. But it could be made
less painful and shorter.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
2006-12-02 21:22:36 -08:00
Stephen Hemminger
b6cd27ed33 netpoll per device txq
When the netpoll beast got really busy, it tended to clog
things, so it stored them for later. But the beast was putting
all it's skb's in one basket. This was bad because maybe some
pipes were clogged and others were not.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
2006-12-02 21:22:33 -08:00
Stephen Hemminger
93ec2c723e netpoll info leak
After looking harder, Steve noticed that the netpoll
beast leaked a little every time it shutdown for a nap.
Not a big leak, but a nuisance kind of thing.

He took out his refcount duct tape and patched the
leak. It was overkill since there was already other
locking in that area, but it looked clean and wouldn't
attract fleas.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
2006-12-02 21:22:32 -08:00
Gerrit Renker
09dbc3895e [DCCP]: Miscellaneous code tidy-ups
This patch does not change code; it performs some trivial clean/tidy-ups:

  * removal of a `debug_prefix' string in favour of the
    already existing dccp_role(sk)

  * add documentation of structures and constants

  * separated out the cases for invalid packets (step 1
    of the packet validation)

  * removing duplicate statements

  * combining declaration & initialisation

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:30 -08:00
Gerrit Renker
c02fdc0e81 [DCCP]: Make feature negotiation more readable
This patch replaces cryptic feature negotiation messages of type

Oct 31 15:42:20 kernel: dccp_feat_change: feat change type=32 feat=1
Oct 31 15:42:21 kernel: dccp_feat_change: feat change type=34 feat=1
Oct 31 15:42:21 kernel: dccp_feat_change: feat change type=32 feat=5

into ones of type:

Nov  2 13:54:45 kernel: dccp_feat_change: ChangeL(CCID (1), 3)
Nov  2 13:54:45 kernel: dccp_feat_change: ChangeR(CCID (1), 3)
Nov  2 13:54:45 kernel: dccp_feat_change: ChangeL(Ack Ratio (5), 2)

Also,
	* completed the feature number list wrt RFC 4340 sec. 6.4
	* annotating which ones have been implemented so far
	* implemented rudimentary sanity checking in feat.c (FIXMEs)
	* some minor fixes

Commiter note: uninlined dccp_feat_name and dccp_feat_typename, for
               consistency with dccp_{state,packet}_name, that, BTW,
               should be compiled only if CONFIG_IP_DCCP_DEBUG is
               selected, leaving this to another cset tho. Also
               shortened dccp_feat_negotiation_debug to dccp_feat_debug.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:29 -08:00
Gerrit Renker
b9df3cb8cf [TCP/DCCP]: Introduce net_xmit_eval
Throughout the TCP/DCCP (and tunnelling) code, it often happens that the
return code of a transmit function needs to be tested against NET_XMIT_CN
which is a value that does not indicate a strict error condition.

This patch uses a macro for these recurring situations which is consistent
with the already existing macro net_xmit_errno, saving on duplicated code.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:27 -08:00
Adrian Bunk
90833aa4f4 [NET]: The scheduled removal of the frame diverter.
This patch contains the scheduled removal of the frame diverter.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:23 -08:00
Gerrit Renker
2e2e9e92bd [DCCP]: Add sysctls to control retransmission behaviour
This adds 3 sysctls which govern the retransmission behaviour of DCCP control
packets (3way handshake, feature negotiation).

It removes 4 FIXMEs from the code.

The close resemblance of sysctl variables to their TCP analogues is emphasised
not only by their name, but also by giving them the same initial values.
This is useful since there is not much practical experience with DCCP yet.

Furthermore, with regard to the previous patch, it is now possible to limit
the number of keepalive-Responses by setting net.dccp.default.request_retries
(also a bit like in TCP).

Lastly, added documentation of all existing DCCP sysctls.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:18 -08:00
Thomas Graf
339bf98ffc [NETLINK]: Do precise netlink message allocations where possible
Account for the netlink message header size directly in nlmsg_new()
instead of relying on the caller calculate it correctly.

Replaces error handling of message construction functions when
constructing notifications with bug traps since a failure implies
a bug in calculating the size of the skb.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:22:11 -08:00
Gerrit Renker
6f4e5fff1e [DCCP]: Support for partial checksums (RFC 4340, sec. 9.2)
This patch does the following:
  a) introduces variable-length checksums as specified in [RFC 4340, sec. 9.2]
  b) provides necessary socket options and documentation as to how to use them
  c) basic support and infrastructure for the Minimum Checksum Coverage feature
     [RFC 4340, sec. 9.2.1]: acceptability tests, user notification and user
     interface

In addition, it

 (1) fixes two bugs in the DCCPv4 checksum computation:
 	* pseudo-header used checksum_len instead of skb->len
	* incorrect checksum coverage calculation based on dccph_x
 (2) removes dccp_v4_verify_checksum() since it reduplicates code of the
     checksum computation; code calling this function is updated accordingly.
 (3) now uses skb_checksum(), which is safer than checksum_partial() if the
     sk_buff has is a non-linear buffer (has pages attached to it).
 (4) fixes an outstanding TODO item:
        * If P.CsCov is too large for the packet size, drop packet and return.

The code has been tested with applications, the latest version of tcpdump now
comes with support for partial DCCP checksums.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:22:09 -08:00
YOSHIFUJI Hideaki
a11d206d0f [IPV6]: Per-interface statistics support.
For IP MIB (RFC4293).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-12-02 21:22:08 -08:00
YOSHIFUJI Hideaki
7a3025b1b3 [IPV6]: Introduce ip6_dst_idev() to get inet6_dev{} stored in dst_entry{}.
Otherwise, we will see a lot of casts...

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2006-12-02 21:22:07 -08:00
Gerrit Renker
9b42078ed6 [DCCP]: Combine allocating & zeroing header space on skb
This is a code simplification:
it combines three often recurring operations into one inline function,

        * allocate `len' bytes header space in skb
        * fill these `len' bytes with zeroes
        * cast the start of this header space as dccp_hdr

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-02 21:21:55 -08:00
David S. Miller
931731123a [TCP]: Don't set SKB owner in tcp_transmit_skb().
The data itself is already charged to the SKB, doing
the skb_set_owner_w() just generates a lot of noise and
extra atomics we don't really need.

Lmbench improvements on lat_tcp are minimal:

before:
TCP latency using localhost: 23.2701 microseconds
TCP latency using localhost: 23.1994 microseconds
TCP latency using localhost: 23.2257 microseconds

after:
TCP latency using localhost: 22.8380 microseconds
TCP latency using localhost: 22.9465 microseconds
TCP latency using localhost: 22.8462 microseconds

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:52 -08:00
Stephen Hemminger
ce7bc3bf15 [TCP]: Restrict congestion control choices.
Allow normal users to only choose among a restricted set of congestion
control choices.  The default is reno and what ever has been configured
as default. But the policy can be changed by administrator at any time.

For example, to allow any choice:
    cp /proc/sys/net/ipv4/tcp_available_congestion_control \
       /proc/sys/net/ipv4/tcp_allowed_congestion_control

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:49 -08:00
Stephen Hemminger
3ff825b28d [TCP]: Add tcp_available_congestion_control sysctl.
Create /proc/sys/net/ipv4/tcp_available_congestion_control
that reflects currently available TCP choices.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:48 -08:00
Vlad Yasevich
b68dbcab1d [SCTP]: Fix warning
An alternate solution would be to make the digest a pointer, allocate
it in sctp_endpoint_init() and free it in sctp_endpoint_destroy().

I guess I should have originally done it this way...

  CC [M]  net/sctp/sm_make_chunk.o
net/sctp/sm_make_chunk.c: In function 'sctp_unpack_cookie':
net/sctp/sm_make_chunk.c:1358: warning: initialization discards qualifiers from pointer target type

The reason is that sctp_unpack_cookie() takes a const struct
sctp_endpoint and modifies the digest in it (digest being embedded in
the struct, not a pointer).  Make digest a pointer to fix this
warning.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:47 -08:00
Eric Dumazet
72a3effaf6 [NET]: Size listen hash tables using backlog hint
We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for
each LISTEN socket, regardless of various parameters (listen backlog for
example)

On x86_64, this means order-1 allocations (might fail), even for 'small'
sockets, expecting few connections. On the contrary, a huge server wanting a
backlog of 50000 is slowed down a bit because of this fixed limit.

This patch makes the sizing of listen hash table a dynamic parameter,
depending of :
- net.core.somaxconn tunable (default is 128)
- net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128)
- backlog value given by user application  (2nd parameter of listen())

For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of
kmalloc().

We still limit memory allocation with the two existing tunables (somaxconn &
tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM
usage.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:44 -08:00
Thomas Graf
3dfbcc411e [NET] rules: Add support to invert selectors
Introduces a new flag FIB_RULE_INVERT causing rules to apply
if the specified selector doesn't match.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:42 -08:00
Thomas Graf
1f6c9557e8 [NET] rules: Share common attribute validation policy
Move the attribute policy for the non-specific attributes into
net/fib_rules.h and include it in the respective protocols.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:41 -08:00
Thomas Graf
b8964ed9fa [NET] rules: Protocol independant mark selector
Move mark selector currently implemented per protocol into
the protocol independant part.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:41 -08:00
Thomas Graf
5f300893fd [IPV4] nl_fib_lookup: Rename fl_fwmark to fl_mark
For the sake of consistency.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:40 -08:00
Thomas Graf
47dcf0cb10 [NET]: Rethink mark field in struct flowi
Now that all protocols have been made aware of the mark
field it can be moved out of the union thus simplyfing
its usage.

The config options in the IPv4/IPv6/DECnet subsystems
to enable respectively disable mark based routing only
obfuscate the code with ifdefs, the cost for the
additional comparison in the flow key is insignificant,
and most distributions have all these options enabled
by default anyway. Therefore it makes sense to remove
the config options and enable mark based routing by
default.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:39 -08:00
Thomas Graf
82e91ffef6 [NET]: Turn nfmark into generic mark
nfmark is being used in various subsystems and has become
the defacto mark field for all kinds of packets. Therefore
it makes sense to rename it to `mark' and remove the
dependency on CONFIG_NETFILTER.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:38 -08:00
Andrew Morton
776810217a [XFRM]: uninline xfrm_selector_match()
Six callsites, huge.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:36 -08:00
Peter Zijlstra
fcc70d5fdc [BLUETOOTH] lockdep: annotate sk_lock nesting in AF_BLUETOOTH
=============================================
[ INFO: possible recursive locking detected ]
2.6.18-1.2726.fc6 #1
2006-12-02 21:21:35 -08:00
Venkat Yekkirala
67f83cbf08 SELinux: Fix SA selection semantics
Fix the selection of an SA for an outgoing packet to be at the same
context as the originating socket/flow. This eliminates the SELinux
policy's ability to use/sendto SAs with contexts other than the socket's.

With this patch applied, the SELinux policy will require one or more of the
following for a socket to be able to communicate with/without SAs:

1. To enable a socket to communicate without using labeled-IPSec SAs:

allow socket_t unlabeled_t:association { sendto recvfrom }

2. To enable a socket to communicate with labeled-IPSec SAs:

allow socket_t self:association { sendto };
allow socket_t peer_sa_t:association { recvfrom };

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:21:34 -08:00
Venkat Yekkirala
6b877699c6 SELinux: Return correct context for SO_PEERSEC
Fix SO_PEERSEC for tcp sockets to return the security context of
the peer (as represented by the SA from the peer) as opposed to the
SA used by the local/source socket.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:21:33 -08:00
Venkat Yekkirala
c1a856c964 SELinux: Various xfrm labeling fixes
Since the upstreaming of the mlsxfrm modification a few months back,
testing has resulted in the identification of the following issues/bugs that
are resolved in this patch set.

1. Fix the security context used in the IKE negotiation to be the context
   of the socket as opposed to the context of the SPD rule.

2. Fix SO_PEERSEC for tcp sockets to return the security context of
   the peer as opposed to the source.

3. Fix the selection of an SA for an outgoing packet to be at the same
   context as the originating socket/flow.

The following would be the result of applying this patchset:

- SO_PEERSEC will now correctly return the peer's context.

- IKE deamons will receive the context of the source socket/flow
  as opposed to the SPD rule's context so that the negotiated SA
  will be at the same context as the source socket/flow.

- The SELinux policy will require one or more of the
  following for a socket to be able to communicate with/without SAs:

  1. To enable a socket to communicate without using labeled-IPSec SAs:

     allow socket_t unlabeled_t:association { sendto recvfrom }

  2. To enable a socket to communicate with labeled-IPSec SAs:

     allow socket_t self:association { sendto };
     allow socket_t peer_sa_t:association { recvfrom };

This Patch: Pass correct security context to IKE for use in negotiation

Fix the security context passed to IKE for use in negotiation to be the
context of the socket as opposed to the context of the SPD rule so that
the SA carries the label of the originating socket/flow.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:21:31 -08:00
Al Viro
6ba9c755e5 [BLUETOOTH]: rfcomm endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:29 -08:00
Al Viro
ae08e1f092 [IPV6]: ip6_output annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:26 -08:00
Al Viro
98a4a86128 [NETFILTER]: trivial annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:25 -08:00
Al Viro
0e11c91e1e [AF_PACKET]: annotate
Weirdness: the third argument of socket() is net-endian
here.  Oh, well - it's documented in packet(7).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:24 -08:00
Al Viro
3fbd418acc [LLC]: anotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:23 -08:00
Al Viro
fede70b986 [IPV6]: annotate inet6_csk_search_req()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:22 -08:00
Al Viro
90bcaf7b4a [IPV6]: flowlabels are net-endian
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:21 -08:00
Al Viro
92d9ece7af [INET]: annotate inet_ecn.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:20 -08:00
Al Viro
8a9ae2110b [NET]: annotate dsfield.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:19 -08:00
Al Viro
5d36b1803d [XFRM]: annotate ->new_mapping()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:18 -08:00
Al Viro
d29ef86b0a [AF_KEY]: annotate
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:18 -08:00
Al Viro
d5a0a1e310 [IPV4]: encapsulation annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:17 -08:00
Al Viro
44473a6b27 [IPV6]: annotate struct frag_hdr
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:14 -08:00
Al Viro
a27ee7a4dd [IPV6]: annotate icmpv6 headers
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:13 -08:00
Al Viro
04ce69093f [IPV6]: 'info' argument of ipv6 ->err_handler() is net-endian
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:12 -08:00
Al Viro
8c689a6eae [XFRM]: misc annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:11 -08:00
Al Viro
d2ecd9ccd0 [IPV6]: annotate inet6_hashtables
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:10 -08:00
Al Viro
5a874db4d9 [NET]: ipconfig and nfsroot annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:09 -08:00
Al Viro
3e6c8cd566 [TIPC]: endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:21:08 -08:00
Linus Torvalds
97be852f81 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (118 commits)
  [netdrvr] skge: build fix
  [PATCH] NetXen: driver cleanup, removed unnecessary __iomem type casts
  [PATCH] PHY: Add support for configuring the PHY connection interface
  [PATCH] chelesio: transmit locking (plus bug fix).
  [PATCH] chelsio: statistics improvement
  [PATCH] chelsio: add MSI support
  [PATCH] chelsio: use standard CRC routines
  [PATCH] chelsio: cleanup pm3393 code
  [PATCH] chelsio: add 1G swcixw aupport
  [PATCH] chelsio: add support for other 10G boards
  [PATCH] chelsio: remove unused mutex
  [PATCH] chelsio: use kzalloc
  [PATCH] chelsio: whitespace fixes
  [PATCH] amd8111e use standard CRC lib
  [PATCH] sky2: msi enhancements.
  [PATCH] sky2: kfree_skb_any needed
  [PATCH] sky2: fixes for Yukon EC_U chip revisions
  [PATCH] sky2: add Dlink 560SX id
  [PATCH] sky2: receive error handling fix
  [PATCH] skge: don't clear MC state on link down
  ...
2006-12-02 15:08:32 -08:00
Linus Torvalds
cdb54fac35 Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/drzeus/mmc:
  mmc: correct request error handling
  mmc: Flush block queue when removing card
  mmc: sdhci high speed support
  mmc: Support for high speed SD cards
  mmc: Fix mmc_delay() function
  mmc: Add support for mmc v4 wide-bus modes
  [PATCH] mmc: Add support for mmc v4 high speed mode
  trivial change for mmc/Kconfig: MMC_PXA does not mean only PXA255
  Make general code cleanups
  Add MMC_CAP_{MULTIWRITE,BYTEBLOCK} flags
  Platform device error handling cleanup
  Move register definitions away from the header file
  Change OMAP_MMC_{READ,WRITE} macros to use the host pointer
  Replace base with virt_base and phys_base
  mmc: constify mmc_host_ops vectors
  mmc: remove kernel_thread()
2006-12-02 08:29:04 -08:00
Andy Fleming
e8a2b6a420 [PATCH] PHY: Add support for configuring the PHY connection interface
Most PHYs connect to an ethernet controller over a GMII or MII
interface.  However, a growing number are connected over
different interfaces, such as RGMII or SGMII.

The ethernet driver will tell the PHY what type of connection it
is by setting it manually, or passing it in through phy_connect
(or phy_attach).

Changes include:
* Updates to documentation
* Updates to PHY Lib consumers
* Changes to PHY Lib to add interface support
* Some minor changes to whitespace in phy.h
* gianfar driver now detects interface and passes appropriate
  value to PHY Lib
Signed-off-by: Andrew Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-02 00:33:11 -05:00
Mariusz Kozlowski
f789dfdc44 [PATCH] mv643xx_eth: fix unbalanced parentheses in macros
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-02 00:22:29 -05:00
Ayaz Abdulla
52d78d6331 [PATCH] forcedeth: add new NVIDIA pci ids
Add pci device ids for the NVIDIA MCP67 chip.

Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-02 00:12:01 -05:00
Christian Lamparter
fe75f7471b [PATCH] wext: extend MLME support
This patch adds two new defines for the SIOCSIWMLME to cover all
kinds MLMEs (well, except REASSOC) through a ioctl.

Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-12-02 00:11:58 -05:00
Larry Finger
837925df02 [PATCH] ieee80211: Drop and count duplicate data frames to remove 'replay detected' log messages
In the SoftMAC version of the IEEE 802.11 stack, not all duplicate messages are
detected. For the most part, there is no difficulty; however for TKIP and CCMP
encryption, the duplicates result in a "replay detected" log message where the
received and previous values of the TSC are identical. This change adds a new
variable to the ieee80211_device structure that holds the 'seq_ctl' value for
the previous frame. When a new frame repeats the value, the frame is dropped and
the appropriate counter is incremented.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-12-02 00:11:57 -05:00
Daniel Drake
c9308b06c0 [PATCH] ieee80211: Move IV/ICV stripping into ieee80211_rx
This patch adds a host_strip_iv_icv flag to ieee80211 which indicates that
ieee80211_rx should strip the IV/ICV/other security features from the payload.
This saves on some memmove() calls in the driver and seems like something that
belongs in the stack as it can be used by bcm43xx, ipw2200, and zd1211rw

I will submit the ipw2200 patch separately as it needs testing.

This patch also adds some sensible variable reuse (idx vs keyidx) in
ieee80211_rx

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-12-02 00:11:56 -05:00
Maciej W. Rozycki
13df29f697 [PATCH] 2.6.18: sb1250-mac: Missing inclusions from <linux/phy.h>
The <linux/phy.h> uses some types and macros defined in
<linux/ethtool.h>, <linux/mii.h>, <linux/timer.h> and <linux/workqueue.h>,
but fails to include these headers.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>

patch-mips-2.6.18-20060920-include-phy-16
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-02 00:11:55 -05:00
Tejun Heo
55a8e2c83c [PATCH] libata: implement presence detection via polling IDENTIFY
On some controllers (ICHs in piix mode), there is *NO* reliable way to
determine device presence other than issuing IDENTIFY and see how the
transaction proceeds by watching the TF status register.

libata acted this way before irq-pio and phantom devices caused very
little problem but now that IDENTIFY is performed using IRQ drive PIO,
such phantom devices now result in multiple 30sec timeouts during
boot.

This patch implements ATA_FLAG_DETECT_POLLING.  If a LLD sets this
flag, libata core issues the initial IDENTIFY in polling mode and if
the initial data transfer fails w/ HSM violation, the port is
considered to be empty thus replicating the old libata and IDE
behavior.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-01 22:45:55 -05:00
Tejun Heo
6fc49adb94 [PATCH] libata: use FLUSH_EXT only when driver is larger than LBA28 limit
Many drives support LBA48 even when its capacity is smaller than
1<<28, as LBA48 is required for many functionalities.  FLUSH_EXT is
mandatory for drives w/ LBA48 support.

Interestingly, at least one of such drives (ST960812A) has problems
dealing with FLUSH_EXT.  It eventually completes the command but takes
around 7 seconds to finish in many cases thus drastically slowing down
IO transactions.  This seems to be a firmware bug which sneaked into
production probably because no other ATA driver including linux IDE
issues FLUSH_EXT to drives which report support for LBA48 & FLUSH_EXT
but is smaller than 1<<28 blocks.

This patch adds ATA_DFLAG_FLUSH_EXT which is set iff the drive
supports LBA48 & FLUSH_EXT and is larger than LBA28 limit.  Both cache
flush paths are updated to issue FLUSH_EXT only when the flag is set.
Note that the changed behavior is more inline with the rest of libata.
libata prefers shorter commands whenever possible.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Danny Kukawka <dkukawka@novell.com>
Cc: Stefan Seyfried <seife@novell.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-01 22:45:54 -05:00
Alessandro Zummo
0df0d0a0ea [libata] ARM: add ixp4xx PATA driver
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-01 22:42:51 -05:00
Tejun Heo
baa1e78a83 [PATCH] libata: implement ATA_EHI_SETMODE and ATA_EHI_POST_SETMODE
libata EH used to perform ata_set_mode() iff the EH session performed
reset as indicated by ATA_EHI_DID_RESET.  This is incorrect because
->dev_config() called by revalidation is allowed to modify transfer
mode which ata_set_mode() should take care of.  This patch implements
the following two flags.

* ATA_EHI_SETMODE: set during EH to schedule ata_set_mode().  Both new
  device attachment and revalidation set this flag.

* ATA_EHI_POST_SETMODE: set while the device is revalidated after
  ata_set_mode().  Post-setmode revalidation is different from initial
  configuaration and EH revalidation in that ->dev_config() is not
  allowed tune transfer mode.  LLD can use this flag to determine
  whether it's allowed to tune transfer mode.  Note that POST_SETMODE
  ->dev_config() is guaranteed to be preceded by non-POST_SETMODE
  ->dev_config().

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-01 22:41:31 -05:00
Tejun Heo
efdaedc443 [PATCH] libata: implement ATA_EHI_PRINTINFO
Implement ehi flag ATA_EHI_PRINTINFO.  This flag is set when device
configuration needs to print out device info.  This used to be handled
by @print_info argument to ata_dev_configure() but LLDs also need to
know about it in ->dev_config() callback.

This patch replaces @print_info w/ ATA_EHI_PRINTINFO and make sata_sil
print workaround messages only on the initial configuration.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-12-01 22:41:31 -05:00