Add helper functions for sysctl registration with optional instantiating
of common path elements (like net/netfilter) and use it for support for
automatic registation of conntrack protocol sysctls.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Remove unused struct list_head from struct nf_conntrack_l3proto and
nf_conntrack_l4proto as all protocols are kept in arrays, not linked
lists.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Add some more sanity checks when registering/unregistering l3/l4 protocols.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Rename 'struct nf_conntrack_protocol' to 'struct nf_conntrack_l4proto' in
order to help distinguish it from 'struct nf_conntrack_l3proto'. It gets
rather confusing with 'nf_conntrack_protocol'.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch splits out the event cache into its own file
nf_conntrack_ecache.c
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch splits out handling of helpers into its own file
nf_conntrack_helper.c
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch splits out expectation handling into its own file
nf_conntrack_expect.c
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This removes and cleans up unused variables and structures which have become
unnecessary following the introduction of the EWMA patch to automatically track
the CCID 3 receiver/sender packet sizes `s'.
It deprecates the PACKET_SIZE socket option by returning an error code and
printing a deprecation warning if an application tries to read or write this
socket option.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This is in response to a request sent earlier by Eric W. Biederman
and replaces all sysctl numbers for net.dccp.default with CTL_UNNUMBERED.
It has been tested to compile and to work.
Commiter note: I've removed the use of CTL_UNNUMBERED, not setting .ctl_name
sets it to 0, that is the what CTL_UNNUMBERED is, reason is
to avoid unneeded source code cluttering.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch consolidates set/getsockopt code between UDP(-Lite) v4 and 6. The
justification is that UDP(-Lite) is a transport-layer protocol and therefore
the socket option code (at least in theory) should be AF-independent.
Furthermore, there is the following code reduplication:
* do_udp{,v6}_getsockopt is 100% identical between v4 and v6
* do_udp{,v6}_setsockopt is identical up to the following differerence
--v4 in contrast to v4 additionally allows the experimental encapsulation
types UDP_ENCAP_ESPINUDP and UDP_ENCAP_ESPINUDP_NON_IKE
--the remainder is identical between v4 and v6
I believe that this difference is of little relevance.
The advantages in not duplicating twice almost completely identical code.
The patch further simplifies the interface of udp{,v6}_push_pending_frames,
since for the second argument (struct udp_sock *up) it always holds that
up = udp_sk(sk); where sk is the first function argument.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv4, IPv6, and DECNet all use struct rta_cacheinfo in a similiar
way, therefore rtnl_put_cacheinfo() is added to reuse code.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The destination PID is passed directly to netlink_unicast()
respectively netlink_multicast().
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds documentation for the TFRC structure fields.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Extends the netlink interface to support the __le16 type and
converts address addition, deletion and, dumping to use the
new netlink interface.
Fixes multiple occasions of possible illegal memory references
due to not validated netlink attributes.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
... into anonymous union of __wsum and __u32 (csum and csum_offset resp.)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
switched to taking a pointer to net-endian sctp_addr
and a net-endian port number. Instances and callers
adjusted; interestingly enough, the only calls are
direct calls of specific instances - the method is not
used at all.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add sctp_chunk->source, sctp_sockaddr_entry->a, sctp_transport->ipaddr
and sctp_transport->saddr, maintain them as net-endian mirrors of
their host-endian counterparts.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Part 1: rename sctp_chunk->source, sctp_sockaddr_entry->a,
sctp_transport->ipaddr and sctp_transport->saddr (to ..._h)
The next patch will reintroduce these fields and keep them as
net-endian mirrors of the original (renamed) ones. Split in
two patches to make sure that we hadn't forgotten any instanes.
Later in the series we'll eliminate uses of host-endian variants
(basically switching users to net-endian counterparts as we
progress through that mess). Then host-endian ones will die.
Other embedded host-endian sctp_addr will be easier to switch
directly, so we leave them alone for now.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
That's going to be a long series. Introduced temporary helpers
doing copy-and-convert for sctp_addr; they are used to kill
flip-in-place in global data structures and will be used
to gradually push host-endian uses of sctp_addr out of existence.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
argument stored for SCTP_CMD_INIT_FAILED is always __be16
(protocol error). Introduced new field and accessor for
it (SCTP_PERR()); switched to their use (from SCTP_U32() and
.u32)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes two needlessly global functions static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This one got lost on the way from Ian to Gerrit to me, fix it.
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Add PCI ID and detection for 5709 copper and SerDes chips.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The CIPSOv4 engine currently has MLS label limits which are slightly larger
than what the draft allows. This is not a major problem due to the current
implementation but we should fix this so it doesn't bite us later.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The existing netlbl_lsm_secattr struct required the LSM to check all of the
fields to determine if any security attributes were present resulting in a lot
of work in the common case of no attributes. This patch adds a 'flags' field
which is used to indicate which attributes are present in the structure; this
should allow the LSM to do a quick comparison to determine if the structure
holds any security attributes.
Example:
if (netlbl_lsm_secattr->flags)
/* security attributes present */
else
/* NO security attributes present */
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
The netlbl_secattr_init() function would always return 0 making it pointless
to have a return value. This patch changes the function to return void.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
There were a few places in the NetLabel code where the int type was being used
instead of the gfp_t type, this patch corrects this mistake.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Spotted by Ian McDonald, tentatively fixed by Gerrit Renker:
http://www.mail-archive.com/dccp%40vger.kernel.org/msg00599.html
Rewritten not to unroll sk_receive_skb, in the common case, i.e. no lock
debugging, its optimized away.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
It's still not completely right; we need to split it into anon unions
of __wsum and unsigned - for cases when we use it for partial checksum
and for offset of checksum in skb
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* kill useless shift
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* kill bogus access_ok() in csum_partial_copy_from_user (the only caller checks)
* kill useless shift
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* kill useless shifts
* usual ntohs->shift
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* kill useless shifts
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* kill useless shifts
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* usual ntohs->shift
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* kill csum_partial_copy
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* kill csum_partial_copy
* usual ntohs->shift, this time in assembler part
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* usual ntohs->shift
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* collapse csum_partial_copy
* usual ntohs->shift
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* collapse csum_partial_copy
* kill csum_partial_copy_fromuser
* ntohs->shift in checksum calculation
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitized prototypes, annotated
* kill shift-by-16 in checksum calculation
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* kill shift-by-16 in checksum calculations
* htons->shift in l-e checksum calculations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotated
* collapsed csum_partial_copy()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* ntohs -> shift in checksum calculations in l-e case
* kill shift-by-16 in checksum calculations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes, annotate
* ntohs -> shift in checksum calculations
* kill access_ok() in csum_partial_copy_from_user
* collapse do_csum_partial_copy_from_user
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes and annotate
* collapse csum_partial_copy
NB: csum_partial() is almost certainly still buggy.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes and annotate
* collapse csum_partial_copy
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes and annotate
* kill cast-as-lvalue abuses in csum_partial()
* usual ntohs-equals-shift for checksum purposes
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* sanitize prototypes and annotate
* kill useless access_ok() in csum_partial_copy_from_user() (the only
caller checks it already).
* do_csum_partial_copy_from_user() is not needed now
* replace htons(len) with len << 8 - they are the same wrt checksums
on little-endian.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
New types - for 16bit checksums and "unfolded" 32bit variant.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a revision of the previously submitted patch, which alters
the way files are organized and compiled in the following manner:
* UDP and UDP-Lite now use separate object files
* source file dependencies resolved via header files
net/ipv{4,6}/udp_impl.h
* order of inclusion files in udp.c/udplite.c adapted
accordingly
[NET/IPv4]: Support for the UDP-Lite protocol (RFC 3828)
This patch adds support for UDP-Lite to the IPv4 stack, provided as an
extension to the existing UDPv4 code:
* generic routines are all located in net/ipv4/udp.c
* UDP-Lite specific routines are in net/ipv4/udplite.c
* MIB/statistics support in /proc/net/snmp and /proc/net/udplite
* shared API with extensions for partial checksum coverage
[NET/IPv6]: Extension for UDP-Lite over IPv6
It extends the existing UDPv6 code base with support for UDP-Lite
in the same manner as per UDPv4. In particular,
* UDPv6 generic and shared code is in net/ipv6/udp.c
* UDP-Litev6 specific extensions are in net/ipv6/udplite.c
* MIB/statistics support in /proc/net/snmp6 and /proc/net/udplite6
* support for IPV6_ADDRFORM
* aligned the coding style of protocol initialisation with af_inet6.c
* made the error handling in udpv6_queue_rcv_skb consistent;
to return `-1' on error on all error cases
* consolidation of shared code
[NET]: UDP-Lite Documentation and basic XFRM/Netfilter support
The UDP-Lite patch further provides
* API documentation for UDP-Lite
* basic xfrm support
* basic netfilter support for IPv4 and IPv6 (LOG target)
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
RTM_GETPREFIX is completely unused and is thus removed.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
By modyfing genlmsg_put() to take a genl_family and by adding
genlmsg_put_reply() the process of constructing the netlink
and generic netlink headers is simplified.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A generic netlink user has no interest in knowing how to
address the source of the original request.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As Steve left netpoll beast, hopefully not to return soon.
He noticed that the header was messy. He straightened it
up and polished it a little, then waved goodbye.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
The beast had a long and not very happy history. At one
point, a friend (netdump) had asked that he open up a little.
Well, the friend was long gone now, and the beast had
this dangling piece hanging (netpoll_queue).
It wasn't hard to stitch the netpoll_queue back in
where it belonged and make everything tidy.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
The netpoll beast was still not happy. If the beast got
clogged pipes, it tended to stare blankly off in space
for a long time.
The problem couldn't be completely fixed because the
beast talked with irq's disabled. But it could be made
less painful and shorter.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
When the netpoll beast got really busy, it tended to clog
things, so it stored them for later. But the beast was putting
all it's skb's in one basket. This was bad because maybe some
pipes were clogged and others were not.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
After looking harder, Steve noticed that the netpoll
beast leaked a little every time it shutdown for a nap.
Not a big leak, but a nuisance kind of thing.
He took out his refcount duct tape and patched the
leak. It was overkill since there was already other
locking in that area, but it looked clean and wouldn't
attract fleas.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
This patch does not change code; it performs some trivial clean/tidy-ups:
* removal of a `debug_prefix' string in favour of the
already existing dccp_role(sk)
* add documentation of structures and constants
* separated out the cases for invalid packets (step 1
of the packet validation)
* removing duplicate statements
* combining declaration & initialisation
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch replaces cryptic feature negotiation messages of type
Oct 31 15:42:20 kernel: dccp_feat_change: feat change type=32 feat=1
Oct 31 15:42:21 kernel: dccp_feat_change: feat change type=34 feat=1
Oct 31 15:42:21 kernel: dccp_feat_change: feat change type=32 feat=5
into ones of type:
Nov 2 13:54:45 kernel: dccp_feat_change: ChangeL(CCID (1), 3)
Nov 2 13:54:45 kernel: dccp_feat_change: ChangeR(CCID (1), 3)
Nov 2 13:54:45 kernel: dccp_feat_change: ChangeL(Ack Ratio (5), 2)
Also,
* completed the feature number list wrt RFC 4340 sec. 6.4
* annotating which ones have been implemented so far
* implemented rudimentary sanity checking in feat.c (FIXMEs)
* some minor fixes
Commiter note: uninlined dccp_feat_name and dccp_feat_typename, for
consistency with dccp_{state,packet}_name, that, BTW,
should be compiled only if CONFIG_IP_DCCP_DEBUG is
selected, leaving this to another cset tho. Also
shortened dccp_feat_negotiation_debug to dccp_feat_debug.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Throughout the TCP/DCCP (and tunnelling) code, it often happens that the
return code of a transmit function needs to be tested against NET_XMIT_CN
which is a value that does not indicate a strict error condition.
This patch uses a macro for these recurring situations which is consistent
with the already existing macro net_xmit_errno, saving on duplicated code.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch contains the scheduled removal of the frame diverter.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds 3 sysctls which govern the retransmission behaviour of DCCP control
packets (3way handshake, feature negotiation).
It removes 4 FIXMEs from the code.
The close resemblance of sysctl variables to their TCP analogues is emphasised
not only by their name, but also by giving them the same initial values.
This is useful since there is not much practical experience with DCCP yet.
Furthermore, with regard to the previous patch, it is now possible to limit
the number of keepalive-Responses by setting net.dccp.default.request_retries
(also a bit like in TCP).
Lastly, added documentation of all existing DCCP sysctls.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Account for the netlink message header size directly in nlmsg_new()
instead of relying on the caller calculate it correctly.
Replaces error handling of message construction functions when
constructing notifications with bug traps since a failure implies
a bug in calculating the size of the skb.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch does the following:
a) introduces variable-length checksums as specified in [RFC 4340, sec. 9.2]
b) provides necessary socket options and documentation as to how to use them
c) basic support and infrastructure for the Minimum Checksum Coverage feature
[RFC 4340, sec. 9.2.1]: acceptability tests, user notification and user
interface
In addition, it
(1) fixes two bugs in the DCCPv4 checksum computation:
* pseudo-header used checksum_len instead of skb->len
* incorrect checksum coverage calculation based on dccph_x
(2) removes dccp_v4_verify_checksum() since it reduplicates code of the
checksum computation; code calling this function is updated accordingly.
(3) now uses skb_checksum(), which is safer than checksum_partial() if the
sk_buff has is a non-linear buffer (has pages attached to it).
(4) fixes an outstanding TODO item:
* If P.CsCov is too large for the packet size, drop packet and return.
The code has been tested with applications, the latest version of tcpdump now
comes with support for partial DCCP checksums.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This is a code simplification:
it combines three often recurring operations into one inline function,
* allocate `len' bytes header space in skb
* fill these `len' bytes with zeroes
* cast the start of this header space as dccp_hdr
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
The data itself is already charged to the SKB, doing
the skb_set_owner_w() just generates a lot of noise and
extra atomics we don't really need.
Lmbench improvements on lat_tcp are minimal:
before:
TCP latency using localhost: 23.2701 microseconds
TCP latency using localhost: 23.1994 microseconds
TCP latency using localhost: 23.2257 microseconds
after:
TCP latency using localhost: 22.8380 microseconds
TCP latency using localhost: 22.9465 microseconds
TCP latency using localhost: 22.8462 microseconds
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow normal users to only choose among a restricted set of congestion
control choices. The default is reno and what ever has been configured
as default. But the policy can be changed by administrator at any time.
For example, to allow any choice:
cp /proc/sys/net/ipv4/tcp_available_congestion_control \
/proc/sys/net/ipv4/tcp_allowed_congestion_control
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create /proc/sys/net/ipv4/tcp_available_congestion_control
that reflects currently available TCP choices.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
An alternate solution would be to make the digest a pointer, allocate
it in sctp_endpoint_init() and free it in sctp_endpoint_destroy().
I guess I should have originally done it this way...
CC [M] net/sctp/sm_make_chunk.o
net/sctp/sm_make_chunk.c: In function 'sctp_unpack_cookie':
net/sctp/sm_make_chunk.c:1358: warning: initialization discards qualifiers from pointer target type
The reason is that sctp_unpack_cookie() takes a const struct
sctp_endpoint and modifies the digest in it (digest being embedded in
the struct, not a pointer). Make digest a pointer to fix this
warning.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for
each LISTEN socket, regardless of various parameters (listen backlog for
example)
On x86_64, this means order-1 allocations (might fail), even for 'small'
sockets, expecting few connections. On the contrary, a huge server wanting a
backlog of 50000 is slowed down a bit because of this fixed limit.
This patch makes the sizing of listen hash table a dynamic parameter,
depending of :
- net.core.somaxconn tunable (default is 128)
- net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128)
- backlog value given by user application (2nd parameter of listen())
For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of
kmalloc().
We still limit memory allocation with the two existing tunables (somaxconn &
tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM
usage.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduces a new flag FIB_RULE_INVERT causing rules to apply
if the specified selector doesn't match.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the attribute policy for the non-specific attributes into
net/fib_rules.h and include it in the respective protocols.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move mark selector currently implemented per protocol into
the protocol independant part.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that all protocols have been made aware of the mark
field it can be moved out of the union thus simplyfing
its usage.
The config options in the IPv4/IPv6/DECnet subsystems
to enable respectively disable mark based routing only
obfuscate the code with ifdefs, the cost for the
additional comparison in the flow key is insignificant,
and most distributions have all these options enabled
by default anyway. Therefore it makes sense to remove
the config options and enable mark based routing by
default.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
nfmark is being used in various subsystems and has become
the defacto mark field for all kinds of packets. Therefore
it makes sense to rename it to `mark' and remove the
dependency on CONFIG_NETFILTER.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the selection of an SA for an outgoing packet to be at the same
context as the originating socket/flow. This eliminates the SELinux
policy's ability to use/sendto SAs with contexts other than the socket's.
With this patch applied, the SELinux policy will require one or more of the
following for a socket to be able to communicate with/without SAs:
1. To enable a socket to communicate without using labeled-IPSec SAs:
allow socket_t unlabeled_t:association { sendto recvfrom }
2. To enable a socket to communicate with labeled-IPSec SAs:
allow socket_t self:association { sendto };
allow socket_t peer_sa_t:association { recvfrom };
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
Fix SO_PEERSEC for tcp sockets to return the security context of
the peer (as represented by the SA from the peer) as opposed to the
SA used by the local/source socket.
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
Since the upstreaming of the mlsxfrm modification a few months back,
testing has resulted in the identification of the following issues/bugs that
are resolved in this patch set.
1. Fix the security context used in the IKE negotiation to be the context
of the socket as opposed to the context of the SPD rule.
2. Fix SO_PEERSEC for tcp sockets to return the security context of
the peer as opposed to the source.
3. Fix the selection of an SA for an outgoing packet to be at the same
context as the originating socket/flow.
The following would be the result of applying this patchset:
- SO_PEERSEC will now correctly return the peer's context.
- IKE deamons will receive the context of the source socket/flow
as opposed to the SPD rule's context so that the negotiated SA
will be at the same context as the source socket/flow.
- The SELinux policy will require one or more of the
following for a socket to be able to communicate with/without SAs:
1. To enable a socket to communicate without using labeled-IPSec SAs:
allow socket_t unlabeled_t:association { sendto recvfrom }
2. To enable a socket to communicate with labeled-IPSec SAs:
allow socket_t self:association { sendto };
allow socket_t peer_sa_t:association { recvfrom };
This Patch: Pass correct security context to IKE for use in negotiation
Fix the security context passed to IKE for use in negotiation to be the
context of the socket as opposed to the context of the SPD rule so that
the SA carries the label of the originating socket/flow.
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
Weirdness: the third argument of socket() is net-endian
here. Oh, well - it's documented in packet(7).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/drzeus/mmc:
mmc: correct request error handling
mmc: Flush block queue when removing card
mmc: sdhci high speed support
mmc: Support for high speed SD cards
mmc: Fix mmc_delay() function
mmc: Add support for mmc v4 wide-bus modes
[PATCH] mmc: Add support for mmc v4 high speed mode
trivial change for mmc/Kconfig: MMC_PXA does not mean only PXA255
Make general code cleanups
Add MMC_CAP_{MULTIWRITE,BYTEBLOCK} flags
Platform device error handling cleanup
Move register definitions away from the header file
Change OMAP_MMC_{READ,WRITE} macros to use the host pointer
Replace base with virt_base and phys_base
mmc: constify mmc_host_ops vectors
mmc: remove kernel_thread()
Most PHYs connect to an ethernet controller over a GMII or MII
interface. However, a growing number are connected over
different interfaces, such as RGMII or SGMII.
The ethernet driver will tell the PHY what type of connection it
is by setting it manually, or passing it in through phy_connect
(or phy_attach).
Changes include:
* Updates to documentation
* Updates to PHY Lib consumers
* Changes to PHY Lib to add interface support
* Some minor changes to whitespace in phy.h
* gianfar driver now detects interface and passes appropriate
value to PHY Lib
Signed-off-by: Andrew Fleming <afleming@freescale.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add pci device ids for the NVIDIA MCP67 chip.
Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch adds two new defines for the SIOCSIWMLME to cover all
kinds MLMEs (well, except REASSOC) through a ioctl.
Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In the SoftMAC version of the IEEE 802.11 stack, not all duplicate messages are
detected. For the most part, there is no difficulty; however for TKIP and CCMP
encryption, the duplicates result in a "replay detected" log message where the
received and previous values of the TSC are identical. This change adds a new
variable to the ieee80211_device structure that holds the 'seq_ctl' value for
the previous frame. When a new frame repeats the value, the frame is dropped and
the appropriate counter is incremented.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch adds a host_strip_iv_icv flag to ieee80211 which indicates that
ieee80211_rx should strip the IV/ICV/other security features from the payload.
This saves on some memmove() calls in the driver and seems like something that
belongs in the stack as it can be used by bcm43xx, ipw2200, and zd1211rw
I will submit the ipw2200 patch separately as it needs testing.
This patch also adds some sensible variable reuse (idx vs keyidx) in
ieee80211_rx
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The <linux/phy.h> uses some types and macros defined in
<linux/ethtool.h>, <linux/mii.h>, <linux/timer.h> and <linux/workqueue.h>,
but fails to include these headers.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
patch-mips-2.6.18-20060920-include-phy-16
Signed-off-by: Jeff Garzik <jeff@garzik.org>
On some controllers (ICHs in piix mode), there is *NO* reliable way to
determine device presence other than issuing IDENTIFY and see how the
transaction proceeds by watching the TF status register.
libata acted this way before irq-pio and phantom devices caused very
little problem but now that IDENTIFY is performed using IRQ drive PIO,
such phantom devices now result in multiple 30sec timeouts during
boot.
This patch implements ATA_FLAG_DETECT_POLLING. If a LLD sets this
flag, libata core issues the initial IDENTIFY in polling mode and if
the initial data transfer fails w/ HSM violation, the port is
considered to be empty thus replicating the old libata and IDE
behavior.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Many drives support LBA48 even when its capacity is smaller than
1<<28, as LBA48 is required for many functionalities. FLUSH_EXT is
mandatory for drives w/ LBA48 support.
Interestingly, at least one of such drives (ST960812A) has problems
dealing with FLUSH_EXT. It eventually completes the command but takes
around 7 seconds to finish in many cases thus drastically slowing down
IO transactions. This seems to be a firmware bug which sneaked into
production probably because no other ATA driver including linux IDE
issues FLUSH_EXT to drives which report support for LBA48 & FLUSH_EXT
but is smaller than 1<<28 blocks.
This patch adds ATA_DFLAG_FLUSH_EXT which is set iff the drive
supports LBA48 & FLUSH_EXT and is larger than LBA28 limit. Both cache
flush paths are updated to issue FLUSH_EXT only when the flag is set.
Note that the changed behavior is more inline with the rest of libata.
libata prefers shorter commands whenever possible.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Danny Kukawka <dkukawka@novell.com>
Cc: Stefan Seyfried <seife@novell.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata EH used to perform ata_set_mode() iff the EH session performed
reset as indicated by ATA_EHI_DID_RESET. This is incorrect because
->dev_config() called by revalidation is allowed to modify transfer
mode which ata_set_mode() should take care of. This patch implements
the following two flags.
* ATA_EHI_SETMODE: set during EH to schedule ata_set_mode(). Both new
device attachment and revalidation set this flag.
* ATA_EHI_POST_SETMODE: set while the device is revalidated after
ata_set_mode(). Post-setmode revalidation is different from initial
configuaration and EH revalidation in that ->dev_config() is not
allowed tune transfer mode. LLD can use this flag to determine
whether it's allowed to tune transfer mode. Note that POST_SETMODE
->dev_config() is guaranteed to be preceded by non-POST_SETMODE
->dev_config().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement ehi flag ATA_EHI_PRINTINFO. This flag is set when device
configuration needs to print out device info. This used to be handled
by @print_info argument to ata_dev_configure() but LLDs also need to
know about it in ->dev_config() callback.
This patch replaces @print_info w/ ATA_EHI_PRINTINFO and make sata_sil
print workaround messages only on the initial configuration.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Separate out sata_port_hardreset() from sata_std_hardreset(). This
will be used by LLD hardreset implementation and later by PMP.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ata_irq_on() isn't used outside of libata core layer. The function is
TF/SFF interface specific but currently used by core path with some
hack too. Move it from include/linux/libata.h to
drivers/ata/libata-sff.c.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
libata waits for !BSY even when the status register reports 0xff.
This causes long boot delays when D8 isn't pulled down properly. This
patch does the followings.
* don't wait if status register is 0xff in all wait functions
* make ata_busy_sleep() return 0 on success and -errno on failure.
-ENODEV is returned on 0xff status and -EBUSY on other failures.
* make ata_bus_softreset() succeed on 0xff status. 0xff status is not
reset failure. It indicates no device. This removes unnecessary
retries on such ports. Note that the code change assumes unoccupied
port reporting 0xff status does not produce valid device signature.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Joe Jin <lkmaillist@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
needs a changelog
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (31 commits)
[MIPS] Remove duplicate ISA DMA code for 0 DMA channel case.
[MIPS] Remove unused definition of cpu_to_lelongp()
[MIPS] Remove userspace proofing from <asm/bitops.h>.
[MIPS] Remove old junk left from old atomic_lock.
[MIPS] Use conditional traps for BUG_ON on MIPS II and better.
[MIPS] mips HPT cleanup: make clocksource_mips public
[MIPS] do_IRQ cleanup
[MIPS] Avoid dupliate D-cache flush on R400C / R4400 SC and MC variants.
[MIPS] Remove redundant r4k_blast_icache() calls
[MIPS] Work around bogus gcc warnings.
[MIPS] Fix double inclusions
[MIPS] use generic_handle_irq, handle_level_irq, handle_percpu_irq
[MIPS] IRQ cleanups
[MIPS] mips hpt cleanup: get rid of mips_hpt_init
[MIPS] PB1200: Remove duplicate definitions
[MIPS] Fix alignment hole in struct cache_desc; shrink struct.
[MIPS] Oprofile: kernel support for the R10000.
[MIPS] Remove unused R10000 performance counter definitions.
[MIPS] Add support for kexec
[MIPS] Don't print presence of WAIT instruction on bootup.
...
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/usb-2.6: (103 commits)
usbcore: remove unused argument in autosuspend
USB: keep count of unsuspended children
USB hub: simplify remote-wakeup handling
USB: struct usb_device: change flag to bitflag
OHCI: make autostop conditional on CONFIG_PM
USB: Add autosuspend support to the hub driver
EHCI: Fix root-hub and port suspend/resume problems
USB: create a new thread for every USB device found during the probe sequence
USB: add driver for the USB debug devices
USB: added dynamic major number for USB endpoints
USB: pegasus error path not resetting task's state
USB: endianness fix for asix.c
USB: build the appledisplay driver
USB serial: replace kmalloc+memset with kzalloc
USB: hid-core: canonical defines for Apple USB device IDs
USB: idmouse cleanup
USB: make drivers/usb/core/driver.c:usb_device_match() static
USB: lh7a40x_udc remove double declaration
USB: pxa2xx_udc recognizes ixp425 rev b0 chip
usbtouchscreen: add support for DMC TSC-10/25 devices
...
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (36 commits)
Driver core: show drivers in /sys/module/
Documentation/driver-model/platform.txt update/rewrite
Driver core: platform_driver_probe(), can save codespace
driver core: Use klist_remove() in device_move()
driver core: Introduce device_move(): move a device to a new parent.
Driver core: make drivers/base/core.c:setup_parent() static
driver core: Introduce device_find_child().
sysfs: sysfs_write_file() writes zero terminated data
cpu topology: consider sysfs_create_group return value
Driver core: Call platform_notify_remove later
ACPI: Change ACPI to use dev_archdata instead of firmware_data
Driver core: add dev_archdata to struct device
Driver core: convert sound core to use struct device
Driver core: change mem class_devices to be real devices
Driver core: convert fb code to use struct device
Driver core: convert firmware code to use struct device
Driver core: convert mmc code to use struct device
Driver core: convert ppdev code to use struct device
Driver core: convert PPP code to use struct device
Driver core: convert cpuid code to use struct device
...
Show the drivers, which belong to the module:
$ ls -l /sys/module/usbcore/drivers/
hub -> ../../../bus/usb/drivers/hub
usb -> ../../../bus/usb/drivers/usb
usbfs -> ../../../bus/usb/drivers/usbfs
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This defines a new platform_driver_probe() method allowing the driver's
probe() method, and its support code+data, to safely live in __init
sections for typical system configurations.
Many system-on-chip processors could benefit from this API, to the tune
of recovering hundreds to thousands of bytes per driver. That's memory
which is currently wasted holding code which can never be called after
system startup, yet can not be removed. It can't be removed because of
the linkage requirement that pointers to init section code (like, ideally,
probe support) must not live in other sections (like driver method tables)
after those pointers would be invalid.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Provide a function device_move() to move a device to a new parent device. Add
auxilliary functions kobject_move() and sysfs_move_dir().
kobject_move() generates a new uevent of type KOBJ_MOVE, containing the
previous path (DEVPATH_OLD) in addition to the usual values. For this, a new
interface kobject_uevent_env() is created that allows to add further
environmental data to the uevent at the kobject layer.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Change ACPI to use dev_archdata instead of firmware_data
This patch changes ACPI to use the new dev_archdata on i386, x86_64
and ia64 (is there any other arch using ACPI ?) to store it's
acpi_handle.
It also removes the firmware_data field from struct device as this
was the only user.
Only build-tested on x86
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add arch specific dev_archdata to struct device
Adds an arch specific struct dev_arch to struct device. This enables
architecture to add specific fields to every device in the system, like
DMA operation pointers, NUMA node ID, firmware specific data, etc...
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Andi Kleen <ak@suse.de>
Acked-By: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Converts from using struct "class_device" to "struct device" making
everything show up properly in /sys/devices/ with symlinks from the
/sys/class directory.
It also makes the struct sound_card to show up as a "real" device
where all the different sound class devices are placed as childs
and different card attribute files can hang off of. /sys/class/sound is
still a flat directory, but the symlink targets of all devices belonging
to the same card, point the the /sys/devices tree below the new card
device object.
Thanks to Kay for the updates to this patch.
Signed-off-by: Kay Sievers <kay.sievers@novell.com>
Acked-by: Jaroslav Kysela <perex@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Converts from using struct "class_device" to "struct device" making
everything show up properly in /sys/devices/ with symlinks from the
/sys/class directory.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Converts from using struct "class_device" to "struct device" making
everything show up properly in /sys/devices/ with symlinks from the
/sys/class directory.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Converts from using struct "class_device" to "struct device" making
everything show up properly in /sys/devices/ with symlinks from the
/sys/class directory.
Also fixes up the isdn drivers that were putting something in the class
device's directory.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This also ment that some of the misc drivers had to also be fixed
up as they were assuming the device was a class_device.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I finally did as you suggested and added the notifier to the struct
bus_type itself. There are still problems to be expected is something
attaches to a bus type where the code can hook in different struct
device sub-classes (which is imho a big bogosity but I won't even try to
argue that case now) but it will solve nicely a number of issues I've
had so far.
That also means that clients interested in registering for such
notifications have to do it before devices are added and after bus types
are registered. Fortunately, most bus types that matter for the various
usage scenarios I have in mind are registerd at postcore_initcall time,
which means I have a really nice spot at arch_initcall time to add my
notifiers.
There are 4 notifications provided. Device being added (before hooked to
the bus) and removed (failure of previous case or after being unhooked
from the bus), along with driver being bound to a device and about to be
unbound.
The usage I have for these are:
- The 2 first ones are used to maintain a struct device_ext that is
hooked to struct device.firmware_data. This structure contains for now a
pointer to the Open Firmware node related to the device (if any), the
NUMA node ID (for quick access to it) and the DMA operations pointers &
iommu table instance for DMA to/from this device. For bus types I own
(like IBM VIO or EBUS), I just maintain that structure directly from the
bus code when creating the devices. But for bus types managed by generic
code like PCI or platform (actually, of_platform which is a variation of
platform linked to Open Firmware device-tree), I need this notifier.
- The other two ones have a completely different usage scenario. I have
cases where multiple devices and their drivers depend on each other. For
example, the IBM EMAC network driver needs to attach to a MAL DMA engine
which is a separate device, and a PHY interface which is also a separate
device. They are all of_platform_device's (well, about to be with my
upcoming patches) but there is no say in what precise order the core
will "probe" them and instanciate the various modules. The solution I
found for that is to have the drivers for emac to use multithread_probe,
and wait for a driver to be bound to the target MAL and PHY control
devices (the device-tree contains reference to the MAL and PHY interface
nodes, which I can then match to of_platform_devices). Right now, I've
been polling, but with that notifier, I can more cleanly wait (with a
timeout of course).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This updated patch adds the Intel ICH9 LPC and SMBus Controller DID's.
Signed-off-by: Jason Gaston <jason.d.gaston@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>