Remove the usage of ASSERT_READ_LOCK/ASSERT_WRITE_LOCK in nf_conntrack,
it didn't do anything, it was just an empty define and it uglified the code.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Add some more sanity checks when registering/unregistering l3/l4 protocols.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Rename 'struct nf_conntrack_protocol' to 'struct nf_conntrack_l4proto' in
order to help distinguish it from 'struct nf_conntrack_l3proto'. It gets
rather confusing with 'nf_conntrack_protocol'.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Place rarely written variables in the read-mostly section by using
__read_mostly
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch splits out L3/L4 protocol handling into its own file
nf_conntrack_proto.c
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch splits out the event cache into its own file
nf_conntrack_ecache.c
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch splits out handling of helpers into its own file
nf_conntrack_helper.c
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch splits out expectation handling into its own file
nf_conntrack_expect.c
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch implements a suggestion by Ian McDonald and
1) Avoids tests against negative packet lengths by using unsigned int
for packet payload lengths in the CCID send_packet()/packet_sent() routines
2) As a consequence, it removes an now unnecessary test with regard to `len > 0'
in ccid3_hc_tx_packet_sent: that condition is always true, since
* negative packet lengths are avoided
* ccid3_hc_tx_send_packet flags an error whenever the payload length is 0.
As a consequence, ccid3_hc_tx_packet_sent is never called as all errors
returned by ccid_hc_tx_send_packet are caught in dccp_write_xmit
3) Removes the third argument of ccid_hc_tx_send_packet (the `len' parameter),
since it is currently always set to skb->len. The code is updated with regard
to this parameter change.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This implements the larger-initial-windows feature for CCID 3, as described in
section 5 of RFC 4342. When the first feedback packet arrives, the sender can
send up to 2..4 packets per RTT, instead of just one.
The patch further
* reduces the number of timestamping calls by passing the timestamp value
(which is computed in one of the calling functions anyway) as argument
* renames one constant with a very long name into one which is shorter and
resembles the one in RFC 3448 (t_mbi)
* simplifies some of the min_t/max_t cases where both `x', `y' have the same
type
Commiter note: renamed TFRC_t_mbi to TFRC_T_MBI, to follow Linux coding style.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
To reflect the fact that this now is of no effect, not making apps
stop working, just be warned in the system log.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This removes and cleans up unused variables and structures which have become
unnecessary following the introduction of the EWMA patch to automatically track
the CCID 3 receiver/sender packet sizes `s'.
It deprecates the PACKET_SIZE socket option by returning an error code and
printing a deprecation warning if an application tries to read or write this
socket option.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This corrects the setting of the nofeedback timer with regard to RFC
3448 - previously it was not set to max(4*R, 2*s/X) as specified. Using
the maximum of 1 second as upper bound (as it was done before) can have
detrimental effects, especially if R is small.
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This is in response to a request sent earlier by Eric W. Biederman
and replaces all sysctl numbers for net.dccp.default with CTL_UNNUMBERED.
It has been tested to compile and to work.
Commiter note: I've removed the use of CTL_UNNUMBERED, not setting .ctl_name
sets it to 0, that is the what CTL_UNNUMBERED is, reason is
to avoid unneeded source code cluttering.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch
* removes setting t_RTO in ccid3_hc_tx_init (per [RFC 3448, 4.2], t_RTO is
undefined until feedback has been received);
* makes some trivial changes (updates of comments);
* performs a small optimisation by exploiting that the feedback timeout
uses the value of t_ipi. The way it is done is safe, because the timeouts
appear after the changes to t_ipi, ensuring that up-to-date values are used;
* in ccid3_hc_tx_packet_recv, moves the t_rto statement closer to the calculation
of the next_tmout. This makes the code clearer to read and is also safe, since
t_rto is not updated until the next call of ccid3_hc_tx_packet_recv, and is not
read by the functions called via ccid_wait_for_ccid();
* removes a `max' statement in sk_reset_timer, this is not needed since the timeout
value is always greater than 1E6 microseconds.
* adds `XXX'es to highlight that currently the nofeedback timer is set
in a non-standard way
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch:
* consolidates updating of parameters (t_nom, t_ipi, t_delta) which
need to be updated at the same time, since they are inter-dependent
* removes two inline functions which are no longer needed as a result of
the above consolidation
* resolves a FIXME regarding the re-calculation of t_ipi within the nofeedback
timer, in the state where no feedback has previously been received
* ties updating these parameters to updating the sending rate X, exploiting
that all three parameters in turn depend on X; and using a small optimisation
which can reduce the number of required instructions: only update the three
parameters when X really changes
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch concerns updating the value of the nofeedback timer when no feedback
has been received so far.
Since in this case the value of R is still undefined according to [RFC 3448,
4.2], we can not perform step (3) of [RFC 3448, 4.3]. A clarification is
provided in [RFC 4342, sec. 5], which states that in these cases the nofeedback
timer (still) expires "after two seconds".
Many thanks to Ian McDonald for pointing this out and providing the
clarification.
The patch
* implements [RFC 4342, sec. 5] with regard to the above case
* consolidates handling timer restart by
- adding an appropriate jump label and
- initialising the timeout value
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Might as well make flush notifier prettier when subpolicy used
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch consolidates set/getsockopt code between UDP(-Lite) v4 and 6. The
justification is that UDP(-Lite) is a transport-layer protocol and therefore
the socket option code (at least in theory) should be AF-independent.
Furthermore, there is the following code reduplication:
* do_udp{,v6}_getsockopt is 100% identical between v4 and v6
* do_udp{,v6}_setsockopt is identical up to the following differerence
--v4 in contrast to v4 additionally allows the experimental encapsulation
types UDP_ENCAP_ESPINUDP and UDP_ENCAP_ESPINUDP_NON_IKE
--the remainder is identical between v4 and v6
I believe that this difference is of little relevance.
The advantages in not duplicating twice almost completely identical code.
The patch further simplifies the interface of udp{,v6}_push_pending_frames,
since for the second argument (struct udp_sock *up) it always holds that
up = udp_sk(sk); where sk is the first function argument.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv4, IPv6, and DECNet all use struct rta_cacheinfo in a similiar
way, therefore rtnl_put_cacheinfo() is added to reuse code.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The destination PID is passed directly to netlink_unicast()
respectively netlink_multicast().
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds documentation for the TFRC structure fields.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This considers the case - ACK received while no packet has been sent
so far. Resolved by printing a (rate-limited) warning message.
Further removes an unnecessary BUG_ON in ccid3_hc_tx_packet_recv,
received feedback on a terminating connection is simply ignored.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch removes a switch statement which is redundant since,
* nothing is done in states TFRC_SSTATE_NO_SENT/TFRC_SSTATE_NO_FBACK
* it is impossible that the function is called in the state TFRC_SSTATE_TERM, since
--the function is called, in dccp_write_xmit, after ccid3_hc_tx_send_packet
--if ccid3_hc_tx_send_packet is called in state TFRC_SSTATE_TERM, it returns
-EINVAL, which means that ccid3_hc_tx_packet_sent will not be called
(compare dccp_write_xmit)
--> therefore, this case is logically impossible
* the remaining state is TFRC_SSTATE_FBACK which conditionally updates t_ipi, t_nom,
and t_delta. This is a no-op, since
--t_ipi only changes when feedback is received
--however, when feedback arrives via ccid3_hc_tx_packet_recv, there is an identical
code block which performs the same set of operations
--performing the same set of operations again in ccid3_hc_tx_packet_sent therefore
does not change anything, since between the time of receiving the last feedback
(and therefore update of t_ipi, t_nom, and t_delta), the value of t_ipi has not
changed
--since t_ipi has not changed, the values of t_delta and t_nom also do not change,
they depend fully on t_ipi
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This resolves an `XXX' in ccid3_hc_tx_send_packet().
The function is only called on Data and DataAck packets and returns a negative
result on zero-sized messages. This is a reasonable policy since CCID 3 is a
congestion-control module and congestion control on zero-sized Data(Ack)
packets is in a way pathological.
The patch uses a more suitable error code for this case, it returns the Posix.1
code `EBADMSG' ("Not a data message") instead of `ENOTCONN'.
As a result of ignoring zero-sized packets, a the condition for a warning
"First packet is data" in ccid3_hc_tx_packet_sent is always satisfied; this
message has been removed since it will always be printed.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This makes some logically equivalent simplifications, by replacing
rc - values plus goto's with direct return statements.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
This patch performs a simplifying (performance) optimisation:
In each call of the inline function ccid3_calc_new_t_ipi(), the state is
tested against TFRC_SSTATE_NO_FBACK. This is expensive when the function
is called very often. A simpler solution, implemented by this patch, is
to adapt the control flow.
Background:
Now that we can stuff bigger ack vectors into options.
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Ack vectors grow proportional to the window size. If an ack vector does not fit
into a single option, it must be spread across multiple options. This patch
will allow for windows to grow larger.
Committer note: Simplified the patch a bit, original algorithm kept.
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Commiter note:
This was split from Andrea's original patch, in the process I changed the type
of the ackvec index fields to u16 instead of to int and haven't folded
dccp_ackvec_parse with dccp_ackvec_check_rcv_ackno.
Next patch will actually do the insertion of more than one ackvec per packet,
using, initially, up to a max of 2 ackvecs as per Andrea's original patch, then
I'll work on support for larger ackvecs, be it using a sysctl or using
setsockopt.
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Commiter note: original patch was splitted.
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Extends the netlink interface to support the __le16 type and
converts address addition, deletion and, dumping to use the
new netlink interface.
Fixes multiple occasions of possible illegal memory references
due to not validated netlink attributes.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Log an error if the remote tunnel endpoint is unable to handle
tunneled packets.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow link-local tunnel endpoints if the underlying link is defined.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doing the mandatory tunnel endpoint checks when the tunnel is set up
isn't enough as interfaces can go up or down and addresses can be
added or deleted after this. The checks need to be done realtime when
the tunnel is processing a packet.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>