Commit Graph

3895 Commits

Author SHA1 Message Date
Jamal Hadi Salim
94b9bb5480 [XFRM] Optimize SA dumping
Same comments as in "[XFRM] Optimize policy dumping"

The numbers are (20K SAs):
2006-12-06 18:38:45 -08:00
Jamal Hadi Salim
baf5d743d1 [XFRM] Optimize policy dumping
This change optimizes the dumping of Security policies.

1) Before this change ..
speedopolis:~# time ./ip xf pol

real    0m22.274s
user    0m0.000s
sys     0m22.269s

2) Turn off sub-policies

speedopolis:~# ./ip xf pol

real    0m13.496s
user    0m0.000s
sys     0m13.493s

i suppose the above is to be expected

3) With this change ..
speedopolis:~# time ./ip x policy

real    0m7.901s
user    0m0.008s
sys     0m7.896s
2006-12-06 18:38:44 -08:00
Patrick McHardy
1b6651f1bf [XFRM]: Use output device disable_xfrm for forwarded packets
Currently the behaviour of disable_xfrm is inconsistent between
locally generated and forwarded packets. For locally generated
packets disable_xfrm disables the policy lookup if it is set on
the output device, for forwarded traffic however it looks at the
input device. This makes it impossible to disable xfrm on all
devices but a dummy device and use normal routing to direct
traffic to that device.

Always use the output device when checking disable_xfrm.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-06 18:38:43 -08:00
Jamal Hadi Salim
334c29a645 [GENETLINK]: Move command capabilities to flags.
This patch moves command capabilities to command flags. Other than
being cleaner, saves several bytes.
We increment the nlctrl version so as to signal to user space that
to not expect the attributes. We will try to be careful
not to do this too often ;->

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-06 18:38:41 -08:00
Chuck Lever
5847e1f4d0 SUNRPC: Remove pprintk() from net/sunrpc/xprt.c
These appear to be deprecated.  Removing them also gets rid of some sparse
noise.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:55 -05:00
Chuck Lever
fbf76683ff SUNRPC: relocate the creation of socket-specific tunables
Clean-up:

The RPC client currently creates some sysctls that are specific to the
socket transport.  Move those entirely into xprtsock.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:53 -05:00
Chuck Lever
282b32e17f SUNRPC: create stubs for xprtsock init and cleanup
Over time we will want to add some specific init and cleanup logic for the
xprtsock implementation.  Add stub routines for initialization and exit
processing.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:53 -05:00
Chuck Lever
dd4564715e SUNRPC: Rename skb_reader_t and friends
Clean-up:  hch suggested that the RPC client shouldn't pollute the name
space used by the generic skb manipulation routines in net/core/skbuff.c.

Rename a couple of types in xdr.h to adhere to this convention.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:52 -05:00
Chuck Lever
9d29231690 SUNRPC: skb_read_bits is the same as xs_tcp_copy_data
Clean-up: eliminate xs_tcp_copy_data -- it's exactly the same logic as the
common routine skb_read_bits.  The UDP and TCP socket read code now share
the same routine for copying data into an xdr_buf.

Now that skb_read_bits() is exported, rename it to avoid confusing it with
a generic skb_* function.  As these functions are XDR-specific, they should
not have names that suggest they are of generic use.  Also rename
skb_read_and_csum_bits() to be consistent.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:52 -05:00
Chuck Lever
7559c7a28f SUNRPC: Make address format buffers more generic
For now we will assume that all transports will use the address format
buffers in the rpc_xprt struct to store their addresses.  Change
rpc_peer2str() to be a generic routine to handle this, and get rid of the
print_address() op in the rpc_xprt_ops vector.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:51 -05:00
Chuck Lever
314dfd7987 SUNRPC: move saved socket callback functions to a private data structure
Move the three fields for saving socket callback functions out of the
rpc_xprt structure and into a private data structure maintained in
net/sunrpc/xprtsock.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:51 -05:00
Chuck Lever
7c6e066ec2 SUNRPC: Move the UDP socket bufsize parameters to a private data structure
Move the socket-specific buffer size parameters for UDP sockets to a
private data structure maintained in net/sunrpc/xprtsock.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:50 -05:00
Chuck Lever
c847546182 SUNRPC: Move rpc_xprt socket connect fields into private data structure
Move the socket-specific connection management fields out of the generic
rpc_xprt structure into a private data structure maintained in
net/sunrpc/xprtsock.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:50 -05:00
Chuck Lever
e136d0926e SUNRPC: Move TCP state flags into xprtsock.c
Move "XPRT_LAST_FRAG" and friends from xprt.h into xprtsock.c, and rename
them to use the naming scheme in use in xprtsock.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:50 -05:00
Chuck Lever
51971139b2 SUNRPC: Move TCP receive state variables into private data structure
Move the TCP receive state variables from the generic rpc_xprt structure to
a private structure maintained inside net/sunrpc/xprtsock.c.

Also rename a function/variable pair to refer to RPC fragment headers
instead of record markers, to be consistent with types defined in
sunrpc/*.h.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:49 -05:00
Chuck Lever
ee0ac0c227 SUNRPC: Remove sock and inet fields from rpc_xprt
The "sock" and "inet" fields are socket-specific.  Move them to a private
data structure maintained entirely within net/sunrpc/xprtsock.c

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:49 -05:00
Chuck Lever
ffc2e518c9 SUNRPC: Allocate a private data area for socket-specific rpc_xprt fields
When setting up a new transport instance, allocate enough memory for an
rpc_xprt and a private area.  As part of the same memory allocation, it
will be easy to find one, given a pointer to the other.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:48 -05:00
J. Bruce Fields
94efa93435 rpcgss: krb5: miscellaneous cleanup
Miscellaneous cosmetic fixes.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:48 -05:00
J. Bruce Fields
717757ad10 rpcgss: krb5: ignore seed
We're currently not actually using seed or seed_init.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:47 -05:00
J. Bruce Fields
d922a84a8b rpcgss: krb5: sanity check sealalg value in the downcall
The sealalg is checked in several places, giving the impression it could be
either SEAL_ALG_NONE or SEAL_ALG_DES.  But in fact SEAL_ALG_NONE seems to
be sufficient only for making mic's, and all the contexts we get must be
capable of wrapping as well.  So the sealalg must be SEAL_ALG_DES.  As
with signalg, just check for the right value on the downcall and ignore it
otherwise.  Similarly, tighten expectations for the sealalg on incoming
tokens, in case we do support other values eventually.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:47 -05:00
J. Bruce Fields
39a21dd1b0 rpcgss: krb5: clean up some goto's, etc.
Remove some unnecessary goto labels; clean up some return values; etc.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:46 -05:00
J. Bruce Fields
ca54f89645 rpcgss: simplify make_checksum
We're doing some pointless translation between krb5 constants and kernel
crypto string names.

Also clean up some related spkm3 code as necessary.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:46 -05:00
J. Bruce Fields
2818bf81a8 rpcgss: krb5: kill checksum_type, miscellaneous small cleanup
Previous changes reveal some obvious cruft.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:45 -05:00
J. Bruce Fields
5eb064f939 rpcgss: krb5: expect a constant signalg value
We also only ever receive one value of the signalg, so let's not pretend
otherwise

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:45 -05:00
J. Bruce Fields
e678e06bf8 gss: krb5: remove signalg and sealalg
We designed the krb5 context import without completely understanding the
context.  Now it's clear that there are a number of fields that we ignore,
or that we depend on having one single value.

In particular, we only support one value of signalg currently; so let's
check the signalg field in the downcall (in case we decide there's
something else we could support here eventually), but ignore it otherwise.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:44 -05:00
Olga Kornievskaia
adeb8133dd rpc: spkm3 update
This updates the spkm3 code to bring it up to date with our current
understanding of the spkm3 spec.

In doing so, we're changing the downcall format used by gssd in the spkm3 case,
which will cause an incompatilibity with old userland spkm3 support.  Since the
old code a) didn't implement the protocol correctly, and b) was never
distributed except in the form of some experimental patches from the citi web
site, we're assuming this is OK.

We do detect the old downcall format and print warning (and fail).  We also
include a version number in the new downcall format, to be used in the
future in case any further change is required.

In some more detail:

	- fix integrity support
	- removed dependency on NIDs. instead OIDs are used
	- known OID values for algorithms added.
	- fixed some context fields and types

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:44 -05:00
Olga Kornievskaia
37a4e6cb03 rpc: move process_xdr_buf
Since process_xdr_buf() is useful outside of the kerberos-specific code, we
move it to net/sunrpc/xdr.c, export it, and rename it in keeping with xdr_*
naming convention of xdr.c.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:44 -05:00
J. Bruce Fields
87d918d667 rpc: gss: fix a kmap_atomic race in krb5 code
This code is never called from interrupt context; it's always run by either
a user thread or rpciod.  So KM_SKB_SUNRPC_DATA is inappropriate here.

Thanks to Aimé Le Rouzic for capturing an oops which showed the kernel
taking an interrupt while we were in this piece of code, resulting in a
nested kmap_atomic(.,KM_SKB_SUNRPC_DATA) call from
xdr_partial_copy_from_skb().

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:43 -05:00
J. Bruce Fields
8fc7500bb8 rpc: gss: eliminate print_hexl()'s
Dumping all this data to the logs is wasteful (even when debugging is turned
off), and creates too much output to be useful when it's turned on.

Fix a minor style bug or two while we're at it.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:43 -05:00
Chuck Lever
2b577f1f14 SUNRPC: another pmap wakeup fix
Don't wake up bind waiters if a task finds that another task is already
trying to bind.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:42 -05:00
Chuck Lever
c8541ecdd5 SUNRPC: Make the transport-specific setup routine allocate rpc_xprt
Change the location where the rpc_xprt structure is allocated so each
transport implementation can allocate a private area from the same
chunk of memory.

Note also that xprt->ops->destroy, rather than xprt_destroy, is now
responsible for freeing rpc_xprt when the transport is destroyed.

Test plan:
Connectathon.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:34 -05:00
Trond Myklebust
24c5684b65 SUNRPC: Clean up xs_send_pages()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:33 -05:00
Trond Myklebust
bee57c99c3 SUNRPC: Ensure xdr_buf_read_netobj() checks for memory overruns
Also clean up the code...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:33 -05:00
Trond Myklebust
4e3e43ad14 SUNRPC: Add __(read|write)_bytes_from_xdr_buf
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:32 -05:00
Trond Myklebust
1e78957e0a SUNRPC: Clean up argument types in xdr.c
Converts various integer buffer offsets and sizes to unsigned integer.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:32 -05:00
Trond Myklebust
6d5fcb5a52 SUNRPC: Remove BKL around the RPC socket operations etc.
All internal RPC client operations should no longer depend on the BKL,
however lockd and NFS callbacks may still require it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:30 -05:00
Trond Myklebust
bbd5a1f9fc SUNRPC: Fix up missing BKL in asynchronous RPC callback functions
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:29 -05:00
Trond Myklebust
3e32a5d99a SUNRPC: Give cloned RPC clients their own rpc_pipefs directory
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:29 -05:00
Trond Myklebust
23bf85ba43 SUNRPC: Handle the cases where rpc_alloc_iostats() fails
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:28 -05:00
Trond Myklebust
8aca67f0ae SUNRPC: Fix a potential race in rpc_wake_up_task()
Use RCU to ensure that we can safely call rpc_finish_wakeup after we've
called __rpc_do_wake_up_task. If not, there is a theoretical race, in which
the rpc_task finishes executing, and gets freed first.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:26 -05:00
Trond Myklebust
e6b3c4db6f Fix a second potential rpc_wakeup race...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:25 -05:00
Christophe Saout
cc4dc59e55 Subject: Re: [PATCH] Fix SUNRPC wakeup/execute race condition
The sunrpc scheduler contains a race condition that can let an RPC
task end up being neither running nor on any wait queue. The race takes
place between rpc_make_runnable (called from rpc_wake_up_task) and
__rpc_execute under the following condition:

First __rpc_execute calls tk_action which puts the task on some wait
queue. The task is dequeued by another process before __rpc_execute
continues its execution. While executing rpc_make_runnable exactly after
setting the task `running' bit and before clearing the `queued' bit
__rpc_execute picks up execution, clears `running' and subsequently
both functions fall through, both under the false assumption somebody
else took the job.

Swapping rpc_test_and_set_running with rpc_clear_queued in
rpc_make_runnable fixes that hole. This introduces another possible
race condition that can be handled by checking for `queued' after
setting the `running' bit.

Bug noticed on a 4-way x86_64 system under XEN with an NFSv4 server
on the same physical machine, apparently one of the few ways to hit
this race condition at all.

Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Christophe Saout <christophe@saout.de>
Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no>
2006-12-06 10:46:24 -05:00
Maxime Austruy
cc8ce997d2 [PATCH] softmac: fix unbalanced mutex_lock/unlock in ieee80211softmac_wx_set_mlme
Routine ieee80211softmac_wx_set_mlme has one return that fails
to release a mutex acquired at entry.

Signed-off-by: Maxime Austruy <maxime@tralhalla.org>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-12-05 19:31:33 -05:00
Ulrich Kunitz
2b50c24554 [PATCH] softmac: Fixed handling of deassociation from AP
In 2.6.19 a deauthentication from the AP doesn't start a
reassociation by the softmac code. It appears that
mac->associnfo.associating must be set and the
ieee80211softmac_assoc_work function must be scheduled. This patch
fixes that.

Signed-off-by: Ulrich Kunitz <kune@deine-taler.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2006-12-05 19:31:33 -05:00
David Howells
9db7372445 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/ata/libata-scsi.c
	include/linux/libata.h

Futher merge of Linus's head and compilation fixups.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-05 17:01:28 +00:00
David Howells
4c1ac1b491 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	drivers/infiniband/core/iwcm.c
	drivers/net/chelsio/cxgb2.c
	drivers/net/wireless/bcm43xx/bcm43xx_main.c
	drivers/net/wireless/prism54/islpci_eth.c
	drivers/usb/core/hub.h
	drivers/usb/input/hid-core.c
	net/core/netpoll.c

Fix up merge failures with Linus's head and fix new compilation failures.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-12-05 14:37:56 +00:00
Linus Torvalds
9b8ab9f6c3 Merge branch 'for-linus4' of master.kernel.org:/pub/scm/linux/kernel/git/viro/bird
* 'for-linus4' of master.kernel.org:/pub/scm/linux/kernel/git/viro/bird:
  [PATCH] severing poll.h -> mm.h
  [PATCH] severing skbuff.h -> mm.h
  [PATCH] severing skbuff.h -> poll.h
  [PATCH] severing skbuff.h -> highmem.h
  [PATCH] severing uaccess.h -> sched.h
  [PATCH] severing fs.h, radix-tree.h -> sched.h
  [PATCH] severing module.h->sched.h
2006-12-04 10:37:06 -08:00
Al Viro
d7fe0f241d [PATCH] severing skbuff.h -> mm.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-12-04 02:00:34 -05:00
Al Viro
a1f8e7f7fb [PATCH] severing skbuff.h -> highmem.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-12-04 02:00:29 -05:00
David S. Miller
83ac58ba0a Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6 2006-12-03 19:24:40 -08:00
David S. Miller
b4ad86bf52 [XFRM] xfrm_user: Better validation of user templates.
Since we never checked the ->family value of templates
before, many applications simply leave it at zero.
Detect this and fix it up to be the pol->family value.

Also, do not clobber xp->family while reading in templates,
that is not necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-03 19:19:26 -08:00
Gerrit Renker
2bbf29acd8 [DCCP] tfrc: Binary search for reverse TFRC lookup
This replaces the linear search algorithm for reverse lookup with
binary search.

It has the advantage of better scalability: O(log2(N)) instead of O(N).
This means that the average number of iterations is reduced from 250
(linear search if each value appears equally likely) down to at most 9.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-03 14:53:27 -02:00
Gerrit Renker
44158306d7 [DCCP] ccid3: Deprecate TFRC_SMALLEST_P
This patch deprecates the existing use of an arbitrary value TFRC_SMALLEST_P
 for low-threshold values of p. This avoids masking low-resolution errors.
 Instead, the code now checks against real boundaries (implemented by preceding
 patch) and provides warnings whenever a real value falls below the threshold.

 If such messages are observed, it is a better solution to take this as an
 indication that the lookup table needs to be re-engineered.

Changelog:
----------
 This patch
   * makes handling all TFRC resolution errors local to the TFRC library

   * removes unnecessary test whether X_calc is 'infinity' due to p==0 -- this
     condition is already caught by tfrc_calc_x()

   * removes setting ccid3hctx_p = TFRC_SMALLEST_P in ccid3_hc_tx_packet_recv
     since this is now done by the TFRC library

   * updates BUG_ON test in ccid3_hc_tx_no_feedback_timer to take into account
     that p now is either 0 (and then X_calc is irrelevant), or it is > 0; since
     the handling of TFRC_SMALLEST_P is now taken care of in the tfrc library

Justification:
--------------
 The TFRC code uses a lookup table which has a bounded resolution.
 The lowest possible value of the loss event rate `p' which can be
 resolved is currently 0.0001.  Substituting this lower threshold for
 p when p is less than 0.0001 results in a huge, exponentially-growing
 error.  The error can be computed by the following formula:

    (f(0.0001) - f(p))/f(p) * 100      for p < 0.0001

 Currently the solution is to use an (arbitrary) value
     TFRC_SMALLEST_P  =   40 * 1E-6   =   0.00004
 and to consider all values below this value as `virtually zero'.  Due to
 the exponentially growing resolution error, this is not a good idea, since
 it hides the fact that the table can not resolve practically occurring cases.
 Already at p == TFRC_SMALLEST_P, the error is as high as 58.19%!

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-03 14:53:07 -02:00
Gerrit Renker
006042d7e1 [DCCP] tfrc: Identify TFRC table limits and simplify code
This
 * adds documentation about the lowest resolution that is possible within
   the bounds of the current lookup table
 * defines a constant TFRC_SMALLEST_P which defines this resolution
 * issues a warning if a given value of p is below resolution
 * combines two previously adjacent if-blocks of nearly identical
   structure into one

This patch does not change the algorithm as such.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-03 14:52:41 -02:00
Gerrit Renker
8d0086adac [DCCP] tfrc: Add protection against invalid parameters to TFRC routines
1) For the forward X_calc lookup, it
    * protects effectively against RTT=0 (this case is possible), by
      returning the maximal lookup value instead of just setting it to 1
    * reformulates the array-bounds exceeded condition: this only happens
      if p is greater than 1E6 (due to the scaling)
    * the case of negative indices can now with certainty be excluded,
      since documentation shows that the formulas are within bounds
    * additional protection against p = 0 (would give divide-by-zero)

 2) For the reverse lookup, it warns against
    * protects against exceeding array bounds
    * now returns 0 if f(p) = 0, due to function definition
    * warns about minimal resolution error and returns the smallest table
      value instead of p=0 [this would mask congestion conditions]

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-03 14:52:26 -02:00
Gerrit Renker
90fb0e60dd [DCCP] tfrc: Fix small error in reverse lookup of p for given f(p)
This fixes the following small error in tfrc_calc_x_reverse_lookup.

 1) The table is generated by the following equations:
	lookup[index][0] = g((index+1) * 1000000/TFRC_CALC_X_ARRSIZE);
	lookup[index][1] = g((index+1) * TFRC_CALC_X_SPLIT/TFRC_CALC_X_ARRSIZE);
    where g(q) is 1E6 * f(q/1E6)

 2) The reverse lookup assigns an entry in lookup[index][small]

 3) This index needs to match the above, i.e.
    * if small=0 then

      		p  = (index + 1) * 1000000/TFRC_CALC_X_ARRSIZE

    * if small=1 then

		p = (index+1) * TFRC_CALC_X_SPLIT/TFRC_CALC_X_ARRSIZE

These are exactly the changes that the patch makes; previously the code did
not conform to the way the lookup table was generated (this difference resulted
in a mean error of about 1.12%).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-03 14:52:01 -02:00
Gerrit Renker
50ab46c790 [DCCP] tfrc: Document boundaries and limits of the TFRC lookup table
This adds documentation for the TCP Reno throughput equation which is at
the heart of the TFRC sending rate / loss rate calculations.

It spells out precisely how the values were determined and what they mean.
The equations were derived through reverse engineering and found to be
fully accurate (verified using test programs).

This patch does not change any code.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-03 14:51:29 -02:00
Gerrit Renker
26af3072b0 [DCCP] ccid3: Fix warning message about illegal ACK
This avoids a (harmless) warning message being printed at the DCCP server
(the receiver of a DCCP half connection).

Incoming packets are both directed to

 * ccid_hc_rx_packet_recv() for the server half
 * ccid_hc_tx_packet_recv() for the client half

The message gets printed since on a server the client half is currently not
sending data packets.
This is resolved for the moment by checking the DCCP-role first. In future
times (bidirectional DCCP connections), this test may have to be more
sophisticated.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-03 14:51:14 -02:00
Gerrit Renker
5c3fbb6acf [DCCP] ccid3: Fix bug in calculation of send rate
The main object of this patch is the following bug:
 ==> In ccid3_hc_tx_packet_recv, the parameters p and X_recv were updated
     _after_ the send rate was calculated. This is clearly an error and is
     resolved by re-ordering statements.

In addition,
  * r_sample is converted from u32 to long to check whether the time difference
    was negative (it would otherwise be converted to a large u32 value)
  * protection against RTT=0 (this is possible) is provided in a further patch
  * t_elapsed is also converted to long, to match the type of r_sample
  * adds a a more debugging information regarding current send rates
  * various trivial comment/documentation updates

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-03 14:50:56 -02:00
Gerrit Renker
76d127779e [DCCP]: Fix BUG in retransmission delay calculation
This bug resulted in ccid3_hc_tx_send_packet returning negative
delay values, which in turn triggered silently dequeueing packets in
dccp_write_xmit. As a result, only a few out of the submitted packets made
it at all onto the network.  Occasionally, when dccp_wait_for_ccid was
involved, this also triggered a bug warning since ccid3_hc_tx_send_packet
returned a negative value (which in reality was a negative delay value).

The cause for this bug lies in the comparison

 if (delay >= hctx->ccid3hctx_delta)
	return delay / 1000L;

The type of `delay' is `long', that of ccid3hctx_delta is `u32'. When comparing
negative long values against u32 values, the test returned `true' whenever delay
was smaller than 0 (meaning the packet was overdue to send).

The fix is by casting, subtracting, and then testing the difference with
regard to 0.

This has been tested and shown to work.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-03 14:50:42 -02:00
Gerrit Renker
8a508ac26e [DCCP]: Use higher RTO default for CCID3
The TFRC nofeedback timer normally expires after the maximum of 4
RTTs and twice the current send interval (RFC 3448, 4.3). On LANs
with a small RTT this can mean a high processing load and reduced
performance, since then the nofeedback timer is triggered very
frequently.

This patch provides a configuration option to set the bound for the
nofeedback timer, using as default 100 milliseconds.

By setting the configuration option to 0, strict RFC 3448 behaviour
can be enforced for the nofeedback timer.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
2006-12-03 14:50:23 -02:00
Jamal Hadi Salim
2b5f6dcce5 [XFRM]: Fix aevent structuring to be more complete.
aevents can not uniquely identify an SA. We break the ABI with this
patch, but consensus is that since it is not yet utilized by any
(known) application then it is fine (better do it now than later).

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:22:25 -08:00
Yasuyuki Kozakai
02dba025b0 [NETFILTER]: xtables: fixes warning on compilation of hashlimit
To use ipv6_find_hdr(), IP6_NF_IPTABLES is necessary.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:19:01 -08:00
Alexey Dobriyan
0506d4068b [ROSE] rose_add_loopback_node: propagate -E
David Binderman's icc logs:
net/rose/rose_route.c(399): remark #593: variable "err" was set but never used

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:17:48 -08:00
Yasuyuki Kozakai
1863f0965e [NETFILTER]: nf_conntrack: fix header inclusions for helpers
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:12:54 -08:00
Patrick McHardy
13b1833910 [NETFILTER]: nf_conntrack: EXPORT_SYMBOL cleanup
- move EXPORT_SYMBOL next to exported symbol
- use EXPORT_SYMBOL_GPL since this is what the original code used

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:11:25 -08:00
Patrick McHardy
a3c479772c [NETFILTER]: Mark old IPv4-only connection tracking scheduled for removal
Also remove the references to "new connection tracking" from Kconfig.
After some short stabilization period of the new connection tracking
helpers/NAT code the old one will be removed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:11:01 -08:00
Patrick McHardy
807467c22a [NETFILTER]: nf_nat: add SNMP NAT helper port
Add nf_conntrack port of the SNMP NAT helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:10:34 -08:00
Patrick McHardy
a536df35b3 [NETFILTER]: nf_conntrack/nf_nat: add TFTP helper port
Add IPv4 and IPv6 capable nf_conntrack port of the TFTP conntrack/NAT helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:10:18 -08:00
Patrick McHardy
9fafcd7b20 [NETFILTER]: nf_conntrack/nf_nat: add SIP helper port
Add IPv4 and IPv6 capable nf_conntrack port of the SIP conntrack/NAT helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:09:57 -08:00
Patrick McHardy
f09943fefe [NETFILTER]: nf_conntrack/nf_nat: add PPTP helper port
Add nf_conntrack port of the PPtP conntrack/NAT helper. Since there seems
to be no IPv6-capable PPtP implementation the helper only support IPv4.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:09:41 -08:00
Patrick McHardy
92703eee4c [NETFILTER]: nf_conntrack: add NetBIOS name service helper port
Add nf_conntrack port of the NetBIOS name service conntrack helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:09:24 -08:00
Patrick McHardy
869f37d8e4 [NETFILTER]: nf_conntrack/nf_nat: add IRC helper port
Add nf_conntrack port of the IRC conntrack/NAT helper. Since DCC doesn't
support IPv6 yet, the helper is still IPv4 only.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:09:06 -08:00
Patrick McHardy
f587de0e2f [NETFILTER]: nf_conntrack/nf_nat: add H.323 helper port
Add IPv4 and IPv6 capable nf_conntrack port of the H.323 conntrack/NAT helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:08:46 -08:00
Patrick McHardy
1695890057 [NETFILTER]: nf_conntrack/nf_nat: add amanda helper port
Add IPv4 and IPv6 capable nf_conntrack port of the Amanda conntrack/NAT helper.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:08:26 -08:00
Patrick McHardy
d6a9b6500a [NETFILTER]: nf_conntrack: add helper function for expectation initialization
Expectation address masks need to be differently initialized depending
on the address family, create helper function to avoid cluttering up
the code too much.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:08:01 -08:00
Jozsef Kadlecsik
55a733247d [NETFILTER]: nf_nat: add FTP NAT helper port
Add FTP NAT helper.

Split out from Jozsef's big nf_nat patch with a few small fixes by myself.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:07:44 -08:00
Jozsef Kadlecsik
5b1158e909 [NETFILTER]: Add NAT support for nf_conntrack
Add NAT support for nf_conntrack. Joint work of Jozsef Kadlecsik,
Yasuyuki Kozakai, Martin Josefsson and myself.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:07:13 -08:00
Patrick McHardy
d2483ddefd [NETFILTER]: nf_conntrack: add module aliases to IPv4 conntrack names
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:06:05 -08:00
Patrick McHardy
b321e14425 [NETFILTER]: Kconfig: improve conntrack selection
Improve the connection tracking selection (well, the user experience,
not really the aesthetics) by offering one option to enable connection
tracking and a choice between the implementations.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:05:46 -08:00
Patrick McHardy
9457d851fc [NETFILTER]: nf_conntrack: automatic helper assignment for expectations
Some helpers (namely H.323) manually assign further helpers to expected
connections. This is not possible with nf_conntrack anymore since we
need to know whether a helper is used at allocation time.

Handle the helper assignment centrally, which allows to perform the
correct allocation and as a nice side effect eliminates the need
for the H.323 helper to fiddle with nf_conntrack_lock.

Mid term the allocation scheme really needs to be redesigned since
we do both the helper and expectation lookup _twice_ for every new
connection.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:05:25 -08:00
Patrick McHardy
bff9a89bca [NETFILTER]: nf_conntrack: endian annotations
Resync with Al Viro's ip_conntrack annotations and fix a missed
spot in ip_nat_proto_icmp.c.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:05:08 -08:00
Patrick McHardy
f9aae95828 [NETFILTER]: nf_conntrack: fix helper structure alignment
Adding the alignment to the size doesn't make any sense, what it
should do is align the size of the conntrack structure to the
alignment requirements of the helper structure and return an
aligned pointer in nfct_help().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:04:50 -08:00
Patrick McHardy
0c4ca1bd86 [NETFILTER]: nf_conntrack: fix NF_CONNTRACK_PROC_COMPAT dependency
NF_CONNTRACK_PROC_COMPAT depends on NF_CONNTRACK_IPV4, not NF_CONNTRACK.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:04:24 -08:00
Patrick McHardy
9a7c9337a0 [NET]: Accept wildcard delimiters in in[46]_pton
Accept -1 as delimiter to abort parsing without an error at the first
unknown character. This is needed by the upcoming nf_conntrack SIP
helper, where addresses are delimited by either '\r' or '\n' characters.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 22:04:04 -08:00
Kim Nordlund
a163148c1b [PKT_SCHED] act_gact: division by zero
Not returning -EINVAL, because someone might want to use the value
zero in some future gact_prob algorithm?

Signed-off-by: Kim Nordlund <kim.nordlund@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:11 -08:00
Jamal Hadi Salim
a4d1366d50 [GENETLINK]: Add cmd dump completion.
Remove assumption that generic netlink commands cannot have dump
completion callbacks.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:09 -08:00
David S. Miller
c40a27f48c [ATM]: Kill ipcommon.[ch]
All that remained was skb_migrate() and that was overkill
for what the two call sites were trying to do.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:08 -08:00
Patrick McHardy
1e9b3d5339 [NET_SCHED]: policer: restore compatibility with old iproute binaries
The tc actions increased the size of struct tc_police, which broke
compatibility with old iproute binaries since both the act_police
and the old NET_CLS_POLICE code check for an exact size match.

Since the new members are not even used, the simple fix is to also
accept the size of the old structure. Dumping is not affected since
old userspace will receive a bigger structure, which is handled fine.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:07 -08:00
Adrian Bunk
5f68e4c07c [PKT_SCHED]: Remove unused exports.
This patch removes the following unused EXPORT_SYMBOL's:
- sch_api.c: qdisc_lookup
- sch_generic.c: __netdev_watchdog_up
- sch_generic.c: noop_qdisc_ops
- sch_generic.c: qdisc_alloc

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:06 -08:00
Al Viro
1e419cd995 [EBTABLES]: Split ebt_replace into user and kernel variants, annotate.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:05 -08:00
Al Viro
df07a81e93 [EBTABLES]: Clean ebt_register_table() up.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:04 -08:00
Al Viro
1bc2326cbe [EBTABLES]: Move calls of ebt_verify_pointers() upstream.
... and pass just repl->name to translate_table()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:03 -08:00
Al Viro
f7da79d998 [EBTABLES]: ebt_check_entry() doesn't need valid_hooks
We can check newinfo->hook_entry[...] instead.
Kill unused argument.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:02 -08:00
Al Viro
177abc348a [EBTABLES]: Clean ebt_get_udc_positions() up.
Check for valid_hooks is redundant (newinfo->hook_entry[i] will
be NULL if bit i is not set).  Kill it, kill unused arguments.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:01 -08:00
Al Viro
0e795531c5 [EBTABLES]: Switch ebt_check_entry_size_and_hooks() to use of newinfo->hook_entry[]
kill unused arguments

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:01 -08:00
Al Viro
1f072c96fd [EBTABLES]: translate_table(): switch direct uses of repl->hook_info to newinfo
Since newinfo->hook_table[] already has been set up, we can switch to using
it instead of repl->{hook_info,valid_hooks}.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:32:00 -08:00
Al Viro
e4fd77deac [EBTABLES]: Move more stuff into ebt_verify_pointers().
Take intialization of ->hook_entry[...], ->entries_size and ->nentries
over there, pull the check for empty chains into the end of that sucker.

Now it's self-contained, so we can move it up in the very beginning of
translate_table() *and* we can rely on ->hook_entry[] being properly
transliterated after it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:59 -08:00
Al Viro
70fe9af47e [EBTABLES]: Pull the loop doing __ebt_verify_pointers() into a separate function.
It's easier to expand the iterator here *and* we'll be able to move all
uses of ebt_replace from translate_table() into this one.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:57 -08:00
Al Viro
22b440bf9e [EBTABLES]: Split ebt_check_entry_size_and_hooks
Split ebt_check_entry_size_and_hooks() in two parts - one that does
sanity checks on pointers (basically, checks that we can safely
use iterator from now on) and the rest of it (looking into details
of entry).

The loop applying ebt_check_entry_size_and_hooks() is split in two.

Populating newinfo->hook_entry[] is done in the first part.

Unused arguments killed.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:56 -08:00
Al Viro
14197d5447 [EBTABLES]: Prevent wraparounds in checks for entry components' sizes.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:56 -08:00
Al Viro
98a0824a0f [EBTABLES]: Deal with the worst-case behaviour in loop checks.
No need to revisit a chain we'd already finished with during
the check for current hook.  It's either instant loop (which
we'd just detected) or a duplicate work.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:55 -08:00
Al Viro
40642f95f5 [EBTABLES]: Verify that ebt_entries have zero ->distinguisher.
We need that for iterator to work; existing check had been too weak.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:54 -08:00
Al Viro
bb2ef25c2c [EBTABLES]: Fix wraparounds in ebt_entries verification.
We need to verify that
	a) we are not too close to the end of buffer to dereference
	b) next entry we'll be checking won't be _before_ our

While we are at it, don't subtract unrelated pointers...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:53 -08:00
Andrew Morton
b6332e6cf9 [TCP]: Fix warnings with TCP_MD5SIG disabled.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:52 -08:00
Adrian Bunk
f5b99bcddd [NET]: Possible cleanups.
This patch contains the following possible cleanups:
- make the following needlessly global functions statis:
  - ipv4/tcp.c: __tcp_alloc_md5sig_pool()
  - ipv4/tcp_ipv4.c: tcp_v4_reqsk_md5_lookup()
  - ipv4/udplite.c: udplite_rcv()
  - ipv4/udplite.c: udplite_err()
- make the following needlessly global structs static:
  - ipv4/tcp_ipv4.c: tcp_request_sock_ipv4_ops
  - ipv4/tcp_ipv4.c: tcp_sock_ipv4_specific
  - ipv6/tcp_ipv6.c: tcp_request_sock_ipv6_ops
- net/ipv{4,6}/udplite.c: remove inline's from static functions
                          (gcc should know best when to inline them)

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:51 -08:00
Miika Komu
2718aa7c55 [IPSEC]: Add AF_KEY interface for encapsulation family.
Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
2006-12-02 21:31:50 -08:00
Miika Komu
8511d01d7c [IPSEC]: Add netlink interface for the encapsulation family.
Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:49 -08:00
Miika Komu
76b3f055f3 [IPSEC]: Add encapsulation family.
Signed-off-by: Miika Komu <miika@iki.fi>
Signed-off-by: Diego Beltrami <Diego.Beltrami@hiit.fi>
Signed-off-by: Kazunori Miyazawa <miyazawa@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:48 -08:00
David S. Miller
08dd1a506b [TCP] MD5SIG: Kill CONFIG_TCP_MD5SIG_DEBUG.
It just obfuscates the code and adds limited value.  And as Adrian
Bunk noticed, it lacked Kconfig help text too, so just kill it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:47 -08:00
Patrick McHardy
e488eafcc5 [NET_SCHED]: Fix endless loops (part 5): netem/tbf/hfsc ->requeue failures
When peeking at the next packet in a child qdisc by calling dequeue/requeue,
the upper qdisc qlen counter may get out of sync in case the requeue fails.
The qdisc and the child qdisc both have their counter decremented, but since
no packet is given to the upper qdisc it won't decrement its counter itself.

requeue should not fail, so this is mostly for "correctness".

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:46 -08:00
Patrick McHardy
256d61b87b [NET_SCHED]: Fix endless loops (part 4): HTB
Convert HTB to use qdisc_tree_decrease_len() and add a callback
for deactivating a class when its child queue becomes empty.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:45 -08:00
Patrick McHardy
f973b913e1 [NET_SCHED]: Fix endless loops (part 3): HFSC
Convert HFSC to use qdisc_tree_decrease_len() and add a callback
for deactivating a class when its child queue becomes empty.

All queue purging goes through hfsc_purge_queue(), which is used in
three cases: grafting, class creation (when a leaf class is turned
into an intermediate class by attaching a new class) and class
deletion. In all cases qdisc_tree_decrease_len() is needed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:44 -08:00
Patrick McHardy
5e50da01d0 [NET_SCHED]: Fix endless loops (part 2): "simple" qdiscs
Convert the "simple" qdiscs to use qdisc_tree_decrease_qlen() where
necessary:

- all graft operations
- destruction of old child qdiscs in prio, red and tbf change operation
- purging of queue in sfq change operation

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:43 -08:00
Patrick McHardy
43effa1e57 [NET_SCHED]: Fix endless loops caused by inaccurate qlen counters (part 1)
There are multiple problems related to qlen adjustment that can lead
to an upper qdisc getting out of sync with the real number of packets
queued, leading to endless dequeueing attempts by the upper layer code.

All qdiscs must maintain an accurate q.qlen counter. There are basically
two groups of operations affecting the qlen: operations that propagate
down the tree (enqueue, dequeue, requeue, drop, reset) beginning at the
root qdisc and operations only affecting a subtree or single qdisc
(change, graft, delete class). Since qlen changes during operations from
the second group don't propagate to ancestor qdiscs, their qlen values
become desynchronized.

This patch adds a function to propagate qlen changes up the qdisc tree,
optionally calling a callback function to perform qdisc-internal
maintenance when the child qdisc becomes empty. The follow-up patches
will convert all qdiscs to use this function where necessary.

Noticed by Timo Steinbach <tsteinbach@astaro.com>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:42 -08:00
Patrick McHardy
9f9afec482 [NET_SCHED]: Set parent classid in default qdiscs
Set parent classids in default qdiscs to allow walking up the tree
from outside the qdiscs. This is needed by the next patch.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:41 -08:00
Patrick McHardy
814a175e7b [NET_SCHED]: sch_htb: perform qlen adjustment immediately in ->delete
qlen adjustment should happen immediately in ->delete and not in the
class destroy function because the reference count will not hit zero in
->delete (sch_api holds a reference) but in ->put. Since the qdisc
lock is released between deletion of the class and final destruction
this creates an externally visible error in the qlen counter.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-02 21:31:40 -08:00
Paul Moore
484b366932 NetLabel: add the ranged tag to the CIPSOv4 protocol
Add support for the ranged tag (tag type #5) to the CIPSOv4 protocol.

The ranged tag allows for seven, or eight if zero is the lowest category,
category ranges to be specified in a CIPSO option.  Each range is specified by
two unsigned 16 bit fields, each with a maximum value of 65534.  The two values
specify the start and end of the category range; if the start of the category
range is zero then it is omitted.

See Documentation/netlabel/draft-ietf-cipso-ipsecurity-01.txt for more details.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:31:38 -08:00
Paul Moore
654bbc2a2b NetLabel: add the enumerated tag to the CIPSOv4 protocol
Add support for the enumerated tag (tag type #2) to the CIPSOv4 protocol.

The enumerated tag allows for 15 categories to be specified in a CIPSO option,
where each category is an unsigned 16 bit field with a maximum value of 65534.

See Documentation/netlabel/draft-ietf-cipso-ipsecurity-01.txt for more details.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:31:37 -08:00
Paul Moore
0275276035 NetLabel: convert to an extensibile/sparse category bitmap
The original NetLabel category bitmap was a straight char bitmap which worked
fine for the initial release as it only supported 240 bits due to limitations
in the CIPSO restricted bitmap tag (tag type 0x01).  This patch converts that
straight char bitmap into an extensibile/sparse bitmap in order to lay the
foundation for other CIPSO tag types and protocols.

This patch also has a nice side effect in that all of the security attributes
passed by NetLabel into the LSM are now in a format which is in the host's
native byte/bit ordering which makes the LSM specific code much simpler; look
at the changes in security/selinux/ss/ebitmap.c as an example.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-12-02 21:31:36 -08:00
Pablo Neira Ayuso
ef91fd522b [NETFILTER]: remove the reference to ipchains from Kconfig
It is time to move on :-)

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:35 -08:00
Patrick McHardy
76592584be [NETFILTER]: Fix PROC_FS=n warnings
Fix some unused function/variable warnings.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:34 -08:00
Patrick McHardy
65195686ff [NETFILTER]: remove remaining ASSERT_{READ,WRITE}_LOCK
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:33 -08:00
Bart De Schuymer
d12cdc3ccf [NETFILTER]: ebtables: add --snap-arp option
The attached patch adds --snat-arp support, which makes it possible to
change the source mac address in both the mac header and the arp header
with one rule.

Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:32 -08:00
Patrick McHardy
baf7b1e112 [NETFILTER]: x_tables: add NFLOG target
Add new NFLOG target to allow use of nfnetlink_log for both IPv4 and IPv6.
Currently we have two (unsupported by userspace) hacks in the LOG and ULOG
targets to optionally call to the nflog API. They lack a few features,
namely the IPv4 and IPv6 LOG targets can not specify a number of arguments
related to nfnetlink_log, while the ULOG target is only available for IPv4.
Remove those hacks and add a clean way to use nfnetlink_log.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:31 -08:00
Patrick McHardy
39b46fc6f0 [NETFILTER]: x_tables: add port of hashlimit match for IPv4 and IPv6
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:31 -08:00
Patrick McHardy
d7a5c32442 [NETFILTER]: nfnetlink_log: remove useless prefix length limitation
There is no reason for limiting netlink attributes in size.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:30 -08:00
Eric Leblond
829e17a1a6 [NETFILTER]: nfnetlink_queue: allow changing queue length through netlink
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:29 -08:00
Pablo Neira Ayuso
7b621c1ea6 [NETFILTER]: ctnetlink: rework conntrack fields dumping logic on events
|   NEW   | UPDATE  | DESTROY |
     ----------------------------------------|
     tuples    |    Y    |    Y    |    Y    |
     status    |    Y    |    Y    |    N    |
     timeout   |    Y    |    Y    |    N    |
     protoinfo |    S    |    S    |    N    |
     helper    |    S    |    S    |    N    |
     mark      |    S    |    S    |    N    |
     counters  |    F    |    F    |    Y    |

 Leyend:
         Y: yes
         N: no
         S: iif the field is set
	 F: iif overflow

This patch also replace IPCT_HELPINFO by IPCT_HELPER since we want to
track the helper assignation process, not the changes in the private
information held by the helper.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:28 -08:00
Pablo Neira Ayuso
bbb3357d14 [NETFILTER]: ctnetlink: check for status attribute existence on conntrack creation
Check that status flags are available in the netlink message received
to create a new conntrack.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:27 -08:00
Patrick McHardy
1b683b5512 [NETFILTER]: sip conntrack: better NAT handling
The NAT handling of the SIP helper has a few problems:

- Request headers are only mangled in the reply direction, From/To headers
  not at all, which can lead to authentication failures with DNAT in case
  the authentication domain is the IP address

- Contact headers in responses are only mangled for REGISTER responses

- Headers may be mangled even though they contain addresses not
  participating in the connection, like alternative addresses

- Packets are droppen when domain names are used where the helper expects
  IP addresses

This patch takes a different approach, instead of fixed rules what field
to mangle to what content, it adds symetric mapping of From/To/Via/Contact
headers, which allows to deal properly with echoed addresses in responses
and foreign addresses not belonging to the connection.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:26 -08:00
Patrick McHardy
77a78dec48 [NETFILTER]: sip conntrack: make header shortcuts optional
Not every header has a shortcut, so make them optional instead
of searching for the same string twice.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:25 -08:00
Patrick McHardy
40883e8184 [NETFILTER]: sip conntrack: do case insensitive SIP header search
SIP headers are generally case-insensitive, only SDP headers are
case sensitive.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:24 -08:00
Patrick McHardy
9d5b8baa4e [NETFILTER]: sip conntrack: minor cleanup
- Use enum for header field enumeration
- Use numerical value instead of pointer to header info structure to
  identify headers, unexport ct_sip_hdrs
- group SIP and SDP entries in header info structure
- remove double forward declaration of ct_sip_get_info

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:23 -08:00
Patrick McHardy
337fbc4166 [NETFILTER]: ip_conntrack: fix NAT helper unload races
The NAT helpr hooks are protected by RCU, but all of the
conntrack helpers test and use the global pointers instead
of copying them first using rcu_dereference()

Also replace synchronize_net() by synchronize_rcu() for clarity
since sychronizing only with packet receive processing is
insufficient to prevent races.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:22 -08:00
Yasuyuki Kozakai
468ec44bd5 [NETFILTER]: conntrack: add '_get' to {ip, nf}_conntrack_expect_find
We usually uses 'xxx_find_get' for function which increments
reference count.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:21 -08:00
Patrick McHardy
e4bd8bce3e [NETFILTER]: nf_conntrack: /proc compatibility with old connection tracking
This patch adds /proc/net/ip_conntrack, /proc/net/ip_conntrack_expect and
/proc/net/stat/ip_conntrack files to keep old programs using them working.

The /proc/net/ip_conntrack and /proc/net/ip_conntrack_expect files show only
IPv4 entries, the /proc/net/stat/ip_conntrack shows global statistics.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:20 -08:00
Patrick McHardy
a999e68376 [NETFILTER]: nf_conntrack: sysctl compatibility with old connection tracking
This patch adds an option to keep the connection tracking sysctls visible
under their old names.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:19 -08:00
Patrick McHardy
933a41e7e1 [NETFILTER]: nf_conntrack: move conntrack protocol sysctls to individual modules
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:18 -08:00
Patrick McHardy
d62f9ed4a4 [NETFILTER]: nf_conntrack: automatic sysctl registation for conntrack protocols
Add helper functions for sysctl registration with optional instantiating
of common path elements (like net/netfilter) and use it for support for
automatic registation of conntrack protocol sysctls.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:17 -08:00
Patrick McHardy
f8eb24a89a [NETFILTER]: nf_conntrack: move extern declaration to header files
Using extern in a C file is a bad idea because the compiler can't
catch type errors.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:16 -08:00
Patrick McHardy
d734685334 [NETFILTER]: nf_conntrack_ftp: fix missing helper mask initilization
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:15 -08:00
Martin Josefsson
be00c8e489 [NETFILTER]: nf_conntrack: reduce timer updates in __nf_ct_refresh_acct()
Only update the conntrack timer if there's been at least HZ jiffies since
the last update. Reduces the number of del_timer/add_timer cycles from one
per packet to one per connection per second (plus once for each state change
of a connection)

Should handle timer wraparounds and connection timeout changes.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:14 -08:00
Martin Josefsson
824621eddd [NETFILTER]: nf_conntrack: remove unused struct list_head from protocols
Remove unused struct list_head from struct nf_conntrack_l3proto and
nf_conntrack_l4proto as all protocols are kept in arrays, not linked
lists.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:13 -08:00
Martin Josefsson
3ffd5eeb1a [NETFILTER]: nf_conntrack: minor __nf_ct_refresh_acct() whitespace cleanup
Minor whitespace cleanup.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:12 -08:00
Martin Josefsson
951d36cace [NETFILTER]: nf_conntrack: remove ASSERT_{READ,WRITE}_LOCK
Remove the usage of ASSERT_READ_LOCK/ASSERT_WRITE_LOCK in nf_conntrack,
it didn't do anything, it was just an empty define and it uglified the code.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:11 -08:00
Martin Josefsson
ae5718fb3d [NETFILTER]: nf_conntrack: more sanity checks in protocol registration/unregistration
Add some more sanity checks when registering/unregistering l3/l4 protocols.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:10 -08:00
Martin Josefsson
605dcad6c8 [NETFILTER]: nf_conntrack: rename struct nf_conntrack_protocol
Rename 'struct nf_conntrack_protocol' to 'struct nf_conntrack_l4proto' in
order to help distinguish it from 'struct nf_conntrack_l3proto'. It gets
rather confusing with 'nf_conntrack_protocol'.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:09 -08:00
Martin Josefsson
e2b7606cdb [NETFILTER]: More __read_mostly annotations
Place rarely written variables in the read-mostly section by using
__read_mostly

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:08 -08:00
Martin Josefsson
8f03dea52b [NETFILTER]: nf_conntrack: split out protocol handling
This patch splits out L3/L4 protocol handling into its own file
nf_conntrack_proto.c

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2006-12-02 21:31:07 -08:00