Commit Graph

14315 Commits

Author SHA1 Message Date
Luis R. Rodriguez
24feda0084 mac80211: fix propagation of failed hardware reconfigurations
mac80211 does not propagate failed hardware reconfiguration
requests. For suspend and resume this is important due to all
the possible issues that can come out of the suspend <-> resume
cycle. Not propagating the error means cfg80211 will assume
the resume for the device went through fine and mac80211 will
continue on trying to poke at the hardware, enable timers,
queue work, and so on for a device which is completley
unfunctional.

The least we can do is to propagate device start issues and
warn when this occurs upon resume. A side effect of this patch
is we also now propagate the start errors upon harware
reconfigurations (non-suspend), but this should also be desirable
anyway, there is not point in continuing to reconfigure a
device if mac80211 was unable to start the device.

For further details refer to the thread:

http://marc.info/?t=126151038700001&r=1&w=2

Cc: stable@kernel.org
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-28 16:20:05 -05:00
Luis R. Rodriguez
b98c06b6de mac80211: fix race with suspend and dynamic_ps_disable_work
When mac80211 suspends it calls a driver's suspend callback
as a last step and after that the driver assumes no calls will
be made to it until we resume and its start callback is kicked.
If such calls are made, however, suspend can end up throwing
hardware in an unexpected state and making the device unusable
upon resume.

Fix this by preventing mac80211 to schedule dynamic_ps_disable_work
by checking for when mac80211 starts to suspend and starts
quiescing. Frames should be allowed to go through though as
that is part of the quiescing steps and we do not flush the
mac80211 workqueue since it was already done towards the
beginning of suspend cycle.

The other mac80211 issue will be hanled in the next patch.

For further details see refer to the thread:

http://marc.info/?t=126144866100001&r=1&w=2

Cc: stable@kernel.org
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-28 16:20:04 -05:00
Johannes Berg
65486c8b30 cfg80211: fix error path in cfg80211_wext_siwscan
If there's an invalid channel or SSID, the code leaks
the scan request. Always free the scan request, unless
it was successfully given to the driver.

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-28 16:19:58 -05:00
Johannes Berg
3bdb2d48c5 cfg80211: fix race between deauth and assoc response
Joseph Nahmias reported, in http://bugs.debian.org/562016,
that he was getting the following warning (with some log
around the issue):

  ath0: direct probe to AP 00:11:95:77:e0:b0 (try 1)
  ath0: direct probe responded
  ath0: authenticate with AP 00:11:95:77:e0:b0 (try 1)
  ath0: authenticated
  ath0: associate with AP 00:11:95:77:e0:b0 (try 1)
  ath0: deauthenticating from 00:11:95:77:e0:b0 by local choice (reason=3)
  ath0: direct probe to AP 00:11:95:77:e0:b0 (try 1)
  ath0: RX AssocResp from 00:11:95:77:e0:b0 (capab=0x421 status=0 aid=2)
  ath0: associated
  ------------[ cut here ]------------
  WARNING: at net/wireless/mlme.c:97 cfg80211_send_rx_assoc+0x14d/0x152 [cfg80211]()
  Hardware name: 7658CTO
  ...
  Pid: 761, comm: phy0 Not tainted 2.6.32-trunk-686 #1
  Call Trace:
   [<c1030a5d>] ? warn_slowpath_common+0x5e/0x8a
   [<c1030a93>] ? warn_slowpath_null+0xa/0xc
   [<f86cafc7>] ? cfg80211_send_rx_assoc+0x14d/0x152
  ...
  ath0: link becomes ready
  ath0: deauthenticating from 00:11:95:77:e0:b0 by local choice (reason=3)
  ath0: no IPv6 routers present
  ath0: link is not ready
  ath0: direct probe to AP 00:11:95:77:e0:b0 (try 1)
  ath0: direct probe responded
  ath0: authenticate with AP 00:11:95:77:e0:b0 (try 1)
  ath0: authenticated
  ath0: associate with AP 00:11:95:77:e0:b0 (try 1)
  ath0: RX ReassocResp from 00:11:95:77:e0:b0 (capab=0x421 status=0 aid=2)
  ath0: associated

It is not clear to me how the first "direct probe" here
happens, but this seems to be a race condition, if the
user requests to deauth after requesting assoc, but before
the assoc response is received. In that case, it may
happen that mac80211 tries to report the assoc success to
cfg80211, but gets blocked on the wdev lock that is held
because the user is requesting the deauth.

The result is that we run into a warning. This is mostly
harmless, but maybe cause an unexpected event to be sent
to userspace; we'd send an assoc success event although
userspace was no longer expecting that.

To fix this, remove the warning and check whether the
race happened and in that case abort processing.

Reported-by: Joseph Nahmias <joe@nahmias.net>
Cc: stable@kernel.org
Cc: 562016-quiet@bugs.debian.org
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-28 16:19:54 -05:00
Felix Fietkau
2e10d330f8 mac80211: fix ibss join with fixed-bssid
When fixed bssid is requested when joining an ibss network, incoming
beacons that match the configured bssid cause mac80211 to create new
sta entries, even before the ibss interface is in joined state.
When that happens, it fails to bring up the interface entirely, because
it checks for existing sta entries before joining.
This patch fixes this bug by refusing to create sta info entries before
the interface is fully operational.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-28 15:56:35 -05:00
John W. Linville
ea1e4b8420 Merge git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2009-12-28 15:09:11 -05:00
Octavian Purdila
3100aa9d74 llc: fix SAP reference counting w.r.t. socket handling
The SAP ref counter gets decremented twice when deleting a socket,
although for all but the first socket of a SAP the SAP ref counter was
incremented only once.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-26 20:47:23 -08:00
Octavian Purdila
8beb9ab6c2 llc: convert llc_sap_list to RCU
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-26 20:46:28 -08:00
Octavian Purdila
52d58aef5e llc: replace the socket list with a local address based hash
For the cases where a lot of interfaces are used in conjunction with a
lot of LLC sockets bound to the same SAP, the iteration of the socket
list becomes prohibitively expensive.

Replacing the list with a a local address based hash significantly
improves the bind and listener lookup operations as well as the
datagram delivery.

Connected sockets delivery is also improved, but this patch does not
address the case where we have lots of sockets with the same local
address connected to different remote addresses.

In order to keep the socket sanity checks alive and fast a socket
counter was added to the SAP structure.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-26 20:45:32 -08:00
Octavian Purdila
6d2e3ea284 llc: use a device based hash table to speed up multicast delivery
This patch adds a per SAP device based hash table to solve the
multicast delivery scalability issue when we have large number of
interfaces and a large number of sockets bound to the same SAP.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-26 20:43:57 -08:00
Octavian Purdila
0f7b67dd9e llc: optimize multicast delivery
Optimize multicast delivery by doing the actual delivery without
holding the lock. Based on the same approach used in UDP code.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-26 20:42:29 -08:00
Octavian Purdila
b76f5a8427 llc: convert the socket list to RCU locking
For the reclamation phase we use the SLAB_DESTROY_BY_RCU mechanism,
which require some extra checks in the lookup code:

a) If the current socket was released, reallocated & inserted in
another list it will short circuit the iteration for the current list,
thus we need to restart the lookup.

b) If the current socket was released, reallocated & inserted in the
same list we just need to recheck it matches the look-up criteria and
if not we can skip to the next element.

In this case there is no need to restart the lookup, since sockets are
inserted at the start of the list and the worst that will happen is
that we will iterate throught some of the list elements more then
once.

Note that the /proc and multicast delivery was not yet converted to
RCU, it still uses spinlocks for protection.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-26 20:41:43 -08:00
Octavian Purdila
abf9d537fe llc: add support for SO_BINDTODEVICE
Using bind(MAC address) with LLC sockets has O(n) complexity, where n
is the number of interfaces. To overcome this, we add support for
SO_BINDTODEVICE which drops the complexity to O(1).

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-26 20:41:12 -08:00
Octavian Purdila
e5cd6fe391 llc: add support for LLC_OPT_PKTINFO
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-26 20:40:34 -08:00
Octavian Purdila
bf9ae5386b llc: use dev_hard_header
Using dev_hard_header allows us to use LLC with VLANs and potentially
other Ethernet/TokernRing specific encapsulations. It also removes code
duplication between LLC and Ethernet/TokenRing core code.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-26 20:38:23 -08:00
Ralf Baechle
96c5340147 NET: XFRM: Fix spelling of neighbour.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

 net/xfrm/xfrm_policy.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-26 20:24:46 -08:00
Jamal Hadi Salim
28f6aeea3f net: restore ip source validation
when using policy routing and the skb mark:
there are cases where a back path validation requires us
to use a different routing table for src ip validation than
the one used for mapping ingress dst ip.
One such a case is transparent proxying where we pretend to be
the destination system and therefore the local table
is used for incoming packets but possibly a main table would
be used on outbound.
Make the default behavior to allow the above and if users
need to turn on the symmetry via sysctl src_valid_mark

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-25 17:30:22 -08:00
John W. Linville
f83d664eef wireless: fix comments in genregdb.awk
Apparently some awk versions choke on C-style comments -- who knew? :-)

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-25 16:35:43 -08:00
David S. Miller
d346f49d0b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2009-12-25 16:34:56 -08:00
John Fastabend
f466dba183 pktgen: ndo_start_xmit can return NET_XMIT_xxx values
This updates pktgen so that it does not decrement skb->users
when it receives valid NET_XMIT_xxx values.  These are now
valid return values from ndo_start_xmit in net-next-2.6.
They also indicate the skb has been consumed.

This fixes pktgen to work correctly with vlan devices.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-23 22:02:57 -08:00
laurent chavey
31d12926e3 net: Add rtnetlink init_rcvwnd to set the TCP initial receive window
Add rtnetlink init_rcvwnd to set the TCP initial receive window size
advertised by passive and active TCP connections.
The current Linux TCP implementation limits the advertised TCP initial
receive window to the one prescribed by slow start. For short lived
TCP connections used for transaction type of traffic (i.e. http
requests), bounding the advertised TCP initial receive window results
in increased latency to complete the transaction.
Support for setting initial congestion window is already supported
using rtnetlink init_cwnd, but the feature is useless without the
ability to set a larger TCP initial receive window.
The rtnetlink init_rcvwnd allows increasing the TCP initial receive
window, allowing TCP connection to advertise larger TCP receive window
than the ones bounded by slow start.

Signed-off-by: Laurent Chavey <chavey@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-23 14:13:30 -08:00
Krishna Kumar
068a2de57d net: release dst entry while cache-hot for GSO case too
Non-GSO code drops dst entry for performance reasons, but
the same is missing for GSO code. Drop dst while cache-hot
for GSO case too.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-23 14:13:30 -08:00
Krishna Kumar
def87cf420 tcp: Slightly optimize tcp_sendmsg
Slightly optimize tcp_sendmsg since NETIF_F_SG is used many
times iteratively in the loop. The only other modification is
to change:
			} else if (i == MAX_SKB_FRAGS ||
				   (!i &&
				   !(sk->sk_route_caps & NETIF_F_SG))) {
	to:
			} else if (i == MAX_SKB_FRAGS || !sg) {

The reason why this change is correct: this code (other than
the MAX_SKB_FRAGS case) executes only due to the else part
of: "if (skb_tailroom(skb) > 0) {" - i.e. there was no space
in the skb to put the data inline. Hence SG is false is a
sufficient condition, and there is no way a fragment can be
added to the skb.

Changelog:
	- Added the above explanation for the change

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-23 14:13:29 -08:00
Krishna Kumar
afeca340c0 tcp: Remove unrequired operations in tcp_push()
Remove unrequired operations in tcp_push()

Changelog:
	Removed a temporary skb variable from tcp_push()

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-23 14:13:28 -08:00
Krishna Kumar
12d50c46dc tcp: Remove check in __tcp_push_pending_frames
tcp_push checks tcp_send_head and calls __tcp_push_pending_frames,
which again checks tcp_send_head, and this unnecessary check is
done for every other caller of __tcp_push_pending_frames.

Remove tcp_send_head check in __tcp_push_pending_frames and add
the check to tcp_push_pending_frames. Other functions call
__tcp_push_pending_frames only when tcp_send_head would evaluate
to true.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-23 14:13:28 -08:00
David S. Miller
b4de921ae6 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2009-12-23 14:09:17 -08:00
Stefani Seibold
7acd72eb85 kfifo: rename kfifo_put... into kfifo_in... and kfifo_get... into kfifo_out...
rename kfifo_put...  into kfifo_in...  to prevent miss use of old non in
kernel-tree drivers

ditto for kfifo_get...  -> kfifo_out...

Improve the prototypes of kfifo_in and kfifo_out to make the kerneldoc
annotations more readable.

Add mini "howto porting to the new API" in kfifo.h

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22 14:17:56 -08:00
Stefani Seibold
e64c026dd0 kfifo: cleanup namespace
change name of __kfifo_* functions to kfifo_*, because the prefix __kfifo
should be reserved for internal functions only.

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22 14:17:56 -08:00
Stefani Seibold
c1e13f2567 kfifo: move out spinlock
Move the pointer to the spinlock out of struct kfifo.  Most users in
tree do not actually use a spinlock, so the few exceptions now have to
call kfifo_{get,put}_locked, which takes an extra argument to a
spinlock.

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22 14:17:56 -08:00
Stefani Seibold
4546548789 kfifo: move struct kfifo in place
This is a new generic kernel FIFO implementation.

The current kernel fifo API is not very widely used, because it has to
many constrains.  Only 17 files in the current 2.6.31-rc5 used it.
FIFO's are like list's a very basic thing and a kfifo API which handles
the most use case would save a lot of development time and memory
resources.

I think this are the reasons why kfifo is not in use:

 - The API is to simple, important functions are missing
 - A fifo can be only allocated dynamically
 - There is a requirement of a spinlock whether you need it or not
 - There is no support for data records inside a fifo

So I decided to extend the kfifo in a more generic way without blowing up
the API to much.  The new API has the following benefits:

 - Generic usage: For kernel internal use and/or device driver.
 - Provide an API for the most use case.
 - Slim API: The whole API provides 25 functions.
 - Linux style habit.
 - DECLARE_KFIFO, DEFINE_KFIFO and INIT_KFIFO Macros
 - Direct copy_to_user from the fifo and copy_from_user into the fifo.
 - The kfifo itself is an in place member of the using data structure, this save an
   indirection access and does not waste the kernel allocator.
 - Lockless access: if only one reader and one writer is active on the fifo,
   which is the common use case, no additional locking is necessary.
 - Remove spinlock - give the user the freedom of choice what kind of locking to use if
   one is required.
 - Ability to handle records. Three type of records are supported:
   - Variable length records between 0-255 bytes, with a record size
     field of 1 bytes.
   - Variable length records between 0-65535 bytes, with a record size
     field of 2 bytes.
   - Fixed size records, which no record size field.
 - Preserve memory resource.
 - Performance!
 - Easy to use!

This patch:

Since most users want to have the kfifo as part of another object,
reorganize the code to allow including struct kfifo in another data
structure.  This requires changing the kfifo_alloc and kfifo_init
prototypes so that we pass an existing kfifo pointer into them.  This
patch changes the implementation and all existing users.

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22 14:17:55 -08:00
Luis R. Rodriguez
9da3e06814 mac80211: only bother printing highest data rate on debugfs if its set
IEEE-802.11n spec says the RX highest data rate field does
not specify the highest supported RX data rate if its not set.
Ignore it if not set then. Refer to section 7.3.56.4

Cc: johannes@sipsolutions.net
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-22 13:56:02 -05:00
Luis R. Rodriguez
7db94e2103 mac80211: parse the HT capabilities info through debugfs
When debugging you want to be lazy and not have to parse
bits yourself so let mac80211 debugfs do the parsing for you.

This is what I get against my WRT610N:

root@tux:~# cat /sys/kernel/debug/ieee80211/phy0/stations/00\:22\:6b\:aa\:bb\:01/ht_capa
ht supported
cap: 0x000e
	HT20/HT40
	SM Power Save disabled
	No RX STBC
	Max AMSDU length: 7935 bytes
	No DSSS/CCK HT40
ampdu factor/density: 2/6
MCS mask: ff ff 00 00 00 00 00 00 00 00
MCS rx highest: 0
MCS tx params: 0

Cc: johannes@sipsolutions.net
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-22 13:56:00 -05:00
Luis R. Rodriguez
cb136f54ee mac80211: make debugfs mcs set entry reflect 16 bits
The MCS set is 16 bits so when debugging ensure the full 16 bits
are represented. Current reading would make you think its only
8 bits.

Cc: johannes@sipsolutions.net
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-22 13:55:58 -05:00
Johannes Berg
2c7e6bc9ac mac80211: disallow fixing bitrates with hw rate control
When hw rate control is used, these parameters have
no meaning because the hardware cannot get at them
right now, so disallow setting them. Also clean up
the function a bit.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-22 13:46:28 -05:00
Johannes Berg
5d1ec85f00 mac80211: dont try to use existing sta for AP
Clean out some cruft that could use an already existing
sta_info struct -- that case cannot happen. Also, there's
a bug there -- if allocation/insertion fails then it is
possible that we are left in a lingering state where
mac80211 waits for the AP, cfg80211 waits for mac80211,
but the AP has already replied. Since there's no way to
indicate an internal error, pretend there was a timeout,
i.e. that the AP never responded.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-22 13:31:18 -05:00
Johannes Berg
5fba4af32c cfg80211: avoid sending spurious deauth to userspace
Before
  commit ca9034592823e8179511e48a78731f95bfdd766c
  Author: Holger Schurig <hs4233@mail.mn-solutions.de>
  Date:   Tue Oct 13 13:45:28 2009 +0200

      cfg80211: remove warning in deauth case

we assumed that drivers never give us spurious deauth
frames because they filter them out based on the auth
state they keep track of. This turned out to be racy,
because userspace might deauth while the AP is also
sending a deauth frame, so the warning was removed.

However, in that case we should not tell userspace
about the AP's frame if it requested deauth "first",
where "first" means it came to cfg80211 first.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-22 13:31:17 -05:00
Johannes Berg
f38fd12fa7 mac80211: allow disabling 40MHz on 2.4GHz
In some situations it is required that a system be
configured with no support for 40 MHz channels in
the 2.4 GHz band. Rather than imposing any such
restrictions on everybody, allow configuration a
system like that with a module parameter. It is
writable at runtime but only takes effect at the
time of the next association.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-22 13:31:17 -05:00
Johannes Berg
0f78231bff mac80211: enable spatial multiplexing powersave
Enable spatial multiplexing in mac80211 by telling the
driver what to do and, where necessary, sending action
frames to the AP to update the requested SMPS mode.

Also includes a trivial implementation for hwsim that
just logs the requested mode.

For now, the userspace interface is in debugfs only,
and let you toggle the requested mode at any time.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-22 13:31:16 -05:00
Zhu Yi
eaf85ca7fe wireless: add ieee80211_amsdu_to_8023s
Move the A-MSDU handling code from mac80211 to cfg80211 so that more
drivers can use it. The new created function ieee80211_amsdu_to_8023s
converts an A-MSDU frame to a list of 802.3 frames.

Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-22 13:31:15 -05:00
gregor kowski
ca99861d54 mac80211 : fix a race with update_tkip_key
The mac80211 tkip code won't call update_tkip_key, if rx packets
are received without KEY_FLAG_UPLOADED_TO_HARDWARE. This can happen on
first packet because the hardware key stuff is called asynchronously with
todo workqueue.

This patch workaround that by tracking if we sent the key to the driver.

Signed-off-by: Gregor Kowski <gregor.kowski@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-22 13:31:14 -05:00
Florian Fainelli
ae24e578de ipvs: ip_vs_wrr.c: use lib/gcd.c
Remove the private version of the greatest common divider to use
lib/gcd.c, the latter also implementing the a < b case.

[akpm@linux-foundation.org: repair neighboring whitespace because the diff looked odd]
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Takashi Iwai <tiwai@suse.de>
Acked-by: Simon Horman <horms@verge.net.au>
Cc: Julius Volz <juliusv@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-12-22 09:42:06 +01:00
John W. Linville
3b377ea9d4 wireless: support internal statically compiled regulatory database
This patch provides infrastructure for machine translation of the
regulatory rules database used by CRDA into a C data structure.
It includes code for searching that database as an alternative
to dynamic regulatory rules updates via CRDA.  Most people should
use CRDA instead of this infrastructure, but it provides a better
alternative than the WIRELESS_OLD_REGULATORY infrastructure (which
can now be removed).

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-21 18:56:10 -05:00
Kalle Valo
59d9cb071d mac80211: remove payload alignment warning
The payload alignment warning enabled by MAC80211_DEBUG_PACKET_ALIGNMENT is
difficult. To fix it, a firmware change is needed but in most cases that's
very difficult. So the benefit from the warning is low and most probably
it just creates more confusion for people who just enable all warnings
(like it did for me).

Remove the unaligned IP payload warning and the kconfig option. But
leave the unaligned packet warning, it will be enabled with
MAC80211_VERBOSE_DEBUG.

Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-21 18:56:09 -05:00
Johannes Berg
12375ef933 mac80211: trace interface name
It's not all that useful to have the vif/sdata pointer,
we'd rather refer to the interfaces by their name.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-21 18:38:54 -05:00
Johannes Berg
47846c9b0c mac80211: reduce reliance on netdev
For bluetooth 3, we will most likely not have
a netdev for a virtual interface (sdata), so
prepare for that by reducing the reliance on
having a netdev. This patch moves the name
and address fields into the sdata struct and
uses them from there all over. Some work is
needed to keep them sync'ed, but that's not
a lot of work and in slow paths anyway.

In doing so, this also reduces the number of
pointer dereferences in many places, because
of things like sdata->dev->dev_addr becoming
sdata->vif.addr.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-21 18:38:52 -05:00
Johannes Berg
abe60632f3 mac80211: make station management completely depend on vif
The station management currently uses the virtual
interface, but you cannot add the same station to
multiple virtual interfaces if you're communicating
with it in multiple ways.

This restriction should be lifted so that in the
future we can, for instance, support bluetooth 3
with an access point that mac80211 is already
associated to.

We can do that by requiring all sta_info_get users
to provide the virtual interface and making the RX
code aware that an address may match more than one
station struct. Thanks to the previous patches this
one isn't all that large and except for the RX and
TX status paths changes has low complexity.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-21 18:38:51 -05:00
David S. Miller
ed4b2019a6 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2009-12-21 11:54:49 -08:00
Linus Torvalds
292be57e15 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  bnx2: Fix bnx2_netif_stop() merge error.
  gianfar: Fix bit definitions of IMASK_GRSC and IMASK_GTSC
  gianfar: Fix stats support
  gianfar: Fix a filer bug
  bnx2: fixing a timout error due not refreshing TX timers correctly
  can/at91: don't check platform_get_irq's return value against zero
  mISDN: use DECLARE_COMPLETION_ONSTACK for non-constant completion
  bnx2: reset_task is crashing the kernel. Fixing it.
  ipv6: fix an oops when force unload ipv6 module
  TI DaVinci EMAC: Fix MDIO bus frequency configuration
  e100: Fix broken cbs accounting due to missing memset.
  broadcom: bcm54xx_shadow_read() errors ignored in bcm54xx_adjust_rxrefclk()
  e1000e: LED settings in EEPROM ignored on 82571 and 82572
  netxen: use module parameter correctly
  netns: fix net.ipv6.route.gc_min_interval_ms in netns
  Bluetooth: Prevent ill-timed autosuspend in USB driver
  Bluetooth: Fix L2CAP locking scheme regression
  Bluetooth: Ack L2CAP I-frames before retransmit missing packet
  Bluetooth: Fix unset of RemoteBusy flag for L2CAP
  Bluetooth: Fix PTR_ERR return of wrong pointer in hidp_setup_hid()
2009-12-21 10:12:25 -08:00
Johannes Berg
0183826b58 mac80211: fix WMM AP settings application
My
  commit 77fdaa12ce
  Author: Johannes Berg <johannes@sipsolutions.net>
  Date:   Tue Jul 7 03:45:17 2009 +0200

      mac80211: rework MLME for multiple authentications

inadvertedly broke WMM because it removed, along with
a bunch of other now useless initialisations, the line
initialising sdata->u.mgd.wmm_last_param_set to -1
which would make it adopt any WMM parameter set. If,
as is usually the case, the AP uses WMM parameter set
sequence number zero, we'd never update it until the
AP changes the sequence number.

Add the missing initialisation back to get the WMM
settings from the AP applied locally.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org [2.6.31+]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-21 11:32:28 -05:00
Johannes Berg
9a418af5df mac80211: fix peer HT capabilities
I noticed yesterday, because Jeff had noticed
a speed regression, cf. bug
http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2138
that the SM PS settings for peers were wrong.
Instead of overwriting the SM PS settings with
the local bits, we need to keep the remote bits.

The bug was part of the original HT code from
over two years ago, but unfortunately nobody
noticed that it makes no sense -- we shouldn't
be overwriting the peer's setting with our own
but rather keep it intact when masking the peer
capabilities with our own.

While fixing that, I noticed that the masking of
capabilities is completely useless for most of
the bits, so also fix those other bits.

Finally, I also noticed that PSMP_SUPPORT no
longer exists in the final 802.11n version, so
also remove that.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-21 11:32:27 -05:00
John W. Linville
254416aae7 wireless: report reasonable bitrate for MCS rates through wext
Previously, cfg80211 had reported "0" for MCS (i.e. 802.11n) bitrates
through the wireless extensions interface.  However, nl80211 was
converting MCS rates into a reasonable bitrate number.  This patch moves
the nl80211 code to cfg80211 where it is now shared between both the
nl80211 interface and the wireless extensions interface.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-21 11:27:31 -05:00
Yang Hongyang
3705e11a21 ipv6: fix an oops when force unload ipv6 module
When I do an ipv6 module force unload,I got the following oops:
#rmmod -f ipv6
------------[ cut here ]------------
kernel BUG at mm/slub.c:2969!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/0000:02:03.0/net/eth2/ifindex
Modules linked in: ipv6(-) dm_multipath uinput ppdev tpm_tis tpm tpm_bios pcspkr pcnet32 mii parport_pc i2c_piix4 parport i2c_core floppy mptspi mptscsih mptbase scsi_transport_spi

Pid: 2530, comm: rmmod Tainted: G  R        2.6.32 #2 440BX Desktop Reference Platform/VMware Virtual Platform
EIP: 0060:[<c04b73f2>] EFLAGS: 00010246 CPU: 0
EIP is at kfree+0x6a/0xdd
EAX: 00000000 EBX: c09e86bc ECX: c043e4dd EDX: c14293e0
ESI: e141f1d8 EDI: e140fc31 EBP: dec58ef0 ESP: dec58ed0
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process rmmod (pid: 2530, ti=dec58000 task=decb1940 task.ti=dec58000)
Stack:
 c14293e0 00000282 df624240 c0897d08 c09e86bc c09e86bc e141f1d8 dec58f1c
<0> dec58f00 e140fc31 c09e84c4 e141f1bc dec58f14 c0689d21 dec58f1c e141f1bc
<0> 00000000 dec58f2c c0689eff c09e84d8 c09e84d8 e141f1bc bff33a90 dec58f38
Call Trace:
 [<e140fc31>] ? ipv6_frags_exit_net+0x22/0x32 [ipv6]
 [<c0689d21>] ? ops_exit_list+0x19/0x3d
 [<c0689eff>] ? unregister_pernet_operations+0x2a/0x51
 [<c0689f70>] ? unregister_pernet_subsys+0x17/0x24
 [<e140fbfe>] ? ipv6_frag_exit+0x21/0x32 [ipv6]
 [<e141a361>] ? inet6_exit+0x47/0x122 [ipv6]
 [<c045f5de>] ? sys_delete_module+0x198/0x1f6
 [<c04a8acf>] ? remove_vma+0x57/0x5d
 [<c070f63f>] ? do_page_fault+0x2e7/0x315
 [<c0403218>] ? sysenter_do_call+0x12/0x28
Code: 86 00 00 00 40 c1 e8 0c c1 e0 05 01 d0 89 45 e0 66 83 38 00 79 06 8b 40 0c 89 45 e0 8b 55 e0 8b 02 84 c0 78 14 66 a9 00 c0 75 04 <0f> 0b eb fe 8b 45 e0 e8 35 15 fe ff eb 5d 8b 45 04 8b 55 e0 89
EIP: [<c04b73f2>] kfree+0x6a/0xdd SS:ESP 0068:dec58ed0
---[ end trace 4475d1a5b0afa7e5 ]---

It's because in ip6_frags_ns_sysctl_register,
"table" only alloced when "net" is not equals
to "init_net".So when we free "table" in 
ip6_frags_ns_sysctl_unregister,we should check
this first.

This patch fix the problem.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-18 20:25:13 -08:00
Alexey Dobriyan
9c69fabe78 netns: fix net.ipv6.route.gc_min_interval_ms in netns
sysctl table was copied, all right, but ->data for net.ipv6.route.gc_min_interval_ms
was not reinitialized for "!= &init_net" case.

In init_net everthing works by accident due to correct ->data initialization
in source table.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-18 20:11:03 -08:00
Andrei Emeltchenko
b13f586044 Bluetooth: Fix L2CAP locking scheme regression
When locking was introduced the error path branch was not taken
into account. Error was found in sparse code checking. Kudos to
Jani Nikula.

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Acked-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2009-12-17 12:07:25 -08:00
Gustavo F. Padovan
186ee8cf01 Bluetooth: Ack L2CAP I-frames before retransmit missing packet
Moving the Ack to before l2cap_retransmit_frame() we can avoid the
case where txWindow is full and the packet can't be retransmited.

Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2009-12-17 12:06:23 -08:00
Gustavo F. Padovan
186de9a338 Bluetooth: Fix unset of RemoteBusy flag for L2CAP
RemoteBusy flag need to be unset before l2cap_ertm_send(), otherwise
l2cap_ertm_send() will return without sending packets because it checks
that flag before start sending.

Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2009-12-17 12:04:08 -08:00
Roel Kluin
971beb83ae Bluetooth: Fix PTR_ERR return of wrong pointer in hidp_setup_hid()
Return the PTR_ERR of the correct pointer.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2009-12-17 11:47:20 -08:00
Nick Piggin
a3a065e3f1 fs: no games with DCACHE_UNHASHED
Filesystems outside the regular namespace do not have to clear DCACHE_UNHASHED
in order to have a working /proc/$pid/fd/XXX. Nothing in proc prevents the
fd link from being used if its dentry is not in the hash.

Also, it does not get put into the dcache hash if DCACHE_UNHASHED is clear;
that depends on the filesystem calling d_add or d_rehash.

So delete the misleading comments and needless code.

Acked-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-17 10:51:40 -05:00
Linus Torvalds
bac5e54c29 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits)
  direct I/O fallback sync simplification
  ocfs: stop using do_sync_mapping_range
  cleanup blockdev_direct_IO locking
  make generic_acl slightly more generic
  sanitize xattr handler prototypes
  libfs: move EXPORT_SYMBOL for d_alloc_name
  vfs: force reval of target when following LAST_BIND symlinks (try #7)
  ima: limit imbalance msg
  Untangling ima mess, part 3: kill dead code in ima
  Untangling ima mess, part 2: deal with counters
  Untangling ima mess, part 1: alloc_file()
  O_TRUNC open shouldn't fail after file truncation
  ima: call ima_inode_free ima_inode_free
  IMA: clean up the IMA counts updating code
  ima: only insert at inode creation time
  ima: valid return code from ima_inode_alloc
  fs: move get_empty_filp() deffinition to internal.h
  Sanitize exec_permission_lite()
  Kill cached_lookup() and real_lookup()
  Kill path_lookup_open()
  ...

Trivial conflicts in fs/direct-io.c
2009-12-16 12:04:02 -08:00
Linus Torvalds
e4bdda1bc3 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFSv4: Fix a regression in the NFSv4 state manager
  NFSv4: Release the sequence id before restarting a CLOSE rpc call
  nfs41: fix session fore channel negotiation
  nfs41: do not zero seqid portion of stateid on close
  nfs: run state manager in privileged mode
  nfs: make recovery state manager operations privileged
  nfs: enforce FIFO ordering of operations trying to acquire slot
  rpc: add a new priority in RPC task
  nfs: remove rpc_task argument from nfs4_find_slot
  rpc: add rpc_queue_empty function
  nfs: change nfs4_do_setlk params to identify recovery type
  nfs: do not do a LOOKUP after open
  nfs: minor cleanup of session draining
2009-12-16 10:47:44 -08:00
Linus Torvalds
37c24b37fb Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: (42 commits)
  nfsd: remove pointless paths in file headers
  nfsd: move most of nfsfh.h to fs/nfsd
  nfsd: remove unused field rq_reffh
  nfsd: enable V4ROOT exports
  nfsd: make V4ROOT exports read-only
  nfsd: restrict filehandles accepted in V4ROOT case
  nfsd: allow exports of symlinks
  nfsd: filter readdir results in V4ROOT case
  nfsd: filter lookup results in V4ROOT case
  nfsd4: don't continue "under" mounts in V4ROOT case
  nfsd: introduce export flag for v4 pseudoroot
  nfsd: let "insecure" flag vary by pseudoflavor
  nfsd: new interface to advertise export features
  nfsd: Move private headers to source directory
  vfs: nfsctl.c un-used nfsd #includes
  lockd: Remove un-used nfsd headers #includes
  s390: remove un-used nfsd #includes
  sparc: remove un-used nfsd #includes
  parsic: remove un-used nfsd #includes
  compat.c: Remove dependence on nfsd private headers
  ...
2009-12-16 10:43:34 -08:00
Linus Torvalds
59be2e04e5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits)
  net: sh_eth alignment fix for sh7724 using NET_IP_ALIGN V2
  ixgbe: allow tx of pre-formatted vlan tagged packets
  ixgbe: Fix 82598 premature copper PHY link indicatation
  ixgbe: Fix tx_restart_queue/non_eop_desc statistics counters
  bcm63xx_enet: fix compilation failure after get_stats_count removal
  packet: dont call sleeping functions while holding rcu_read_lock()
  tcp: Revert per-route SACK/DSACK/TIMESTAMP changes.
  ipvs: zero usvc and udest
  netfilter: fix crashes in bridge netfilter caused by fragment jumps
  ipv6: reassembly: use seperate reassembly queues for conntrack and local delivery
  sky2: leave PCI config space writeable
  sky2: print Optima chip name
  x25: Update maintainer.
  ipvs: fix synchronization on connection close
  netfilter: xtables: document minimal required version
  drivers/net/bonding/: : use pr_fmt
  can: CAN_MCP251X should depend on HAS_DMA
  drivers/net/usb: Correct code taking the size of a pointer
  drivers/net/cpmac.c: Correct code taking the size of a pointer
  drivers/net/sfc: Correct code taking the size of a pointer
  ...
2009-12-16 10:33:18 -08:00
Linus Torvalds
e69381b417 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (45 commits)
  RDMA/cxgb3: Fix error paths in post_send and post_recv
  RDMA/nes: Fix stale ARP issue
  RDMA/nes: FIN during MPA startup causes timeout
  RDMA/nes: Free kmap() resources
  RDMA/nes: Check for zero STag
  RDMA/nes: Fix Xansation test crash on cm_node ref_count
  RDMA/nes: Abnormal listener exit causes loopback node crash
  RDMA/nes: Fix crash in nes_accept()
  RDMA/nes: Resource not freed for REJECTed connections
  RDMA/nes: MPA request/response error checking
  RDMA/nes: Fix query of ORD values
  RDMA/nes: Fix MAX_CM_BUFFER define
  RDMA/nes: Pass correct size to ioremap_nocache()
  RDMA/nes: Update copyright and branding string
  RDMA/nes: Add max_cqe check to nes_create_cq()
  RDMA/nes: Clean up struct nes_qp
  RDMA/nes: Implement IB_SIGNAL_ALL_WR as an iWARP extension
  RDMA/nes: Add additional SFP+ PHY uC status check and PHY reset
  RDMA/nes: Correct fast memory registration implementation
  IB/ehca: Fix error paths in post_send and post_recv
  ...
2009-12-16 10:32:31 -08:00
Al Viro
2c48b9c455 switch alloc_file() to passing struct path
... and have the caller grab both mnt and dentry; kill
leak in infiniband, while we are at it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:42 -05:00
Al Viro
cc3808f8c3 switch sock_alloc_file() to alloc_file()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:42 -05:00
Al Viro
6b18662e23 9p connect fixes
* if we fail in p9_conn_create(), we shouldn't leak references to struct file.
  Logics in ->close() doesn't help - ->trans is already gone by the time it's
  called.
* sock_create_kern() can fail.
* use of sock_map_fd() is all fscked up; I'd fixed most of that, but the
  rest will have to wait for a bit more work in net/socket.c (we still are
  violating the basic rule of working with descriptor table: "once the reference
  is installed there, don't rely on finding it there again").

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:41 -05:00
Al Viro
7cbe66b6b5 merge sock_alloc_fd/sock_attach_fd into a new helper
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:41 -05:00
Al Viro
198de4d7ac reorder alloc_fd/attach_fd in socketpair()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:41 -05:00
Alexey Dobriyan
28dfef8feb const: constify remaining pipe_buf_operations
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:05 -08:00
Eric Dumazet
1a35ca80c1 packet: dont call sleeping functions while holding rcu_read_lock()
commit 654d1f8a01 (packet: less dev_put() calls)
introduced a problem, calling potentially sleeping functions from a
rcu_read_lock() protected section.

Fix this by releasing lock before the sock_wmalloc()/memcpy_fromiovec() calls.

After skb allocation and copy from user space, we redo device
lookup and appropriate tests.

Reported-and-tested-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-15 21:12:21 -08:00
David S. Miller
81e839efc2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2009-12-15 21:08:53 -08:00
David S. Miller
bb5b7c1126 tcp: Revert per-route SACK/DSACK/TIMESTAMP changes.
It creates a regression, triggering badness for SYN_RECV
sockets, for example:

[19148.022102] Badness at net/ipv4/inet_connection_sock.c:293
[19148.022570] NIP: c02a0914 LR: c02a0904 CTR: 00000000
[19148.023035] REGS: eeecbd30 TRAP: 0700   Not tainted  (2.6.32)
[19148.023496] MSR: 00029032 <EE,ME,CE,IR,DR>  CR: 24002442  XER: 00000000
[19148.024012] TASK = eee9a820[1756] 'privoxy' THREAD: eeeca000

This is likely caused by the change in the 'estab' parameter
passed to tcp_parse_options() when invoked by the functions
in net/ipv4/tcp_minisocks.c

But even if that is fixed, the ->conn_request() changes made in
this patch series is fundamentally wrong.  They try to use the
listening socket's 'dst' to probe the route settings.  The
listening socket doesn't even have a route, and you can't
get the right route (the child request one) until much later
after we setup all of the state, and it must be done by hand.

This stuff really isn't ready, so the best thing to do is a
full revert.  This reverts the following commits:

f55017a93f
022c3f7d82
1aba721eba
cda42ebd67
345cda2fd6
dc343475ed
05eaade278
6a2a2d6bf8

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-15 20:56:42 -08:00
Alexandros Batsakis
689cf5c15b nfs: enforce FIFO ordering of operations trying to acquire slot
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:55:18 -05:00
Alexandros Batsakis
48f1861242 rpc: add rpc_queue_empty function
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:51:17 -05:00
André Goddard Rosa
e7d2860b69 tree-wide: convert open calls to remove spaces to skip_spaces() lib function
Makes use of skip_spaces() defined in lib/string.c for removing leading
spaces from strings all over the tree.

It decreases lib.a code size by 47 bytes and reuses the function tree-wide:
   text    data     bss     dec     hex filename
  64688     584     592   65864   10148 (TOTALS-BEFORE)
  64641     584     592   65817   10119 (TOTALS-AFTER)

Also, while at it, if we see (*str && isspace(*str)), we can be sure to
remove the first condition (*str) as the second one (isspace(*str)) also
evaluates to 0 whenever *str == 0, making it redundant. In other words,
"a char equals zero is never a space".

Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below,
and found occurrences of this pattern on 3 more files:
    drivers/leds/led-class.c
    drivers/leds/ledtrig-timer.c
    drivers/video/output.c

@@
expression str;
@@

( // ignore skip_spaces cases
while (*str &&  isspace(*str)) { \(str++;\|++str;\) }
|
- *str &&
isspace(*str)
)

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Julia Lawall <julia@diku.dk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: David Howells <dhowells@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Samuel Ortiz <samuel@sortiz.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:32 -08:00
Alexey Dobriyan
471452104b const: constify remaining dev_pm_ops
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:25 -08:00
Simon Horman
258c889362 ipvs: zero usvc and udest
Make sure that any otherwise uninitialised fields of usvc are zero.

This has been obvserved to cause a problem whereby the port of
fwmark services may end up as a non-zero value which causes
scheduling of a destination server to fail for persisitent services.

As observed by Deon van der Merwe <dvdm@truteq.co.za>.
This fix suggested by Julian Anastasov <ja@ssi.bg>.

For good measure also zero udest.

Cc: Deon van der Merwe <dvdm@truteq.co.za>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: stable@kernel.org
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-12-15 17:01:25 +01:00
Patrick McHardy
8fa9ff6849 netfilter: fix crashes in bridge netfilter caused by fragment jumps
When fragments from bridge netfilter are passed to IPv4 or IPv6 conntrack
and a reassembly queue with the same fragment key already exists from
reassembling a similar packet received on a different device (f.i. with
multicasted fragments), the reassembled packet might continue on a different
codepath than where the head fragment originated. This can cause crashes
in bridge netfilter when a fragment received on a non-bridge device (and
thus with skb->nf_bridge == NULL) continues through the bridge netfilter
code.

Add a new reassembly identifier for packets originating from bridge
netfilter and use it to put those packets in insolated queues.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=14805

Reported-and-Tested-by: Chong Qiao <qiaochong@loongson.cn>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-12-15 16:59:59 +01:00
Patrick McHardy
0b5ccb2ee2 ipv6: reassembly: use seperate reassembly queues for conntrack and local delivery
Currently the same reassembly queue might be used for packets reassembled
by conntrack in different positions in the stack (PREROUTING/LOCAL_OUT),
as well as local delivery. This can cause "packet jumps" when the fragment
completing a reassembled packet is queued from a different position in the
stack than the previous ones.

Add a "user" identifier to the reassembly queue key to seperate the queues
of each caller, similar to what we do for IPv4.

Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-12-15 16:59:18 +01:00
Gertjan van Wingerde
d24deb2580 mac80211: Add define for TX headroom reserved by mac80211 itself.
Add a definition of the amount of TX headroom reserved by mac80211 itself
for its own purposes. Also add BUILD_BUG_ON to validate the value.
This define can then be used by drivers to request additional TX headroom
in the most efficient manner.

Signed-off-by: Gertjan van Wingerde <gwingerde@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-14 14:22:31 -05:00
Xiaotian Feng
9abfe315de ipvs: fix synchronization on connection close
commit 9d3a0de makes slaves expire as they would do on the master
with much shorter timeouts. But it introduces another problem:
When we close a connection, on master server the connection became
CLOSE_WAIT/TIME_WAIT, it was synced to slaves, but if master is
finished within it's timeouts (CLOSE), it will not be synced to
slaves. Then slaves will be kept on CLOSE_WAIT/TIME_WAIT until
timeout reaches. Thus we should also sync with CLOSE.

Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-12-14 16:38:21 +01:00
Eric W. Biederman
d90a909e1f net: Fix userspace RTM_NEWLINK notifications.
I received some bug reports about userspace programs having problems
because after RTM_NEWLINK was received they could not immediate access
files under /proc/sys/net/ because they had not been registered yet.

The original problem was trivially fixed by moving the userspace
notification from rtnetlink_event() to the end of
register_netdevice().

When testing that change I discovered I was still getting RTM_NEWLINK
events before I could access proc and I was also getting RTM_NEWLINK
events after I was seeing RTM_DELLINK.  Things practically guaranteed
to confuse userspace.

After a little more investigation these extra notifications proved to
be from the new notifiers NETDEV_POST_INIT and NETDEV_UNREGISTER_BATCH
hitting the default case in rtnetlink_event, and triggering
unnecessary RTM_NEWLINK messages.

rtnetlink_event now explicitly handles NETDEV_UNREGISTER_BATCH and
NETDEV_POST_INIT to avoid sending the incorrect userspace
notifications.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13 19:45:22 -08:00
Eric Dumazet
5781b2356c udp: udp_lib_get_port() fix
Now we can have a large udp hash table, udp_lib_get_port() loop
should be converted to a do {} while (cond) form,
or we dont enter it at all if hash table size is exactly 65536.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-13 19:32:39 -08:00
Trond Myklebust
52c9948b1f Merge branch 'nfs-for-2.6.33' 2009-12-13 13:56:27 -05:00
David S. Miller
501706565b Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	include/net/tcp.h
2009-12-11 17:12:17 -08:00
Krishna Kumar
e93737b0f0 net: Handle NETREG_UNINITIALIZED devices correctly
Fix two problems:

1. If unregister_netdevice_many() is called with both registered
   and unregistered devices, rollback_registered_many() bails out
   when it reaches the first unregistered device. The processing
   of the prior registered devices is unfinished, and the
   remaining devices are skipped, and possible registered netdev's
   are leaked/unregistered.

2. System hangs or panics depending on how the devices are passed,
   since when netdev_run_todo() runs, some devices were not fully
   processed.

Tested by passing intermingled unregistered and registered vlan
devices to unregister_netdevice_many() as follows:
	1. dev, fake_dev1, fake_dev2: hangs in run_todo
	   ("unregister_netdevice: waiting for eth1.100 to become
	    free. Usage count = 1")
	2. fake_dev1, dev, fake_dev2: failure during de-registration
	   and next registration, followed by a vlan driver Oops
	   during subsequent registration.

Confirmed that the patch fixes both cases.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-11 15:11:45 -08:00
Martin Willi
c20a66f474 xfrm: Fix truncation length of authentication algorithms installed via PF_KEY
Commit 4447bb33f0 ("xfrm: Store aalg in
xfrm_state with a user specified truncation length") breaks
installation of authentication algorithms via PF_KEY, as the state
specific truncation length is not installed with the algorithms
default truncation length.  This patch initializes state properly to
the default if installed via PF_KEY.

Signed-off-by: Martin Willi <martin@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-11 15:07:57 -08:00
Heiko Carstens
de039f02d8 net: use compat helper functions in compat_sys_recvmmsg
Use (get|put)_compat_timespec helper functions to simplify the code.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-11 15:07:57 -08:00
Heiko Carstens
60c2ffd3d2 net: fix compat_sys_recvmmsg parameter type
compat_sys_recvmmsg has a compat_timespec parameter and not a
timespec parameter. This way we also get rid of an odd cast.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-11 15:07:56 -08:00
John W. Linville
65182b9fb0 wireless: update old static regulatory domain rules
Update "US" and "JP" for current rules, and replace "EU" rules with the
world roaming domain (since it was only a pseudo-domain anyway).

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-10 16:21:51 -05:00
Javier Cardona
7b324d28a9 mac80211: Revert 'Use correct sign for mesh active path refresh'
The patch ("mac80211: Use correct sign for mesh active path
refresh.") was actually a bug.  Reverted it and improved the
explanation of how mesh path refresh works.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-10 16:12:52 -05:00
Javier Cardona
5d618cb81a mac80211: Fixed bug in mesh portal paths
Paths to mesh portals were being timed out immediately after each use in
intermediate forwarding nodes.  mppath->exp_time is set to the expiration time
so assigning it to jiffies was marking the path as expired.

Signed-off-by: Javier Cardona <javier@cozybit.com>
Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-10 16:09:52 -05:00
Julia Lawall
0c3cee72a4 net/mac80211: Correct size given to memset
Memset should be given the size of the structure, not the size of the pointer.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
type T;
T *x;
expression E;
@@

memset(x, E, sizeof(
+ *
 x))
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-10 16:09:52 -05:00
Linus Torvalds
4ef58d4e2a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
  tree-wide: fix misspelling of "definition" in comments
  reiserfs: fix misspelling of "journaled"
  doc: Fix a typo in slub.txt.
  inotify: remove superfluous return code check
  hdlc: spelling fix in find_pvc() comment
  doc: fix regulator docs cut-and-pasteism
  mtd: Fix comment in Kconfig
  doc: Fix IRQ chip docs
  tree-wide: fix assorted typos all over the place
  drivers/ata/libata-sff.c: comment spelling fixes
  fix typos/grammos in Documentation/edac.txt
  sysctl: add missing comments
  fs/debugfs/inode.c: fix comment typos
  sgivwfb: Make use of ARRAY_SIZE.
  sky2: fix sky2_link_down copy/paste comment error
  tree-wide: fix typos "couter" -> "counter"
  tree-wide: fix typos "offest" -> "offset"
  fix kerneldoc for set_irq_msi()
  spidev: fix double "of of" in comment
  comment typo fix: sybsystem -> subsystem
  ...
2009-12-09 19:43:33 -08:00
Suresh Jayaraman
053e324f67 rpc: remove unneeded function parameter in gss_add_msg()
The pointer to struct gss_auth parameter in gss_add_msg is not really needed
after commit 5b7ddd4a. Zap it.

Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-09 16:23:18 -05:00
John W. Linville
19deffbeba wireless: correctly report signal value for IEEE80211_HW_SIGNAL_UNSPEC
This part was missed in "cfg80211: implement get_wireless_stats",
probably because sta_set_sinfo already existed and was only handling
dBm signals.

Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-09 15:10:08 -05:00
Vivek Natarajan
d55fb891f9 cfg80211: Clear encryption privacy when key off is done.
When the current_bss is not set, 'iwconfig <iface> key off' does not
clear the private flag. Hence after we connect with WEP to an AP and
then try to connect with another non-WEP AP, it does not work.
This issue will not be seen if supplicant is used.

Signed-off-by: Vivek Natarajan <vnatarajan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-09 15:10:08 -05:00
Ilpo Järvinen
77722b177a tcp: fix retrans_stamp advancing in error cases
It can happen, that tcp_retransmit_skb fails due to some error.
In such cases we might end up into a state where tp->retrans_out
is zero but that's only because we removed the TCPCB_SACKED_RETRANS
bit from a segment but couldn't retransmit it because of the error
that happened. Therefore some assumptions that retrans_out checks
are based do not necessarily hold, as there still can be an old
retransmission but that is only visible in TCPCB_EVER_RETRANS bit.
As retransmission happen in sequential order (except for some very
rare corner cases), it's enough to check the head skb for that bit.

Main reason for all this complexity is the fact that connection dying
time now depends on the validity of the retrans_stamp, in particular,
that successive retransmissions of a segment must not advance
retrans_stamp under any conditions. It seems after quick thinking that
this has relatively low impact as eventually TCP will go into CA_Loss
and either use the existing check for !retrans_stamp case or send a
retransmission successfully, setting a new base time for the dying
timer (can happen only once). At worst, the dying time will be
approximately the double of the intented time. In addition,
tcp_packet_delayed() will return wrong result (has some cc aspects
but due to rarity of these errors, it's hardly an issue).

One of retrans_stamp clearing happens indirectly through first going
into CA_Open state and then a later ACK lets the clearing to happen.
Thus tcp_try_keep_open has to be modified too.

Thanks to Damian Lukowski <damian@tvk.rwth-aachen.de> for hinting
that this possibility exists (though the particular case discussed
didn't after all have it happening but was just a debug patch
artifact).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-08 20:56:12 -08:00
Damian Lukowski
2f7de5710a tcp: Stalling connections: Move timeout calculation routine
This patch moves retransmits_timed_out() from include/net/tcp.h
to tcp_timer.c, where it is used.

Reported-by: Frederic Leroy <fredo@starox.org>
Signed-off-by: Damian Lukowski <damian@tvk.rwth-aachen.de>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-08 20:56:11 -08:00
chas williams - CONTRACTOR
2e302ebfea atm: [br2684] allow routed mode operation again
in routed mode, we don't have a hardware address so netdev_ops doesnt
need to validate our hardware address via .ndo_validate_addr

Reported-by: Manuel Fuentes <mfuentes@agenciaefe.com>
Signed-off-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-08 20:22:31 -08:00
chas williams - CONTRACTOR
eb0445887a atm: [lec] initialize .netdev_ops before calling register_netdev()
fix oops when initializing lane interfaces. lec should probably be
changed to use alloc_netdev() instead.

Signed-off-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-08 20:22:30 -08:00
Eric Dumazet
2a8875e73f [PATCH] tcp: documents timewait refcnt tricks
Adds kerneldoc for inet_twsk_unhash() & inet_twsk_bind_unhash().

With help from Randy Dunlap.

Suggested-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-08 20:19:53 -08:00
Eric Dumazet
3cdaedae63 tcp: Fix a connect() race with timewait sockets
When we find a timewait connection in __inet_hash_connect() and reuse
it for a new connection request, we have a race window, releasing bind
list lock and reacquiring it in __inet_twsk_kill() to remove timewait
socket from list.

Another thread might find the timewait socket we already chose, leading to
list corruption and crashes.

Fix is to remove timewait socket from bind list before releasing the bind lock.

Note: This problem happens if sysctl_tcp_tw_reuse is set.

Reported-by: kapil dakhane <kdakhane@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-08 20:17:51 -08:00
Eric Dumazet
9327f7053e tcp: Fix a connect() race with timewait sockets
First patch changes __inet_hash_nolisten() and __inet6_hash()
to get a timewait parameter to be able to unhash it from ehash
at same time the new socket is inserted in hash.

This makes sure timewait socket wont be found by a concurrent
writer in __inet_check_established()

Reported-by: kapil dakhane <kdakhane@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-08 20:17:51 -08:00
David S. Miller
3dc789320e tcp: Remove runtime check that can never be true.
GCC even warns about it, as reported by Andrew Morton:

net/ipv4/tcp.c: In function 'do_tcp_getsockopt':
net/ipv4/tcp.c:2544: warning: comparison is always false due to limited range of data type

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-08 20:07:54 -08:00
David S. Miller
e61444d920 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2009-12-08 13:44:44 -08:00
Linus Torvalds
a252e749f1 sctp: fix compile error due to sysctl mismerge
I messed up the merge in d7fc02c7ba, where
the conflict in question wasn't just about CTL_UNNUMBERED being removed,
but the 'strategy' field is too (sysctl handling is now done through the
/proc interface, with no duplicate protocols for reading the data).

Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-08 12:51:22 -08:00
Trond Myklebust
e4ee5d4dfd Merge branch 'bugfixes' into nfs-for-next 2009-12-08 14:36:53 -05:00
Roel Kluin
480e3243df SUNRPC: IS_ERR/PTR_ERR confusion
IS_ERR returns 1 or 0, PTR_ERR returns the error value.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-08 13:13:03 -05:00
Linus Torvalds
d7fc02c7ba Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
  mac80211: fix reorder buffer release
  iwmc3200wifi: Enable wimax core through module parameter
  iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
  iwmc3200wifi: Coex table command does not expect a response
  iwmc3200wifi: Update wiwi priority table
  iwlwifi: driver version track kernel version
  iwlwifi: indicate uCode type when fail dump error/event log
  iwl3945: remove duplicated event logging code
  b43: fix two warnings
  ipw2100: fix rebooting hang with driver loaded
  cfg80211: indent regulatory messages with spaces
  iwmc3200wifi: fix NULL pointer dereference in pmkid update
  mac80211: Fix TX status reporting for injected data frames
  ath9k: enable 2GHz band only if the device supports it
  airo: Fix integer overflow warning
  rt2x00: Fix padding bug on L2PAD devices.
  WE: Fix set events not propagated
  b43legacy: avoid PPC fault during resume
  b43: avoid PPC fault during resume
  tcp: fix a timewait refcnt race
  ...

Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
CTL_UNNUMBERED removed) in
	kernel/sysctl_check.c
	net/ipv4/sysctl_net_ipv4.c
	net/ipv6/addrconf.c
	net/sctp/sysctl.c
2009-12-08 07:55:01 -08:00
Linus Torvalds
1557d33007 Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits)
  security/tomoyo: Remove now unnecessary handling of security_sysctl.
  security/tomoyo: Add a special case to handle accesses through the internal proc mount.
  sysctl: Drop & in front of every proc_handler.
  sysctl: Remove CTL_NONE and CTL_UNNUMBERED
  sysctl: kill dead ctl_handler definitions.
  sysctl: Remove the last of the generic binary sysctl support
  sysctl net: Remove unused binary sysctl code
  sysctl security/tomoyo: Don't look at ctl_name
  sysctl arm: Remove binary sysctl support
  sysctl x86: Remove dead binary sysctl support
  sysctl sh: Remove dead binary sysctl support
  sysctl powerpc: Remove dead binary sysctl support
  sysctl ia64: Remove dead binary sysctl support
  sysctl s390: Remove dead sysctl binary support
  sysctl frv: Remove dead binary sysctl support
  sysctl mips/lasat: Remove dead binary sysctl support
  sysctl drivers: Remove dead binary sysctl support
  sysctl crypto: Remove dead binary sysctl support
  sysctl security/keys: Remove dead binary sysctl support
  sysctl kernel: Remove binary sysctl logic
  ...
2009-12-08 07:38:50 -08:00
Vasanthakumar Thiagarajan
1814077fd1 mac80211: Fix bug in computing crc over dynamic IEs in beacon
On a 32-bit machine, BIT() macro does not give the required
bit value if the bit is mroe than 31. In ieee802_11_parse_elems_crc(),
BIT() is suppossed to get the bit value more than 31 (42 (id of ERP_INFO_IE),
37 (CHANNEL_SWITCH_IE), (42), 32 (POWER_CONSTRAINT_IE), 45 (HT_CAP_IE),
61 (HT_INFO_IE)). As we do not get the required bit value for the above
IEs, crc over these IEs are never calculated, so any dynamic change in these
IEs after the association is not really handled on 32-bit platforms.
This patch fixes this issue.

Cc: stable@kernel.org
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-07 16:51:24 -05:00
Andrew Morton
02f7f17930 net/rfkill/core.c: work around gcc-4.0.2 silliness
net/rfkill/core.c: In function 'rfkill_type_show':
net/rfkill/core.c:610: warning: control may reach end of non-void function 'rfkill_get_type_str' being inlined

A gcc bug, but simple enough to squish.

Cc: John W. Linville <linville@tuxdriver.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-07 16:51:23 -05:00
Vivek Natarajan
7c3f4bbedc mac80211: Fix dynamic power save for scanning.
Not only ps_sdata but also IEEE80211_CONF_PS is to be considered
before restoring PS in scan_ps_disable(). For instance, when ps_sdata
is set but CONF_PS is not set just because the dynamic timer is still
running, a sw scan leads to setting of CONF_PS in scan_ps_disable
instead of restarting the dynamic PS timer.
Also for the above case, a null data frame is to be sent after
returning to operating channel which was not happening with the
current implementation. This patch fixes this too.

Signed-off-by: Vivek Natarajan <vnatarajan@atheros.com>
Reviewed-by: Kalle Valo <kalle.valo@nokia.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-07 16:51:18 -05:00
Johannes Berg
bc83b68192 mac80211: recalculate idle later in MLME
hwsim testing has revealed that when the MLME
recalculates the idle state of the device, it
sometimes does so before sending the final
deauthentication or disassociation frame. This
patch changes the place where the idle state
is recalculated, but of course driver transmit
is typically asynchronous while configuration
is expected to be synchronous, so it doesn't
fix all possible cases yet.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-07 16:51:18 -05:00
Jiri Kosina
d014d04386 Merge branch 'for-next' into for-linus
Conflicts:

	kernel/irq/chip.c
2009-12-07 18:36:35 +01:00
David S. Miller
28b4d5cc17 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Conflicts:
	drivers/net/pcmcia/fmvj18x_cs.c
	drivers/net/pcmcia/nmclan_cs.c
	drivers/net/pcmcia/xirc2ps_cs.c
	drivers/net/wireless/ray_cs.c
2009-12-05 15:22:26 -08:00
Linus Torvalds
d0b093a8b5 Merge branch 'core-printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ratelimit: Make suppressed output messages more useful
  printk: Remove ratelimit.h from kernel.h
  ratelimit: Fix/allow use in atomic contexts
  ratelimit: Use per ratelimit context locking
2009-12-05 09:50:22 -08:00
Johannes Berg
d29cecda03 mac80211: fix reorder buffer release
My patch "mac80211: correctly place aMPDU RX reorder code"
uses an skb queue for MPDUs that were released from the
buffer. I intentially didn't initialise and use the skb
queue's spinlock, but in this place forgot that the code
variant that doesn't touch the spinlock is needed.

Thanks to Christian Lamparter for quickly spotting the
bug in the backtrace Reinette reported.

Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Bug-identified-by: Christian Lamparter <chunkeey@googlemail.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-04 14:25:43 -08:00
David S. Miller
8f56874bd7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2009-12-04 13:25:15 -08:00
Kalle Valo
269ac5fd2d cfg80211: indent regulatory messages with spaces
The regulatory messages in syslog look weird:

kernel: cfg80211: Regulatory domain: US
kernel: ^I(start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
kernel: ^I(2402000 KHz - 2472000 KHz @ 40000 KHz), (600 mBi, 2700 mBm)
kernel: ^I(5170000 KHz - 5190000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
kernel: ^I(5190000 KHz - 5210000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
kernel: ^I(5210000 KHz - 5230000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
kernel: ^I(5230000 KHz - 5330000 KHz @ 40000 KHz), (600 mBi, 2300 mBm)
kernel: ^I(5735000 KHz - 5835000 KHz @ 40000 KHz), (600 mBi, 3000 mBm)

Indent them with four spaces instead of the tab character to get prettier
output.

Signed-off-by: Kalle Valo <kalle.valo@nokia.com>
Acked: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-04 14:16:24 -05:00
Jouni Malinen
914828fad0 mac80211: Fix TX status reporting for injected data frames
An earlier optimization on removing unnecessary traffic on cooked
monitor interfaces ("mac80211: reduce the amount of unnecessary traffic
on cooked monitor interfaces ") ended up removing quite a bit more
than just unnecessary traffic. It was not supposed to remove TX status
reporting for injected frames, but ended up doing it by checking the
injected flag in skb->cb only after that field had been cleared with
memset.. Fix this by taking a local copy of the injected flag before
skb->cb is cleared.

This broke user space applications that depend on getting TX status
notifications for injected data frames. For example, STA inactivity
poll from hostapd did not work and ended up kicking out stations even
if they were still present.

Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-04 14:16:23 -05:00
Jean Tourrilhes
1014eb6ec9 WE: Fix set events not propagated
I've just noticed that some events are no longer propagated
for some wireless drivers. Basically, SET request with a extra payload
for driver without commit handler. The fix is pretty simple, see
attached.
	Actually, a few lines below this line, you will see that the
event generation for simple SET (iwpoint-less ?) is done properly,
and this other event generation does not need fixing.

Signed-off-by: Jean Tourrilhes <jt@hpl.hp.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2009-12-04 13:30:39 -05:00
André Goddard Rosa
af901ca181 tree-wide: fix assorted typos all over the place
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 15:39:55 +01:00
Thadeu Lima de Souza Cascardo
94e2bd6888 tree-wide: fix some typos and punctuation in comments
fix some typos and punctuation in comments

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 15:39:48 +01:00
Trond Myklebust
7285f2d2ff Merge branch 'devel' into linux-next 2009-12-03 21:27:36 -05:00
Eric Dumazet
47e1c32306 tcp: fix a timewait refcnt race
After TCP RCU conversion, tw->tw_refcnt should not be set to 1 in
inet_twsk_alloc(). It allows a RCU reader to get this timewait socket,
while we not yet stabilized it.

Only choice we have is to set tw_refcnt to 0 in inet_twsk_alloc(),
then atomic_add() it later, once everything is done.

Location of this atomic_add() is tricky, because we dont want another
writer to find this timewait in ehash, while tw_refcnt is still zero !

Thanks to Kapil Dakhane tests and reports.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 16:17:44 -08:00
Eric Dumazet
13475a30b6 tcp: connect() race with timewait reuse
Its currently possible that several threads issuing a connect() find
the same timewait socket and try to reuse it, leading to list
corruptions.

Condition for bug is that these threads bound their socket on same
address/port of to-be-find timewait socket, and connected to same
target. (SO_REUSEADDR needed)

To fix this problem, we could unhash timewait socket while holding
ehash lock, to make sure lookups/changes will be serialized. Only
first thread finds the timewait socket, other ones find the
established socket and return an EADDRNOTAVAIL error.

This second version takes into account Evgeniy's review and makes sure
inet_twsk_put() is called outside of locked sections.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 16:17:43 -08:00
Eric Dumazet
49d0900787 tcp: diag: Dont report negative values for rx queue
Both netlink and /proc/net/tcp interfaces can report transient
negative values for rx queue.

ss ->
State   Recv-Q Send-Q  Local Address:Port  Peer Address:Port
ESTAB   -6     6       127.0.0.1:45956     127.0.0.1:3333 

netstat ->
tcp   4294967290      6 127.0.0.1:37784  127.0.0.1:3333 ESTABLISHED

This is because we dont lock socket while computing 
tp->rcv_nxt - tp->copied_seq,
and another CPU can update copied_seq before rcv_next in RX path.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 16:06:13 -08:00
Patrick Mullaney
fc4a748966 netdevice: provide common routine for macvlan and vlan operstate management
Provide common routine for the transition of operational state for a leaf
device during a root device transition.

Signed-off-by: Patrick Mullaney <pmullaney@novell.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 15:59:22 -08:00
David S. Miller
a7fca0ccec Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-next-2.6 2009-12-03 13:51:02 -08:00
David S. Miller
424eff9751 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2009-12-03 13:23:12 -08:00
Chuck Lever
3a28becc35 SUNRPC: soft connect semantics for UDP
Introduce soft connect behavior for UDP transports.  In this case, a
major timeout returns ETIMEDOUT instead of EIO.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-03 15:58:56 -05:00
Chuck Lever
caabea8a56 SUNRPC: Use soft connect semantics when performing RPC ping
Currently, if a remote RPC service is unreachable, an RPC ping will
hang until the underlying transport connect attempt times out.  A more
desirable behavior might be to have the ping fail immediately so upper
layers can recover appropriately.

In the case of an NFS mount, for instance, this would mean the
mount(2) system call could fail immediately if the server isn't
listening, rather than hanging uninterruptibly for more than 3
minutes.

Change rpc_ping() so that it fails immediately for connection-oriented
transports.  rpc_create() will then fail immediately for such
transports if an RPC ping was requested.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-03 15:58:56 -05:00
Chuck Lever
012da158f6 SUNRPC: Use soft connects for autobinding over TCP
Autobinding is handled by the rpciod process, not in user processes
that are generating regular RPC requests.  Thus autobinding is usually
not affected by signals targetting user processes, such as KILL or
timer expiration events.

In addition, an RPC request generated by a user process that has
RPC_TASK_SOFTCONN set and needs to perform an autobind will hang if
the remote rpcbind service is not available.

For rpcbind queries on connection-oriented transports, let's use the
new soft connect semantic to return control to the user's process
quickly, if the kernel's rpcbind client can't connect to the remote
rpcbind service.

Logic is introduced in call_bind_status() to handle connection errors
that occurred during an asynchronous rpcbind query.  The logic
abandons the rpcbind query if the RPC request has SOFTCONN set, and
retries after a few seconds in the normal case.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-03 15:58:56 -05:00
Chuck Lever
2a76b3bfa2 SUNRPC: Use TCP for local rpcbind upcalls
Use TCP with the soft connect semantic for local rpcbind upcalls so
the kernel can detect immediately if the local rpcbind daemon is not
running.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2009-12-03 15:58:56 -05:00
Chuck Lever
c526611dd6 SUNRPC: Use a cached RPC client and transport for rpcbind upcalls
The kernel's rpcbind client creates and deletes an rpc_clnt and its
underlying transport socket for every upcall to the local rpcbind
daemon.

When starting a typical NFS server on IPv4 and IPv6, the NFS service
itself does three upcalls (one per version) times two upcalls (one
per transport) times two upcalls (one per address family), making 12,
plus another one for the initial call to unregister previous NFS
services.  Starting the NLM service adds an additional 13 upcalls,
for similar reasons.

(Currently the NFS service doesn't start IPv6 listeners, but it will
soon enough).

Instead, let's create an rpc_clnt for rpcbind upcalls during the
first local rpcbind query, and cache it.  This saves the overhead of
creating and destroying an rpc_clnt and a socket for every upcall.

The new logic also prevents the kernel from attempting an RPCB_SET or
RPCB_UNSET if it knows from the start that the local portmapper does
not support rpcbind protocol version 4.  This will cut down on the
number of rpcbind upcalls in legacy environments.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2009-12-03 15:58:56 -05:00
Chuck Lever
5a46211540 SUNRPC: Simplify synopsis of rpcb_local_clnt()
Clean up: At one point, rpcb_local_clnt() handled IPv6 loopback
addresses too, but it doesn't any more; only IPv4 loopback is used
now.  Get rid of the @addr and @addrlen arguments to
rpcb_local_clnt().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-03 15:58:56 -05:00
Chuck Lever
09a21c4102 SUNRPC: Allow RPCs to fail quickly if the server is unreachable
The kernel sometimes makes RPC calls to services that aren't running.
Because the kernel's RPC client always assumes the hard retry semantic
when reconnecting a connection-oriented RPC transport, the underlying
reconnect logic takes a long while to time out, even though the remote
may have responded immediately with ECONNREFUSED.

In certain cases, like upcalls to our local rpcbind daemon, or for NFS
mount requests, we'd like the kernel to fail immediately if the remote
service isn't reachable.  This allows another transport to be tried
immediately, or the pending request can be abandoned quickly.

Introduce a per-request flag which controls how call_transmit_status()
behaves when request transmission fails because the server cannot be
reached.

We don't want soft connection semantics to apply to other errors.  The
default case of the switch statement in call_transmit_status() no
longer falls through; the fall through code is copied to the default
case, and a "break;" is added.

The transport's connection re-establishment timeout is also ignored for
such requests.  We want the request to fail immediately, so the
reconnect delay is skipped.  Additionally, we don't want a connect
failure here to further increase the reconnect timeout value, since
this request will not be retried.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-03 15:58:56 -05:00
Chuck Lever
206a134b4d SUNRPC: Check explicitly for tk_status == 0 in call_transmit_status()
The success case, where task->tk_status == 0, is by far the most
frequent case in call_transmit_status().

The default: arm of the switch statement in call_transmit_status()
handles the 0 case.  default: was moved close to the top of the switch
statement in call_transmit_status() under the theory that the compiler
places object code for the earliest arms of a switch statement first,
making the CPU do less work.

The default: arm of a switch statement, however, is executed only
after all the other cases have been checked.  Even if the compiler
rearranges the object code, the default: arm is the "last resort",
meaning all of the other cases have been explicitly exhausted.  That
makes the current arrangement about as inefficient as it gets for the
common case.

To fix this, add an explicit check for zero before the switch
statement.  That forces the compiler to do the zero check first, no
matter what optimizations it might try to do to the switch statement.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-03 15:58:56 -05:00
Chuck Lever
dd1fd90fe6 SUNRPC: Display compressed (shorthand) IPv6 presentation addresses
Recent changes to snprintf() introduced the %pI6c formatter, which can
display an IPv6 address with standard shorthanding.  Using a
shorthanded address can save us a few bytes of memory for each stored
presentation address, or a few bytes on the wire when sending these in
a universal address.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-03 15:58:56 -05:00
Eric W. Biederman
b099ce2602 net: Batch inet_twsk_purge
This function walks the whole hashtable so there is no point in
passing it a network namespace.  Instead I purge all timewait
sockets from dead network namespaces that I find.  If the namespace
is one of the once I am trying to purge I am guaranteed no new timewait
sockets can be formed so this will get them all.  If the namespace
is one I am not acting for it might form a few more but I will
call inet_twsk_purge again and  shortly to get rid of them.  In
any even if the network namespace is dead timewait sockets are
useless.

Move the calls of inet_twsk_purge into batch_exit routines so
that if I am killing a bunch of namespaces at once I will just
call inet_twsk_purge once and save a lot of redundant unnecessary
work.

My simple 4k network namespace exit test the cleanup time dropped from
roughly 8.2s to 1.6s.  While the time spent running inet_twsk_purge fell
to about 2ms.  1ms for ipv4 and 1ms for ipv6.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 12:23:47 -08:00
Eric W. Biederman
575f4cd5a5 net: Use rcu lookups in inet_twsk_purge.
While we are looking up entries to free there is no reason to take
the lock in inet_twsk_purge.  We have to drop locks and restart
occassionally anyway so adding a few more in case we get on the
wrong list because of a timewait move is no big deal.  At the
same time not taking the lock for long periods of time is much
more polite to the rest of the users of the hash table.

In my test configuration of killing 4k network namespaces
this change causes 4k back to back runs of inet_twsk_purge on an
empty hash table to go from roughly 20.7s to 3.3s, and the total
time to destroy 4k network namespaces goes from roughly 44s to
3.3s.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 12:23:47 -08:00
Eric W. Biederman
e9c5158ac2 net: Allow fib_rule_unregister to batch
Refactor the code so fib_rules_register always takes a template instead
of the actual fib_rules_ops structure that will be used.  This is
required for network namespace support so 2 out of the 3 callers already
do this, it allows the error handling to be made common, and it allows
fib_rules_unregister to free the template for hte caller.

Modify fib_rules_unregister to use call_rcu instead of syncrhonize_rcu
to allw multiple namespaces to be cleaned up in the same rcu grace
period.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 12:22:55 -08:00
Eric W. Biederman
3a765edadb netns: Add an explicit rcu_barrier to unregister_pernet_{device|subsys}
This allows namespace exit methods to batch work that comes requires an
rcu barrier using call_rcu without having to treat the
unregister_pernet_operations cases specially.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 12:22:03 -08:00
Eric W. Biederman
d79d792ef9 net: Allow xfrm_user_net_exit to batch efficiently.
xfrm.nlsk is provided by the xfrm_user module and is access via rcu from
other parts of the xfrm code.  Add xfrm.nlsk_stash a copy of xfrm.nlsk that
will never be set to NULL.  This allows the synchronize_net and
netlink_kernel_release to be deferred until a whole batch of xfrm.nlsk sockets
have been set to NULL.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 12:22:03 -08:00
Eric W. Biederman
04dc7f6be3 net: Move network device exit batching
Move network device exit batching from a special case in
net_namespace.c to using common mechanisms in dev.c

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 12:22:02 -08:00
Eric W. Biederman
72ad937abd net: Add support for batching network namespace cleanups
- Add exit_list to struct net to support building lists of network
  namespaces to cleanup.

- Add exit_batch to pernet_operations to allow running operations only
  once during a network namespace exit.  Instead of once per network
  namespace.

- Factor opt ops_exit_list and ops_exit_free so the logic with cleanup
  up a network namespace does not need to be duplicated.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 12:22:01 -08:00
Patrick McHardy
8153a10c08 ipv4 05/05: add sysctl to accept packets with local source addresses
commit 8ec1e0ebe26087bfc5c0394ada5feb5758014fc8
Author: Patrick McHardy <kaber@trash.net>
Date:   Thu Dec 3 12:16:35 2009 +0100

    ipv4: add sysctl to accept packets with local source addresses

    Change fib_validate_source() to accept packets with a local source address when
    the "accept_local" sysctl is set for the incoming inet device. Combined with the
    previous patches, this allows to communicate between multiple local interfaces
    over the wire.

    Signed-off-by: Patrick McHardy <kaber@trash.net>

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 12:14:38 -08:00
Patrick McHardy
5adef18091 net 04/05: fib_rules: allow to delete local rule
commit d124356ce314fff22a047ea334379d5105b2d834
Author: Patrick McHardy <kaber@trash.net>
Date:   Thu Dec 3 12:16:35 2009 +0100

    net: fib_rules: allow to delete local rule

    Allow to delete the local rule and recreate it with a higher priority. This
    can be used to force packets with a local destination out on the wire instead
    of routing them to loopback. Additionally this patch allows to recreate rules
    with a priority of 0.

    Combined with the previous patch to allow oif classification, a socket can
    be bound to the desired interface and packets routed to the wire like this:

    # move local rule to lower priority
    ip rule add pref 1000 lookup local
    ip rule del pref 0

    # route packets of sockets bound to eth0 to the wire independant
    # of the destination address
    ip rule add pref 100 oif eth0 lookup 100
    ip route add default dev eth0 table 100

    Signed-off-by: Patrick McHardy <kaber@trash.net>

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-12-03 12:14:37 -08:00