Commit Graph

10524 Commits

Author SHA1 Message Date
David S. Miller
ab55570d64 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-10-14 23:19:16 -07:00
Alexey Dobriyan
eef9d90dcd netns: correct mib stats in ip6_route_me_harder()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14 22:55:21 -07:00
Alexey Dobriyan
4ef079ccc1 netns: fix net_generic array leak
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14 22:54:48 -07:00
Johannes Berg
4233df6b74 ath9k/mac80211: disallow fragmentation in ath9k, report to userspace
As I've reported, ath9k currently fails utterly when fragmentation
is enabled. This makes ath9k "support" hardware fragmentation by
not supporting fragmentation at all to avoid the double-free issue.
The patch also changes mac80211 to report errors from the driver
operation to userspace.

That hack in ath9k should be removed once the rate control algorithm
it has is fixed, and we can at that time consider removing the hw
fragmentation support entirely since it's not used by any driver.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@kernel.org
Acked-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 21:12:37 -04:00
Jouni Malinen
d048e503a2 mac80211: Fix scan RX processing oops
ieee80211_bss_info_update() can return NULL. Verify that this is not the
case before calling ieee802111_rx_bss_put() which would trigger an oops
in interrupt context in atomic_dec_and_lock().

Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Benoit Papillault <benoit.papillault@free.fr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 21:08:11 -04:00
Johannes Berg
33c0360bf7 cfg80211: fix debugfs error handling
If something goes wrong creating the debugfs dir or when
debugfs is not compiled in, the current code might lead to
trouble; make it more robust.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 20:48:25 -04:00
Johannes Berg
c74e90a9e3 mac80211: fix debugfs netdev rename
If, for some reason, a netdev has no debugfs dir, we shouldn't
try to rename that dir.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 20:48:00 -04:00
Johannes Berg
09914813da mac80211: fix HT information element parsing
There's no checking that the HT IEs are of the right length
which can be used by an attacker to cause an out-of-bounds
access by sending a too short HT information/capability IE.
Fix it by simply pretending those IEs didn't exist when too
short.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 20:47:15 -04:00
Johannes Berg
63044e9f54 mac80211: fix debugfs lockup
When debugfs_create_dir fails, sta_info_debugfs_add_work will not
terminate because it will find the same station again and again.
This is possible whenever debugfs fails for whatever reason; one
reason is a race condition in mac80211, unfortunately we cannot
do much about it, so just document it, it just means some station
may be missing from debugfs.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-14 20:46:41 -04:00
Linus Torvalds
e413b210c5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (55 commits)
  HID: build drivers for all quirky devices by default
  HID: add missing blacklist entry for Apple ATV ircontrol
  HID: add support for Bright ABNT2 brazilian device
  HID: Don't let Avermedia Radio FM800 be handled by usb hid drivers
  HID: fix numlock led on Dell device 0x413c/0x2105
  HID: remove warn() macro from usb hid drivers
  HID: remove info() macro from usb HID drivers
  HID: add appletv IR receiver quirk
  HID: fix a lockup regression when using force feedback on a PID device
  HID: hiddev.h: Fix example code.
  HID: hiddev.h: Fix mixed space and tabs in example code.
  HID: convert to dev_* prints
  HID: remove hid-ff
  HID: move zeroplus FF processing
  HID: move thrustmaster FF processing
  HID: move pantherlord FF processing
  HID: fix incorrent length condition in hidraw_write()
  HID: fix tty<->hid deadlock
  HID: ignore iBuddy devices
  HID: report descriptor fix for remaining MacBook JIS keyboards
  ...
2008-10-14 16:35:43 -07:00
Jiri Slaby
93c10132a7 HID: move connect quirks
Move connecting from usbhid to the hid layer and fix also hidp in
that manner.
This removes all the ignore/force hidinput/hiddev connecting quirks.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14 23:50:56 +02:00
Jiri Slaby
8c19a51591 HID: move apple quirks
Move them from the core code to a separate driver.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14 23:50:49 +02:00
Jiri Slaby
d458a9dfc4 HID: move ignore quirks
Move ignore quirks from usbhid-quirks into hid-core code. Also don't output
warning when ENODEV is error code in usbhid and try ordinal input in hidp
when that error is returned.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14 23:50:49 +02:00
Jiri Slaby
c500c97140 HID: hid, make parsing event driven
Next step for complete hid bus, this patch includes:
- call parser either from probe or from hid-core if there is no probe.
- add ll_driver structure and centralize some stuff there (open, close...)
- split and merge usb_hid_configure and hid_probe into several functions
  to allow hooks/fixes between them

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14 23:50:48 +02:00
Jiri Slaby
85cdaf524b HID: make a bus from hid code
Make a bus from hid core. This is the first step for converting all the
quirks and separate almost-drivers into real drivers attached to this bus.

It's implemented to change behaviour in very tiny manner, so that no driver
needs to be changed this time.

Also add generic drivers for both usb and bt into usbhid or hidp
respectively which will bind all non-blacklisted device. Those blacklisted
will be either grabbed by special drivers or by nobody if they are broken at
the very rude base.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-10-14 23:50:48 +02:00
Linus Torvalds
8acd3a60bc Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: (59 commits)
  svcrdma: Fix IRD/ORD polarity
  svcrdma: Update svc_rdma_send_error to use DMA LKEY
  svcrdma: Modify the RPC reply path to use FRMR when available
  svcrdma: Modify the RPC recv path to use FRMR when available
  svcrdma: Add support to svc_rdma_send to handle chained WR
  svcrdma: Modify post recv path to use local dma key
  svcrdma: Add a service to register a Fast Reg MR with the device
  svcrdma: Query device for Fast Reg support during connection setup
  svcrdma: Add FRMR get/put services
  NLM: Remove unused argument from svc_addsock() function
  NLM: Remove "proto" argument from lockd_up()
  NLM: Always start both UDP and TCP listeners
  lockd: Remove unused fields in the nlm_reboot structure
  lockd: Add helper to sanity check incoming NOTIFY requests
  lockd: change nlmclnt_grant() to take a "struct sockaddr *"
  lockd: Adjust nlmsvc_lookup_host() to accomodate AF_INET6 addresses
  lockd: Adjust nlmclnt_lookup_host() signature to accomodate non-AF_INET
  lockd: Support non-AF_INET addresses in nlm_lookup_host()
  NLM: Convert nlm_lookup_host() to use a single argument
  svcrdma: Add Fast Reg MR Data Types
  ...
2008-10-14 12:31:14 -07:00
Pablo Neira Ayuso
e6a7d3c04f netfilter: ctnetlink: remove bogus module dependency between ctnetlink and nf_nat
This patch removes the module dependency between ctnetlink and
nf_nat by means of an indirect call that is initialized when
nf_nat is loaded. Now, nf_conntrack_netlink only requires
nf_conntrack and nfnetlink.

This patch puts nfnetlink_parse_nat_setup_hook into the
nf_conntrack_core to avoid dependencies between ctnetlink,
nf_conntrack_ipv4 and nf_conntrack_ipv6.

This patch also introduces the function ctnetlink_change_nat
that is only invoked from the creation path. Actually, the
nat handling cannot be invoked from the update path since
this is not allowed. By introducing this function, we remove
the useless nat handling in the update path and we avoid
deadlock-prone code.

This patch also adds the required EAGAIN logic for nfnetlink.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14 11:58:31 -07:00
Patrick McHardy
129404a1f1 netfilter: fix ebtables dependencies
Ingo Molnar reported a build error with ebtables:

ERROR: "ebt_register_table" [net/bridge/netfilter/ebtable_filter.ko] undefined!
ERROR: "ebt_do_table" [net/bridge/netfilter/ebtable_filter.ko] undefined!
ERROR: "ebt_unregister_table" [net/bridge/netfilter/ebtable_filter.ko] undefined!
ERROR: "ebt_register_table" [net/bridge/netfilter/ebtable_broute.ko] undefined!
ERROR: "ebt_do_table" [net/bridge/netfilter/ebtable_broute.ko] undefined!
ERROR: "ebt_unregister_table" [net/bridge/netfilter/ebtable_broute.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2

This reason is a missing dependencies that got lost during Kconfig cleanups.
Restore it.

Tested-by: Ingo Molnar <mingo@elte.hu>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14 11:57:33 -07:00
Patrick McHardy
38f7ac3eb7 netfilter: restore lost #ifdef guarding defrag exception
Nir Tzachar <nir.tzachar@gmail.com> reported a warning when sending
fragments over loopback with NAT:

[ 6658.338121] WARNING: at net/ipv4/netfilter/nf_nat_standalone.c:89 nf_nat_fn+0x33/0x155()

The reason is that defragmentation is skipped for already tracked connections.
This is wrong in combination with NAT and ip_conntrack actually had some ifdefs
to avoid this behaviour when NAT is compiled in.

The entire "optimization" may seem a bit silly, for now simply restoring the
lost #ifdef is the easiest solution until we can come up with something better.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14 11:56:59 -07:00
Linus Torvalds
43096597a4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  qlge: Fix page size ifdef test.
  net: Rationalise email address: Network Specific Parts
  dsa: fix compile bug on s390
  netns: mib6 section fixlet
  enic: Fix Kconfig headline description
  de2104x: wrong MAC address fix
  s390: claw compile fixlet
  net: export genphy_restart_aneg
  cxgb3: extend copyrights to 2008
  cxgb3: update driver version
  net/phy: add missing kernel-doc
  pktgen: fix skb leak in case of failure
  mISDN/dsp_cmx.c: fix size checks
  misdn: use nonseekable_open()
  net: fix driver build errors due to missing net/ip6_checksum.h include
2008-10-14 10:28:49 -07:00
Geert Uytterhoeven
56f26f7b78 net/rfkill/rfkill-input.c needs <linux/sched.h>
For some m68k configs, I get:

| net/rfkill/rfkill-input.c: In function 'rfkill_start':
| net/rfkill/rfkill-input.c:208: error: dereferencing pointer to incomplete type

As the incomplete type is `struct task_struct', including <linux/sched.h> fixes
it.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-14 10:23:27 -07:00
Alan Cox
113aa838ec net: Rationalise email address: Network Specific Parts
Clean up the various different email addresses of mine listed in the code
to a single current and valid address. As Dave says his network merges
for 2.6.28 are now done this seems a good point to send them in where
they won't risk disrupting real changes.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-13 19:01:08 -07:00
Heiko Carstens
510149e319 dsa: fix compile bug on s390
git commit 45cec1bac0
"dsa: Need to select PHYLIB." causes this build bug on s390:

drivers/built-in.o: In function `phy_stop_interrupts':
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:631: undefined reference to `free_irq'
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:646: undefined reference to `enable_irq'
drivers/built-in.o: In function `phy_start_interrupts':
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:601: undefined reference to `request_irq'
drivers/built-in.o: In function `phy_interrupt':
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:528: undefined reference to `disable_irq_nosync'
drivers/built-in.o: In function `phy_change':
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:674: undefined reference to `enable_irq'
/home/heicarst/linux-2.6/drivers/net/phy/phy.c:692: undefined reference to `disable_irq'

PHYLIB has alread a depend on !S390, however select PHYLIB at DSA overrides
that unfortunately. So add a depend on !S390 to DSA as well.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-13 18:58:48 -07:00
Alexey Dobriyan
e7dc849494 netns: mib6 section fixlet
LD      net/ipv6/ipv6.o
WARNING: net/ipv6/ipv6.o(.text+0xd8): Section mismatch in reference from the function inet6_net_init() to the function .init.text:ipv6_init_mibs()

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-13 18:54:07 -07:00
Ilpo Järvinen
b4bb4ac8cb pktgen: fix skb leak in case of failure
Seems that skb goes into void unless something magic happened
in pskb_expand_head in case of failure.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-13 18:43:59 -07:00
Steven Whitehouse
a447c09324 vfs: Use const for kernel parser table
This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.

This was posted for review some time ago and I believe its been in -mm
since then.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-13 10:10:37 -07:00
Linus Torvalds
fffdedef69 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net/mac80211/rx.c: fix build error
  acpi: Make ACPI_TOSHIBA depend on INPUT.
  net/bfin_mac.c MDIO namespace fixes
  jme: remove unused #include <version.h>
  netfilter: remove unused #include <version.h>
  net: Fix off-by-one in skb_dma_map
  smc911x: Add support for LAN921{5,7,8} chips from SMSC
  qlge: remove duplicated #include
  wireless: remove duplicated #include
  net/au1000_eth.c MDIO namespace fixes
  net/tc35815.c: fix compilation
  sky2: Fix WOL regression
  r8169: NULL pointer dereference on r8169 load
2008-10-13 10:08:08 -07:00
Ingo Molnar
bf94e17bc8 net/mac80211/rx.c: fix build error
older versions of gcc do not recognize that ieee80211_rx_h_mesh_fwding()
is unused when CONFIG_MAC80211_MESH is disabled:

  net/built-in.o: In function `ieee80211_rx_h_mesh_fwding':
  rx.c:(.text+0xd89af): undefined reference to `mpp_path_lookup'
  rx.c:(.text+0xd89c6): undefined reference to `mpp_path_add'

as this code construct:

        if (ieee80211_vif_is_mesh(&sdata->vif))
                CALL_RXH(ieee80211_rx_h_mesh_fwding);

still causes ieee80211_rx_h_mesh_fwding() to be linked in.

Protect these places with an #ifdef.

commit b0dee578 ("Fix modpost failure when rx handlers are not inlined.")
solved part of this problem - this patch is still needed.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-12 23:51:38 -07:00
Huang Weiyi
14717f811b netfilter: remove unused #include <version.h>
The file(s) below do not use LINUX_VERSION_CODE nor KERNEL_VERSION.
  net/netfilter/nf_tproxy_core.c

This patch removes the said #include <version.h>.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-12 21:08:34 -07:00
Dimitris Michailidis
ab396eb03f net: Fix off-by-one in skb_dma_map
The unwind loop iterates down to -1 instead of stopping at 0 and ends up
accessing ->frags[-1].

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-12 21:07:34 -07:00
Huang Weiyi
ff71268aa4 wireless: remove duplicated #include
Removed duplicated include <linux/list.h> in 
net/wireless/core.c.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-12 21:03:38 -07:00
James Morris
93db628658 Merge branch 'next' into for-linus 2008-10-13 09:35:14 +11:00
Herbert Xu
7bb82d9245 gre: Initialise rtnl_link tunnel parameters properly
Brown paper bag error of calling memset with sizeof(p) instead
of sizeof(*p).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-11 12:20:15 -07:00
David S. Miller
f901b64472 ipvs: Add proper dependencies on IP_VS, and fix description header line.
Linus noted a build failure case:

net/netfilter/ipvs/ip_vs_xmit.c: In function 'ip_vs_tunnel_xmit':
net/netfilter/ipvs/ip_vs_xmit.c:616: error: implicit declaration of function 'ip_select_ident'

The proper include file (net/ip.h) is being included in ip_vs_xmit.c to get
that declaration.  So the only possible case where this can happen is if
CONFIG_INET is not enabled.

This seems to be purely a missing dependency in the ipvs/Kconfig file IP_VS
entry.

Also, while we're here, remove the out of date "EXPERIMENTAL" string in the
IP_VS config help header line.  IP_VS no longer depends upon CONFIG_EXPERIMENTAL

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-11 12:18:04 -07:00
Tobias Brunner
1839faab9a af_key: fix SADB_X_SPDDELETE response
When deleting an SPD entry using SADB_X_SPDDELETE, c.data.byid is not
initialized to zero in pfkey_spddelete(). Thus, key_notify_policy()
responds with a PF_KEY message of type SADB_X_SPDDELETE2 instead of
SADB_X_SPDDELETE.

Signed-off-by: Tobias Brunner <tobias.brunner@strongswan.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-10 14:07:03 -07:00
Tom Talpey
c055551e97 RPC/RDMA: ensure connection attempt is complete before signalling.
The RPC/RDMA connection logic could return early from reconnection
attempts, leading to additional spurious retries.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:15:06 -04:00
Tom Talpey
08ca0dce1e RPC/RDMA: correct the reconnect timer backoff
The RPC/RDMA code had a constant 5-second reconnect backoff, and
always performed it, even when re-establishing a connection to a
server after the RPC layer closed it due to being idle. Make it
an geometric backoff (up to 30 seconds), and don't delay idle
reconnect.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:15:02 -04:00
Tom Talpey
b3cd8d45a7 RPC/RDMA: optionally emit useful transport info upon connect/disconnect.
Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:13:59 -04:00
Tom Talpey
5f37d561e0 RPC/RDMA: reformat a debug printk to keep lines together.
The send marshaling code split a particular dprintk across two
lines, which makes it hard to extract from logfiles.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:13:42 -04:00
Tom Talpey
5675add36e RPC/RDMA: harden connection logic against missing/late rdma_cm upcalls.
Add defensive timeouts to wait_for_completion() calls in RDMA
address resolution, and make them interruptible. Fix the timeout
units to milliseconds (formerly jiffies) and move to private header.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:13:31 -04:00
Tom Talpey
1a954051b0 RPC/RDMA: fix connect/reconnect resource leak.
The RPC/RDMA code can leak RDMA connection manager endpoints in
certain error cases on connect. Don't signal unwanted events,
and be certain to destroy any allocated qp.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:13:02 -04:00
Tom Talpey
926449ba66 RPC/RDMA: return a consistent error, when connect fails.
The xprt_connect call path does not expect such errors as ECONNREFUSED
to be returned from failed transport connection attempts, otherwise it
translates them to EIO and signals fatal errors. For example, mount.nfs
prints simply "internal error". Translate all such errors to ENOTCONN
from RPC/RDMA to match sockets behavior.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:12:44 -04:00
Tom Talpey
9191ca3b38 RPC/RDMA: adhere to protocol for unpadded client trailing write chunks.
The RPC/RDMA protocol allows clients and servers to avoid RDMA
operations for data which is purely the result of XDR padding.
On the client, automatically insert the necessary padding for
such server replies, and optionally don't marshal such chunks.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:12:33 -04:00
Tom Talpey
fee08caf94 RPC/RDMA: avoid an oops due to disconnect racing with async upcalls.
RDMA disconnects yield an upcall from the RDMA connection manager,
which can race with rpc transport close, e.g. on ^C of a mount.
Ensure any rdma cm_id and qp are fully destroyed before continuing.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:11:40 -04:00
Patrick McHardy
4d74f8ba1f gre: minor cleanups in netlink interface
- use typeful helpers for IFLA_GRE_LOCAL/IFLA_GRE_REMOTE
- replace magic value by FIELD_SIZEOF
- use MODULE_ALIAS_RTNL_LINK macro

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-10 12:11:06 -07:00
Tom Talpey
ad0e9e01da RPC/RDMA: maintain the RPC task bytes-sent statistic.
Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:10:40 -04:00
Tom Talpey
575448bd36 RPC/RDMA: suppress retransmit on RPC/RDMA clients.
An RPC/RDMA client cannot retransmit on an unbroken connection,
doing so violates its flow control with the server.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:10:36 -04:00
Patrick McHardy
ba9e64b1c2 gre: fix copy and paste error
The flags are dumped twice, the keys not at all.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-10 12:10:30 -07:00
Tom Tucker
b334eaabf4 RPC/RDMA: fix connection IRD/ORD setting
This logic sets the connection parameter that configures the local device
and informs the remote peer how many concurrent incoming RDMA_READ
requests are supported. The original logic didn't really do what was
intended for two reasons:

- The max number supported by the device is typically smaller than
any one factor in the calculation used, and

- The field in the connection parameter structure where the value is
stored is a u8 and always overflows for the default settings.

So what really happens is the value requested for responder resources
is the left over 8 bits from the "desired value". If the desired value
happened to be a multiple of 256, the result was zero and it wouldn't
connect at all.

Given the above and the fact that max_requests is almost always larger
than the max responder resources supported by the adapter, this patch
simplifies this logic and simply requests the max supported by the device,
subject to a reasonable limit.

This bug was found by Jim Schutt at Sandia.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:09:56 -04:00
Tom Talpey
3197d309f5 RPC/RDMA: support FRMR client memory registration.
Configure, detect and use "fastreg" support from IB/iWARP verbs
layer to perform RPC/RDMA memory registration.

Make FRMR the default memreg mode (will fall back if not supported
by the selected RDMA adapter).

This allows full and optimal operation over the cxgb3 adapter, and others.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Acked-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:09:34 -04:00
Tom Talpey
bd7ed1d133 RPC/RDMA: check selected memory registration mode at runtime.
At transport creation, check for, and use, any local dma lkey.
Then, check that the selected memory registration mode is in fact
supported by the RDMA adapter selected for the mount. Fall back
to best alternative if not.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Acked-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:09:26 -04:00
Tom Talpey
fe9053b30b RPC/RDMA: add data types and new FRMR memory registration enum.
Internal RPC/RDMA structure updates in preparation for FRMR support.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Acked-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:09:24 -04:00
Tom Talpey
8d4ba0347c RPC/RDMA: refactor the inline memory registration code.
Refactor the memory registration and deregistration routines.
This saves stack space, makes the code more readable and prepares
to add the new FRMR registration methods.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-10 15:09:16 -04:00
Paul Moore
d91d407991 netlabel: Add configuration support for local labeling
Add the necessary NetLabel support for the new CIPSO mapping,
CIPSO_V4_MAP_LOCAL, which allows full LSM label/context support.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:34 -04:00
Paul Moore
15c45f7b2e cipso: Add support for native local labeling and fixup mapping names
This patch accomplishes three minor tasks: add a new tag type for local
labeling, rename the CIPSO_V4_MAP_STD define to CIPSO_V4_MAP_TRANS and
replace some of the CIPSO "magic numbers" with constants from the header
file.  The first change allows CIPSO to support full LSM labels/contexts,
not just MLS attributes.  The second change brings the mapping names inline
with what userspace is using, compatibility is preserved since we don't
actually change the value.  The last change is to aid readability and help
prevent mistakes.

Signed-off-by: Paul Moore <paul.moore@hp.com>
2008-10-10 10:16:34 -04:00
Paul Moore
014ab19a69 selinux: Set socket NetLabel based on connection endpoint
Previous work enabled the use of address based NetLabel selectors, which while
highly useful, brought the potential for additional per-packet overhead when
used.  This patch attempts to solve that by applying NetLabel socket labels
when sockets are connect()'d.  This should alleviate the per-packet NetLabel
labeling for all connected sockets (yes, it even works for connected DGRAM
sockets).

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:33 -04:00
Paul Moore
948bf85c1b netlabel: Add functionality to set the security attributes of a packet
This patch builds upon the new NetLabel address selector functionality by
providing the NetLabel KAPI and CIPSO engine support needed to enable the
new packet-based labeling.  The only new addition to the NetLabel KAPI at
this point is shown below:

 * int netlbl_skbuff_setattr(skb, family, secattr)

... and is designed to be called from a Netfilter hook after the packet's
IP header has been populated such as in the FORWARD or LOCAL_OUT hooks.

This patch also provides the necessary SELinux hooks to support this new
functionality.  Smack support is not currently included due to uncertainty
regarding the permissions needed to expand the Smack network access controls.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:32 -04:00
Paul Moore
63c4168874 netlabel: Add network address selectors to the NetLabel/LSM domain mapping
This patch extends the NetLabel traffic labeling capabilities to individual
packets based not only on the LSM domain but the by the destination address
as well.  The changes here only affect the core NetLabel infrastructre,
changes to the NetLabel KAPI and individial protocol engines are also
required but are split out into a different patch to ease review.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:32 -04:00
Paul Moore
61e1068219 netlabel: Add a generic way to create ordered linked lists of network addrs
Create an ordered IP address linked list mechanism similar to the core
kernel's linked list construct.  The idea behind this list functionality
is to create an extensibile linked list ordered by IP address mask to
ease the matching of network addresses.  The linked list is ordered with
larger address masks at the front of the list and shorter address masks
at the end to facilitate overriding network entries with individual host
or subnet entries.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:32 -04:00
Paul Moore
b1edeb1023 netlabel: Replace protocol/NetLabel linking with refrerence counts
NetLabel has always had a list of backpointers in the CIPSO DOI definition
structure which pointed to the NetLabel LSM domain mapping structures which
referenced the CIPSO DOI struct.  The rationale for this was that when an
administrator removed a CIPSO DOI from the system all of the associated
NetLabel LSM domain mappings should be removed as well; a list of
backpointers made this a simple operation.

Unfortunately, while the backpointers did make the removal easier they were
a bit of a mess from an implementation point of view which was making
further development difficult.  Since the removal of a CIPSO DOI is a
realtively rare event it seems to make sense to remove this backpointer
list as the optimization was hurting us more then it was helping.  However,
we still need to be able to track when a CIPSO DOI definition is being used
so replace the backpointer list with a reference count.  In order to
preserve the current functionality of removing the associated LSM domain
mappings when a CIPSO DOI is removed we walk the LSM domain mapping table,
removing the relevant entries.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:31 -04:00
Paul Moore
dfaebe9825 selinux: Fix missing calls to netlbl_skbuff_err()
At some point I think I messed up and dropped the calls to netlbl_skbuff_err()
which are necessary for CIPSO to send error notifications to remote systems.
This patch re-introduces the error handling calls into the SELinux code.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:31 -04:00
Paul Moore
948a72438d netlabel: Remove unneeded in-kernel API functions
After some discussions with the Smack folks, well just Casey, I now have a
better idea of what Smack wants out of NetLabel in the future so I think it
is now safe to do some API "pruning".  If another LSM comes along that
needs this functionality we can always add it back in, but I don't see any
LSMs on the horizon which might make use of these functions.

Thanks to Rami Rosen who suggested removing netlbl_cfg_cipsov4_del() back
in February 2008.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Reviewed-by: James Morris <jmorris@namei.org>
2008-10-10 10:16:30 -04:00
Paul Moore
561967010e netlabel: Fix some sparse warnings
Fix a few sparse warnings.  One dealt with a RCU lock being held on error,
another dealt with an improper type caused by a signed/unsigned mixup while
the rest appeared to be caused by using rcu_dereference() in a
list_for_each_entry_rcu() call.  The latter probably isn't a big deal, but
I derive a certain pleasure from knowing that the net/netlabel is nice and
clean.

Thanks to James Morris for pointing out the issues and demonstrating how
to run sparse.

Signed-off-by: Paul Moore <paul.moore@hp.com>
2008-10-10 10:16:29 -04:00
Guo-Fu Tseng
fa3e5b4eb8 tcpv6: fix error with CONFIG_TCP_MD5SIG disabled
This patch fix error with CONFIG_TCP_MD5SIG disabled.

Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 21:11:56 -07:00
Eric Dumazet
f24d43c07e udp: complete port availability checking
While looking at UDP port randomization, I noticed it
was litle bit pessimistic, not looking at type of sockets
(IPV6/IPV4) and not looking at bound addresses if any.

We should perform same tests than when binding to a
specific port.

This permits a cleanup of udp_lib_get_port()

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 14:51:27 -07:00
Ilpo Järvinen
626e264dd1 tcpv6: combine tcp_v6_send_(reset|ack)
$ codiff tcp_ipv6.o.old tcp_ipv6.o.new
net/ipv6/tcp_ipv6.c:
  tcp_v6_md5_hash_hdr | -144
  tcp_v6_send_ack     | -585
  tcp_v6_send_reset   | -540
 3 functions changed, 1269 bytes removed, diff: -1269

net/ipv6/tcp_ipv6.c:
  tcp_v6_send_response | +791
 1 function changed, 791 bytes added, diff: +791

tcp_ipv6.o.new:
 4 functions changed, 791 bytes added, 1269 bytes removed, diff: -478

I choose to leave the reset related netns comment in place (not
the one that is killed) as I cannot understand its English so
it's a bit hard for me to evaluate its usefulness :-).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 14:42:40 -07:00
Ilpo Järvinen
81ada62d70 tcpv6: convert opt[] -> topt in tcp_v6_send_reset
after this I get:

$ diff-funcs tcp_v6_send_reset tcp_ipv6.c tcp_ipv6.c tcp_v6_send_ack
 --- tcp_ipv6.c:tcp_v6_send_reset()
 +++ tcp_ipv6.c:tcp_v6_send_ack()
@@ -1,4 +1,5 @@
-static void tcp_v6_send_reset(struct sock *sk, struct sk_buff *skb)
+static void tcp_v6_send_ack(struct sk_buff *skb, u32 seq, u32 ack, u32 win,
u32 ts,
+                           struct tcp_md5sig_key *key)
 {
        struct tcphdr *th = tcp_hdr(skb), *t1;
        struct sk_buff *buff;
@@ -7,31 +8,14 @@
        struct sock *ctl_sk = net->ipv6.tcp_sk;
        unsigned int tot_len = sizeof(struct tcphdr);
        __be32 *topt;
-#ifdef CONFIG_TCP_MD5SIG
-       struct tcp_md5sig_key *key;
-#endif
-
-       if (th->rst)
-               return;
-
-       if (!ipv6_unicast_destination(skb))
-               return;

+       if (ts)
+               tot_len += TCPOLEN_TSTAMP_ALIGNED;
 #ifdef CONFIG_TCP_MD5SIG
-       if (sk)
-               key = tcp_v6_md5_do_lookup(sk, &ipv6_hdr(skb)->daddr);
-       else
-               key = NULL;
-
        if (key)
                tot_len += TCPOLEN_MD5SIG_ALIGNED;
 #endif

-       /*
-        * We need to grab some memory, and put together an RST,
-        * and then put it into the queue to be sent.
-        */
-
        buff = alloc_skb(MAX_HEADER + sizeof(struct ipv6hdr) + tot_len,
                         GFP_ATOMIC);
        if (buff == NULL)
@@ -46,18 +30,20 @@
        t1->dest = th->source;
        t1->source = th->dest;
        t1->doff = tot_len / 4;
-       t1->rst = 1;
-
-       if(th->ack) {
-               t1->seq = th->ack_seq;
-       } else {
-               t1->ack = 1;
-               t1->ack_seq = htonl(ntohl(th->seq) + th->syn + th->fin
-                                   + skb->len - (th->doff<<2));
-       }
+       t1->seq = htonl(seq);
+       t1->ack_seq = htonl(ack);
+       t1->ack = 1;
+       t1->window = htons(win);

        topt = (__be32 *)(t1 + 1);

+       if (ts) {
+               *topt++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) |
+                               (TCPOPT_TIMESTAMP << 8) |
TCPOLEN_TIMESTAMP);
+               *topt++ = htonl(tcp_time_stamp);
+               *topt++ = htonl(ts);
+       }
+
 #ifdef CONFIG_TCP_MD5SIG
        if (key) {
                *topt++ = htonl((TCPOPT_NOP << 24) | (TCPOPT_NOP << 16) |
@@ -84,15 +70,10 @@
        fl.fl_ip_sport = t1->source;
        security_skb_classify_flow(skb, &fl);

-       /* Pass a socket to ip6_dst_lookup either it is for RST
-        * Underlying function will use this to retrieve the network
-        * namespace
-        */
        if (!ip6_dst_lookup(ctl_sk, &buff->dst, &fl)) {
                if (xfrm_lookup(&buff->dst, &fl, NULL, 0) >= 0) {
                        ip6_xmit(ctl_sk, buff, &fl, NULL, 0);
                        TCP_INC_STATS_BH(net, TCP_MIB_OUTSEGS);
-                       TCP_INC_STATS_BH(net, TCP_MIB_OUTRSTS);
                        return;
                }
        }


...which starts to be trivial to combine.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 14:42:01 -07:00
Ilpo Järvinen
77c676da1b tcpv6: trivial formatting changes to send_(ack|reset)
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 14:41:38 -07:00
Ilpo Järvinen
78e645cb89 tcpv[46]: fix md5 pseudoheader address field ordering
Maybe it's just me but I guess those md5 people made a mess
out of it by having *_md5_hash_* to use daddr, saddr order
instead of the one that is natural (and equal to what csum
functions use). For the segment were sending, the original
addresses are reversed so buff's saddr == skb's daddr and
vice-versa.

Maybe I can finally proceed with unification of some code
after fixing it first... :-)

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 14:37:47 -07:00
Vlad Yasevich
a1080a8b0b sctp: update SNMP statiscts when T5 timer expired.
The T5 timer is the timer for the over-all shutdown procedure.  If
this timer expires, then shutdown procedure has not completed and we
ABORT the association.  We should update SCTP_MIB_ABORTED and
SCTP_MIB_CURRESTAB  when aborting.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 14:33:26 -07:00
Vlad Yasevich
56eb82bb8d sctp: Fix SNMP number of SCTP_MIB_ABORTED during violation handling.
If ABORT chunks require authentication and a protocol violation
is triggered, we do not tear down the association.  Subsequently,
we should not increment SCTP_MIB_ABORTED.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 14:33:01 -07:00
Wei Yongjun
3d5a019d57 sctp: Fix the SNMP number of SCTP_MIB_CURRESTAB
RFC3873 defined SCTP_MIB_CURRESTAB:
  sctpCurrEstab OBJECT-TYPE
    SYNTAX         Gauge32
    MAX-ACCESS     read-only
    STATUS         current
    DESCRIPTION
         "The number of associations for which the current state is
         either ESTABLISHED, SHUTDOWN-RECEIVED or SHUTDOWN-PENDING."
    REFERENCE
         "Section 4 in RFC2960 covers the SCTP   Association state
         diagram."

If the T4 RTO timer expires many times(timeout), the association will enter
CLOSED state, so we should dec the number of SCTP_MIB_CURRESTAB, not inc the
number of SCTP_MIB_CURRESTAB.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 14:32:24 -07:00
Herbert Xu
64194c31a0 inet: Make tunnel RX/TX byte counters more consistent
This patch makes the RX/TX byte counters for IPIP, GRE and SIT more
consistent.  Previously we included the external IP headers on the
way out but not when the packet is inbound.

The new scheme is to count payload only in both directions.  For
IPIP and SIT this simply means the exclusion of the external IP
header.  For GRE this means that we exclude the GRE header as
well.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 12:03:17 -07:00
Herbert Xu
e1a8000228 gre: Add Transparent Ethernet Bridging
This patch adds support for Ethernet over GRE encapsulation.
This is exposed to user-space with a new link type of "gretap"
instead of "gre".  It will create an ARPHRD_ETHER device in
lieu of the usual ARPHRD_IPGRE.

Note that to preserver backwards compatibility all Transparent
Ethernet Bridging packets are passed to an ARPHRD_IPGRE tunnel
if its key matches and there is no ARPHRD_ETHER device whose
key matches more closely.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 12:00:17 -07:00
Herbert Xu
c19e654ddb gre: Add netlink interface
This patch adds a netlink interface that will eventually displace
the existing ioctl interface.  It utilises the elegant rtnl_link_ops
mechanism.

This also means that user-space no longer needs to rely on the
tunnel interface being of type GRE to identify GRE tunnels.  The
identification can now occur using rtnl_link_ops.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 11:59:55 -07:00
Herbert Xu
42aa916265 gre: Move MTU setting out of ipgre_tunnel_bind_dev
This patch moves the dev->mtu setting out of ipgre_tunnel_bind_dev.
This is in prepartion of using rtnl_link where we'll need to make
the MTU setting conditional on whether the user has supplied an
MTU.  This also requires the move of the ipgre_tunnel_bind_dev
call out of the dev->init function so that we can access the user
parameters later.

This patch also adds a check to prevent setting the MTU below
the minimum of 68.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 11:59:32 -07:00
Herbert Xu
c95b819ad7 gre: Use needed_headroom
Now that we have dev->needed_headroom, we can use it instead of
having a bogus dev->hard_header_len.  This also allows us to
include dev->hard_header_len in the MTU computation so that when
we do have a meaningful hard_harder_len in future it is included
automatically in figuring out the MTU.

Incidentally, this fixes a bug where we ignored the needed_headroom
field of the underlying device in calculating our own hard_header_len.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-09 11:58:54 -07:00
David S. Miller
45cec1bac0 dsa: Need to select PHYLIB.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 17:33:01 -07:00
Lennert Buytenhek
2e16a77e1e dsa: add support for the Marvell 88E6060 switch chip
Add support for the Marvell 88E6060 switch chip.  This chip only
supports the Header and Trailer tagging formats, and we use it in
Trailer mode since that mode is slightly easier to handle than
Header mode.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Byron Bradley <byron.bbradley@gmail.com>
Tested-by: Tim Ellis <tim.ellis@mac.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 17:24:22 -07:00
Lennert Buytenhek
396138f03f dsa: add support for Trailer tagging format
This adds support for the Trailer switch tagging format.  This is
another tagging that doesn't explicitly mark tagged packets with a
distinct ethertype, so that we need to add a similar hack in the
receive path as for the Original DSA tagging format.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Byron Bradley <byron.bbradley@gmail.com>
Tested-by: Tim Ellis <tim.ellis@mac.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 17:24:16 -07:00
Lennert Buytenhek
2e5f032095 dsa: add support for the Marvell 88E6131 switch chip
Add support for the Marvell 88E6131 switch chip.  This chip only
supports the original (ethertype-less) DSA tagging format.

On the 88E6131, there is a PHY Polling Unit (PPU) which has exclusive
access to each of the PHYs's MII management registers.  If we want to
talk to the PHYs from software, we have to disable the PPU and wait
for it to complete its current transaction before we can do so, and we
need to re-enable the PPU afterwards to make sure that the switch will
notice changes in link state and speed on the individual ports as they
occur.

Since disabling the PPU is rather slow, and since MII management
accesses are typically done in bursts, this patch keeps the PPU disabled
for 10ms after a software access completes.  This makes handling the
PPU slightly more complex, but speeds up something like running ethtool
on one of the switch slave interfaces from ~300ms to ~30ms on typical
hardware.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Nicolas Pitre <nico@marvell.com>
Tested-by: Peter van Valderen <linux@ddcrew.com>
Tested-by: Dirk Teurlings <dirk@upexia.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 17:24:09 -07:00
Lennert Buytenhek
cf85d08fdf dsa: add support for original DSA tagging format
Most of the DSA switches currently in the field do not support the
Ethertype DSA tagging format that one of the previous patches added
support for, but only the original DSA tagging format.

The original DSA tagging format carries the same information as the
Ethertype DSA tagging format, but with the difference that it does not
have an ethertype field.  In other words, when receiving a packet that
is tagged with an original DSA tag, there is no way of telling in
eth_type_trans() that this packet is in fact a DSA-tagged packet.

This patch adds a hook into eth_type_trans() which is only compiled in
if support for a switch chip that doesn't support Ethertype DSA is
selected, and which checks whether there is a DSA switch driver
instance attached to this network device which uses the old tag format.
If so, it sets the protocol field to ETH_P_DSA without looking at the
packet, so that the packet ends up in the right place.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Nicolas Pitre <nico@marvell.com>
Tested-by: Peter van Valderen <linux@ddcrew.com>
Tested-by: Dirk Teurlings <dirk@upexia.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 17:19:56 -07:00
Lennert Buytenhek
91da11f870 net: Distributed Switch Architecture protocol support
Distributed Switch Architecture is a protocol for managing hardware
switch chips.  It consists of a set of MII management registers and
commands to configure the switch, and an ethernet header format to
signal which of the ports of the switch a packet was received from
or is intended to be sent to.

The switches that this driver supports are typically embedded in
access points and routers, and a typical setup with a DSA switch
looks something like this:

	+-----------+       +-----------+
	|           | RGMII |           |
	|           +-------+           +------ 1000baseT MDI ("WAN")
	|           |       |  6-port   +------ 1000baseT MDI ("LAN1")
	|    CPU    |       |  ethernet +------ 1000baseT MDI ("LAN2")
	|           |MIImgmt|  switch   +------ 1000baseT MDI ("LAN3")
	|           +-------+  w/5 PHYs +------ 1000baseT MDI ("LAN4")
	|           |       |           |
	+-----------+       +-----------+

The switch driver presents each port on the switch as a separate
network interface to Linux, polls the switch to maintain software
link state of those ports, forwards MII management interface
accesses to those network interfaces (e.g. as done by ethtool) to
the switch, and exposes the switch's hardware statistics counters
via the appropriate Linux kernel interfaces.

This initial patch supports the MII management interface register
layout of the Marvell 88E6123, 88E6161 and 88E6165 switch chips, and
supports the "Ethertype DSA" packet tagging format.

(There is no officially registered ethertype for the Ethertype DSA
packet format, so we just grab a random one.  The ethertype to use
is programmed into the switch, and the switch driver uses the value
of ETH_P_EDSA for this, so this define can be changed at any time in
the future if the one we chose is allocated to another protocol or
if Ethertype DSA gets its own officially registered ethertype, and
everything will continue to work.)

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Nicolas Pitre <nico@marvell.com>
Tested-by: Byron Bradley <byron.bbradley@gmail.com>
Tested-by: Tim Ellis <tim.ellis@mac.com>
Tested-by: Peter van Valderen <linux@ddcrew.com>
Tested-by: Dirk Teurlings <dirk@upexia.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 17:15:19 -07:00
J. Bruce Fields
107e0008df Merge branch 'from-tomtucker' into for-2.6.28 2008-10-08 18:22:18 -04:00
David S. Miller
4dd565134e Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/e1000e/ich8lan.c
	drivers/net/e1000e/netdev.c
2008-10-08 14:56:41 -07:00
Sven Wegener
071d7ab664 ipvs: Remove stray file left over from ipvs move
Commit cb7f6a7b71 ("IPVS: Move IPVS to
net/netfilter/ipvs") has left a stray file in the old location of ipvs.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 14:41:35 -07:00
Ilpo Järvinen
53b125779f tcpv6: fix option space offsets with md5
More breakage :-), part of timestamps just were previously
overwritten.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 14:36:33 -07:00
David S. Miller
db2bf2476b Merge branch 'lvs-next-2.6' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-2.6
Conflicts:

	net/netfilter/Kconfig
2008-10-08 14:26:36 -07:00
Vlad Yasevich
02015180e2 sctp: shrink sctp_tsnmap some more by removing gabs array
The gabs array in the sctp_tsnmap structure is only used
in one place, sctp_make_sack().  As such, carrying the
array around in the sctp_tsnmap and thus directly in
the sctp_association is rather pointless since most
of the time it's just taking up space.  Now, let
sctp_make_sack create and populate it and then throw
it away when it's done.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 14:19:01 -07:00
Vlad Yasevich
8e1ee18c33 sctp: Rework the tsn map to use generic bitmap.
The tsn map currently use is 4K large and is stuck inside
the sctp_association structure making memory references REALLY
expensive.  What we really need is at most 4K worth of bits
so the biggest map we would have is 512 bytes.   Also, the
map is only really usefull when we have gaps to store and
report.  As such, starting with minimal map of say 32 TSNs (bits)
should be enough for normal low-loss operations.  We can grow
the map by some multiple of 32 along with some extra room any
time we receive the TSN which would put us outside of the map
boundry.  As we close gaps, we can shift the map to rebase
it on the latest TSN we've seen.  This saves 4088 bytes per
association just in the map alone along savings from the now
unnecessary structure members.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 14:18:39 -07:00
Eric Dumazet
3c689b7320 inet: cleanup of local_port_range
I noticed sysctl_local_port_range[] and its associated seqlock
sysctl_local_port_range_lock were on separate cache lines.
Moreover, sysctl_local_port_range[] was close to unrelated
variables, highly modified, leading to cache misses.

Moving these two variables in a structure can help data
locality and moving this structure to read_mostly section
helps sharing of this data among cpus.

Cleanup of extern declarations (moved in include file where
they belong), and use of inet_get_local_port_range()
accessor instead of direct access to ports values.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 14:18:04 -07:00
Eric Dumazet
9088c56095 udp: Improve port randomization
Current UDP port allocation is suboptimal.
We select the shortest chain to chose a port (out of 512)
that will hash in this shortest chain.

First, it can lead to give not so ramdom ports and ease
give attackers more opportunities to break the system.

Second, it can consume a lot of CPU to scan all table
in order to find the shortest chain.

Third, in some pathological cases we can fail to find
a free port even if they are plenty of them.

This patch zap the search for a short chain and only
use one random seed. Problem of getting long chains
should be addressed in another way, since we can
obtain long chains with non random ports.

Based on a report and patch from Vitaly Mayatskikh

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 11:44:17 -07:00
Jarek Poplawski
53e9150349 pkt_sched: Update qdisc requeue stats in dev_requeue_skb()
After the last change of requeuing there is no info about such
incidents in tc stats. This patch updates the counter, but we should
consider this should differ from previous stats because of additional
checks preventing to repeat this. On the other hand, previous stats
didn't include requeuing of gso_segmented skbs.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 11:36:22 -07:00
Ilpo Järvinen
52cd5750e8 tcp: fix length used for checksum in a reset
While looking for some common code I came across difference
in checksum calculation between tcp_v6_send_(reset|ack) I
couldn't explain. I checked both v4 and v6 and found out that
both seem to have the same "feature". I couldn't find anything
in rfc nor anywhere else which would state that md5 option
should be ignored like it was in case of reset so I came to
a conclusion that this is probably a genuine bug. I suspect
that addition of md5 just was fooled by the excessive
copy-paste code in those functions and the reset part was
never tested well enough to find out the problem.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 11:34:06 -07:00
Denis V. Lunev
2ca89cea5c ipv6: remove unused not init_ipv6_mibs/cleanup_ipv6_mibs
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 11:17:07 -07:00
Denis V. Lunev
9261e53701 ipv6: making ip and icmp statistics per/namespace
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 11:16:45 -07:00
Denis V. Lunev
55d43808eb ipv6: added net argument to ICMP6MSGIN_INC_STATS_BH
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 11:15:46 -07:00
Denis V. Lunev
5a57d4c7fd ipv6: added net argument to ICMP6MSGOUT_INC_STATS_BH
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 11:15:05 -07:00
Denis V. Lunev
5c5d244bd3 ipv6: added net argument to ICMP6MSGOUT_INC_STATS
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 11:14:44 -07:00
Denis V. Lunev
e41b5368e0 ipv6: added net argument to ICMP6_INC_STATS_BH
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 11:14:13 -07:00