Add basic netlink support to the Ethernet bridge. Including:
* dump interfaces in bridges
* monitor link status changes
* change state of bridge port
For some demo programs see:
http://developer.osdl.org/shemminger/prototypes/brnl.tar.gz
These are to allow building a daemon that does alternative
implementations of Spanning Tree Protocol.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return address in use, if some other kernel code has the SAP.
Propogate out error codes from netfilter registration and unwind.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Small optimizations of bridge forwarding path.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are several bugs in error handling in br_add_bridge:
- when dev_alloc_name fails, allocated net_device is not freed
- unregister_netdev is called when rtnl lock is held
- free_netdev is called before netdev_run_todo has a chance to be run after
unregistering net_device
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bridge will OOPS on removal if other application has the SAP open.
The bridge SAP might be shared with other usages, so need
to do reference counting on module removal rather than explicit
close/delete.
Since packet might arrive after or during removal, need to clear
the receive function handle, so LLC only hands it to user (if any).
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The prefix argument for nf_log_packet is a format specifier,
so don't pass the user defined string directly to it.
Signed-off-by: Philip Craig <philipc@snapgear.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that netdevice sysfs registration is done as part of
register_netdevice; bridge code no longer has to be tricky when adding
it's kobjects to bridges.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It makes sense to add this simple statistic to keep track of received
multicast packets.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Need to allow for VLAN header when bridging.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make all the vmalloc calls in net/bridge/netfilter/ebtables.c follow
the standard convention. Remove unnecessary casts, and use '*object'
instead of 'type'.
Signed-off-by: Jayachandran C. <c.jayachandran@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allocate an array of 'struct ebt_chainstack *', the current code allocates
array of 'struct ebt_chainstack'.
akpm: converted to use the
foo = alloc(sizeof(*foo))
form. Which would have prevented this from happening in the first place.
akpm: also removed unneeded typecast.
akpm: what on earth is this code doing anyway? cpu_possible_map can be
sparse..
Signed-off-by: Jayachandran C. <c.jayachandran@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change allows link local packets (like 802.3ad and Spanning Tree
Protocol) to be processed even when the bridge is not using the port.
It fixes the chicken-egg problem for bridging a bonded device, and
may also fix problems with spanning tree failover.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
for_each_cpu() actually iterates across all possible CPUs. We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs. This is inefficient and
possibly buggy.
We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.
This patch replaces for_each_cpu with for_each_possible_cpu under /net
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The conntrack code doesn't do re-fragmentation of defragmented packets
anymore but relies on fragmentation in the IP layer. Purely bridged
packets don't pass through the IP layer, so the bridge netfilter code
needs to take care of fragmentation itself.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Every netfilter module uses `init' for its module_init() function and
`fini' or `cleanup' for its module_exit() function.
Problem is, this creates uninformative initcall_debug output and makes
ctags rather useless.
So go through and rename them all to $(filename)_init and
$(filename)_fini.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
I see lots of
kernel unaligned access to 0xa0000001009dbb6f, ip=0xa000000100811591
kernel unaligned access to 0xa0000001009dbb6b, ip=0xa0000001008115c1
kernel unaligned access to 0xa0000001009dbb6d, ip=0xa0000001008115f1
messages in my logs on IA64 when using the ethernet bridge with 2.6.16.
Appended is a patch to fix them.
Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bridge code can use existing LLC output code when building
spanning tree protocol packets.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bridge's communicate with each other using Spanning Tree Protocol
over a standard multicast address. There are times when testing or
layering bridges over existing topologies or tunnels, when it is
useful to use alternative multicast addresses for STP packets.
The 802.1d standard has some unused addresses, that can be used for this.
This patch is restrictive in that it only allows one of the possible
addresses in the standard.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use LLC for the receive path of Spanning Tree Protocol packets.
This allows link local multicast packets to be received by
other protocols (if they care), and uses the existing LLC
code to get STP packets back into bridge code.
The bridge multicast address is also checked, so bridges using
other link local multicast addresses are ignored. This allows
for use of different multicast addresses to define separate STP
domains.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanup the get/set of bridge timer value in the packets.
It is clearer not to bury the conversion in macro.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Optimize the forwarding and transmit paths. Both places are
called with bottom half/no preempt so there is no need to use
spin_lock_bh or rcu_read_lock.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move nf_bridge_alloc from header file to the one place it is
used and optimize it.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the VLAN macros in bridge netfilter code. Macros should
not depend on magic variables.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only use__constant_htons() for initializers and switch cases.
For other uses, it is just as efficient and clearer to use htons
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Run br_netfilter through Lindent to fix whitespace.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netfilter hook that is used to receive frames doesn't need to be a
stub. It is only called in two ways, both of which ignore the return
value.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use kzalloc versus kmalloc+memset. Also don't need to do
memset() of bridge address since it is in netdev private data
that is already zero'd in alloc_netdev.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The STP timers run off softirq (kernel timers), so there is no need to
disable bottom half in the spin locks.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/bridge/br_netfilter.c: In function `br_nf_pre_routing':
net/bridge/br_netfilter.c:427: warning: unused variable `vhdr'
net/bridge/br_netfilter.c:445: warning: unused variable `vhdr'
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/bridge/netfilter/ebtables.c:1481: warning: initialization makes pointer from integer without a cast
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We're now starting to have quite a number of places that do skb_pull
followed immediately by an skb_postpull_rcsum. We can merge these two
operations into one function with skb_pull_rcsum. This makes sense
since most pull operations on receive skb's need to update the
checksum.
I've decided to make this out-of-line since it is fairly big and the
fast path where hardware checksums are enabled need to call
csum_partial anyway.
Since this is a brand new function we get to add an extra check on the
len argument. As it is most callers of skb_pull ignore its return
value which essentially means that there is no check on the len
argument.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Semaphore to mutex conversion.
The conversion was generated via scripts, and the result was validated
automatically via a script as well.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Gregor Maier <gregor@net.in.tum.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The earlier round of kobject/sysfs changes to bridge caused
it not to generate a uevent on removal. Don't think any application
cares (not sure about Xen) but since it generates add uevent
it should generate remove as well.
Signed-off-by: Stephen Hemminger <shemmigner@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initialize the STP timers for a port when it is created,
rather than when it is enabled. This will prevent future race conditions
where timer gets started before port is enabled.
Signed-off-by: Stephen Hemminger <shemmigner@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bridge would crash because of uninitailized timer if STP is used and
device was inserted into a bridge before bridge was up. This got
introduced when the delayed port checking was added. Fix is to not
enable STP on port unless bridge is up.
Bugzilla: http://bugzilla.kernel.org/show_bug.cgi?id=6140
Dup: http://bugzilla.kernel.org/show_bug.cgi?id=6156
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The nfnetlink_log infrastructure changes broke compatiblity of the LOG
targets. They currently use whatever log backend was registered first,
which means that if ipt_ULOG was loaded first, no messages will be printed
to the ring buffer anymore.
Restore compatiblity by using the old log functions by default and only use
the nf_log backend if the user explicitly said so.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bridge-netfilter code attaches a fake dst_entry with dst->ops == NULL
to purely bridged packets. When these packets are SNATed and a policy
lookup is done, xfrm_lookup crashes because it tries to dereference
dst->ops.
Change xfrm_lookup not to dereference dst->ops before checking for the
DST_NOXFRM flag and set this flag in the fake dst_entry.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Looks like somebody forgot to use the _bh spin_lock variant. We ran into a
deadlock where br->hello_timer expired while br_stp_disable_br() walked
br->port_list.
Signed-off-by: Adrian Drzewiecki <z@drze.net>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Horms patch was the best of the three fixes. Dave, already applied
Harald's version, so this patch converts that to the better one.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/bridge/br_netfilter.c: In function `br_nf_post_routing':
net/bridge/br_netfilter.c:808: warning: implicit declaration of function `has_bridge_parent'
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Cc: Harald Welte <laforge@netfilter.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Refactor how the bridge code interacts with kobject system.
It should still use kobjects even if not using sysfs.
Fix the error unwind handling in br_add_if.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bridge netfilter code needs to handle the case where device is
removed from bridge while packet in process. In these cases the
bridge_parent can become null while processing.
This should fix: http://bugzilla.kernel.org/show_bug.cgi?id=5803
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change Bridge receive path to correctly handle RCU removal of device
from bridge. Also fixes deadlock between carrier_check and del_nbp.
This replaces the previous deleted flag fix.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
netfilter's do_replace() can overflow on addition within SMP_ALIGN()
and/or on multiplication by NR_CPUS, resulting in a buffer overflow on
the copy_from_user(). In practice, the overflow on addition is
triggerable on all systems, whereas the multiplication one might require
much physical memory to be present due to the check above. Either is
sufficient to overwrite arbitrary amounts of kernel memory.
I really hate adding the same check to all 4 versions of do_replace(),
but the code is duplicate...
Found by Solar Designer during security audit of OpenVZ.org
Signed-Off-By: Kirill Korotaev <dev@openvz.org>
Signed-Off-By: Solar Designer <solar@openwall.com>
Signed-off-by: Patrck McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The skb allocated is always of size nlbufsize, even if that is smaller than
the size needed for the current packet.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Performance tests showed that ULOG may fail on heavy loaded systems
because of failed order-N allocations (N >= 1).
The default value of 4096 is not optimal in the sense that it actually
allocates _two_ contigous physical pages. Reasoning: ULOG uses
alloc_skb(), which adds another ~300 bytes for skb_shared_info.
This patch sets the default value to NLMSG_GOODSIZE and adds some
documentation at the top.
Signed-off-by: Holger Eitzenberger <heitzenberger@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>