Do not need to use read_lock(&dev_base_lock), use RCU instead.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds an RCU macro for continuing search, useful for some
network devices like vlan.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change udp6_portaddr_hash() to use ipv6_addr_v4mapped()
inline instead of ipv6_addr_type().
Signed-off-by: Brian Haley <brian.haley@hp.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use for_each_netdev_rcu() and dont lock dev_base_lock anymore
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch rearranges the SIT DF bit handling using the new IPIP DF
code. The only externally visible effect should be the case where
PMTU is enabled and the MTU is exactly 1280 bytes. In this case the
previous code would send packets out with DF off while the new code
would set the DF bit. This is inline with RFC 4213.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks,
Signed-off-by: David S. Miller <davem@davemloft.net>
Apparently, inet6_dump_addr() is not able to handle more than
64 ipv6 addresses per device. We must break from inner loops
in case skb is full, or else cursor is put at the end of list.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When handling large number of netdevice, inet6_dump_ifinfo()
is very slow because it has O(N^2) complexity.
Instead of scanning one single list, we can use the 256 sub lists
of the dev_index hash table, and RCU lookups.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use guard DECLARE_SOCKADDR in a few more places which allow
us to catch if the structure copied back is too big.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some usbnet drivers update link state while others do not due to
hardware limitations. Add a flag to distinguish those that do, and
set the link down initially for their devices.
This is intended to fix this bug: http://bugs.debian.org/444043
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use DMA_BIT_MASK(44) instead of deprecated DMA_44BIT_MASK
Signed-off-by: Marin Mitov <mitov@issp.bas.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
UDP bind() can be O(N^2) in some pathological cases.
Thanks to secondary hash tables, we can make it O(N)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Failing to allocate MSI-X vectors is not an error and should not be
printed as such
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Making sure that whenever the FW/HW is configured for GSO, it is also
configured to CSUM offload
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace run-time string formatting with preprocessor string
manipulation.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace run-time string formatting with preprocessor string
manipulation.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the following bug in the current implementation of
net/xfrm: SAD entries timeouts do not count the time spent by the machine
in the suspended state. This leads to the connectivity problems because
after resuming local machine thinks that the SAD entry is still valid, while
it has already been expired on the remote server.
The cause of this is very simple: the timeouts in the net/xfrm are bound to
the old mod_timer() timers. This patch reassigns them to the
CLOCK_REALTIME hrtimer.
I have been using this version of the patch for a few months on my
machines without any problems. Also run a few stress tests w/o any
issues.
This version of the patch uses tasklet_hrtimer by Peter Zijlstra
(commit 9ba5f0).
This patch is against 2.6.31.4. Please CC me.
Signed-off-by: Yury Polyanskiy <polyanskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds compat_ioctl support for SIOCWANDEV, which has
always been missing.
The definition of struct compat_ifreq was missing an
ifru_settings fields that is needed to support SIOCWANDEV,
so add that and clean up the whitespace damage in the
struct definition.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
SIOCGMIIPHY and SIOCGMIIREG return data through ifreq,
so it needs to be converted on the way out as well.
SIOCGIFPFLAGS is unused, but has the same problem in theory.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When skb_clone() fails, we should increment sk_drops and SNMP counters.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPV6 UDP multicast rx path is a bit complex and can hold a spinlock
for a long time.
Using a small (32 or 64 entries) stack of socket pointers can help
to perform expensive operations (skb_clone(), udp_queue_rcv_skb())
outside of the lock, in most cases.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
UDP multicast rx path is a bit complex and can hold a spinlock
for a long time.
Using a small (32 or 64 entries) stack of socket pointers can help
to perform expensive operations (skb_clone(), udp_queue_rcv_skb())
outside of the lock, in most cases.
It's also a base for a future RCU conversion of multicast recption.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Lucian Adrian Grijincu <lgrijincu@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We first locate the (local port) hash chain head
If few sockets are in this chain, we proceed with previous lookup algo.
If too many sockets are listed, we take a look at the secondary
(port, address) hash chain.
We choose the shortest chain and proceed with a RCU lookup on the elected chain.
But, if we chose (port, address) chain, and fail to find a socket on given address,
we must try another lookup on (port, in6addr_any) chain to find sockets not bound
to a particular IP.
-> No extra cost for typical setups, where the first lookup will probabbly
be performed.
RCU lookups everywhere, we dont acquire spinlock.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We first locate the (local port) hash chain head
If few sockets are in this chain, we proceed with previous lookup algo.
If too many sockets are listed, we take a look at the secondary
(port, address) hash chain we added in previous patch.
We choose the shortest chain and proceed with a RCU lookup on the elected chain.
But, if we chose (port, address) chain, and fail to find a socket on given address,
we must try another lookup on (port, INADDR_ANY) chain to find socket not bound
to a particular IP.
-> No extra cost for typical setups, where the first lookup will probabbly
be performed.
RCU lookups everywhere, we dont acquire spinlock.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extends udp_table to contain a secondary hash table.
socket anchor for this second hash is free, because UDP
doesnt use skc_bind_node : We define an union to hold
both skc_bind_node & a new hlist_nulls_node udp_portaddr_node
udp_lib_get_port() inserts sockets into second hash chain
(additional cost of one atomic op)
udp_lib_unhash() deletes socket from second hash chain
(additional cost of one atomic op)
Note : No spinlock lockdep annotation is needed, because
lock for the secondary hash chain is always get after
lock for primary hash chain.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Union sk_hash with two u16 hashes for udp (no extra memory taken)
One 16 bits hash on (local port) value (the previous udp 'hash')
One 16 bits hash on (local address, local port) values, initialized
but not yet used. This second hash is using jenkin hash for better
distribution.
Because the 'port' is xored later, a partial hash is performed
on local address + net_hash_mix(net)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds a counter in udp_hslot to keep an accurate count
of sockets present in chain.
This will permit to upcoming UDP lookup algo to chose
the shortest chain when secondary hash is added.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Christian Pellegrin <chripell@fsfe.org>
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no good reason to not support userspace specifying the
network namespace during device creation, and it makes it easier
to create a network device and pass it to a child network namespace
with a well known name.
We have to be careful to ensure that the target network namespace
for the new device exists through the life of the call. To keep
that logic clear I have factored out the network namespace grabbing
logic into rtnl_link_get_net.
In addtion we need to continue to pass the source network namespace
to the rtnl_link_ops.newlink method so that we can find the base
device source network namespace.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
or it will taint the kernel and fail to load becuase
of_address_to_resource() is GPL only.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
On older kernels, e.g. 2.6.27, a WARN_ON dump in rtmsg_ifinfo()
is thrown when the CAN device is registered due to insufficient
skb space, as reported by various users. This patch adds the
rtnl_link_ops "get_size" to fix the problem. I think this patch
is required for more recent kernels as well, even if no WARN_ON
dumps are triggered. Maybe we also need "get_xstats_size" for
the CAN xstats.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
atalk_sum_partial can now use the rol16 function in bitops.h
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using RCU helps not touching device refcount in rawv6_bind()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit fba4ed030c ("gianfar: Add Multiple
Queue Support") introduced the following warnings:
CHECK gianfar.c
gianfar.c:333:8: warning: incorrect type in assignment (different address spaces)
gianfar.c:333:8: expected unsigned int [usertype] *baddr
gianfar.c:333:8: got unsigned int [noderef] <asn:2>*<noident>
[... 67 lines skipped ...]
gianfar.c:2565:3: warning: incorrect type in argument 1 (different type sizes)
gianfar.c:2565:3: expected unsigned long const *addr
gianfar.c:2565:3: got unsigned int *<noident>
CC gianfar.o
gianfar.c: In function 'gfar_probe':
gianfar.c:985: warning: passing argument 1 of 'find_next_bit' from incompatible pointer type
gianfar.c:985: warning: passing argument 1 of 'find_next_bit' from incompatible pointer type
gianfar.c:993: warning: passing argument 1 of 'find_next_bit' from incompatible pointer type
gianfar.c:993: warning: passing argument 1 of 'find_next_bit' from incompatible pointer type
gianfar.c: In function 'gfar_configure_coalescing':
gianfar.c:1680: warning: passing argument 1 of 'find_next_bit' from incompatible pointer type
gianfar.c:1680: warning: passing argument 1 of 'find_next_bit' from incompatible pointer type
gianfar.c:1688: warning: passing argument 1 of 'find_next_bit' from incompatible pointer type
gianfar.c:1688: warning: passing argument 1 of 'find_next_bit' from incompatible pointer type
gianfar.c: In function 'gfar_poll':
gianfar.c:2565: warning: passing argument 1 of 'find_next_bit' from incompatible pointer type
gianfar.c:2565: warning: passing argument 1 of 'find_next_bit' from incompatible pointer type
gianfar.c:2566: warning: passing argument 2 of 'test_bit' from incompatible pointer type
gianfar.c:2585: warning: passing argument 2 of 'set_bit' from incompatible pointer type
Following warnings left unfixed (looks like sparse doesn't like
locks in loops, so __acquires/__releases() doesn't help):
gianfar.c:441:40: warning: context imbalance in 'lock_rx_qs': wrong count at exit
gianfar.c:441:40: context '<noident>': wanted 0, got 1
gianfar.c:449:40: warning: context imbalance in 'lock_tx_qs': wrong count at exit
gianfar.c:449:40: context '<noident>': wanted 0, got 1
gianfar.c:458:3: warning: context imbalance in 'unlock_rx_qs': __context__ statement expected different context
gianfar.c:458:3: context '<noident>': wanted >= 0, got -1
gianfar.c:466:3: warning: context imbalance in 'unlock_tx_qs': __context__ statement expected different context
gianfar.c:466:3: context '<noident>': wanted >= 0, got -1
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes following warnings:
fsl_pq_mdio.c:112:38: warning: cast adds address space to expression (<asn:2>)
fsl_pq_mdio.c:124:38: warning: cast adds address space to expression (<asn:2>)
fsl_pq_mdio.c:133:38: warning: cast adds address space to expression (<asn:2>)
fsl_pq_mdio.c:414:11: warning: cast adds address space to expression (<asn:2>)
Instead of adding __force all over the place, introduce convenient
fsl_pq_mdio_get_regs() call that does the ugly casting just once.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 1d2397d742 ("fsl_pq_mdio: Add
Suport for etsec2.0 devices") introduced the following warnings:
CHECK fsl_pq_mdio.c
fsl_pq_mdio.c:287:22: warning: incorrect type in initializer (different base types)
fsl_pq_mdio.c:287:22: expected unknown type 11 const *__mptr
fsl_pq_mdio.c:287:22: got unsigned long long [unsigned] [assigned] [usertype] addr
fsl_pq_mdio.c:287:19: warning: incorrect type in assignment (different base types)
fsl_pq_mdio.c:287:19: expected unsigned long long [unsigned] [usertype] ioremap_miimcfg
fsl_pq_mdio.c:287:19: got struct fsl_pq_mdio *<noident>
CC fsl_pq_mdio.o
fsl_pq_mdio.c: In function 'fsl_pq_mdio_probe':
fsl_pq_mdio.c:287: warning: initialization makes pointer from integer without a cast
fsl_pq_mdio.c:287: warning: assignment makes integer from pointer without a cast
These warnings are not easy to fix without ugly __force casts. So,
instead of introducing the casts, rework the code to substitute an
offset from an already mapped area. This makes the code a lot simpler
and less duplicated.
Plus, from now on we don't actually map reserved registers on
non-etsec2.0 devices, so we have more chances to catch programming
errors.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>