2b85a34e91
One of the problem with sock memory accounting is it uses a pair of sock_hold()/sock_put() for each transmitted packet. This slows down bidirectional flows because the receive path also needs to take a refcount on socket and might use a different cpu than transmit path or transmit completion path. So these two atomic operations also trigger cache line bounces. We can see this in tx or tx/rx workloads (media gateways for example), where sock_wfree() can be in top five functions in profiles. We use this sock_hold()/sock_put() so that sock freeing is delayed until all tx packets are completed. As we also update sk_wmem_alloc, we could offset sk_wmem_alloc by one unit at init time, until sk_free() is called. Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc) to decrement initial offset and atomicaly check if any packets are in flight. skb_set_owner_w() doesnt call sock_hold() anymore sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc reached 0 to perform the final freeing. Drawback is that a skb->truesize error could lead to unfreeable sockets, or even worse, prematurely calling __sk_free() on a live socket. Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt contention point. 5 % speedup on a UDP transmit workload (depends on number of flows), lowering TX completion cpu usage. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
---|---|---|
.. | ||
netfilter | ||
addrconf_core.c | ||
addrconf.c | ||
addrlabel.c | ||
af_inet6.c | ||
ah6.c | ||
anycast.c | ||
datagram.c | ||
esp6.c | ||
exthdrs_core.c | ||
exthdrs.c | ||
fib6_rules.c | ||
icmp.c | ||
inet6_connection_sock.c | ||
inet6_hashtables.c | ||
ip6_fib.c | ||
ip6_flowlabel.c | ||
ip6_input.c | ||
ip6_output.c | ||
ip6_tunnel.c | ||
ip6mr.c | ||
ipcomp6.c | ||
ipv6_sockglue.c | ||
Kconfig | ||
Makefile | ||
mcast.c | ||
mip6.c | ||
ndisc.c | ||
netfilter.c | ||
proc.c | ||
protocol.c | ||
raw.c | ||
reassembly.c | ||
route.c | ||
sit.c | ||
syncookies.c | ||
sysctl_net_ipv6.c | ||
tcp_ipv6.c | ||
tunnel6.c | ||
udp_impl.h | ||
udp.c | ||
udplite.c | ||
xfrm6_input.c | ||
xfrm6_mode_beet.c | ||
xfrm6_mode_ro.c | ||
xfrm6_mode_transport.c | ||
xfrm6_mode_tunnel.c | ||
xfrm6_output.c | ||
xfrm6_policy.c | ||
xfrm6_state.c | ||
xfrm6_tunnel.c |