android_kernel_xiaomi_sm8350/include/net
David S. Miller 5e9965c15b Merge branch 'kill_rtcache'
The ipv4 routing cache is non-deterministic, performance wise, and is
subject to reasonably easy to launch denial of service attacks.

The routing cache works great for well behaved traffic, and the world
was a much friendlier place when the tradeoffs that led to the routing
cache's design were considered.

What it boils down to is that the performance of the routing cache is
a product of the traffic patterns seen by a system rather than being a
product of the contents of the routing tables.  The former of which is
controllable by external entitites.

Even for "well behaved" legitimate traffic, high volume sites can see
hit rates in the routing cache of only ~%10.

The general flow of this patch series is that first the routing cache
is removed.  We build a completely new rtable entry every lookup
request.

Next we make some simplifications due to the fact that removing the
routing cache causes several members of struct rtable to become no
longer necessary.

Then we need to make some amends such that we can legally cache
pre-constructed routes in the FIB nexthops.  Firstly, we need to
invalidate routes which are hit with nexthop exceptions.  Secondly we
have to change the semantics of rt->rt_gateway such that zero means
that the destination is on-link and non-zero otherwise.

Now that the preparations are ready, we start caching precomputed
routes in the FIB nexthops.  Output and input routes need different
kinds of care when determining if we can legally do such caching or
not.  The details are in the commit log messages for those changes.

The patch series then winds down with some more struct rtable
simplifications and other tidy ups that remove unnecessary overhead.

On a SPARC-T3 output route lookups are ~876 cycles.  Input route
lookups are ~1169 cycles with rpfilter disabled, and about ~1468
cycles with rpfilter enabled.

These measurements were taken with the kbench_mod test module in the
net_test_tools GIT tree:

git://git.kernel.org/pub/scm/linux/kernel/git/davem/net_test_tools.git

That GIT tree also includes a udpflood tester tool and stresses
route lookups on packet output.

For example, on the same SPARC-T3 system we can run:

	time ./udpflood -l 10000000 10.2.2.11

with routing cache:
real    1m21.955s       user    0m6.530s        sys     1m15.390s

without routing cache:
real    1m31.678s       user    0m6.520s        sys     1m25.140s

Performance undoubtedly can easily be improved further.

For example fib_table_lookup() performs a lot of excessive
computations with all the masking and shifting, some of it
conditionalized to deal with edge cases.

Also, Eric's no-ref optimization for input route lookups can be
re-instated for the FIB nexthop caching code path.  I would be really
pleased if someone would work on that.

In fact anyone suitable motivated can just fire up perf on the loading
of the test net_test_tools benchmark kernel module.  I spend much of
my time going:

bash# perf record insmod ./kbench_mod.ko dst=172.30.42.22 src=74.128.0.1 iif=2
bash# perf report

Thanks to helpful feedback from Joe Perches, Eric Dumazet, Ben
Hutchings, and others.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-22 17:04:15 -07:00
..
9p 9p: Reduce object size with CONFIG_NET_9P_DEBUG 2012-01-05 10:51:44 -06:00
bluetooth Bluetooth: Use tx window from config response for ack timing 2012-07-15 12:18:29 -03:00
caif caif-hsi: Remove use of module parameters 2012-06-25 16:44:12 -07:00
irda
iucv af_iucv: add shutdown for HS transport 2012-03-07 22:52:24 -08:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-07-10 23:56:33 -07:00
netns tcp: use hash_32() in tcp_metrics 2012-07-20 10:59:41 -07:00
nfc NFC: Allow HCI driver to pre-open pipes to some gates 2012-07-09 16:42:12 -04:00
phonet net: remove my future former mail address 2012-06-17 16:29:38 -07:00
sctp sctp: Implement quick failover draft from tsvwg 2012-07-22 12:13:46 -07:00
tc_act
act_api.h
addrconf.h ipv6: add ipv6_addr_hash() helper 2012-07-18 11:28:46 -07:00
af_ieee802154.h
af_rxrpc.h
af_unix.h af_unix: speedup /proc/net/unix 2012-06-08 14:27:23 -07:00
ah.h
arp.h ipv4: Fix neigh lookup keying over loopback/point-to-point devices. 2012-07-20 16:06:10 -07:00
atmclip.h atm: clip: Use device neigh support on top of "arp_tbl". 2011-11-30 18:51:03 -05:00
ax25.h net ax25: Fix the build when sysctl support is disabled. 2012-04-23 22:14:47 -04:00
ax88796.h
cfg80211-wext.h cfg80211: remove unused wext handler exports 2011-08-08 14:26:29 -04:00
cfg80211.h cfg80211: support TX error rate CQM 2012-07-17 11:57:23 +02:00
checksum.h
cipso_ipv4.h cipso: handle CIPSO options correctly when NetLabel is disabled 2012-06-01 14:18:29 -04:00
cls_cgroup.h
codel.h fq_codel: should use qdisc backlog as threshold 2012-05-16 15:30:26 -04:00
compat.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
datalink.h
dcbevent.h dcb: Add stub routines for !CONFIG_DCB 2011-10-06 15:49:51 -04:00
dcbnl.h net/dcb: Add an optional max rate attribute 2012-04-05 05:08:04 -04:00
dn_dev.h
dn_fib.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
dn_neigh.h
dn_nsp.h
dn_route.h decnet: Use neighbours privately in dn_route struct. 2012-07-05 01:12:14 -07:00
dn.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
dsa.h dsa: Include linux/if_ether.h to fix build error 2011-12-01 11:41:06 -05:00
dsfield.h
dst_ops.h net: Fix warnings in dst_ops.h 2012-07-19 10:43:03 -07:00
dst.h ipv4: Kill routes during PMTU/redirect updates. 2012-07-20 13:31:22 -07:00
esp.h
ethoc.h
fib_rules.h ipv4: Elide fib_validate_source() completely when possible. 2012-06-29 01:36:36 -07:00
flow_keys.h flow_dissector: use a 64bit load/store 2011-11-29 13:17:03 -05:00
flow.h ipv4: Kill FLOWI_FLAG_RT_NOCACHE and associated code. 2012-07-20 13:36:54 -07:00
garp.h
gen_stats.h
genetlink.h net: Use NLMSG_DEFAULT_SIZE in combination with nlmsg_new() 2012-06-28 17:56:43 -07:00
gre.h
icmp.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
ieee80211_radiotap.h wireless: move ieee80211chan2mhz macro 2011-11-11 12:32:50 -05:00
ieee802154_netdev.h mac802154: declare reduced mlme operations 2012-05-16 15:16:56 -04:00
ieee802154.h 6LoWPAN: add fragmentation support 2011-11-14 00:19:42 -05:00
if_inet6.h net: delete all instances of special processing for token ring 2012-05-15 20:14:35 -04:00
inet6_connection_sock.h ipv6: Add helper inet6_csk_update_pmtu(). 2012-07-16 03:44:56 -07:00
inet6_hashtables.h net: use IS_ENABLED(CONFIG_IPV6) 2011-12-11 18:25:16 -05:00
inet_common.h net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN) 2012-07-19 11:02:03 -07:00
inet_connection_sock.h ipv4: Kill FLOWI_FLAG_RT_NOCACHE and associated code. 2012-07-20 13:36:54 -07:00
inet_ecn.h inet: add rfc 3168 extract in front of INET_ECN_encapsulate() 2011-10-22 01:25:23 -04:00
inet_frag.h ip_frag: struct inet_frags match() method returns a bool 2012-05-18 01:40:27 -04:00
inet_hashtables.h ipv4: Early TCP socket demux. 2012-06-19 21:22:05 -07:00
inet_sock.h inet: Kill FLOWI_FLAG_PRECOW_METRICS. 2012-07-10 22:40:12 -07:00
inet_timewait_sock.h inet: remove rcu protection on tw_net 2011-12-14 13:34:55 -05:00
inetpeer.h ipv4: Maintain redirect and PMTU info in struct rtable again. 2012-07-10 22:40:14 -07:00
ip6_checksum.h
ip6_fib.h ipv6: Store route neighbour in rt6_info struct. 2012-07-05 02:41:58 -07:00
ip6_route.h ipv6: fix inet6_csk_xmit() 2012-07-18 08:59:58 -07:00
ip6_tunnel.h ipv6_tunnel: Allow receiving packets on the fallback tunnel if they pass sanity checks 2012-06-29 00:52:32 -07:00
ip_fib.h ipv4: Cache input routes in fib_info nexthops. 2012-07-20 13:36:40 -07:00
ip_vs.h ipvs: fix oops on NAT reply in br_nf context 2012-07-17 12:00:46 +02:00
ip.h ipv4: tcp: remove per net tcp_sock 2012-07-19 10:35:30 -07:00
ipcomp.h
ipconfig.h
ipip.h tunnel: implement 64 bits statistics 2012-04-14 14:47:05 -04:00
ipv6.h ipv6: add ipv6_addr_hash() helper 2012-07-18 11:28:46 -07:00
ipx.h
iw_handler.h
lapb.h lapb: Neaten debugging 2012-05-17 18:45:20 -04:00
lib80211.h include: replace linux/module.h with "struct module" wherever possible 2011-10-31 19:32:32 -04:00
llc_c_ac.h
llc_c_ev.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
llc_c_st.h
llc_conn.h
llc_if.h
llc_pdu.h net: delete all instances of special processing for token ring 2012-05-15 20:14:35 -04:00
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
llc.h atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
mac80211.h mac80211: add time synchronisation with BSS for assoc 2012-07-12 12:10:46 +02:00
mac802154.h mac802154: add wpan device-class support 2012-06-26 21:06:11 -07:00
mip6.h
mld.h
ndisc.h ipv6: Export ndisc option parsing from ndisc.c 2012-07-11 23:39:11 -07:00
neighbour.h net: Do delayed neigh confirmation. 2012-07-05 01:03:06 -07:00
net_namespace.h net: make sock diag per-namespace 2012-07-16 22:31:34 -07:00
net_ratelimit.h
netdma.h
netevent.h net: Pass neighbours and dest address into NETEVENT_REDIRECT events. 2012-07-05 02:21:55 -07:00
netlabel.h doc: Update the email address for Paul Moore in various source files 2011-08-01 17:58:33 -07:00
netlink.h netlink: Delete all NLA_PUT*() macros. 2012-04-02 04:33:45 -04:00
netprio_cgroup.h net: netprio_cgroup: rework update socket logic 2012-07-22 12:44:01 -07:00
netrom.h
nexthop.h
nl802154.h
p8022.h
ping.h
pkt_cls.h
pkt_sched.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
protocol.h ipv4: Kill early demux method return value. 2012-06-27 22:01:22 -07:00
psnap.h
raw.h
rawv6.h ipv6: bool/const conversions phase2 2012-05-19 01:08:16 -04:00
red.h net_sched: red: Make minor corrections to comments 2012-04-16 23:53:11 -04:00
regulatory.h cfg80211: add cellular base station regulatory hint support 2012-07-17 12:16:39 +02:00
request_sock.h tcp: Change possible SYN flooding messages 2011-09-15 14:49:43 -04:00
rose.h
route.h ipv4: Kill rt->fi 2012-07-20 13:40:07 -07:00
rtnetlink.h rtnl: allow to specify different num for rx and tx queue count 2012-07-20 11:06:59 -07:00
sch_generic.h net: rename bond_queue_mapping to slave_dev_queue_mapping 2012-07-20 11:07:00 -07:00
scm.h af_unix: dont send SCM_CREDENTIALS by default 2011-09-28 13:29:50 -04:00
secure_seq.h tcp: add const qualifiers where possible 2011-10-21 05:22:42 -04:00
slhc_vj.h
snmp.h Merge branch 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2012-01-09 13:08:28 -08:00
sock.h tcp: TCP Small Queues 2012-07-11 18:12:59 -07:00
stp.h
tcp_memcontrol.h cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcg 2012-04-10 10:04:07 -07:00
tcp_states.h
tcp.h tcp: improve latencies of timer triggered events 2012-07-20 10:59:41 -07:00
timewait_sock.h [PATCH] tcp: Cache inetpeer in timewait socket, and only when necessary. 2012-06-09 14:56:12 -07:00
transp_v6.h net: relax PKTINFO non local ipv6 udp xmit check 2011-08-30 17:39:01 -04:00
udp.h net/ipv6/udp: UDP encapsulation: introduce encap_rcv hook into IPv6 2012-04-28 22:21:51 -04:00
udplite.h net: ipv4: Standardize prefixes for message logging 2012-03-12 17:05:21 -07:00
wext.h
wimax.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
wpan-phy.h mac802154: monitor device support 2012-05-16 15:17:08 -04:00
x25.h net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
x25device.h
xfrm.h net/ipv4: VTI support rx-path hook in xfrm4_mode_tunnel. 2012-07-18 09:36:12 -07:00