OpenCloudOS-Kernel/include/net
Kuniyuki Iwashima 9804985bf2 udp: Introduce optional per-netns hash table.
The maximum hash table size is 64K due to the nature of the protocol. [0]
It's smaller than TCP, and fewer sockets can cause a performance drop.

On an EC2 c5.24xlarge instance (192 GiB memory), after running iperf3 in
different netns, creating 32Mi sockets without data transfer in the root
netns causes regression for the iperf3's connection.

  uhash_entries		sockets		length		Gbps
	    64K		      1		     1		5.69
			    1Mi		    16		5.27
			    2Mi		    32		4.90
			    4Mi		    64		4.09
			    8Mi		   128		2.96
			   16Mi		   256		2.06
			   32Mi		   512		1.12

The per-netns hash table breaks the lengthy lists into shorter ones.  It is
useful on a multi-tenant system with thousands of netns.  With smaller hash
tables, we can look up sockets faster, isolate noisy neighbours, and reduce
lock contention.

The max size of the per-netns table is 64K as well.  This is because the
possible hash range by udp_hashfn() always fits in 64K within the same
netns and we cannot make full use of the whole buckets larger than 64K.

  /* 0 < num < 64K  ->  X < hash < X + 64K */
  (num + net_hash_mix(net)) & mask;

Also, the min size is 128.  We use a bitmap to search for an available
port in udp_lib_get_port().  To keep the bitmap on the stack and not
fire the CONFIG_FRAME_WARN error at build time, we round up the table
size to 128.

The sysctl usage is the same with TCP:

  $ dmesg | cut -d ' ' -f 6- | grep "UDP hash"
  UDP hash table entries: 65536 (order: 9, 2097152 bytes, vmalloc)

  # sysctl net.ipv4.udp_hash_entries
  net.ipv4.udp_hash_entries = 65536  # can be changed by uhash_entries

  # sysctl net.ipv4.udp_child_hash_entries
  net.ipv4.udp_child_hash_entries = 0  # disabled by default

  # ip netns add test1
  # ip netns exec test1 sysctl net.ipv4.udp_hash_entries
  net.ipv4.udp_hash_entries = -65536  # share the global table

  # sysctl -w net.ipv4.udp_child_hash_entries=100
  net.ipv4.udp_child_hash_entries = 100

  # ip netns add test2
  # ip netns exec test2 sysctl net.ipv4.udp_hash_entries
  net.ipv4.udp_hash_entries = 128  # own a per-netns table with 2^n buckets

We could optimise the hash table lookup/iteration further by removing
the netns comparison for the per-netns one in the future.  Also, we
could optimise the sparse udp_hslot layout by putting it in udp_table.

[0]: https://lore.kernel.org/netdev/4ACC2815.7010101@gmail.com/

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:43:35 +00:00
..
9p net/9p: add 'pooled_rbuffers' flag to struct p9_trans_module 2022-10-05 07:05:41 +09:00
bluetooth Bluetooth: Fix HCIGETDEVINFO regression 2022-09-09 12:25:18 -07:00
caif net: remove the caif_hsi driver 2021-07-01 13:19:48 -07:00
iucv net/af_iucv: Use struct_group() to zero struct iucv_sock region 2021-11-19 11:52:25 +00:00
mana Merge branch 'mana-shared-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma 2022-11-10 12:07:19 -08:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next 2022-11-15 20:33:20 -08:00
netns udp: Introduce optional per-netns hash table. 2022-11-16 09:43:35 +00:00
nfc NFC: add NCI_UNREG flag to eliminate the race 2021-11-17 20:17:05 -08:00
phonet net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
sctp sctp: remove unnecessary NULL check in sctp_association_init() 2022-10-20 21:43:10 -07:00
tc_act net: sched: add helper support in act_ct 2022-11-08 12:15:19 +01:00
6lowpan.h
Space.h wan: remove sbni/granch driver 2021-08-03 13:05:26 +01:00
act_api.h act_skbedit: skbedit queue mapping for receive queue 2022-10-25 10:32:40 +02:00
addrconf.h ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr 2022-07-28 10:42:44 -07:00
af_ieee802154.h
af_rxrpc.h rxrpc: Remove rxrpc_get_reply_time() which is no longer used 2022-09-01 11:44:13 +01:00
af_unix.h af_unix: Remove unix_table_locks. 2022-06-22 12:59:43 +01:00
af_vsock.h vsock: add API call for data ready 2022-08-23 10:43:11 +02:00
ah.h
amt.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
arp.h ipv4: Invalidate neighbour for broadcast address upon address addition 2022-02-21 11:44:30 +00:00
atmclip.h
ax25.h ax25: fix incorrect dev_tracker usage 2022-07-28 22:06:15 -07:00
ax88796.h ax88796: Fix some typo in a comment 2022-08-09 22:14:02 -07:00
bareudp.h bareudp: Move definition of struct bareudp_conf to bareudp.c 2021-12-13 12:34:09 +00:00
bond_3ad.h net: bonding: Share lacpdu_mcast_addr definition 2022-09-16 14:34:01 +01:00
bond_alb.h bonding (gcc13): synchronize bond_{a,t}lb_xmit() types 2022-11-02 20:38:13 -07:00
bond_options.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
bonding.h bond: Disable TLS features indication 2022-10-27 12:45:38 +02:00
bpf_sk_storage.h bpf: struct sock is declared twice in bpf_sk_storage header 2021-03-26 17:43:55 +01:00
busy_poll.h net: Fix a data-race around sysctl_net_busy_poll. 2022-08-24 13:46:58 +01:00
calipso.h
cfg80211-wext.h
cfg80211.h cfg80211: Update Transition Disable policy during port authorization 2022-10-07 15:27:40 +02:00
cfg802154.h mac802154: set filter at drv_start() 2022-10-12 12:56:58 +02:00
checksum.h powerpc/net: Implement powerpc specific csum_shift() to remove branch 2022-03-11 10:57:22 +00:00
cipso_ipv4.h cipso: Remove unused inline functions 2020-07-15 07:45:24 -07:00
cls_cgroup.h
codel.h codel: remove unnecessary pkt_sched.h include 2021-12-22 15:03:51 -08:00
codel_impl.h codel: remove unnecessary sock.h include 2021-12-22 15:03:47 -08:00
codel_qdisc.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
compat.h net: copy from user before calling __get_compat_msghdr 2022-07-24 18:39:17 -06:00
datalink.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
dcbevent.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
dcbnl.h net: dcb: add new apptrust attribute 2022-11-03 15:16:50 +01:00
devlink.h devlink: Add packet traps for 802.1X operation 2022-11-09 19:06:14 -08:00
dropreason.h net: dropreason: add SKB_DROP_REASON_FRAG_TOO_FAR 2022-10-31 20:14:27 -07:00
dsa.h net: dsa: remove phylink_validate() method 2022-11-15 20:34:27 -08:00
dsfield.h
dst.h net: Remove unused inline function dst_hold_and_use() 2022-09-26 11:48:14 -07:00
dst_cache.h wireguard: device: reset peer src endpoint when netns exits 2021-11-29 19:50:45 -08:00
dst_metadata.h Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2022-10-03 07:52:13 +01:00
dst_ops.h
erspan.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
esp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
espintcp.h
ethoc.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
failover.h net: failover: add net device refcount tracker 2021-12-06 16:06:02 -08:00
fib_notifier.h
fib_rules.h fib: expand fib_rule_policy 2021-12-16 07:18:35 -08:00
firewire.h firewire: net: Make use of get_unaligned_be48(), put_unaligned_be48() 2022-07-28 22:21:54 -07:00
flow.h net: Remove DECnet leftovers from flow.h. 2022-10-03 12:41:59 +01:00
flow_dissector.h flow_dissector: Add L2TPv3 dissectors 2022-09-20 09:13:38 +02:00
flow_offload.h net: flow_offload: add support for ARP frame matching 2022-11-14 11:24:16 +00:00
fou.h
fq.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
fq_impl.h net/fq_impl: Use the bitmap API to allocate bitmaps 2022-07-11 19:49:38 -07:00
garp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
gen_stats.h net: sched: Remove Qdisc::running sequence counter 2021-10-18 12:54:41 +01:00
genetlink.h genetlink: allow families to use split ops directly 2022-11-07 12:30:17 +00:00
geneve.h net: geneve: fix array of flexible structures warnings 2022-10-31 10:43:04 +00:00
gre.h ip_gre: add csum offload support for gre header 2021-01-29 20:39:14 -08:00
gro.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-08-25 16:07:42 -07:00
gro_cells.h
gtp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
gue.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
hwbm.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
icmp.h ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages 2021-06-28 14:29:45 -07:00
ieee80211_radiotap.h ieee80211: radiotap: fix -Wcast-qual warnings 2022-02-04 16:25:21 +01:00
ieee802154_netdev.h Merge tag 'ieee802154-for-net-next-2022-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next 2022-10-26 15:24:36 +01:00
if_inet6.h ipv6: fix locking issues with loops over idev->addr_list 2022-04-06 22:09:39 -07:00
ife.h
ila.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
inet6_connection_sock.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
inet6_hashtables.h net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set 2022-07-29 11:58:54 +01:00
inet_common.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
inet_connection_sock.h net: Add a bhash2 table hashed by port and address 2022-08-24 19:30:07 -07:00
inet_dscp.h ipv6: Define dscp_t and stop taking ECN bits into account in fib6-rules 2022-02-07 20:12:45 -08:00
inet_ecn.h net: add skb_get_dsfield() helper 2021-10-15 11:33:08 +01:00
inet_frag.h net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT 2022-10-31 20:14:27 -07:00
inet_hashtables.h net: Fix incorrect address comparison when searching for a bind2 bucket 2022-09-28 18:52:17 -07:00
inet_sock.h net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set 2022-07-29 11:58:54 +01:00
inet_timewait_sock.h Revert "tcp/dccp: get rid of inet_twsk_purge()" 2022-05-13 12:24:12 +01:00
inetpeer.h
ioam6.h treewide: Replace zero-length arrays with flexible-array members 2022-02-17 07:00:39 -06:00
ip.h bpf: Change bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt() 2022-09-02 20:34:32 -07:00
ip6_checksum.h net: move gro definitions to include/net/gro.h 2021-11-16 13:16:54 +00:00
ip6_fib.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-02-17 11:44:20 -08:00
ip6_route.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ip6_tunnel.h ip_gre, ip6_gre: Fix race condition on o_seqno in collect_md mode 2022-04-25 11:40:45 +01:00
ip_fib.h ipv4: Use dscp_t in struct fib_entry_notifier_info 2022-04-11 17:37:50 -07:00
ip_tunnels.h net: Add helper function to parse netlink msg of ip_tunnel_parm 2022-10-03 07:59:06 +01:00
ip_vs.h ipvs: add sysctl_run_estimation to support disable estimation 2021-10-07 19:52:58 +02:00
ipcomp.h xfrm: ipcomp: add extack to ipcomp{4,6}_init_state 2022-09-29 07:18:00 +02:00
ipconfig.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ipv6.h tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). 2022-10-12 17:50:37 -07:00
ipv6_frag.h net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT 2022-10-31 20:14:27 -07:00
ipv6_stubs.h bpf: Change bpf_getsockopt(SOL_IPV6) to reuse do_ipv6_getsockopt() 2022-09-02 20:34:32 -07:00
iw_handler.h
kcm.h
l3mdev.h l3mdev: add infrastructure for table to VRF mapping 2020-06-20 17:22:22 -07:00
lag.h
lapb.h net: lapb: Make "lapb_t1timer_running" able to detect an already running timer 2021-03-23 14:14:50 -07:00
lib80211.h
llc.h llc: fix out-of-bound array index in llc_sk_dev_hash() 2021-11-07 19:25:29 +00:00
llc_c_ac.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_c_ev.h
llc_c_st.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_conn.h llc: add net device refcount tracker 2021-12-07 20:44:59 -08:00
llc_if.h llc/snap: constify dev_addr passing 2021-10-13 09:40:46 -07:00
llc_pdu.h net: llc: fix skb_over_panic 2021-07-27 13:05:56 +01:00
llc_s_ac.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_s_ev.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
llc_s_st.h add missing includes and forward declarations to networking includes under linux/ 2022-07-28 11:29:36 +02:00
llc_sap.h
lwtunnel.h netfilter: add netfilter hooks to SRv6 data plane 2021-08-30 01:51:36 +02:00
mac80211.h wifi: mac80211: add internal handler for wake_tx_queue 2022-10-10 10:54:17 +02:00
mac802154.h mac802154: Drop IEEE802154_HW_RX_DROP_BAD_CKSUM 2022-10-12 12:57:19 +02:00
macsec.h net: macsec: remove the prepare flag from the MACsec offloading context 2022-09-23 06:56:08 -07:00
mctp.h mctp: Use output netdev to allocate skb headroom 2022-04-01 12:04:15 +01:00
mctpdevice.h mctp: Pass flow data & flow release events to drivers 2021-10-29 13:23:51 +01:00
mip6.h
mld.h mld: add new workqueues for process mld events 2021-03-26 15:14:56 -07:00
mpls.h net: Make mpls_entry_encode() available for generic users 2020-05-29 21:20:20 -07:00
mpls_iptunnel.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
mptcp.h mptcp, btf: Add struct mptcp_sock definition when CONFIG_MPTCP is disabled 2022-08-08 15:30:45 +02:00
mrp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ncsi.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
ndisc.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-03-03 11:55:12 -08:00
neighbour.h neighbour: Remove unused inline function neigh_key_eq16() 2022-09-26 11:48:14 -07:00
net_debug.h net: add CONFIG_DEBUG_NET 2022-05-11 12:43:10 +01:00
net_failover.h
net_namespace.h net: add a refcount tracker for kernel sockets 2022-10-24 11:04:43 +01:00
net_ratelimit.h
net_trackers.h net: add networking namespace refcount tracker 2021-12-10 06:38:26 -08:00
netevent.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
netlabel.h
netlink.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-11-03 13:21:54 -07:00
netprio_cgroup.h
netrom.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
nexthop.h net: ipv4: Fix rtnexthop len when RTA_FLOW is present 2021-09-24 14:07:10 +01:00
nl802154.h net: ieee802154: Fix compilation error when CONFIG_IEEE802154_NL802154_EXPERIMENTAL is disabled 2022-09-02 19:59:08 -07:00
nsh.h
p8022.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
page_pool.h net: page_pool: introduce ethtool stats 2022-04-15 10:43:47 +01:00
pie.h
ping.h net: remove noblock parameter from recvmsg() entities 2022-04-12 15:00:25 +02:00
pkt_cls.h net: sched: cls_api: introduce tc_cls_bind_class() helper 2022-10-02 16:07:17 +01:00
pkt_sched.h net/sched: taprio: allow user input of per-tc max SDU 2022-09-29 18:52:05 -07:00
pptp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
protocol.h tcp/udp: Make early_demux back namespacified. 2022-07-15 18:50:35 -07:00
psample.h psample: Add a fwd declaration for skbuff 2021-08-09 15:34:21 -07:00
psnap.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
raw.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-07-14 15:27:35 -07:00
rawv6.h raw: convert raw sockets to RCU 2022-06-19 10:00:02 +01:00
red.h treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
regulatory.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
request_sock.h tcp: Use BPF timeout setting for SYN ACK RTO 2022-02-02 14:45:18 +00:00
rose.h net: rose: add netdev ref tracker to 'struct rose_sock' 2022-08-01 11:59:23 -07:00
route.h Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2022-07-25 13:25:39 +01:00
rpl.h net: ipv6: Use struct_size() helper and kcalloc() 2020-06-23 20:27:09 -07:00
rsi_91x.h
rtnetlink.h rtnetlink: Honour NLM_F_ECHO flag in rtnl_delete_link 2022-10-31 18:10:21 -07:00
rtnh.h
sch_generic.h net/sched: query offload capabilities through ndo_setup_tc() 2022-09-29 18:52:01 -07:00
scm.h fs: Move __scm_install_fd() to __receive_fd() 2020-07-13 11:03:44 -07:00
secure_seq.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
seg6.h udp6: Use Segment Routing Header for dest address if present 2022-01-04 12:17:35 +00:00
seg6_hmac.h
seg6_local.h
selftests.h net: selftest: fix build issue if INET is disabled 2021-04-28 14:06:45 -07:00
slhc_vj.h
smc.h net/smc: Pass on DMBE bit mask in IRQ handler 2022-07-27 13:24:42 +01:00
snmp.h
sock.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-11-03 13:21:54 -07:00
sock_reuseport.h soreuseport: Fix socket selection for SO_INCOMING_CPU. 2022-10-25 11:35:16 +02:00
stp.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
strparser.h tls: rx: remove the message decrypted tracking 2022-07-18 11:24:10 +01:00
switchdev.h bridge: switchdev: Allow device drivers to install locked FDB entries 2022-11-09 19:06:13 -08:00
tcp.h tcp: add PLB functionality for TCP 2022-10-28 10:47:42 +01:00
tcp_states.h
timewait_sock.h
tipc.h
tls.h net/tls: Describe ciphers sizes by const structs 2022-09-22 17:27:41 -07:00
tls_toe.h
transp_v6.h inet6: Remove inet6_destroy_sock(). 2022-10-24 09:40:39 +01:00
tso.h net: tso: cache transport header length 2020-06-18 20:46:23 -07:00
tun_proto.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
udp.h udp: track the forward memory release threshold in an hot cacheline 2022-10-24 10:52:50 +01:00
udp_tunnel.h net: Change the udp encap_err_rcv to allow use of {ip,ipv6}_icmp_error() 2022-11-08 16:42:28 +00:00
udplite.h tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). 2022-10-12 17:50:37 -07:00
vsock_addr.h
vxlan.h drivers: vxlan: vnifilter: per vni stats 2022-03-01 08:38:02 +00:00
wext.h
x25.h
x25device.h
xdp.h xdp: Adjust xdp_frame layout to avoid using bitfields 2022-09-26 13:28:19 -07:00
xdp_priv.h net: add missing includes and forward declarations under net/ 2022-07-22 12:53:22 +01:00
xdp_sock.h net: Don't include filter.h from net/sock.h 2021-12-29 08:48:14 -08:00
xdp_sock_drv.h xsk: Remove unused xsk_buff_discard 2022-09-30 07:55:46 -07:00
xfrm.h xfrm: pass extack down to xfrm_type ->init_state 2022-09-29 07:17:58 +02:00
xsk_buff_pool.h xsk: Inherit need_wakeup flag for shared sockets 2022-09-22 17:16:22 +02:00