OpenCloudOS-Kernel/include/net
Eric Dumazet b178bb3dfc net: reorder struct sock fields
Right now, fields in struct sock are not optimally ordered, because each
path (RX softirq, TX completion, RX user,  TX user) has to touch fields
that are contained in many different cache lines.

The really critical thing is to shrink number of cache lines that are
used at RX softirq time : CPU handling softirqs for a device can receive
many frames per second for many sockets. If load is too big, we can drop
frames at NIC level. RPS or multiqueue cards can help, but better reduce
latency if possible.

This patch starts with UDP protocol, then additional patches will try to
reduce latencies of other ones as well.

At RX softirq time, fields of interest for UDP protocol are :
(not counting ones in inet struct for the lookup)

Read/Written:
sk_refcnt   (atomic increment/decrement)
sk_rmem_alloc & sk_backlog.len (to check if there is room in queues)
sk_receive_queue
sk_backlog (if socket locked by user program)
sk_rxhash
sk_forward_alloc
sk_drops

Read only:
sk_rcvbuf (sk_rcvqueues_full())
sk_filter
sk_wq
sk_policy[0]
sk_flags

Additional notes :

- sk_backlog has one hole on 64bit arches. We can fill it to save 8
bytes.
- sk_backlog is used only if RX sofirq handler finds the socket while
locked by user.
- sk_rxhash is written only once per flow.
- sk_drops is written only if queues are full

Final layout :

[1] One section grouping all read/write fields, but placing rxhash and
sk_backlog at the end of this section.

[2] One section grouping all read fields in RX handler
   (sk_filter, sk_rcv_buf, sk_wq)

[3] Section used by other paths

I'll post a patch on its own to put sk_refcnt at the end of struct
sock_common so that it shares same cache line than section [1]

New offsets on 64bit arch :

sizeof(struct sock)=0x268
offsetof(struct sock, sk_refcnt)  =0x10
offsetof(struct sock, sk_lock)    =0x48
offsetof(struct sock, sk_receive_queue)=0x68
offsetof(struct sock, sk_backlog)=0x80
offsetof(struct sock, sk_rmem_alloc)=0x80
offsetof(struct sock, sk_forward_alloc)=0x98
offsetof(struct sock, sk_rxhash)=0x9c
offsetof(struct sock, sk_rcvbuf)=0xa4
offsetof(struct sock, sk_drops) =0xa0
offsetof(struct sock, sk_filter)=0xa8
offsetof(struct sock, sk_wq)=0xb0
offsetof(struct sock, sk_policy)=0xd0
offsetof(struct sock, sk_flags) =0xe0

Instead of :

sizeof(struct sock)=0x270
offsetof(struct sock, sk_refcnt)  =0x10
offsetof(struct sock, sk_lock)    =0x50
offsetof(struct sock, sk_receive_queue)=0xc0
offsetof(struct sock, sk_backlog)=0x70
offsetof(struct sock, sk_rmem_alloc)=0xac
offsetof(struct sock, sk_forward_alloc)=0x10c
offsetof(struct sock, sk_rxhash)=0x128
offsetof(struct sock, sk_rcvbuf)=0x4c
offsetof(struct sock, sk_drops) =0x16c
offsetof(struct sock, sk_filter)=0x198
offsetof(struct sock, sk_wq)=0x88
offsetof(struct sock, sk_policy)=0x98
offsetof(struct sock, sk_flags) =0x130

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-11-16 11:17:43 -08:00
..
9p 9p: Add datasync to client side TFSYNC/RFSYNC for dotl 2010-10-28 09:08:49 -05:00
bluetooth Bluetooth: clean up rfcomm code 2010-10-12 12:44:53 -03:00
caif include/net/caif/cfctrl.h: Remove unnecessary semicolons 2010-11-15 11:07:16 -08:00
irda net: return operator cleanup 2010-09-23 14:33:39 -07:00
iucv include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 2010-10-23 11:47:02 -07:00
netns netns: reorder fields in struct net 2010-10-17 13:49:14 -07:00
phonet Phonet: 'connect' socket implementation for Pipe controller 2010-10-13 14:40:34 -07:00
sctp net: return operator cleanup 2010-09-23 14:33:39 -07:00
tc_act net/sched: add ACT_CSUM action to update packets checksums 2010-08-20 01:42:59 -07:00
tipc tipc: cleanup function namespace 2010-10-16 11:13:24 -07:00
act_api.h pkt_sched: gen_kill_estimator() rcu fixes 2010-06-11 18:37:08 -07:00
addrconf.h ipv6: make __ipv6_isatap_ifid static 2010-10-05 00:47:39 -07:00
af_ieee802154.h af_ieee802154: add support for WANT_ACK socket option 2009-08-12 21:54:50 -07:00
af_rxrpc.h
af_unix.h af_unix: Allow credentials to work across user and pid namespaces. 2010-06-16 14:58:16 -07:00
ah.h net: cleanup include/net 2009-11-04 05:06:25 -08:00
arp.h arp: remove unnecessary export of arp_broken_ops 2010-09-29 19:45:35 -07:00
atmclip.h clip: convert to internal network_device_stats 2009-01-21 14:01:59 -08:00
ax25.h include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
ax88796.h ax88796: Add method to take MAC from platform data 2009-03-24 23:32:03 -07:00
cfg80211.h cfg80211: fix allowing country IEs for WIPHY_FLAG_STRICT_REGULATORY 2010-11-15 13:24:09 -05:00
checksum.h include/net net/ - csum_partial - remove unnecessary casts 2008-11-19 15:44:53 -08:00
cipso_ipv4.h netlabel: Label incoming TCP connections correctly in SELinux 2009-03-28 15:01:36 +11:00
cls_cgroup.h Merge commit 'v2.6.36-rc7' into core/rcu 2010-10-07 09:43:45 +02:00
compat.h net: fix compat_sys_recvmmsg parameter type 2009-12-11 15:07:56 -08:00
datalink.h
dcbnl.h dcbnl: Add support for setapp/getapp to netdev dcbnl_rtnl_ops 2009-09-01 01:24:30 -07:00
dn.h net: avoid limits overflow 2010-11-10 12:12:00 -08:00
dn_dev.h decnet: RCU conversion and get rid of dev_base_lock 2010-11-08 13:50:08 -08:00
dn_fib.h decnet: Remove unused FIB metric macros. 2010-03-27 19:23:46 -07:00
dn_neigh.h
dn_nsp.h net: use __packed annotation 2010-06-03 03:21:52 -07:00
dn_route.h ipv4: Make rt->fl.iif tests lest obscure. 2010-11-11 17:07:48 -08:00
dsa.h dsa: add switch chip cascading support 2009-03-21 19:06:54 -07:00
dsfield.h
dst.h decnet: RCU conversion and get rid of dev_base_lock 2010-11-08 13:50:08 -08:00
dst_ops.h net dst: need linux/cache.h for ____cacheline_aligned_in_smp. 2010-11-07 19:58:05 -08:00
esp.h
ethoc.h net: Add support for the OpenCores 10/100 Mbps Ethernet MAC. 2009-03-27 00:16:21 -07:00
fib_rules.h fib_rules: __rcu annotates ctarget 2010-10-27 11:37:32 -07:00
flow.h xfrm: use gre key as flow upper protocol info 2010-11-15 10:44:04 -08:00
garp.h net/802: add __rcu annotations 2010-10-25 13:09:44 -07:00
gen_stats.h net: cleanup include/net 2009-11-04 05:06:25 -08:00
genetlink.h genetlink: introduce pre_doit/post_doit hooks 2010-10-05 13:35:30 -04:00
gre.h PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol) 2010-08-21 23:05:39 -07:00
icmp.h ipv4: raw: move struct raw_sock and raw_sk() to include/net/raw.h 2010-04-13 14:49:31 -07:00
ieee80211_radiotap.h wireless: update radiotap parser 2010-02-08 16:50:53 -05:00
ieee802154.h ieee802154: move headers out of extra directory 2009-07-23 17:08:51 +04:00
ieee802154_netdev.h ieee802154: add an mlme_ops call to retrieve PHY object 2009-11-06 14:32:18 +03:00
if_inet6.h ipv6: Replace inet6_ifaddr->dead with state 2010-05-18 15:36:06 -07:00
inet6_connection_sock.h net: replace ipfragok with skb->local_df 2010-04-15 23:36:37 -07:00
inet6_hashtables.h tcp: Fix a connect() race with timewait sockets 2009-12-08 20:17:51 -08:00
inet_common.h inet, inet6: make tcp_sendmsg() and tcp_sendpage() through inet_sendmsg() and inet_sendpage() 2010-07-12 20:21:46 -07:00
inet_connection_sock.h tcp: Add TCP_USER_TIMEOUT socket option. 2010-08-30 13:23:33 -07:00
inet_ecn.h net: return operator cleanup 2010-09-23 14:33:39 -07:00
inet_frag.h fragment: add fast path for in-order fragments 2010-06-30 13:44:29 -07:00
inet_hashtables.h tproxy: fix hash locking issue when using port redirection in __inet_inherit_port() 2010-10-21 13:06:43 +02:00
inet_sock.h igmp: RCU conversion of in_dev->mc_list 2010-11-12 13:18:57 -08:00
inet_timewait_sock.h net: suppress RCU lockdep false positive in twsk_net() 2010-04-27 12:39:01 -07:00
inetpeer.h inetpeer: __rcu annotations 2010-10-27 11:37:33 -07:00
ip.h ipv4: add __rcu annotations to ip_ra_chain 2010-10-25 14:18:28 -07:00
ip6_checksum.h
ip6_fib.h net-next: remove useless union keyword 2010-06-10 23:31:35 -07:00
ip6_route.h net: sk_dst_cache RCUification 2010-04-13 01:41:33 -07:00
ip6_tunnel.h tunnels: add _rcu annotations 2010-10-25 13:09:45 -07:00
ip_fib.h fib: Fix fib zone and its hash leak on namespace stop 2010-10-28 10:27:03 -07:00
ip_vs.h ipvs: provide address family for debugging 2010-10-21 11:04:43 +02:00
ipcomp.h percpu: add __percpu sparse annotations to net 2010-02-16 23:05:38 -08:00
ipconfig.h net: remove CVS keywords 2008-06-11 21:00:38 -07:00
ipip.h tunnels: add __rcu annotations 2010-10-27 11:37:32 -07:00
ipv6.h net: return operator cleanup 2010-09-23 14:33:39 -07:00
ipx.h net: use __packed annotation 2010-06-03 03:21:52 -07:00
iw_handler.h include/net/iw_handler.h: Use SIOCIWFIRST not SIOCSIWCOMMIT in comment 2010-03-31 14:49:12 -04:00
lapb.h
lib80211.h lib80211: remove unused host_build_iv option 2010-07-26 15:09:04 -04:00
llc.h llc: convert llc_sap_list to RCU 2009-12-26 20:46:28 -08:00
llc_c_ac.h
llc_c_ev.h
llc_c_st.h
llc_conn.h llc: use a device based hash table to speed up multicast delivery 2009-12-26 20:43:57 -08:00
llc_if.h [LLC]: Kill static inline llc_addrany 2008-02-29 11:46:17 -08:00
llc_pdu.h [LLC]: skb allocation size for responses 2008-03-31 21:02:47 -07:00
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h [LLC]: skb allocation size for responses 2008-03-31 21:02:47 -07:00
mac80211.h mac80211: add probe request filter flag 2010-10-13 15:45:22 -04:00
mip6.h net: use __packed annotation 2010-06-03 03:21:52 -07:00
mld.h ipv6 mcast: Introduce include/net/mld.h for MLD definitions. 2010-04-23 13:35:55 +09:00
ndisc.h net: use __packed annotation 2010-06-03 03:21:52 -07:00
neighbour.h neigh: reorder struct neighbour 2010-11-11 10:29:40 -08:00
net_namespace.h net_ns: add __rcu annotations 2010-10-25 14:18:27 -07:00
netdma.h net_dma: convert to dma_find_channel 2009-01-06 11:38:15 -07:00
netevent.h
netlabel.h include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
netlink.h netlink: let nlmsg and nla functions take pointer-to-const args 2010-11-16 09:52:32 -08:00
netrom.h include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
nexthop.h
nl802154.h ieee802154: add support for channel pages from IEEE 802.15.4-2006 2009-08-19 23:08:22 +04:00
p8022.h
pkt_cls.h net: rename skb->iif to skb->skb_iif 2009-11-20 15:35:04 -08:00
pkt_sched.h net: Define accessors to manipulate QDISC_STATE_RUNNING 2010-06-02 03:23:51 -07:00
protocol.h net: add __rcu annotations to protocol 2010-10-27 11:37:31 -07:00
psnap.h snap: use const for descriptor 2009-03-21 19:06:50 -07:00
raw.h include/net/raw.h: Convert raw_seq_private macro to inline 2010-09-08 13:42:22 -07:00
rawv6.h ipv6: Use correct data types for ICMPv6 type and code 2009-06-23 04:31:07 -07:00
red.h net: cleanup include/net 2009-11-04 05:06:25 -08:00
regulatory.h wireless: only use alpha2 regulatory information from country IE 2010-07-20 16:44:35 -04:00
request_sock.h tcp: account SYN-ACK timeouts & retransmissions 2010-01-17 19:09:39 -08:00
rose.h NET: ROSE: Don't use static buffer. 2009-07-26 19:11:14 -07:00
route.h ipv4: Make rt->fl.iif tests lest obscure. 2010-11-11 17:07:48 -08:00
rtnetlink.h rtnetlink: remove rtnl_kill_links 2010-10-21 03:09:45 -07:00
sch_generic.h net_sched: remove the unused parameter of qdisc_create_dflt() 2010-10-21 03:09:47 -07:00
scm.h scm: Capture the full credentials of the scm sender. 2010-06-16 14:55:56 -07:00
slhc_vj.h
snmp.h snmp: 64bit ipstats_mib for all arches 2010-06-30 13:31:19 -07:00
sock.h net: reorder struct sock fields 2010-11-16 11:17:43 -08:00
stp.h net: Add STP demux layer 2008-07-05 21:25:39 -07:00
tcp.h net: avoid limits overflow 2010-11-10 12:12:00 -08:00
tcp_states.h
timewait_sock.h net: Fix memory leak in the proto_register function 2008-11-21 16:45:22 -08:00
transp_v6.h IPv6: Add dontfrag argument to relevant functions 2010-04-23 23:35:28 -07:00
udp.h net: avoid limits overflow 2010-11-10 12:12:00 -08:00
udplite.h udp: introduce struct udp_table and multiple spinlocks 2008-10-29 01:41:45 -07:00
wext.h wext: refactor 2009-10-07 16:39:43 -04:00
wimax.h Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2009-12-09 19:43:33 -08:00
wpan-phy.h ieee802154: add support for creation/removal of logic interfaces 2009-11-06 14:32:24 +03:00
x25.h X25: Move accept approve flag to bitfield 2010-05-17 17:39:27 -07:00
x25device.h X25: Add if_x25.h and x25 to device identifiers 2010-04-22 16:12:36 -07:00
xfrm.h xfrm: use gre key as flow upper protocol info 2010-11-15 10:44:04 -08:00