OpenCloudOS-Kernel/net
David S. Miller 7d384846b9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains a second batch of Netfilter updates for
your net-next tree. This includes a rework of the core hook
infrastructure that improves Netfilter performance by ~15% according to
synthetic benchmarks. Then, a large batch with ipset updates, including
a new hash:ipmac set type, via Jozsef Kadlecsik. This also includes a
couple of assorted updates.

Regarding the core hook infrastructure rework to improve performance,
using this simple drop-all packets ruleset from ingress:

        nft add table netdev x
        nft add chain netdev x y { type filter hook ingress device eth0 priority 0\; }
        nft add rule netdev x y drop

And generating traffic through Jesper Brouer's
samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh script using -i
option. perf report shows nf_tables calls in its top 10:

    17.30%  kpktgend_0   [nf_tables]            [k] nft_do_chain
    15.75%  kpktgend_0   [kernel.vmlinux]       [k] __netif_receive_skb_core
    10.39%  kpktgend_0   [nf_tables_netdev]     [k] nft_do_chain_netdev

I'm measuring here an improvement of ~15% in performance with this
patchset, so we got +2.5Mpps more. I have used my old laptop Intel(R)
Core(TM) i5-3320M CPU @ 2.60GHz 4-cores.

This rework contains more specifically, in strict order, these patches:

1) Remove compile-time debugging from core.

2) Remove obsolete comments that predate the rcu era. These days it is
   well known that a Netfilter hook always runs under rcu_read_lock().

3) Remove threshold handling, this is only used by br_netfilter too.
   We already have specific code to handle this from br_netfilter,
   so remove this code from the core path.

4) Deprecate NF_STOP, as this is only used by br_netfilter.

5) Place nf_state_hook pointer into xt_action_param structure, so
   this structure fits into one single cacheline according to pahole.
   This also implicit affects nftables since it also relies on the
   xt_action_param structure.

6) Move state->hook_entries into nf_queue entry. The hook_entries
   pointer is only required by nf_queue(), so we can store this in the
   queue entry instead.

7) use switch() statement to handle verdict cases.

8) Remove hook_entries field from nf_hook_state structure, this is only
   required by nf_queue, so store it in nf_queue_entry structure.

9) Merge nf_iterate() into nf_hook_slow() that results in a much more
   simple and readable function.

10) Handle NF_REPEAT away from the core, so far the only client is
    nf_conntrack_in() and we can restart the packet processing using a
    simple goto to jump back there when the TCP requires it.
    This update required a second pass to fix fallout, fix from
    Arnd Bergmann.

11) Set random seed from nft_hash when no seed is specified from
    userspace.

12) Simplify nf_tables expression registration, in a much smarter way
    to save lots of boiler plate code, by Liping Zhang.

13) Simplify layer 4 protocol conntrack tracker registration, from
    Davide Caratti.

14) Missing CONFIG_NF_SOCKET_IPV4 dependency for udp4_lib_lookup, due
    to recent generalization of the socket infrastructure, from Arnd
    Bergmann.

15) Then, the ipset batch from Jozsef, he describes it as it follows:

* Cleanup: Remove extra whitespaces in ip_set.h
* Cleanup: Mark some of the helpers arguments as const in ip_set.h
* Cleanup: Group counter helper functions together in ip_set.h
* struct ip_set_skbinfo is introduced instead of open coded fields
  in skbinfo get/init helper funcions.
* Use kmalloc() in comment extension helper instead of kzalloc()
  because it is unnecessary to zero out the area just before
  explicit initialization.
* Cleanup: Split extensions into separate files.
* Cleanup: Separate memsize calculation code into dedicated function.
* Cleanup: group ip_set_put_extensions() and ip_set_get_extensions()
  together.
* Add element count to hash headers by Eric B Munson.
* Add element count to all set types header for uniform output
  across all set types.
* Count non-static extension memory into memsize calculation for
  userspace.
* Cleanup: Remove redundant mtype_expire() arguments, because
  they can be get from other parameters.
* Cleanup: Simplify mtype_expire() for hash types by removing
  one level of intendation.
* Make NLEN compile time constant for hash types.
* Make sure element data size is a multiple of u32 for the hash set
  types.
* Optimize hash creation routine, exit as early as possible.
* Make struct htype per ipset family so nets array becomes fixed size
  and thus simplifies the struct htype allocation.
* Collapse same condition body into a single one.
* Fix reported memory size for hash:* types, base hash bucket structure
  was not taken into account.
* hash:ipmac type support added to ipset by Tomasz Chilinski.
* Use setup_timer() and mod_timer() instead of init_timer()
  by Muhammad Falak R Wani, individually for the set type families.

16) Remove useless connlabel field in struct netns_ct, patch from
    Florian Westphal.

17) xt_find_table_lock() doesn't return ERR_PTR() anymore, so simplify
    {ip,ip6,arp}tables code that uses this.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-13 22:41:25 -05:00
..
6lowpan 6lowpan: ndisc: no overreact if no short address is available 2016-09-19 20:19:34 +02:00
9p IB/core: add support to create a unsafe global rkey to ib_create_pd 2016-09-23 13:47:44 -04:00
802 net: use core MTU range checking in misc drivers 2016-10-20 14:51:10 -04:00
8021q Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
appletalk appletalk: use IS_ENABLED() instead of checking for built-in or module 2016-09-10 21:19:10 -07:00
atm net: remove MTU limits on a few ether_setup callers 2016-10-21 13:57:50 -04:00
ax25 AX.25: Close socket connection on session completion 2016-06-18 20:55:34 -07:00
batman-adv This feature and cleanup patchset includes the following changes: 2016-11-09 22:15:28 -05:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
bridge netfilter: remove hook_entries field from nf_hook_state 2016-11-03 11:52:58 +01:00
caif net caif: insert missing spaces in pr_* messages and unbreak multi-line strings 2016-10-28 13:47:33 -04:00
can can: only call can_stat_update with procfs 2016-06-23 11:23:49 +02:00
ceph mm: replace get_user_pages_unlocked() write/force parameters with gup_flags 2016-10-18 14:13:37 -07:00
core net: napi_hash_add() is no longer exported 2016-11-09 21:16:05 -05:00
dcb
dccp tcp/dccp: drop SYN packets if accept queue is full 2016-10-29 15:09:21 -04:00
decnet net: fix decnet rtnexthop parsing 2016-07-05 14:08:47 -07:00
dns_resolver KEYS: Add a facility to restrict new links into a keyring 2016-04-11 22:37:37 +01:00
dsa net: remove MTU limits on a few ether_setup callers 2016-10-21 13:57:50 -04:00
ethernet net: make default TX queue length a defined constant 2016-11-07 20:15:55 -05:00
hsr Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
ieee802154 genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
ipv4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2016-11-13 22:41:25 -05:00
ipv6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2016-11-13 22:41:25 -05:00
ipx
irda genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
iucv Subject: [PATCH] af_iucv: drop skbs rejected by filter 2016-10-12 01:56:04 -04:00
kcm Merge branch 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-07 15:36:58 -07:00
key
l2tp net: l2tp: fix negative assignment to unsigned int 2016-11-09 18:55:36 -05:00
l3mdev net: ipv6: Remove l3mdev_get_saddr6 2016-09-10 23:12:53 -07:00
lapb net/lapb: tuse %*ph to dump buffers 2016-05-29 22:33:25 -07:00
llc llc: switch type to bool as the timeout is only tested versus 0 2016-09-17 10:05:05 -04:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
mac802154 mac802154: use rate limited warnings for malformed frames 2016-09-19 20:19:34 +02:00
mpls lwt: Remove unused len field 2016-10-23 17:45:01 -04:00
ncsi net/ncsi: Improve HNCDSC AEN handler 2016-10-20 11:23:08 -04:00
netfilter netfilter: x_tables: simplify IS_ERR_OR_NULL to NULL test 2016-11-13 22:26:13 +01:00
netlabel genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
netlink genetlink: fix error return code in genl_register_family() 2016-11-01 12:13:13 -04:00
netrom
nfc genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2016-11-13 22:41:25 -05:00
packet packet: on direct_xmit, limit tso and csum to supported devices 2016-10-29 15:02:15 -04:00
phonet net: use core MTU range checking in misc drivers 2016-10-20 14:51:10 -04:00
qrtr Merge tag 'qcom-soc-for-4.7-2' into net-next 2016-05-17 14:11:19 -04:00
rds RDS: TCP: start multipath acceptor loop at 0 2016-11-09 12:47:49 -05:00
rfkill rfkill: Use switch to demux userspace operations 2016-04-05 10:48:53 +02:00
rose rose: limit sk_filter trim to payload 2016-07-13 11:53:40 -07:00
rxrpc udp: do fwd memory scheduling on dequeue 2016-11-07 13:24:41 -05:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2016-11-13 22:41:25 -05:00
sctp sctp: clean up sctp_packet_transmit 2016-11-02 15:03:13 -04:00
strparser strparser: Propagate correct error code in strp_recv() 2016-10-12 01:51:49 -04:00
sunrpc udp: do fwd memory scheduling on dequeue 2016-11-07 13:24:41 -05:00
switchdev Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
tipc tipc: remove SS_CONNECTED sock state 2016-11-01 11:53:26 -04:00
unix udp: do fwd memory scheduling on dequeue 2016-11-07 13:24:41 -05:00
vmw_vsock VSOCK: Don't dec ack backlog twice for rejected connections 2016-09-27 07:59:25 -04:00
wimax genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
wireless Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
x25 net: x25: remove null checks on arrays calling_ae and called_ae 2016-09-09 18:13:30 -07:00
xfrm Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2016-10-28 13:26:27 -04:00
Kconfig strparser: Stream parser for messages 2016-08-17 19:36:23 -04:00
Makefile strparser: Stream parser for messages 2016-08-17 19:36:23 -04:00
compat.c packet: compat support for sock_fprog 2016-06-09 23:41:03 -07:00
socket.c net: core: Add a UID field to struct sock. 2016-11-04 14:45:22 -04:00
sysctl_net.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2016-10-06 09:52:23 -07:00