Commit Graph

65165 Commits

Author SHA1 Message Date
Zheng Yongjun fe6c0262bd rxrpc: Fix a typo
targetted  ==> targeted

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-02 14:01:55 -07:00
Zheng Yongjun 91641b79e1 Bluetooth: Fix spelling mistakes
Fix some spelling mistakes in comments:
udpate  ==> update
retreive  ==> retrieve
accidentially  ==> accidentally
correspondig  ==> corresponding
adddress  ==> address
estabilish  ==> establish
commplete  ==> complete
Unkown  ==> Unknown
triggerd  ==> triggered
transtion  ==> transition

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-02 14:01:55 -07:00
Zheng Yongjun 8ab1784df6 9p/trans_virtio: Fix spelling mistakes
reseting  ==> resetting
alloced  ==> allocated
accomodate  ==> accommodate

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-02 14:01:55 -07:00
Zheng Yongjun b676c7f1c3 ethtool: Fix a typo
atribute  ==> attribute

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-02 14:01:55 -07:00
zhang kai 7f0e869c4e sit: replace 68 with micro IPV4_MIN_MTU
Use meaningfull micro IPV4_MIN_MTU

Signed-off-by: zhang kai <zhangkaiheb@126.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-02 13:28:05 -07:00
David S. Miller 5fe8e519e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Support for SCTP chunks matching on nf_tables, from Phil Sutter.

2) Skip LDMXCSR, we don't need a valid MXCSR state. From Stefano Brivio.

3) CONFIG_RETPOLINE for nf_tables set lookups, from Florian Westphal.

4) A few Kconfig leading spaces removal, from Juerg Haefliger.

5) Remove spinlock from xt_limit, from Jason Baron.

6) Remove useless initialization in xt_CT, oneliner from Yang Li.

7) Tree-wide replacement of netlink_unicast() by nfnetlink_unicast().

8) Reduce footprint of several structures: xt_action_param,
   nft_pktinfo and nf_hook_state, from Florian.

10) Add nft_thoff() and nft_sk() helpers and use them, also from Florian.

11) Fix documentation in nf_tables pipapo avx2, from Florian Westphal.

12) Fix clang-12 fmt string warnings, also from Florian.
====================
2021-06-01 17:15:14 -07:00
Zheng Yongjun ca746c55a7 net: Return the correct errno code
When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 17:02:15 -07:00
Zheng Yongjun b923cda963 net: dcb: Return the correct errno code
When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 17:01:33 -07:00
Boris Sukholitko 8323b20f1d net/sched: act_vlan: No dump for unset priority
Dump vlan priority only if it has been previously set.

Fix the tests accordingly.

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 16:54:42 -07:00
Boris Sukholitko 9c5eee0afc net/sched: act_vlan: Fix modify to allow 0
Currently vlan modification action checks existence of vlan priority by
comparing it to 0. Therefore it is impossible to modify existing vlan
tag to have priority 0.

For example, the following tc command will change the vlan id but will
not affect vlan priority:

tc filter add dev eth1 ingress matchall action vlan modify id 300 \
        priority 0 pipe mirred egress redirect dev eth2

The incoming packet on eth1:

ethertype 802.1Q (0x8100), vlan 200, p 4, ethertype IPv4

will be changed to:

ethertype 802.1Q (0x8100), vlan 300, p 4, ethertype IPv4

although the user has intended to have p == 0.

The fix is to add tcfv_push_prio_exists flag to struct tcf_vlan_params
and rely on it when deciding to set the priority.

Fixes: 45a497f2d1 (net/sched: act_vlan: Introduce TCA_VLAN_ACT_MODIFY vlan action)
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 16:54:42 -07:00
Yang Li 7cf85f8caa NFC: nci: Remove redundant assignment to len
Variable 'len' is set to conn_info->max_pkt_payload_len but this
value is never read as it is overwritten with a new value later on,
hence it is a redundant assignment and can be removed.

Clean up the following clang-analyzer warning:

net/nfc/nci/hci.c:164:3: warning: Value stored to 'len' is never read
[clang-analyzer-deadcode.DeadStores]

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 15:52:24 -07:00
Wander Lairson Costa b0f6c9ac80 netpoll: don't require irqs disabled in rt kernels
write_msg(netconsole.c:836) calls netpoll_send_udp after a call to
spin_lock_irqsave, which normally disables interrupts; but in PREEMPT_RT
this call just locks an rt_mutex without disabling irqs. In this case,
netpoll_send_udp is called with interrupts enabled.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-01 15:15:11 -07:00
Florian Westphal 8a1c08ad19 netfilter: fix clang-12 fmt string warnings
nf_conntrack_h323_main.c:198:6: warning: format specifies type 'unsigned short' but
xt_AUDIT.c:121:9: warning: format specifies type 'unsigned char' but the argument has type 'int' [-Wformat]

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-01 23:53:51 +02:00
Florian Westphal 89258f8e41 netfilter: nft_set_pipapo_avx2: fix up description warnings
W=1:
net/netfilter/nft_set_pipapo_avx2.c:159: warning: Excess function parameter 'len' description in 'nft_pipapo_avx2_refill'
net/netfilter/nft_set_pipapo_avx2.c:1124: warning: Function parameter or member 'key' not described in 'nft_pipapo_avx2_lookup'
net/netfilter/nft_set_pipapo_avx2.c:1124: warning: Excess function parameter 'elem' description in 'nft_pipapo_avx2_lookup'

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-01 23:53:51 +02:00
Zheng Yongjun 0c2c366e0e sctp: sm_statefuns: Fix spelling mistakes
Fix some spelling mistakes in comments:
genereate ==> generate
correclty ==> correctly
boundries ==> boundaries
failes ==> fails
isses ==> issues
assocition ==> association
signe ==> sign
assocaition ==> association
managemement ==> management
restransmissions ==> retransmission
sideffect ==> sideeffect
bomming ==> booming
chukns ==> chunks
SHUDOWN ==> SHUTDOWN
violationg ==> violating
explcitly ==> explicitly
CHunk ==> Chunk

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Link: https://lore.kernel.org/r/20210601020801.3625358-1-zhengyongjun3@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:45:15 -07:00
Zheng Yongjun 379aecbce0 rds: Fix spelling mistakes
Fix some spelling mistakes in comments:
alloced  ==> allocated

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Link: https://lore.kernel.org/r/20210531063617.3018637-1-zhengyongjun3@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:45:05 -07:00
Zheng Yongjun 37f2ad2b90 net: sched: Fix spelling mistakes
Fix some spelling mistakes in comments:
sevaral  ==> several
sugestion  ==> suggestion
unregster  ==> unregister
suplied  ==> supplied
cirsumstances  ==> circumstances

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Link: https://lore.kernel.org/r/20210531020048.2920054-1-zhengyongjun3@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:44:56 -07:00
Zheng Yongjun 01709d0977 nfc: hci: Fix spelling mistakes
Fix some spelling mistakes in comments:
occured  ==> occurred
negociate  ==> negotiate

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20210531020019.2919799-1-zhengyongjun3@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:44:47 -07:00
Rocco Yue 12e64b3bb9 ipv6: align code with context
The Tab key is used three times, causing the code block to
be out of alignment with the context.

Signed-off-by: Rocco Yue <rocco.yue@mediatek.com>
Link: https://lore.kernel.org/r/20210530113811.8817-1-rocco.yue@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:14:15 -07:00
Willy Tarreau 62f20e068c ipv6: use prandom_u32() for ID generation
This is a complement to commit aa6dd211e4 ("inet: use bigger hash
table for IP ID generation"), but focusing on some specific aspects
of IPv6.

Contary to IPv4, IPv6 only uses packet IDs with fragments, and with a
minimum MTU of 1280, it's much less easy to force a remote peer to
produce many fragments to explore its ID sequence. In addition packet
IDs are 32-bit in IPv6, which further complicates their analysis. On
the other hand, it is often easier to choose among plenty of possible
source addresses and partially work around the bigger hash table the
commit above permits, which leaves IPv6 partially exposed to some
possibilities of remote analysis at the risk of weakening some
protocols like DNS if some IDs can be predicted with a good enough
probability.

Given the wide range of permitted IDs, the risk of collision is extremely
low so there's no need to rely on the positive increment algorithm that
is shared with the IPv4 code via ip_idents_reserve(). We have a fast
PRNG, so let's simply call prandom_u32() and be done with it.

Performance measurements at 10 Gbps couldn't show any difference with
the previous code, even when using a single core, because due to the
large fragments, we're limited to only ~930 kpps at 10 Gbps and the cost
of the random generation is completely offset by other operations and by
the network transfer time. In addition, this change removes the need to
update a shared entry in the idents table so it may even end up being
slightly faster on large scale systems where this matters.

The risk of at least one collision here is about 1/80 million among
10 IDs, 1/850k among 100 IDs, and still only 1/8.5k among 1000 IDs,
which remains very low compared to IPv4 where all IDs are reused
every 4 to 80ms on a 10 Gbps flow depending on packet sizes.

Reported-by: Amit Klein <aksecurity@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20210529110746.6796-1-w@1wt.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-31 22:12:08 -07:00
Florian Westphal 897389de48 netfilter: nf_tables: remove xt_action_param from nft_pktinfo
Init it on demand in the nft_compat expression.  This reduces size
of nft_pktinfo from 48 to 24 bytes on x86_64.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29 22:15:59 +02:00
Florian Westphal f06ad944b6 netfilter: nf_tables: remove unused arg in nft_set_pktinfo_unspec()
The functions pass extra skb arg, but either its not used or the helpers
can already access it via pkt->skb.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29 01:04:54 +02:00
Florian Westphal 2d7b4ace07 netfilter: nf_tables: add and use nft_thoff helper
This allows to change storage placement later on without changing readers.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29 01:04:54 +02:00
Florian Westphal 85554eb981 netfilter: nf_tables: add and use nft_sk helper
This allows to change storage placement later on without changing readers.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29 01:04:53 +02:00
Florian Westphal 586d5a8bce netfilter: x_tables: reduce xt_action_param by 8 byte
The fragment offset in ipv4/ipv6 is a 16bit field, so use
u16 instead of unsigned int.

On 64bit: 40 bytes to 32 bytes. By extension this also reduces
nft_pktinfo (56 to 48 byte).

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29 01:04:53 +02:00
Pablo Neira Ayuso e0241ae6ac netfilter: use nfnetlink_unicast()
Replace netlink_unicast() calls by nfnetlink_unicast() which already
deals with translating EAGAIN to ENOBUFS as the nfnetlink core expects.

nfnetlink_unicast() calls nlmsg_unicast() which returns zero in case of
success, otherwise the netlink core function netlink_rcv_skb() turns
err > 0 into an acknowlegment.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29 01:04:53 +02:00
Yang Li 02d8514267 netfilter: xt_CT: Remove redundant assignment to ret
Variable 'ret' is set to zero but this value is never read as it is
overwritten with a new value later on, hence it is a redundant
assignment and can be removed

Clean up the following clang-analyzer warning:

net/netfilter/xt_CT.c:175:2: warning: Value stored to 'ret' is never read [clang-analyzer-deadcode.DeadStores]

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29 01:04:53 +02:00
Jason Baron 07df3fc90a netfilter: x_tables: improve limit_mt scalability
We've seen this spin_lock show up high in profiles. Let's introduce a
lockless version. I've tested this using pktgen_sample01_simple.sh.

Signed-off-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29 01:04:52 +02:00
Juerg Haefliger 06f0299302 netfilter: Remove leading spaces in Kconfig
Remove leading spaces before tabs in Kconfig file(s) by running the
following command:

  $ find net/netfilter -name 'Kconfig*' | xargs sed -r -i 's/^[ ]+\t/\t/'

Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29 01:04:52 +02:00
Florian Westphal f227925e53 netfilter: nf_tables: prefer direct calls for set lookups
Extend nft_set_do_lookup() to use direct calls when retpoline feature
is enabled.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29 01:04:27 +02:00
Matthieu Baerts 744ee14054 mptcp: restrict values of 'enabled' sysctl
To avoid confusions, it seems better to parse this sysctl parameter as a
boolean. We use it as a boolean, no need to parse an integer and bring
confusions if we see a value different from 0 and 1, especially with
this parameter name: enabled.

It seems fine to do this modification because the default value is 1
(enabled). Then the only other interesting value to set is 0 (disabled).
All other values would not have changed the default behaviour.

Suggested-by: Florian Westphal <fw@strlen.de>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28 13:59:16 -07:00
Matthieu Baerts 804c72eeec mptcp: support SYSCTL only if enabled
Since the introduction of the sysctl support in MPTCP with
commit 784325e9f0 ("mptcp: new sysctl to control the activation per NS"),
we don't check CONFIG_SYSCTL.

Until now, that was not an issue: the register and unregister functions
were replaced by NO-OP one if SYSCTL was not enabled in the config. The
only thing we could have avoid is not to reserve memory for the table
but that's for the moment only a small table per net-ns.

But the following commit is going to use SYSCTL_ZERO and SYSCTL_ONE
which are not be defined if SYSCTL is not enabled in the config. This
causes 'undefined reference' errors from the linker.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28 13:59:16 -07:00
Jianguo Wu eb5fb629f5 mptcp: make sure flag signal is set when add addr with port
When add address with port, it is mean to create a listening socket,
and send an ADD_ADDR to remote, so it must have flag signal set,
add this check in mptcp_pm_parse_addr().

Fixes: a77e9179c7 ("mptcp: deal with MPTCP_PM_ADDR_ATTR_PORT in PM netlink")
Acked-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28 13:59:16 -07:00
Jianguo Wu ae514983f2 mptcp: remove redundant initialization in pm_nl_init_net()
Memory of struct pm_nl_pernet{} is allocated by kzalloc()
in setup_net()->ops_init(), so it's no need to reset counters
and zero bitmap in pm_nl_init_net().

Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28 13:59:16 -07:00
Jianguo Wu 0a4d8e96e4 mptcp: generate subflow hmac after mptcp_finish_join()
For outgoing subflow join, when recv SYNACK, in subflow_finish_connect(),
the mptcp_finish_join() may return false in some cases, and send a RESET
to remote, and no local hmac is required.
So generate subflow hmac after mptcp_finish_join().

Fixes: ec3edaa7ca ("mptcp: Add handling of outgoing MP_JOIN requests")
Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28 13:59:16 -07:00
Jianguo Wu c68a0cd173 mptcp: using TOKEN_MAX_RETRIES instead of magic number
We have macro TOKEN_MAX_RETRIES for the number of token generate retries,
so using TOKEN_MAX_RETRIES in subflow_check_req().

And rename TOKEN_MAX_RETRIES to MPTCP_TOKEN_MAX_RETRIES as it is now
exposed.

Fixes: 535fb8152f ("mptcp: token: move retry to caller")
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28 13:59:15 -07:00
Jianguo Wu 2f1af441fd mptcp: fix pr_debug in mptcp_token_new_connect
After commit 2c5ebd001d ("mptcp: refactor token container"),
pr_debug() is called before mptcp_crypto_key_gen_sha() in
mptcp_token_new_connect(), so the output local_key, token and
idsn are 0, like:

  MPTCP: ssk=00000000f6b3c4a2, local_key=0, token=0, idsn=0

Move pr_debug() after mptcp_crypto_key_gen_sha().

Fixes: 2c5ebd001d ("mptcp: refactor token container")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-28 13:59:15 -07:00
Florian Westphal 0974cff3eb netfilter: add and use nft_set_do_lookup helper
Followup patch will add a CONFIG_RETPOLINE wrapper to avoid
the ops->lookup() indirection cost for retpoline builds.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-28 21:11:41 +02:00
Stefano Brivio a58db7ad80 netfilter: nft_set_pipapo_avx2: Skip LDMXCSR, we don't need a valid MXCSR state
We don't need a valid MXCSR state for the lookup routines, none of
the instructions we use rely on or affect any bit in the MXCSR
register.

Instead of calling kernel_fpu_begin(), we can pass 0 as mask to
kernel_fpu_begin_mask() and spare one LDMXCSR instruction.

Commit 49200d17d2 ("x86/fpu/64: Don't FNINIT in kernel_fpu_begin()")
already speeds up lookups considerably, and by dropping the MCXSR
initialisation we can now get a much smaller, but measurable, increase
in matching rates.

The table below reports matching rates and a wild approximation of
clock cycles needed for a match in a "port,net" test with 10 entries
from selftests/netfilter/nft_concat_range.sh, limited to the first
field, i.e. the port (with nft_set_rbtree initialisation skipped), run
on a single AMD Epyc 7351 thread (2.9GHz, 512 KiB L1D$, 8 MiB L2$).

The (very rough) estimation of clock cycles is obtained by simply
dividing frequency by matching rate. The "cycles spared" column refers
to the difference in cycles compared to the previous row, and the rate
increase also refers to the previous row. Results are averages of six
runs.

Merely for context, I'm also reporting packet rates obtained by
skipping kernel_fpu_begin() and kernel_fpu_end() altogether (which
shows a very limited impact now), as well as skipping the whole lookup
function, compared to simply counting and dropping all packets using
the netdev hook drop (see nft_concat_range.sh for details). This
workload also includes packet generation with pktgen and the receive
path of veth.

                                      |matching|  est.  | cycles |  rate  |
                                      |  rate  | cycles | spared |increase|
                                      | (Mpps) |        |        |        |
--------------------------------------|--------|--------|--------|--------|
FNINIT, LDMXCSR (before 49200d17d2) |  5.245 |    553 |      - |      - |
LDMXCSR only (with 49200d17d2)      |  6.347 |    457 |     96 |  21.0% |
Without LDMXCSR (this patch)          |  6.461 |    449 |      8 |   1.8% |
-------- for reference only: ---------|--------|--------|--------|--------|
Without kernel_fpu_begin()            |  6.513 |    445 |      4 |   0.8% |
Without actual matching (return true) |  7.649 |    379 |     66 |  17.4% |
Without lookup operation (netdev drop)| 10.320 |    281 |     98 |  34.9% |

The clock cycles spared by avoiding LDMXCSR appear to be in line with CPI
and latency indicated in the manuals of comparable architectures: Intel
Skylake (CPI: 1, latency: 7) and AMD 12h (latency: 12) -- I couldn't find
this information for AMD 17h.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-28 21:11:41 +02:00
Phil Sutter 133dc203d7 netfilter: nft_exthdr: Support SCTP chunks
Chunks are SCTP header extensions similar in implementation to IPv6
extension headers or TCP options. Reusing exthdr expression to find and
extract field values from them is therefore pretty straightforward.

For now, this supports extracting data from chunks at a fixed offset
(and length) only - chunks themselves are an extensible data structure;
in order to make all fields available, a nested extension search is
needed.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-28 21:11:33 +02:00
Jakub Kicinski b14b27fffa linux-can-next-for-5.14-20210527
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAmCvTYUTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCpyVqK+u3vqTxPB/4xVeasYKcyinylU7adBp9HFIvKjOiK
 TpxQ7h4EhjJxkmQyONP529ZeQ5sjbnbc9IGkDQNhhVVm764LnEJ01aIi+kMtRs+M
 szAGbWcITSyv4iaYCcKtNDSi2m74TK4gtRhsKItkIBRAZCs5jb54DSjWae7cGH0A
 M/ts6WbYTbp89Lmww3mYtQ4dpmqvk/gXNbzKicrs2uGbPg0YTyq8rAQztt4yFaQR
 9cBzxnwcfgvTz/uihkItiClv7kZIYjwzFB8BO1S2qx0TaUE1n78uiuuzcIdfvPt8
 TKap7pwnjLYYToHokcWaU2t8Nd1Hy3KCaHuYO54TdXM4KYODJmbPydER
 =pVNd
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-5.14-20210527' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
can-next 2021-05-27

The first 2 patches are by Geert Uytterhoeven and convert the rcan_can
and rcan_canfd device tree bindings to yaml.

The next 2 patches are by Oliver Hartkopp and me and update the CAN
uapi headers.

zuoqilin's patch removes an unnecessary variable from the CAN proc
code.

Patrick Menschel contributes 3 patches for CAN ISOTP to enhance the
error messages.

Jiapeng Chong's patch removes two dead stores from the softing driver.

The next 4 patches are by me and silence several warnings found by
clang compiler.

Jimmy Assarsson's patches for the kvaser_usb driver add support for
the Kvaser hydra devices.

Dario Binacchi provides 2 patches for the c_can driver, first removing
an unused variable, then adding basic ethtool support to query driver
and ring parameter info.

The last 4 patches are by Torin Cooper-Bennun and clean up the m_can
driver.

* tag 'linux-can-next-for-5.14-20210527' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (21 commits)
  can: m_can: fix whitespace in a few comments
  can: m_can: make TXESC, RXESC config more explicit
  can: m_can: clean up CCCR reg defs, order by revs
  can: m_can: use bits.h macros for all regmasks
  can: c_can: add ethtool support
  can: c_can: remove unused variable struct c_can_priv::rxmasked
  can: kvaser_usb: Add new Kvaser hydra devices
  can: kvaser_usb: Rename define USB_HYBRID_{,PRO_}CANLIN_PRODUCT_ID
  can: at91_can: silence clang warning
  can: mcp251xfd: silence clang warning
  can: mcp251x: mcp251x_can_probe(): silence clang warning
  can: hi311x: hi3110_can_probe(): silence clang warning
  can: softing: Remove redundant variable ptr
  can: isotp: Add error message if txqueuelen is too small
  can: isotp: add symbolic error message to isotp_module_init()
  can: isotp: change error format from decimal to symbolic error names
  can: proc: remove unnecessary variables
  can: uapi: introduce CANFD_FDF flag for mixed content in struct canfd_frame
  can: uapi: update CAN-FD frame description
  dt-bindings: can: rcar_canfd: Convert to json-schema
  ...
====================

Link: https://lore.kernel.org/r/20210527084532.1384031-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 14:39:11 -07:00
Jiri Pirko f285f37cb1 devlink: append split port number to the port name
Instead of doing sprintf twice in case the port is split or not, append
the split port suffix in case the port is split.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20210527104819.789840-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 14:21:12 -07:00
Jakub Kicinski 5ada57a9a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
cdc-wdm: s/kill_urbs/poison_urbs/ to fix build

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-05-27 09:55:10 -07:00
Patrick Menschel c69d190f7b can: isotp: Add error message if txqueuelen is too small
This patch adds an additional error message in case that txqueuelen is
set too small and advices the user to increase txqueuelen.

This is likely to happen even with small transfers if txqueuelen is at
default value 10 frames.

Link: https://lore.kernel.org/r/20210427052150.2308-4-menschel.p@posteo.de
Signed-off-by: Patrick Menschel <menschel.p@posteo.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27 09:42:21 +02:00
Patrick Menschel 6a5ddae578 can: isotp: add symbolic error message to isotp_module_init()
This patch adds the value of err with format %pe to the already
existing error message.

Link: https://lore.kernel.org/r/20210427052150.2308-3-menschel.p@posteo.de
Signed-off-by: Patrick Menschel <menschel.p@posteo.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27 09:42:21 +02:00
Patrick Menschel 46d8657a6b can: isotp: change error format from decimal to symbolic error names
This patch changes the format string for errors from decimal %d to
symbolic error names %pe to achieve more comprehensive log messages.

Link: https://lore.kernel.org/r/20210427052150.2308-2-menschel.p@posteo.de
Signed-off-by: Patrick Menschel <menschel.p@posteo.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27 09:42:21 +02:00
zuoqilin 24a774a4f9 can: proc: remove unnecessary variables
There is no need to define the variable "rate" to receive, just return
directly.

Link: https://lore.kernel.org/r/20210514100806.792-1-zuoqilin1@163.com
Signed-off-by: zuoqilin <zuoqilin@yulong.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-05-27 09:42:21 +02:00
Linus Torvalds d7c5303fbc Networking fixes for 5.13-rc4, including fixes from bpf, netfilter,
can and wireless trees. Notably including fixes for the recently
 announced "FragAttacks" WiFi vulnerabilities. Rather large batch,
 touching some core parts of the stack, too, but nothing hair-raising.
 
 Current release - regressions:
 
  - tipc: make node link identity publish thread safe
 
  - dsa: felix: re-enable TAS guard band mode
 
  - stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()
 
  - stmmac: fix system hang if change mac address after interface ifdown
 
 Current release - new code bugs:
 
  - mptcp: avoid OOB access in setsockopt()
 
  - bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers
 
  - ethtool: stats: fix a copy-paste error - init correct array size
 
 Previous releases - regressions:
 
  - sched: fix packet stuck problem for lockless qdisc
 
  - net: really orphan skbs tied to closing sk
 
  - mlx4: fix EEPROM dump support
 
  - bpf: fix alu32 const subreg bound tracking on bitwise operations
 
  - bpf: fix mask direction swap upon off reg sign change
 
  - bpf, offload: reorder offload callback 'prepare' in verifier
 
  - stmmac: Fix MAC WoL not working if PHY does not support WoL
 
  - packetmmap: fix only tx timestamp on request
 
  - tipc: skb_linearize the head skb when reassembling msgs
 
 Previous releases - always broken:
 
  - mac80211: address recent "FragAttacks" vulnerabilities
 
  - mac80211: do not accept/forward invalid EAPOL frames
 
  - mptcp: avoid potential error message floods
 
  - bpf, ringbuf: deny reserve of buffers larger than ringbuf to prevent
                  out of buffer writes
 
  - bpf: forbid trampoline attach for functions with variable arguments
 
  - bpf: add deny list of functions to prevent inf recursion of tracing
         programs
 
  - tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT
 
  - can: isotp: prevent race between isotp_bind() and isotp_setsockopt()
 
  - netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check,
               fallback to non-AVX2 version
 
 Misc:
 
  - bpf: add kconfig knob for disabling unpriv bpf by default
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmCuy2gACgkQMUZtbf5S
 IruE5BAAhihia5EaiV71Bz/Cqr/d+osv5u283riKT8kBft0bWFVFFnT3iweWyR0/
 5X+bB6zmr80Cuqh45ZeYyq+zJtiAAlsbD5hqBIGdMriSWLxciNKjVJRzuEjuqnek
 USMW/LqGyf4NhmLogmQKpx8XcKSG7VYuK7vPrsH8us1dL5vIssceIXn8R9Dzj9NN
 P77K5Z+Oka8XQJgetNLxR3tDAM/92RwIshotkhJbRwgiUvzb+wbnrnSOAZCIPgku
 ydJyOxOklln1Sx07SejgzEl33ri0CkioDPThBWpOn7Mu0JrYKukXPKludoZcRYuJ
 2jNLYfbH0ZS5EkOfk89h7j7MDoAJMUK72M+S1w5DEYz6eH2EjhAq9noZ6E1iQH+U
 9vfoIvQjPh6Zhyk5QeM4dpt0cvR7rSElXkLVxo/x0dSBAi2rIng1bKeCUtv2J689
 CsoD0oghtEzvUTYVxY6iNr15OFGl6KsZv4tVQ709gGA36sDlK8ozGbJH5WReobBl
 f8H2WJlj2tVW5V75yUoio8TumDw34yk/5xlJFzm9GOwkqBrUcqOraHtHdUIsa4qr
 KbELQQ9QVt4zYdLAiWy5BL/QLycp0ibmA1IB8W1bxEVSK1JXzREHzPxv85KOfZkn
 8+vzNHmk2PEZYYsExiEykc5jXKOCPs8L0rJ6p4OverlbpDZcwIg=
 =peMK
 -----END PGP SIGNATURE-----

Merge tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes for 5.13-rc4, including fixes from bpf, netfilter,
  can and wireless trees. Notably including fixes for the recently
  announced "FragAttacks" WiFi vulnerabilities. Rather large batch,
  touching some core parts of the stack, too, but nothing hair-raising.

  Current release - regressions:

   - tipc: make node link identity publish thread safe

   - dsa: felix: re-enable TAS guard band mode

   - stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()

   - stmmac: fix system hang if change mac address after interface
     ifdown

  Current release - new code bugs:

   - mptcp: avoid OOB access in setsockopt()

   - bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers

   - ethtool: stats: fix a copy-paste error - init correct array size

  Previous releases - regressions:

   - sched: fix packet stuck problem for lockless qdisc

   - net: really orphan skbs tied to closing sk

   - mlx4: fix EEPROM dump support

   - bpf: fix alu32 const subreg bound tracking on bitwise operations

   - bpf: fix mask direction swap upon off reg sign change

   - bpf, offload: reorder offload callback 'prepare' in verifier

   - stmmac: Fix MAC WoL not working if PHY does not support WoL

   - packetmmap: fix only tx timestamp on request

   - tipc: skb_linearize the head skb when reassembling msgs

  Previous releases - always broken:

   - mac80211: address recent "FragAttacks" vulnerabilities

   - mac80211: do not accept/forward invalid EAPOL frames

   - mptcp: avoid potential error message floods

   - bpf, ringbuf: deny reserve of buffers larger than ringbuf to
     prevent out of buffer writes

   - bpf: forbid trampoline attach for functions with variable arguments

   - bpf: add deny list of functions to prevent inf recursion of tracing
     programs

   - tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT

   - can: isotp: prevent race between isotp_bind() and
     isotp_setsockopt()

   - netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check,
     fallback to non-AVX2 version

  Misc:

   - bpf: add kconfig knob for disabling unpriv bpf by default"

* tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (172 commits)
  net: phy: Document phydev::dev_flags bits allocation
  mptcp: validate 'id' when stopping the ADD_ADDR retransmit timer
  mptcp: avoid error message on infinite mapping
  mptcp: drop unconditional pr_warn on bad opt
  mptcp: avoid OOB access in setsockopt()
  nfp: update maintainer and mailing list addresses
  net: mvpp2: add buffer header handling in RX
  bnx2x: Fix missing error code in bnx2x_iov_init_one()
  net: zero-initialize tc skb extension on allocation
  net: hns: Fix kernel-doc
  sctp: fix the proc_handler for sysctl encap_port
  sctp: add the missing setting for asoc encap_port
  bpf, selftests: Adjust few selftest result_unpriv outcomes
  bpf: No need to simulate speculative domain for immediates
  bpf: Fix mask direction swap upon off reg sign change
  bpf: Wrap aux data inside bpf_sanitize_info container
  bpf: Fix BPF_LSM kconfig symbol dependency
  selftests/bpf: Add test for l3 use of bpf_redirect_peer
  bpftool: Add sock_release help info for cgroup attach/prog load command
  net: dsa: microchip: enable phy errata workaround on 9567
  ...
2021-05-26 17:44:49 -10:00
David S. Miller f5d287126f Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-05-26

The following pull-request contains BPF updates for your *net* tree.

We've added 14 non-merge commits during the last 14 day(s) which contain
a total of 17 files changed, 513 insertions(+), 231 deletions(-).

The main changes are:

1) Fix bpf_skb_change_head() helper to reset mac_len, from Jussi Maki.

2) Fix masking direction swap upon off-reg sign change, from Daniel Borkmann.

3) Fix BPF offloads in verifier by reordering driver callback, from Yinjun Zhang.

4) BPF selftest for ringbuf mmap ro/rw restrictions, from Andrii Nakryiko.

5) Follow-up fixes to nested bprintf per-cpu buffers, from Florent Revest.

6) Fix bpftool sock_release attach point help info, from Liu Jian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-25 15:59:24 -07:00
Davide Caratti d58300c318 mptcp: validate 'id' when stopping the ADD_ADDR retransmit timer
when Linux receives an echo-ed ADD_ADDR, it checks the IP address against
the list of "announced" addresses. In case of a positive match, the timer
that handles retransmissions is stopped regardless of the 'Address Id' in
the received packet: this behaviour does not comply with RFC8684 3.4.1.

Fix it by validating the 'Address Id' in received echo-ed ADD_ADDRs.
Tested using packetdrill, with the following captured output:

 unpatched kernel:

 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0xfd2e62517888fe29,mptcp dss ack 3007449509], length 0
 In  <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 1.2.3.4,mptcp dss ack 3013740213], length 0
 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0xfd2e62517888fe29,mptcp dss ack 3007449509], length 0
 In  <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 90 198.51.100.2,mptcp dss ack 3013740213], length 0
        ^^^ retransmission is stopped here, but 'Address Id' is 90

 patched kernel:

 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0
 In  <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 1.2.3.4,mptcp dss ack 1672384568], length 0
 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0
 In  <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 90 198.51.100.2,mptcp dss ack 1672384568], length 0
 Out <...> Flags [.], ack 1, win 256, options [mptcp add-addr v1 id 1 198.51.100.2 hmac 0x1cf372d59e05f4b8,mptcp dss ack 3007449509], length 0
 In  <...> Flags [.], ack 1, win 257, options [mptcp add-addr v1-echo id 1 198.51.100.2,mptcp dss ack 1672384568], length 0
        ^^^ retransmission is stopped here, only when both 'Address Id' and 'IP Address' match

Fixes: 00cfd77b90 ("mptcp: retransmit ADD_ADDR when timeout")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-25 15:56:20 -07:00