Commit Graph

1090150 Commits

Author SHA1 Message Date
Artem Savkov 920fd5e177 selftests/bpf: Fix attach tests retcode checks
Switching to libbpf 1.0 API broke test_sock and test_sysctl as they
check for return of bpf_prog_attach to be exactly -1. Switch the check
to '< 0' instead.

Fixes: b858ba8c52 ("selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/bpf/20220421130104.1582053-1-asavkov@redhat.com
2022-04-21 16:34:55 +02:00
Maciej Fijalkowski 9d87e41a6d i40e, xsk: Get rid of redundant 'fallthrough'
Intel drivers translate actions returned from XDP programs to their own
return codes that have the following mapping:

XDP_REDIRECT -> I40E_XDP_{REDIR,CONSUMED}
XDP_TX -> I40E_XDP_{TX,CONSUMED}
XDP_DROP -> I40E_XDP_CONSUMED
XDP_ABORTED -> I40E_XDP_CONSUMED
XDP_PASS -> I40E_XDP_PASS

Commit b8aef650e5 ("i40e, xsk: Terminate Rx side of NAPI when XSK Rx
queue gets full") introduced new translation

XDP_REDIRECT -> I40E_XDP_EXIT

which is set when XSK RQ gets full and to indicate that driver should
stop further Rx processing. This happens for unsuccessful
xdp_do_redirect() so it is valuable to call trace_xdp_exception() for
this case. In order to avoid I40E_XDP_EXIT -> IXGBE_XDP_CONSUMED
overwrite, XDP_DROP case was moved above which in turn made the
'fallthrough' that is in XDP_ABORTED useless as it became the last label
in the switch statement.

Simply drop this leftover.

Fixes: b8aef650e5 ("i40e, xsk: Terminate Rx side of NAPI when XSK Rx queue gets full")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220421132126.471515-3-maciej.fijalkowski@intel.com
2022-04-21 16:31:10 +02:00
Maciej Fijalkowski e130e8d543 ixgbe, xsk: Get rid of redundant 'fallthrough'
Intel drivers translate actions returned from XDP programs to their own
return codes that have the following mapping:

XDP_REDIRECT -> IXGBE_XDP_{REDIR,CONSUMED}
XDP_TX -> IXGBE_XDP_{TX,CONSUMED}
XDP_DROP -> IXGBE_XDP_CONSUMED
XDP_ABORTED -> IXGBE_XDP_CONSUMED
XDP_PASS -> IXGBE_XDP_PASS

Commit c7dd09fd46 ("ixgbe, xsk: Terminate Rx side of NAPI when XSK Rx
queue gets full") introduced new translation

XDP_REDIRECT -> IXGBE_XDP_EXIT

which is set when XSK RQ gets full and to indicate that driver should
stop further Rx processing. This happens for unsuccessful
xdp_do_redirect() so it is valuable to call trace_xdp_exception() for
this case. In order to avoid IXGBE_XDP_EXIT -> IXGBE_XDP_CONSUMED
overwrite, XDP_DROP case was moved above which in turn made the
'fallthrough' that is in XDP_ABORTED useless as it became the last label
in the switch statement.

Simply drop this leftover.

Fixes: c7dd09fd46 ("ixgbe, xsk: Terminate Rx side of NAPI when XSK Rx queue gets full")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220421132126.471515-2-maciej.fijalkowski@intel.com
2022-04-21 16:31:10 +02:00
Kumar Kartikeya Dwivedi e9147b4422 bpf: Move check_ptr_off_reg before check_map_access
Some functions in next patch want to use this function, and those
functions will be called by check_map_access, hence move it before
check_map_access.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Link: https://lore.kernel.org/bpf/20220415160354.1050687-3-memxor@gmail.com
2022-04-21 16:31:10 +02:00
Kumar Kartikeya Dwivedi 42ba130807 bpf: Make btf_find_field more generic
Next commit introduces field type 'kptr' whose kind will not be struct,
but pointer, and it will not be limited to one offset, but multiple
ones. Make existing btf_find_struct_field and btf_find_datasec_var
functions amenable to use for finding kptrs in map value, by moving
spin_lock and timer specific checks into their own function.

The alignment, and name are checked before the function is called, so it
is the last point where we can skip field or return an error before the
next loop iteration happens. Size of the field and type is meant to be
checked inside the function.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220415160354.1050687-2-memxor@gmail.com
2022-04-21 16:31:10 +02:00
Grant Seltzer a66ab9a9e6 libbpf: Add documentation to API functions
This adds documentation for the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()
- bpf_program__set_attach_target()
- bpf_program__attach()
- bpf_program__pin()
- bpf_program__unpin()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-3-grantseltzer@gmail.com
2022-04-21 16:31:07 +02:00
Grant Seltzer df28671632 libbpf: Update API functions usage to check error
This updates usage of the following API functions within
libbpf so their newly added error return is checked:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-2-grantseltzer@gmail.com
2022-04-21 16:28:25 +02:00
Grant Seltzer 93442f132b libbpf: Add error returns to two API functions
This adds an error return to the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()

In both cases, the error occurs when the BPF object has
already been loaded when the function is called. In this
case -EBUSY is returned.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-1-grantseltzer@gmail.com
2022-04-21 16:28:11 +02:00
Haowen Bai 9c8774e629 net: eql: Use kzalloc instead of kmalloc/memset
Use kzalloc rather than duplicating its implementation, which
makes code simple and easy to understand.

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Link: https://lore.kernel.org/r/1650277333-31090-1-git-send-email-baihaowen@meizu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-21 14:40:21 +02:00
Minghao Chi 4facbe3d44 drivers: net: davinci_mdio: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
Using pm_runtime_resume_and_get is more appropriate
for simplifing code

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Link: https://lore.kernel.org/r/20220418062921.2557884-1-chi.minghao@zte.com.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-21 14:28:38 +02:00
Duoming Zhou bc6de28784 drivers: net: hippi: Fix deadlock in rr_close()
There is a deadlock in rr_close(), which is shown below:

   (Thread 1)                |      (Thread 2)
                             | rr_open()
rr_close()                   |  add_timer()
 spin_lock_irqsave() //(1)   |  (wait a time)
 ...                         | rr_timer()
 del_timer_sync()            |  spin_lock_irqsave() //(2)
 (wait timer to stop)        |  ...

We hold rrpriv->lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need rrpriv->lock in position (2) of thread 2.
As a result, rr_close() will block forever.

This patch extracts del_timer_sync() from the protection of
spin_lock_irqsave(), which could let timer handler to obtain
the needed lock.

Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/20220417125519.82618-1-duoming@zju.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-21 10:30:45 +02:00
Zhengchao Shao db69264f98 samples/bpf: Reduce the sampling interval in xdp1_user
If interval is 2, and sum - prev[key] = 1, the result = 0. This will
mislead the tester that the port has no traffic right now. So reduce the
sampling interval to 1.

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220419114746.291613-1-shaozhengchao@huawei.com
2022-04-20 14:58:29 -07:00
Liu Jian 127e7dca42 selftests/bpf: Add test for skb_load_bytes
Use bpf_prog_test_run_opts to test the skb_load_bytes function. Tests
the behavior when offset is greater than INT_MAX or a normal value.

Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220416105801.88708-4-liujian56@huawei.com
2022-04-20 23:48:34 +02:00
Liu Jian 92ece28072 net: Change skb_ensure_writable()'s write_len param to unsigned int type
Both pskb_may_pull() and skb_clone_writable()'s length parameters are of
type unsigned int already. Therefore, change this function's write_len
param to unsigned int type.

Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220416105801.88708-3-liujian56@huawei.com
2022-04-20 23:48:22 +02:00
Liu Jian 45969b4152 bpf: Enlarge offset check value to INT_MAX in bpf_skb_{load,store}_bytes
The data length of skb frags + frag_list may be greater than 0xffff, and
skb_header_pointer can not handle negative offset. So, here INT_MAX is used
to check the validity of offset. Add the same change to the related function
skb_store_bytes.

Fixes: 05c74e5e53 ("bpf: add bpf_skb_load_bytes helper")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220416105801.88708-2-liujian56@huawei.com
2022-04-20 23:47:40 +02:00
Linus Torvalds b253435746 Xtensa fixes for v5.18:
- fix patching CPU selection in patch_text
 - fix potential deadlock in ISS platform serial driver
 - fix potential register clobbering in coprocessor exception handler
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEK2eFS5jlMn3N6xfYUfnMkfg/oEQFAmJbgwQTHGpjbXZia2Jj
 QGdtYWlsLmNvbQAKCRBR+cyR+D+gRPYwD/9/aVrsMKPkJsxUQDI/A2fX4ZTcM5fj
 cnBsKvKGc+CpKwtWDK5YGAlLLGskBicRSUXqqiNNYapUUx6R9nwyRzEz2srKUa+P
 V/4gUnKCCVUyBiTMf1gYa9srO+KdrgY2460uDZWCvPsA1PTLotrQQQcZR4lkIKeT
 RvHjflEAY5f1l0A96WoH4F3xSNqrDdDInB4QpqwmZ2oWP4qjO9AA1RY25HFsvWVz
 xGErCrPKlJeHR6N5q1N0IzJYJdJeT2lKIMi5kwepLurNnn60+8KL9eQwFFd546fz
 EHuzcThg8c1q/jOhaOabkp0HWtWxv5dZiWzKmYnGXnrtjZwDQFk4PFdrHqYyY6OU
 LKsyV/bQDsDmzxT+oU1ylWC/buUsY3ulI30yfESWGXZzPw3iQDimO9O0jEJJ//W9
 FNl6hU+KvVcr5bedYV6UIMrQeAXhGpUzpJL9qebuJmR/66sjiwZJGIYVB5RViF9o
 eVutlfkDByQsKY4ATJZyaj+8/p6Px2KNaOdMI4rzphrzpsQGAgUg7bNEgDt+dWMr
 vFPBcj+Ad8u1CQbBn6hmdovLutFdT1mmO/ePWepslcavhRQf4AnN7KMY7A8LIoiF
 dgegaICNrNA3XRAHp1UkBUQoGgqn++xvVYRZ8biHgjnBF8sVfythooc3tjHT+C6V
 RSyz+S+8Yiuh0w==
 =zDe5
 -----END PGP SIGNATURE-----

Merge tag 'xtensa-20220416' of https://github.com/jcmvbkbc/linux-xtensa

Pull xtensa fixes from Max Filippov:

 - fix patching CPU selection in patch_text

 - fix potential deadlock in ISS platform serial driver

 - fix potential register clobbering in coprocessor exception handler

* tag 'xtensa-20220416' of https://github.com/jcmvbkbc/linux-xtensa:
  xtensa: fix a7 clobbering in coprocessor context load/store
  arch: xtensa: platforms: Fix deadlock in rs_close()
  xtensa: patch_text: Fixup last cpu should be master
2022-04-20 12:43:27 -07:00
Linus Torvalds 10c5f102e2 Changes since last update:
- Fix use-after-free of the on-stack z_erofs_decompressqueue;
 
  - Fix sysfs documentation Sphinx warnings.
 -----BEGIN PGP SIGNATURE-----
 
 iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCYl2A0REceGlhbmdAa2Vy
 bmVsLm9yZwAKCRA5NzHcH7XmBEe7AQCh7aNhYEncBnoHvrB276HCP0xmMwmc0gPq
 6UeSWgarpAEArgfgLyo8tFgS+gvXbhB8/P1FqEMwaRZVLWathXID3QU=
 =GFAS
 -----END PGP SIGNATURE-----

Merge tag 'erofs-for-5.18-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:
 "One patch to fix a use-after-free race related to the on-stack
  z_erofs_decompressqueue, which happens very rarely but needs to be
  fixed properly soon.

  The other patch fixes some sysfs Sphinx warnings"

* tag 'erofs-for-5.18-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  Documentation/ABI: sysfs-fs-erofs: Fix Sphinx errors
  erofs: fix use-after-free of on-stack io[]
2022-04-20 12:35:20 -07:00
Linus Torvalds 906f904097 Revert "fs/pipe: use kvcalloc to allocate a pipe_buffer array"
This reverts commit 5a519c8fe4.

It turns out that making the pipe almost arbitrarily large has some
rather unexpected downsides.  The kernel test robot reports a kernel
warning that is due to pipe->max_usage now growing to the point where
the iter_file_splice_write() buffer allocation can no longer be
satisfied as a slab allocation, and the

        int nbufs = pipe->max_usage;
        struct bio_vec *array = kcalloc(nbufs, sizeof(struct bio_vec),
                                        GFP_KERNEL);

code sequence there will now always fail as a result.

That code could be modified to use kvcalloc() too, but I feel very
uncomfortable making those kinds of changes for a very niche use case
that really should have other options than make these kinds of
fundamental changes to pipe behavior.

Maybe the CRIU process dumping should be multi-threaded, and use
multiple pipes and multiple cores, rather than try to use one larger
pipe to minimize splice() calls.

Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/all/20220420073717.GD16310@xsang-OptiPlex-9020/
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-20 12:07:53 -07:00
Mikulas Patocka a6823e4e36 x86: __memcpy_flushcache: fix wrong alignment if size > 2^32
The first "if" condition in __memcpy_flushcache is supposed to align the
"dest" variable to 8 bytes and copy data up to this alignment.  However,
this condition may misbehave if "size" is greater than 4GiB.

The statement min_t(unsigned, size, ALIGN(dest, 8) - dest); casts both
arguments to unsigned int and selects the smaller one.  However, the
cast truncates high bits in "size" and it results in misbehavior.

For example:

	suppose that size == 0x100000001, dest == 0x200000002
	min_t(unsigned, size, ALIGN(dest, 8) - dest) == min_t(0x1, 0xe) == 0x1;
	...
	dest += 0x1;

so we copy just one byte "and" dest remains unaligned.

This patch fixes the bug by replacing unsigned with size_t.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-20 11:38:49 -07:00
Ido Schimmel 5e6242151d selftests: mlxsw: vxlan_flooding_ipv6: Prevent flooding of unwanted packets
The test verifies that packets are correctly flooded by the bridge and
the VXLAN device by matching on the encapsulated packets at the other
end. However, if packets other than those generated by the test also
ingress the bridge (e.g., MLD packets), they will be flooded as well and
interfere with the expected count.

Make the test more robust by making sure that only the packets generated
by the test can ingress the bridge. Drop all the rest using tc filters
on the egress of 'br0' and 'h1'.

In the software data path, the problem can be solved by matching on the
inner destination MAC or dropping unwanted packets at the egress of the
VXLAN device, but this is not currently supported by mlxsw.

Fixes: d01724dd2a ("selftests: mlxsw: spectrum-2: Add a test for VxLAN flooding with IPv6")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 15:04:27 +01:00
Ido Schimmel 044011fdf1 selftests: mlxsw: vxlan_flooding: Prevent flooding of unwanted packets
The test verifies that packets are correctly flooded by the bridge and
the VXLAN device by matching on the encapsulated packets at the other
end. However, if packets other than those generated by the test also
ingress the bridge (e.g., MLD packets), they will be flooded as well and
interfere with the expected count.

Make the test more robust by making sure that only the packets generated
by the test can ingress the bridge. Drop all the rest using tc filters
on the egress of 'br0' and 'h1'.

In the software data path, the problem can be solved by matching on the
inner destination MAC or dropping unwanted packets at the egress of the
VXLAN device, but this is not currently supported by mlxsw.

Fixes: 94d302deae ("selftests: mlxsw: Add a test for VxLAN flooding")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 15:04:27 +01:00
David S. Miller 365014f5c3 Merge branch 'mlxsw-line-card-status-tracking'
Ido Schimmel says:

====================
mlxsw: Line cards status tracking

When a line card is provisioned, netdevs corresponding to the ports
found on the line card are registered. User space can then perform
various logical configurations (e.g., splitting, setting MTU) on these
netdevs.

However, since the line card is not present / powered on (i.e., it is
not in 'active' state), user space cannot access the various components
found on the line card. For example, user space cannot read the
temperature of gearboxes or transceiver modules found on the line card
via hwmon / thermal. Similarly, it cannot dump the EEPROM contents of
these transceiver modules. The above is only possible when the line card
becomes active.

This patchset solves the problem by tracking the status of each line
card and invoking callbacks from interested parties when a line card
becomes active / inactive.

Patchset overview:

Patch  adds the infrastructure in the line cards core that allows
users to registers a set of callbacks that are invoked when a line card
becomes active / inactive. To avoid races, if a line card is already
active during registration, the got_active() callback is invoked.

Patches #2-#3 are preparations.

Patch  changes the port module core to register a set of callbacks
with the line cards core. See detailed description with examples in the
commit message.

Patches #5-#6 do the same with regards to thermal / hwmon support, so
that user space will be able to monitor the temperature of various
components on the line card when it becomes active.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 15:03:22 +01:00
Vadim Pasternak 99a03b3193 mlxsw: core_hwmon: Add interfaces for line card initialization and de-initialization
Add callback functions for line card 'hwmon' initialization and
de-initialization. Each line card is associated with the relevant
'hwmon' device, which may contain thermal attributes for the cages
and gearboxes found on this line card.

The line card 'hwmon' initialization / de-initialization APIs are to be
called when line card is set to active / inactive state by
got_active() / got_inactive() callbacks from line card state machine.

For example cage temperature for module  located at line card  will
be exposed by utility 'sensors' like:
linecard#07
front panel 009:	+32.0C  (crit = +70.0C, emerg = +80.0C)
And temperature for gearbox  located at line card  will be exposed
like:
linecard#05
gearbox 003:		+41.0C  (highest = +41.0C)

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 15:03:21 +01:00
Vadim Pasternak f11a323da4 mlxsw: core_thermal: Add interfaces for line card initialization and de-initialization
Add callback functions for line card thermal area initialization and
de-initialization. Each line card is associated with the relevant
thermal area, which may contain thermal zones for cages and gearboxes
found on this line card.

The line card thermal initialization / de-initialization APIs are to be
called when line card is set to active / inactive state by
got_active() / got_inactive() callbacks from line card state machine.

For example thermal zone for module  located at line card  will
have type:
mlxsw-lc7-module9.
And thermal zone for gearbox  located at line card  will have type:
mlxsw-lc5-gearbox2.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 15:03:21 +01:00
Vadim Pasternak 06a0fc43bb mlxsw: core_env: Add interfaces for line card initialization and de-initialization
Netdevs for ports found on line cards are registered upon provisioning.
However, user space is not allowed to access the transceiver modules
found on a line card until the line card becomes active.

Therefore, register event operations with the line card core to get
notifications whenever a line card becomes active or inactive.

When user space tries to dump the EEPROM of a transceiver module or reset
it and the corresponding line card is inactive, emit an error
message:
ethtool -m enp1s0nl7p9
netlink error: mlxsw_core: Cannot read EEPROM of module on an inactive line card
netlink error: Input/output error

When user space tries to set the power mode policy of such a transceiver,
cache the configuration and apply it when the line card becomes active. This
is consistent with other port configuration (e.g., MTU setting) that user space
is able to perform while the line card is provisioned, but inactive.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 15:03:21 +01:00
Vadim Pasternak a11e1ec141 mlxsw: core_env: Split module power mode setting to a separate function
Move the code that applies the module power mode to the device to a
separate function. This function will be invoked by the next patch to
set the power mode on transceiver modules found on a line card when the
line card becomes active.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 15:03:21 +01:00
Vadim Pasternak 7b261af9f6 mlxsw: core: Add bus argument to environment init API
Pass bus argument to mlxsw_env_init(). The purpose is to get access to
device handle, which is to be provided to error message in case of line
card activation failure.

Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 15:03:21 +01:00
Jiri Pirko de28976d26 mlxsw: core_linecards: Introduce ops for linecards status change tracking
Introduce an infrastructure allowing users to register a set
of operations which are to be called whenever a line card gets
active/inactive.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 15:03:21 +01:00
Krzysztof Kozlowski c5d0fc54be nfc: MAINTAINERS: add Bug entry
Add a Bug section, indicating preferred mailing method for bug reports,
to NFC Subsystem entry.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 13:24:00 +01:00
David S. Miller 85ef87ba9b linux-can-next-for-5.19-20220419
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEBsvAIBsPu6mG7thcrX5LkNig010FAmJe0WETHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCtfkuQ2KDTXbLsCACm5O5idvwV7HLXyi0GN8JZxrZoShQ4
 ovsYTTG0moz2kcbj4x3Fi85Um3U/fSOhZRLo2mF757s+vb6un5bpS4eB1+NHgtM0
 L3aScyyGk8iBkf0VObvB/iGjVR3/zUH6fN+ov/C5I2G8mSxUaHbPRzxQTcMOB7XG
 HowAwE/d/gCvHxF7l/AmcZe21PaEC2OlOa9Xn9F4x5Lzn2E/TciyLaGQKmfnYl55
 0GZ4Mb4t4t0qtVFyMe2TGlFAhKDnrm+qdDsb7diDpEj6I2RYWHL2rckS9RHFrKdH
 J/q8uEsMIvQqT1vM9ts8EuIcz31shS5E6fb2nFXuvBRjpPkGxosZDcTe
 =4Ot+
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-5.19-20220419' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2022-04-19

this is a pull request of 17 patches for net-next/master.

The first 2 patches are by me and target the CAN driver
infrastructure. One patch renames a function in the rx_offload helper
the other one updates the CAN bitrate calculation to prefer small bit
rate pre-scalers over larger ones, which is encouraged by the CAN in
Automation.

Kris Bahnsen contributes a patch to fix the links to Technologic
Systems web resources in the sja1000 driver.

Christophe Leroy's patch prepares the mpc5xxx_can driver for upcoming
powerpc header cleanup.

Minghao Chi's patch converts the flexcan driver to use
pm_runtime_resume_and_get().

The next 2 patches target the Xilinx CAN driver. Lukas Bulwahn's patch
fixes an entry in the MAINTAINERS file. A patch by me marks the bit
timing constants as const.

Wolfram Sang's patch documents r8a77961 support on the
renesas,rcar-canfd bindings document.

The next 2 patches are by me and add support for the mcp251863 chip to
the mcp251xfd driver.

The last 7 patches are by Pavel Pisa, Martin Jerabek et al. and add
the ctucanfd driver for the CTU CAN FD IP Core.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:14:36 +01:00
Kevin Hao 234901de2b net: stmmac: Use readl_poll_timeout_atomic() in atomic state
The init_systime() may be invoked in atomic state. We have observed the
following call trace when running "phc_ctl /dev/ptp0 set" on a Intel
Agilex board.
  BUG: sleeping function called from invalid context at drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c:74
  in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 381, name: phc_ctl
  preempt_count: 1, expected: 0
  RCU nest depth: 0, expected: 0
  Preemption disabled at:
  [<ffff80000892ef78>] stmmac_set_time+0x34/0x8c
  CPU: 2 PID: 381 Comm: phc_ctl Not tainted 5.18.0-rc2-next-20220414-yocto-standard+ 
  Hardware name: SoCFPGA Agilex SoCDK (DT)
  Call trace:
   dump_backtrace.part.0+0xc4/0xd0
   show_stack+0x24/0x40
   dump_stack_lvl+0x7c/0xa0
   dump_stack+0x18/0x34
   __might_resched+0x154/0x1c0
   __might_sleep+0x58/0x90
   init_systime+0x78/0x120
   stmmac_set_time+0x64/0x8c
   ptp_clock_settime+0x60/0x9c
   pc_clock_settime+0x6c/0xc0
   __arm64_sys_clock_settime+0x88/0xf0
   invoke_syscall+0x5c/0x130
   el0_svc_common.constprop.0+0x4c/0x100
   do_el0_svc+0x7c/0xa0
   el0_svc+0x58/0xcc
   el0t_64_sync_handler+0xa4/0x130
   el0t_64_sync+0x18c/0x190

So we should use readl_poll_timeout_atomic() here instead of
readl_poll_timeout().

Also adjust the delay time to 10us to fix a "__bad_udelay" build error
reported by "kernel test robot <lkp@intel.com>". I have tested this on
Intel Agilex and NXP S32G boards, there is no delay needed at all.
So the 10us delay should be long enough for most cases.

Fixes: ff8ed73786 ("net: stmmac: use readl_poll_timeout() function in init_systime()")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:10:27 +01:00
David S. Miller c1f6f1e673 Merge branch 'net-sched-flower-num-vlan-tags'
Boris Sukholitko says:

====================
net/sched: flower: match on the number of vlan tags

Our customers in the fiber telecom world have network configurations
where they would like to control their traffic according to the number
of tags appearing in the packet.

For example, TR247 GPON conformance test suite specification mostly
talks about untagged, single, double tagged packets and gives lax
guidelines on the vlan protocol vs. number of vlan tags.

This is different from the common IT networks where 802.1Q and 802.1ad
protocols are usually describe single and double tagged packet. GPON
configurations that we work with have arbitrary mix the above protocols
and number of vlan tags in the packet.

The following patch series implement number of vlans flower filter. They
add num_of_vlans flower filter as an alternative to vlan ethtype protocol
matching. The end result is that the following command becomes possible:

tc filter add dev eth1 ingress flower \
  num_of_vlans 1 vlan_prio 5 action drop

Also, from our logs, we have redirect rules such that:

tc filter add dev $GPON ingress flower num_of_vlans $N \
     action mirred egress redirect dev $DEV

where N can range from 0 to 3 and $DEV is the function of $N.

Also there are rules setting skb mark based on the number of vlans:

tc filter add dev $GPON ingress flower num_of_vlans $N vlan_prio \
    $P action skbedit mark $M

More about the patch series:
  - patches 1-2 remove duplicate code by introducing is_key_vlan
    helper.
  - patch 3, 4 implement num_of_vlans in the dissector and in the
    flower.
  - patch 5 uses the num_of_vlans filter to allow further matching on
    vlan attributes.

Complementary iproute2 patches are being sent separately.

Thanks,
Boris.

- v4: rebased to the latest net-next
- v3:
    - more example commands in patch 3 description (request by Jamal)
    - patch 5 description made clearer (thanks to Jiri)
- v2:
    - add suitable subject prefixes
    - more evolved patch 5 description
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:09:13 +01:00
Boris Sukholitko 99fdb22bc5 net/sched: flower: Consider the number of tags for vlan filters
Before this patch the existence of vlan filters was conditional on the vlan
protocol being matched in the tc rule. For example, the following rule:

tc filter add dev eth1 ingress flower vlan_prio 5

was illegal because vlan protocol (e.g. 802.1q) does not appear in the rule.

Remove the above restriction by looking at the num_of_vlans filter to
allow further matching on vlan attributes. The following rule becomes
legal as a result of this commit:

tc filter add dev eth1 ingress flower num_of_vlans 1 vlan_prio 5

because having num_of_vlans==1 implies that the packet is single tagged.

Change is_vlan_key helper to look at the number of vlans in addition to
the vlan ethertype. The outcome of this change is that outer (e.g. vlan_prio)
and inner (e.g. cvlan_prio) tag vlan filters require the number of vlan
tags to be greater then 0 and 1 accordingly.

As a result of is_vlan_key change, the ethertype may be set to 0 when
matching on the number of vlans. Update fl_set_key_vlan to avoid setting
key, mask vlan_tpid for the 0 ethertype.

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:09:13 +01:00
Boris Sukholitko b400031282 net/sched: flower: Add number of vlan tags filter
These are bookkeeping parts of the new num_of_vlans filter.
Defines, dump, load and set are being done here.

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:09:13 +01:00
Boris Sukholitko 34951fcf26 flow_dissector: Add number of vlan tags dissector
Our customers in the fiber telecom world have network configurations
where they would like to control their traffic according to the number
of tags appearing in the packet.

For example, TR247 GPON conformance test suite specification mostly
talks about untagged, single, double tagged packets and gives lax
guidelines on the vlan protocol vs. number of vlan tags.

This is different from the common IT networks where 802.1Q and 802.1ad
protocols are usually describe single and double tagged packet. GPON
configurations that we work with have arbitrary mix the above protocols
and number of vlan tags in the packet.

The goal is to make the following TC commands possible:

tc filter add dev eth1 ingress flower \
  num_of_vlans 1 vlan_prio 5 action drop

From our logs, we have redirect rules such that:

tc filter add dev $GPON ingress flower num_of_vlans $N \
     action mirred egress redirect dev $DEV

where N can range from 0 to 3 and $DEV is the function of $N.

Also there are rules setting skb mark based on the number of vlans:

tc filter add dev $GPON ingress flower num_of_vlans $N vlan_prio \
    $P action skbedit mark $M

This new dissector allows extracting the number of vlan tags existing in
the packet.

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:09:13 +01:00
Boris Sukholitko 6ee59e554d net/sched: flower: Reduce identation after is_key_vlan refactoring
Whitespace only.

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:09:13 +01:00
Boris Sukholitko 285ba06b0e net/sched: flower: Helper function for vlan ethtype checks
There are somewhat repetitive ethertype checks in fl_set_key. Refactor
them into is_vlan_key helper function.

To make the changes clearer, avoid touching identation levels. This is
the job for the next patch in the series.

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:09:13 +01:00
Haowen Bai e63dd41235 ar5523: Use kzalloc instead of kmalloc/memset
Use kzalloc rather than duplicating its implementation, which
makes code simple and easy to understand.

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:04:09 +01:00
Luiz Angelo Daros de Luca fcd30c96af net: dsa: realtek: remove realtek,rtl8367s string
There is no need to add new compatible strings for each new supported
chip version. The compatible string is used only to select the subdriver
(rtl8365mb.c or rtl8366rb.c). Once in the subdriver, it will detect the
chip model by itself, ignoring which compatible string was used.

Link: https://lore.kernel.org/netdev/20220414014055.m4wbmr7tdz6hsa3m@bang-olufsen.dk/
Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:02:28 +01:00
Luiz Angelo Daros de Luca 6f2d04ccae dt-bindings: net: dsa: realtek: cleanup compatible strings
Compatible strings are used to help the driver find the chip ID/version
register for each chip family. After that, the driver can setup the
switch accordingly. Keep only the first supported model for each family
as a compatible string and reference other chip models in the
description.

The removed compatible strings have never been used in a released kernel.

CC: devicetree@vger.kernel.org
Link: https://lore.kernel.org/netdev/20220414014055.m4wbmr7tdz6hsa3m@bang-olufsen.dk/
Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:02:28 +01:00
David S. Miller e92453b9fe Merge branch 'hns3-next'
Guangbin Huang says:

====================
net: hns3: updates for -next

This series includes some updates for the HNS3 ethernet driver.

Change logs:
V1 -> V2:
 - Fix failed to apply to net-next problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:45:51 +01:00
Hao Chen 29c17cb672 net: hns3: remove unnecessary line wrap for hns3_set_tunable
Remove unnecessary line wrap for hns3_set_tunable to improve
function readability.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:45:51 +01:00
Peng Li 350cb44092 net: hns3: replace magic value by HCLGE_RING_REG_OFFSET
Magic values are not recommended.

Signed-off-by: Peng Li<lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:45:51 +01:00
Peng Li 9c657cbc2c net: hns3: fix the wrong words in comments
This patch fixes wrong words in comments.

Signed-off-by: Peng Li<lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:45:51 +01:00
Peng Li 2e0f538870 net: hns3: update the comment of function hclgevf_get_mbx_resp
The param of function hclgevf_get_mbx_resp has been changed but the
comments not upodated. This patch updates it.

Signed-off-by: Peng Li<lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:45:51 +01:00
Hao Chen 2373b35c24 net: hns3: add log for setting tx spare buf size
For the active tx spare buffer size maybe changed according
to the page size, so add log to notice it.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:45:50 +01:00
Jie Wang bcc7a98f0d net: hns3: add failure logs in hclge_set_vport_mtu
Currently, There is a low probability that pf mtu configuration fails, but
the information in logs is insufficient for problem locating when the VF
mtu value is illegally modified.

So record the vf index and vf mtu value at the failure scenario.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:45:50 +01:00
Jian Shen 6fde96df04 net: hns3: refine the definition for struct hclge_pf_to_vf_msg
The struct hclge_pf_to_vf_msg is used for mailbox message from
PF to VF, including both response and request. But its definition
can only indicate respone, which makes the message data copy in
function hclge_send_mbx_msg() unreadable. So refine it by edding
a general message definition into it.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:45:50 +01:00
Hao Chen 07fdc163ac net: hns3: refactor hns3_set_ringparam()
Use struct hns3_ring_param to replace variable new/old_xxx and
add hns3_is_ringparam_changed() to judge them if is changed to
improve code readability.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:45:50 +01:00
Yufeng Mo 286c61e727 net: hns3: add ethtool parameter check for CQE/EQE mode
For DEVICE_VERSION_V2, the hardware does not support the CQE mode.
So add capability bit for coalesce CQE mode and add parameter check
for it in ethtool.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:45:50 +01:00