When failed to start nic or add interrupt service routine in
s2io_card_up() for opening device, napi isn't disabled. When open
s2io device next time, it will trigger a BUG_ON()in napi_enable().
Compile tested only.
Fixes: 5f490c9680 ("S2io: Fixed synchronization between scheduling of napi with card reset and close")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221109023741.131552-1-shaozhengchao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The ndo_start_xmit field in net_device_ops is expected to be of type
netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, struct net_device *dev).
The mismatched return type breaks forward edge kCFI since the underlying
function definition does not match the function hook definition. A new
warning in clang will catch this at compile time:
drivers/net/ethernet/microsoft/mana/mana_en.c:382:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
.ndo_start_xmit = mana_start_xmit,
^~~~~~~~~~~~~~~
1 error generated.
The return type of mana_start_xmit should be changed from int to
netdev_tx_t.
Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1703
Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
[nathan: Rebase on net-next and resolve conflicts
Add note about new clang warning]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20221109002629.1446680-1-nathan@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Antoine Tenart says:
====================
macsec: clear encryption keys in h/w drivers
Commit aaab73f8fb ("macsec: clear encryption keys from the stack after
setting up offload") made sure to clean encryption keys from the stack
after setting up offloading but some h/w drivers did a copy of the key
which need to be zeroed as well.
The MSCC PHY driver can actually be converted not to copy the encryption
key at all, but such patch would be quite difficult to backport. I'll
send a following up patch doing this in net-next once this series lands.
Tested on the MSCC PHY but not on the atlantic NIC.
====================
Link: https://lore.kernel.org/r/20221108153459.811293-1-atenart@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Commit aaab73f8fb ("macsec: clear encryption keys from the stack after
setting up offload") made sure to clean encryption keys from the stack
after setting up offloading, but the atlantic driver made a copy and did
not clear it. Fix this.
[4 Fixes tags below, all part of the same series, no need to split this]
Fixes: 9ff40a751a ("net: atlantic: MACSec ingress offload implementation")
Fixes: b8f8a0b7b5 ("net: atlantic: MACSec ingress offload HW bindings")
Fixes: 27736563ce ("net: atlantic: MACSec egress offload implementation")
Fixes: 9d106c6dd8 ("net: atlantic: MACSec egress offload HW bindings")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Commit aaab73f8fb ("macsec: clear encryption keys from the stack after
setting up offload") made sure to clean encryption keys from the stack
after setting up offloading, but the MSCC PHY driver made a copy, kept
it in the flow data and did not clear it when freeing a flow. Fix this.
Fixes: 28c5107aa9 ("net: phy: mscc: macsec support")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
After searching for a protocol handler in dev_gro_receive, checking for
failure is redundant. Skip the failure code after finding the
corresponding handler.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221108123320.GA59373@debian
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Yang Yingliang says:
====================
stmmac: dwmac-loongson: fixes three leaks
patch #2 fixes missing pci_disable_device() in the error path in probe()
patch #1 and pach #3 fix missing pci_disable_msi() and of_node_put() in
error and remove() path.
====================
Link: https://lore.kernel.org/r/20221108114647.4144952-1-yangyingliang@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The node returned by of_get_child_by_name() with refcount decremented,
of_node_put() needs be called when finish using it. So add it in the
error path in loongson_dwmac_probe() and in loongson_dwmac_remove().
Fixes: 2ae34111fe ("stmmac: dwmac-loongson: fix invalid mdio_node")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Add missing pci_disable_device() in the error path in loongson_dwmac_probe().
Fixes: 30bba69d7d ("stmmac: pci: Add dwmac support for Loongson")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
pci_enable_msi() has been called in loongson_dwmac_probe(),
so pci_disable_msi() needs be called in remove path and error
path of probe().
Fixes: 30bba69d7d ("stmmac: pci: Add dwmac support for Loongson")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The MANA hardware support protection domain and memory registration for use
in RDMA environment. Add those definitions and expose them for use by the
RDMA driver.
Signed-off-by: Ajay Sharma <sharmaajay@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-12-git-send-email-longli@linuxonhyperv.com
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
The RDMA device needs to allocate doorbell pages for each user context.
Define the GDMA data structures for use by the RDMA driver.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-11-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
When doing memory registration, the PF may respond with
GDMA_STATUS_MORE_ENTRIES to indicate a follow request is needed. This is
not an error and should be processed as expected.
Signed-off-by: Ajay Sharma <sharmaajay@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-10-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
The number of maximum SGl entries should be computed from the maximum
WQE size for the intended queue type and the corresponding OOB data
size. This guarantees the hardware queue can successfully queue requests
up to the queue depth exposed to the upper layer.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-9-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
In preparation to add MANA RDMA driver, move all the required header files
to a common location for use by both Ethernet and RDMA drivers.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-8-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
The port number is useful for user-mode application to identify this
net device based on port index. Set to the correct value in ndev.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-7-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
RDMA device may need to create Ethernet device queues for use by Queue
Pair type RAW. This allows a user-mode context accesses Ethernet hardware
queues. Export the supporting functions for use by the RDMA driver.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-6-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
MANA hardware doesn't have any restrictions on the DMA segment size, set it
to the max allowed value.
Signed-off-by: Ajay Sharma <sharmaajay@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-5-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
For outgoing packets, the PF requires the VF to configure the vport with
corresponding protection domain and doorbell ID for the kernel or user
context. The vport can't be shared between different contexts.
Implement the logic to exclusively take over the vport by either the
Ethernet device or RDMA device.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-4-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
For supporting RDMA device with multiple user contexts with their
individual doorbell pages, record the start address of doorbell page
region for use by the RDMA driver to allocate user context doorbell IDs.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-3-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
In preparation for supporting MANA RDMA driver, add support for auxiliary
device in the Ethernet driver. The RDMA device is modeled as an auxiliary
device to the Ethernet device.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Link: https://lore.kernel.org/r/1667502990-2559-2-git-send-email-longli@linuxonhyperv.com
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
John Fastabend says:
====================
Not sure if folks want to take this through BPF tree or networking tree.
I took a quick look and didn't see any pending fixes so seems no one
has noticed the panic yet. It reproducible and easy to repro.
I put bpf in the title thinking it woudl be great to run through the
BPF selftests given its XDP triggering the panic.
Sorry maintainers resent with CC'ing actual lists. Had a scripting
issue. Also dropped henqqi has they are bouncing.
Thanks!
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
The following panic is observed when bringing up (veth_open) a veth device
that has an XDP program attached.
[ 61.519185] kernel BUG at net/core/dev.c:6442!
[ 61.519456] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[ 61.519752] CPU: 0 PID: 408 Comm: ip Tainted: G W 6.1.0-rc2-185930-gd9095f92950b-dirty #26
[ 61.520288] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 61.520806] RIP: 0010:napi_enable+0x3d/0x40
[ 61.521077] Code: f6 f6 80 61 08 00 00 02 74 0d 48 83 bf 88 01 00 00 00 74 03 80 cd 01 48 89 d0 f0 48 0f b1 4f 10 48 39 c2 75 c8 c3 cc cc cc cc <0f> 0b 90 48 8b 87 b0 00 00 00 48 81 c7 b0 00 00 00 45 31 c0 48 39
[ 61.522226] RSP: 0018:ffffbc9800cc36f8 EFLAGS: 00010246
[ 61.522557] RAX: 0000000000000001 RBX: 0000000000000300 RCX: 0000000000000001
[ 61.523004] RDX: 0000000000000010 RSI: ffffffff8b0de852 RDI: ffff9f03848e5000
[ 61.523452] RBP: 0000000000000000 R08: 0000000000000800 R09: 0000000000000000
[ 61.523899] R10: ffff9f0384a96800 R11: ffffffffffa48061 R12: ffff9f03849c3000
[ 61.524345] R13: 0000000000000300 R14: ffff9f03848e5000 R15: 0000001000000100
[ 61.524792] FS: 00007f58cb64d2c0(0000) GS:ffff9f03bbc00000(0000) knlGS:0000000000000000
[ 61.525301] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 61.525673] CR2: 00007f6cc629b498 CR3: 000000010498c000 CR4: 00000000000006f0
[ 61.526121] Call Trace:
[ 61.526284] <TASK>
[ 61.526425] __veth_napi_enable_range+0xd6/0x230
[ 61.526723] veth_enable_xdp+0xd0/0x160
[ 61.526969] veth_open+0x2e/0xc0
[ 61.527180] __dev_open+0xe2/0x1b0
[ 61.527405] __dev_change_flags+0x1a1/0x210
[ 61.527673] dev_change_flags+0x1c/0x60
This happens because we are calling veth_napi_enable() on already enabled
queues. The root cause is in commit 2e0de6366a changed the control logic
dropping this case,
if (priv->_xdp_prog) {
err = veth_enable_xdp(dev);
if (err)
return err;
- } else if (veth_gro_requested(dev)) {
+ /* refer to the logic in veth_xdp_set() */
+ if (!rtnl_dereference(peer_rq->napi)) {
+ err = veth_napi_enable(peer);
+ if (err)
+ return err;
+ }
so that now veth_napi_enable is called if the peer has not yet
initialiazed its peer_rq->napi. The issue is this will happen
even if the NIC is not up. Then in veth_enable_xdp just above
we have similar path,
veth_enable_xdp
napi_already_on = (dev->flags & IFF_UP) && rcu_access_pointer(rq->napi)
err = veth_enable_xdp_range(dev, 0, dev->real_num_rx_queues, napi_already_on);
The trouble is an xdp prog is assigned before bringing the device up each
of the veth_open path will enable the peers xdp napi structs. But then when
we bring the peer up it will similar try to enable again because from
veth_open the IFF_UP flag is not set until after the op in __dev_open so
we believe napi_alread_on = false.
To fix this just drop the IFF_UP test and rely on checking if the napi
struct is enabled. This also matches the peer check in veth_xdp for
disabling.
To reproduce run ./test_xdp_meta.sh I found adding Cilium/Tetragon tests
for XDP.
Fixes: 2e0de6366a ("veth: Avoid drop packets when xdp_redirect performs")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20221108221650.808950-2-john.fastabend@gmail.com
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
When showing the result of a test group, if one
of the subtests was skipped, while still having
passing subtests, the group result was marked as
SKIP. E.g.:
223/1 usdt/basic:SKIP
223/2 usdt/multispec:OK
223/3 usdt/urand_auto_attach:OK
223/4 usdt/urand_pid_attach:OK
223 usdt:SKIP
The test result of usdt in the example above
should be OK instead of SKIP, because the test
group did have passing tests and it would be
considered in "normal" state.
With this change, only if all of the subtests
were skipped, the group test is marked as SKIP.
When only some of the subtests are skipped, a
more detailed result is given, stating how
many of the subtests were skipped. E.g:
223/1 usdt/basic:SKIP
223/2 usdt/multispec:OK
223/3 usdt/urand_auto_attach:OK
223/4 usdt/urand_pid_attach:OK
223 usdt:OK (SKIP: 1/4)
Signed-off-by: Domenico Cerasuolo <dceras@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20221109184039.3514033-1-cerasuolodomenico@gmail.com
Eduard Zingerman says:
====================
The patch-set is consists of the following parts:
- An update for libbpf's hashmap interface from void* -> void* to a
polymorphic one, allowing to use both long and void* keys and values
w/o additional casts. Interface functions are hidden behind
auxiliary macro that add casts as necessary.
This simplifies many use cases in libbpf as hashmaps there are
mostly integer to integer and required awkward looking incantations
like `(void *)(long)off` previously. Also includes updates for perf
as it copies map implementation from libbpf.
- A change to `lib/bpf/btf.c:btf__dedup` that adds a new pass named
"Resolve unambiguous forward declaration". This pass builds a
hashmap `name_off -> uniquely named struct or union` and uses it to
replace FWD types by structs or unions. This is necessary for corner
cases when FWD is not used as a part of some struct or union
definition de-duplicated by `btf_dedup_struct_types`.
The goal of the patch-set is to resolve forward declarations that
don't take part in type graphs comparisons if declaration name is
unambiguous.
Example:
CU #1:
struct foo; // standalone forward declaration
struct foo *some_global;
CU #2:
struct foo { int x; };
struct foo *another_global;
Currently the de-duplicated BTF for this example looks as follows:
[1] STRUCT 'foo' size=4 vlen=1 ...
[2] INT 'int' size=4 ...
[3] PTR '(anon)' type_id=1
[4] FWD 'foo' fwd_kind=struct
[5] PTR '(anon)' type_id=4
The goal of this patch-set is to simplify it as follows:
[1] STRUCT 'foo' size=4 vlen=1
'x' type_id=2 bits_offset=0
[2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[3] PTR '(anon)' type_id=1
For defconfig kernel with BTF enabled this removes 63 forward
declarations.
For allmodconfig kernel with BTF enabled this removes ~5K out of ~21K
forward declarations in ko objects. This unlocks some additional
de-duplication in ko objects, but impact is tiny: ~13K less BTF ids
out of ~2M.
Changelog:
v3 -> v4
Changes suggested by Andrii:
- hashmap interface rework to allow use of integer and pointer keys
an values w/o casts as suggested in [1].
v2 -> v3
Changes suggested by Andrii:
- perf's util/hashtable.{c,h} are synchronized with libbpf
implementation, perf's source code updated accordingly;
- changes to libbpf, bpf selftests and perf are combined in a single
patch to simplify bisecting;
- hashtable interface updated to be long -> long instead of
uintptr_t -> uintptr_t;
- btf_dedup_resolve_fwds updated to correctly use IS_ERR / PTR_ERR
macro;
- test cases for btf_dedup_resolve_fwds are updated for better
clarity.
v1 -> v2
- Style fixes in btf_dedup_resolve_fwd and btf_dedup_resolve_fwds as
suggested by Alan.
[v1] https://lore.kernel.org/bpf/20221102110905.2433622-1-eddyz87@gmail.com/
[v2] https://lore.kernel.org/bpf/20221103033430.2611623-1-eddyz87@gmail.com/
[v3] https://lore.kernel.org/bpf/20221106202910.4193104-1-eddyz87@gmail.com/
[1] https://lore.kernel.org/bpf/CAEf4BzZ8KFneEJxFAaNCCFPGqp20hSpS2aCj76uRk3-qZUH5xg@mail.gmail.com/
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tests to verify the following behavior of `btf_dedup_resolve_fwds`:
- remapping for struct forward declarations;
- remapping for union forward declarations;
- no remapping if forward declaration kind does not match similarly
named struct or union declaration;
- no remapping if forward declaration name is ambiguous;
- base ids are considered for fwd resolution in split btf scenario.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20221109142611.879983-4-eddyz87@gmail.com
Resolve forward declarations that don't take part in type graphs
comparisons if declaration name is unambiguous. Example:
CU #1:
struct foo; // standalone forward declaration
struct foo *some_global;
CU #2:
struct foo { int x; };
struct foo *another_global;
The `struct foo` from CU #1 is not a part of any definition that is
compared against another definition while `btf_dedup_struct_types`
processes structural types. The the BTF after `btf_dedup_struct_types`
the BTF looks as follows:
[1] STRUCT 'foo' size=4 vlen=1 ...
[2] INT 'int' size=4 ...
[3] PTR '(anon)' type_id=1
[4] FWD 'foo' fwd_kind=struct
[5] PTR '(anon)' type_id=4
This commit adds a new pass `btf_dedup_resolve_fwds`, that maps such
forward declarations to structs or unions with identical name in case
if the name is not ambiguous.
The pass is positioned before `btf_dedup_ref_types` so that types
[3] and [5] could be merged as a same type after [1] and [4] are merged.
The final result for the example above looks as follows:
[1] STRUCT 'foo' size=4 vlen=1
'x' type_id=2 bits_offset=0
[2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[3] PTR '(anon)' type_id=1
For defconfig kernel with BTF enabled this removes 63 forward
declarations. Examples of removed declarations: `pt_regs`, `in6_addr`.
The running time of `btf__dedup` function is increased by about 3%.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221109142611.879983-3-eddyz87@gmail.com
An update for libbpf's hashmap interface from void* -> void* to a
polymorphic one, allowing both long and void* keys and values.
This simplifies many use cases in libbpf as hashmaps there are mostly
integer to integer.
Perf copies hashmap implementation from libbpf and has to be
updated as well.
Changes to libbpf, selftests/bpf and perf are packed as a single
commit to avoid compilation issues with any future bisect.
Polymorphic interface is acheived by hiding hashmap interface
functions behind auxiliary macros that take care of necessary
type casts, for example:
#define hashmap_cast_ptr(p) \
({ \
_Static_assert((p) == NULL || sizeof(*(p)) == sizeof(long),\
#p " pointee should be a long-sized integer or a pointer"); \
(long *)(p); \
})
bool hashmap_find(const struct hashmap *map, long key, long *value);
#define hashmap__find(map, key, value) \
hashmap_find((map), (long)(key), hashmap_cast_ptr(value))
- hashmap__find macro casts key and value parameters to long
and long* respectively
- hashmap_cast_ptr ensures that value pointer points to a memory
of appropriate size.
This hack was suggested by Andrii Nakryiko in [1].
This is a follow up for [2].
[1] https://lore.kernel.org/bpf/CAEf4BzZ8KFneEJxFAaNCCFPGqp20hSpS2aCj76uRk3-qZUH5xg@mail.gmail.com/
[2] https://lore.kernel.org/bpf/af1facf9-7bc8-8a3d-0db4-7b3f333589a2@meta.com/T/#m65b28f1d6d969fcd318b556db6a3ad499a42607d
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221109142611.879983-2-eddyz87@gmail.com
When t4vf_update_port_info() failed in cxgb4vf_open(), resources applied
during adapter goes up are not cleared. Fix it. Only be compiled, not be
tested.
Fixes: 18d79f721e ("cxgb4vf: Update port information in cxgb4vf_open()")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221109012100.99132-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King says:
====================
Clean up pcs-xpcs accessors
This series cleans up the pcs-xpcs code to use mdiodev accessors for
read/write just like xpcs_modify_changed() does. In order to do this,
we need to introduce the mdiodev clause 45 accessors.
====================
Link: https://lore.kernel.org/r/Y2pm13+SDg6N/IVx@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use mdiodev accessors rather than accessing the bus and address in
the mdio_device structure and using the mdiobus accessors.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The strlen() may go too far when estimating the length of
the given string. In some cases it may go over the boundary
and crash the system which is the case according to the commit
13a55372b6 ("ARM: orion5x: Revert commit 4904dbda41c8.").
Rectify this by switching to strnlen() for the expected
maximum length of the string.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20221108141108.62974-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If mctp_neigh_init() return error, the routes resources should
be released in the error handling path. Otherwise some resources
leak.
Fixes: 4d8b931928 ("mctp: Add neighbour implementation")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Matt Johnston <matt@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20221108095517.620115-1-weiyongjun@huaweicloud.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The existing genphy_loopback() is not working for TI DP83867 PHY as it
will disable autoneg support while another side is still enabling autoneg.
This is causing the link is not established and results in timeout error
in genphy_loopback() function.
Thus, based on TI PHY datasheet, introduce a TI PHY loopback function by
just configuring BMCR_LOOPBACK(Bit-9) in MII_BMCR register (0x0).
Tested working on TI DP83867 PHY for all speeds (10/100/1000Mbps).
Signed-off-by: Tan Tee Min <tee.min.tan@linux.intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20221108101527.612723-1-michael.wei.hong.sit@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Raju Lakkaraju says:
====================
net: lan743x: PCI11010 / PCI11414 devices Enhancements
This patch series continues with the addition of supported features for the
Ethernet function of the PCI11010 / PCI11414 devices to the LAN743x driver.
====================
Link: https://lore.kernel.org/r/20221107085650.991470-1-Raju.Lakkaraju@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Test that locked bridge port configurations that are not supported by
mlxsw are rejected.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Test that packets received via a locked bridge port whose {SMAC, VID}
does not appear in the bridge's FDB or appears with a different port,
trigger the "locked_port" packet trap.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Test that packets with a destination MAC of 01:80:C2:00:00:03 trigger
the "eapol" packet trap.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Merely checking whether a trap counter incremented or not without
logging a test result is useful on its own. Split this functionality to
a helper which will be used by subsequent patches.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add locked bridge port support by reacting to changes in the
'BR_PORT_LOCKED' flag. When set, enable security checks on the local
port via the previously added SPFSR register.
When security checks are enabled, an incoming packet will trigger an FDB
lookup with the packet's source MAC and the FID it was classified to. If
an FDB entry was not found or was found to be pointing to a different
port, the packet will be dropped. Such packets increment the
"discard_ingress_general" ethtool counter. For added visibility, user
space can trap such packets to the CPU by enabling the "locked_port"
trap. Example:
# devlink trap set pci/0000:06:00.0 trap locked_port action trap
Unlike other configurations done via bridge port flags (e.g., learning,
flooding), security checks are enabled in the device on a per-port basis
and not on a per-{port, VLAN} basis. As such, scenarios where user space
can configure different locking settings for different VLANs configured
on a port need to be vetoed. To that end, veto the following scenarios:
1. Locking is set on a bridge port that is a VLAN upper
2. Locking is set on a bridge port that has VLAN uppers
3. VLAN upper is configured on a locked bridge port
Examples:
# bridge link set dev swp1.10 locked on
Error: mlxsw_spectrum: Locked flag cannot be set on a VLAN upper.
# ip link add link swp1 name swp1.10 type vlan id 10
# bridge link set dev swp1 locked on
Error: mlxsw_spectrum: Locked flag cannot be set on a bridge port that has VLAN uppers.
# bridge link set dev swp1 locked on
# ip link add link swp1 name swp1.10 type vlan id 10
Error: mlxsw_spectrum: VLAN uppers are not supported on a locked port.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Propagate extack to mlxsw_sp_port_attr_br_pre_flags_set() in order to
communicate error messages related to bridge port flag validation.
Example:
# bridge link set dev swp1 locked on
Error: mlxsw_spectrum: Unsupported bridge port flag.
More error messages will be added in subsequent patches.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In Spectrum, learning happens in parallel to the security checks.
Therefore, regardless of the result of the security checks, a learning
notification will be generated by the device and polled later on by the
driver.
Currently, the driver reacts to learning notifications by programming
corresponding FDB entries to the device. When a port is locked (i.e.,
has security checks enabled), this can no longer happen, as otherwise
any host will blindly gain authorization.
Instead, notify the learned entry as a locked entry to the bridge driver
that will in turn notify it to user space, in case MAB is enabled. User
space can then decide to authorize the host by clearing the "locked"
flag, which will cause the entry to be programmed to the device.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Subsequent patches will need to report locked FDB entries to the bridge
driver. Prepare for that by adding a 'locked' argument to
mlxsw_sp_fdb_call_notifiers() according to which the 'locked' bit is set
in the FDB notification info. For now, always pass 'false'.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add an API to enable or disable security checks on a local port. It will
be used by subsequent patches when the 'BR_PORT_LOCKED' flag is toggled.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add the Switch Port FDB Security Register (SPFSR) that allows enabling
and disabling security checks on a given local port. In Linux terms, it
allows locking / unlocking a port.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Register the previously added packet traps with devlink. This allows
user space to tune their policers and in the case of the locked port
trap, user space can set its action to "trap" in order to gain
visibility into packets that were discarded by the device due to the
locked port check failure.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>