Commit Graph

1203107 Commits

Author SHA1 Message Date
Jiri Pirko 36e5a0efc8 net/mlx5: Fix error message in mlx5_sf_dev_state_change_handler()
sw_function_id contains sfnum, so fix the error message to name the
value properly.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-14 14:40:22 -07:00
Jiri Pirko b63f8bde2f net/mlx5: Remove redundant check of mlx5_vhca_event_supported()
Since mlx5_vhca_event_supported() is called in mlx5_sf_dev_supported(),
remove the redundant call.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-14 14:40:21 -07:00
Jiri Pirko 88074d81e5 net/mlx5: Use mlx5_sf_start_function_id() helper instead of directly calling MLX5_CAP_GEN()
There is a helper called mlx5_sf_start_function_id() that
wraps up a query to get base SF function id. Use that instead of
calling MLX5_CAP_GEN() directly.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-14 14:40:21 -07:00
Jiri Pirko ae80d7a06f net/mlx5: Remove redundant SF supported check from mlx5_sf_hw_table_init()
Since mlx5_sf_supported() check is done as a first thing in
mlx5_sf_max_functions(), remove the redundant check.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-14 14:40:21 -07:00
Jiri Pirko 2ad0160c02 net/mlx5: Use auxiliary_device_uninit() instead of device_put()
Instead of using device_put(), use auxiliary_device_uninit() for
auxiliary device uninit which internally just calls device_put().

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-14 14:40:21 -07:00
Jianbo Liu e0e22d59b4 net/mlx5: E-switch, Add checking for flow rule destinations
Firmware doesn't allow flow rules in FDB to do header rewrite and send
packets to both internal and uplink vports. The following syndrome
will be generated when trying to offload such kind of rules:

mlx5_core 0000:08:00.0: mlx5_cmd_out_err:803:(pid 23569): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x8c8f08), err(-22)

To avoid this syndrome, add a checking before creating FTE. If a rule
with header rewrite action forwards packets to both VF and PF, an
error is returned directly.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-14 14:40:21 -07:00
Moshe Shemesh a9f168e4c6 net/mlx5: Check with FW that sync reset completed successfully
Even if the PF driver had no error on his part of the sync reset flow,
the firmware can see wider picture as it syncs all the PFs in the flow.
So add at end of sync reset flow check with firmware by reading MFRL
register and initialization segment that the flow had no issue from
firmware point of view too.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-14 14:40:21 -07:00
Shay Drory 6486c0f44e net/mlx5: Expose max possible SFs via devlink resource
Introduce devlink resource for exposing max possible SFs on mlx5
devices.

For example:
$ devlink resource show pci/0000:00:0b.0
pci/0000:00:0b.0:
  name max_local_SFs size 5 unit entry dpipe_tables none
  name max_external_SFs size 0 unit entry dpipe_tables none

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-14 14:40:20 -07:00
Rahul Rameshbabu 53b836a44d net/mlx5e: Add recovery flow for tx devlink health reporter for unhealthy PTP SQ
A new check for the tx devlink health reporter is introduced for
determining when the PTP port timestamping SQ is considered unhealthy. If
there are enough CQEs considered never to be delivered, the space that can
be utilized on the SQ decreases significantly, impacting performance and
usability of the SQ. The health reporter is triggered when the number of
likely never delivered port timestamping CQEs that utilize the space of the
PTP SQ is greater than 93.75% of the total capacity of the SQ. A devlink
health reporter recover method is also provided for this specific TX error
context that restarts the PTP SQ.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-14 14:40:20 -07:00
Rahul Rameshbabu 3178308ad4 net/mlx5e: Make tx_port_ts logic resilient to out-of-order CQEs
Use a map structure for associating CQEs containing port timestamping
information with the appropriate skb. Track order of WQEs submitted using a
FIFO. Check if the corresponding port timestamping CQEs from the lookup
values in the FIFO are considered dropped due to time elapsed. Return the
lookup value to a freelist after consuming the skb. Reuse the freed lookup
in future WQE submission iterations.

The map structure uses an integer identifier for the key and returns an skb
corresponding to that identifier. Embed the integer identifier in the WQE
submitted to the WQ for the transmit path when the SQ is a PTP (port
timestamping) SQ. The embedded identifier can then be queried using a field
in the CQE of the corresponding port timestamping CQ. In the port
timestamping napi_poll context, the identifier is queried from the CQE
polled from CQ and used to lookup the corresponding skb from the WQE submit
path. The skb reference is removed from map and then embedded with the port
HW timestamp information from the CQE and eventually consumed.

The metadata freelist FIFO is an array containing integer identifiers that
can be pushed and popped in the FIFO. The purpose of this structure is
bookkeeping what identifier values can safely be used in a subsequent WQE
submission and should not contain identifiers that have still not been
reaped by processing a corresponding CQE completion on the port
timestamping CQ.

The ts_cqe_pending_list structure is a combination of an array and linked
list. The array is pre-populated with the nodes that will be added and
removed from the head of the linked list. Each node contains the unique
identifier value associated with the values submitted in the WQEs and
retrieved in the port timestamping CQEs. When a WQE is submitted, the node
in the array corresponding to the identifier popped from the metadata
freelist is added to the end of the CQE pending list and is marked as
"in-use". The node is removed from the linked list under two conditions.
The first condition is that the corresponding port timestamping CQE is
polled in the PTP napi_poll context. The second condition is that more than
a second has elapsed since the DMA timestamp value corresponding to the WQE
submission. When the first condition occurs, the "in-use" bit in the linked
list node is cleared, and the resources corresponding to the WQE submission
are then released. The second condition, however, indicates that the port
timestamping CQE will likely never be delivered. It's not impossible for
the device to post a CQE after an infinite amount of time though highly
improbable. In order to be resilient to this improbable case, resources
related to the corresponding WQE submission are still kept, the identifier
value is not returned to the freelist, and the "in-use" bit is cleared on
the node to indicate that it's no longer part of the linked list of "likely
to be delivered" port timestamping CQE identifiers. A count for the number
of port timestamping CQEs considered highly likely to never be delivered by
the device is maintained. This count gets decremented in the unlikely event
a port timestamping CQE considered unlikely to ever be delivered is polled
in the PTP napi_poll context.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-14 14:40:20 -07:00
Rahul Rameshbabu b608dd670b net/mlx5: Consolidate devlink documentation in devlink/mlx5.rst
De-duplicate documentation by removing mellanox/mlx5/devlink.rst. Instead,
only use the generic devlink documentation directory to document mlx5
devlink parameters. Avoid providing general devlink tool usage information
in mlx5-specific documentation.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-14 14:40:20 -07:00
Jeff Layton c96e2a695e sunrpc: set the bv_offset of first bvec in svc_tcp_sendmsg
svc_tcp_sendmsg used to factor in the xdr->page_base when sending pages,
but commit 5df5dd03a8 ("sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather
then sendpage") dropped that part of the handling. Fix it by setting
the bv_offset of the first bvec.

Fixes: 5df5dd03a8 ("sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-14 15:02:25 -04:00
Jakub Kicinski f3cc00303c Merge branch 'devlink-introduce-selective-dumps'
Jiri Pirko says:

====================
devlink: introduce selective dumps

Motivation:

For SFs, one devlink instance per SF is created. There might be
thousands of these on a single host. When a user needs to know port
handle for specific SF, he needs to dump all devlink ports on the host
which does not scale good.

Solution:

Allow user to pass devlink handle (and possibly other attributes)
alongside the dump command and dump only objects which are matching
the selection.

Use split ops to generate policies for dump callbacks acccording to
the attributes used for selection.

The userspace can use ctrl genetlink GET_POLICY command to find out if
the selective dumps are supported by kernel for particular command.

Example:
$ devlink port show
auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false
auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false

$ devlink port show auxiliary/mlx5_core.eth.0
auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false

$ devlink port show auxiliary/mlx5_core.eth.1
auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false

Extension:

patches #12 and #13 extends selection attributes by port index
for health reporter dumping.
====================

Link: https://lore.kernel.org/r/20230811155714.1736405-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:27 -07:00
Jiri Pirko 0149bca172 netlink: specs: devlink: extend health reporter dump attributes by port index
Allow user to pass port index for health reporter dump request.

Re-generate the related code.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-14-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:25 -07:00
Jiri Pirko b03f13cb67 devlink: extend health reporter dump selector by port index
Introduce a possibility for devlink object to expose attributes it
supports for selection of dumped objects.

Use this by health reporter to indicate it supports port index based
selection of dump objects. Implement this selection mechanism in
devlink_nl_cmd_health_reporter_get_dump_one()

Example:
$ devlink health
pci/0000:08:00.0:
  reporter fw
    state healthy error 0 recover 0 auto_dump true
  reporter fw_fatal
    state healthy error 0 recover 0 grace_period 60000 auto_recover true auto_dump true
  reporter vnic
    state healthy error 0 recover 0
pci/0000:08:00.0/32768:
  reporter vnic
    state healthy error 0 recover 0
pci/0000:08:00.0/32769:
  reporter vnic
    state healthy error 0 recover 0
pci/0000:08:00.0/32770:
  reporter vnic
    state healthy error 0 recover 0
pci/0000:08:00.1:
  reporter fw
    state healthy error 0 recover 0 auto_dump true
  reporter fw_fatal
    state healthy error 0 recover 0 grace_period 60000 auto_recover true auto_dump true
  reporter vnic
    state healthy error 0 recover 0
pci/0000:08:00.1/98304:
  reporter vnic
    state healthy error 0 recover 0
pci/0000:08:00.1/98305:
  reporter vnic
    state healthy error 0 recover 0
pci/0000:08:00.1/98306:
  reporter vnic
    state healthy error 0 recover 0

$ devlink health show pci/0000:08:00.0
pci/0000:08:00.0:
  reporter fw
    state healthy error 0 recover 0 auto_dump true
  reporter fw_fatal
    state healthy error 0 recover 0 grace_period 60000 auto_recover true auto_dump true
  reporter vnic
    state healthy error 0 recover 0
pci/0000:08:00.0/32768:
  reporter vnic
    state healthy error 0 recover 0
pci/0000:08:00.0/32769:
  reporter vnic
    state healthy error 0 recover 0
pci/0000:08:00.0/32770:
  reporter vnic
    state healthy error 0 recover 0

$ devlink health show pci/0000:08:00.0/32768
pci/0000:08:00.0/32768:
  reporter vnic
    state healthy error 0 recover 0

The last command is possible because of this patch.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-13-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:25 -07:00
Jiri Pirko 34493336e7 netlink: specs: devlink: extend per-instance dump commands to accept instance attributes
Extend per-instance dump command definitions to accept instance
attributes. Allow parsing of devlink handle attributes so they could
be used for instance selection.

Re-generate the related code.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-12-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:25 -07:00
Jiri Pirko 4a1b5aa8b5 devlink: allow user to narrow per-instance dumps by passing handle attrs
For SFs, one devlink instance per SF is created. There might be
thousands of these on a single host. When a user needs to know port
handle for specific SF, he needs to dump all devlink ports on the host
which does not scale good.

Allow user to pass devlink handle attributes alongside the dump command
and dump only objects which are under selected devlink instance.

Example:
$ devlink port show
auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false
auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false

$ devlink port show auxiliary/mlx5_core.eth.0
auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false

$ devlink port show auxiliary/mlx5_core.eth.1
auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-11-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:25 -07:00
Jiri Pirko 833e479d33 devlink: remove converted commands from small ops
As the commands are already defined in split ops, remove them
from small ops.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-10-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:25 -07:00
Jiri Pirko ddff283280 devlink: remove duplicate temporary netlink callback prototypes
Remove the duplicate temporary netlink callback prototype as the
generated ones are already in place.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-9-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:25 -07:00
Jiri Pirko 7199c86247 netlink: specs: devlink: add commands that do per-instance dump
Add the definitions for the commands that do per-instance dump
and re-generate the related code.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-8-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:25 -07:00
Jiri Pirko 7d3c6fec61 devlink: pass flags as an arg of dump_one() callback
In order to easily set NLM_F_DUMP_FILTERED for partial dumps, pass the
flags as an arg of dump_one() callback. Currently, it is always
NLM_F_MULTI.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-7-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:25 -07:00
Jiri Pirko 24c8e56d4f devlink: introduce dumpit callbacks for split ops
Introduce dumpit callbacks for generated split ops. Have them
as a thin wrapper around iteration function and allow to pass dump_one()
function pointer directly without need to store in devlink_cmd structs.

Note that the function prototypes are temporary until the generated ones
will replace them in a follow-up patch.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-6-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:24 -07:00
Jiri Pirko 8fa995ad1f devlink: rename doit callbacks for per-instance dump commands
Rename netlink doit callback functions for the commands that do
implement per-instance dump to match the generated names that are going
to be introduce in the follow-up patch.

Note that the function prototypes are temporary until the generated ones
will replace them in a follow-up patch.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-5-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:24 -07:00
Jiri Pirko ee6d78ac28 devlink: introduce devlink_nl_pre_doit_port*() helper functions
Define port handling helpers what don't rely on internal_flags.
Have __devlink_nl_pre_doit() to accept the flags as a function arg and
make devlink_nl_pre_doit() a wrapper helper function calling it.
Introduce new helpers devlink_nl_pre_doit_port() and
devlink_nl_pre_doit_port_optional() to be used by split ops in follow-up
patch.

Note that the function prototypes are temporary until the generated ones
will replace them in a follow-up patch.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-4-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:24 -07:00
Jiri Pirko 41a1d4d139 devlink: parse rate attrs in doit() callbacks
No need to give the rate any special treatment in netlink attributes
parsing, as unlike for ports, there is only a couple of commands
benefiting from that.

Remove DEVLINK_NL_FLAG_NEED_RATE*, make pre_doit() callback simpler
by moving the rate attributes parsing to rate_*_doit() ops.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-3-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:24 -07:00
Jiri Pirko 63618463cb devlink: parse linecard attr in doit() callbacks
No need to give the linecards any special treatment in netlink attribute
parsing, as unlike for ports, there is only a couple of commands
benefiting from that.

Remove DEVLINK_NL_FLAG_NEED_LINECARD, make pre_doit() callback simpler
by moving the linecard attribute parsing to linecard_[gs]et_doit() ops.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230811155714.1736405-2-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-14 11:47:24 -07:00
Yipeng Zou 83a89c4b6a selftests/bpf: Clean up fmod_ret in bench_rename test script
Running the bench_rename test script, the following error occurs:

  # ./benchs/run_bench_rename.sh
  base      :    0.819 ± 0.012M/s
  kprobe    :    0.538 ± 0.009M/s
  kretprobe :    0.503 ± 0.004M/s
  rawtp     :    0.779 ± 0.020M/s
  fentry    :    0.726 ± 0.007M/s
  fexit     :    0.691 ± 0.007M/s
  benchmark 'rename-fmodret' not found

The bench_rename_fmodret has been removed in commit b000def2e0
("selftests: Remove fmod_ret from test_overhead"), thus remove it
from the runners in the test script.

Fixes: b000def2e0 ("selftests: Remove fmod_ret from test_overhead")
Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230814030727.3010390-1-zouyipeng@huawei.com
2023-08-14 18:43:04 +02:00
Yipeng Zou 811915db67 selftests/bpf: Fix repeat option when kfunc_call verification fails
There is no way where topts.repeat can be set to 1 when tc_test fails.
Fix the typo where the break statement slipped by one line.

Fixes: fb66223a24 ("selftests/bpf: add test for accessing ctx from syscall program type")
Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Li Zetao <lizetao1@huawei.com>
Link: https://lore.kernel.org/bpf/20230814031434.3077944-1-zouyipeng@huawei.com
2023-08-14 18:03:00 +02:00
Sven Eckelmann 6f96d46f9a batman-adv: Drop per algo GW section class code
This code was  only used in the past for the sysfs interface. But since
this was replace with netlink, it was never executed. The function pointer
was only checked to figure out whether the limit 255 (B.A.T.M.A.N. IV) or
2**32-1 (B.A.T.M.A.N. V) should be used as limit.

So instead of keeping the function pointer, just store the limits directly
in struct batadv_algo_gw_ops.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2023-08-14 18:01:21 +02:00
Sven Eckelmann 02e61f06a9 batman-adv: Keep batadv_netlink_notify_* static
The batadv_netlink_notify_*() functions are not used by any other source
file. Just keep them local to netlink.c to get informed by the compiler
when they are not used anymore.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2023-08-14 18:01:21 +02:00
Sven Eckelmann 950c92bbaa batman-adv: Drop unused function batadv_gw_bandwidth_set
This function is no longer used since the sysfs support was removed from
batman-adv.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2023-08-14 18:01:21 +02:00
Jani Nikula 423ffe62c0 drm/i915: fix display probe for IVB Q and IVB D GT2 server
The current display probe is unable to differentiate between IVB Q and
IVB D GT2 server, as they both have the same device id, but different
subvendor and subdevice. This leads to the latter being misidentified as
the former, and should just end up not having a display. However, the no
display case returns a NULL as the display device info, and promptly
oopses.

As the IVB Q case is rare, and we're anyway moving towards GMD ID,
handle the identification requiring subvendor and subdevice as a special
case first, instead of unnecessarily growing the intel_display_ids[]
array with subvendor and subdevice.

[    5.425298] BUG: kernel NULL pointer dereference, address: 0000000000000000
[    5.426059] #PF: supervisor read access in kernel mode
[    5.426810] #PF: error_code(0x0000) - not-present page
[    5.427570] PGD 0 P4D 0
[    5.428285] Oops: 0000 [#1] PREEMPT SMP PTI
[    5.429035] CPU: 0 PID: 137 Comm: (udev-worker) Not tainted 6.4.0-1-amd64 #1  Debian 6.4.4-1
[    5.429759] Hardware name: HP HP Z220 SFF Workstation/HP Z220 SFF Workstation, BIOS 4.19-218-gb184e6e0a1 02/02/2023
[    5.430485] RIP: 0010:intel_device_info_driver_create+0xf1/0x120 [i915]
[    5.431338] Code: 48 8b 97 80 1b 00 00 89 8f c0 1b 00 00 48 89 b7 b0 1b 00 00 48 89 97 b8 1b 00 00 0f b7 fd e8 76 e8 14 00 48 89 83 50 1b 00 00 <48> 8b 08 48 89 8b c4 1b 00 00 48 8b 48 08 48 89 8b cc 1b 00 00 8b
[    5.432920] RSP: 0018:ffffb8254044fb98 EFLAGS: 00010206
[    5.433707] RAX: 0000000000000000 RBX: ffff923076e80000 RCX: 0000000000000000
[    5.434494] RDX: 0000000000000260 RSI: 0000000100001000 RDI: 000000000000016a
[    5.435277] RBP: 000000000000016a R08: ffffb8254044fb00 R09: 0000000000000000
[    5.436055] R10: ffff922d02761de8 R11: 00657361656c6572 R12: ffffffffc0e5d140
[    5.436867] R13: ffff922d00b720d0 R14: 0000000076e80000 R15: ffff923078c0cae8
[    5.437646] FS:  00007febd19a18c0(0000) GS:ffff92307c000000(0000) knlGS:0000000000000000
[    5.438434] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.439218] CR2: 0000000000000000 CR3: 000000010256e002 CR4: 00000000001706f0
[    5.440009] Call Trace:
[    5.440824]  <TASK>
[    5.441611]  ? __die+0x23/0x70
[    5.442394]  ? page_fault_oops+0x17d/0x4c0
[    5.443173]  ? exc_page_fault+0x7f/0x180
[    5.443949]  ? asm_exc_page_fault+0x26/0x30
[    5.444756]  ? intel_device_info_driver_create+0xf1/0x120 [i915]
[    5.445652]  ? intel_device_info_driver_create+0xea/0x120 [i915]
[    5.446545]  i915_driver_probe+0x7f/0xb60 [i915]
[    5.447431]  ? drm_privacy_screen_get+0x15c/0x1a0 [drm]
[    5.448240]  local_pci_probe+0x45/0xa0
[    5.449013]  pci_device_probe+0xc7/0x240
[    5.449748]  really_probe+0x19e/0x3e0
[    5.450464]  ? __pfx___driver_attach+0x10/0x10
[    5.451172]  __driver_probe_device+0x78/0x160
[    5.451870]  driver_probe_device+0x1f/0x90
[    5.452601]  __driver_attach+0xd2/0x1c0
[    5.453293]  bus_for_each_dev+0x88/0xd0
[    5.453989]  bus_add_driver+0x116/0x220
[    5.454672]  driver_register+0x59/0x100
[    5.455336]  i915_init+0x25/0xc0 [i915]
[    5.456104]  ? __pfx_i915_init+0x10/0x10 [i915]
[    5.456882]  do_one_initcall+0x5d/0x240
[    5.457511]  do_init_module+0x60/0x250
[    5.458126]  __do_sys_finit_module+0xac/0x120
[    5.458721]  do_syscall_64+0x60/0xc0
[    5.459314]  ? syscall_exit_to_user_mode+0x1b/0x40
[    5.459897]  ? do_syscall_64+0x6c/0xc0
[    5.460510]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[    5.461082] RIP: 0033:0x7febd20b0eb9
[    5.461648] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2f 1f 0d 00 f7 d8 64 89 01 48
[    5.462905] RSP: 002b:00007fffabb1ba78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    5.463554] RAX: ffffffffffffffda RBX: 0000561e6304f410 RCX: 00007febd20b0eb9
[    5.464201] RDX: 0000000000000000 RSI: 00007febd2244f0d RDI: 0000000000000015
[    5.464869] RBP: 00007febd2244f0d R08: 0000000000000000 R09: 000000000000000a
[    5.465512] R10: 0000000000000015 R11: 0000000000000246 R12: 0000000000020000
[    5.466124] R13: 0000000000000000 R14: 0000561e63032b60 R15: 000000000000000a
[    5.466700]  </TASK>
[    5.467271] Modules linked in: i915(+) drm_buddy video crc32_pclmul sr_mod hid_generic wmi crc32c_intel i2c_algo_bit sd_mod cdrom drm_display_helper cec usbhid rc_core ghash_clmulni_intel hid sha512_ssse3 ttm sha512_generic xhci_pci ehci_pci xhci_hcd ehci_hcd nvme ahci drm_kms_helper nvme_core libahci t10_pi libata psmouse aesni_intel scsi_mod crypto_simd i2c_i801 scsi_common crc64_rocksoft_generic cryptd i2c_smbus drm lpc_ich crc64_rocksoft crc_t10dif e1000e usbcore crct10dif_generic usb_common crct10dif_pclmul crc64 crct10dif_common button
[    5.469750] CR2: 0000000000000000
[    5.470364] ---[ end trace 0000000000000000 ]---
[    5.470971] RIP: 0010:intel_device_info_driver_create+0xf1/0x120 [i915]
[    5.471699] Code: 48 8b 97 80 1b 00 00 89 8f c0 1b 00 00 48 89 b7 b0 1b 00 00 48 89 97 b8 1b 00 00 0f b7 fd e8 76 e8 14 00 48 89 83 50 1b 00 00 <48> 8b 08 48 89 8b c4 1b 00 00 48 8b 48 08 48 89 8b cc 1b 00 00 8b
[    5.473034] RSP: 0018:ffffb8254044fb98 EFLAGS: 00010206
[    5.473698] RAX: 0000000000000000 RBX: ffff923076e80000 RCX: 0000000000000000
[    5.474371] RDX: 0000000000000260 RSI: 0000000100001000 RDI: 000000000000016a
[    5.475045] RBP: 000000000000016a R08: ffffb8254044fb00 R09: 0000000000000000
[    5.475725] R10: ffff922d02761de8 R11: 00657361656c6572 R12: ffffffffc0e5d140
[    5.476405] R13: ffff922d00b720d0 R14: 0000000076e80000 R15: ffff923078c0cae8
[    5.477124] FS:  00007febd19a18c0(0000) GS:ffff92307c000000(0000) knlGS:0000000000000000
[    5.477811] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.478499] CR2: 0000000000000000 CR3: 000000010256e002 CR4: 00000000001706f0

Fixes: 69d439818f ("drm/i915/display: Make display responsible for probing its own IP")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8991
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230804084600.1005818-1-jani.nikula@intel.com
(cherry picked from commit 1435188307)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-08-14 11:59:30 -04:00
Jani Nikula 2002eb6d3e drm/i915/sdvo: fix panel_type initialization
Commit 3f9ffce576 ("drm/i915: Do panel VBT init early if the VBT
declares an explicit panel type") started using -1 as the value for
unset panel_type. It gets initialized in intel_panel_init_alloc(), but
the SDVO code never calls it.

Call intel_panel_init_alloc() to initialize the panel, including the
panel_type.

Reported-by: Tomi Leppänen <tomi@tomin.site>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8896
Fixes: 3f9ffce576 ("drm/i915: Do panel VBT init early if the VBT declares an explicit panel type")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.1+
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Tested-by: Tomi Leppänen <tomi@tomin.site>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230803122706.838721-1-jani.nikula@intel.com
(cherry picked from commit 26e60294e8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-08-14 11:59:26 -04:00
Vinay Belgaumkar 5598c9bfdb drm/i915/guc/slpc: Restore efficient freq earlier
This should be done before the soft min/max frequencies are restored.
When we disable the "Ignore efficient frequency" flag, GuC does not
actually bring the requested freq down to RPn.

Specifically, this scenario-

- ignore efficient freq set to true
- reduce min to RPn (from efficient)
- suspend
- resume (includes GuC load, restore soft min/max, restore efficient freq)
- validate min freq has been resored to RPn

This will fail if we didn't first restore(disable, in this case) efficient
freq flag before setting the soft min frequency.

v2: Bring the min freq down to RPn when we disable efficient freq (Rodrigo)
Also made the change to set the min softlimit to RPn at init. Otherwise, we
were storing RPe there.

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8736
Fixes: 55f9720dbf ("drm/i915/guc/slpc: Provide sysfs for efficient freq")
Fixes: 95ccf312a1 ("drm/i915/guc/slpc: Allow SLPC to use efficient frequency")
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230726010044.3280402-1-vinay.belgaumkar@intel.com
(cherry picked from commit 28e671114f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-08-14 11:59:22 -04:00
Marco Vedovati 8e50750f12 libbpf: Set close-on-exec flag on gzopen
Enable the close-on-exec flag when using gzopen. This is especially important
for multithreaded programs making use of libbpf, where a fork + exec could
race with libbpf library calls, potentially resulting in a file descriptor
leaked to the new process. This got missed in 59842c5451 ("libbpf: Ensure
libbpf always opens files with O_CLOEXEC").

Fixes: 59842c5451 ("libbpf: Ensure libbpf always opens files with O_CLOEXEC")
Signed-off-by: Marco Vedovati <marco.vedovati@crowdstrike.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230810214350.106301-1-martin.kelly@crowdstrike.com
2023-08-14 17:35:32 +02:00
Steve French 7b38f6ddc9 smb3: display network namespace in debug information
We recently had problems where a network namespace was deleted
causing hard to debug reconnect problems.  To help deal with
configuration issues like this it is useful to dump the network
namespace to better debug what happened.

So add this to information displayed in /proc/fs/cifs/DebugData for
the server (and channels if mounted with multichannel). For example:

   Local Users To Server: 1 SecMode: 0x1 Req On Wire: 0 Net namespace: 4026531840

This can be easily compared with what is displayed for the
processes on the system. For example /proc/1/ns/net in this case
showed the same thing (see below), and we can see that the namespace
is still valid in this example.

   'net:[4026531840]'

Cc: stable@vger.kernel.org
Acked-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-08-14 08:41:29 -05:00
Russell Harmon via samba-technical 69513dd669 cifs: Release folio lock on fscache read hit.
Under the current code, when cifs_readpage_worker is called, the call
contract is that the callee should unlock the page. This is documented
in the read_folio section of Documentation/filesystems/vfs.rst as:

> The filesystem should unlock the folio once the read has completed,
> whether it was successful or not.

Without this change, when fscache is in use and cache hit occurs during
a read, the page lock is leaked, producing the following stack on
subsequent reads (via mmap) to the page:

$ cat /proc/3890/task/12864/stack
[<0>] folio_wait_bit_common+0x124/0x350
[<0>] filemap_read_folio+0xad/0xf0
[<0>] filemap_fault+0x8b1/0xab0
[<0>] __do_fault+0x39/0x150
[<0>] do_fault+0x25c/0x3e0
[<0>] __handle_mm_fault+0x6ca/0xc70
[<0>] handle_mm_fault+0xe9/0x350
[<0>] do_user_addr_fault+0x225/0x6c0
[<0>] exc_page_fault+0x84/0x1b0
[<0>] asm_exc_page_fault+0x27/0x30

This requires a reboot to resolve; it is a deadlock.

Note however that the call to cifs_readpage_from_fscache does mark the
page clean, but does not free the folio lock. This happens in
__cifs_readpage_from_fscache on success. Releasing the lock at that
point however is not appropriate as cifs_readahead also calls
cifs_readpage_from_fscache and *does* unconditionally release the lock
after its return. This change therefore effectively makes
cifs_readpage_worker work like cifs_readahead.

Signed-off-by: Russell Harmon <russ@har.mn>
Acked-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-08-14 08:39:12 -05:00
Luca Ceresoli e8470c0a7b drm/panel: simple: Fix AUO G121EAN01 panel timings according to the docs
Commit 03e909acd9 ("drm/panel: simple: Add support for AUO G121EAN01.4
panel") added support for this panel model, but the timings it implements
are very different from what the datasheet describes. I checked both the
G121EAN01.0 datasheet from [0] and the G121EAN01.4 one from [1] and they
all have the same timings: for example the LVDS clock typical value is 74.4
MHz, not 66.7 MHz as implemented.

Replace the timings with the ones from the documentation. These timings
have been tested and the clock frequencies verified with an oscilloscope to
ensure they are correct.

Also use struct display_timing instead of struct drm_display_mode in order
to also specify the minimum and maximum values.

[0] https://embedded.avnet.com/product/g121ean01-0/
[1] https://embedded.avnet.com/product/g121ean01-4/

Fixes: 03e909acd9 ("drm/panel: simple: Add support for AUO G121EAN01.4 panel")
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230804151239.835216-1-luca.ceresoli@bootlin.com
2023-08-14 14:50:47 +02:00
David Heidelberg ae6546835e drm/panel: JDI LT070ME05000 simplify with dev_err_probe()
Use the dev_err_probe() helper to simplify error handling during probe.
This also handle scenario, when EDEFER is returned and useless error is printed.

Fixes error:
panel-jdi-lt070me05000 4700000.dsi.0: cannot get enable-gpio -517

Signed-off-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230812185239.378582-1-david@ixit.cz
2023-08-14 14:48:49 +02:00
Petr Machata 855067defa selftests: mirror_gre_changes: Tighten up the TTL test match
This test verifies whether the encapsulated packets have the correct
configured TTL. It does so by sending ICMP packets through the test
topology and mirroring them to a gretap netdevice. On a busy host
however, more than just the test ICMP packets may end up flowing
through the topology, get mirrored, and counted. This leads to
potential spurious failures as the test observes much more mirrored
packets than the sent test packets, and assumes a bug.

Fix this by tightening up the mirror action match. Change it from
matchall to a flower classifier matching on ICMP packets specifically.

Fixes: 45315673e0 ("selftests: forwarding: Test changes in mirror-to-gretap")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Tested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 11:14:24 +01:00
Helge Deller 7a894c8737 parisc: Fix CONFIG_TLB_PTLOCK to work with lightweight spinlock checks
For the TLB_PTLOCK checks we used an optimization to store the spc
register into the spinlock to unlock it. This optimization works as
long as the lightweight spinlock checks (CONFIG_LIGHTWEIGHT_SPINLOCK_CHECK)
aren't enabled, because they really check if the lock word is zero or
__ARCH_SPIN_LOCK_UNLOCKED_VAL and abort with a kernel crash
("Spinlock was trashed") otherwise.

Drop that optimization to make it possible to activate both checks
at the same time.

Noticed-by: Sam James <sam@gentoo.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: Sam James <sam@gentoo.org>
Cc: stable@vger.kernel.org # v6.4+
Fixes: 15e64ef652 ("parisc: Add lightweight spinlock checks")
2023-08-14 09:54:49 +02:00
Vlad Buslov ace0ab3a4b Revert "vlan: Fix VLAN 0 memory leak"
This reverts commit 718cb09aaa.

The commit triggers multiple syzbot issues, probably due to possibility of
manually creating VLAN 0 on netdevice which will cause the code to delete
it since it can't distinguish such VLAN from implicit VLAN 0 automatically
created for devices with NETIF_F_HW_VLAN_CTAG_FILTER feature.

Reported-by: syzbot+662f783a5cdf3add2719@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/00000000000090196d0602a6167d@google.com/
Reported-by: syzbot+4b4f06495414e92701d5@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/00000000000096ae870602a61602@google.com/
Reported-by: syzbot+d810d3cd45ed1848c3f7@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/0000000000009f0f9c0602a616ce@google.com/
Fixes: 718cb09aaa ("vlan: Fix VLAN 0 memory leak")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 08:14:00 +01:00
Gabor Juhos 83b5f0253b net: phy: Introduce PSGMII PHY interface mode
The PSGMII interface is similar to QSGMII. The main difference
is that the PSGMII interface combines five SGMII lines into a
single link while in QSGMII only four lines are combined.

Similarly to the QSGMII, this interface mode might also needs
special handling within the MAC driver.

It is commonly used by Qualcomm with their QCA807x PHY series and
modern WiSoC-s.

Add definitions for the PHY layer to allow to express this type
of connection between the MAC and PHY.

Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 08:12:53 +01:00
Robert Marko de875d35e0 dt-bindings: net: ethernet-controller: add PSGMII mode
Add a new PSGMII mode which is similar to QSGMII with the difference being
that it combines 5 SGMII lines into a single link compared to 4 on QSGMII.

It is commonly used by Qualcomm on their QCA807x PHY series.

Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 08:12:53 +01:00
David S. Miller 2d93c30c4e Merge branch 'mlxsw-redirection'
Petr Machata says:

====================
mlxsw: Support traffic redirection from a locked bridge port

Ido Schimmel writes:

It is possible to add a filter that redirects traffic from the ingress
of a bridge port that is locked (i.e., performs security / SMAC lookup)
and has learning enabled. For example:

 # ip link add name br0 type bridge
 # ip link set dev swp1 master br0
 # bridge link set dev swp1 learning on locked on mab on
 # tc qdisc add dev swp1 clsact
 # tc filter add dev swp1 ingress pref 1 proto ip flower skip_sw src_ip 192.0.2.1 action mirred egress redirect dev swp2

In the kernel's Rx path, this filter is evaluated before the Rx handler
of the bridge, which means that redirected traffic should not be
affected by bridge port configuration such as learning.

However, the hardware data path is a bit different and the redirect
action (FORWARDING_ACTION in hardware) merely attaches a pointer to the
packet, which is later used by the L2 lookup stage to understand how to
forward the packet. Between both stages - ingress ACL and L2 lookup -
learning and security lookup are performed, which means that redirected
traffic is affected by bridge port configuration, unlike in the kernel's
data path.

The learning discrepancy was handled in commit 577fa14d21 ("mlxsw:
spectrum: Do not process learned records with a dummy FID") by simply
ignoring learning notifications generated by the redirected traffic. A
similar solution is not possible for the security / SMAC lookup since
- unlike learning - the CPU is not involved and packets that failed the
lookup are dropped by the device.

Instead, solve this by prepending the ignore action to the redirect
action and use it to instruct the device to disable both learning and
the security / SMAC lookup for redirected traffic.

Patch #1 adds the ignore action.

Patch #2 prepends the action to the redirect action in flower offload
code.

Patch #3 removes the workaround in commit 577fa14d21 ("mlxsw:
spectrum: Do not process learned records with a dummy FID") since it is
no longer needed.

Patch #4 adds a test case.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 08:11:14 +01:00
Ido Schimmel 38c43a1ce7 selftests: forwarding: Add test case for traffic redirection from a locked port
Check that traffic can be redirected from a locked bridge port and that
it does not create locked FDB entries.

Cc: Hans J. Schultz <netdev@kapio-technology.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 08:11:14 +01:00
Ido Schimmel 9793a5a9c4 mlxsw: spectrum: Stop ignoring learning notifications from redirected traffic
As explained in the previous patch, with the ignore action prepended to
the redirect action, it is not longer possible for redirected traffic to
generate learning notifications.

Therefore, remove the workaround that was added in commit 577fa14d21
("mlxsw: spectrum: Do not process learned records with a dummy FID") as
it is no longer needed.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 08:11:14 +01:00
Ido Schimmel 0433670e13 mlxsw: spectrum_flower: Disable learning and security lookup when redirecting
It is possible to add a filter that redirects traffic from the ingress
of a bridge port that is locked (i.e., performs security / SMAC lookup)
and has learning enabled. For example:

 # ip link add name br0 type bridge
 # ip link set dev swp1 master br0
 # bridge link set dev swp1 learning on locked on mab on
 # tc qdisc add dev swp1 clsact
 # tc filter add dev swp1 ingress pref 1 proto ip flower skip_sw src_ip 192.0.2.1 action mirred egress redirect dev swp2

In the kernel's Rx path, this filter is evaluated before the Rx handler
of the bridge, which means that redirected traffic should not be
affected by bridge port configuration such as learning.

However, the hardware data path is a bit different and the redirect
action (FORWARDING_ACTION in hardware) merely attaches a pointer to the
packet, which is later used by the L2 lookup stage to understand how to
forward the packet. Between both stages - ingress ACL and L2 lookup -
learning and security lookup are performed, which means that redirected
traffic is affected by bridge port configuration, unlike in the kernel's
data path.

The learning discrepancy was handled in commit 577fa14d21 ("mlxsw:
spectrum: Do not process learned records with a dummy FID") by simply
ignoring learning notifications generated by the redirected traffic. A
similar solution is not possible for the security / SMAC lookup since
- unlike learning - the CPU is not involved and packets that failed the
lookup are dropped by the device.

Instead, solve this by prepending the ignore action to the redirect
action and use it to instruct the device to disable both learning and
the security / SMAC lookup for redirected traffic.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 08:11:14 +01:00
Ido Schimmel d0d449c747 mlxsw: core_acl_flex_actions: Add IGNORE_ACTION
Add the IGNORE_ACTION which is used to ignore basic switching functions
such as learning on a per-packet basis.

The action will be prepended to the FORWARDING_ACTION in subsequent
patches.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 08:11:14 +01:00
Furong Xu 58c1e0bace net: stmmac: xgmac: show more MAC HW features in debugfs
1. Show TSSTSSEL(Timestamp System Time Source),
ADDMACADRSEL(additional MAC addresses), SMASEL(SMA/MDIO Interface),
HDSEL(Half-duplex Support) in debugfs.
2. Show exact number of additional MAC address registers for XGMAC2 core.
3. XGMAC2 core does not have different IP checksum offload types, so just
show rx_coe instead of rx_coe_type1 or rx_coe_type2.
4. XGMAC2 core does not have rxfifo_over_2048 definition, skip it.

Signed-off-by: Furong Xu <0x1207@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-14 08:09:00 +01:00