Commit Graph

113753 Commits

Author SHA1 Message Date
Gal Pressman 16ab85e784 net/mlx5e: Expose rx_oversize_pkts_buffer counter
Add the rx_oversize_pkts_buffer counter to ethtool statistics.
This counter exposes the number of dropped received packets due to
length which arrived to RQ and exceed software buffer size allocated by
the device for incoming traffic. It might imply that the device MTU is
larger than the software buffers size.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:55:29 -07:00
Maxim Mikityanskiy c2c9e31dfa net/mlx5e: xsk: Optimize for unaligned mode with 3072-byte frames
When XSK frame size is 3072 (or another power of two multiplied by 3),
KLM mechanism for NIC virtual memory page mapping can be optimized by
replacing it with KSM.

Before this change, two KLM entries were needed to map an XSK frame that
is not a power of two: one entry maps the UMEM memory up to the frame
length, the other maps the rest of the stride to the garbage page.

When the frame length divided by 3 is a power of two, it can be mapped
using 3 KSM entries, and the fourth will map the rest of the stride to
the garbage page. All 4 KSM entries are of the same size, which allows
for a much faster lookup.

Frame size 3072 is useful in certain use cases, because it allows
packing 4 frames into 3 pages. Generally speaking, other frame sizes
equal to PAGE_SIZE minus a power of two can be optimized in a similar
way, but it will require many more KSMs per frame, which slows down UMRs
a little bit, but more importantly may hit the limit for the maximum
number of KSM entries.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:55:28 -07:00
Maxim Mikityanskiy c6f0420468 net/mlx5e: xsk: Print a warning in slow configurations
On striding RQ, when the XSK frame size doesn't match the MKey page
size, KLM is used for memory mappings, which is a slower mechanism than
MTT or KSM. It may happen in two cases:

1. Frame size is not a power of two (only possible in the unaligned mode
of XSK).

2. Frame size is 2048 bytes, and the firmware doesn't support MKey pages
smaller than 4096 bytes.

Depending on the case, print a warning and recommend to disable striding
RQ or upgrade the firmware.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:55:28 -07:00
Maxim Mikityanskiy 1392134510 net/mlx5e: xsk: Use KLM to protect frame overrun in unaligned mode
XSK RQs support striding RQ linear mode, but the stride size may be
bigger than the XSK frame size, because:

1. The stride size must be a power of two.

2. The stride size must be equal to the UMR page size. Each XSK frame is
treated as a separate page, because they aren't necessarily adjacent in
physical memory, so the driver can't put more than one stride per page.

3. The minimal MTT page size is 4096 on older firmware.

That means that if XSK frame size is 2048 or not a power of two, the
strides may be bigger than XSK frames. Normally, it's not a problem if
the hardware enforces the MTU. However, traffic between vports skips the
hardware MTU check, and oversized packets may be received.

If an oversized packet is bigger than the XSK frame but not bigger than
the stride, it will cause overwriting of the adjacent UMEM region. If
the packet takes more than one stride, they can be recycled for reuse,
so it's not a problem when the XSK frame size matches the stride size.

Work around the above issue by leveraging KLM to make a more
fine-grained mapping. The beginning of each stride is mapped to the
frame memory, and the padding up to the closest power of two is mapped
to the overflow page that doesn't belong to UMEM. This way, application
data corruption won't happen upon receiving packets bigger than MTU.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:55:28 -07:00
Maxim Mikityanskiy 9f123f7404 net/mlx5e: Improve MTT/KSM alignment
Make mlx5e_mpwrq_mtts_per_wqe take into account that KSM requires
smaller alignment than MTT.

Ensure that there is always an even amount of MTTs in a UMR WQE, so that
complete octwords are formed, and no garbage is mapped.

Drop extra alignment in MLX5_MTT_OCTW that may cause setting too big
ucseg->xlt_octowords, also leading to mapping garbage.

Generalize some calculations by introducing the MLX5_OCTWORD constant.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:55:28 -07:00
Maxim Mikityanskiy 168723c1f8 net/mlx5e: xsk: Use umr_mode to calculate striding RQ parameters
Instead of passing the unaligned flag, pass an enum that indicates the
UMR mode. The next commit will add the third mode (KLM for certain
configurations of XSK), which will be added to this enum instead of
adding another bool flag everywhere.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:55:27 -07:00
Maxim Mikityanskiy cfb4d09c30 net/mlx5e: xsk: Improve need_wakeup logic
XSK need_wakeup mechanism allows the driver to stop busy waiting for
buffers when the fill ring is empty, yield to the application and signal
it that the driver needs to be waken up after the application refills
the fill ring.

Add protection against the race condition on the RX (refill) side: if
the application refills buffers after xskrq->post_wqes is called, but
before mlx5e_xsk_update_rx_wakeup, NAPI will exit, skipping taking these
buffers to the hardware WQ, and the application won't wake it up again.

Optimize the whole need_wakeup logic, removing unneeded flows, to
compensate for this new check.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:55:27 -07:00
Maxim Mikityanskiy 1ca6492ec9 net/mlx5e: xsk: Include XSK skb_from_cqe callbacks in INDIRECT_CALL
XSK is a performance-critical data path. To avoid an indirect function
call with a retpoline, include XSK callbacks in the INDIRECT_CALL macro,
so that they are called directly in XSK flows.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:55:27 -07:00
Maxim Mikityanskiy a2740f529d net/mlx5e: xsk: Set napi_id to support busy polling
xdp_rxq_info_reg should get the actual napi_id, not 0, in order to
support socket busy polling properly.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:55:27 -07:00
Maxim Mikityanskiy 082a9edf12 net/mlx5e: xsk: Flush RQ on XSK activation to save memory
The regular RQ remains open after opening an XSK socket, in order to
guarantee that closing the XSK socket never fails due to an error when
reopening the regular RQ.

To save memory, the regular RQ can be deactivated and flushed, releasing
all pages, when an XSK socket is open.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:55:26 -07:00
Russell King (Oracle) 0152dfee23 net: mvpp2: fix mvpp2 debugfs leak
When mvpp2 is unloaded, the driver specific debugfs directory is not
removed, which technically leads to a memory leak. However, this
directory is only created when the first device is probed, so the
hardware is present. Removing the module is only something a developer
would to when e.g. testing out changes, so the module would be
reloaded. So this memory leak is minor.

The original attempt in commit fe2c9c61f6 ("net: mvpp2: debugfs: fix
memory leak when using debugfs_lookup()") that was labelled as a memory
leak fix was not, it fixed a refcount leak, but in doing so created a
problem when the module is reloaded - the directory already exists, but
mvpp2_root is NULL, so we lose all debugfs entries. This fix has been
reverted.

This is the alternative fix, where we remove the offending directory
whenever the driver is unloaded.

Fixes: 21da57a231 ("net: mvpp2: add a debugfs interface for the Header Parser")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Marcin Wojtas <mw@semihalf.com>
Link: https://lore.kernel.org/r/E1ofOAB-00CzkG-UO@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:53:46 -07:00
Alex Elder a4388da51a net: ipa: update copyrights
Some source files state copyright dates that are earlier than the
last modification of the file.  Change the copyright year to 2022 in
all such cases.

Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20220930224549.3503434-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:49:20 -07:00
Alex Elder ace5dc6162 net: ipa: update comments
This patch just updates comments throughout the IPA code.

Transaction state is now tracked using indexes into an array rather
than linked lists, and a few comments refer to the "old way" of
doing things.  The description of how transactions are used was
changed to refer to "operations" rather than "commands", to
(hopefully) remove a possible ambiguity.

IPA register offsets and fields are now handled differently as well,
and the register documentation is updated to better describe the
code.

A few minor updates to comments were made (e.g., adding a missing
word, fixing a typo or punctuation, etc.).

Finally, the local macro atomic_dec_not_zero() is no longer used, so
it is deleted.

Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20220930224527.3503404-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:49:05 -07:00
Andrew Gaul 93e2be344a r8152: Rate limit overflow messages
My system shows almost 10 million of these messages over a 24-hour
period which pollutes my logs.

Signed-off-by: Andrew Gaul <gaul@google.com>
Link: https://lore.kernel.org/r/20221002034128.2026653-1-gaul@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:48:31 -07:00
Nathan Huckleberry 450a580fc4 net: lan966x: Fix return type of lan966x_port_xmit
The ndo_start_xmit field in net_device_ops is expected to be of type
netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, struct net_device *dev).

The mismatched return type breaks forward edge kCFI since the underlying
function definition does not match the function hook definition.

The return type of lan966x_port_xmit should be changed from int to
netdev_tx_t.

Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1703
Cc: llvm@lists.linux.dev
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20220929182704.64438-1-nhuck@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-03 16:40:16 -07:00
Linus Torvalds a5088ee725 Thermal control updates for 6.1-rc1
- Rework the device tree initialization, convert the drivers to the
    new API and remove the old OF code (Daniel Lezcano).
 
  - Fix return value to -ENODEV when searching for a specific thermal
    zone which does not exist (Daniel Lezcano).
 
  - Fix the return value inspection in of_thermal_zone_find() (Dan
    Carpenter).
 
  - Fix kernel panic when KASAN is enabled as it detects use after
    free when unregistering a thermal zone (Daniel Lezcano).
 
  - Move the set_trip ops inside the therma sysfs code (Daniel Lezcano).
 
  - Remove unnecessary error message as it is already shown in the
    underlying function (Jiapeng Chong).
 
  - Rework the monitoring path and move the locks upper in the call
    stack to fix some potentials race windows (Daniel Lezcano).
 
  - Fix lockdep_assert() warning introduced by the lock rework (Daniel
    Lezcano).
 
  - Do not lock thermal zone mutex in the user space governor (Rafael
    Wysocki).
 
  - Revert the Mellanox 'hotter thermal zone' feature because it is
    already handled in the thermal framework core code (Daniel Lezcano).
 
  - Increase maximum number of trip points in the thermal core (Sumeet
    Pawnikar).
 
  - Replace strlcpy() with unused retval with strscpy() in the core
    thermal control code (Wolfram Sang).
 
  - Use module_pci_driver() macro in the int340x processor_thermal
    driver (Shang XiaoJing).
 
  - Use get_cpu() instead of smp_processor_id() in the intel_powerclamp
    thermal driver to prevent it from crashing and remove unused
    accounting for IRQ wakes from it (Srinivas Pandruvada).
 
  - Consolidate priv->data_vault checks in int340x_thermal (Rafael
    Wysocki).
 
  - Check the policy first in cpufreq_cooling_register() (Xuewen Yan).
 
  - Drop redundant error message from da9062-thermal (zhaoxiao).
 
  - Drop of_match_ptr() from thermal_mmio (Jean Delvare).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmM7OzUSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx8CMP/1kNnDUjtCBIvl/OJEM0yVlEGJNppNu9
 dCNQwuftAJRt7dBB292clKW15ef1fFAHrW/eOy+z62k6k/tfqnMLEK38hpS4pIaL
 PZRjRfpp2CDmBBVTJ0I2nNC9h0zZY4tQK+JCII1M2UmgtLaGbp3NuIu2Ga4i5bNR
 IYE3QvVz/g2/pRNa/BTsJySje/q7wv231Vd5jg4czg58EyntBGO4gmWNuXyZ6OkF
 ijzcu7K5Tkll6DT+Paw8bGcPCyjBtoBhv2A6HtsC3cYfc2tY9BVf3DG2HlG0qTAU
 pIZsLOlUozzJScVbu8ScKj1hlP/iCKcPgVjP4l+U57EtcY/ULBqrQtkDVtGedNOZ
 wa5a1VUsZDrigWRpj1u5SsDWmbAod5uK5X/3naUfH7XtomhqikfaZK7Euek6kfDN
 SEhZaBycAFWlI/L5cG2sd6BBcmWlBqkaHiQ0zEv2YlALY5SkNOUQrrq2A/1LP1Ae
 67xbzbWFXyR8DAq3YyfnLpqgcJcSiyGxmdKZ1u2XHudXeJHKdAB7xlcJz+hfWpYU
 Rd1ZTB3ATA/IMG1rLO2dKgaMmyQCWPG6oXtJjTH0g3sXm0U6K14dwEU1ey9u/Li6
 DmI4cWaXf8WoRvb3rkEcJliayUed4U/ohU+z1bInSE8EuCBuyMLRhoJdi3ZhunOC
 PAyKcHg8fLy9
 =Ufw7
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
 "The most significant part of this update is the thermal control DT
  initialization rework from Daniel Lezcano and the following conversion
  of drivers to use the new API introduced by it

  Apart from that, the maximum number of trip points in a thermal zone
  is increased and there are some fixes and code cleanups

  Specifics:

   - Rework the device tree initialization, convert the drivers to the
     new API and remove the old OF code (Daniel Lezcano)

   - Fix return value to -ENODEV when searching for a specific thermal
     zone which does not exist (Daniel Lezcano)

   - Fix the return value inspection in of_thermal_zone_find() (Dan
     Carpenter)

   - Fix kernel panic when KASAN is enabled as it detects use after free
     when unregistering a thermal zone (Daniel Lezcano)

   - Move the set_trip ops inside the therma sysfs code (Daniel Lezcano)

   - Remove unnecessary error message as it is already shown in the
     underlying function (Jiapeng Chong)

   - Rework the monitoring path and move the locks upper in the call
     stack to fix some potentials race windows (Daniel Lezcano)

   - Fix lockdep_assert() warning introduced by the lock rework (Daniel
     Lezcano)

   - Do not lock thermal zone mutex in the user space governor (Rafael
     Wysocki)

   - Revert the Mellanox 'hotter thermal zone' feature because it is
     already handled in the thermal framework core code (Daniel Lezcano)

   - Increase maximum number of trip points in the thermal core (Sumeet
     Pawnikar)

   - Replace strlcpy() with unused retval with strscpy() in the core
     thermal control code (Wolfram Sang)

   - Use module_pci_driver() macro in the int340x processor_thermal
     driver (Shang XiaoJing)

   - Use get_cpu() instead of smp_processor_id() in the intel_powerclamp
     thermal driver to prevent it from crashing and remove unused
     accounting for IRQ wakes from it (Srinivas Pandruvada)

   - Consolidate priv->data_vault checks in int340x_thermal (Rafael
     Wysocki)

   - Check the policy first in cpufreq_cooling_register() (Xuewen Yan)

   - Drop redundant error message from da9062-thermal (zhaoxiao)

   - Drop of_match_ptr() from thermal_mmio (Jean Delvare)"

* tag 'thermal-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (55 commits)
  thermal: core: Increase maximum number of trip points
  thermal: int340x: processor_thermal: Use module_pci_driver() macro
  thermal: intel_powerclamp: Remove accounting for IRQ wakes
  thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash
  thermal: int340x_thermal: Consolidate priv->data_vault checks
  thermal: cpufreq_cooling: Check the policy first in cpufreq_cooling_register()
  thermal: Drop duplicate words from comments
  thermal: move from strlcpy() with unused retval to strscpy()
  thermal: da9062-thermal: Drop redundant error message
  thermal/drivers/thermal_mmio: Drop of_match_ptr()
  thermal: gov_user_space: Do not lock thermal zone mutex
  Revert "mlxsw: core: Add the hottest thermal zone detection"
  thermal/core: Fix lockdep_assert() warning
  thermal/core: Move the mutex inside the thermal_zone_device_update() function
  thermal/core: Move the thermal zone lock out of the governors
  thermal/governors: Group the thermal zone lock inside the throttle function
  thermal/core: Rework the monitoring a bit
  thermal/core: Rearm the monitoring only one time
  thermal/drivers/qcom/spmi-adc-tm5: Remove unnecessary print function dev_err()
  thermal/of: Remove old OF code
  ...
2022-10-03 15:33:38 -07:00
Alexander Potapenko 440fed95eb crypto: kmsan: disable accelerated configs under KMSAN
KMSAN is unable to understand when initialized values come from assembly. 
Disable accelerated configs in KMSAN builds to prevent false positive
reports.

Link: https://lkml.kernel.org/r/20220915150417.722975-27-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:22 -07:00
Rafael J. Wysocki a7ae50fc34 Merge branch 'thermal-core'
Merge core thermal control changes for 6.1-rc1:

 - Increase maximum number of trip points in the thermal core (Sumeet
   Pawnikar).

 - Replace strlcpy() with unused retval with strscpy() in the core
   thermal control code (Wolfram Sang).

 - Do not lock thermal zone mutex in the user space governor (Rafael
   Wysocki)

 - Rework the device tree initialization, convert the drivers to the
   new API and remove the old OF code (Daniel Lezcano)

 - Fix return value to -ENODEV when searching for a specific thermal
   zone which does not exist (Daniel Lezcano)

 - Fix the return value inspection in of_thermal_zone_find() (Dan
   Carpenter)

 - Fix kernel panic when KASAN is enabled as it detects use after
   free when unregistering a thermal zone (Daniel Lezcano)

 - Move the set_trip ops inside the therma sysfs code (Daniel Lezcano)

 - Remove unnecessary error message as it is already showed in the
   underlying function (Jiapeng Chong)

 - Rework the monitoring path and move the locks upper in the call
   stack to fix some potentials race windows (Daniel Lezcano)

 - Fix lockdep_assert() warning introduced by the lock rework (Daniel
   Lezcano)

 - Revert the Mellanox 'hotter thermal zone' feature because it is
   already handled in the thermal framework core code (Daniel Lezcano)

* thermal-core: (47 commits)
  thermal: core: Increase maximum number of trip points
  thermal: move from strlcpy() with unused retval to strscpy()
  thermal: gov_user_space: Do not lock thermal zone mutex
  Revert "mlxsw: core: Add the hottest thermal zone detection"
  thermal/core: Fix lockdep_assert() warning
  thermal/core: Move the mutex inside the thermal_zone_device_update() function
  thermal/core: Move the thermal zone lock out of the governors
  thermal/governors: Group the thermal zone lock inside the throttle function
  thermal/core: Rework the monitoring a bit
  thermal/core: Rearm the monitoring only one time
  thermal/drivers/qcom/spmi-adc-tm5: Remove unnecessary print function dev_err()
  thermal/of: Remove old OF code
  thermal/core: Move set_trip_temp ops to the sysfs code
  thermal/drivers/samsung: Switch to new of thermal API
  regulator/drivers/max8976: Switch to new of thermal API
  Input: sun4i-ts - switch to new of thermal API
  iio/drivers/sun4i_gpadc: Switch to new of thermal API
  hwmon/drivers/core: Switch to new of thermal API
  hwmon: pm_bus: core: Switch to new of thermal API
  ata/drivers/ahci_imx: Switch to new of thermal API
  ...
2022-10-03 20:38:43 +02:00
Maxim Mikityanskiy ba0fbdb95d net: wwan: iosm: Call mutex_init before locking it
wwan_register_ops calls wwan_create_default_link, which ends up in the
ipc_wwan_newlink callback that locks ipc_wwan->if_mutex. However, this
mutex is not yet initialized by that point. Fix it by moving mutex_init
above the wwan_register_ops call. This also makes the order of
operations in ipc_wwan_init symmetric to ipc_wwan_deinit.

Fixes: 83068395bb ("net: iosm: create default link via WWAN core")
Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Reviewed-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 13:25:43 +01:00
Subbaraya Sundeep c54ffc7360 octeontx2-pf: mcs: Introduce MACSEC hardware offloading
This patch introduces the macsec offload feature to cn10k
PF netdev driver. The macsec offload ops like adding, deleting
and updating SecYs, SCs, SAs and stats are supported. XPN support
will be added in later patches. Some stats use same counter in hardware
which means based on the SecY mode the same counter represents different
stat. Hence when SecY mode/policy is changed then snapshot of current
stats are captured. Also there is no provision to specify the unique
flow-id/SCI per packet to hardware hence different mac address needs to
be set for macsec interfaces.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:50:19 +01:00
Geetha sowjanya d06c2aba51 octeontx2-af: cn10k: mcs: Add debugfs support
This patch adds debugfs entry to dump MCS secy, sc,
sa, flowid and port stats. This helps in debugging
the packet path and to figure out where exactly packet
was dropped.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:50:19 +01:00
Geetha sowjanya 6c635f78c4 octeontx2-af: cn10k: mcs: Handle MCS block interrupts
Hardware triggers an interrupt for events like PN wrap to zero,
PN crosses set threshold. This interrupt is received
by the MCS_AF. MCS AF then finds the PF/VF to which SA is mapped
and notifies them using mcs_intr_notify mbox message.

PF/VF using mcs_intr_cfg mbox can configure the list
of interrupts for which they want to receive the
notification from AF.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:50:19 +01:00
Geetha sowjanya 9312150af8 octeontx2-af: cn10k: mcs: Support for stats collection
Add mailbox messages to return the resource stats to the
caller. Stats of SecY, SC and SAs as per the macsec standard,
TCAM flow id hits/miss, mailbox to clear the stats are
implemented.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Ankur Dwivedi <adwivedi@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:50:19 +01:00
Geetha sowjanya bd69476e86 octeontx2-af: cn10k: mcs: Install a default TCAM for normal traffic
Out of all the TCAM entries, reserve last TX and RX TCAM flow
entry(low priority) so that normal traffic can be sent out and
received. The traffic which needs macsec processing hits the
high priority TCAM flows. Also install a FLR handler to free
the allocated resources for PF/VF.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:50:19 +01:00
Geetha sowjanya cfc14181d4 octeontx2-af: cn10k: mcs: Manage the MCS block hardware resources
To establish a macsec connection association netdev driver
needs hardware resources like SecY, TCAM flows, SCs and SAs.
This patch manages allocating, freeing and configuring those
resources. AF consumers can request resources and configure them
via these mailbox messages. AF can allocate until it runs out of
hardware resources.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:50:19 +01:00
Geetha sowjanya 080bbd19c9 octeontx2-af: cn10k: mcs: Add mailboxes for port related operations
There are set of configurations to be done at MCS port level like
bringing port out of reset, making port as operational or bypass.
This patch adds all the port related mailbox message handlers
so that AF consumers can use them.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:50:18 +01:00
Geetha sowjanya ca7f49ff88 octeontx2-af: cn10k: Introduce driver for macsec block.
CN10K-B and CNF10K-B has macsec block(MCS) to encrypt and
decrypt packets at MAC level. This block is a global resource
with hardware resources like SecYs, SCs and SAs and is in
between NIX block and RPM LMAC. CN10K-B silicon has only one MCS
block which receives packets from all LMACS whereas CNF10K-B has
seven MCS blocks for seven LMACs. Both MCS blocks are
similar in operation except for few register offsets and some
configurations require writing to different registers. Those
differences between IPs are handled using separate ops.
This patch adds basic driver and does the initial hardware
calibration and parser configuration.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:50:18 +01:00
Zheng Wang 12aece8b01 eth: sp7021: fix use after free bug in spl2sw_nvmem_get_mac_address
This frees "mac" and tries to display its address as part of the error
message on the next line.  Swap the order.

Fixes: fd3040b939 ("net: ethernet: Add driver for Sunplus SP7021")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:47:42 +01:00
Horatiu Vultur b69e95397c net: lan966x: Add port mirroring support using tc-matchall
Add support for port mirroring. It is possible to mirror only one port
at a time and it is possible to have both ingress and egress mirroring.
Frames injected by the CPU don't get egress mirrored because they are
bypassing the analyzer module.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:46:46 +01:00
Horatiu Vultur 5390334b59 net: lan966x: Add port police support using tc-matchall
Add support for port police. It is possible to police only on the
ingress side. To be able to add police support also it was required to
add tc-matchall classifier offload support.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:46:46 +01:00
Shenwei Wang 95698ff617 net: fec: using page pool to manage RX buffers
This patch optimizes the RX buffer management by using the page
pool. The purpose for this change is to prepare for the following
XDP support. The current driver uses one frame per page for easy
management.

Added __maybe_unused attribute to the following functions to avoid
the compiling warning. Those functions will be removed by a separate
patch once this page pool solution is accepted.
 - fec_enet_new_rxbdp
 - fec_enet_copybreak

The following are the comparing result between page pool implementation
and the original implementation (non page pool).

 --- small packet (64 bytes) testing are almost the same
 --- no matter what the implementation is
 --- on both i.MX8 and i.MX6SX platforms.

shenwei@5810:~/pktgen$ iperf -c 10.81.16.245 -w 2m -i 1 -l 64
------------------------------------------------------------
Client connecting to 10.81.16.245, TCP port 5001
TCP window size:  416 KByte (WARNING: requested 1.91 MByte)
------------------------------------------------------------
[  1] local 10.81.17.20 port 39728 connected with 10.81.16.245 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-1.0000 sec  37.0 MBytes   311 Mbits/sec
[  1] 1.0000-2.0000 sec  36.6 MBytes   307 Mbits/sec
[  1] 2.0000-3.0000 sec  37.2 MBytes   312 Mbits/sec
[  1] 3.0000-4.0000 sec  37.1 MBytes   312 Mbits/sec
[  1] 4.0000-5.0000 sec  37.2 MBytes   312 Mbits/sec
[  1] 5.0000-6.0000 sec  37.2 MBytes   312 Mbits/sec
[  1] 6.0000-7.0000 sec  37.2 MBytes   312 Mbits/sec
[  1] 7.0000-8.0000 sec  37.2 MBytes   312 Mbits/sec
[  1] 0.0000-8.0943 sec   299 MBytes   310 Mbits/sec

 --- Page Pool implementation on i.MX8 ----

shenwei@5810:~$ iperf -c 10.81.16.245 -w 2m -i 1
------------------------------------------------------------
Client connecting to 10.81.16.245, TCP port 5001
TCP window size:  416 KByte (WARNING: requested 1.91 MByte)
------------------------------------------------------------
[  1] local 10.81.17.20 port 43204 connected with 10.81.16.245 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-1.0000 sec   111 MBytes   933 Mbits/sec
[  1] 1.0000-2.0000 sec   111 MBytes   934 Mbits/sec
[  1] 2.0000-3.0000 sec   112 MBytes   935 Mbits/sec
[  1] 3.0000-4.0000 sec   111 MBytes   933 Mbits/sec
[  1] 4.0000-5.0000 sec   111 MBytes   934 Mbits/sec
[  1] 5.0000-6.0000 sec   111 MBytes   933 Mbits/sec
[  1] 6.0000-7.0000 sec   111 MBytes   931 Mbits/sec
[  1] 7.0000-8.0000 sec   112 MBytes   935 Mbits/sec
[  1] 8.0000-9.0000 sec   111 MBytes   933 Mbits/sec
[  1] 9.0000-10.0000 sec   112 MBytes   935 Mbits/sec
[  1] 0.0000-10.0077 sec  1.09 GBytes   933 Mbits/sec

 --- Non Page Pool implementation on i.MX8 ----

shenwei@5810:~$ iperf -c 10.81.16.245 -w 2m -i 1
------------------------------------------------------------
Client connecting to 10.81.16.245, TCP port 5001
TCP window size:  416 KByte (WARNING: requested 1.91 MByte)
------------------------------------------------------------
[  1] local 10.81.17.20 port 49154 connected with 10.81.16.245 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-1.0000 sec   104 MBytes   868 Mbits/sec
[  1] 1.0000-2.0000 sec   105 MBytes   878 Mbits/sec
[  1] 2.0000-3.0000 sec   105 MBytes   881 Mbits/sec
[  1] 3.0000-4.0000 sec   105 MBytes   879 Mbits/sec
[  1] 4.0000-5.0000 sec   105 MBytes   878 Mbits/sec
[  1] 5.0000-6.0000 sec   105 MBytes   878 Mbits/sec
[  1] 6.0000-7.0000 sec   104 MBytes   875 Mbits/sec
[  1] 7.0000-8.0000 sec   104 MBytes   875 Mbits/sec
[  1] 8.0000-9.0000 sec   104 MBytes   873 Mbits/sec
[  1] 9.0000-10.0000 sec   104 MBytes   875 Mbits/sec
[  1] 0.0000-10.0073 sec  1.02 GBytes   875 Mbits/sec

 --- Page Pool implementation on i.MX6SX ----

shenwei@5810:~/pktgen$ iperf -c 10.81.16.245 -w 2m -i 1
------------------------------------------------------------
Client connecting to 10.81.16.245, TCP port 5001
TCP window size:  416 KByte (WARNING: requested 1.91 MByte)
------------------------------------------------------------
[  1] local 10.81.17.20 port 57288 connected with 10.81.16.245 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-1.0000 sec  78.8 MBytes   661 Mbits/sec
[  1] 1.0000-2.0000 sec  82.5 MBytes   692 Mbits/sec
[  1] 2.0000-3.0000 sec  82.4 MBytes   691 Mbits/sec
[  1] 3.0000-4.0000 sec  82.4 MBytes   691 Mbits/sec
[  1] 4.0000-5.0000 sec  82.5 MBytes   692 Mbits/sec
[  1] 5.0000-6.0000 sec  82.4 MBytes   691 Mbits/sec
[  1] 6.0000-7.0000 sec  82.5 MBytes   692 Mbits/sec
[  1] 7.0000-8.0000 sec  82.4 MBytes   691 Mbits/sec
[  1] 8.0000-9.0000 sec  82.4 MBytes   691 Mbits/sec
[  1] 9.0000-9.5506 sec  45.0 MBytes   686 Mbits/sec
[  1] 0.0000-9.5506 sec   783 MBytes   688 Mbits/sec

 --- Non Page Pool implementation on i.MX6SX ----

shenwei@5810:~/pktgen$ iperf -c 10.81.16.245 -w 2m -i 1
------------------------------------------------------------
Client connecting to 10.81.16.245, TCP port 5001
TCP window size:  416 KByte (WARNING: requested 1.91 MByte)
------------------------------------------------------------
[  1] local 10.81.17.20 port 36486 connected with 10.81.16.245 port 5001
[ ID] Interval       Transfer     Bandwidth
[  1] 0.0000-1.0000 sec  70.5 MBytes   591 Mbits/sec
[  1] 1.0000-2.0000 sec  64.5 MBytes   541 Mbits/sec
[  1] 2.0000-3.0000 sec  73.6 MBytes   618 Mbits/sec
[  1] 3.0000-4.0000 sec  73.6 MBytes   618 Mbits/sec
[  1] 4.0000-5.0000 sec  72.9 MBytes   611 Mbits/sec
[  1] 5.0000-6.0000 sec  73.4 MBytes   616 Mbits/sec
[  1] 6.0000-7.0000 sec  73.5 MBytes   617 Mbits/sec
[  1] 7.0000-8.0000 sec  73.4 MBytes   616 Mbits/sec
[  1] 8.0000-9.0000 sec  73.4 MBytes   616 Mbits/sec
[  1] 9.0000-10.0000 sec  73.9 MBytes   620 Mbits/sec
[  1] 0.0000-10.0174 sec   723 MBytes   605 Mbits/sec

Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:43:59 +01:00
Jianglei Nie b43f9acbb8 bnx2x: fix potential memory leak in bnx2x_tpa_stop()
bnx2x_tpa_stop() allocates a memory chunk from new_data with
bnx2x_frag_alloc(). The new_data should be freed when gets some error.
But when "pad + len > fp->rx_buf_size" is true, bnx2x_tpa_stop() returns
without releasing the new_data, which will lead to a memory leak.

We should free the new_data with bnx2x_frag_free() when "pad + len >
fp->rx_buf_size" is true.

Fixes: 07b0f00964 ("bnx2x: fix possible panic under memory stress")
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:41:07 +01:00
Raju Lakkaraju cb4b12071a eth: lan743x: reject extts for non-pci11x1x devices
Remove PTP_PF_EXTTS support for non-PCI11x1x devices since they do not support
the PTP-IO Input event triggered timestamping mechanisms added

Fixes: 60942c397a ("net: lan743x: Add support for PTP-IO Event Input External Timestamp (extts)")
Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:39:39 +01:00
Jiasheng Jiang 9e6fd874c7 net: prestera: acl: Add check for kmemdup
As the kemdup could return NULL, it should be better to check the return
value and return error if fails.
Moreover, the return value of prestera_acl_ruleset_keymask_set() should
be checked by cascade.

Fixes: 604ba23090 ("net: prestera: flower template support")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reviewed-by: Taras Chornyi<tchornyi@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 12:35:21 +01:00
Marek Behún 324e88cbe3 net: sfp: add support for multigig RollBall transceivers
This adds support for multigig copper SFP modules from RollBall/Hilink.
These modules have a specific way to access clause 45 registers of the
internal PHY.

We also need to wait at least 22 seconds after deasserting TX disable
before accessing the PHY. The code waits for 25 seconds just to be sure.

Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:33 +01:00
Marek Behún 09bbedac72 net: phy: mdio-i2c: support I2C MDIO protocol for RollBall SFP modules
Some multigig SFPs from RollBall and Hilink do not expose functional
MDIO access to the internal PHY of the SFP via I2C address 0x56
(although there seems to be read-only clause 22 access on this address).

Instead these SFPs PHY can be accessed via I2C via the SFP Enhanced
Digital Diagnostic Interface - I2C address 0x51. The SFP_PAGE has to be
selected to 3 and the password must be filled with 0xff bytes for this
PHY communication to work.

This extends the mdio-i2c driver to support this protocol by adding a
special parameter to mdio_i2c_alloc function via which this RollBall
protocol can be selected.

Signed-off-by: Marek Behún <kabel@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:33 +01:00
Marek Behún e85b1347ac net: sfp: create/destroy I2C mdiobus before PHY probe/after PHY release
Instead of configuring the I2C mdiobus when SFP driver is probed,
create/destroy the mdiobus before the PHY is probed for/after it is
released.

This way we can tell the mdio-i2c code which protocol to use for each
SFP transceiver.

Move the code that determines MDIO I2C protocol from
sfp_sm_probe_for_phy() to sfp_sm_mod_probe(), where most of the SFP ID
parsing is done. Don't allocate I2C bus if no PHY is expected.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:33 +01:00
Marek Behún 13c8adcf22 net: sfp: Add and use macros for SFP quirks definitions
Add macros SFP_QUIRK(), SFP_QUIRK_M() and SFP_QUIRK_F() for defining SFP
quirk table entries. Use them to deduplicate the code a little bit.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:33 +01:00
Marek Behún 31eb8907aa net: phylink: allow attaching phy for SFP modules on 802.3z mode
Some SFPs may contain an internal PHY which may in some cases want to
connect with the host interface in 1000base-x/2500base-x mode.
Do not fail if such PHY is being attached in one of these PHY interface
modes.

Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Pali Rohár <pali@kernel.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:33 +01:00
Russell King d6d2929264 net: phy: marvell10g: select host interface configuration
Select the host interface configuration according to the capabilities of
the host if the host provided them. This is currently provided only when
connecting PHY that is inside a SFP.

The PHY supports several configurations of host communication:
- always communicate with host in 10gbase-r, even if copper speed is
  lower (rate matching mode),
- the same as above but use xaui/rxaui instead of 10gbase-r,
- switch host SerDes mode between 10gbase-r, 5gbase-r, 2500base-x and
  sgmii according to copper speed,
- the same as above but use xaui/rxaui instead of 10gbase-r.

This mode of host communication, called MACTYPE, is by default selected
by strapping pins, but it can be changed in software.

This adds support for selecting this mode according to which modes are
supported by the host.

This allows the kernel to:
- support SFP modules with 88X33X0 or 88E21X0 inside them

Note: we use mv3310_select_mactype() for both 88X3310 and 88X3340,
although 88X3340 does not support XAUI. This is not a problem because
88X3340 does not declare XAUI in it's supported_interfaces, and so this
function will never choose that MACTYPE.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
[ rebase, updated, also added support for 88E21X0 ]
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:32 +01:00
Marek Behún 3891569b2f net: phy: marvell10g: Use tabs instead of spaces for indentation
Some register definitions were defined with spaces used for indentation.
Change them to tabs.

Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:32 +01:00
Marek Behún eca68a3c7d net: phylink: pass supported host PHY interface modes to phylib for SFP's PHYs
Pass the supported PHY interface types to phylib if the PHY we are
connecting is inside a SFP, so that the PHY driver can select an
appropriate host configuration mode for their interface according to
the host capabilities.

For example the Marvell 88X3310 PHY inside RollBall SFP modules
defaults to 10gbase-r mode on host's side, and the marvell10g
driver currently does not change this setting. But a host may not
support 10gbase-r. For example Turris Omnia only supports sgmii,
1000base-x and 2500base-x modes. The PHY can be configured to use
those modes, but in order for the PHY driver to do that, it needs
to know which modes are supported.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:32 +01:00
Russell King (Oracle) e60846370c net: phylink: rename phylink_sfp_config()
phylink_sfp_config() now only deals with configuring the MAC for a
SFP containing a PHY. Rename it to be specific.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:32 +01:00
Russell King f81fa96d8a net: phylink: use phy_interface_t bitmaps for optical modules
Where a MAC provides a phy_interface_t bitmap, use these bitmaps to
select the operating interface mode for optical SFP modules, rather
than using the linkmode bitmaps.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:32 +01:00
Russell King fd580c9830 net: sfp: augment SFP parsing with phy_interface_t bitmap
We currently parse the SFP EEPROM to a bitmap of ethtool link modes,
and then attempt to convert the link modes to a PHY interface mode.
While this works at present, there are cases where this is sub-optimal.
For example, where a module can operate with several different PHY
interface modes.

To start addressing this, arrange for the SFP EEPROM parsing to also
provide a bitmap of the possible PHY interface modes.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:32 +01:00
Russell King (Oracle) 1645f44dd5 net: phylink: add ability to validate a set of interface modes
Rather than having the ability to validate all supported interface
modes or a single interface mode, introduce the ability to validate
a subset of supported modes.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
[ rebased on current net-next ]
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 11:08:32 +01:00
Nathan Huckleberry 73ea735073 net: sparx5: Fix return type of sparx5_port_xmit_impl
The ndo_start_xmit field in net_device_ops is expected to be of type
netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, struct net_device *dev).

The mismatched return type breaks forward edge kCFI since the underlying
function definition does not match the function hook definition.

The return type of sparx5_port_xmit_impl should be changed from int to
netdev_tx_t.

Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1703
Cc: llvm@lists.linux.dev
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03 08:07:40 +01:00
Maxim Mikityanskiy 3db4c85cde net/mlx5e: xsk: Use queue indices starting from 0 for XSK queues
In the initial implementation of XSK in mlx5e, XSK RQs coexisted with
regular RQs in the same channel. The main idea was to allow RSS work the
same for regular traffic, without need to reconfigure RSS to exclude XSK
queues.

However, this scheme didn't prove to be beneficial, mainly because of
incompatibility with other vendors. Some tools don't properly support
using higher indices for XSK queues, some tools get confused with the
double amount of RQs exposed in sysfs. Some use cases are purely XSK,
and allocating the same amount of unused regular RQs is a waste of
resources.

This commit changes the queuing scheme to the standard one, where XSK
RQs replace regular RQs on the channels where XSK sockets are open. Two
RQs still exist in the channel to allow failsafe disable of XSK, but
only one is exposed at a time. The next commit will achieve the desired
memory save by flushing the buffers when the regular RQ is unused.

As the result of this transition:

1. It's possible to use RSS contexts over XSK RQs.

2. It's possible to dedicate all queues to XSK.

3. When XSK RQs coexist with regular RQs, the admin should make sure no
unwanted traffic goes into XSK RQs by either excluding them from RSS or
settings up the XDP program to return XDP_PASS for non-XSK traffic.

4. When using a mixed fleet of mlx5e devices and other netdevs, the same
configuration can be applied. If the application supports the fallback
to copy mode on unsupported drivers, it will work too.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:21 -07:00
Maxim Mikityanskiy d9ba64deb2 net/mlx5e: Introduce the mlx5e_flush_rq function
Add a function to flush an RQ: clean up descriptors, release pages and
reset the RQ. This procedure is used by the recovery flow, and it will
also be used in a following commit to free some memory when switching a
channel to the XSK mode.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:21 -07:00
Maxim Mikityanskiy a752b2edb5 net/mlx5e: xsk: Support XDP metadata on XSK RQs
Add support for XDP metadata on XSK RQs for cross-program
communication. The driver no longer calls xdp_set_data_meta_invalid and
copies the metadata to a newly allocated SKB on XDP_PASS.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:21 -07:00
Maxim Mikityanskiy ddb7afeee2 net/mlx5e: Optimize RQ page deallocation
mlx5e_free_rx_mpwqe loops over all pages of a MPWQE, calling
mlx5e_page_release for ones that are not scheduled for XDP_TX or
XDP_REDIRECT; and mlx5e_page_release checks whether it's an XSK RQ or a
regular one for each page/XSK frame. This check can be moved outside the
loop to reduce the number of branches.

mlx5e_free_rx_wqe loops over all fragments, calling mlx5e_page_release
for the ones that are last in a page; and mlx5e_page_release checks
whether it's an XSK RQ or a regular one for each fragment. Using the
fact that XSK doesn't support multiple fragments, it can be optimized
for both XSK and regular usages:

1. Make an early check for XSK and call its deallocator directly, saving
3 branches (loop condition, frag->last_in_page and selection of
deallocator).

2. Call the regular deallocator directly in the non-XSK case, saving a
branch per fragment, except the first one.

After the changes, mlx5e_page_release is removed, as there are no
callers left.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:21 -07:00
Maxim Mikityanskiy 96d37d861a net/mlx5e: Call mlx5e_page_release_dynamic directly where possible
mlx5e_page_release calls the appropriate deallocator depending on
whether it's an XSK RQ or a regular one. Some flows that call this
function are not compatible with XSK, so they can call the non-XSK
deallocator directly to save a branch.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:21 -07:00
Maxim Mikityanskiy 132857d912 net/mlx5e: Use non-XSK page allocator in SHAMPO
The SHAMPO flow is not compatible with XSK, it can call the page pool
allocator directly to save a branch.

mlx5e_page_alloc is removed, as it's no longer used in any flow.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:20 -07:00
Maxim Mikityanskiy cf544517c4 net/mlx5e: xsk: Use xsk_buff_alloc_batch on striding RQ
XSK provides a function to allocate frames in batches for more efficient
processing. This commit starts using this function on striding RQ and
creates an optimized flow for XSK. A side effect is an opportunity to
optimize the regular RX flow by dropping branching for XSK cases.

Performance improvement is up to 6.4% in the aligned mode and up to 7.5%
in the unaligned mode.

Aligned mode, 2048-byte frames: 12.9 Mpps -> 13.8 Mpps
Aligned mode, 4096-byte frames: 11.8 Mpps -> 12.5 Mpps
Unaligned mode, 2048-byte frames: 11.9 Mpps -> 12.8 Mpps
Unaligned mode, 3072-byte frames: 11.4 Mpps -> 12.1 Mpps
Unaligned mode, 4096-byte frames: 11.0 Mpps -> 11.2 Mpps

CPU: Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:20 -07:00
Maxim Mikityanskiy 259bbc6436 net/mlx5e: xsk: Use xsk_buff_alloc_batch on legacy RQ
XSK provides a function to allocate frames in batches for more efficient
processing. This commit starts using this function on legacy RQ, adding
a special case for XSK. The new branch introduced basically replaces the
branch that was removed from the same place a few commits before.

A check is made that DMA sync is not needed, because the batching
allocator falls back to returning one frame when DMA sync is needed, and
this is best handled by the loop in the standard case.

Performance improvement is up to 8% in the aligned mode and up to 9% in
the unaligned mode.

Aligned mode, 2048-byte frames: 12.8 Mpps -> 13.5 Mpps
Aligned mode, 4096-byte frames: 11.5 Mpps -> 12.4 Mpps
Unaligned mode, 2048-byte frames: 12.2 Mpps -> 13.4 Mpps
Unaligned mode, 3072-byte frames: 11.6 Mpps -> 12.5 Mpps
Unaligned mode, 4096-byte frames: 11.2 Mpps -> 12.2 Mpps

CPU: Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:20 -07:00
Maxim Mikityanskiy a2e5ba242c net/mlx5e: xsk: Split out WQE allocation for legacy XSK RQ
Allocation of XSK frames on legacy RQ may be made more efficient with a
specialized routine that relies on certain assumptions, such as there is
only one fragment, allocation units (XSK frames) are not shared among
multiple packets. It reduces the number of branches both in the XSK code
and in the regular RQ, because with this approach there is only a single
check whether it's an XSK or regular RQ.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:20 -07:00
Maxim Mikityanskiy 0b48223237 net/mlx5e: Remove the outer loop when allocating legacy RQ WQEs
Legacy RQ WQEs are allocated in a loop in small batches (8 WQEs). As
partial batches are allowed, there is no point to have a loop in a loop,
so the outer loop is removed, and the batch size is increased up to the
total number of WQEs to allocate, still not smaller than 8.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:20 -07:00
Maxim Mikityanskiy 3f5fe0b2e6 net/mlx5e: xsk: Use partial batches in legacy RQ with XSK
The previous commit allowed allocating WQE batches in legacy RQ
partially, however, XSK still checks whether there are enough frames in
the fill ring. Remove this check to allow to allocate batches partially
also with XSK.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:19 -07:00
Maxim Mikityanskiy 42847fed55 net/mlx5e: Use partial batches in legacy RQ
Legacy RQ allocates WQEs in batches. If the batch allocation fails, the
pages of the allocated part are released. This commit changes this
behavior to allow to use the pages that have been already allocated.

After this change, we need to be careful about indexing rq->wqe.frags[].
The WQ size is a power of two that divides by wqe_bulk (8), and the old
code used whole bulks, which allowed to use indices [8*K; 8*K+7] without
overflowing. Now that the bulks may be partial, the range can start at
any location (not only at 8*K), so we need to wrap them around to avoid
out-of-bounds array access.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:19 -07:00
Maxim Mikityanskiy 5758c3145b net/mlx5e: Make the wqe_index_mask calculation more exact
The old calculation of wqe_index_mask may give false positives, i.e.
request bulking of pairs of WQEs when not strictly needed, for example,
when the first fragment size is equal to the PAGE_SIZE, bulking is not
needed, even if the number of fragments is odd.

Make the calculation more exact to cut false positives.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:19 -07:00
Maxim Mikityanskiy a064c60984 net/mlx5e: Introduce wqe_index_mask for legacy RQ
When fragments of different WQEs share the same page, mlx5e_post_rx_wqes
must wait until the old WQE stops using the page, only then the new WQE
can allocate the new page. Essentially, it means that if WQE index i is
still in use, the allocation must stop before `i % bulk`, where bulk is
the number of WQEs that may share the same page.

As bulk is always a power of two, `i % bulk = i & (bulk - 1)`, and the
new wqe_index_mask field will be equal to `bulk - 1`.

At the same time, wqe_bulk remains for optimization purposes and stores
`max(bulk, 8)`, which allows to skip the allocation until we have at
least 8 WQEs free.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:19 -07:00
Maxim Mikityanskiy 8cbcafcee1 net/mlx5e: xsk: Drop the check for XSK state in mlx5e_xsk_wakeup
The MLX5E_CHANNEL_STATE_XSK flag checked in mlx5e_xsk_wakeup indicates
that XSK queues are open, but not necessarily activated. This check is
not very useful, because:

0. Both XSK setup and netdev state transitions take the same state_lock
mutex, so they can't happen at the same time.

1. If the netdev is up, xsk_is_bound can return true only when
MLX5E_CHANNEL_STATE_XSK is set on the corresponding channel.
mlx5e_xsk_wakeup is only called when xsk_is_bound is true.

2. If the XSK socket is bound, and the netdev is going up or down,
mlx5e_xsk_wakeup can take one of two branches, depending on the return
value of napi_if_scheduled_mark_missed:

2.1. True means one of two things: either NAPI was enabled at this
point, which means MLX5E_CHANNEL_STATE_XSK was also set; or NAPI was
disabled, and nothing really happened.

2.2. False means that NAPI was enabled by this point, which also implies
MLX5E_CHANNEL_STATE_XSK was set. Additionally, mlx5e_xsk_wakeup contains
a following check for MLX5E_SQ_STATE_ENABLED on async_icosq, and this
flag implies MLX5E_CHANNEL_STATE_XSK too on XSK channels.

As checking this flag doesn't cut any flows, remove the check.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:19 -07:00
Maxim Mikityanskiy d54d7194ba net/mlx5e: xsk: Use mlx5e_trigger_napi_icosq for XSK wakeup
mlx5e_xsk_wakeup triggers an IRQ by posting a NOP to async_icosq, taking
a spinlock to protect from concurrent access. There is already a
function that does the same: mlx5e_trigger_napi_icosq. Use this function
in mlx5e_xsk_wakeup.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-01 13:30:19 -07:00
Daniel Golle ae3ed15da5 net: ethernet: mtk_eth_soc: fix state in __mtk_foe_entry_clear
Setting ib1 state to MTK_FOE_STATE_UNBIND in __mtk_foe_entry_clear
routine as done by commit 0e80707d94 ("net: ethernet: mtk_eth_soc:
fix typo in __mtk_foe_entry_clear") breaks flow offloading, at least
on older MTK_NETSYS_V1 SoCs, OpenWrt users have confirmed the bug on
MT7622 and MT7621 systems.
Felix Fietkau suggested to use MTK_FOE_STATE_INVALID instead which
works well on both, MTK_NETSYS_V1 and MTK_NETSYS_V2.

Tested on MT7622 (Linksys E8450) and MT7986 (BananaPi BPI-R3).

Suggested-by: Felix Fietkau <nbd@nbd.name>
Fixes: 0e80707d94 ("net: ethernet: mtk_eth_soc: fix typo in __mtk_foe_entry_clear")
Fixes: 33fc42de33 ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://lore.kernel.org/r/YzY+1Yg0FBXcnrtc@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 19:05:38 -07:00
Fei Qin 2820a400df nfp: add support restart of link auto-negotiation
Add support restart of link auto-negotiation.
This may be initiated using:

  # ethtool -r <intf>

Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 18:47:53 -07:00
Yinjun Zhang 8d545385bf nfp: add support for link auto negotiation
Report the auto negotiation capability if it's supported
in management firmware, and advertise it if it's enabled.
Changing port speed is not allowed when autoneg is enabled.

The ethtool <intf> command displays the auto-neg capability:

  # ethtool enp1s0np0
  Settings for enp1s0np0:
          Supported ports: [ FIBRE ]
          Supported link modes:   Not reported
          Supported pause frame use: Symmetric
          Supports auto-negotiation: Yes
          Supported FEC modes: None        RS      BASER
          Advertised link modes:  Not reported
          Advertised pause frame use: Symmetric
          Advertised auto-negotiation: Yes
          Advertised FEC modes: None       RS      BASER
          Speed: 25000Mb/s
          Duplex: Full
          Auto-negotiation: on
          Port: FIBRE
          PHYAD: 0
          Transceiver: internal
          Link detected: yes

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 18:47:53 -07:00
Yinjun Zhang b1e4f11e42 nfp: refine the ABI of getting `sp_indiff` info
Considering that whether application firmware is indifferent to
port speed is a firmware property instead of port property, now use
a new rtsym to get the property instead of parsing per-port tlv caps.
With this change, relevant code is moved to `nfp_main` layer.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 18:47:52 -07:00
Yinjun Zhang 965dd27d98 nfp: avoid halt of driver init process when non-fatal error happens
It's not a fatal error when setting `hwinfo` into management firmware
fails, no need to halt the whole driver initialization process.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 18:47:52 -07:00
Yinjun Zhang fc26e70f8a nfp: add support for reporting active FEC mode
The latest management firmware can now report the active FEC
mode. Adapt driver accordingly so that user can get the active
FEC mode by running command:

  # ethtool --show-fec <intf>

Also correct use of `fec` field.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 18:47:52 -07:00
Chunhao Lin 3406079bbb r8169: add rtl_disable_rxdvgate()
rtl_disable_rxdvgate() is used for disable RXDV_GATE. It is opposite function
of rtl_enable_rxdvgate().

Disable RXDV_GATE does not have to delay. So in this patch, also remove the
delay after disale RXDV_GATE.

Signed-off-by: Chunhao Lin <hau@realtek.com>
Link: https://lore.kernel.org/r/20220928171356.3951-1-hau@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 18:17:06 -07:00
Jakub Kicinski 915b96c527 wireless-next patches for v6.1
Few stack changes and lots of driver changes in this round. brcmfmac
 has more activity as usual and it gets new hardware support. ath11k
 improves WCN6750 support and also other smaller features. And of
 course changes all over.
 
 Note: in early September wireless tree was merged to wireless-next to
 avoid some conflicts with mac80211 patches, this shouldn't cause any
 problems but wanted to mention anyway.
 
 Major changes:
 
 mac80211
 
 * refactoring and preparation for Wi-Fi 7 Multi-Link Operation (MLO)
   feature continues
 
 brcmfmac
 
 * support CYW43439 SDIO chipset
 
 * support BCM4378 on Apple platforms
 
 * support CYW89459 PCIe chipset
 
 rtw89
 
 * more work to get rtw8852c supported
 
 * P2P support
 
 * support for enabling and disabling MSDU aggregation via nl80211
 
 mt76
 
 * tx status reporting improvements
 
 ath11k
 
 * cold boot calibration support on WCN6750
 
 * Target Wake Time (TWT) debugfs support for STA interface
 
 * support to connect to a non-transmit MBSSID AP profile
 
 * enable remain-on-channel support on WCN6750
 
 * implement SRAM dump debugfs interface
 
 * enable threaded NAPI on all hardware
 
 * WoW support for WCN6750
 
 * support to provide transmit power from firmware via nl80211
 
 * support to get power save duration for each client
 
 * spectral scan support for 160 MHz
 
 wcn36xx
 
 * add SNR from a received frame as a source of system entropy
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmM3BGYRHGt2YWxvQGtl
 cm5lbC5vcmcACgkQbhckVSbrbZuR3Af/XiuMlnDB6flq+M/kQHLWWvHybLw5aCJ7
 l3yXhNFWxpBl2hQXtj17JSjVCYQmxbfrgRqhbNhyACO25bpymCb5QctB9X+Y7TwL
 250JmuKvQfFx5oJNRfJ67dKTf3raloQYbdEMJNqySgebL+eSfrDskc9vaCLVDmCK
 I994fl0Q1wUbJ6fbuIFd07ti8ay6UlSS/iakv4+nEeimabtZWJWlXBWYRpKpikdP
 h9z2kPtss6yz6seaQuw6ny+qysYLi11Tp+Cued9XR3dWOOhB2X1tLHH0H02xPw76
 9OJZEJHycP2juxjMfAaktHY+VX36GPLsMLUTVusH0h/Fdy3VG8YSAw==
 =emmG
 -----END PGP SIGNATURE-----

Merge tag 'wireless-next-2022-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.1

Few stack changes and lots of driver changes in this round. brcmfmac
has more activity as usual and it gets new hardware support. ath11k
improves WCN6750 support and also other smaller features. And of
course changes all over.

Note: in early September wireless tree was merged to wireless-next to
avoid some conflicts with mac80211 patches, this shouldn't cause any
problems but wanted to mention anyway.

Major changes:

mac80211

 - refactoring and preparation for Wi-Fi 7 Multi-Link Operation (MLO)
  feature continues

brcmfmac

 - support CYW43439 SDIO chipset

 - support BCM4378 on Apple platforms

 - support CYW89459 PCIe chipset

rtw89

 - more work to get rtw8852c supported

 - P2P support

 - support for enabling and disabling MSDU aggregation via nl80211

mt76

 - tx status reporting improvements

ath11k

 - cold boot calibration support on WCN6750

 - Target Wake Time (TWT) debugfs support for STA interface

 - support to connect to a non-transmit MBSSID AP profile

 - enable remain-on-channel support on WCN6750

 - implement SRAM dump debugfs interface

 - enable threaded NAPI on all hardware

 - WoW support for WCN6750

 - support to provide transmit power from firmware via nl80211

 - support to get power save duration for each client

 - spectral scan support for 160 MHz

wcn36xx

 - add SNR from a received frame as a source of system entropy

* tag 'wireless-next-2022-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (231 commits)
  wifi: rtl8xxxu: Improve rtl8xxxu_queue_select
  wifi: rtl8xxxu: Fix AIFS written to REG_EDCA_*_PARAM
  wifi: rtl8xxxu: gen2: Enable 40 MHz channel width
  wifi: rtw89: 8852b: configure DLE mem
  wifi: rtw89: check DLE FIFO size with reserved size
  wifi: rtw89: mac: correct register of report IMR
  wifi: rtw89: pci: set power cut closed for 8852be
  wifi: rtw89: pci: add to do PCI auto calibration
  wifi: rtw89: 8852b: implement chip_ops::{enable,disable}_bb_rf
  wifi: rtw89: add DMA busy checking bits to chip info
  wifi: rtw89: mac: define DMA channel mask to avoid unsupported channels
  wifi: rtw89: pci: mask out unsupported TX channels
  iwlegacy: Replace zero-length arrays with DECLARE_FLEX_ARRAY() helper
  ipw2x00: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
  wifi: iwlwifi: Track scan_cmd allocation size explicitly
  brcmfmac: Remove the call to "dtim_assoc" IOVAR
  brcmfmac: increase dcmd maximum buffer size
  brcmfmac: Support 89459 pcie
  brcmfmac: increase default max WOWL patterns to 16
  cw1200: fix incorrect check to determine if no element is found in list
  ...
====================

Link: https://lore.kernel.org/r/20220930150413.A7984C433D6@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 10:07:31 -07:00
Maxim Mikityanskiy 8f5ed1c140 net/mlx5e: Clean up and fix error flows in mlx5e_alloc_rq
Although mlx5e_rq_free_shampo can be called unconditionally, it belongs
to case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ. Move it there to allow to
add more init/cleanup actions to the striding RQ case.

If xdp_rxq_info_reg_mem_model fails, don't forget to destroy the page
pool.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:48 -07:00
Maxim Mikityanskiy e64d71d055 net/mlx5e: Move repeating clear_bit in mlx5e_rx_reporter_err_rq_cqe_recover
The same clear_bit is called in both error and success flows. Move the
call to do it only once and remove the out label.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:47 -07:00
Maxim Mikityanskiy d32c225316 net/mlx5e: Split out channel (de)activation in rx_res
To decrease the nesting level and reduce duplication of code, create
functions to redirect direct RQTs to the actual RQs or drop_rq, which
are used in the activation and deactivation flows of channels.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:47 -07:00
Maxim Mikityanskiy 2d0765f78c net/mlx5e: xsk: Remove mlx5e_xsk_page_alloc_pool
mlx5e_xsk_page_alloc_pool became a thin wrapper around xsk_buff_alloc.
Drop it and call xsk_buff_alloc directly.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:47 -07:00
Maxim Mikityanskiy 672db02433 net/mlx5e: Convert struct mlx5e_alloc_unit to a union
struct mlx5e_alloc_unit consists of a single union. Convert it to a
union itself to simplify casting it to struct xdp_buff *, which will be
used to implement XSK batching on striding RQ.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:47 -07:00
Maxim Mikityanskiy 6bdeb96382 net/mlx5e: Remove DMA address from mlx5e_alloc_unit
mlx5e_alloc_unit stores the DMA address and a pointer to either struct
page (regular RQ) or struct xdp_buff (XSK RQ). This DMA address is
redundant, because when a page or an XSK frame is allocated, the same
address is also stored there. Some flows take the address from struct
mlx5e_alloc_unit, and some take it from struct page or xdp_buff.

This commit removes the address from struct mlx5e_alloc_unit, which
makes it twice as small and improves locality (this struct is used in an
array), also saving on unnecessary stores to the addr field. Almost all
flows know unambiguously whether the DMA address should be taken from
page or from xdp_buff. The exception is the allocation flows, where a
new branch appeared, which will be optimized out in the next commits.

struct mlx5e_alloc_unit used to be called mlx5e_dma_info.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:47 -07:00
Maxim Mikityanskiy 79008676d5 net/mlx5e: Rename mlx5e_dma_info to prepare for removal of DMA address
The next commit will remove the DMA address from the struct currently
called mlx5e_dma_info, because the same value can be retrieved with
page_pool_get_dma_addr(page) in almost all cases, with the notable
exception of SHAMPO (HW GRO implementation) that modifies this address
on the fly, after the initial allocation.

To keep the SHAMPO logic intact, struct mlx5e_dma_info remains in the
SHAMPO code, consisting of addr and page (XSK is not compatible with
SHAMPO). The struct used in all other places is renamed to
mlx5e_alloc_unit, allowing the next commit to remove the addr field
without affecting SHAMPO.

The new name means "allocation unit", and it's more appropriate after
the field with the DMA address gets removed.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:47 -07:00
Maxim Mikityanskiy 707f908e31 net/mlx5e: Optimize the page cache reducing its size 2x
RX page cache stores dma_info structs, that consist of a pointer to
struct page and a DMA address. In fact, the DMA address is extracted
from struct page using page_pool_get_dma_addr when a page is pushed to
the cache. By moving this call to the point when a page is popped from
the cache, we can avoid storing the DMA address in the cache,
effectively reducing its size by two times without losing any
functionality.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:46 -07:00
Maxim Mikityanskiy 0b9c86c785 net/mlx5e: Fix calculations for ICOSQ size
WQEs must not cross page boundaries, they are padded with NOPs if they
don't fit the page. mlx5e_mpwrq_total_umr_wqebbs doesn't take into
account this padding, risking reserving not enough space.

The padding is not straightforward to add to this calculation, because
WQEs of different sizes may be mixed together in the queue. If each page
ends with a big WQE that doesn't fit and requires at most its size minus
1 WQEBB of padding, the total space can be much bigger than in case when
smaller WQEs take advantage of this padding.

Replace the wrong exact calculation by the following estimation. Each
padding can be at most the size of the maximum WQE used in the queue
minus one WQEBB. Let's call the rest of the page "useful space". If we
divide the total size of all needed WQEs by this useful space, rounding
up, we'll get the number of pages, which is enough to contain all these
WQEs. It's correct, because every WQE that appeared on the boundary
between two blocks of useful space would start in the useful space of
one page and end in the padding of the same page, while our estimation
reserved space for its tail in the next space, making the estimation not
smaller than the real space occupied in the queue.

The code actually uses a looser estimation: instead of taking the
maximum size of all used WQE types minus 1 WQEBB, it takes the maximum
hardware size of a WQE. It's made for simplicity and extensibility.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:46 -07:00
Maxim Mikityanskiy 6470d2e7e8 net/mlx5e: xsk: Use KSM for unaligned XSK
UMR MTTs used in striding RQ have certain alignment requirements. While
it's guaranteed to work when UMR pages are aligned to the UMR page size,
in practice it works then UMR pages are aligned to 8 bytes. However,
it's still not enough flexibility for the unaligned mode of XSK. This
patch leverages KSM to map UMR pages without alignment requirements,
when unaligned XSK is active. The downside is that KSM entries are twice
as big as MTTs, which limits the maximum WQE size, so regular RQs and
aligned XSK continue using MTTs.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:46 -07:00
Maxim Mikityanskiy c4418f3495 net/mlx5: Add MLX5_FLEXIBLE_INLEN to safely calculate cmd inlen
Some commands use a flexible array after a common header. Add a macro to
safely calculate the total input length of the command, detecting
overflows and printing errors with specific values when such overflows
happen.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:46 -07:00
Maxim Mikityanskiy ecc7ad2eab net/mlx5e: Keep a separate MKey for striding RQ
Currently, rq->mkey_be keeps a big-endian value of either the PA MKey
(for legacy RQ, no address translation) or MTT MKey (for striding RQ,
direct address translation). Striding RQ stores the same value in
rq->umr_mkey in the native endianness.

The next commit will make striding RQ use KSM MKey (indirect address
translation) for the unaligned mode of XSK, which will require storing
both KSM MKey and PA MKey in the RQ struct. This commit optimizes fields
of mlx5e_rq: umr_mkey is removed (it's redundant), mkey_be always points
to the PA MKey, and mpwqe.umr_mkey_be points to the MTT MKey (or to the
KSM MKey, starting from the next commit).

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:45 -07:00
Maxim Mikityanskiy fa5573359a net/mlx5e: xsk: Use XSK frame size as striding RQ page size
XSK RQs support striding RQ linear mode, but the stride size is always
set to PAGE_SIZE. It may be larger than the XSK frame size,
unnecessarily reducing the useful space in a WQE, but more importantly
causing UMEM data corruption in certain cases.

Normally, stride size bigger than XSK frame size is not a problem if the
hardware enforces the MTU. However, traffic between vports skips the
hardware MTU check, and oversized packets may be received.

If an oversized packet is bigger than the XSK frame but not bigger than
the stride, it will cause overwriting of the adjacent UMEM region. If
the packet takes more than one stride, they can be recycled for reuse
so it's not a problem when the XSK frame size matches the stride size.

To reduce the impact of the above issue, attempt to use the MTT page
size for striding RQ that matches the XSK frame size, allowing to safely
use 2048-byte frames on an up-to-date firmware.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:45 -07:00
Maxim Mikityanskiy e5a3cc83d5 net/mlx5e: Use runtime page_shift for striding RQ
This commit allows striding RQ to determine MTT page size at runtime,
instead of sticking to the compile-time PAGE_SIZE. This functionality
will be used by a following commit that adjusts the MTT page size to the
XSK frame size.

Stick with PAGE_SIZE for XSK on legacy RQ, as frag_stride is not used in
data path, it only helps calculate how pages are partitioned into
fragments, and PAGE_SIZE will ensure each fragment starts at the
beginning of a new allocation unit (XSK frame).

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 07:55:45 -07:00
Jianguo Zhang 83936ea8d8 net: stmmac: add a parse for new property 'snps,clk-csr'
Parse new property 'snps,clk-csr' firstly because the new property
is documented in binding file, if failed, fall back to old property
'clk_csr' for legacy case

Signed-off-by: Jianguo Zhang <jianguo.zhang@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 13:04:23 +01:00
Colin Ian King fd01b9b5b0 net/mlx5: Fix spelling mistake "syndrom" -> "syndrome"
There is a spelling mistake in a devlink_health_report message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 13:01:16 +01:00
Colin Ian King 3b882a7bf6 net: bna: Fix spelling mistake "muliple" -> "multiple"
There is a spelling mistake in a literal string in the array
bnad_net_stats_strings. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 13:01:16 +01:00
Nick Child 10c2aba89c ibmveth: Ethtool set queue support
Implement channel management functions to allow dynamic addition and
removal of transmit queues. The `ethtool --show-channels` and
`ethtool --set-channels` commands can be used to get and set the
number of queues, respectively. Allow the ability to add as many
transmit queues as available processors but never allow more than the
hard maximum of 16. The number of receive queues is one and cannot be
modified.

Depending on whether the requested number of queues is larger or
smaller than the current value, either allocate or free long term
buffers. Since long term buffer construction and destruction can
occur in two different areas, from either channel set requests or
device open/close, define functions for performing this work. If
allocation of a new buffer fails, then attempt to revert back to the
previous number of queues.

Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 12:40:22 +01:00
Nick Child d926793c1d ibmveth: Implement multi queue on xmit
The `ndo_start_xmit` function is protected by a spinlock on the tx queue
being used to transmit the skb. Allow concurrent calls to
`ndo_start_xmit` by using more than one tx queue. This allows for
greater throughput when several jobs are trying to transmit data.

Introduce 16 tx queues (leave single rx queue as is) which each
correspond to one DMA mapped long term buffer.

Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 12:40:22 +01:00
Nick Child d6832ca48d ibmveth: Copy tx skbs into a premapped buffer
Rather than DMA mapping and unmapping every outgoing skb, copy the skb
into a buffer that was mapped during the drivers open function. Copying
the skb and its frags have proven to be more time efficient than
mapping and unmapping. As an effect, performance increases by 3-5
Gbits/s.

Allocate and DMA map one continuous 64KB buffer at `ndo_open`. This
buffer is maintained until `ibmveth_close` is called. This buffer is
large enough to hold the largest possible linnear skb. During
`ndo_start_xmit`, copy the skb and all of it's frags into the continuous
buffer. By manually linnearizing all the socket buffers, time is saved
during memcpy as well as more efficient handling in FW.
As a result, we no longer need to worry about the firmware limitation
of handling a max of 6 frags. So, we only need to maintain 1 descriptor
instead of 6 and can hardcode 0 for the other 5 descriptors during
h_send_logical_lan.

Since, DMA allocation/mapping issues can no longer arise in xmit
functions, we can further reduce code size by removing the need for a
bounce buffer on DMA errors.

Signed-off-by: Nick Child <nnac123@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 12:40:22 +01:00
Colin Ian King ea9b9a985d bnx2: Fix spelling mistake "bufferred" -> "buffered"
There are spelling mistakes in two literal strings. Fix these.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 12:38:58 +01:00
Colin Ian King db7fccc122 net: lan966x: Fix spelling mistake "tarffic" -> "traffic"
There is a spelling mistake in a netdev_err message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 12:34:01 +01:00
Wang Yufen 96e0718165 net: bonding: Convert to use sysfs_emit()/sysfs_emit_at() APIs
Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the value
to be returned to user space.

Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 12:29:45 +01:00
Wang Yufen aff3069954 net: tun: Convert to use sysfs_emit() APIs
Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the value
to be returned to user space.

Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 12:27:43 +01:00
Gerhard Engleder bb837a37db tsnep: Use page pool for RX
Use page pool for RX buffer handling. Makes RX path more efficient and
is required prework for future XDP support.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 11:32:27 +01:00
Gerhard Engleder 308ce14265 tsnep: Add EtherType RX flow classification support
Received Ethernet frames are assigned to first RX queue per default.
Based on EtherType Ethernet frames can be assigned to other RX queues.
This enables processing of real-time Ethernet protocols on dedicated
RX queues.

Add RX flow classification interface for EtherType based RX queue
assignment.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 11:32:27 +01:00
Gerhard Engleder 762031375d tsnep: Support multiple TX/RX queue pairs
Support additional TX/RX queue pairs if dedicated interrupt is
available. Interrupts are detected by name in device tree.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 11:32:26 +01:00
Gerhard Engleder 58eaa8abe4 tsnep: Move interrupt from device to queue
For multiple queues multiple interrupts shall be used. Therefore, rework
global interrupt to per queue interrupt.

Every interrupt name shall contain interface name and queue information.
To get a valid interface name, the interrupt request needs to by done
during open like in other drivers. Additionally, this allows the removal
of some initialisation checks in the interrupt handler.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30 11:32:26 +01:00
Jakub Kicinski d742ea6b8e Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-09-28 (ice)

Arkadiusz implements a single pin initialization function, checking feature
bits, instead of having separate device functions and updates sub-device
IDs for recognizing E810T devices.

Martyna adds support for switchdev filters on VLAN priority field.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: Add support for VLAN priority filters in switchdev
  ice: support features on new E810T variants
  ice: Merge pin initialization of E810 and E810T adapters
====================

Link: https://lore.kernel.org/r/20220928203217.411078-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29 19:29:03 -07:00
Jakub Kicinski 6ad1c94e1e eth: alx: take rtnl_lock on resume
Zbynek reports that alx trips an rtnl assertion on resume:

 RTNL: assertion failed at net/core/dev.c (2891)
 RIP: 0010:netif_set_real_num_tx_queues+0x1ac/0x1c0
 Call Trace:
  <TASK>
  __alx_open+0x230/0x570 [alx]
  alx_resume+0x54/0x80 [alx]
  ? pci_legacy_resume+0x80/0x80
  dpm_run_callback+0x4a/0x150
  device_resume+0x8b/0x190
  async_resume+0x19/0x30
  async_run_entry_fn+0x30/0x130
  process_one_work+0x1e5/0x3b0

indeed the driver does not hold rtnl_lock during its internal close
and re-open functions during suspend/resume. Note that this is not
a huge bug as the driver implements its own locking, and does not
implement changing the number of queues, but we need to silence
the splat.

Fixes: 4a5fe57e77 ("alx: use fine-grained locking instead of RTNL")
Reported-and-tested-by: Zbynek Michl <zbynek.michl@gmail.com>
Reviewed-by: Niels Dossche <dossche.niels@gmail.com>
Link: https://lore.kernel.org/r/20220928181236.1053043-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29 19:27:17 -07:00
Wang Yufen b5155ddd22 net: phy: Convert to use sysfs_emit() APIs
Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the value
to be returned to user space.

Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/1664364860-29153-1-git-send-email-wangyufen@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29 19:19:41 -07:00
Vladimir Oltean dfc7175de3 net: enetc: offload per-tc max SDU from tc-taprio
The driver currently sets the PTCMSDUR register statically to the max
MTU supported by the interface. Keep this logic if tc-taprio is absent
or if the max_sdu for a traffic class is 0, and follow the requested max
SDU size otherwise.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29 18:52:06 -07:00
Vladimir Oltean 9a2ea26d97 net: enetc: use common naming scheme for PTGCR and PTGCAPR registers
The Port Time Gating Control Register (PTGCR) and Port Time Gating
Capability Register (PTGCAPR) have definitions in the driver which
aren't in line with the other registers. Rename these.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29 18:52:06 -07:00
Vladimir Oltean 715bf2610f net: enetc: cache accesses to &priv->si->hw
The &priv->si->hw construct dereferences 2 pointers and makes lines
longer than they need to be, in turn making the code harder to read.

Replace &priv->si->hw accesses with a "hw" variable when there are 2 or
more accesses within a function that dereference this. This includes
loops, since &priv->si->hw is a loop invariant.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29 18:52:06 -07:00
Kurt Kanzenbach a745c69783 net: dsa: hellcreek: Offload per-tc max SDU from tc-taprio
Add support for configuring the max SDU per priority and per port. If not
specified, keep the default.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29 18:52:06 -07:00
Vladimir Oltean 248376b1b1 net: dsa: hellcreek: refactor hellcreek_port_setup_tc() to use switch/case
The following patch will need to make this function also respond to
TC_QUERY_BASE, so make the processing more structured around the
tc_setup_type.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29 18:52:05 -07:00
Vladimir Oltean 1712be05a8 net: dsa: felix: offload per-tc max SDU from tc-taprio
Our current vsc9959_tas_guard_bands_update() algorithm has a limitation
imposed by the hardware design. To avoid packet overruns between one
gate interval and the next (which would add jitter for scheduled traffic
in the next gate), we configure the switch to use guard bands. These are
as large as the largest packet which is possible to be transmitted.

The problem is that at tc-taprio intervals of sizes comparable to a
guard band, there isn't an obvious place in which to split the interval
between the useful portion (for scheduling) and the guard band portion
(where scheduling is blocked).

For example, a 10 us interval at 1Gbps allows 1225 octets to be
transmitted. We currently split the interval between the bare minimum of
33 ns useful time (required to schedule a single packet) and the rest as
guard band.

But 33 ns of useful scheduling time will only allow a single packet to
be sent, be that packet 1200 octets in size, or 60 octets in size. It is
impossible to send 2 60 octets frames in the 10 us window. Except that
if we reduced the guard band (and therefore the maximum allowable SDU
size) to 5 us, the useful time for scheduling is now also 5 us, so more
packets could be scheduled.

The hardware inflexibility of not scheduling according to individual
packet lengths must unfortunately propagate to the user, who needs to
tune the queueMaxSDU values if he wants to fit more small packets into a
10 us interval, rather than one large packet.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29 18:52:05 -07:00
Jakub Kicinski accc3b4a57 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29 14:30:51 -07:00
ruanjinjie 510bbf82f8 net: cpmac: Add __init/__exit annotations to module init/exit funcs
Add __init/__exit annotations to module init/exit funcs

Signed-off-by: ruanjinjie <ruanjinjie@huawei.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220928031708.89120-1-ruanjinjie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-29 13:39:58 +02:00
Yang Yingliang 0e9804cff1 ethernet: 8390: remove unnecessary check of mem
The 'mem' returned by platform_get_resource() has been checked in probe
function, so it is no need do this check in remove function.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220927151406.797800-1-yangyingliang@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-29 11:05:23 +02:00
Shang XiaoJing d49e265b66 nfp: Use skb_put_data() instead of skb_put/memcpy pair
Use skb_put_data() instead of skb_put() and memcpy(), which is clear.

Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Link: https://lore.kernel.org/r/20220927141835.19221-1-shangxiaojing@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-29 10:46:42 +02:00
Yuan Can 01c617d73f net: liquidio: Remove unused struct lio_trusted_vf_ctx
After commit 6870957ed5bc("liquidio: make soft command calls synchronous"), no
one use struct lio_trusted_vf_ctx, so remove it.

Signed-off-by: Yuan Can <yuancan@huawei.com>
Link: https://lore.kernel.org/r/20220927133940.104181-1-yuancan@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-29 10:14:05 +02:00
Liu Shixin 1a0c667ea8 net: ethernet: mtk_eth_soc: use DEFINE_SHOW_ATTRIBUTE to simplify code
Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code.
No functional change.

Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Link: https://lore.kernel.org/r/20220927111925.2424100-1-liushixin2@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-29 09:57:23 +02:00
Shang XiaoJing 6db239f01a wwan_hwsim: Use skb_put_data() instead of skb_put/memcpy pair
Use skb_put_data() instead of skb_put() and memcpy(), which is clear.

Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Link: https://lore.kernel.org/r/20220927024511.14665-1-shangxiaojing@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-29 09:37:40 +02:00
Shang XiaoJing 85e69a7dd6 net: ax88796c: Use skb_put_data() instead of skb_put/memcpy pair
Use skb_put_data() instead of skb_put() and memcpy(), which is clear.

Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Link: https://lore.kernel.org/r/20220927023043.17769-1-shangxiaojing@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-29 09:37:29 +02:00
Shang XiaoJing 1469327bb3 ethernet: s2io: Use skb_put_data() instead of skb_put/memcpy pair
Use skb_put_data() instead of skb_put() and memcpy(), which is shorter
and clear. Drop the tmp variable that is not needed any more.

Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Link: https://lore.kernel.org/r/20220927022802.16050-1-shangxiaojing@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-29 09:19:09 +02:00
Bitterblue Smith 2fc6de5c69 wifi: rtl8xxxu: Improve rtl8xxxu_queue_select
Remove the unused ieee80211_hw* parameter, and pass ieee80211_hdr*
instead of relying on skb->data having the right value at the time
the function is called.

This doesn't change the functionality at all.

Fixes: 26f1fad29a ("New driver: rtl8xxxu (mac80211)")
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/2af44c28-1c12-46b9-85b9-011560bf7f7e@gmail.com
2022-09-29 09:18:42 +03:00
Bitterblue Smith 5574d32904 wifi: rtl8xxxu: Fix AIFS written to REG_EDCA_*_PARAM
ieee80211_tx_queue_params.aifs is not supposed to be written directly
to the REG_EDCA_*_PARAM registers. Instead process it like the vendor
drivers do. It's kinda hacky but it works.

This change boosts the download speed and makes it more stable.

Tested with RTL8188FU but all the other supported chips should also
benefit.

Fixes: 26f1fad29a ("New driver: rtl8xxxu (mac80211)")
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/038cc03f-3567-77ba-a7bd-c4930e3b2fad@gmail.com
2022-09-29 09:18:41 +03:00
Bitterblue Smith a8b5aef2cc wifi: rtl8xxxu: gen2: Enable 40 MHz channel width
The module parameter ht40_2g was supposed to enable 40 MHz operation,
but it didn't.

Tell the firmware about the channel width when updating the rate mask.
This makes it work with my gen 2 chip RTL8188FU.

I'm not sure if anything needs to be done for the gen 1 chips, if 40
MHz channel width already works or not. They update the rate mask with
a different structure which doesn't have a field for the channel width.

Also set the channel width correctly for sta_statistics.

Fixes: f653e69009 ("rtl8xxxu: Implement basic 8723b specific update_rate_mask() function")
Fixes: bd917b3d28 ("rtl8xxxu: fill up txrate info for gen1 chips")
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/3a950997-7580-8a6b-97a0-e0a81a135456@gmail.com
2022-09-29 09:18:41 +03:00
Maxim Mikityanskiy 997ce6affe net/mlx5e: Use runtime values of striding RQ parameters in datapath
Some of the parameters of striding RQ are compile-time constants, but
they are going to become dynamically calculated at runtime in a
following commit. This commit prepares the datapath to take cached
runtime parameters, prefilled at queue creation.

New fields added to struct mlx5e_rq fit into an existing 7-byte hole.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:39 -07:00
Maxim Mikityanskiy 258e655c00 net/mlx5e: Make dma_info array dynamic in struct mlx5e_mpw_info
This commit moves the dma_info array to the end of struct mlx5e_mpw_info
to make it a flexible array. It also removes the intermediate struct
mlx5e_umr_dma_info, which used to contain only this array. The
flexibility of dma_info will allow to choose its size dynamically in a
following commit.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:39 -07:00
Maxim Mikityanskiy 3904d2afad net/mlx5e: Improve the MTU change shortcut
Normally, the MTU change requires reopening the channels, but it can be
skipped if the new MTU doesn't change any of the queue parameters and if
MTU is not used in the data path.

The shortcut is applicable to the non-linear mode of striding RQ,
because the only thing affected by MTU is the queue length. As ethtool
sets the queue length in packets, but striding RQ length is defined in
strides or bytes, we estimate the RQ length to be at least as big as the
requested number of MTU-sized packets, that's why it depends on MTU.

Improve the shortcut by actually checking whether the RQ length stayed
the same, instead of an intermediate step in the calculation.

As MTU also affects the SHAMPO parameters, skip the shortcut if SHAMPO
is in use.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:38 -07:00
Maxim Mikityanskiy 411295fbe6 net/mlx5e: xsk: Fix SKB headroom calculation in validation
In a typical scenario, if an XSK socket is opened first, then an XDP
program is attached, mlx5e_validate_xsk_param will be called twice:
first on XSK bind, second on channel restart caused by enabling XDP. The
validation includes a call to mlx5e_rx_is_linear_skb, which checks the
presence of the XDP program.

The above means that mlx5e_rx_is_linear_skb might return true the first
time, but false the second time, as mlx5e_rx_get_linear_sz_skb's return
value will increase, because of a different headroom used with XDP.

As XSK RQs never exist without XDP, it would make sense to trick
mlx5e_rx_get_linear_sz_skb into thinking XDP is enabled at the first
check as well. This way, if MTU is too big, it would be detected on XSK
bind, without giving false hope to the userspace application.

However, it turns out that this check is too restrictive in the first
place. SKBs created on XDP_PASS on XSK RQs don't have any headroom. That
means that big MTUs filtered out on the first and the second checks
might actually work.

So, address this issue in the proper way, but taking into account the
absence of the SKB headroom on XSK RQs, when calculating the buffer
size.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:38 -07:00
Maxim Mikityanskiy 8c654a1bb6 net/mlx5e: xsk: Remove dead code in validation
One of the checks in mlx5e_rx_is_linear_skb verifies that the RX buffer
fits into the XSK frame size. Remove the duplicating check from
mlx5e_validate_xsk_param. It allows to make mlx5e_rx_get_min_frag_sz
static.

Remove mlx5e_rx_is_xdp altogether, as its only usage is located in a
branch where xsk == NULL.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:38 -07:00
Maxim Mikityanskiy ddbef36560 net/mlx5e: Simplify stride size calculation for linear RQ
Linear RX buffers must be big enough to fit the MTU-sized packet along
with the headroom. On the other hand, they must be small enough to fit
into a page (or into an XSK frame). A straightforward way to check
whether the linear mode is possible would be comparing the required
buffer size to PAGE_SIZE or XSK frame size.

Stride size in the linear mode is defined by the following constraints:

1. A stride is at least as big as the buffer size, and it's a power of
two.

2. If non-XSK XDP is enabled, the stride size is PAGE_SIZE, because
mlx5e requires each packet to be in its own page when XDP is in use. The
previous constraint is automatically fulfilled, because buffer size
can't be bigger than PAGE_SIZE.

3. XSK uses stride size equal to PAGE_SIZE, but the following commits
will allow it to use roundup_pow_of_two(XSK frame size), by allowing the
NIC's MMU to use page sizes not equal to the CPU page size.

This commit puts the above requirements and constraints straight to the
code in an attempt to simplify it and to prepare it for changes made in
the next patches.

For the reference, the old code uses an equivalent, but trickier
calculation (high-level simplified pseudocode):

    if XDP or XSK:
        mlx5e_rx_get_linear_frag_sz := max(buffer size, PAGE_SIZE)
    else:
        mlx5e_rx_get_linear_frag_sz := buffer size
    mlx5e_rx_is_linear_skb := mlx5e_rx_get_linear_frag_sz <= PAGE_SIZE
    stride size := roundup_pow_of_two(mlx5e_rx_get_linear_frag_sz)

The new code effectively removes mlx5e_rx_get_linear_frag_sz that used
to return either buffer size or stride size, depending on the situation,
making it hard to work with and to make changes:

    if XDP or XSK:
        mlx5e_rx_get_linear_stride_sz := PAGE_SIZE
    else
        mlx5e_rx_get_linear_stride_sz := roundup_pow_of_two(buffer size)
    mlx5e_rx_is_linear_skb := buffer size <= (PAGE_SIZE or XSK frame sz)
    stride size := mlx5e_rx_get_linear_stride_sz

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:37 -07:00
Maxim Mikityanskiy 4c78782e2e net/mlx5e: kTLS, Check ICOSQ WQE size in advance
Instead of WARNing in runtime when TLS offload WQEs posted to ICOSQ are
over the hardware limit, check their size before enabling TLS RX
offload, and block the offload if the condition fails. It also allows to
drop a u16 field from struct mlx5e_icosq.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:37 -07:00
Maxim Mikityanskiy 21a0502d59 net/mlx5e: Use the aligned max TX MPWQE size
TX MPWQE size is limited to the cacheline-aligned maximum. Use the same
value for the stop room and the capability check.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:36 -07:00
Maxim Mikityanskiy e3c4c496dc net/mlx5e: Fix a typo in mlx5e_xdp_mpwqe_is_full
Fix a typo in the function name: mpqwe -> mpwqe (stands for multi-packet
work queue element).

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:36 -07:00
Maxim Mikityanskiy 527918e9cc net/mlx5e: Use mlx5e_stop_room_for_max_wqe where appropriate
mlx5e_alloc_xdpsq calculates sq->stop_room internally, but there is
already a function for that: mlx5e_stop_room_for_max_wqe. This commit
makes use of this function.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:36 -07:00
Maxim Mikityanskiy ed5c92ff0f net/mlx5e: Let mlx5e_get_sw_max_sq_mpw_wqebbs accept mdev
To shorten and simplify code, let mlx5e_get_sw_max_sq_mpw_wqebbs accept
mdev and derive max SQ WQEBBs from it. Also rename the function to a
more generic name mlx5e_get_max_sq_aligned_wqebbs, because the following
patches will use it in non-MPWQE contexts.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:35 -07:00
Maxim Mikityanskiy 44f4fd03b5 net/mlx5e: Validate striding RQ before enabling XDP
Currently, the driver can silently fall back to legacy RQ after enabling
XDP, even if striding RQ was active before. It happens when PAGE_SIZE is
bigger than the maximum supported stride size. This commit changes this
behavior to more straightforward: if an operation (enabling XDP) doesn't
support the current parameters (striding RQ mode), it fails.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:35 -07:00
Maxim Mikityanskiy 7e49abb1e3 net/mlx5e: Make mlx5e_verify_rx_mpwqe_strides static
mlx5e_verify_rx_mpwqe_strides is only used in en/params.c, so it can be
made static and removed from en/params.h.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:35 -07:00
Maxim Mikityanskiy 665f29de4c net/mlx5e: Remove unused fields from datapath structs
No need to keep max_sq_wqebbs in mlx5e_txqsq and mlx5e_xdpsq, as it's
only used when allocating the queues. Removing an extra field reduces
the struct size.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:34 -07:00
Maxim Mikityanskiy f060ccc2af net/mlx5e: Convert mlx5e_get_max_sq_wqebbs to u8
The return value of mlx5e_get_max_sq_wqebbs is clamped down to
MLX5_SEND_WQE_MAX_WQEBBS = 16, which fits into u8. This commit changes
the return type of this function to u8 for stricter type safety.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:36:34 -07:00
Jakub Kicinski 0d5bfebf74 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:

====================
updates from mlx5-next 2022-09-24

Updates form mlx5-next including[1]:

1) HW definitions and support for NPPS clock settings.

2) various cleanups

3) Enable hash mode by default for all NICs

4) page tracker and advanced virtualization HW definitions for vfio

[1] https://lore.kernel.org/netdev/20220907233636.388475-1-saeed@kernel.org/

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
  net/mlx5: Remove from FPGA IFC file not-needed definitions
  net/mlx5: Remove unused structs
  net/mlx5: Remove unused functions
  net/mlx5: detect and enable bypass port select flow table
  net/mlx5: Lag, enable hash mode by default for all NICs
  net/mlx5: Lag, set active ports if support bypass port select flow table
  RDMA/mlx5: Don't set tx affinity when lag is in hash mode
  net/mlx5: add IFC bits for bypassing port select flow table
  net/mlx5: Add support for NPPS with real time mode
  net/mlx5: Expose NPPS related registers
  net/mlx5: Query ADV_VIRTUALIZATION capabilities
  net/mlx5: Introduce ifc bits for page tracker
  RDMA/mlx5: Move function mlx5_core_query_ib_ppcnt() to mlx5_ib
====================

Link: https://lore.kernel.org/all/20220927201906.234015-1-saeed@kernel.org/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:20:49 -07:00
Sean Anderson d4ddeefa64 net: sunhme: Fix undersized zeroing of quattro->happy_meals
Just use kzalloc instead.

Fixes: d6f1e89bdb ("sunhme: Return an ERR_PTR from quattro_pci_find")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Link: https://lore.kernel.org/r/20220928004157.279731-1-seanga2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:18:42 -07:00
Shang XiaoJing f45892f750 net: wwan: iosm: Use skb_put_data() instead of skb_put/memcpy pair
Use skb_put_data() instead of skb_put() and memcpy(), which is clear.

Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Reviewed-by: M Chetan Kumar <m.chetan.kumar@intel.com>
Link: https://lore.kernel.org/r/20220927023254.30342-1-shangxiaojing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:18:03 -07:00
Vladimir Oltean 1109b97b61 net: dsa: felix: update regmap requests to be string-based
Existing felix DSA drivers (vsc9959, vsc9953) are all switches that were
integrated in NXP SoCs, which makes them a bit unusual compared to the
usual Microchip branded Ocelot switches.

To be precise, looking at
Documentation/devicetree/bindings/net/mscc,vsc7514-switch.yaml, one can
see 21 memory regions for the "switch" node, and these correspond to the
"targets" of the switch IP, which are spread throughout the guts of that
SoC's memory space.

In NXP integrations, those targets still exist, but they were condensed
within a single memory region, with no other peripheral in between them,
so it made more sense for the driver to ioremap the entire memory space
of the switch, and then find the targets within that memory space via
some offsets hardcoded in the driver.

The effect of this design decision is that now, the felix driver expects
hardware instantiations to provide their own resource definitions, which
is kind of odd when considering a typical device (those are retrieved
from 'reg' properties in the device tree, using platform_get_resource()
or similar).

Allow other hardware instantiations that share the felix driver to not
provide a hardcoded array of resources in the future. Instead, make the
common denominator based on which regmaps are created be just the
resource "names". Each instantiation comes with its own array of names
that are mandatory for it, and with an optional array of resources.

So we split the resources in 2 arrays, one is what's requested and the
other is what's provided. There is one pool of provided resources, in
felix->info->resources (of length felix->info->num_resources). There are
2 different ways of requesting a resource. One is by enum ocelot_target
(this handles the global regmaps), and one is by int port (this handles
the per-port ones).

For the existing vsc9959 and vsc9953, it would be a bit stupid to
request something that's not provided, given that the 2 arrays are both
defined in the same place.

The advantage is that we can now modify felix_request_regmap_by_name()
to make felix->info->resources[] optional, and if absent, the
implementation can call dev_get_regmap() and this is something that is
compatible with MFD.

Co-developed-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:15:02 -07:00
Vladimir Oltean 044d447a80 net: dsa: felix: use DEFINE_RES_MEM_NAMED for resources
Use less verbose resource definitions in vsc9959 and vsc9953. This also
sets IORESOURCE_MEM in the constant array of resources, so we don't have
to do this from felix_init_structs() - in fact, in the future, we may
even support IORESOURCE_REG resources.

Note that this macro takes start and length as argument, and we had
start and end before. So transform end into length.

While at it, sort the resources according to their offset.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:14:56 -07:00
Vladimir Oltean 8f66c64bfc net: dsa: felix: remove felix_info :: init_regmap
It turns out that the idea of having a customizable implementation of a
regmap creation from a resource is not exactly useful. The idea was for
the new MFD-based VSC7512 driver to use something that creates a SPI
regmap from a resource. But there are problems in actually getting those
resources (it involves getting them from MFD).

To avoid all that, we'll be getting resources by name, so this custom
init_regmap() method won't be needed. Remove it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:14:56 -07:00
Vladimir Oltean 1382ba68a0 net: dsa: felix: remove felix_info :: imdio_base
This address is only relevant for the vsc9959, which is a PCIe device
that holds its switch registers in a different PCIe BAR compared to the
registers for the internal MDIO controller.

Hide this aspect from the common felix driver and move the
pci_resource_start() call to the only place that needs it, which is in
vsc9959_mdio_bus_alloc().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:14:55 -07:00
Vladimir Oltean 5fc080de89 net: dsa: felix: remove felix_info :: imdio_res
The imdio_res is used only by vsc9959, which references its own
vsc9959_imdio_res through the common felix_info->imdio_res pointer.
Since the common code doesn't care about this resource (and it can't be
part of the common array of resources, either, because it belongs in a
different PCI BAR), just reference it directly.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:14:55 -07:00
Jakub Kicinski 3e1308a7c8 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
ice: xsk: ZC changes

Maciej Fijalkowski says:

This set consists of two fixes to issues that were either pointed out on
indirectly (John was reviewing AF_XDP selftests that were testing ice's
ZC support) mailing list or were directly reported by customers.

First patch allows user space to see done descriptor in CQ even after a
single frame being transmitted and second patch removes the need for
having HW rings sized to power of 2 number of descriptors when used
against AF_XDP.

I also forgot to mention that due to the current Tx cleaning algorithm,
4k HW ring was broken and these two patches bring it back to life, so we
kill two birds with one stone.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: xsk: drop power of 2 ring size restriction for AF_XDP
  ice: xsk: change batched Tx descriptor cleaning
====================

Link: https://lore.kernel.org/r/20220927164112.4011983-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:04:43 -07:00
Daniel Golle c9da02bfb1 net: ethernet: mtk_eth_soc: fix mask of RX_DMA_GET_SPORT{,_V2}
The bitmasks applied in RX_DMA_GET_SPORT and RX_DMA_GET_SPORT_V2 macros
were swapped. Fix that.

Reported-by: Chen Minqiang <ptpt52@gmail.com>
Fixes: 160d3a9b19 ("net: ethernet: mtk_eth_soc: introduce MTK_NETSYS_V2 support")
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://lore.kernel.org/r/YzMW+mg9UsaCdKRQ@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:03:57 -07:00
Vladimir Oltean 276d37eb44 net: mscc: ocelot: fix tagged VLAN refusal while under a VLAN-unaware bridge
Currently the following set of commands fails:

$ ip link add br0 type bridge # vlan_filtering 0
$ ip link set swp0 master br0
$ bridge vlan
port              vlan-id
swp0              1 PVID Egress Untagged
$ bridge vlan add dev swp0 vid 10
Error: mscc_ocelot_switch_lib: Port with more than one egress-untagged VLAN cannot have egress-tagged VLANs.

Dumping ocelot->vlans, one can see that the 2 egress-untagged VLANs on swp0 are
vid 1 (the bridge PVID) and vid 4094, a PVID used privately by the driver for
VLAN-unaware bridging. So this is why bridge vid 10 is refused, despite
'bridge vlan' showing a single egress untagged VLAN.

As mentioned in the comment added, having this private VLAN does not impose
restrictions to the hardware configuration, yet it is a bookkeeping problem.

There are 2 possible solutions.

One is to make the functions that operate on VLAN-unaware pvids:
- ocelot_add_vlan_unaware_pvid()
- ocelot_del_vlan_unaware_pvid()
- ocelot_port_setup_dsa_8021q_cpu()
- ocelot_port_teardown_dsa_8021q_cpu()
call something different than ocelot_vlan_member_(add|del)(), the latter being
the real problem, because it allocates a struct ocelot_bridge_vlan *vlan which
it adds to ocelot->vlans. We don't really *need* the private VLANs in
ocelot->vlans, it's just that we have the extra convenience of having the
vlan->portmask cached in software (whereas without these structures, we'd have
to create a raw ocelot_vlant_rmw_mask() procedure which reads back the current
port mask from hardware).

The other solution is to filter out the private VLANs from
ocelot_port_num_untagged_vlans(), since they aren't what callers care about.
We only need to do this to the mentioned function and not to
ocelot_port_num_tagged_vlans(), because private VLANs are never egress-tagged.

Nothing else seems to be broken in either solution, but the first one requires
more rework which will conflict with the net-next change  36a0bf4435 ("net:
mscc: ocelot: set up tag_8021q CPU ports independent of user port affinity"),
and I'd like to avoid that. So go with the other one.

Fixes: 54c3198460 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220927122042.1100231-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:03:28 -07:00
Jakub Kicinski b48b89f9c1 net: drop the weight argument from netif_napi_add
We tell driver developers to always pass NAPI_POLL_WEIGHT
as the weight to netif_napi_add(). This may be confusing
to newcomers, drop the weight argument, those who really
need to tweak the weight can use netif_napi_add_weight().

Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN
Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:57:14 -07:00
Pavel Begunkov b63ca3e822 xen/netback: use struct ubuf_info_msgzc
struct ubuf_info will be changed, use ubuf_info_msgzc instead.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:51:24 -07:00
Martyna Szapar-Mudlaw 34800178b3 ice: Add support for VLAN priority filters in switchdev
Enable support for adding TC rules that filter on the VLAN priority
in switchdev mode.

VLAN priority are the first 3 bits of 16b switch field vector word
which contain also vlan id value within its last 12 bits.
When getting vlan priority value from tc match.key it
has to be shifted first to proper bits positions (by VLAN_PRIO_SHIFT)
and then can be added to the joint 'vlan' field in ice_vlan_hdr
in lookup element.

The mask of lookup changes accordingly.
0x0FFF - when only vlan id is added in filter
0xE000 - when only vlan priority is added in filter
0xEFFF - when both these values are specified

Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-28 11:40:57 -07:00
Arkadiusz Kubalewski 793189a2fc ice: support features on new E810T variants
Add new sub-device ids required for proper initialization of features
on E810T devices supported by ice driver.

Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-28 11:40:57 -07:00
Arkadiusz Kubalewski 43c4958a3d ice: Merge pin initialization of E810 and E810T adapters
Remove separate function initializing pins for E810T-based adapters
and initialize pins based on feature bits.

Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-28 11:40:57 -07:00
Edward Cree d902e1a737 sfc: bare bones TC offload on EF100
This is the absolute minimum viable TC implementation to get traffic to
 VFs and allow them to be tested; it supports no match fields besides
 ingress port, no actions besides mirred and drop, and no stats.
Example usage:
    tc filter add dev $PF parent ffff: flower skip_sw \
        action mirred egress mirror dev $VFREP
    tc filter add dev $VFREP parent ffff: flower skip_sw \
        action mirred egress redirect dev $PF
 gives a VF unfiltered access to the network out the physical port ($PF
 acts here as a physical port representor).
More matches, actions, and counters will be added in subsequent patches.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Edward Cree 7ce3e235f2 sfc: interrogate MAE capabilities at probe time
Different versions of EF100 firmware and FPGA bitstreams support different
 matching capabilities in the Match-Action Engine.  Probe for these at
 start of day; subsequent patches will validate TC offload requests
 against the reported capabilities.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Edward Cree f54a28a211 sfc: add a hashtable for offloaded TC rules
Nothing inserts into this table yet, but we have code to remove rules
 on FLOW_CLS_DESTROY or at driver teardown time, in both cases also
 attempting to remove the corresponding hardware rules.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Edward Cree 7c9d266d8f sfc: optional logging of TC offload errors
TC offload support will involve complex limitations on what matches and
 actions a rule can do, in some cases potentially depending on rules
 already offloaded.  So add an ethtool private flag "log-tc-errors" which
 controls reporting the reasons for un-offloadable TC rules at NETIF_INFO.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Edward Cree 5b2e12d51b sfc: bind indirect blocks for TC offload on EF100
Bind indirect blocks for recognised tunnel netdevices.
Currently these connect to a stub efx_tc_flower() that only returns
 -EOPNOTSUPP; subsequent patches will implement flower offloads to the
 Match-Action Engine.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Edward Cree 9dc0cad203 sfc: bind blocks for TC offload on EF100
Bind direct blocks for the MAE-admin PF and each VF representor.
Currently these connect to a stub efx_tc_flower() that only returns
 -EOPNOTSUPP; subsequent patches will implement flower offloads to the
 Match-Action Engine.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Gustavo A. R. Silva c87e4ad1d3 net: ethernet: rmnet: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
Zero-length arrays are deprecated and we are moving towards adopting
C99 flexible-array members, instead. So, replace zero-length arrays
declarations in anonymous union with the new DECLARE_FLEX_ARRAY()
helper macro.

This helper allows for flexible-array members in unions.

Link: https://github.com/KSPP/linux/issues/193
Link: https://github.com/KSPP/linux/issues/221
Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:40:31 +01:00
Horatiu Vultur 29aaf3d40e net: lan966x: Add offload support for ets
Add ets qdisc which allows to mix strict priority with bandwidth-sharing
bands. The ets qdisc needs to be attached as root qdisc.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:36:28 +01:00
Horatiu Vultur 21ce14a8e7 net: lan966x: Add offload support for cbs
Lan966x switch supports credit based shaper in hardware according to
IEEE Std 802.1Q-2018 Section 8.6.8.2. Add support for cbs configuration
on egress port of lan966x switch.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:36:28 +01:00
Horatiu Vultur 94644b6d72 net: lan966x: Add offload support for tbf
The tbf qdisc allows to attach a shaper on traffic egress on a port or
on a queue. On port they are attached directly to the root and on queue
they are attached on one of the classes of the parent qdisc.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:36:28 +01:00
Marc Kleine-Budde 81d192c2ce can: c_can: don't cache TX messages for C_CAN cores
As Jacob noticed, the optimization introduced in 387da6bc7a ("can:
c_can: cache frames to operate as a true FIFO") doesn't properly work
on C_CAN, but on D_CAN IP cores. The exact reasons are still unknown.

For now disable caching if CAN frames in the TX path for C_CAN cores.

Fixes: 387da6bc7a ("can: c_can: cache frames to operate as a true FIFO")
Link: https://lore.kernel.org/all/20220928083354.1062321-1-mkl@pengutronix.de
Link: https://lore.kernel.org/all/15a8084b-9617-2da1-6704-d7e39d60643b@gmail.com
Reported-by: Jacob Kroon <jacob.kroon@gmail.com>
Tested-by: Jacob Kroon <jacob.kroon@gmail.com>
Cc: stable@vger.kernel.org # v5.15
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-09-28 10:34:04 +02:00
Ping-Ke Shih a1cb097168 wifi: rtw89: 8852b: configure DLE mem
Configure DLE (data link engine) memory size for operating modes.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220927062611.30484-10-pkshih@realtek.com
2022-09-28 09:45:59 +03:00
Ping-Ke Shih 5f8c35b932 wifi: rtw89: check DLE FIFO size with reserved size
For SCC mode, some FIFO are reserved, so compare the quantity after minus
the reserved size.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220927062611.30484-9-pkshih@realtek.com
2022-09-28 09:45:59 +03:00
Ping-Ke Shih 75f1ed29e4 wifi: rtw89: mac: correct register of report IMR
The register of report IMR is chip specific, so add a field to strut to
correct them.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220927062611.30484-8-pkshih@realtek.com
2022-09-28 09:45:58 +03:00
Ping-Ke Shih 3d7475897a wifi: rtw89: pci: set power cut closed for 8852be
Entering LPS with PCIe APHY power cut closed would cause PCIe link issue.
To avoid the combinational issue, keep PCIe APHY power cut always on.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220927062611.30484-7-pkshih@realtek.com
2022-09-28 09:45:58 +03:00
Ping-Ke Shih 9e6e66ffba wifi: rtw89: pci: add to do PCI auto calibration
8852be needs this with n times calibration to correct hardware clock.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220927062611.30484-6-pkshih@realtek.com
2022-09-28 09:45:58 +03:00
Ping-Ke Shih 14b6e9f4b0 wifi: rtw89: 8852b: implement chip_ops::{enable,disable}_bb_rf
Implement to power on/off BB and RF via MAC registers.

Add return type of chip_ops::disable_bb_rf, because it could fail to
disable. Also, correct naming of register 0x0200 used by the ops as well.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220927062611.30484-5-pkshih@realtek.com
2022-09-28 09:45:58 +03:00
Ping-Ke Shih 61bdf7aacd wifi: rtw89: add DMA busy checking bits to chip info
8852B has less DMA channels, so its checking bits are different from other
chips.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220927062611.30484-4-pkshih@realtek.com
2022-09-28 09:45:57 +03:00
Ping-Ke Shih a1b7163aab wifi: rtw89: mac: define DMA channel mask to avoid unsupported channels
Six channels are unsupported by 8852b, so mask them out to prevent to
access undefined registers in this chip.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220927062611.30484-3-pkshih@realtek.com
2022-09-28 09:45:57 +03:00
Ping-Ke Shih 1bebcf08a3 wifi: rtw89: pci: mask out unsupported TX channels
8852BE doesn't support some TX channels, so mask them out, or it access
undefined registers.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220927062611.30484-2-pkshih@realtek.com
2022-09-28 09:45:57 +03:00
Gustavo A. R. Silva 56df3d408a iwlegacy: Replace zero-length arrays with DECLARE_FLEX_ARRAY() helper
Zero-length arrays are deprecated and we are moving towards adopting
C99 flexible-array members, instead. So, replace zero-length arrays
declarations in anonymous union with the new DECLARE_FLEX_ARRAY()
helper macro.

This helper allows for flexible-array members in unions.

Link: https://github.com/KSPP/linux/issues/193
Link: https://github.com/KSPP/linux/issues/223
Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/YzIvzc0jsYLigO8a@work
2022-09-28 09:45:15 +03:00
Gustavo A. R. Silva 413cda9564 ipw2x00: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
Zero-length arrays are deprecated and we are moving towards adopting
C99 flexible-array members, instead. So, replace zero-length arrays
declarations in anonymous union with the new DECLARE_FLEX_ARRAY()
helper macro.

This helper allows for flexible-array members in unions.

Link: https://github.com/KSPP/linux/issues/193
Link: https://github.com/KSPP/linux/issues/220
Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/YzIeULWc17XSIglv@work
2022-09-28 09:44:48 +03:00
Kees Cook 72c08d9f4c wifi: iwlwifi: Track scan_cmd allocation size explicitly
In preparation for reducing the use of ksize(), explicitly track the
size of scan_cmd allocations. This also allows for noticing if the scan
size changes unexpectedly. Note that using ksize() was already incorrect
here, in the sense that ksize() would not match the actual allocation
size, which would trigger future run-time allocation bounds checking.
(In other words, memset() may know how large scan_cmd was allocated for,
but ksize() will return the upper bounds of the actually allocated memory,
causing a run-time warning about an overflow.)

Cc: Gregory Greenman <gregory.greenman@intel.com>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Cc: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Cc: Ilan Peer <ilan.peer@intel.com>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220923220853.3302056-1-keescook@chromium.org
2022-09-28 09:43:58 +03:00
Kees Cook d89318bbdf mlxsw: core_acl_flex_actions: Split memcpy() of struct flow_action_cookie flexible array
To work around a misbehavior of the compiler's ability to see into
composite flexible array structs (as detailed in the coming memcpy()
hardening series[1]), split the memcpy() of the header and the payload
so no false positive run-time overflow warning will be generated.

[1] https://lore.kernel.org/linux-hardening/20220901065914.1417829-2-keescook@chromium.org

Cc: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/20220927004033.1942992-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:44:21 -07:00
Alex Elder 181ca02026 net: ipa: define remaining IPA register fields
Define the fields for the ENDP_INIT_DEAGGR, ENDP_INIT_RSRC_GRP,
ENDP_INIT_SEQ, ENDP_STATUS, and ENDP_FILTER_ROUTER_HSH_CFG, and
IPA_IRQ_UC IPA registers for all supported IPA versions.

Create enumerated types to identify fields for these IPA registers.
Use IPA_REG_FIELDS() and IPA_REG_STRIDE_FIELDS() to specify the
field mask values defined for these registers, for each supported
version of IPA.

Use ipa_reg_encode() and ipa_reg_bit() to build up the values to be
written to these registers, remove an inline function and all the
*_FMASK symbols that are now no longer used.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:52 -07:00
Alex Elder 216b409d09 net: ipa: define more IPA endpoint register fields
Define the fields for the ENDP_INIT_MODE, ENDP_INIT_AGGR,
ENDP_INIT_HOL_BLOCK_EN, and ENDP_INIT_HOL_BLOCK_TIMER IPA
registers for all supported IPA versions.

Create enumerated types to identify fields for these IPA registers.
Use IPA_REG_STRIDE_FIELDS() to specify the field mask values defined
for these registers, for each supported version of IPA.

Change aggr_time_limit_encode() and hol_block_timer_encode() so they
take an ipa_reg pointer, and use those register's fields to compute
their encoded results.  Have aggr_time_limit_encode() take an IPA
pointer rather than version, to match hol_block_timer_encode().

Use ipa_reg_encode(), ipa_reg_bit(), and ipa_reg_field_max() to
manipulate values to be written to these registers, remove the
definitions of the various inline functions and *_FMASK symbols that
are now no longer used.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:51 -07:00
Alex Elder 4468a3448b net: ipa: define some IPA endpoint register fields
Define the fields for the ENDP_INIT_CTRL, ENDP_INIT_CFG, ENDP_INIT_NAT,
ENDP_INIT_HDR, and ENDP_INIT_HDR_EXT IPA registers for all supported
IPA versions.

Create enumerated types to identify fields for these IPA registers.
Use IPA_REG_STRIDE_FIELDS() to specify the field mask values defined
for these registers, for each supported version of IPA.

Move ipa_header_size_encoded() and ipa_metadata_offset_encoded() out
of "ipa_reg.h" and into "ipa_endpoint.c".  Change them so they take
an additional ipa_reg structure argument, and use ipa_reg_encode()
to encode the parts of the header size and offset prior to writing
to the register.  Change their names to be verbs rather than nouns.

Use ipa_reg_encode(), ipa_reg_bit, and ipa_reg_field_max() to
manipulate values to be written to these registers, remove the
definition of the no-longer-used *_FMASK symbols.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:51 -07:00
Alex Elder 1c418c4a92 net: ipa: define resource group/type IPA register fields
Define the fields for the {SRC,DST}_RSRC_GRP_{01,23,45,67}_RSRC_TYPE
IPA registers for all supported IPA versions.

Create enumerated types to identify fields for these IPA registers.
Use IPA_REG_STRIDE_FIELDS() to specify the field mask values defined
for these registers, for each supported version of IPA.

Use ipa_reg_encode() to build up the values to be written to these
registers.

Remove the definition of the no-longer-used *_FMASK symbols.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:51 -07:00
Alex Elder 9265a4f0f0 net: ipa: define even more IPA register fields
Define the fields for the FLAVOR_0, IDLE_INDICATION_CFG,
QTIME_TIMESTAMP_CFG, TIMERS_XO_CLK_DIV_CFG and TIMERS_PULSE_GRAN_CFG
IPA registers for all supported IPA versions.

Create enumerated types to identify fields for these IPA registers.
Use IPA_REG_FIELDS() to specify the field mask values defined for
these registers, for each supported version of IPA.

Use ipa_reg_bit() and ipa_reg_encode() to build up the values to be
written to these registers.  Use ipa_reg_decode() to extract field
values from the FLAVOR_0 register.

Remove the definition of the no-longer-used *_FMASK symbols.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:51 -07:00
Alex Elder b5c35fa470 net: ipa: define more IPA register fields
Define the fields for the LOCAL_PKT_PROC_CNTXT, COUNTER_CFG, and
IPA_TX_CFG IPA registers for all supported IPA versions.

Create enumerated types to identify fields for these IPA registers.
Use IPA_REG_FIELDS() to specify the field mask values defined for
these registers, for each supported version of IPA.

Use ipa_reg_bit() and ipa_reg_encode() to build up the values to be
written to these registers.  Remove the definition of the *_FMASK
symbols as well as proc_cntxt_base_addr_encoded(), because they are
no longer needed.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:51 -07:00
Alex Elder 62b9c009a8 net: ipa: define some more IPA register fields
Define the fields for the SHARED_MEM_SIZE, QSB_MAX_WRITES,
QSB_MAX_READS, FILT_ROUT_HASH_EN, and FILT_ROUT_HASH_FLUSH IPA
registers for all supported IPA versions.

Create enumerated types to identify fields for these registers.  Use
IPA_REG_FIELDS() to specify the field mask values defined for these
registers, for each supported version of IPA.

Use ipa_reg_bit() and ipa_reg_encode() to build up the values to be
written to these registers rather than using the *_FMASK
preprocessor symbols.

Remove the definition of the now unused *_FMASK symbols.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder 479deb3298 net: ipa: define CLKON_CFG and ROUTE IPA register fields
Create the ipa_reg_clkon_cfg_field_id enumerated type, which
identifies the fields for the CLKON_CFG IPA register.  Add "CLKON_"
to a few short names to try to avoid name conflicts.  Create the
ipa_reg_route_field_id enumerated type, which identifies the fields
for the ROUTE IPA register.

Use IPA_REG_FIELDS() to specify the field mask values defined for
these registers, for each supported version of IPA.

Use ipa_reg_bit() and ipa_reg_encode() to build up the values to be
written to these registers rather than using the *_FMASK
preprocessor symbols.

Remove the definition of the now unused *_FMASK symbols.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder 12c7ea7dfd net: ipa: define COMP_CFG IPA register fields
Create the ipa_reg_comp_cfg_field_id enumerated type, which
identifies the fields for the COMP_CFG IPA register.

Use IPA_REG_FIELDS() to specify the field mask values defined for
this register, for each supported version of IPA.

Use ipa_reg_bit() to build up the value to be written to this
register rather than using the *_FMASK preprocessor symbols.

Remove the definition of the *_FMASK symbols, along with the inline
functions that were used to encode certain fields whose position
and/or width within the register was dependent on IPA version.

Take this opportunity to represent all one-bit fields using BIT(x)
rather than GENMASK(x, x).

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder a5ad8956f9 net: ipa: introduce ipa_reg field masks
Add register field descriptors to the ipa_reg structure.  A field in
a register is defined by a field mask, which is a 32-bit mask having
a single contiguous range of bits set.

For each register that has at least one field defined, an enumerated
type will identify the register's fields.  The ipa_reg structure for
that register will include an array fmask[] of field masks, indexed
by that enumerated type.  Each field mask defines the position and
bit width of a field.  An additional "fcount" records how many
fields (masks) are defined for a given register.

Introduce two macros to be used to define registers that have at
least one field.

Introduce a few new functions related to field masks.  The first
simply returns a field mask, given an IPA register pointer and field
mask ID.  A variant of that is meant to be used for the special case
of single-bit field masks.

Next, ipa_reg_encode(), identifies a field with an IPA register
pointer and a field ID, and takes a value to represent in that
field.  The result encodes the value in the appropriate place to be
stored in the register.  This is roughly modeled after the bitmask
operations (like u32_encode_bits()).

Another function (ipa_reg_decode()) similarly identifies a register
field, but the value supplied to it represents a full register
value.  The value encoded in the field is extracted from the value
and returned.  This is also roughly modeled after bitmask operations
(such as u32_get_bits()).

Finally, ipa_reg_field_max() returns the maximum value representable
by a field.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder 6a244b75cf net: ipa: introduce ipa_reg()
Create a new function that returns a register descriptor given its
ID.  Change ipa_reg_offset() and ipa_reg_n_offset() so they take a
register descriptor argument rather than an IPA pointer and register
ID.  Have them accept null pointers (and return an invalid 0 offset),
to avoid the need for excessive error checking.  (A warning is issued
whenever ipa_reg() returns 0).

Call ipa_reg() or ipa_reg_n() to look up information about the
register before calls to ipa_reg_offset() and ipa_reg_n_offset().
Delay looking up offsets until they're needed to read or write
registers.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder 82a0680745 net: ipa: use ipa_reg[] array for register offsets
Use the array of register descriptors assigned at initialization
time to determine the offset (and where used, stride) for IPA
registers.  Issue a warning if an offset is requested for a register
that's not valid for the current system.

Remove all IPE_REG_*_OFFSET macros, as well as inline static
functions that returned register offsets.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder 07f120bcf7 net: ipa: add per-version IPA register definition files
Create a new subdirectory "reg", which contains a register
definition file for each supported version of IPA.  Each register
definition contains the register's offset, and for parameterized
registers, the stride (distance between consecutive instances of the
register).  Finally, it includes an all-caps printable register name.

In these files, each IPA version defines an array of IPA register
definition pointers, with unsupported registers defined with a null
pointer.  The array is indexed by the ipa_reg_id enumerated type.

At initialization time, the appropriate register definition array to
use is selected based on the IPA version, and assigned to a new
"regs" field in the IPA structure.

Extend ipa_reg_valid() so it fails if a valid register is not
defined.

This patch simply puts this infrastructure in place; the next will
use it.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:49 -07:00
Alex Elder 6bfb753850 net: ipa: use IPA register IDs to determine offsets
Expose two inline functions that return the offset for a register
whose ID is provided; one of them takes an additional argument
that's used for registers that are parameterized.  These both use
a common helper function __ipa_reg_offset(), which just uses the
offset symbols already defined.

Replace all references to the offset macros defined for IPA
registers with calls to ipa_reg_offset() or ipa_reg_n_offset().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:49 -07:00
Alex Elder 98e2dd71a8 net: ipa: introduce IPA register IDs
Create a new ipa_reg_id enumerated type, which identifies each IPA
register with a symbolic identifier.  Use short names, but in some
cases (such as "BCR") add "IPA_" to the name to help avoid name
conflicts.

Create two functions that indicate register validity.  The first
concisely indicates whether a register is valid for a given version
of IPA, and if so, whether it is defined.  The second indicates
whether a register is valid for TX or RX endpoints.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:49 -07:00
Bhupesh Sharma c64655f32f net: stmmac: Minor spell fix related to 'stmmac_clk_csr_set()'
Minor spell fix related to 'stmmac_clk_csr_set()' inside a
comment used in the 'stmmac_probe_config_dt()' function.

Cc: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Link: https://lore.kernel.org/r/20220924104514.1666947-1-bhupesh.sharma@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 17:26:56 -07:00
Gal Pressman b53ff37fcd net/mlx5: Remove unused structs
Remove structs which are no longer used in the driver:
  mlx5dr_cmd_qp_create_attr
  mlx5_fs_dr_ns
  mlx5_pas

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-09-27 12:50:27 -07:00
Gal Pressman 66af4fe371 net/mlx5: Remove unused functions
Remove functions which are no longer used in the driver:
  mlx5e_ipsec_is_tx_flow
  mlx5_health_flush
  get_cqe_enhanced_num_mini_cqes
  get_cqe_l3_hdr_type
  mlx5_health_flush
  mlx5_fs_is_ipsec_flow
  _mlx5_fs_is_outer_ipproto_flow
  mlx5_fs_is_outer_tcp_flow
  mlx5_fs_is_outer_udp_flow
  mlx5_fs_is_vxlan_flow
  mlx5_fs_is_outer_ipsec_flow

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-09-27 12:50:27 -07:00
Liu, Changcheng 90b1df74b5 net/mlx5: detect and enable bypass port select flow table
Use port selection capability port_select_flow_table_bypass
bit to detect and enable explicit port affinity even when
in lag hash mode.

Signed-off-by: Liu, Changcheng <jerrliu@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-09-27 12:50:27 -07:00
Liu, Changcheng b146a7cd0a net/mlx5: Lag, enable hash mode by default for all NICs
The firmware supports adding a steering rule to catch egress traffic
of the QPs/TISs which are set port affinity explicitly in hash mode.
Enable that mode for NICS with 2 ports as well.

Signed-off-by: Liu, Changcheng <jerrliu@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-09-27 12:50:27 -07:00
Liu, Changcheng c5c13b456c net/mlx5: Lag, set active ports if support bypass port select flow table
active_port bit mask indicates the current active ports. Set bit indicates
the port is active. Update active ports info to FW to redirect the QP/TIS
from inactive ports to other ports.

Signed-off-by: Liu, Changcheng <jerrliu@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-09-27 12:50:27 -07:00
Liu, Changcheng a83bb5df2a RDMA/mlx5: Don't set tx affinity when lag is in hash mode
In hash mode, without setting tx affinity explicitly, the port select
flow table decides which port is used for the traffic.
If port_select_flow_table_bypass capability is supported and tx affinity
is set explicitly for QP/TIS, they will be added into the explicit affinity
table in FW to check which port is used for the traffic.
1. The overloaded explicit affinity table may affect performance.
   To avoid this, do not set tx affinity explicitly by default.
2. The packets of the same flow need to be transmitted on the same port.
   Because the packets of the same flow use different QPs in slow & fast
   path, it shouldn't set tx affinity explicitly for these QPs.

Signed-off-by: Liu, Changcheng <jerrliu@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-09-27 12:50:27 -07:00
Aya Levin f0462bc3e9 net/mlx5: Add support for NPPS with real time mode
Add support for setting NPPS. NPPS is currently available in
REAL_TIME_CLOCK mode only. In addition allow the user to set the pulse
duration.

When NPPS pulse duration is not set explicitly by the user, driver set
it to 50% of the NPPS period.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-09-27 12:50:26 -07:00
Wolfram Sang 85f17d677f Merge branch 'master' into i2c/for-mergewindow 2022-09-27 21:33:37 +02:00
Maciej Fijalkowski b3056ae2b5 ice: xsk: drop power of 2 ring size restriction for AF_XDP
We had multiple customers in the past months that reported commit
296f13ff38 ("ice: xsk: Force rings to be sized to power of 2")
makes them unable to use ring size of 8160 in conjunction with AF_XDP.
Remove this restriction.

Fixes: 296f13ff38 ("ice: xsk: Force rings to be sized to power of 2")
CC: Alasdair McWilliam <alasdair.mcwilliam@outlook.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-27 09:01:01 -07:00