Here is the big set of tty and serial driver updates for 5.16-rc1.
Nothing major in here at all, just lots of tiny serial and tty driver
updates for various reported things, and some good cleanups. These
include:
- more good tty api cleanups from Jiri
- stm32 serial driver updates
- softlockup fix for non-preempt systems under high serial load
- rpmsg serial driver update
- 8250 drivers updates and fixes
- n_gsm line discipline fixes and updates as people are finally
starting to use it.
All of these have been in linux-next for a while now with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPczQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykWbwCfaIScbUoCUx+h/uP93nKKD8B3KgYAoMvFuhhD
D/fTLggs12x5NsvLBgtZ
=rq0R
-----END PGP SIGNATURE-----
Merge tag 'tty-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial driver updates from Greg KH:
"Here is the big set of tty and serial driver updates for 5.16-rc1.
Nothing major in here at all, just lots of tiny serial and tty driver
updates for various reported things, and some good cleanups. These
include:
- more good tty api cleanups from Jiri
- stm32 serial driver updates
- softlockup fix for non-preempt systems under high serial load
- rpmsg serial driver update
- 8250 drivers updates and fixes
- n_gsm line discipline fixes and updates as people are finally
starting to use it.
All of these have been in linux-next for a while now with no reported
issues"
* tag 'tty-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (86 commits)
tty: Fix extra "not" in TTY_DRIVER_REAL_RAW description
serial: cpm_uart: Protect udbg definitions by CONFIG_SERIAL_CPM_CONSOLE
tty: rpmsg: Define tty name via constant string literal
tty: rpmsg: Add pr_fmt() to prefix messages
tty: rpmsg: Use dev_err_probe() in ->probe()
tty: rpmsg: Unify variable used to keep an error code
tty: rpmsg: Assign returned id to a local variable
serial: stm32: push DMA RX data before suspending
serial: stm32: terminate / restart DMA transfer at suspend / resume
serial: stm32: rework RX dma initialization and release
serial: 8250_pci: Remove empty stub pci_quatech_exit()
serial: 8250_pci: Replace custom pci_match_id() implementation
serial: xilinx_uartps: Fix race condition causing stuck TX
serial: sunzilog: Mark sunzilog_putchar() __maybe_unused
Revert "tty: hvc: pass DMA capable memory to put_chars()"
Revert "virtio-console: remove unnecessary kmemdup()"
serial: 8250_pci: Replace dev_*() by pci_*() macros
serial: 8250_pci: Get rid of redundant 'else' keyword
serial: 8250_pci: Refactor the loop in pci_ite887x_init()
tty: add rpmsg driver
...
Hardening work by Jason
vdpa driver for Alibaba ENI
Performance tweaks for virtio blk
virtio rng rework using an internal buffer
mac/mtu programming for mlx5 vdpa
Misc fixes, cleanups
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmF/su8PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpo+sIAJjBTvbET+d0KuIMt8YBGsU+NWgHaJxW06hm
GHZzinueZYWaS20MPMYMPfcHUgigWj0kk7F0gMljG5rtDFzR85JkOi7+e5V6RVJW
8SQc7JjA1Krde6EiPJdlv3mVkLz/5VbrGUgjAW9di/O04Xc/jAc9kmJ41wTYr0mL
E5IsUi9QBmgHtKQ+2ofb9QEWuebJclGK+NVd3Kp08BtAQOrdn5UrIzY30/sjkZwE
ul2ll6v/k7T3dCQbHD/wMgfFycPvoZVfxaVj+FWQnwdWkqZwzMRmpKPRes4KJfww
Lfx1qox10mJzH7XJCs7xmOM8f7HgNNWi0gD5FxNlfkiz1AwHYOk=
=DTXY
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
"vhost and virtio fixes and features:
- Hardening work by Jason
- vdpa driver for Alibaba ENI
- Performance tweaks for virtio blk
- virtio rng rework using an internal buffer
- mac/mtu programming for mlx5 vdpa
- Misc fixes, cleanups"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (45 commits)
vdpa/mlx5: Forward only packets with allowed MAC address
vdpa/mlx5: Support configuration of MAC
vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit
vdpa_sim_net: Enable user to set mac address and mtu
vdpa: Enable user to set mac and mtu of vdpa device
vdpa: Use kernel coding style for structure comments
vdpa: Introduce query of device config layout
vdpa: Introduce and use vdpa device get, set config helpers
virtio-scsi: don't let virtio core to validate used buffer length
virtio-blk: don't let virtio core to validate used length
virtio-net: don't let virtio core to validate used length
virtio_ring: validate used buffer length
virtio_blk: correct types for status handling
virtio_blk: allow 0 as num_request_queues
i2c: virtio: Add support for zero-length requests
virtio-blk: fixup coccinelle warnings
virtio_ring: fix typos in vring_desc_extra
virtio-pci: harden INTX interrupts
virtio_pci: harden MSI-X interrupts
virtio_config: introduce a new .enable_cbs method
...
The VF can be configured via the PF's ndo ops at the same time the PF is
receiving/handling virtchnl messages. This has many issues, with
one of them being the ndo op could be actively resetting a VF (i.e.
resetting it to the default state and deleting/re-adding the VF's VSI)
while a virtchnl message is being handled. The following error was seen
because a VF ndo op was used to change a VF's trust setting while the
VIRTCHNL_OP_CONFIG_VSI_QUEUES was ongoing:
[35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
[35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
[35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
Fix this by making sure the virtchnl handling and VF ndo ops that
trigger VF resets cannot run concurrently. This is done by adding a
struct mutex cfg_lock to each VF structure. For VF ndo ops, the mutex
will be locked around the critical operations and VFR. Since the ndo ops
will trigger a VFR, the virtchnl thread will use mutex_trylock(). This
is done because if any other thread (i.e. VF ndo op) has the mutex, then
that means the current VF message being handled is no longer valid, so
just ignore it.
This issue can be seen using the following commands:
for i in {0..50}; do
rmmod ice
modprobe ice
sleep 1
echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
ip link set ens785f1 vf 0 trust on
ip link set ens785f0 vf 0 trust on
sleep 2
echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
sleep 1
echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
ip link set ens785f1 vf 0 trust on
ip link set ens785f0 vf 0 trust on
done
Fixes: 7c710869d6 ("ice: Add handlers for VF netdevice operations")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When a VF is removed and/or reset its Tx queues need to be
stopped from the PF. This is done by calling the ice_dis_vf_qs()
function, which calls ice_vsi_stop_lan_tx_rings(). Currently
ice_dis_vf_qs() is protected by the VF state bit ICE_VF_STATE_QS_ENA.
Unfortunately, this is causing the Tx queues to not be disabled in some
cases and when the VF tries to re-enable/reconfigure its Tx queues over
virtchnl the op is failing. This is because a VF can be reset and/or
removed before the ICE_VF_STATE_QS_ENA bit is set, but the Tx queues
were already configured via ice_vsi_cfg_single_txq() in the
VIRTCHNL_OP_CONFIG_VSI_QUEUES op. However, the ICE_VF_STATE_QS_ENA bit
is set on a successful VIRTCHNL_OP_ENABLE_QUEUES, which will always
happen after the VIRTCHNL_OP_CONFIG_VSI_QUEUES op.
This was causing the following error message when loading the ice
driver, creating VFs, and modifying VF trust in an endless loop:
[35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
[35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
[35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
Fix this by always calling ice_dis_vf_qs() and silencing the error
message in ice_vsi_stop_tx_ring() since the calling code ignores the
return anyway. Also, all other places that call ice_vsi_stop_tx_ring()
catch the error, so this doesn't affect those flows since there was no
change to the values the function returns.
Other solutions were considered (i.e. tracking which VF queues had been
"started/configured" in VIRTCHNL_OP_CONFIG_VSI_QUEUES, but it seemed
more complicated than it was worth. This solution also brings in the
chance for other unexpected conditions due to invalid state bit checks.
So, the proposed solution seemed like the best option since there is no
harm in failing to stop Tx queues that were never started.
This issue can be seen using the following commands:
for i in {0..50}; do
rmmod ice
modprobe ice
sleep 1
echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
ip link set ens785f1 vf 0 trust on
ip link set ens785f0 vf 0 trust on
sleep 2
echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
sleep 1
echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
ip link set ens785f1 vf 0 trust on
ip link set ens785f0 vf 0 trust on
done
Fixes: 77ca27c417 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
VF was not able to change its hardware MAC address in case
the new address was already present in the MAC filter list.
Change the handling of VF add mac request to not return
if requested MAC address is already present on the list
and check if its hardware MAC needs to be updated in this case.
Fixes: ed4c068d46 ("ice: Enable ip link show on the PF to display VF unicast MAC(s)")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently when a trusted VF enables promiscuous mode spoofchk will be
disabled. This is wrong and should only be modified from the
ndo_set_vf_spoofchk callback. Fix this by removing the call to toggle
spoofchk for trusted VFs.
Fixes: 01b5e89aab ("ice: Add VF promiscuous support")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When a VF requests promiscuous mode and it's trusted and true promiscuous
mode is enabled the PF driver attempts to enable unicast and/or
multicast promiscuous mode filters based on the request. This is fine,
but there are a couple issues with the current code.
[1] The define to configure the unicast promiscuous mode mask also
includes bits to configure the multicast promiscuous mode mask, which
causes multicast to be set/cleared unintentionally.
[2] All 4 cases for enable/disable unicast/multicast mode are not
handled in the promiscuous mode message handler, which causes
unexpected results regarding the current promiscuous mode settings.
To fix [1] make sure any promiscuous mask defines include the correct
bits for each of the promiscuous modes.
To fix [2] make sure that all 4 cases are handled since there are 2 bits
(FLAG_VF_UNICAST_PROMISC and FLAG_VF_MULTICAST_PROMISC) that can be
either set or cleared. Also, since either unicast and/or multicast
promiscuous configuration can fail, introduce two separate error values
to handle each of these cases.
Fixes: 01b5e89aab ("ice: Add VF promiscuous support")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
A typical collection of patches this cycle, mostly fixing with a few new
features:
- Fixes from static tools. clang warnings, dead code, unused variable,
coccinelle sweeps, etc
- Driver bug fixes and minor improvements in rxe, bnxt_re, hfi1, mlx5,
irdma, qedr
- rtrs ULP bug fixes an improvments
- Additional counters for bnxt_re
- Support verbs CQ notifications in EFA
- Continued reworking and fixing of rxe
- netlink control to enable/disable optional device counters
- rxe now can use AH objects for its UD path, fixing various bugs in the
process
- Add DMABUF support to EFA
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmGCiiEACgkQOG33FX4g
mxrXlg//bsbMvwkgfS4DGtHyBgrkcqlilQuoNiNq8ezc1mWXWnvx5N/tq0biUlqu
UOsZXgLk45pJa8k3pfCp2+7eWROobt5pfA0lSHoJWHY3bsoJmoM7mggg+PUTi6nm
fAyN2qlzfGuPvcwfXeoj0qlhJMQuV0y2sDCDdh9nHMqsznPAu37cG4zcvXx2+8BL
hQiW3GG1zg2fL1ZF4vaju6CT/6RwnIOYeYnCLYMkTrAjq7r9VyRYuKagUWrG82yN
jeO9bBfhv9FV46C0yzWTbX0hC17YgnelmG0HOQuINr4vWE8cPkO0OakpIJNPGw7a
OCK6+KmdtKLuDEcPx099xEaB1ZqFLh8L1FNuR/VdOIyrERmh4rAFcgsyDhlcZ1wQ
vKOxZFqtUrz5vuGCteQUh9q841/sRorMFtefJIYuxbFDWLBft8heslL9Xn8APAuT
7kFHGBewdcuOSWQ016F5v8x25wY1dCYgdE9VSvxJnbjCp/Aeq8VzAnyNvfnmsg8e
0dsqWwP28mLrLMKtT9EBJ+sud/33x+JTZVBBF+XxtpKYma06OStiRqyUoFZiWmQU
d6qexLelRldLRY9OUoPYpShTZ5X7e0LDJj/1sRV4J1pxDhX3v9rnubtrusWADxG5
VDKx7aJ5BT5xc+pOOkIKHyiw2WZ8ERF4J/YCRyGqT4Yl5fxxyto=
=CxD0
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"A typical collection of patches this cycle, mostly fixing with a few
new features:
- Fixes from static tools. clang warnings, dead code, unused
variable, coccinelle sweeps, etc
- Driver bug fixes and minor improvements in rxe, bnxt_re, hfi1,
mlx5, irdma, qedr
- rtrs ULP bug fixes an improvments
- Additional counters for bnxt_re
- Support verbs CQ notifications in EFA
- Continued reworking and fixing of rxe
- netlink control to enable/disable optional device counters
- rxe now can use AH objects for its UD path, fixing various bugs in
the process
- Add DMABUF support to EFA"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (103 commits)
RDMA/core: Require the driver to set the IOVA correctly during rereg_mr
RDMA/bnxt_re: Remove unsupported bnxt_re_modify_ah callback
RDMA/irdma: optimize rx path by removing unnecessary copy
RDMA/qed: Use helper function to set GUIDs
RDMA/hns: Use the core code to manage the fixed mmap entries
IB/opa_vnic: Rebranding of OPA VNIC driver to Cornelis Networks
IB/qib: Rebranding of qib driver to Cornelis Networks
IB/hfi1: Rebranding of hfi1 driver to Cornelis Networks
RDMA/bnxt_re: Use helper function to set GUIDs
RDMA/bnxt_re: Fix kernel panic when trying to access bnxt_re_stat_descs
RDMA/qedr: Fix NULL deref for query_qp on the GSI QP
RDMA/hns: Modify the value of MAX_LP_MSG_LEN to meet hardware compatibility
RDMA/hns: Fix initial arm_st of CQ
RDMA/rxe: Make rxe_type_info static const
RDMA/rxe: Use 'bitmap_zalloc()' when applicable
RDMA/rxe: Save a few bytes from struct rxe_pool
RDMA/irdma: Remove the unused variable local_qp
RDMA/core: Fix missed initialization of rdma_hw_stats::lock
RDMA/efa: Add support for dmabuf memory regions
RDMA/umem: Allow pinned dmabuf umem usage
...
Normally it is expected that the dsa_device_ops :: rcv() method finishes
parsing the DSA tag and consumes it, then never looks at it again.
But commit c0bcf53766 ("net: dsa: ocelot: add hardware timestamping
support for Felix") added support for RX timestamping in a very
unconventional way. On this switch, a partial timestamp is available in
the DSA header, but the driver got away with not parsing that timestamp
right away, but instead delayed that parsing for a little longer:
dsa_switch_rcv():
nskb = cpu_dp->rcv(skb, dev); <------------- not here
-> ocelot_rcv()
...
skb = nskb;
skb_push(skb, ETH_HLEN);
skb->pkt_type = PACKET_HOST;
skb->protocol = eth_type_trans(skb, skb->dev);
...
if (dsa_skb_defer_rx_timestamp(p, skb)) <--- but here
-> felix_rxtstamp()
return 0;
When in felix_rxtstamp(), this driver accounted for the fact that
eth_type_trans() happened in the meanwhile, so it got a hold of the
extraction header again by subtracting (ETH_HLEN + OCELOT_TAG_LEN) bytes
from the current skb->data.
This worked for quite some time but was quite fragile from the very
beginning. Not to mention that having DSA tag parsing split in two
different files, under different folders (net/dsa/tag_ocelot.c vs
drivers/net/dsa/ocelot/felix.c) made it quite non-obvious for patches to
come that they might break this.
Finally, the blamed commit does the following: at the end of
ocelot_rcv(), it checks whether the skb payload contains a VLAN header.
If it does, and this port is under a VLAN-aware bridge, that VLAN ID
might not be correct in the sense that the packet might have suffered
VLAN rewriting due to TCAM rules (VCAP IS1). So we consume the VLAN ID
from the skb payload using __skb_vlan_pop(), and take the classified
VLAN ID from the DSA tag, and construct a hwaccel VLAN tag with the
classified VLAN, and the skb payload is VLAN-untagged.
The big problem is that __skb_vlan_pop() does:
memmove(skb->data + VLAN_HLEN, skb->data, 2 * ETH_ALEN);
__skb_pull(skb, VLAN_HLEN);
aka it moves the Ethernet header 4 bytes to the right, and pulls 4 bytes
from the skb headroom (effectively also moving skb->data, by definition).
So for felix_rxtstamp()'s fragile logic, all bets are off now.
Instead of having the "extraction" pointer point to the DSA header,
it actually points to 4 bytes _inside_ the extraction header.
Corollary, the last 4 bytes of the "extraction" header are in fact 4
stale bytes of the destination MAC address from the Ethernet header,
from prior to the __skb_vlan_pop() movement.
So of course, RX timestamps are completely bogus when the system is
configured in this way.
The fix is actually very simple: just don't structure the code like that.
For better or worse, the DSA PTP timestamping API does not offer a
straightforward way for drivers to present their RX timestamps, but
other drivers (sja1105) have established a simple mechanism to carry
their RX timestamp from dsa_device_ops :: rcv() all the way to
dsa_switch_ops :: port_rxtstamp() and even later. That mechanism is to
simply save the partial timestamp to the skb->cb, and complete it later.
Question: why don't we simply populate the skb's struct
skb_shared_hwtstamps from ocelot_rcv(), and bother with this
complication of propagating the timestamp to felix_rxtstamp()?
Answer: dsa_switch_ops :: port_rxtstamp() answers the question whether
PTP packets need sleepable context to retrieve the full RX timestamp.
Currently felix_rxtstamp() answers "no, thanks" to that question, and
calls ocelot_ptp_gettime64() from softirq atomic context. This is
understandable, since Felix VSC9959 is a PCIe memory-mapped switch, so
hardware access does not require sleeping. But the felix driver is
preparing for the introduction of other switches where hardware access
is over a slow bus like SPI or MDIO:
https://lore.kernel.org/lkml/20210814025003.2449143-1-colin.foster@in-advantage.com/
So I would like to keep this code structure, so the rework needed when
that driver will need PTP support will be minimal (answer "yes, I need
deferred context for this skb's RX timestamp", then the partial
timestamp will still be found in the skb->cb.
Fixes: ea440cd2d9 ("net: dsa: tag_ocelot: use VLAN information from tagging header when available")
Reported-by: Po Liu <po.liu@nxp.com>
Cc: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some device set MAC06 exchange in the bootloader. This cause some
problem as we don't support this strange mode and we just set the port6
as the primary CPU port. With MAC06 exchange, PAD0 reg configure port6
instead of port0. Add an extra check and explicitly disable MAC06 exchange
to correctly configure the port PAD config.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Fixes: 3fcf734aa4 ("net: dsa: qca8k: add support for cpu port 6")
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean up the following includecheck warning:
./drivers/net/amt.c: net/protocol.h is included more than once.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On m68k:
In function ‘prestera_hw_build_tests’,
inlined from ‘prestera_hw_switch_init’ at drivers/net/ethernet/marvell/prestera/prestera_hw.c:788:2:
././include/linux/compiler_types.h:335:38: error: call to ‘__compiletime_assert_345’ declared with attribute error: BUILD_BUG_ON failed: sizeof(struct prestera_msg_switch_attr_req) != 16
...
The driver assumes structure members are naturally aligned, but does not
add explicit padding, thus breaking architectures where integral values
are not always naturally aligned (e.g. on m68k, __alignof(int) is 2, not
4).
Fixes: bb5dbf2cc6 ("net: marvell: prestera: add firmware v4.0 support")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211102082433.3820514-1-geert@linux-m68k.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix following coccicheck warning:
./drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c:446:8-56: WARNING
avoid newline at end of message in NL_SET_ERR_MSG_MOD.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Link: https://lore.kernel.org/r/20211102020312.16567-1-wanjiabing@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch allows to use 0 for `coal->rx_coalesce_usecs` param to
disable rx irq coalescing.
Previously we could enable rx irq coalescing via ethtool
(For ex: `ethtool -C eth0 rx-usecs 2000`) but we couldn't disable
it because this part rejects 0 value:
if (!coal->rx_coalesce_usecs)
return -EINVAL;
Fixes: 84da2658a6 ("TI DaVinci EMAC : Implement interrupt pacing functionality.")
Signed-off-by: Maxim Kiselev <bigunclemax@gmail.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20211101152343.4193233-1-bigunclemax@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a function to initialize phy rgmii delay according to phydev->interface.
Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20211101162119.29275-1-yuiko.oshino@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmGBMQUTHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXmE5B/9MK3Ju+tc6C8eyR3Ic4XBYHJ3voEKO
M+R90gggBriDOgkz4B8vF+k0aD8wevXAUtmCSXonDzCh5H7GoyfrVZmJEVkwlioH
ZMSMlFHcjGhCPIXhLbNtfo/NsAYEtT/lRM2lLGCSbdGuKabylXKujVdhuSIcRPdj
Rj5innUgcAywOoxG6WzFt3JBzM33UQErCGfUF2b7Rvp9E+Zii4vIMxkMzUpnkEHH
F8WMEdL0DqH5ThOs0MslNgy03pUC9wk1d5DNd9ytYHqiSQtcQZhFHw/P6dxzUFlW
OptWv31PXUIsiJf4Zi9hmfjgUl+KZHeacZ2hXtidAo86VPcIjVs25OQW
=40fn
-----END PGP SIGNATURE-----
Merge tag 'hyperv-next-signed-20211102' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- Initial patch set for Hyper-V isolation VM support (Tianyu Lan)
- Fix a warning on preemption (Vitaly Kuznetsov)
- A bunch of misc cleanup patches
* tag 'hyperv-next-signed-20211102' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Protect set_hv_tscchange_cb() against getting preempted
Drivers: hv : vmbus: Adding NULL pointer check
x86/hyperv: Remove duplicate include
x86/hyperv: Remove duplicated include in hv_init
Drivers: hv: vmbus: Remove unused code to check for subchannels
Drivers: hv: vmbus: Initialize VMbus ring buffer for Isolation VM
Drivers: hv: vmbus: Add SNP support for VMbus channel initiate message
x86/hyperv: Add ghcb hvcall support for SNP VM
x86/hyperv: Add Write/Read MSR registers via ghcb page
Drivers: hv: vmbus: Mark vmbus ring buffer visible to host in Isolation VM
x86/hyperv: Add new hvcall guest address host visibility support
x86/hyperv: Initialize shared memory boundary in the Isolation VM.
x86/hyperv: Initialize GHCB page in Isolation VM
- Remove socket skb caches
- Add a SO_RESERVE_MEM socket op to forward allocate buffer space
and avoid memory accounting overhead on each message sent
- Introduce managed neighbor entries - added by control plane and
resolved by the kernel for use in acceleration paths (BPF / XDP
right now, HW offload users will benefit as well)
- Make neighbor eviction on link down controllable by userspace
to work around WiFi networks with bad roaming implementations
- vrf: Rework interaction with netfilter/conntrack
- fq_codel: implement L4S style ce_threshold_ect1 marking
- sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
BPF:
- Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
as implemented in LLVM14
- Introduce bpf_get_branch_snapshot() to capture Last Branch Records
- Implement variadic trace_printk helper
- Add a new Bloomfilter map type
- Track <8-byte scalar spill and refill
- Access hw timestamp through BPF's __sk_buff
- Disallow unprivileged BPF by default
- Document BPF licensing
Netfilter:
- Introduce egress hook for looking at raw outgoing packets
- Allow matching on and modifying inner headers / payload data
- Add NFT_META_IFTYPE to match on the interface type either from
ingress or egress
Protocols:
- Multi-Path TCP:
- increase default max additional subflows to 2
- rework forward memory allocation
- add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
- MCTP flow support allowing lower layer drivers to configure msg
muxing as needed
- Automatic Multicast Tunneling (AMT) driver based on RFC7450
- HSR support the redbox supervision frames (IEC-62439-3:2018)
- Support for the ip6ip6 encapsulation of IOAM
- Netlink interface for CAN-FD's Transmitter Delay Compensation
- Support SMC-Rv2 eliminating the current same-subnet restriction,
by exploiting the UDP encapsulation feature of RoCE adapters
- TLS: add SM4 GCM/CCM crypto support
- Bluetooth: initial support for link quality and audio/codec
offload
Driver APIs:
- Add a batched interface for RX buffer allocation in AF_XDP
buffer pool
- ethtool: Add ability to control transceiver modules' power mode
- phy: Introduce supported interfaces bitmap to express MAC
capabilities and simplify PHY code
- Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
New drivers:
- WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
- Ethernet driver for ASIX AX88796C SPI device (x88796c)
Drivers:
- Broadcom PHYs
- support 72165, 7712 16nm PHYs
- support IDDQ-SR for additional power savings
- PHY support for QCA8081, QCA9561 PHYs
- NXP DPAA2: support for IRQ coalescing
- NXP Ethernet (enetc): support for software TCP segmentation
- Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
Gigabit-capable IP found on RZ/G2L SoC
- Intel 100G Ethernet
- support for eswitch offload of TC/OvS flow API, including
offload of GRE, VxLAN, Geneve tunneling
- support application device queues - ability to assign Rx and Tx
queues to application threads
- PTP and PPS (pulse-per-second) extensions
- Broadcom Ethernet (bnxt)
- devlink health reporting and device reload extensions
- Mellanox Ethernet (mlx5)
- offload macvlan interfaces
- support HW offload of TC rules involving OVS internal ports
- support HW-GRO and header/data split
- support application device queues
- Marvell OcteonTx2:
- add XDP support for PF
- add PTP support for VF
- Qualcomm Ethernet switch (qca8k): support for QCA8328
- Realtek Ethernet DSA switch (rtl8366rb)
- support bridge offload
- support STP, fast aging, disabling address learning
- support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
- Mellanox Ethernet/IB switch (mlxsw)
- multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
- offload root TBF qdisc as port shaper
- support multiple routing interface MAC address prefixes
- support for IP-in-IP with IPv6 underlay
- MediaTek WiFi (mt76)
- mt7921 - ASPM, 6GHz, SDIO and testmode support
- mt7915 - LED and TWT support
- Qualcomm WiFi (ath11k)
- include channel rx and tx time in survey dump statistics
- support for 80P80 and 160 MHz bandwidths
- support channel 2 in 6 GHz band
- spectral scan support for QCN9074
- support for rx decapsulation offload (data frames in 802.3
format)
- Qualcomm phone SoC WiFi (wcn36xx)
- enable Idle Mode Power Save (IMPS) to reduce power consumption
during idle
- Bluetooth driver support for MediaTek MT7922 and MT7921
- Enable support for AOSP Bluetooth extension in Qualcomm WCN399x
and Realtek 8822C/8852A
- Microsoft vNIC driver (mana)
- support hibernation and kexec
- Google vNIC driver (gve)
- support for jumbo frames
- implement Rx page reuse
Refactor:
- Make all writes to netdev->dev_addr go thru helpers, so that we
can add this address to the address rbtree and handle the updates
- Various TCP cleanups and optimizations including improvements
to CPU cache use
- Simplify the gnet_stats, Qdisc stats' handling and remove
qdisc->running sequence counter
- Driver changes and API updates to address devlink locking
deficiencies
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGAzX4ACgkQMUZtbf5S
IrvW3g//Q0ZLrOuHK9pZ8sCXMMhDj8qL6ajm0otMddHWA/+1UglwVBKFhsajfxOf
wJ/5LZis+XKLpLqKTU5chKVfn39HuDGe/D3l+egi01Gv5BW0+XzEhagfyR5tJX5z
wsGG5CXO/we/laVSzRiFtwwVEKHKN20YC+tIQwYOYP5Wy3q4G7qDsFhT7GqgsGCS
n74QUEAIB5Tz0ODWFqLtbsySzIurXrskibwt5T9bvAAlPw/lCU68mmG+NVJ7VddO
lBbNkLMOo8yW9Ci20H09SrYd4jZTmMARo9tsFO1tAvAMk7qpn0Wd8pnOYTjFFoMD
+qjiFSVMh7E0JGb8Y7NCvwaB99suAK5rfGP68Xwe62DfP7vYWEx4pZGxBP19F4ld
6Kn1ME33BX9rUF9tBecf0bdKfJUwB2Q2Xou/b9laG04bwiqsc9iG5FQq1C46lnLZ
QdzNiS1My4dJMczkWt66HF3Kx30ibwHfvKMIHjf4PqkzEatkv6Y6SBZ57KXL+Lde
0BQSFhbf0tm2Gf55etzrczLElI3uqHSFWUNZZ2Bt6WmzO1e6tpV9nAtRWF4C/dFg
QDpLJtOOOY65uq+qz09zoPfv2lem868SrCAuFrVn99bEpYjx/CGNFDeEI02l6jyr
84eUxd364UcbIk3fc+eTGdXHLQNVk30G0AHVBBxaWNIidwfqXeE=
=srde
-----END PGP SIGNATURE-----
Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Remove socket skb caches
- Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
avoid memory accounting overhead on each message sent
- Introduce managed neighbor entries - added by control plane and
resolved by the kernel for use in acceleration paths (BPF / XDP
right now, HW offload users will benefit as well)
- Make neighbor eviction on link down controllable by userspace to
work around WiFi networks with bad roaming implementations
- vrf: Rework interaction with netfilter/conntrack
- fq_codel: implement L4S style ce_threshold_ect1 marking
- sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
BPF:
- Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
as implemented in LLVM14
- Introduce bpf_get_branch_snapshot() to capture Last Branch Records
- Implement variadic trace_printk helper
- Add a new Bloomfilter map type
- Track <8-byte scalar spill and refill
- Access hw timestamp through BPF's __sk_buff
- Disallow unprivileged BPF by default
- Document BPF licensing
Netfilter:
- Introduce egress hook for looking at raw outgoing packets
- Allow matching on and modifying inner headers / payload data
- Add NFT_META_IFTYPE to match on the interface type either from
ingress or egress
Protocols:
- Multi-Path TCP:
- increase default max additional subflows to 2
- rework forward memory allocation
- add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
- MCTP flow support allowing lower layer drivers to configure msg
muxing as needed
- Automatic Multicast Tunneling (AMT) driver based on RFC7450
- HSR support the redbox supervision frames (IEC-62439-3:2018)
- Support for the ip6ip6 encapsulation of IOAM
- Netlink interface for CAN-FD's Transmitter Delay Compensation
- Support SMC-Rv2 eliminating the current same-subnet restriction, by
exploiting the UDP encapsulation feature of RoCE adapters
- TLS: add SM4 GCM/CCM crypto support
- Bluetooth: initial support for link quality and audio/codec offload
Driver APIs:
- Add a batched interface for RX buffer allocation in AF_XDP buffer
pool
- ethtool: Add ability to control transceiver modules' power mode
- phy: Introduce supported interfaces bitmap to express MAC
capabilities and simplify PHY code
- Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
New drivers:
- WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
- Ethernet driver for ASIX AX88796C SPI device (x88796c)
Drivers:
- Broadcom PHYs
- support 72165, 7712 16nm PHYs
- support IDDQ-SR for additional power savings
- PHY support for QCA8081, QCA9561 PHYs
- NXP DPAA2: support for IRQ coalescing
- NXP Ethernet (enetc): support for software TCP segmentation
- Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
Gigabit-capable IP found on RZ/G2L SoC
- Intel 100G Ethernet
- support for eswitch offload of TC/OvS flow API, including
offload of GRE, VxLAN, Geneve tunneling
- support application device queues - ability to assign Rx and Tx
queues to application threads
- PTP and PPS (pulse-per-second) extensions
- Broadcom Ethernet (bnxt)
- devlink health reporting and device reload extensions
- Mellanox Ethernet (mlx5)
- offload macvlan interfaces
- support HW offload of TC rules involving OVS internal ports
- support HW-GRO and header/data split
- support application device queues
- Marvell OcteonTx2:
- add XDP support for PF
- add PTP support for VF
- Qualcomm Ethernet switch (qca8k): support for QCA8328
- Realtek Ethernet DSA switch (rtl8366rb)
- support bridge offload
- support STP, fast aging, disabling address learning
- support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
- Mellanox Ethernet/IB switch (mlxsw)
- multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
- offload root TBF qdisc as port shaper
- support multiple routing interface MAC address prefixes
- support for IP-in-IP with IPv6 underlay
- MediaTek WiFi (mt76)
- mt7921 - ASPM, 6GHz, SDIO and testmode support
- mt7915 - LED and TWT support
- Qualcomm WiFi (ath11k)
- include channel rx and tx time in survey dump statistics
- support for 80P80 and 160 MHz bandwidths
- support channel 2 in 6 GHz band
- spectral scan support for QCN9074
- support for rx decapsulation offload (data frames in 802.3
format)
- Qualcomm phone SoC WiFi (wcn36xx)
- enable Idle Mode Power Save (IMPS) to reduce power consumption
during idle
- Bluetooth driver support for MediaTek MT7922 and MT7921
- Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
Realtek 8822C/8852A
- Microsoft vNIC driver (mana)
- support hibernation and kexec
- Google vNIC driver (gve)
- support for jumbo frames
- implement Rx page reuse
Refactor:
- Make all writes to netdev->dev_addr go thru helpers, so that we can
add this address to the address rbtree and handle the updates
- Various TCP cleanups and optimizations including improvements to
CPU cache use
- Simplify the gnet_stats, Qdisc stats' handling and remove
qdisc->running sequence counter
- Driver changes and API updates to address devlink locking
deficiencies"
* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
Revert "net: avoid double accounting for pure zerocopy skbs"
selftests: net: add arp_ndisc_evict_nocarrier
net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
net: arp: introduce arp_evict_nocarrier sysctl parameter
libbpf: Deprecate AF_XDP support
kbuild: Unify options for BTF generation for vmlinux and modules
selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
net: avoid double accounting for pure zerocopy skbs
tcp: rename sk_wmem_free_skb
netdevsim: fix uninit value in nsim_drv_configure_vfs()
selftests/bpf: Fix also no-alu32 strobemeta selftest
bpf: Add missing map_delete_elem method to bloom filter map
selftests/bpf: Add bloom map success test for userspace calls
bpf: Add alignment padding for "map_extra" + consolidate holes
bpf: Bloom filter map naming fixups
selftests/bpf: Add test cases for struct_ops prog
bpf: Add dummy BPF STRUCT_OPS for test purpose
...
The end goal of the current buffer overflow detection work[0] is to gain
full compile-time and run-time coverage of all detectable buffer overflows
seen via array indexing or memcpy(), memmove(), and memset(). The str*()
family of functions already have full coverage.
While much of the work for these changes have been on-going for many
releases (i.e. 0-element and 1-element array replacements, as well as
avoiding false positives and fixing discovered overflows[1]), this series
contains the foundational elements of several related buffer overflow
detection improvements by providing new common helpers and FORTIFY_SOURCE
changes needed to gain the introspection required for compiler visibility
into array sizes. Also included are a handful of already Acked instances
using the helpers (or related clean-ups), with many more waiting at the
ready to be taken via subsystem-specific trees[2]. The new helpers are:
- struct_group() for gaining struct member range introspection.
- memset_after() and memset_startat() for clearing to the end of structures.
- DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in structs.
Also included is the beginning of the refactoring of FORTIFY_SOURCE to
support memcpy() introspection, fix missing and regressed coverage under
GCC, and to prepare to fix the currently broken Clang support. Finishing
this work is part of the larger series[0], but depends on all the false
positives and buffer overflow bug fixes to have landed already and those
that depend on this series to land.
As part of the FORTIFY_SOURCE refactoring, a set of both a compile-time
and run-time tests are added for FORTIFY_SOURCE and the mem*()-family
functions respectively. The compile time tests have found a legitimate
(though corner-case) bug[6] already.
Please note that the appearance of "panic" and "BUG" in the
FORTIFY_SOURCE refactoring are the result of relocating existing code,
and no new use of those code-paths are expected nor desired.
Finally, there are two tree-wide conversions for 0-element arrays and
flexible array unions to gain sane compiler introspection coverage that
result in no known object code differences.
After this series (and the changes that have now landed via netdev
and usb), we are very close to finally being able to build with
-Warray-bounds and -Wzero-length-bounds. However, due corner cases in
GCC[3] and Clang[4], I have not included the last two patches that turn
on these options, as I don't want to introduce any known warnings to
the build. Hopefully these can be solved soon.
[0] https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE
[2] https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/
[3] https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/
[4] https://bugs.llvm.org/show_bug.cgi?id=51682
[5] https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/
[6] https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmGAFWcWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJmKFD/45MJdnvW5MhIEeW5tc5UjfcIPS
ae+YvlEX/2ZwgSlTxocFVocE6hz7b6eCiX3dSAChPkPxsSfgeiuhjxsU+4ROnELR
04RqTA/rwT6JXfJcXbDPXfxDL4huUkgktAW3m1sT771AZspeap2GrSwFyttlTqKA
+kTiZ3lXJVFcw10uyhfp3Lk6eFJxdf5iOjuEou5kBOQfpNKEOduRL2K15hSowOwB
lARiAC+HbmN+E+npvDE7YqK4V7ZQ0/dtB0BlfqgTkn1spQz8N21kBAMpegV5vvIk
A+qGHc7q2oyk4M14TRTidQHGQ4juW1Kkvq3NV6KzwQIVD+mIfz0ESn3d4tnp28Hk
Y+OXTI1BRFlApQU9qGWv33gkNEozeyqMLDRLKhDYRSFPA9UKkpgXQRzeTzoLKyrQ
4B6n5NnUGcu7I6WWhpyZQcZLDsHGyy0vHzjQGs/NXtb1PzXJ5XIGuPdmx9pVMykk
IVKnqRcWyGWahfh3asOnoXvdhi1No4NSHQ/ZHfUM+SrIGYjBMaUisw66qm3Fe8ZU
lbO2CFkCsfGSoKNPHf0lUEGlkyxAiDolazOfflDNxdzzlZo2X1l/a7O/yoO4Pqul
cdL0eDjiNoQ2YR2TSYPnXq5KSL1RI0tlfS8pH8k1hVhZsQx0wpAQ+qki0S+fLePV
PdA9XB82G2tmqKc9cQ==
=9xbT
-----END PGP SIGNATURE-----
Merge tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull overflow updates from Kees Cook:
"The end goal of the current buffer overflow detection work[0] is to
gain full compile-time and run-time coverage of all detectable buffer
overflows seen via array indexing or memcpy(), memmove(), and
memset(). The str*() family of functions already have full coverage.
While much of the work for these changes have been on-going for many
releases (i.e. 0-element and 1-element array replacements, as well as
avoiding false positives and fixing discovered overflows[1]), this
series contains the foundational elements of several related buffer
overflow detection improvements by providing new common helpers and
FORTIFY_SOURCE changes needed to gain the introspection required for
compiler visibility into array sizes. Also included are a handful of
already Acked instances using the helpers (or related clean-ups), with
many more waiting at the ready to be taken via subsystem-specific
trees[2].
The new helpers are:
- struct_group() for gaining struct member range introspection
- memset_after() and memset_startat() for clearing to the end of
structures
- DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in
structs
Also included is the beginning of the refactoring of FORTIFY_SOURCE to
support memcpy() introspection, fix missing and regressed coverage
under GCC, and to prepare to fix the currently broken Clang support.
Finishing this work is part of the larger series[0], but depends on
all the false positives and buffer overflow bug fixes to have landed
already and those that depend on this series to land.
As part of the FORTIFY_SOURCE refactoring, a set of both a
compile-time and run-time tests are added for FORTIFY_SOURCE and the
mem*()-family functions respectively. The compile time tests have
found a legitimate (though corner-case) bug[6] already.
Please note that the appearance of "panic" and "BUG" in the
FORTIFY_SOURCE refactoring are the result of relocating existing code,
and no new use of those code-paths are expected nor desired.
Finally, there are two tree-wide conversions for 0-element arrays and
flexible array unions to gain sane compiler introspection coverage
that result in no known object code differences.
After this series (and the changes that have now landed via netdev and
usb), we are very close to finally being able to build with
-Warray-bounds and -Wzero-length-bounds.
However, due corner cases in GCC[3] and Clang[4], I have not included
the last two patches that turn on these options, as I don't want to
introduce any known warnings to the build. Hopefully these can be
solved soon"
Link: https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/ [0]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE [1]
Link: https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/ [2]
Link: https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/ [3]
Link: https://bugs.llvm.org/show_bug.cgi?id=51682 [4]
Link: https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/ [5]
Link: https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/ [6]
* tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits)
fortify: strlen: Avoid shadowing previous locals
compiler-gcc.h: Define __SANITIZE_ADDRESS__ under hwaddress sanitizer
treewide: Replace 0-element memcpy() destinations with flexible arrays
treewide: Replace open-coded flex arrays in unions
stddef: Introduce DECLARE_FLEX_ARRAY() helper
btrfs: Use memset_startat() to clear end of struct
string.h: Introduce memset_startat() for wiping trailing members and padding
xfrm: Use memset_after() to clear padding
string.h: Introduce memset_after() for wiping trailing members/padding
lib: Introduce CONFIG_MEMCPY_KUNIT_TEST
fortify: Add compile-time FORTIFY_SOURCE tests
fortify: Allow strlen() and strnlen() to pass compile-time known lengths
fortify: Prepare to improve strnlen() and strlen() warnings
fortify: Fix dropped strcpy() compile-time write overflow check
fortify: Explicitly disable Clang support
fortify: Move remaining fortify helpers into fortify-string.h
lib/string: Move helper functions out of string.c
compiler_types.h: Remove __compiletime_object_size()
cm4000_cs: Use struct_group() to zero struct cm4000_dev region
can: flexcan: Use struct_group() to zero struct flexcan_regs regions
...
In one if branch, (ec->rx_coalesce_usecs != 0) is checked. When it is
checked again in two more places, it is always false and has no effect
on the whole check expression. We should remove it in both places.
In another if branch, (ec->use_adaptive_rx_coalesce != 0) is checked.
When it is checked again, it is always false. We should remove the
entire branch with it.
In addition we might as well let C precedence dictate by getting rid of
two pairs of parentheses in the neighboring lines in order to keep
expressions on both sides of '||' in balance with checkpatch warning
silenced.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Link: https://lore.kernel.org/r/20211031012728.8325-1-sakiwit@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Build bot points out that I missed initializing ret
after refactoring.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 1c401078bc ("netdevsim: move details of vf config to dev")
Link: https://lore.kernel.org/r/20211101221845.3188490-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmF/AjYeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG1hkIAJ6sFDbvb4M4LMwf
Slh2NVL9o5sLMBDzVwnVlyMSKDbMn1WBKreGssaLgZjGDc74lxsdSmw5l9MZm0JN
xlq95Q6XFiuu+0qDHPWwfDz3JFO4TqW2ZLLPWk9NnkNbRXqccSrlVRi1RpgE1t3/
NUtS8CQLu6A2BYMc6mkk3aV6IwSNKOkWbM5eBHSvU4j8B6lLbNQop0AfO/wyY1xB
U6LiVE1RpN/b7Yv+75ITtNzuHzVIBx6305FvSnOlKbMKKvIClt96Vd2OeuoEkK+6
wGU8JraB1+fc0GckAhynNrjWQWdvi0MAhFWWEJxjS20OGcV1rXDduNfkVNauO1Zn
+dNyJ3s=
=g9fz
-----END PGP SIGNATURE-----
Merge tag 'v5.15' into rdma.git for-next
Pull in the accepted for-rc patches as the next merge needs a newer base.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
In the previous patch, igmp report handler was added.
That handler can be used for mld too.
So, it uses that common code to parse mld report message.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
amt 'Relay' interface manages multicast groups(igmp/mld) and sources.
In order to manage, it should have the function to parse igmp/mld
report messages. So, this adds the logic for parsing igmp report messages
and saves them on their own data structure.
struct amt_group_node means one group(igmp/mld).
struct amt_source_node means one source.
The same source can't exist in the same group.
The same group can exist in the same tunnel because it manages
the host address too.
The group information is used when forwarding multicast data.
If there are no groups in the specific tunnel, Relay doesn't forward it.
Although Relay manages sources, it doesn't support the source filtering
feature. Because the reason to manage sources is just that in order
to manage group more correctly.
In the next patch, MLD part will be added.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before forwarding multicast traffic, the amt interface establishes between
gateway and relay. In order to establish, amt defined some message type
and those message flow looks like the below.
Gateway Relay
------- -----
: Request :
[1] | N |
|---------------------->|
| Membership Query | [2]
| N,MAC,gADDR,gPORT |
|<======================|
[3] | Membership Update |
| ({G:INCLUDE({S})}) |
|======================>|
| |
---------------------:-----------------------:---------------------
| | | |
| | *Multicast Data | *IP Packet(S,G) |
| | gADDR,gPORT |<-----------------() |
| *IP Packet(S,G) |<======================| |
| ()<-----------------| | |
| | | |
---------------------:-----------------------:---------------------
~ ~
~ Request ~
[4] | N' |
|---------------------->|
| Membership Query | [5]
| N',MAC',gADDR',gPORT' |
|<======================|
[6] | |
| Teardown |
| N,MAC,gADDR,gPORT |
|---------------------->|
| | [7]
| Membership Update |
| ({G:INCLUDE({S})}) |
|======================>|
| |
---------------------:-----------------------:---------------------
| | | |
| | *Multicast Data | *IP Packet(S,G) |
| | gADDR',gPORT' |<-----------------() |
| *IP Packet (S,G) |<======================| |
| ()<-----------------| | |
| | | |
---------------------:-----------------------:---------------------
| |
: :
1. Discovery
- Sent by Gateway to Relay
- To find Relay unique ip address
2. Advertisement
- Sent by Relay to Gateway
- Contains the unique IP address
3. Request
- Sent by Gateway to Relay
- Solicit to receive 'Query' message.
4. Query
- Sent by Relay to Gateway
- Contains General Query message.
5. Update
- Sent by Gateway to Relay
- Contains report message.
6. Multicast Data
- Sent by Relay to Gateway
- encapsulated multicast traffic.
7. Teardown
- Not supported at this time.
Except for the Teardown message, it supports all messages.
In the next patch, IGMP/MLD logic will be added.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It adds definitions and control plane code for AMT.
this is very similar to udp tunneling interfaces such as gtp, vxlan, etc.
In the next patch, data plane code will be added.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename functions serving as driver entry points
from nsim_dev_... to nsim_drv_... this makes the
API boundary between bus and dev clearer.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
max_vfs is a strange little beast because the file
hangs off of nsim's debugfs, but it configures a field
in the bus device. Move it to dev.c, let's look at it
as if the device driver was imposing VF limit based
on FW info (like pci_sriov_set_totalvfs()).
Again, when moving refactor the function not to hold
the vfs lock pointlessly while parsing the input.
Wrap the access from the read side in READ_ONCE()
to appease concurrency checkers. Do not check if
return value from snprintf() is negative...
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since "eswitch" configuration was added bus.c contains
a lot of device details which really belong to dev.c.
Restructure the code while moving it.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When netdevsim got split into the faux bus vfconfig ended
up in the bus device (think pci_dev) which is strange because
it contains very networky not to say netdevy information.
Move it to nsim_dev, which is the driver "priv" structure
for the device.
To make sure we don't race with probe/remove take
the device lock (much like PCI).
While at it remove the NULL-checking of vfconfigs.
It appears to be pointless.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Legacy VF NDOs look at num_vfs and then based on that
index into vfconfig. If we don't rtnl_lock() num_vfs
may get set to 0 and vfconfig freed/replaced while
the NDO is running.
We don't need to protect replacing vfconfig since it's
only done when num_vfs is 0.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the suspend/resume/shutdown callbacks for hibernation/kexec.
Add mana_gd_setup() and mana_gd_cleanup() for some common code, and
use them in the mand_gd_* callbacks.
Reuse mana_probe/remove() for the hibernation path.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently when the HWC creation fails, the error handling is flawed,
e.g. if mana_hwc_create_channel() -> mana_hwc_establish_channel() fails,
the resources acquired in mana_hwc_init_queues() is not released.
Enhance mana_hwc_destroy_channel() to do the proper cleanup work and
call it accordingly.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PF driver might use the OS info for statistical purposes.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the correct port index rather than 0.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we get CRQ_INIT, we set errno to -EIO and first call complete() to
notify the waiter. Then we try to schedule a FAILOVER reset. If this
occurs while adapter is in PROBING state, ibmvnic_reset() changes the
error code to EAGAIN and returns without scheduling the FAILOVER. The
purpose of setting error code to EAGAIN is to ask the waiter to retry.
But due to the earlier complete() call, the waiter may already have seen
the -EIO response and decided not to retry. This can cause intermittent
failures when bringing up ibmvnic adapters during boot, specially in
in kexec/kdump kernels.
Defer the complete() call until after scheduling the reset.
Also streamline the error code to EAGAIN. Don't see why we need EIO
sometimes. All 3 callers of ibmvnic_reset_init() can handle EAGAIN.
Fixes: 17c8705838 ("ibmvnic: Return error code if init interrupted by transport event")
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Soon after registering a CRQ it is possible that we get a fail over or
maybe a CRQ_INIT from the VIOS while interrupts were disabled.
Look for any such CRQs after enabling interrupts.
Otherwise we can intermittently fail to bring up ibmvnic adapters during
boot, specially in kexec/kdump kernels.
Fixes: 032c5e8284 ("Driver for IBM System i/p VNIC protocol")
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If adapter's resetting bit is on, discard the packet but don't stop the
transmit queue - instead leave that to the reset code. With this change,
it is possible that we may get several calls to ibmvnic_xmit() that simply
discard packets and return.
But if we stop the queue here, we might end up doing so just after
__ibmvnic_open() started the queues (during a hard/soft reset) and before
the ->resetting bit was cleared. If that happens, there will be no one to
restart queue and transmissions will be blocked indefinitely.
This can cause a TIMEOUT reset and with auto priority failover enabled,
an unnecessary FAILOVER reset to less favored backing device and then a
FAILOVER back to the most favored backing device. If we hit the window
repeatedly, we can get stuck in a loop of TIMEOUT, FAILOVER, FAILOVER
resets leaving the adapter unusable for extended periods of time.
Fixes: 7f5b030830 ("ibmvnic: Free skb's in cases of failure in transmit")
Reported-by: Abdul Haleem <abdhalee@in.ibm.com>
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tunnel_type check only allows for "netif_is_gretap", but for
OVS the port is actually "netif_is_ip6gretap" when setting up GRE
for ipv6, which means offloading request was rejected before.
Therefore, adding "netif_is_ip6gretap" allow ipv6gretap interface
for offloading.
Signed-off-by: Yu Xiao <yu.xiao@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2021-10-29
This series contains updates to ice and iavf drivers and virtchnl header
file.
Brett removes vlan_promisc argument from a function call for ice driver.
In the virtchnl header file he removes an unused, reserved define and
converts raw value defines to instead use the BIT macro.
Marcin adds syncing of MAC addresses when creating switchdev VFs to
remove error messages on link up and stops showing buffer information
for port representors to remove duplicated entries being displayed for
ice driver.
Karen introduces a helper to go from pci_dev to iavf_adapter in the
iavf driver.
Przemyslaw fixes an issue where iavf was attempting to free IRQs before
calling disable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
1) Minor trivial refactoring and improvements
2) Check for unsupported parameters fields in SW steering
3) Support TC offload for OVS internal port, from Ariel, see below.
Ariel Levkovich says:
=====================
Support HW offload of TC rules involving OVS internal port
device type as the filter device or the destination
device.
The support is for flows which explicitly use the internal
port as source or destination device as well as indirect offload
for flows performing tunnel set or unset via a tunnel device
and the internal port is the tunnel overlay device.
Since flows with internal port as source port are added
as egress rules while redirecting to internal port is done
as an ingress redirect, the series introduces the necessary
changes in mlx5_core driver to support the new types of flows
and actions.
=====================
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmF8X0sACgkQSD+KveBX
+j5PsQf/RfsE+spW0yJriJQ6Et+o+/CYR+AQYat5MaXjRw8uMz6uBcfXWCIBbYjw
OwNP4ZagWXIHMkelj2Ap0Qlu4yqkUBy1A0le7HcAzOeje1vc9BObS15w9pJvQ9cp
br3ZK5VZnQccSfF/LQpSjlGhD9083kETA2uXlCz7vitn8MVaya6ue6GU+wFC4Wnz
LjOJ4PMXCEfhpA+efD0nD4EK6FJjqvJoVQkxWNmgOW7yg5PcyWXZD/tsDZUI8DGl
0GlnM6W2H8bC0YhW01cnOsWPU+vtMLCsaF0YKqsLhnWUsaYSD5lXPIHqH6VpucZ7
LSv/c2U9pBnWkf7UoFyuEeQxAhz1rg==
=Uxa9
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2021-10-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-10-29
1) Minor trivial refactoring and improvements
2) Check for unsupported parameters fields in SW steering
3) Support TC offload for OVS internal port, from Ariel, see below.
Ariel Levkovich says:
=====================
Support HW offload of TC rules involving OVS internal port
device type as the filter device or the destination
device.
The support is for flows which explicitly use the internal
port as source or destination device as well as indirect offload
for flows performing tunnel set or unset via a tunnel device
and the internal port is the tunnel overlay device.
Since flows with internal port as source port are added
as egress rules while redirecting to internal port is done
as an ingress redirect, the series introduces the necessary
changes in mlx5_core driver to support the new types of flows
and actions.
=====================
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
For RX virtuqueue, the used length is validated in all the three paths
(big, small and mergeable). For control vq, we never tries to use used
length. So this patch forbids the core to validate the used length.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211027022107.14357-3-jasowang@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make tailroom math follow same logic as everything else, subtracing
values in the order in which things are laid out in the buffer.
Tested-by: Corentin Noël <corentin.noel@collabora.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tony Nguyen says:
====================
1GbE Intel Wired LAN Driver Updates 2021-10-29
This series contains updates to igc driver only.
Sasha removes an unnecessary media type check, adds a new device ID, and
changes a device reset to a port reset command.
====================
Link: https://lore.kernel.org/r/20211029174101.2970935-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
40GbE Intel Wired LAN Driver Updates 2021-10-29
This series contains updates to i40e, ice, igb, and ixgbevf drivers.
Yang Li simplifies return statements of bool values for i40e and ice.
Jan Kundrát corrects problems with I2C bit-banging for igb.
Colin Ian King removes unneeded variable initialization for ixgbevf.
====================
Link: https://lore.kernel.org/r/20211029164641.2714265-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When performing route device lookup for decap action, support
the case of ovs internal port as the lookup result.
In such case, an internal port struct is mapped and attached
to the flow attributes so that the source port matching of the
rule will match on the internal port's metadata value.
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Adjust termination table logic to handle rules which
involve internal port as filter or forwarding device.
For cases where the rule forwards from internal port
to uplink, always choose to go via termination table.
This is because it is not known from where the packet
originally arrived to the internal port and it is possible
that it came from the uplink itself, in which case
a term table is required to perform hairpin.
If the packet arrived from a vport, going via term
table has no effect.
For cases where the rule forwards to an internal port
from uplink the rep pointer will point to the uplink rep,
avoid going via termination table as it is not required.
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Register callbacks for tc blocks of ovs internal port devices.
This allows an indirect offloading rules that apply on
such devices as the filter device.
In case a rule is added to a tc block of an internal port,
the mlx5 driver will implicitly add a matching on the internal
port's unique vport metadata value to the rule's matching list.
Therefore, only packets that previously hit a rule that redirects
to an internal port and got the vport metadata overwritten to the
internal port's unique metadata, can match on such indirect rule.
Offloading of both ingress and egress tc blocks of internal ports
is supported as opposed to other devices where only ingress block
offloading is supported.
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
When pefroming encap action, a route lookup is performed
to find the routing device the packet should be forwarded
to after the encapsulation. This is the device that has the
local tunnel ip address.
This change adds support to offload an encap rule where the
route device ends up being an ovs internal port.
In such case, the driver will add a HW rule that will encapsulate
the packet with the tunnel header and will overwrite the vport
metadata in reg_c0 to the internal port metadata value.
Finally, the packet will be forwarded to the root table to be
processed again with the indication that it came from an internal
port.
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Allow offloading rules that redirect to ovs internal port
ingress and egress.
To support redirect to ingress device, offloading of REDIRECT_INGRESS
action is added.
When a tc rule redirects to ovs internal port, the hw rule will
overwrite the input vport value in reg_c0 with a new vport metadata
value that is mapped for this internal port using the internal
port mapping api that is introduce in previous patches.
After that the hw rule will redirect the packet to the root table
to continue processing with the new vport metadata value.
The new vport metadata value indicates that this packet is now
arriving through an internal port and therefore should be processed
using rules that apply on the same internal port as the filter device.
Therefore, following rules that apply on this internal port will have
to match on the same vport metadata value as part of their matching
keys to make sure the packet belongs to the internal port.
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Setting the skb packet type field to host is usually
done when performing forwarding to ingress device.
This is required since the receive handling that is used
by the redirect to ingress action checks whether the packet
doesn't belong to this host and drops the packet in such case.
In order to be able to offload action redirect ingress, tc offload
code needs to accept the skbedit ptype action as well.
There's no special handling in HW for such action since it will
be followed by a redirect action and therefore, this code
only allows us to accept such action in the actions list but
not performing anything specific in HW for it.
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Adding infrastructure to map ovs internal port device to vport
match metadata to support offload of rules with internal port as
the filter device or as the destination device.
The infrastructure allows adding and removing internal port device
to an eswitch database and getting a unique vport metadata value to
be placed and match on in reg_c0 when offloading rules that are coming
from or going to an internal port.
The new int port metadata can be written to the source port register
in HW to indicate that current source port of the packet is the
internal port and not one of the actual HW vports (uplink or VF).
Using this method, it is possible to offload TC rules with an OVS
internal port as their destination port (overwriting the src vport
register) or as the filter port (matching on the value of the src
vport register and making sure it matches to the internal port's
value).
There is also a need to handle a miss case where the packet's
src port value was changed in HW to an internal port but a following
rule which matches on this new src port value wasn't found in HW.
In such case, the packet will be forwarded to the driver with
metadata which allows driver to restore the info of the internal
port's netdevice. Once this info is restored, the uplink driver
can forward the packet to the relevant netdevice in SW.
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Rename tun_dev to fwd_dev within mlx5e_tc_update_priv struct
since future implementation may introduce other device types
which the handler is forwarding to.
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Move the ownership of skb forwarding to network stack to the
tc update_skb handler as different cases will require different
handling of the skb.
While the tc handler will take care of the various cases and
properly handle the handover of the skb to the network stack
and freeing the skb, the main rx handler will be kept clean
from branches and usage of flags.
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
When a matcher is being built, we "consume" (clear) mask fields one by one,
and to verify that we do support all the required fields we check if the
whole mask was consumed, else the matching request includes unsupported
fields.
Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
CT creates a counter for each CT rule, and for each such counter,
fs_counters tries to queue mlx5_fc_stats_work() work again via
mod_delayed_work(0) call to refresh all counters. This call has a
large performance impact when reaching high insertion rate and
accounts for ~8% of the insertion time when using software steering.
Allow skipping the refresh of all counters during counter creation.
Change CT to use this refresh skipping for it's counters.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Part of code that is related solely to IPsec is always compiled in the
driver code regardless if the IPsec functionality is enabled or disabled
in the driver code, this will add unnecessary branch in case IPsec is
disabled at Tx data path.
Move IPsec related code to IPsec related file such that in case of IPsec
is disabled and because of unlikely macro the compiler should be able to
optimize and omit the checksum IPsec code all together from Tx data path
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Emeel Hakim <ehakim@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
ignore_flow_level isn't supported for VFs, and so it causes
post_act and ct to warn about it.
Instead of disabling CT for VFs, and a driver update will be need
to enable CT again once firmware support this, remove this warning
specifically for VFs. This way, it could be automatically enabled on
future firmwares where VFs support ignore_flow_level capability.
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Clang warns:
drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c:635:34: error: variable 'esw' is uninitialized when used here [-Werror,-Wuninitialized]
mlx5_eswitch_del_offloaded_rule(esw, sample_flow->pre_rule, sample_flow->pre_attr);
^~~
drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c:626:26: note: initialize the variable 'esw' to silence this warning
struct mlx5_eswitch *esw;
^
= NULL
1 error generated.
It appears that the assignment should have been shuffled instead of
removed outright like in mlx5e_tc_sample_offload(). Add it back so there
is no use of esw uninitialized.
Fixes: a64c5edbd2 ("net/mlx5: Remove unnecessary checks for slow path flag")
Link: https://github.com/ClangBuiltLinux/linux/issues/1494
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Since the GPIO driver (gpio-mlxbf2.c) supports interrupt handling,
replace the custom routine with simple IRQ request.
Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Fix driver not freeing VF's traffic irqs, prior to calling
pci_disable_msix in iavf_remove.
There were possible 2 erroneous states in which, iavf_close would
not be called.
One erroneous state is fixed by allowing netdev to register, when state
is already running. It was possible for VF adapter to enter state loop
from running to resetting, where iavf_open would subsequently fail.
If user would then unload driver/remove VF pci, iavf_close would not be
called, as the netdev was not registered, leaving traffic pcis still
allocated.
Fixed this by breaking loop, allowing netdev to open device when adapter
state is __IAVF_RUNNING and it is not explicitily downed.
Other possiblity is entering to iavf_remove from __IAVF_RESETTING state,
where iavf_close would not free irqs, but just return 0.
Fixed this by checking for last adapter state and then removing irqs.
Kernel panic:
[ 2773.628585] kernel BUG at drivers/pci/msi.c:375!
...
[ 2773.631567] RIP: 0010:free_msi_irqs+0x180/0x1b0
...
[ 2773.640939] Call Trace:
[ 2773.641572] pci_disable_msix+0xf7/0x120
[ 2773.642224] iavf_reset_interrupt_capability.part.41+0x15/0x30 [iavf]
[ 2773.642897] iavf_remove+0x12e/0x500 [iavf]
[ 2773.643578] pci_device_remove+0x3b/0xc0
[ 2773.644266] device_release_driver_internal+0x103/0x1f0
[ 2773.644948] pci_stop_bus_device+0x69/0x90
[ 2773.645576] pci_stop_and_remove_bus_device+0xe/0x20
[ 2773.646215] pci_iov_remove_virtfn+0xba/0x120
[ 2773.646862] sriov_disable+0x2f/0xe0
[ 2773.647531] ice_free_vfs+0x2f8/0x350 [ice]
[ 2773.648207] ice_sriov_configure+0x94/0x960 [ice]
[ 2773.648883] ? _kstrtoull+0x3b/0x90
[ 2773.649560] sriov_numvfs_store+0x10a/0x190
[ 2773.650249] kernfs_fop_write+0x116/0x190
[ 2773.650948] vfs_write+0xa5/0x1a0
[ 2773.651651] ksys_write+0x4f/0xb0
[ 2773.652358] do_syscall_64+0x5b/0x1a0
[ 2773.653075] entry_SYSCALL_64_after_hwframe+0x65/0xca
Fixes: 22ead37f8a ("i40evf: Add longer wait after remove module")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Add helper function to go from pci_dev to adapter to make work simple -
to go from a pci_dev to the adapter structure and make netdev assignment
instead of having to go to the net_device then the adapter.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Disable showing bus-info information for port representors in switchdev
mode. This fixes a bug that caused displaying wrong netdev descriptions in
lshw tool - one port representor displayed PF branding string, and in turn
one PF displayed a "generic" description. The bug occurs when many devices
show the same bus-info in ethtool, which was the case in switchdev mode (PF
and its port representors displayed the same bus-info). The bug occurs only
if a port representor netdev appears before PF netdev in /proc/net/dev.
In the examples below:
ens6fX is PF
ens6fXvY is VF
ethX is port representor
One irrelevant column was removed from output
Before:
$ sudo lshw -c net -businfo
Bus info Device Description
=========================================
pci@0000:02:00.0 eth102 Ethernet Controller E810-XXV for SFP
pci@0000:02:00.1 ens6f1 Ethernet Controller E810-XXV for SFP
pci@0000:02:01.0 ens6f0v0 Ethernet Adaptive Virtual Function
pci@0000:02:01.1 ens6f0v1 Ethernet Adaptive Virtual Function
pci@0000:02:01.2 ens6f0v2 Ethernet Adaptive Virtual Function
pci@0000:02:00.0 ens6f0 Ethernet interface
Notice that eth102 and ens6f0 have the same bus-info and their descriptions
are swapped.
After:
$ sudo lshw -c net -businfo
Bus info Device Description
=========================================
pci@0000:02:00.0 ens6f0 Ethernet Controller E810-XXV for SFP
pci@0000:02:00.1 ens6f1 Ethernet Controller E810-XXV for SFP
pci@0000:02:01.0 ens6f0v0 Ethernet Adaptive Virtual Function
pci@0000:02:01.1 ens6f0v1 Ethernet Adaptive Virtual Function
pci@0000:02:01.2 ens6f0v2 Ethernet Adaptive Virtual Function
Fixes: 7aae80cef7 ("ice: add port representor ethtool ops and stats")
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When spawning VFs in switchdev mode, internal filter list of VSIs is
cleared, which includes MAC rules. However MAC entries stay on netdev's
multicast list, which causes error message when bringing link up after
spawning VFs ("Failed to delete MAC filters"). __dev_mc_sync() is
called and tries to unsync addresses that were already removed
internally when adding VFs.
This can be reproduced with:
1) Load ice driver
2) Change PF to switchdev mode
3) Bring PF link up
4) Bring PF link down
5) Create a VF on PF
6) Bring PF link up
Added clearing of netdev's multicast (and also unicast) list when
spawning VFs in switchdev mode, so the state of internal rule list and
netdev's MAC list is consistent.
Fixes: 1a1c40df2e ("ice: set and release switchdev environment")
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently, the vlan_promisc flag is used exclusively by VF VSI to
determine whether or not to toggle VLAN pruning along with
trusted/true-promiscuous mode. This is not needed for a couple of
reasons. First, trusted/true-promiscuous mode is only supposed to allow
all MAC filters within VLANs that a VF has added filters for, so VLAN
pruning should not be disabled. Second, the boolean argument makes the
function confusing and unintuitive. Remove this flag.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The _reset_hw_base method switched from port reset (CTRL[26]) to device
reset (CTRL[29]) since the FW was receiving an interrupt on CTRL[29].
FW code was later modified to also receive an interrupt on CTRL[26].
Since certain HW values are not reset to default by CTRL[29], we go back
to CTRL[26] for the HW reset, as it meets all current requirements.
This reverts commit bb4265ec24 ("igc: Update the MAC reset flow").
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Add new device ID for the next step of the silicon and
reflect the I226_LMVP part.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
i225 devices only have copper phy media type. There is no point in
checking phy media type during the phy initialization. This patch cleans
up a pointless check.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The variable ret_val is being initialized with a value that is never
read, it is being updated later on. The assignment is redundant and
can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The driver tried to use Linux' native software I2C bus master
(i2c-algo-bits) for exporting the I2C interface that talks to the SFP
cage(s) towards userspace. As-is, however, the physical SCL/SDA pins
were not moving at all, staying at logical 1 all the time.
The main culprit was the I2CPARAMS register where igb was not setting
the I2CBB_EN bit. That meant that all the careful signal bit-banging was
actually not being propagated to the chip pads (I verified this with a
scope).
The bit-banging was not correct either, because I2C is supposed to be an
open-collector bus, and the code was driving both lines via a totem
pole. The code was also trying to do operations which did not make any
sense with the i2c-algo-bits, namely manipulating both SDA and SCL from
igb_set_i2c_data (which is only supposed to set SDA). I'm not sure if
that was meant as an optimization, or was just flat out wrong, but given
that the i2c-algo-bits is set up to work with a totally dumb GPIO-ish
implementation underneath, there's no need for this code to be smart.
The open-drain vs. totem-pole is fixed by the usual trick where the
logical zero is implemented via regular output mode and outputting a
logical 0, and the logical high is implemented via the IO pad configured
as an input (thus floating), and letting the mandatory pull-up resistors
do the rest. Anything else is actually wrong on I2C where all devices
are supposed to have open-drain connection to the bus.
The missing I2CBB_EN is set (along with a safe initial value of the
GPIOs) just before registering this software I2C bus.
The chip datasheet mentions HW-implemented I2C transactions (SFP EEPROM
reads and writes) as well, but I'm not touching these for simplicity.
Tested on a LR-Link LRES2203PF-2SFP (which is an almost-miniPCIe form
factor card, a cable, and a module with two SFP cages). There was one
casualty, an old broken SFP we had laying around, which was used to
solder some thin wires as a DIY I2C breakout. Thanks for your service.
With this patch in place, I can `i2cdump -y 3 0x51 c` and read back data
which make sense. Yay.
Signed-off-by: Jan Kundrát <jan.kundrat@cesnet.cz>
See-also: https://www.spinics.net/lists/netdev/msg490554.html
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Fix the following coccicheck warning:
./drivers/net/ethernet/intel/i40e/i40e_xsk.c:229:35-40: WARNING:
conversion to bool not needed here
./drivers/net/ethernet/intel/ice/ice_xsk.c:399:35-40: WARNING:
conversion to bool not needed here
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Fourth set of patches for v5.16. Mostly fixes this time, wcn36xx and
iwlwifi have some new features but nothing really out of ordinary. We
have one conflict with kspp tree.
Conflicts:
kspp tree has a conflict in drivers/net/wireless/intel/iwlwifi/fw/api/tx.h:
https://lkml.kernel.org/r/20211028192934.01520d7e@canb.auug.org.au
Major changes:
ath11k
* fix QCA6390 A-MSDU handling (CVE-2020-24588)
wcn36xx
* enable hardware scan offload for 5Ghz band
* add missing 5GHz channels 136 and 144
iwlwifi
* support a new ACPI table revision
* improvements in the device selection code
* new hardware support
* support for WiFi 6E enablement via BIOS
* support firmware API version 67
* support for 160MHz in ranging measurements
-----BEGIN PGP SIGNATURE-----
iQFJBAABCgAzFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmF7+wUVHGt2YWxvQGNv
ZGVhdXJvcmEub3JnAAoJEG4XJFUm622bRGoH/0XrfEwzH4iR3j/xRPTBMJjBZO/Z
ZN0PQ2L8402c7iG9M0psSFOdqFe8vtKzuuV367ifQzwmGFoQIzAckL9nCA1yFUBg
EXF3nP0/mb8R0w6rzjkLUQ9EejLLeX35Kh6B1oLgglPdLvE+Yv6/Iqs1yOHwHZFF
RBHqyBc9YcQBuhah4JPqIXr8tUCkRgWwK/VvCwhoGDWimBlH7LhgOHTFQhVJ+Z8b
/RsLS2iIv4wGkiQX9cwilmX3QSz/jK0od1FXHW347v+nm96Fs3d2F1juAyhE6io2
xN1qzEU4SQ90byxhCsBOvZC6JTpRJvw49XME76MrzaLP+FuZ8I2zdAe7wd8=
=Y1xX
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-2021-10-29' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for v5.16
Fourth set of patches for v5.16. Mostly fixes this time, wcn36xx and
iwlwifi have some new features but nothing really out of ordinary.
We have one conflict with kspp tree.
Major changes:
ath11k
* fix QCA6390 A-MSDU handling (CVE-2020-24588)
wcn36xx
* enable hardware scan offload for 5Ghz band
* add missing 5GHz channels 136 and 144
iwlwifi
* support a new ACPI table revision
* improvements in the device selection code
* new hardware support
* support for WiFi 6E enablement via BIOS
* support firmware API version 67
* support for 160MHz in ranging measurements
====================
Link: https://lore.kernel.org/r/20211029134707.DE2B0C4360D@smtp.codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver no longer depends on this option, but it fails to
build if it's disabled because the skb->tc_skip_classify is
hidden behind an #ifdef:
drivers/net/ifb.c:81:8: error: no member named 'tc_skip_classify' in 'struct sk_buff'
skb->tc_skip_classify = 1;
Use the same #ifdef around the assignment.
Fixes: 046178e726 ("ifb: Depend on netfilter alternatively to tc")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add firmware (FW) version 4.0 support for Marvell Prestera
driver.
Major changes have been made to new v4.0 FW ABI to add support
of new features, introduce the stability of the FW ABI and ensure
better forward compatibility for the future driver vesrions.
Current v4.0 FW feature set support does not expect any changes
to ABI, as it was defined and tested through long period of time.
The ABI may be extended in case of new features, but it will not
break the backward compatibility.
ABI major changes done in v4.0:
- L1 ABI, where MAC and PHY API configuration are split.
- ACL has been split to low-level TCAM and Counters ABI
to provide more HW ACL capabilities for future driver
versions.
To support backward support, the addition compatibility layer is
required in the driver which will have two different codebase under
"if FW-VER elif FW-VER else" conditions that will be removed
in the future anyway, So, the idea was to break backward support
and focus on more stable FW instead of supporting old version
with very minimal and limited set of features/capabilities.
Improve FW msg validation:
* Use __le64, __le32, __le16 types in msg to/from FW to
catch endian mismatch by sparse.
* Use BUILD_BUG_ON for structures sent/recv to/from FW.
Co-developed-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both !data[IFLA_BAREUDP_PORT] and !data[IFLA_BAREUDP_ETHERTYPE] are
checked. We should remove the checks of data[IFLA_BAREUDP_PORT] and
data[IFLA_BAREUDP_ETHERTYPE] that follow since they are always true.
Put both statements together in group and balance the space on both
sides of '=' sign.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove additional character in the source to properly indent if branch.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Increase the rx ring size (LAN743X_RX_RING_SIZE) to improve rx performance on some platforms.
Tested on x86 PC with EVB-LAN7430.
The iperf3.7 TCPIP improved from 881 Mbps to 922 Mbps, and UDP improved from 817 Mbps to 936 Mbps.
Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mvneta does not support asymetric pause modes, and it flags this by the
lack of AsymPause in the supported field. When setting pause modes, we
check that pause->rx_pause == pause->tx_pause, but only when pause
autoneg is enabled. When pause autoneg is disabled, we still allow
pause->rx_pause != pause->tx_pause, which is incorrect when the MAC
does not support asymetric pause, and causes mvneta to issue a warning.
Fix this by removing the test for pause->autoneg, so we always check
that pause->rx_pause == pause->tx_pause for network devices that do not
support AsymPause.
Fixes: 9525ae8395 ("phylink: add phylink infrastructure")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
When port is linked down, the process which has acquired rtnl_lock
will wait for the in-progress dim work to finish, and the work also
acquires rtnl_lock, which may cause deadlock.
Currently IRQ_MOD registers can be configured by `ethtool -C` and
dim work, and which will take effect depends on the execution order,
rtnl_lock is useless here, so remove them.
Fixes: 9d32e4e7e9 ("nfp: add support for coalesce adaptive feature")
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each rx/tx ring has a related dim work, when rx/tx ring number is
decreased by `ethtool -L`, the corresponding rx_ring or tx_ring is
assigned NULL, while its related work is not destroyed. When scheduled,
the work will access NULL pointer.
Fixes: 9d32e4e7e9 ("nfp: add support for coalesce adaptive feature")
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For each rate change command submission, the FW has to do a phy
power off sequence internally. For this to happen correctly, the
PLL re-initialization control setting has to be turned off before
sending mailbox commands and re-enabled once the command submission
is complete.
Without the PLL control setting, the link up takes longer time in a
fixed phy configuration.
Fixes: 47f164deab ("amd-xgbe: Add PCI device support")
Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2021-10-28
This series contains updates to ice driver only.
Michal adds support for eswitch drop and redirect filters from and to
tunnel devices. From meaning from uplink to VF and to means from VF to
uplink. This is accomplished by adding support for indirect TC tunnel
notifications and adding appropriate training packets and match fields
for UDP tunnel headers. He also adds returning virtchannel responses for
blocked operations as returning a response is still needed.
Marcin sets netdev min and max MTU values on port representors to allow
for MTU changes over default values.
Brett adds detecting and reporting of PHY firmware load issues for devices
which support this.
Nathan Chancellor fixes a clang warning for implicit fallthrough.
Wang Hai fixes a return value for failed allocation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
All SoCs with this IP core support 8 burst length. Hauke
suggested to hardcode this value and simplify the driver.
Link: https://lkml.org/lkml/2021/9/14/1533
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
On older firmware that doesn't support the HWRM_NVM_GET_DEV_INFO
command that returns detailed stored firmware versions, fallback
to use the same firmware package version that is reported to ethtool.
Refactor bnxt_get_pkgver() in bnxt_ethtool.c so that devlink can call
and get the package version.
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Live patches are activated by using the 'limit no_reset' option when
performing a devlink dev reload fw_activate operation. These packages
must first be installed on the device in the usual way. For example,
via devlink dev flash or ethtool -f.
The devlink device info has also been enhanced to render stored and
running live patch versions.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The main changes are firmware live patch support and 2 additional FEC
standard counters.
Add the matching FEC counters to ethtool counter array. Firmware older
than 220 does not return the proper size of the extended RX counters so
we need to cap it at the smaller legacy size. Otherwise the new FEC
counters may show up with garbage values.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Populate the dump with firmware 'live' coredump data. This includes
the information stored in NVRAM by the firmware exception handler
prior to recovery. Thus, the live dump includes the desired crash
context.
Firmware does not support HWRM calls after RESET_NOTIFY, so there is
no supported way to capture a coredump during the auto dump phase.
Detect this and abort when called from devlink_health_report().
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tools other than 'ethtool -w' may be used to produce a coredump. For
devlink health, such dumps could even be driver initiated in response
to a health event. In these cases, the kernel thread information will
be placed in the coredump record instead.
v2: use min_t() instead of min() to fix the mismatched type warning
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent firmware provides coredump and crashdump size info via
DBG_QCFG command. Read the dump sizes from firmware, instead of
computing in the driver. This patch reduces the time taken
to collect the dump via ethtool.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware sets compression flags for each segment, add this information
while filling segment header.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change bnxt_get_coredump() and bnxt_get_coredump_length() to non-static
functions.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The coredump functionality will be used by devlink health. Refactor
these functions that get coredump and coredump length. There is no
functional change, but the following checkpatch warnings were
addressed:
- strscpy is preferred over strlcpy.
- sscanf results should be checked, with an additional warning.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add firmware event counters as well as health state severity. In
the unhealthy state, recommend a remedy and inform the user as to
its impact.
Readability of the devlink tool's output is negatively impacted by
adding these fields to the diagnosis. The single line of text, as
rendered by devlink health diagnose, benefits from more terse
descriptions, which can be substituted without loss of clarity, even
in pretty printed JSON mode.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge 'fw' and 'fw_fatal' health reporters. There is no longer a need
to distinguish between firmware reporters. Only bonafide errors are
reported now and no reports were being generated for the 'fw' reporter.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware resets initiated by the user are not errors and should not
be reported via devlink. Once only unsolicited resets remain, it is no
longer sensible to maintain a separate fw_reset reporter.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recovery election messages are often mistaken for errors. Improve
the wording to clarify the meaning of these frequent and expected
events. Also, take the first step towards more inclusive language.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>