Commit Graph

37804 Commits

Author SHA1 Message Date
Yufeng Mo 8c9200e387 net: hns3: clear unnecessary reset request in hclge_reset_rebuild
HW error and global reset are reported through MSIX interrupts.
The same error may be reported to different functions at the
same time. When global reset begins, the pending reset request
set by this error is unnecessary. So clear the pending reset
request after the reset is complete to avoid the repeated reset.

Fixes: f6162d4412 ("net: hns3: add handling of hw errors reported through MSIX")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 14:49:33 -07:00
Peng Li b416e872be net: hns3: use netif_tx_disable to stop the transmit queue
Currently, netif_tx_stop_all_queues() is used to ensure that
the xmit is not running, but for the concurrent case it will
not take effect, since netif_tx_stop_all_queues() just sets
a flag without locking to indicate that the xmit queue(s)
should not be run.

So use netif_tx_disable() to replace netif_tx_stop_all_queues(),
it takes the xmit queue lock while marking the queue stopped.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 14:49:33 -07:00
Hao Chen 905416f18f net: hns3: fix for vxlan gpe tx checksum bug
When skb->ip_summed is CHECKSUM_PARTIAL, for non-tunnel udp packet,
which has a dest port as the IANA assigned, the hardware is expected
to do the checksum offload, but the hardware whose version is below
V3 will not do the checksum offload when udp dest port is 4790.

So fixes it by doing the checksum in software for this case.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 14:49:33 -07:00
Mohammad Athari Bin Ismail db7c691d7f net: stmmac: cleared __FPE_REMOVING bit in stmmac_fpe_start_wq()
An issue found when network interface is down and up again, FPE handshake
fails to trigger. This is due to __FPE_REMOVING bit remains being set in
stmmac_fpe_stop_wq() but not cleared in stmmac_fpe_start_wq(). This
cause FPE workqueue task, stmmac_fpe_lp_task() not able to be executed.

To fix this, add clearing __FPE_REMOVING bit in stmmac_fpe_start_wq().

Fixes: 5a5586112b ("net: stmmac: support FPE link partner hand-shaking procedure")
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 14:43:53 -07:00
Heiner Kallweit 4cf0abbce6 PCI/VPD: Remove pci_vpd_find_tag() 'offset' argument
All callers pass 0 as offset. Therefore remove the parameter and use a
fixed offset 0 in pci_vpd_find_tag().

Link: https://lore.kernel.org/r/f62e6e19-5423-2ead-b2bd-62844b23ef8f@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2021-04-30 14:38:32 -05:00
Yang Li 8343b1f8b9 bnx2x: Remove redundant assignment to err
Variable 'err' is set to -EIO but this value is never read as it is
overwritten with a new value later on, hence it is a redundant
assignment and can be removed.

Clean up the following clang-analyzer warning:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:1195:2: warning: Value
stored to 'err' is never read [clang-analyzer-deadcode.DeadStores]

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:52:07 -07:00
Jiapeng Chong bbf6acea6e net: macb: Remove redundant assignment to queue
Variable queue is set to bp->queues but these values is not used as it
is overwritten later on, hence redundant assignment  can be removed.

Cleans up the following clang-analyzer warning:

drivers/net/ethernet/cadence/macb_main.c:4919:21: warning: Value stored
to 'queue' during its initialization is never read
[clang-analyzer-deadcode.DeadStores].

drivers/net/ethernet/cadence/macb_main.c:4832:21: warning: Value stored
to 'queue' during its initialization is never read
[clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:50:20 -07:00
Jian Shen b4047aac4e net: hns3: add check for HNS3_NIC_STATE_INITED in hns3_reset_notify_up_enet()
In some cases, the device is not initialized because reset failed.
If another task calls hns3_reset_notify_up_enet() before reset
retry, it will cause an error since uninitialized pointer access.
So add check for HNS3_NIC_STATE_INITED before calling
hns3_nic_net_open() in hns3_reset_notify_up_enet().

Fixes: bb6b94a896 ("net: hns3: Add reset interface implementation in client")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:41:43 -07:00
Yufeng Mo 568a54bdf7 net: hns3: initialize the message content in hclge_get_link_mode()
The message sent to VF should be initialized, otherwise random
value of some contents may cause improper processing by the target.
So add a initialization to message in hclge_get_link_mode().

Fixes: 9194d18b05 ("net: hns3: fix the problem that the supported port is empty")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:41:43 -07:00
Yufeng Mo 2867298dd4 net: hns3: fix incorrect configuration for igu_egu_hw_err
According to the UM, the type and enable status of igu_egu_hw_err
should be configured separately. Currently, the type field is
incorrect when disable this error. So fix it by configuring these
two fields separately.

Fixes: bf1faf9415 ("net: hns3: Add enable and process hw errors from IGU, EGU and NCSI")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:41:43 -07:00
Linus Torvalds 9d31d23389 Networking changes for 5.13.
Core:
 
  - bpf:
 	- allow bpf programs calling kernel functions (initially to
 	  reuse TCP congestion control implementations)
 	- enable task local storage for tracing programs - remove the
 	  need to store per-task state in hash maps, and allow tracing
 	  programs access to task local storage previously added for
 	  BPF_LSM
 	- add bpf_for_each_map_elem() helper, allowing programs to
 	  walk all map elements in a more robust and easier to verify
 	  fashion
 	- sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
 	  redirection
 	- lpm: add support for batched ops in LPM trie
 	- add BTF_KIND_FLOAT support - mostly to allow use of BTF
 	  on s390 which has floats in its headers files
 	- improve BPF syscall documentation and extend the use of kdoc
 	  parsing scripts we already employ for bpf-helpers
 	- libbpf, bpftool: support static linking of BPF ELF files
 	- improve support for encapsulation of L2 packets
 
  - xdp: restructure redirect actions to avoid a runtime lookup,
 	improving performance by 4-8% in microbenchmarks
 
  - xsk: build skb by page (aka generic zerocopy xmit) - improve
 	performance of software AF_XDP path by 33% for devices
 	which don't need headers in the linear skb part (e.g. virtio)
 
  - nexthop: resilient next-hop groups - improve path stability
 	on next-hops group changes (incl. offload for mlxsw)
 
  - ipv6: segment routing: add support for IPv4 decapsulation
 
  - icmp: add support for RFC 8335 extended PROBE messages
 
  - inet: use bigger hash table for IP ID generation
 
  - tcp: deal better with delayed TX completions - make sure we don't
 	give up on fast TCP retransmissions only because driver is
 	slow in reporting that it completed transmitting the original
 
  - tcp: reorder tcp_congestion_ops for better cache locality
 
  - mptcp:
 	- add sockopt support for common TCP options
 	- add support for common TCP msg flags
 	- include multiple address ids in RM_ADDR
 	- add reset option support for resetting one subflow
 
  - udp: GRO L4 improvements - improve 'forward' / 'frag_list'
 	co-existence with UDP tunnel GRO, allowing the first to take
 	place correctly	even for encapsulated UDP traffic
 
  - micro-optimize dev_gro_receive() and flow dissection, avoid
 	retpoline overhead on VLAN and TEB GRO
 
  - use less memory for sysctls, add a new sysctl type, to allow using
 	u8 instead of "int" and "long" and shrink networking sysctls
 
  - veth: allow GRO without XDP - this allows aggregating UDP
 	packets before handing them off to routing, bridge, OvS, etc.
 
  - allow specifing ifindex when device is moved to another namespace
 
  - netfilter:
 	- nft_socket: add support for cgroupsv2
 	- nftables: add catch-all set element - special element used
 	  to define a default action in case normal lookup missed
 	- use net_generic infra in many modules to avoid allocating
 	  per-ns memory unnecessarily
 
  - xps: improve the xps handling to avoid potential out-of-bound
 	accesses and use-after-free when XPS change race with other
 	re-configuration under traffic
 
  - add a config knob to turn off per-cpu netdev refcnt to catch
 	underflows in testing
 
 Device APIs:
 
  - add WWAN subsystem to organize the WWAN interfaces better and
    hopefully start driving towards more unified and vendor-
    -independent APIs
 
  - ethtool:
 	- add interface for reading IEEE MIB stats (incl. mlx5 and
 	  bnxt support)
 	- allow network drivers to dump arbitrary SFP EEPROM data,
 	  current offset+length API was a poor fit for modern SFP
 	  which define EEPROM in terms of pages (incl. mlx5 support)
 
  - act_police, flow_offload: add support for packet-per-second
 	policing (incl. offload for nfp)
 
  - psample: add additional metadata attributes like transit delay
 	for packets sampled from switch HW (and corresponding egress
 	and policy-based sampling in the mlxsw driver)
 
  - dsa: improve support for sandwiched LAGs with bridge and DSA
 
  - netfilter:
 	- flowtable: use direct xmit in topologies with IP
 	  forwarding, bridging, vlans etc.
 	- nftables: counter hardware offload support
 
  - Bluetooth:
 	- improvements for firmware download w/ Intel devices
 	- add support for reading AOSP vendor capabilities
 	- add support for virtio transport driver
 
  - mac80211:
 	- allow concurrent monitor iface and ethernet rx decap
 	- set priority and queue mapping for injected frames
 
  - phy: add support for Clause-45 PHY Loopback
 
  - pci/iov: add sysfs MSI-X vector assignment interface
 	to distribute MSI-X resources to VFs (incl. mlx5 support)
 
 New hardware/drivers:
 
  - dsa: mv88e6xxx: add support for Marvell mv88e6393x -
 	11-port Ethernet switch with 8x 1-Gigabit Ethernet
 	and 3x 10-Gigabit interfaces.
 
  - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365
 	and BCM63xx switches
 
  - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches
 
  - ath11k: support for QCN9074 a 802.11ax device
 
  - Bluetooth: Broadcom BCM4330 and BMC4334
 
  - phy: Marvell 88X2222 transceiver support
 
  - mdio: add BCM6368 MDIO mux bus controller
 
  - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips
 
  - mana: driver for Microsoft Azure Network Adapter (MANA)
 
  - Actions Semi Owl Ethernet MAC
 
  - can: driver for ETAS ES58X CAN/USB interfaces
 
 Pure driver changes:
 
  - add XDP support to: enetc, igc, stmmac
  - add AF_XDP support to: stmmac
 
  - virtio:
 	- page_to_skb() use build_skb when there's sufficient tailroom
 	  (21% improvement for 1000B UDP frames)
 	- support XDP even without dedicated Tx queues - share the Tx
 	  queues with the stack when necessary
 
  - mlx5:
 	- flow rules: add support for mirroring with conntrack,
 	  matching on ICMP, GTP, flex filters and more
 	- support packet sampling with flow offloads
 	- persist uplink representor netdev across eswitch mode
 	  changes
 	- allow coexistence of CQE compression and HW time-stamping
 	- add ethtool extended link error state reporting
 
  - ice, iavf: support flow filters, UDP Segmentation Offload
 
  - dpaa2-switch:
 	- move the driver out of staging
 	- add spanning tree (STP) support
 	- add rx copybreak support
 	- add tc flower hardware offload on ingress traffic
 
  - ionic:
 	- implement Rx page reuse
 	- support HW PTP time-stamping
 
  - octeon: support TC hardware offloads - flower matching on ingress
 	and egress ratelimitting.
 
  - stmmac:
 	- add RX frame steering based on VLAN priority in tc flower
 	- support frame preemption (FPE)
 	- intel: add cross time-stamping freq difference adjustment
 
  - ocelot:
 	- support forwarding of MRP frames in HW
 	- support multiple bridges
 	- support PTP Sync one-step timestamping
 
  - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
 	learning, flooding etc.
 
  - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
 	SC7280 SoCs)
 
  - mt7601u: enable TDLS support
 
  - mt76:
 	- add support for 802.3 rx frames (mt7915/mt7615)
 	- mt7915 flash pre-calibration support
 	- mt7921/mt7663 runtime power management fixes
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmCKFPIACgkQMUZtbf5S
 Irtw0g/+NA8bWdHNgG4H5rya0pv2z3IieLRmSdDfKRQQXcJpklawc5MKVVaTee/Q
 5/QqgPdCsu1LAU6JXBKsKmyDDaMlQKdWuKbOqDSiAQKoMesZStTEHf9d851ZzgxA
 Cdb6O7BD3lBl/IN+oxNG+KcmD1LKquTPKGySq2mQtEdLO12ekAsranzmj4voKffd
 q9tBShpXQ7Dq77DLYfiQXVCvsizNcbbJFuxX0o9Lpb9+61ZyYAbogZSa9ypiZZwR
 I/9azRBtJg7UV1aD/cLuAfy66Qh7t63+rCxVazs5Os8jVO26P/jQdisnnOe/x+p9
 wYEmKm3GSu0V4SAPxkWW+ooKusflCeqDoMIuooKt6kbP6BRj540veGw3Ww/m5YFr
 7pLQkTSP/tSjuGQIdBE1LOP5LBO8DZeC8Kiop9V0fzAW9hFSZbEq25WW0bPj8QQO
 zA4Z7yWlslvxcfY2BdJX3wD8klaINkl/8fDWZFFsBdfFX2VeLtm7Xfduw34BJpvU
 rYT3oWr6PhtkPAKR32SUcemSfeWgIVU41eSshzRz3kez1NngBUuLlSGGSEaKbes5
 pZVt6pYFFVByyf6MTHFEoQvafZfEw04JILZpo4R5V8iTHzom0kD3Py064sBiXEw2
 B6t+OW4qgcxGblpFkK2lD4kR2s1TPUs0ckVO6sAy1x8q60KKKjY=
 =vcbA
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - bpf:
        - allow bpf programs calling kernel functions (initially to
          reuse TCP congestion control implementations)
        - enable task local storage for tracing programs - remove the
          need to store per-task state in hash maps, and allow tracing
          programs access to task local storage previously added for
          BPF_LSM
        - add bpf_for_each_map_elem() helper, allowing programs to walk
          all map elements in a more robust and easier to verify fashion
        - sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
          redirection
        - lpm: add support for batched ops in LPM trie
        - add BTF_KIND_FLOAT support - mostly to allow use of BTF on
          s390 which has floats in its headers files
        - improve BPF syscall documentation and extend the use of kdoc
          parsing scripts we already employ for bpf-helpers
        - libbpf, bpftool: support static linking of BPF ELF files
        - improve support for encapsulation of L2 packets

   - xdp: restructure redirect actions to avoid a runtime lookup,
     improving performance by 4-8% in microbenchmarks

   - xsk: build skb by page (aka generic zerocopy xmit) - improve
     performance of software AF_XDP path by 33% for devices which don't
     need headers in the linear skb part (e.g. virtio)

   - nexthop: resilient next-hop groups - improve path stability on
     next-hops group changes (incl. offload for mlxsw)

   - ipv6: segment routing: add support for IPv4 decapsulation

   - icmp: add support for RFC 8335 extended PROBE messages

   - inet: use bigger hash table for IP ID generation

   - tcp: deal better with delayed TX completions - make sure we don't
     give up on fast TCP retransmissions only because driver is slow in
     reporting that it completed transmitting the original

   - tcp: reorder tcp_congestion_ops for better cache locality

   - mptcp:
        - add sockopt support for common TCP options
        - add support for common TCP msg flags
        - include multiple address ids in RM_ADDR
        - add reset option support for resetting one subflow

   - udp: GRO L4 improvements - improve 'forward' / 'frag_list'
     co-existence with UDP tunnel GRO, allowing the first to take place
     correctly even for encapsulated UDP traffic

   - micro-optimize dev_gro_receive() and flow dissection, avoid
     retpoline overhead on VLAN and TEB GRO

   - use less memory for sysctls, add a new sysctl type, to allow using
     u8 instead of "int" and "long" and shrink networking sysctls

   - veth: allow GRO without XDP - this allows aggregating UDP packets
     before handing them off to routing, bridge, OvS, etc.

   - allow specifing ifindex when device is moved to another namespace

   - netfilter:
        - nft_socket: add support for cgroupsv2
        - nftables: add catch-all set element - special element used to
          define a default action in case normal lookup missed
        - use net_generic infra in many modules to avoid allocating
          per-ns memory unnecessarily

   - xps: improve the xps handling to avoid potential out-of-bound
     accesses and use-after-free when XPS change race with other
     re-configuration under traffic

   - add a config knob to turn off per-cpu netdev refcnt to catch
     underflows in testing

  Device APIs:

   - add WWAN subsystem to organize the WWAN interfaces better and
     hopefully start driving towards more unified and vendor-
     independent APIs

   - ethtool:
        - add interface for reading IEEE MIB stats (incl. mlx5 and bnxt
          support)
        - allow network drivers to dump arbitrary SFP EEPROM data,
          current offset+length API was a poor fit for modern SFP which
          define EEPROM in terms of pages (incl. mlx5 support)

   - act_police, flow_offload: add support for packet-per-second
     policing (incl. offload for nfp)

   - psample: add additional metadata attributes like transit delay for
     packets sampled from switch HW (and corresponding egress and
     policy-based sampling in the mlxsw driver)

   - dsa: improve support for sandwiched LAGs with bridge and DSA

   - netfilter:
        - flowtable: use direct xmit in topologies with IP forwarding,
          bridging, vlans etc.
        - nftables: counter hardware offload support

   - Bluetooth:
        - improvements for firmware download w/ Intel devices
        - add support for reading AOSP vendor capabilities
        - add support for virtio transport driver

   - mac80211:
        - allow concurrent monitor iface and ethernet rx decap
        - set priority and queue mapping for injected frames

   - phy: add support for Clause-45 PHY Loopback

   - pci/iov: add sysfs MSI-X vector assignment interface to distribute
     MSI-X resources to VFs (incl. mlx5 support)

  New hardware/drivers:

   - dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port
     Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit
     interfaces.

   - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and
     BCM63xx switches

   - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches

   - ath11k: support for QCN9074 a 802.11ax device

   - Bluetooth: Broadcom BCM4330 and BMC4334

   - phy: Marvell 88X2222 transceiver support

   - mdio: add BCM6368 MDIO mux bus controller

   - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips

   - mana: driver for Microsoft Azure Network Adapter (MANA)

   - Actions Semi Owl Ethernet MAC

   - can: driver for ETAS ES58X CAN/USB interfaces

  Pure driver changes:

   - add XDP support to: enetc, igc, stmmac

   - add AF_XDP support to: stmmac

   - virtio:
        - page_to_skb() use build_skb when there's sufficient tailroom
          (21% improvement for 1000B UDP frames)
        - support XDP even without dedicated Tx queues - share the Tx
          queues with the stack when necessary

   - mlx5:
        - flow rules: add support for mirroring with conntrack, matching
          on ICMP, GTP, flex filters and more
        - support packet sampling with flow offloads
        - persist uplink representor netdev across eswitch mode changes
        - allow coexistence of CQE compression and HW time-stamping
        - add ethtool extended link error state reporting

   - ice, iavf: support flow filters, UDP Segmentation Offload

   - dpaa2-switch:
        - move the driver out of staging
        - add spanning tree (STP) support
        - add rx copybreak support
        - add tc flower hardware offload on ingress traffic

   - ionic:
        - implement Rx page reuse
        - support HW PTP time-stamping

   - octeon: support TC hardware offloads - flower matching on ingress
     and egress ratelimitting.

   - stmmac:
        - add RX frame steering based on VLAN priority in tc flower
        - support frame preemption (FPE)
        - intel: add cross time-stamping freq difference adjustment

   - ocelot:
        - support forwarding of MRP frames in HW
        - support multiple bridges
        - support PTP Sync one-step timestamping

   - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
     learning, flooding etc.

   - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
     SC7280 SoCs)

   - mt7601u: enable TDLS support

   - mt76:
        - add support for 802.3 rx frames (mt7915/mt7615)
        - mt7915 flash pre-calibration support
        - mt7921/mt7663 runtime power management fixes"

* tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits)
  net: selftest: fix build issue if INET is disabled
  net: netrom: nr_in: Remove redundant assignment to ns
  net: tun: Remove redundant assignment to ret
  net: phy: marvell: add downshift support for M88E1240
  net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255
  net/sched: act_ct: Remove redundant ct get and check
  icmp: standardize naming of RFC 8335 PROBE constants
  bpf, selftests: Update array map tests for per-cpu batched ops
  bpf: Add batched ops support for percpu array
  bpf: Implement formatted output helpers with bstr_printf
  seq_file: Add a seq_bprintf function
  sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues
  net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
  net: fix a concurrency bug in l2tp_tunnel_register()
  net/smc: Remove redundant assignment to rc
  mpls: Remove redundant assignment to err
  llc2: Remove redundant assignment to rc
  net/tls: Remove redundant initialization of record
  rds: Remove redundant assignment to nr_sig
  dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0
  ...
2021-04-29 11:57:23 -07:00
Oleksij Rempel 4a52dd8fef net: selftest: fix build issue if INET is disabled
In case ethernet driver is enabled and INET is disabled, selftest will
fail to build.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: 3e1e58d64c ("net: add generic selftest support")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210428130947.29649-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-28 14:06:45 -07:00
Ignat Korchagin 99ba0ea616 sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues
efx->xdp_tx_queue_count is initially initialized to num_possible_cpus() and is
later used to allocate and traverse efx->xdp_tx_queues lookup array. However,
we may end up not initializing all the array slots with real queues during
probing. This results, for example, in a NULL pointer dereference, when running
"# ethtool -S <iface>", similar to below

[2570283.664955][T4126959] BUG: kernel NULL pointer dereference, address: 00000000000000f8
[2570283.681283][T4126959] #PF: supervisor read access in kernel mode
[2570283.695678][T4126959] #PF: error_code(0x0000) - not-present page
[2570283.710013][T4126959] PGD 0 P4D 0
[2570283.721649][T4126959] Oops: 0000 [#1] SMP PTI
[2570283.734108][T4126959] CPU: 23 PID: 4126959 Comm: ethtool Tainted: G           O      5.10.20-cloudflare-2021.3.1 #1
[2570283.752641][T4126959] Hardware name: <redacted>
[2570283.781408][T4126959] RIP: 0010:efx_ethtool_get_stats+0x2ca/0x330 [sfc]
[2570283.796073][T4126959] Code: 00 85 c0 74 39 48 8b 95 a8 0f 00 00 48 85 d2 74 2d 31 c0 eb 07 48 8b 95 a8 0f 00 00 48 63 c8 49 83 c4 08 83 c0 01 48 8b 14 ca <48> 8b 92 f8 00 00 00 49 89 54 24 f8 39 85 a0 0f 00 00 77 d7 48 8b
[2570283.831259][T4126959] RSP: 0018:ffffb79a77657ce8 EFLAGS: 00010202
[2570283.845121][T4126959] RAX: 0000000000000019 RBX: ffffb799cd0c9280 RCX: 0000000000000018
[2570283.860872][T4126959] RDX: 0000000000000000 RSI: ffff96dd970ce000 RDI: 0000000000000005
[2570283.876525][T4126959] RBP: ffff96dd86f0a000 R08: ffff96dd970ce480 R09: 000000000000005f
[2570283.892014][T4126959] R10: ffffb799cd0c9fff R11: ffffb799cd0c9000 R12: ffffb799cd0c94f8
[2570283.907406][T4126959] R13: ffffffffc11b1090 R14: ffff96dd970ce000 R15: ffffffffc11cd66c
[2570283.922705][T4126959] FS:  00007fa7723f8740(0000) GS:ffff96f51fac0000(0000) knlGS:0000000000000000
[2570283.938848][T4126959] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2570283.952524][T4126959] CR2: 00000000000000f8 CR3: 0000001a73e6e006 CR4: 00000000007706e0
[2570283.967529][T4126959] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[2570283.982400][T4126959] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[2570283.997308][T4126959] PKRU: 55555554
[2570284.007649][T4126959] Call Trace:
[2570284.017598][T4126959]  dev_ethtool+0x1832/0x2830

Fix this by adjusting efx->xdp_tx_queue_count after probing to reflect the true
value of initialized slots in efx->xdp_tx_queues.

Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Fixes: e26ca4b535 ("sfc: reduce the number of requested xdp ev queues")
Cc: <stable@vger.kernel.org> # 5.12.x
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 15:38:29 -07:00
Yangbo Lu 39e5308b32 net: mscc: ocelot: support PTP Sync one-step timestamping
Although HWTSTAMP_TX_ONESTEP_SYNC existed in ioctl for hardware timestamp
configuration, the PTP Sync one-step timestamping had never been supported.

This patch is to truely support it.

- ocelot_port_txtstamp_request()
  This function handles tx timestamp request by storing
  ptp_cmd(tx timestamp type) in OCELOT_SKB_CB(skb)->ptp_cmd,
  and additionally for two-step timestamp storing ts_id in
  OCELOT_SKB_CB(clone)->ptp_cmd.

- ocelot_ptp_rew_op()
  During xmit, this function is called to get rew_op (rewriter option) by
  checking skb->cb for tx timestamp request, and configure to transmitting.

Non-onestep-Sync packet with one-step timestamp request falls back to use
two-step timestamp.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:10:15 -07:00
Yangbo Lu 682eaad93e net: mscc: ocelot: convert to ocelot_port_txtstamp_request()
Convert to a common ocelot_port_txtstamp_request() for TX timestamp
request handling.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:10:15 -07:00
Yangbo Lu c4b364ce12 net: dsa: free skb->cb usage in core driver
Free skb->cb usage in core driver and let device drivers decide to
use or not. The reason having a DSA_SKB_CB(skb)->clone was because
dsa_skb_tx_timestamp() which may set the clone pointer was called
before p->xmit() which would use the clone if any, and the device
driver has no way to initialize the clone pointer.

This patch just put memset(skb->cb, 0, sizeof(skb->cb)) at beginning
of dsa_slave_xmit(). Some new features in the future, like one-step
timestamp may need more bytes of skb->cb to use in
dsa_skb_tx_timestamp(), and p->xmit().

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:10:15 -07:00
Lv Yunlong 6d72e7c767 net:emac/emac-mac: Fix a use after free in emac_mac_tx_buf_send
In emac_mac_tx_buf_send, it calls emac_tx_fill_tpd(..,skb,..).
If some error happens in emac_tx_fill_tpd(), the skb will be freed via
dev_kfree_skb(skb) in error branch of emac_tx_fill_tpd().
But the freed skb is still used via skb->len by netdev_sent_queue(,skb->len).

As i observed that emac_tx_fill_tpd() haven't modified the value of skb->len,
thus my patch assigns skb->len to 'len' before the possible free and
use 'len' instead of skb->len later.

Fixes: b9b17debc6 ("net: emac: emac gigabit ethernet controller driver")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-26 13:07:30 -07:00
Jiapeng Chong 9176e38027 net: davicom: Remove redundant assignment to ret
Variable ret is set to zero but this value is never read as it is
overwritten with a new value later on, hence it is a redundant
assignment and can be removed.

Cleans up the following clang-analyzer warning:

drivers/net/ethernet/davicom/dm9000.c:1527:5: warning: Value stored to
'ret' is never read [clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-26 12:48:30 -07:00
Jiapeng Chong 930d2d619d pcnet32: Remove redundant variable prev_link and curr_link
Variable prev_link and curr_link is being assigned a value from a
calculation however the variable is never read, so this redundant
variable can be removed.

Cleans up the following clang-analyzer warning:

drivers/net/ethernet/amd/pcnet32.c:2857:4: warning: Value stored to
'prev_link' is never read [clang-analyzer-deadcode.DeadStores].

drivers/net/ethernet/amd/pcnet32.c:2856:4: warning: Value stored to
'curr_link' is never read [clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-26 12:46:10 -07:00
Linus Torvalds 37f00ab4a0 ARM: SoC drivers for v5.13
Updates for SoC specific drivers include a few subsystems that
 have their own maintainers but send them through the soc tree:
 
 TEE/OP-TEE:
  -  Add tracepoints around calls to secure world
 
 Memory controller drivers:
  - Minor fixes for Renesas, Exynos, Mediatek and Tegra platforms
  - Add debug statistics to Tegra20 memory controller
  - Update Tegra bindings and convert to dtschema
 
 ARM SCMI Firmware:
  - Support for modular SCMI protocols and vendor specific extensions
  - New SCMI IIO driver
  - Per-cpu DVFS
 
 The other driver changes are all from the platform maintainers
 directly and reflect the drivers that don't fit into any other
 subsystem as well as treewide changes for a particular platform.
 
 SoCFPGA:
  - Various cleanups contributed by Krzysztof Kozlowski
 
 Mediatek:
  - add MT8183 support to mutex driver
  - MMSYS: use per SoC array to describe the possible routing
  - add MMSYS support for MT8183 and MT8167
  - add support for PMIC wrapper with integrated arbiter
  - add support for MT8192/MT6873
 
 Tegra:
  - Bug fixes to PMC and clock drivers
 
 NXP/i.MX:
  - Update SCU power domain driver to keep console domain power on.
  - Add missing ADC1 power domain to SCU power domain driver.
  - Update comments for single global power domain in SCU power domain
    driver.
  - Add i.MX51/i.MX53 unique id support to i.MX SoC driver.
 
 NXP/FSL SoC driver updates for v5.13
  - Add ACPI support for RCPM driver
  - Use generic io{read,write} for QE drivers after performance optimized
    for PowerPC
  - Fix QBMAN probe to cleanup HW states correctly for kexec
  - Various cleanup and style fix for QBMAN/QE/GUTS drivers
 
 OMAP:
  - Preparation to use devicetree for genpd
  - ti-sysc needs iorange check improved when the interconnect target module
    has no control registers listed
  - ti-sysc needs to probe l4_wkup and l4_cfg interconnects first to avoid
    issues with missing resources and unnecessary deferred probe
  - ti-sysc debug option can now detect more devices
  - ti-sysc now warns if an old incomplete devicetree data is found as we
    now rely on it being complete for am3 and 4
  - soc init code needs to check for prcm and prm nodes for omap4/5 and dra7
  - omap-prm driver needs to enable autoidle retention support for omap4
  - omap5 clocks are missing gpmc and ocmc clock registers
  - pci-dra7xx now needs to use builtin_platform_driver instead of using
    builtin_platform_driver_probe for deferred probe to work
 
 Raspberry Pi:
  - Fix-up all RPi firmware drivers so as for unbind to happen in an
    orderly fashion
  - Support for RPi's PoE hat PWM bus
 
 Qualcomm
  - Improved detection for SCM calling conventions
  - Support for OEM specific wifi firmware path
  - Added drivers for SC7280/SM8350: RPMH, LLCC< AOSS QMP
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmCC2JwACgkQmmx57+YA
 GNkgRg//cBtq2NyDbjiNABxFSkmGCfcc0w0C2wjVzr4cfg6BLTbuvvlpZxI912pu
 P1G2sbsdfQJ8sSeIyZos+PilWK0zHrqlaGZfKI19US45dMjpteDBgsPd7wNZwBjQ
 jbops3YLjztZK1HpY4dIdvMnfxt7yRqhBWaTbPuCwQ35c5KsOM8NHB3cP3BUINWK
 x1uuBCv9svppzwdDiPxneV93WKEzabOUo+WBMPyh5vnyvmW17Iif4BA/VKQxzymm
 mWUi8HHpKBpvntJOKwAD2hnLAdpR3SwX20SLOpyLhnJMotbzNUEqq3LdRxDNPdHk
 ry+rarJ78JGlYfpcfegf2bLf5ITNMfOyRGkjtzeYpcZIXPjufOg9DA9YtAy37k0u
 L0T/9gQ+tQ01WGMca77OyUtIqJKdblZrQMfuH/yGlR99bqFQMV7rNc7GNlX1MXp/
 zw4aOYrRWGtGEeAjx5JJWcYydvMSJpCrqxTz3YhgeJECHB2iA6YkV3NROR4TLW//
 tfxaKqxR/KmSqE6hoVOAuuQ0BLXNlql/+4EE6MKsAOBiKPJclvmJg4CyuY8G21ev
 9Su0zJnXMzai7gNu32v1pizGj26+AOhxCEgAG0mGgk2jlQSn24CKgm5e7kCUewcF
 j/1XksNPT95v/K8MsLpXe5xGvF3jhA1BlFfvjJNZOrcZywBXRxg=
 =iidq
 -----END PGP SIGNATURE-----

Merge tag 'arm-drivers-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC driver updates from Arnd Bergmann:
 "Updates for SoC specific drivers include a few subsystems that have
  their own maintainers but send them through the soc tree:

  TEE/OP-TEE:
   - Add tracepoints around calls to secure world

  Memory controller drivers:
   - Minor fixes for Renesas, Exynos, Mediatek and Tegra platforms
   - Add debug statistics to Tegra20 memory controller
   - Update Tegra bindings and convert to dtschema

  ARM SCMI Firmware:
   - Support for modular SCMI protocols and vendor specific extensions
   - New SCMI IIO driver
   - Per-cpu DVFS

  The other driver changes are all from the platform maintainers
  directly and reflect the drivers that don't fit into any other
  subsystem as well as treewide changes for a particular platform.

  SoCFPGA:
   - Various cleanups contributed by Krzysztof Kozlowski

  Mediatek:
   - add MT8183 support to mutex driver
   - MMSYS: use per SoC array to describe the possible routing
   - add MMSYS support for MT8183 and MT8167
   - add support for PMIC wrapper with integrated arbiter
   - add support for MT8192/MT6873

  Tegra:
   - Bug fixes to PMC and clock drivers

  NXP/i.MX:
   - Update SCU power domain driver to keep console domain power on.
   - Add missing ADC1 power domain to SCU power domain driver.
   - Update comments for single global power domain in SCU power domain
     driver.
   - Add i.MX51/i.MX53 unique id support to i.MX SoC driver.

  NXP/FSL SoC driver updates for v5.13
   - Add ACPI support for RCPM driver
   - Use generic io{read,write} for QE drivers after performance
     optimized for PowerPC
   - Fix QBMAN probe to cleanup HW states correctly for kexec
   - Various cleanup and style fix for QBMAN/QE/GUTS drivers

  OMAP:
   - Preparation to use devicetree for genpd
   - ti-sysc needs iorange check improved when the interconnect target
     module has no control registers listed
   - ti-sysc needs to probe l4_wkup and l4_cfg interconnects first to
     avoid issues with missing resources and unnecessary deferred probe
   - ti-sysc debug option can now detect more devices
   - ti-sysc now warns if an old incomplete devicetree data is found as
     we now rely on it being complete for am3 and 4
   - soc init code needs to check for prcm and prm nodes for omap4/5 and
     dra7
   - omap-prm driver needs to enable autoidle retention support for
     omap4
   - omap5 clocks are missing gpmc and ocmc clock registers
   - pci-dra7xx now needs to use builtin_platform_driver instead of
     using builtin_platform_driver_probe for deferred probe to work

  Raspberry Pi:
   - Fix-up all RPi firmware drivers so as for unbind to happen in an
     orderly fashion
   - Support for RPi's PoE hat PWM bus

  Qualcomm
   - Improved detection for SCM calling conventions
   - Support for OEM specific wifi firmware path
   - Added drivers for SC7280/SM8350: RPMH, LLCC< AOSS QMP"

* tag 'arm-drivers-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (165 commits)
  soc: aspeed: fix a ternary sign expansion bug
  memory: mtk-smi: Add device-link between smi-larb and smi-common
  memory: samsung: exynos5422-dmc: handle clk_set_parent() failure
  memory: renesas-rpc-if: fix possible NULL pointer dereference of resource
  clk: socfpga: fix iomem pointer cast on 64-bit
  soc: aspeed: Adapt to new LPC device tree layout
  pinctrl: aspeed-g5: Adapt to new LPC device tree layout
  ipmi: kcs: aspeed: Adapt to new LPC DTS layout
  ARM: dts: Remove LPC BMC and Host partitions
  dt-bindings: aspeed-lpc: Remove LPC partitioning
  soc: fsl: enable acpi support in RCPM driver
  soc: qcom: mdt_loader: Detect truncated read of segments
  soc: qcom: mdt_loader: Validate that p_filesz < p_memsz
  soc: qcom: pdr: Fix error return code in pdr_register_listener
  firmware: qcom_scm: Fix kernel-doc function names to match
  firmware: qcom_scm: Suppress sysfs bind attributes
  firmware: qcom_scm: Workaround lack of "is available" call on SC7180
  firmware: qcom_scm: Reduce locking section for __get_convention()
  firmware: qcom_scm: Make __qcom_scm_is_call_available() return bool
  Revert "soc: fsl: qe: introduce qe_io{read,write}* wrappers"
  ...
2021-04-26 12:11:52 -07:00
David S. Miller 6876a18d33 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-04-26 12:00:00 -07:00
Linus Torvalds 91552ab8ff The usual updates from the irq departement:
Core changes:
 
  - Provide IRQF_NO_AUTOEN as a flag for request*_irq() so drivers can be
    cleaned up which either use a seperate mechanism to prevent auto-enable
    at request time or have a racy mechanism which disables the interrupt
    right after request.
 
  - Get rid of the last usage of irq_create_identity_mapping() and remove
    the interface.
 
  - An overhaul of tasklet_disable(). Most usage sites of tasklet_disable()
    are in task context and usually in cleanup, teardown code pathes.
    tasklet_disable() spinwaits for a tasklet which is currently executed.
    That's not only a problem for PREEMPT_RT where this can lead to a live
    lock when the disabling task preempts the softirq thread. It's also
    problematic in context of virtualization when the vCPU which runs the
    tasklet is scheduled out and the disabling code has to spin wait until
    it's scheduled back in. Though there are a few code pathes which invoke
    tasklet_disable() from non-sleepable context. For these a new disable
    variant which still spinwaits is provided which allows to switch
    tasklet_disable() to a sleep wait mechanism. For the atomic use cases
    this does not solve the live lock issue on PREEMPT_RT. That is mitigated
    by blocking on the RT specific softirq lock.
 
  - The PREEMPT_RT specific implementation of softirq processing and
    local_bh_disable/enable().
 
    On RT enabled kernels soft interrupt processing happens always in task
    context and all interrupt handlers, which are not explicitly marked to
    be invoked in hard interrupt context are forced into task context as
    well. This allows to protect against softirq processing with a per
    CPU lock, which in turn allows to make BH disabled regions preemptible.
 
    Most of the softirq handling code is still shared. The RT/non-RT
    specific differences are addressed with a set of inline functions which
    provide the context specific functionality. The local_bh_disable() /
    local_bh_enable() mechanism are obviously seperate.
 
  - The usual set of small improvements and cleanups
 
 Driver changes:
 
  - New drivers for Nuvoton WPCM450 and DT 79rc3243x interrupt controllers
 
  - Extended functionality for MStar, STM32 and SC7280 irq chips
 
  - Enhanced robustness for ARM GICv3/4.1 drivers
 
  - The usual set of cleanups and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmCGh5wTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZ+/EACWBpQ/2ZHizEw1bzjaDzJrR8U228xu
 wNi7nSP92Y07nJ3cCX7a6TJ53mqd0n3RT+DprlsOuqSN0D7Ktr/x44V/aZtm0d3N
 GkFOlpeGCRnHusLaUTwk7a8289LuoQ7OhSxIB409n1I4nLI96ZK41D1tYonMYl6E
 nxDiGADASfjaciBWbjwJO/mlwmiW/VRpSTxswx0wzakFfbIx9iKyKv1bCJQZ5JK+
 lHmf0jxpDIs1EVK/ElJ9Ky6TMBlEmZyiX7n6rujtwJ1W+Jc/uL/y8pLJvGwooVmI
 yHTYsLMqzviCbAMhJiB3h1qs3GbCGlM78prgJTnOd0+xEUOCcopCRQlsTXVBq8Nb
 OS+HNkYmYXRfiSH6lINJsIok8Xis28bAw/qWz2Ho+8wLq0TI8crK38roD1fPndee
 FNJRhsPPOBkscpIldJ0Cr0X5lclkJFiAhAxORPHoseKvQSm7gBMB7H99xeGRffTn
 yB3XqeTJMvPNmAHNN4Brv6ey3OjwnEWBgwcnIM2LtbIlRtlmxTYuR+82OPOgEvzk
 fSrjFFJqu0LEMLEOXS4pYN824PawjV//UAy4IaG8AodmUUCSGHgw1gTVa4sIf72t
 tXY54HqWfRWRpujhVRgsZETqBUtZkL6yvpoe8f6H7P91W5tAfv3oj4ch9RkhUo+Z
 b0/u9T0+Fpbg+w==
 =id4G
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2021-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "The usual updates from the irq departement:

  Core changes:

   - Provide IRQF_NO_AUTOEN as a flag for request*_irq() so drivers can
     be cleaned up which either use a seperate mechanism to prevent
     auto-enable at request time or have a racy mechanism which disables
     the interrupt right after request.

   - Get rid of the last usage of irq_create_identity_mapping() and
     remove the interface.

   - An overhaul of tasklet_disable().

     Most usage sites of tasklet_disable() are in task context and
     usually in cleanup, teardown code pathes. tasklet_disable()
     spinwaits for a tasklet which is currently executed. That's not
     only a problem for PREEMPT_RT where this can lead to a live lock
     when the disabling task preempts the softirq thread. It's also
     problematic in context of virtualization when the vCPU which runs
     the tasklet is scheduled out and the disabling code has to spin
     wait until it's scheduled back in.

     There are a few code pathes which invoke tasklet_disable() from
     non-sleepable context. For these a new disable variant which still
     spinwaits is provided which allows to switch tasklet_disable() to a
     sleep wait mechanism. For the atomic use cases this does not solve
     the live lock issue on PREEMPT_RT. That is mitigated by blocking on
     the RT specific softirq lock.

   - The PREEMPT_RT specific implementation of softirq processing and
     local_bh_disable/enable().

     On RT enabled kernels soft interrupt processing happens always in
     task context and all interrupt handlers, which are not explicitly
     marked to be invoked in hard interrupt context are forced into task
     context as well. This allows to protect against softirq processing
     with a per CPU lock, which in turn allows to make BH disabled
     regions preemptible.

     Most of the softirq handling code is still shared. The RT/non-RT
     specific differences are addressed with a set of inline functions
     which provide the context specific functionality. The
     local_bh_disable() / local_bh_enable() mechanism are obviously
     seperate.

   - The usual set of small improvements and cleanups

  Driver changes:

   - New drivers for Nuvoton WPCM450 and DT 79rc3243x interrupt
     controllers

   - Extended functionality for MStar, STM32 and SC7280 irq chips

   - Enhanced robustness for ARM GICv3/4.1 drivers

   - The usual set of cleanups and improvements all over the place"

* tag 'irq-core-2021-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
  irqchip/xilinx: Expose Kconfig option for Zynq/ZynqMP
  irqchip/gic-v3: Do not enable irqs when handling spurious interrups
  dt-bindings: interrupt-controller: Add IDT 79RC3243x Interrupt Controller
  irqchip: Add support for IDT 79rc3243x interrupt controller
  irqdomain: Drop references to recusive irqdomain setup
  irqdomain: Get rid of irq_create_strict_mappings()
  irqchip/jcore-aic: Kill use of irq_create_strict_mappings()
  ARM: PXA: Kill use of irq_create_strict_mappings()
  irqchip/gic-v4.1: Disable vSGI upon (GIC CPUIF < v4.1) detection
  irqchip/tb10x: Use 'fallthrough' to eliminate a warning
  genirq: Reduce irqdebug cacheline bouncing
  kernel: Initialize cpumask before parsing
  irqchip/wpcm450: Drop COMPILE_TEST
  irqchip/irq-mst: Support polarity configuration
  irqchip: Add driver for WPCM450 interrupt controller
  dt-bindings: interrupt-controller: Add nuvoton, wpcm450-aic
  dt-bindings: qcom,pdc: Add compatible for sc7280
  irqchip/stm32: Add usart instances exti direct event support
  irqchip/gic-v3: Fix OF_BAD_ADDR error handling
  irqchip/sifive-plic: Mark two global variables __ro_after_init
  ...
2021-04-26 09:43:16 -07:00
Michael Chan 1698d600b3 bnxt_en: Implement .ndo_features_check().
For UDP encapsultions, we only support the offloaded Vxlan port and
Geneve port.  All other ports included FOU and GUE are not supported so
we need to turn off TSO and checksum features.

v2: Reverse the check for supported UDP ports to be more straight forward.

Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:37:39 -07:00
Michael Chan dade5e15fa bnxt_en: Support IFF_SUPP_NOFCS feature to transmit without ethernet FCS.
If firmware is capable, set the IFF_SUPP_NOFCS flag to support the
sockets option to transmit packets without FCS.  This is mainly used
for testing.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:37:39 -07:00
Michael Chan 7fbf359bb2 bnxt_en: Add PCI IDs for Hyper-V VF devices.
Support VF device IDs used by the Hyper-V hypervisor.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:37:38 -07:00
Michael Chan 92923cc710 bnxt_en: Call bnxt_approve_mac() after the PF gives up control of the VF MAC.
When the PF is no longer enforcing an assigned MAC address on a VF, the
VF needs to call bnxt_approve_mac() to tell the PF what MAC address it is
now using.  Otherwise it gets out of sync and the PF won't know what
MAC address the VF wants to use.  Ultimately the VF will fail when it
tries to setup the L2 MAC filter for the vnic.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:37:38 -07:00
Michael Chan 7b3c8e27d6 bnxt_en: Move bnxt_approve_mac().
Move it before bnxt_update_vf_mac().  In the next patch, we need to call
bnxt_approve_mac() from bnxt_update_mac() under some conditions.  This
will avoid forward declaration.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:37:38 -07:00
Edwin Peer 6b70276898 bnxt_en: allow VF config ops when PF is closed
It is perfectly legal for the stack to query and configure VFs via PF
NDOs while the NIC is administratively down.  Remove the unnecessary
check for the PF to be in open state.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:37:38 -07:00
Edwin Peer dd85fc0ab5 bnxt_en: allow promiscuous mode for trusted VFs
Firmware previously only allowed promiscuous mode for VFs associated with
a default VLAN. It is now possible to enable promiscuous mode for a VF
having no VLAN configured provided that it is trusted. In such cases the
VF will see all packets received by the PF, irrespective of destination
MAC or VLAN.

Note, it is necessary to query firmware at the time of bnxt_promisc_ok()
instead of in bnxt_hwrm_func_qcfg() because the trusted status might be
altered by the PF after the VF has been configured. This check must now
also be deferred because the firmware call sleeps.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:37:38 -07:00
Michael Chan d5ca99054f bnxt_en: Add support for fw managed link down feature.
In the current code, the driver will not shutdown the link during
IFDOWN if there are still VFs sharing the port.  Newer firmware will
manage the link down decision when the port is shared by VFs, so
we can just call firmware to shutdown the port unconditionally and
let firmware make the final decision.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:37:38 -07:00
Michael Chan b0d28207ce bnxt_en: Add a new phy_flags field to the main driver structure.
Copy the phy related feature flags from the firmware call
HWRM_PORT_PHY_QCAPS to this new field.  We can also remove the flags
field in the bnxt_test_info structure.  It's cleaner to have all PHY
related flags in one location, directly copied from the firmware.

To keep the BNXT_PHY_CFG_ABLE() macro logic the same, we need to make
a slight adjustment to check that it is a PF.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:37:38 -07:00
Edwin Peer 1d2deb61f0 bnxt_en: report signal mode in link up messages
Firmware reports link signalling mode for certain speeds. In these
cases, print the signalling modes in kernel log link up messages.

Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:37:38 -07:00
David S. Miller 1e5e4acb66 mlx5-updates-2021-04-21
devlink external port attribute for SF (Sub-Function) port flavour
 
 This adds the support to instantiate Sub-Functions on external hosts
 E.g when Eswitch manager is enabled on the ARM SmarNic SoC CPU, users
 are now able to spawn new Sub-Functions on the Host server CPU.
 
 Parav Pandit Says:
 ==================
 
 This series introduces and uses external attribute for the SF port to
 indicate that a SF port belongs to an external controller.
 
 This is needed to generate unique phys_port_name when PF and SF numbers
 are overlapping between local and external controllers.
 For example two controllers 0 and 1, both of these controller have a SF.
 having PF number 0, SF number 77. Here, phys_port_name has duplicate
 entry which doesn't have controller number in it.
 
 Hence, add controller number optionally when a SF port is for an
 external controller. This extension is similar to existing PF and VF
 eswitch ports of the external controller.
 
 When a SF is for external controller an example view of external SF
 port and config sequence:
 
 On eswitch system:
 $ devlink dev eswitch set pci/0033:01:00.0 mode switchdev
 
 $ devlink port show
 pci/0033:01:00.0/196607: type eth netdev enP51p1s0f0np0 flavour physical port 0 splittable false
 pci/0033:01:00.0/131072: type eth netdev eth0 flavour pcipf controller 1 pfnum 0 external true splittable false
   function:
     hw_addr 00:00:00:00:00:00
 
 $ devlink port add pci/0033:01:00.0 flavour pcisf pfnum 0 sfnum 77 controller 1
 pci/0033:01:00.0/163840: type eth netdev eth1 flavour pcisf controller 1 pfnum 0 sfnum 77 splittable false
   function:
     hw_addr 00:00:00:00:00:00 state inactive opstate detached
 
 phys_port_name construction:
 $ cat /sys/class/net/eth1/phys_port_name
 c1pf0sf77
 
 Patch summary:
 First 3 patches prepares the eswitch to handle vports in more generic
 way using xarray to lookup vport from its unique vport number.
 Patch-1 returns maximum eswitch ports only when eswitch is enabled
 Patch-2 prepares eswitch to return eswitch max ports from a struct
 Patch-3 uses xarray for vport and representor lookup
 Patch-4 considers SF for an additioanl range of SF vports
 Patch-5 relies on SF hw table to check SF support
 Patch-6 extends SF devlink port attribute for external flag
 Patch-7 stores the per controller SF allocation attributes
 Patch-8 uses SF function id for filtering events
 Patch-9 uses helper for allocation and free
 Patch-10 splits hw table into per controller table and generic one
 Patch-11 extends sf table for additional range
 
 ==================
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmCDz80ACgkQSD+KveBX
 +j4LmggAwS9otYoo639Kmow/wlMZ6yyLsH02zVMFLEJ2AE4VbL73i4iiQ67ZWygL
 yQ8HawPAnythx4RsN/M6/WjSKRpdqTC27C9CpdM78zhXb1vnOrlzba7rYngqmo7N
 5fIkGyjsUGHNqq+15SftK7JYbXFTe1b5RdWawXkQoyBlXTTBamyxD7C5NMpoDots
 /e88Bs8Zy5nVPZqPchIId8TZEKKuO/heTz8ks6q6s/t1MGj7QP+ddxVMgNg00NR5
 OpNTr7YYdpHxpfLSUZgdHaptwwKOx+nou8LdJkIKWPs7SHX6HDggyZJjGBOEWtE7
 qG7oSip4olOTM0w9PZrAewLwSYhq7Q==
 =oqBr
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-04-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-04-21

devlink external port attribute for SF (Sub-Function) port flavour

This adds the support to instantiate Sub-Functions on external hosts
E.g when Eswitch manager is enabled on the ARM SmarNic SoC CPU, users
are now able to spawn new Sub-Functions on the Host server CPU.

Parav Pandit Says:
==================

This series introduces and uses external attribute for the SF port to
indicate that a SF port belongs to an external controller.

This is needed to generate unique phys_port_name when PF and SF numbers
are overlapping between local and external controllers.
For example two controllers 0 and 1, both of these controller have a SF.
having PF number 0, SF number 77. Here, phys_port_name has duplicate
entry which doesn't have controller number in it.

Hence, add controller number optionally when a SF port is for an
external controller. This extension is similar to existing PF and VF
eswitch ports of the external controller.

When a SF is for external controller an example view of external SF
port and config sequence:

On eswitch system:
$ devlink dev eswitch set pci/0033:01:00.0 mode switchdev

$ devlink port show
pci/0033:01:00.0/196607: type eth netdev enP51p1s0f0np0 flavour physical port 0 splittable false
pci/0033:01:00.0/131072: type eth netdev eth0 flavour pcipf controller 1 pfnum 0 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port add pci/0033:01:00.0 flavour pcisf pfnum 0 sfnum 77 controller 1
pci/0033:01:00.0/163840: type eth netdev eth1 flavour pcisf controller 1 pfnum 0 sfnum 77 splittable false
  function:
    hw_addr 00:00:00:00:00:00 state inactive opstate detached

phys_port_name construction:
$ cat /sys/class/net/eth1/phys_port_name
c1pf0sf77

Patch summary:
First 3 patches prepares the eswitch to handle vports in more generic
way using xarray to lookup vport from its unique vport number.
Patch-1 returns maximum eswitch ports only when eswitch is enabled
Patch-2 prepares eswitch to return eswitch max ports from a struct
Patch-3 uses xarray for vport and representor lookup
Patch-4 considers SF for an additioanl range of SF vports
Patch-5 relies on SF hw table to check SF support
Patch-6 extends SF devlink port attribute for external flag
Patch-7 stores the per controller SF allocation attributes
Patch-8 uses SF function id for filtering events
Patch-9 uses helper for allocation and free
Patch-10 splits hw table into per controller table and generic one
Patch-11 extends sf table for additional range

==================

====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:31:35 -07:00
Linus Walleij 95aafe911d net: ethernet: ixp4xx: Support device tree probing
This adds device tree probing to the IXP4xx ethernet
driver.

Add a platform data bool to tell us whether to
register an MDIO bus for the device or not, as well
as the corresponding NPE.

We need to drop the memory region request as part of
this since the OF core will request the memory for the
device.

Cc: Zoltan HERPAI <wigyori@uid0.hu>
Cc: Raylynn Knight <rayknight@me.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:27:13 -07:00
Linus Walleij 3e8047a985 net: ethernet: ixp4xx: Retire ancient phy retrieveal
This driver was using a really dated way of obtaining the
phy by printing a string and using it with phy_connect().
Switch to using more reasonable modern interfaces.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:27:13 -07:00
Michael Chan bbd6f0a948 bnxt_en: Fix RX consumer index logic in the error path.
In bnxt_rx_pkt(), the RX buffers are expected to complete in order.
If the RX consumer index indicates an out of order buffer completion,
it means we are hitting a hardware bug and the driver will abort all
remaining RX packets and reset the RX ring.  The RX consumer index
that we pass to bnxt_discard_rx() is not correct.  We should be
passing the current index (tmp_raw_cons) instead of the old index
(raw_cons).  This bug can cause us to be at the wrong index when
trying to abort the next RX packet.  It can crash like this:

 #0 [ffff9bbcdf5c39a8] machine_kexec at ffffffff9b05e007
 #1 [ffff9bbcdf5c3a00] __crash_kexec at ffffffff9b111232
 #2 [ffff9bbcdf5c3ad0] panic at ffffffff9b07d61e
 #3 [ffff9bbcdf5c3b50] oops_end at ffffffff9b030978
 #4 [ffff9bbcdf5c3b78] no_context at ffffffff9b06aaf0
 #5 [ffff9bbcdf5c3bd8] __bad_area_nosemaphore at ffffffff9b06ae2e
 #6 [ffff9bbcdf5c3c28] bad_area_nosemaphore at ffffffff9b06af24
 #7 [ffff9bbcdf5c3c38] __do_page_fault at ffffffff9b06b67e
 #8 [ffff9bbcdf5c3cb0] do_page_fault at ffffffff9b06bb12
 #9 [ffff9bbcdf5c3ce0] page_fault at ffffffff9bc015c5
    [exception RIP: bnxt_rx_pkt+237]
    RIP: ffffffffc0259cdd  RSP: ffff9bbcdf5c3d98  RFLAGS: 00010213
    RAX: 000000005dd8097f  RBX: ffff9ba4cb11b7e0  RCX: ffffa923cf6e9000
    RDX: 0000000000000fff  RSI: 0000000000000627  RDI: 0000000000001000
    RBP: ffff9bbcdf5c3e60   R8: 0000000000420003   R9: 000000000000020d
    R10: ffffa923cf6ec138  R11: ffff9bbcdf5c3e83  R12: ffff9ba4d6f928c0
    R13: ffff9ba4cac28080  R14: ffff9ba4cb11b7f0  R15: ffff9ba4d5a30000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Fixes: a1b0e4e684 ("bnxt_en: Improve RX consumer index validity check.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:18:03 -07:00
Jiapeng Chong bf7d20cd51 ch_ktls: Remove redundant variable result
Variable result is being assigned a value from a calculation
however the variable is never read, so this redundant variable
can be removed.

Cleans up the following clang-analyzer warning:

drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:1488:2:
warning: Value stored to 'pos' is never read
[clang-analyzer-deadcode.DeadStores].

drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:876:3:
warning: Value stored to 'pos' is never read
[clang-analyzer-deadcode.DeadStores].

drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:36:3:
warning: Value stored to 'start' is never read
[clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-25 18:07:11 -07:00
Parav Pandit f1b9acd3a5 net/mlx5: SF, Extend SF table for additional SF id range
Extended the SF table to cover additioanl SF id range of external
controller.

A user optionallly provides the external controller number when user
wants to create SF on the external controller.

An example on eswitch system:
$ devlink dev eswitch set pci/0033:01:00.0 mode switchdev

$ devlink port show
pci/0033:01:00.0/196607: type eth netdev enP51p1s0f0np0 flavour physical port 0 splittable false
pci/0033:01:00.0/131072: type eth netdev eth0 flavour pcipf controller 1 pfnum 0 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port add pci/0033:01:00.0 flavour pcisf pfnum 0 sfnum 77 controller 1
pci/0033:01:00.0/163840: type eth netdev eth1 flavour pcisf controller 1 pfnum 0 sfnum 77 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00 state inactive opstate detached

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-24 00:59:07 -07:00
Parav Pandit a3088f87d9 net/mlx5: SF, Split mlx5_sf_hw_table into two parts
Device has SF ids in two different contiguous ranges. One for the local
controller and second for the external controller's PF.

Each such range has its own maximum number of functions and base id.
To allocate SF from either of the range, prepare code to split into
range specific fields into its own structure.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-24 00:59:04 -07:00
Parav Pandit 01ed9550e8 net/mlx5: SF, Use helpers for allocation and free
Use helper routines for SF id and SF table allocation and free
so that subsequent patch can reuse it for multiple SF function
id range.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-24 00:59:01 -07:00
Parav Pandit 326c08a020 net/mlx5: SF, Consider own vhca events of SF devices
Vhca events on eswitch manager are received for all the functions on the
NIC, including for SFs of external host PF controllers.

While SF device handler is only interested in SF devices events related
to its own PF.
Hence, validate if the function belongs to self or not.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-24 00:58:59 -07:00
Parav Pandit 7e6ccbc187 net/mlx5: SF, Store and use start function id
SF ids in the device are in two different contiguous ranges. One for
the local controller and second for the external host controller.

Prepare code to handle multiple start function id by storing it in the
table.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-24 00:58:56 -07:00
Parav Pandit a1ab3e4554 devlink: Extend SF port attributes to have external attribute
Extended SF port attributes to have optional external flag similar to
PCI PF and VF port attributes.

External atttibute is required to generate unique phys_port_name when PF number
and SF number are overlapping between two controllers similar to SR-IOV
VFs.

When a SF is for external controller an example view of external SF
port and config sequence.

On eswitch system:
$ devlink dev eswitch set pci/0033:01:00.0 mode switchdev

$ devlink port show
pci/0033:01:00.0/196607: type eth netdev enP51p1s0f0np0 flavour physical port 0 splittable false
pci/0033:01:00.0/131072: type eth netdev eth0 flavour pcipf controller 1 pfnum 0 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port add pci/0033:01:00.0 flavour pcisf pfnum 0 sfnum 77 controller 1
pci/0033:01:00.0/163840: type eth netdev eth1 flavour pcisf controller 1 pfnum 0 sfnum 77 splittable false
  function:
    hw_addr 00:00:00:00:00:00 state inactive opstate detached

phys_port_name construction:
$ cat /sys/class/net/eth1/phys_port_name
c1pf0sf77

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-24 00:58:53 -07:00
Parav Pandit 1d7979352f net/mlx5: SF, Rely on hw table for SF devlink port allocation
Supporting SF allocation is currently checked at two places:
(a) SF devlink port allocator and
(b) SF HW table handler.

Both layers are using HCA CAP to identify it using helper routine
mlx5_sf_supported() and mlx5_sf_max_functions().

Instead, rely on the HW table handler to check if SF is supported
or not.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-24 00:58:51 -07:00
Parav Pandit 87bd418ea7 net/mlx5: E-Switch, Consider SF ports of host PF
Query SF vports count and base id of host PF from the firmware.

Account these ports in the total port calculation whenever it is non
zero.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-24 00:58:48 -07:00
Parav Pandit 47dd7e609f net/mlx5: E-Switch, Use xarray for vport number to vport and rep mapping
Currently vport number to vport and its representor are mapped using an
array and an index.

Vport numbers of different types of functions are not contiguous. Adding
new such discontiguous range using index and number mapping is increasingly
complex and hard to maintain.

Hence, maintain an xarray of vport and rep whose lookup is done based on
the vport number.
Each VF and SF entry is marked with a xarray mark to identify the function
type. Additionally PF and VF needs special handling for legacy inline
mode. They are additionally marked as host function using additional
HOST_FN mark.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-24 00:58:45 -07:00
Parav Pandit 9f8c7100c8 net/mlx5: E-Switch, Prepare to return total vports from eswitch struct
Total vports are already stored during eswitch initialization. Instead
of calculating everytime, read directly from eswitch.

Additionally, host PF's SF vport information is available using
QUERY_HCA_CAP command. It is not available through HCA_CAP of the
eswitch manager PF.
Hence, this patch prepares the return total eswitch vport count from the
existing eswitch struct.

This further helps to keep eswitch port counting macros and logic within
eswitch.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-24 00:58:43 -07:00
Parav Pandit 06ec5acc77 net/mlx5: E-Switch, Return eswitch max ports when eswitch is supported
mlx5_eswitch_get_total_vports() doesn't honor MLX5_ESWICH Kconfig flag.

When MLX5_ESWITCH is disabled, FS layer continues to initialize eswitch
specific ACL namespaces.
Instead, start honoring MLX5_ESWITCH flag and perform vport specific
initialization only when vport count is non zero.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-24 00:58:40 -07:00
David S. Miller b1ce98c70e Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
40GbE Intel Wired LAN Driver Updates 2021-04-23

This series contains updates to i40e and iavf drivers.

Aleksandr adds support for VIRTCHNL_VF_CAP_ADV_LINK_SPEED in i40e which
allows for reporting link speed to VF as a value instead of using an
enum; helper functions are created to remove repeated code.

Coiby Xu reduces memory use of i40e when using kdump by reducing Tx, Rx,
and admin queue to minimum values. Current use causes failure of kdump.

Stefan Assmann removes duplicated free calls in iavf.

Haiyue cleans up a loop to return directly when if the value is found
and changes some magic numbers to defines for better maintainability
in iavf.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 14:04:43 -07:00
Yangbo Lu 7ce9c3d363 enetc: fix locking for one-step timestamping packet transfer
The previous patch to support PTP Sync packet one-step timestamping
described one-step timestamping packet handling logic as below in
commit message:

- Trasmit packet immediately if no other one in transfer, or queue to
  skb queue if there is already one in transfer.
  The test_and_set_bit_lock() is used here to lock and check state.
- Start a work when complete transfer on hardware, to release the bit
  lock and to send one skb in skb queue if has.

There was not problem of the description, but there was a mistake in
implementation. The locking/test_and_set_bit_lock() should be put in
enetc_start_xmit() which may be called by worker, rather than in
enetc_xmit(). Otherwise, the worker calling enetc_start_xmit() after
bit lock released is not able to lock again for transfer.

Fixes: 7294380c52 ("enetc: support PTP Sync packet one-step timestamping")
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:52:30 -07:00
Ilya Lipnitskiy 3bc8e0aff2 net: ethernet: mtk_eth_soc: use iopoll.h macro for DMA init
Replace a tight busy-wait loop without a pause with a standard
readx_poll_timeout_atomic routine with a 5 us poll period.

Tested by booting a MT7621 device to ensure the driver initializes
properly.

Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:58 -07:00
Felix Fietkau fa817272c3 net: ethernet: mtk_eth_soc: set PPE flow hash as skb hash if present
This improves GRO performance

Signed-off-by: Felix Fietkau <nbd@nbd.name>
[Ilya: Use MTK_RXD4_FOE_ENTRY instead of GENMASK(13, 0)]
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:58 -07:00
Ilya Lipnitskiy db2c7b353d net: ethernet: mtk_eth_soc: rework NAPI callbacks
Use napi_complete_done to communicate total TX and RX work done to NAPI.
Count total RX work up instead of remaining work down for clarity.
Remove unneeded local variables for clarity. Use do {} while instead of
goto for clarity.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:58 -07:00
Felix Fietkau 16769a8923 net: ethernet: mtk_eth_soc: reduce unnecessary interrupts
Avoid rearming interrupt if napi_complete returns false

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:58 -07:00
Felix Fietkau 816ac3e6e6 net: ethernet: mtk_eth_soc: only read the full RX descriptor if DMA is done
Uncached memory access is expensive, and there is no need to access all
descriptor words if we can't process them anyway

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:58 -07:00
Felix Fietkau 4e6bf60956 net: ethernet: mtk_eth_soc: cache HW pointer of last freed TX descriptor
The value is only updated by the CPU, so it is cheaper to access from the
ring data structure than from a hardware register.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:58 -07:00
Felix Fietkau e9229ffd55 net: ethernet: mtk_eth_soc: implement dynamic interrupt moderation
Reduces the number of interrupts under load

Signed-off-by: Felix Fietkau <nbd@nbd.name>
[Ilya: add documentation for new struct fields]
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:58 -07:00
Felix Fietkau 6b4423b258 net: ethernet: mtk_eth_soc: increase DMA ring sizes
256 descriptors is not enough for multi-gigabit traffic under load on
MT7622. Bump it to 512 to improve performance.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:58 -07:00
Felix Fietkau 59555a8d0d net: ethernet: mtk_eth_soc: use larger burst size for QDMA TX
Improves tx performance

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:58 -07:00
Felix Fietkau 16ef670789 net: ethernet: mtk_eth_soc: remove unnecessary TX queue stops
When running short on descriptors, only stop the queue for the netdev that
tx was attempted for. By the time something tries to send on the other
netdev, the ring might have some more room already.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:58 -07:00
Felix Fietkau 3630d519d7 net: ethernet: mtk_eth_soc: reduce MDIO bus access latency
usleep_range often ends up sleeping much longer than the 10-20us provided
as a range here. This causes significant latency in mdio bus acceses,
which easily adds multiple seconds to the boot time on MT7621 when polling
DSA slave ports.
Use cond_resched instead of usleep_range, since the MDIO access does not
take much time

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:58 -07:00
Felix Fietkau c30c4a8273 net: ethernet: mtk_eth_soc: use napi_consume_skb
Should improve performance, since it can use bulk free

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:57 -07:00
Ilya Lipnitskiy 787082ab9f net: ethernet: mtk_eth_soc: fix build_skb cleanup
In case build_skb fails, call skb_free_frag on the correct pointer. Also
update the DMA structures with the new mapping before exiting, because
the mapping was successful

Suggested-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:57 -07:00
Felix Fietkau 5196c41785 net: ethernet: mtk_eth_soc: unmap RX data before calling build_skb
Since build_skb accesses the data area (for initializing shinfo), dma unmap
needs to happen before that call

Signed-off-by: Felix Fietkau <nbd@nbd.name>
[Ilya: split build_skb cleanup fix into a separate commit]
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:57 -07:00
Felix Fietkau 3f57d8c40f net: ethernet: mtk_eth_soc: fix RX VLAN offload
The VLAN ID in the rx descriptor is only valid if the RX_DMA_VTAG bit is
set. Fixes frames wrongly marked with VLAN tags.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
[Ilya: fix commit message]
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:31:57 -07:00
Dexuan Cui d90a94680b net: mana: Use int to check the return value of mana_gd_poll_cq()
mana_gd_poll_cq() may return -1 if an overflow error is detected (this
should never happen unless there is a bug in the driver or the hardware).

Fix the type of the variable "comp_read" by using int rather than u32.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: ca9c54d2d6 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-23 13:16:29 -07:00
Haiyue Wang 1f70dfc542 iavf: redefine the magic number for FDIR GTP-U header fields
The flex-byte for GTP-U protocol header fields uses the magic number,
which is hard to maintain and understand, define the interested fields
with meaningful macro name, based on the GTP-U protocol stack:

GTP-U header
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     | 0x1 |1|0|1|0|0|     0xff      |           Length              |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                           TEID = 1654                         |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |    Sequence Number = 0        |N-PDU Number=0 |NextExtHdr=0x85|
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

GTP-U Extension Header (PDU Session Container)
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |  ExtHdrLen=2  |Type=0 | Spare |0|0|   QFI     | PPI |  Spare  |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |                    Padding                    |NextExtHdr=0x0 |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-23 09:34:19 -07:00
Haiyue Wang f3b9da31f0 iavf: enhance the duplicated FDIR list scan handling
When the FDIR entry is found, just return the result directly to break
the loop.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-23 09:03:56 -07:00
Haiyue Wang f995f95af6 iavf: change the flex-byte support number to macro definition
The maximum number (2) of flex-byte support is derived from ethtool
use-def data size (8 byte).

Change the magic number 2 to macro definition, and add the comment to
track the design thinking, so the code is clear and easily maintained.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-23 09:03:56 -07:00
Stefan Assmann 1a0e880b02 iavf: remove duplicate free resources calls
Both iavf_free_all_tx_resources() and iavf_free_all_rx_resources() have
already been called in the very same function.
Remove the duplicate calls.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-23 09:03:56 -07:00
Coiby Xu 5c208e9f49 i40e: use minimal admin queue for kdump
The minimum size of admin send/receive queue is 1 and 2 respectively.
The admin send queue can't be set to 1 because in that case, the
firmware would fail to init.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-23 09:03:56 -07:00
Coiby Xu dcb75338f6 i40e: use minimal Rx and Tx ring buffers for kdump
Use the minimum of the number of descriptors thus we will allocate the
minimal ring buffers for kdump.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-23 09:03:56 -07:00
Coiby Xu 065aa694a7 i40e: use minimal Tx and Rx pairs for kdump
Set the number of the MSI-X vectors to 1. When MSI-X is enabled,
it's not allowed to use more TC queue pairs than MSI-X vectors
(pf->num_lan_msix) exist. Thus the number of Tx and Rx pairs
(vsi->num_queue_pairs) will be equal to the number of MSI-X vectors,
i.e., 1.

Signed-off-by: Coiby Xu <coxu@redhat.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-23 09:03:56 -07:00
Aleksandr Loktionov 6d2c322cce i40e: refactor repeated link state reporting code
Refactor repeated link state reporting code into a separate helper
functions: i40e_set_vf_link_state() i40e_vc_link_speed2mbps().
Add support of VIRTCHNL_VF_CAP_ADV_LINK_SPEED;

Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-23 09:03:56 -07:00
Mohammad Athari Bin Ismail 676b7ec67d stmmac: intel: Enable HW descriptor prefetch by default
Enable HW descriptor prefetch by default by setting plat->dma_cfg->dche =
true in intel_mgbe_common_data(). Need to be noted that this capability
only be supported in DWMAC core version 5.20 onwards. In stmmac, there is
a checking to check the core version. If the core version is below 5.20,
this capability wouldn`t be configured.

Below is the iperf result comparison between HW descriptor prefetch
disabled(DCHE=0b) and enabled(DCHE=1b). Tested on Intel Elkhartlake
platform with DWMAC Core 5.20. Observed line rate performance
improvement with HW descriptor prefetch enabled.

DCHE = 0b
[  5] local 169.254.1.162 port 42123 connected to 169.254.244.142 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec  96.7 MBytes   811 Mbits/sec  70050
[  5]   1.00-2.00   sec  96.5 MBytes   809 Mbits/sec  69850
[  5]   2.00-3.00   sec  96.3 MBytes   808 Mbits/sec  69720
[  5]   3.00-4.00   sec  95.9 MBytes   804 Mbits/sec  69450
[  5]   4.00-5.00   sec  96.0 MBytes   806 Mbits/sec  69530
[  5]   5.00-6.00   sec  96.8 MBytes   812 Mbits/sec  70080
[  5]   6.00-7.00   sec  96.9 MBytes   813 Mbits/sec  70140
[  5]   7.00-8.00   sec  96.8 MBytes   812 Mbits/sec  70080
[  5]   8.00-9.00   sec  97.0 MBytes   814 Mbits/sec  70230
[  5]   9.00-10.00  sec  96.9 MBytes   813 Mbits/sec  70170
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec   966 MBytes   810 Mbits/sec  0.000 ms  0/699300 (0%)  sender
[  5]   0.00-10.00  sec   966 MBytes   810 Mbits/sec  0.011 ms  0/699265 (0%)  receiver

DCHE = 1b
[  5] local 169.254.1.162 port 49740 connected to 169.254.244.142 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec  97.9 MBytes   821 Mbits/sec  70880
[  5]   1.00-2.00   sec  98.1 MBytes   823 Mbits/sec  71060
[  5]   2.00-3.00   sec  98.2 MBytes   824 Mbits/sec  71140
[  5]   3.00-4.00   sec  98.2 MBytes   824 Mbits/sec  71090
[  5]   4.00-5.00   sec  98.1 MBytes   823 Mbits/sec  71050
[  5]   5.00-6.00   sec  98.1 MBytes   823 Mbits/sec  71040
[  5]   6.00-7.00   sec  98.1 MBytes   823 Mbits/sec  71050
[  5]   7.00-8.00   sec  98.2 MBytes   824 Mbits/sec  71140
[  5]   8.00-9.00   sec  98.2 MBytes   824 Mbits/sec  71120
[  5]   9.00-10.00  sec  98.3 MBytes   824 Mbits/sec  71150
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec   981 MBytes   823 Mbits/sec  0.000 ms  0/710720 (0%)  sender
[  5]   0.00-10.00  sec   981 MBytes   823 Mbits/sec  0.041 ms  0/710650 (0%) receiver

Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22 15:02:40 -07:00
Mohammad Athari Bin Ismail 96874c619c net: stmmac: Add HW descriptor prefetch setting for DWMAC Core 5.20 onwards
DWMAC Core 5.20 onwards supports HW descriptor prefetching.
Additionally, it also depends on platform specific RTL configuration.
This capability could be enabled by setting DMA_Mode bit-19 (DCHE).

So, to enable this cability, platform must set plat->dma_cfg->dche = true
and the DWMAC core version must be 5.20 onwards. Else, this capability
wouldn`t be configured

Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22 15:02:40 -07:00
Hans Westgaard Ry 79ebfb11fe net/mlx4: Treat VFs fair when handling comm_channel_events
Handling comm_channel_event in mlx4_master_comm_channel uses a double
loop to determine which slaves have requested work. The search is
always started at lowest slave. This leads to unfairness; lower VFs
tends to be prioritized over higher VFs.

The patch uses find_next_bit to determine which slaves to handle.
Fairness is implemented by always starting at the next to the last
start.

An MPI program has been used to measure improvements. It runs 500
ibv_reg_mr, synchronizes with all other instances and then runs 500
ibv_dereg_mr.

The results running 500 processes, time reported is for running 500
calls:

ibv_reg_mr:
             Mod.   Org.
mlx4_1    403.356ms 424.674ms
mlx4_2    403.355ms 424.674ms
mlx4_3    403.354ms 424.674ms
mlx4_4    403.355ms 424.674ms
mlx4_5    403.357ms 424.677ms
mlx4_6    403.354ms 424.676ms
mlx4_7    403.357ms 424.675ms
mlx4_8    403.355ms 424.675ms

ibv_dereg_mr:
             Mod.   Org.
mlx4_1    116.408ms 142.818ms
mlx4_2    116.434ms 142.793ms
mlx4_3    116.488ms 143.247ms
mlx4_4    116.679ms 143.230ms
mlx4_5    112.017ms 107.204ms
mlx4_6    112.032ms 107.516ms
mlx4_7    112.083ms 184.195ms
mlx4_8    115.089ms 190.618ms

Suggested-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22 14:59:26 -07:00
Dan Carpenter 27537929f3 bnxt_en: fix ternary sign extension bug in bnxt_show_temp()
The problem is that bnxt_show_temp() returns long but "rc" is an int
and "len" is a u32.  With ternary operations the type promotion is quite
tricky.  The negative "rc" is first promoted to u32 and then to long so
it ends up being a high positive value instead of a a negative as we
intended.

Fix this by removing the ternary.

Fixes: d69753fa1e ("bnxt_en: return proper error codes in bnxt_show_temp")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22 14:56:38 -07:00
David S. Miller 9904e1ee96 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-04-22

This series contains updates to virtchnl header file, ice, and iavf
drivers.

Vignesh adds support to warn about potentially malicious VFs; those that
are overflowing the mailbox for the ice driver.

Michal adds support for an allowlist/denylist of VF commands based on
supported capabilities for the ice driver.

Brett adds support for iavf UDP segmentation offload by adding the
capability bit to virtchnl, advertising support in the ice driver, and
enabling it in the iavf driver. He also adds a helper function for
getting the VF VSI for ice.

Colin Ian King removes an unneeded pointer assignment.

Qi enables support in the ice driver to support virtchnl requests from
the iavf to configure its own RSS input set. This includes adding new
capability bits, structures, and commands to virtchnl header file.

Haiyue enables configuring RSS flow hash via ethtool to support TCP, UDP
and SCTP protocols in iavf.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22 13:57:21 -07:00
Arnd Bergmann 3197a98c70 vxge: avoid -Wemtpy-body warnings
There are a few warnings about empty debug macros in this driver:

drivers/net/ethernet/neterion/vxge/vxge-main.c: In function 'vxge_probe':
drivers/net/ethernet/neterion/vxge/vxge-main.c:4480:76: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
 4480 |                                 "Failed in enabling SRIOV mode: %d\n", ret);

Change them to proper 'do { } while (0)' expressions to make the
code a little more robust and avoid the warnings.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22 13:27:36 -07:00
Arnd Bergmann 74c97ea3b6 net: enetc: fix link error again
A link time bug that I had fixed before has come back now that
another sub-module was added to the enetc driver:

ERROR: modpost: "enetc_ierb_register_pf" [drivers/net/ethernet/freescale/enetc/fsl-enetc.ko] undefined!

The problem is that the enetc Makefile is not actually used for
the ierb module if that is the only built-in driver in there
and everything else is a loadable module.

Fix it by always entering the directory this time, regardless
of which symbols are configured. This should reliably fix the
problem and prevent it from coming back another time.

Fixes: 112463ddbe ("net: dsa: felix: fix link error")
Fixes: e7d48e5fbf ("net: enetc: add a mini driver for the Integrated Endpoint Register Block")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22 13:23:07 -07:00
Arnd Bergmann 45b102dd81 net: mana: fix PCI_HYPERV dependency
The MANA driver causes a build failure in some configurations when
it selects an unavailable symbol:

WARNING: unmet direct dependencies detected for PCI_HYPERV
  Depends on [n]: PCI [=y] && X86_64 [=y] && HYPERV [=n] && PCI_MSI [=y] && PCI_MSI_IRQ_DOMAIN [=y] && SYSFS [=y]
  Selected by [y]:
  - MICROSOFT_MANA [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICROSOFT [=y] && PCI_MSI [=y] && X86_64 [=y]
drivers/pci/controller/pci-hyperv.c: In function 'hv_irq_unmask':
drivers/pci/controller/pci-hyperv.c:1217:9: error: implicit declaration of function 'hv_set_msi_entry_from_desc' [-Werror=implicit-function-declaration]
 1217 |         hv_set_msi_entry_from_desc(&params->int_entry.msi_entry, msi_desc);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~

A PCI driver should never depend on a particular host bridge
implementation in the first place, but if we have this dependency
it's better to express it as a 'depends on' rather than 'select'.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-22 13:22:28 -07:00
Haiyue Wang e41985f0fe iavf: Support for modifying SCTP RSS flow hashing
Provide the ability to enable SCTP RSS hashing by ethtool.

It gives users option of generating RSS hash based on the SCTP source
and destination ports numbers, IPv4 or IPv6 source and destination
addresses.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:23 -07:00
Haiyue Wang 7b8f3f957b iavf: Support for modifying UDP RSS flow hashing
Provides the ability to enable UDP RSS hashing by ethtool.

It gives users option of generating RSS hash based on the UDP source
and destination ports numbers, IPv4 or IPv6 source and destination
addresses.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:23 -07:00
Haiyue Wang 5ab91e0593 iavf: Support for modifying TCP RSS flow hashing
Provides the ability to enable TCP RSS hashing by ethtool.

It gives users option of generating RSS hash based on the TCP source
and destination ports numbers, IPv4 or IPv6 source and destination
addresses.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:23 -07:00
Haiyue Wang 0aaeb4fbc8 iavf: Add framework to enable ethtool RSS config
Add the virtchnl message interface to VF, so that VF can request RSS
input set(s) based on PF's capability.

This framework allows ethtool RSS config support on the VF driver.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:23 -07:00
Qi Zhang ddd1f3cfed ice: Support RSS configure removal for AVF
Add the handler for virtchnl message VIRTCHNL_OP_DEL_RSS_CFG to remove
an existing RSS configuration with matching hashed fields.

Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com>
Co-developed-by: Jia Guo <jia.guo@intel.com>
Signed-off-by: Jia Guo <jia.guo@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Bo Chen <BoX.C.Chen@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:22 -07:00
Qi Zhang 222a8ab016 ice: Enable RSS configure for AVF
Currently, RSS hash input is not available to AVF by ethtool, it is set
by the PF directly.

Add the RSS configure support for AVF through new virtchnl message, and
define the capability flag VIRTCHNL_VF_OFFLOAD_ADV_RSS_PF to query this
new RSS offload support.

Signed-off-by: Jia Guo <jia.guo@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Tested-by: Bo Chen <BoX.C.Chen@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:22 -07:00
Brett Creeley c5afbe99b7 ice: Add helper function to get the VF's VSI
Currently, the driver gets the VF's VSI by using a long string of
dereferences (i.e. vf->pf->vsi[vf->lan_vsi_idx]). If the method to get
the VF's VSI were to change the driver would have to change it in every
location. Fix this by adding the helper ice_get_vf_vsi().

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:22 -07:00
Colin Ian King c9b5f681fe ice: remove redundant assignment to pointer vsi
Pointer vsi is being re-assigned a value that is never read,
the assignment is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:22 -07:00
Brett Creeley c91a4f9feb iavf: add support for UDP Segmentation Offload
Add code to support UDP segmentation offload (USO) for
hardware that supports it.

Suggested-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:22 -07:00
Brett Creeley 142da08c4d ice: Advertise virtchnl UDP segmentation offload capability
As the hardware is capable of supporting UDP segmentation offload, add a
capability bit to virtchnl.h to communicate this and have the driver
advertise its support.

Suggested-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:22 -07:00
Michal Swiatkowski c0dcaa55f9 ice: Allow ignoring opcodes on specific VF
Declare bitmap of allowed commands on VF. Initialize default
opcodes list that should be always supported. Declare array of
supported opcodes for each caps used in virtchnl code.

Change allowed bitmap by setting or clearing corresponding
bit to allowlist (bit set) or denylist (bit clear).

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:22 -07:00
Vignesh Sridhar 0891c89674 ice: warn about potentially malicious VFs
Attempt to detect malicious VFs and, if suspected, log the information but
keep going to allow the user to take any desired actions.

Potentially malicious VFs are identified by checking if the VFs are
transmitting too many messages via the PF-VF mailbox which could cause an
overflow of this channel resulting in denial of service. This is done by
creating a snapshot or static capture of the mailbox buffer which can be
traversed and in which the messages sent by VFs are tracked.

Co-developed-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com>
Signed-off-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com>
Co-developed-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Co-developed-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-22 09:26:22 -07:00
Adam Ford 36e69da892 net: ethernet: ravb: Fix release of refclk
The call to clk_disable_unprepare() can happen before priv is
initialized. This means moving clk_disable_unprepare out of
out_release into a new label.

Fixes: 8ef7adc6be ("net: ethernet: ravb: Enable optional refclk")
Signed-off-by: Adam Ford <aford173@gmail.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 11:17:09 -07:00
Yoshihiro Shimoda 5718458b09 net: renesas: ravb: Fix a stuck issue when a lot of frames are received
When a lot of frames were received in the short term, the driver
caused a stuck of receiving until a new frame was received. For example,
the following command from other device could cause this issue.

    $ sudo ping -f -l 1000 -c 1000 <this driver's ipaddress>

The previous code always cleared the interrupt flag of RX but checks
the interrupt flags in ravb_poll(). So, ravb_poll() could not call
ravb_rx() in the next time until a new RX frame was received if
ravb_rx() returned true. To fix the issue, always calls ravb_rx()
regardless the interrupt flags condition.

Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 10:56:37 -07:00
Ong Boon Leong 5e6038b88a net: stmmac: fix TSO and TBS feature enabling during driver open
TSO and TBS cannot co-exist and current implementation requires two
fixes:

 1) stmmac_open() does not need to call stmmac_enable_tbs() because
    the MAC is reset in stmmac_init_dma_engine() anyway.
 2) Inside stmmac_hw_setup(), we should call stmmac_enable_tso() for
    TX Q that is _not_ configured for TBS.

Fixes: 579a25a854 ("net: stmmac: Initial support for TBS")
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 10:51:26 -07:00
Ong Boon Leong 17cb00704c stmmac: intel: set TSO/TBS TX Queues default settings
TSO and TBS cannot coexist, for now we set Intel mGbE controller to use
below TX Queue mapping: TxQ0 uses TSO and the rest of TXQs supports TBS.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 10:49:02 -07:00
Dan Carpenter 53e35ebb9a stmmac: intel: unlock on error path in intel_crosststamp()
We recently added some new locking to this function but one error path
was overlooked.  We need to drop the lock before returning.

Fixes: f4da56529d ("net: stmmac: Add support for external trigger timestamping")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 10:27:55 -07:00
Yinjun Zhang 90b669d65d nfp: devlink: initialize the devlink port attribute "lanes"
The number of lanes of devlink port should be correctly initialized
when registering the port, so that the input check when running
"devlink port split <port> count <N>" can pass.

Fixes: a21cf0a833 ("devlink: Add a new devlink port lanes attribute and pass to netlink")
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 10:26:35 -07:00
Srujana Challa 2e2ee4cd0a octeontx2-af: Add mailbox for CPT stats
Adds a new mailbox to get CPT stats, includes performance
counters, CPT engines status and RXC status.

Signed-off-by: Narayana Prasad Raju Atherya <pathreya@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 10:23:17 -07:00
Srujana Challa ecad2ce8c4 octeontx2-af: cn10k: Add mailbox to configure reassembly timeout
CN10K CPT coprocessor includes a component named RXC which
is responsible for reassembly of inner IP packets. RXC has
the feature to evict oldest entries based on age/threshold.
This patch adds a new mailbox to configure reassembly age
or threshold.

Signed-off-by: Jerin Jacob Kollanukkaran <jerinj@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 10:23:17 -07:00
Srujana Challa e4bbc5c53a octeontx2-af: cn10k: Mailbox changes for CN10K CPT
Adds changes to existing CPT mailbox messages to support
CN10K CPT block. This patch also adds new register defines
for CN10K CPT.

Signed-off-by: Vidya Sagar Velumuri <vvelumuri@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-21 10:23:17 -07:00
Colin Ian King d83b8aa520 net: davinci_emac: Fix incorrect masking of tx and rx error channel
The bit-masks used for the TXERRCH and RXERRCH (tx and rx error channels)
are incorrect and always lead to a zero result. The mask values are
currently the incorrect post-right shifted values, fix this by setting
them to the currect values.

(I double checked these against the TMS320TCI6482 data sheet, section
5.30, page 127 to ensure I had the correct mask values for the TXERRCH
and RXERRCH fields in the MACSTATUS register).

Addresses-Coverity: ("Operands don't affect result")
Fixes: a6286ee630 ("net: Add TI DaVinci EMAC driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 17:12:18 -07:00
Vadym Kochan 333980481b net: marvell: prestera: fix port event handling on init
For some reason there might be a crash during ports creation if port
events are handling at the same time  because fw may send initial
port event with down state.

The crash points to cancel_delayed_work() which is called when port went
is down.  Currently I did not find out the real cause of the issue, so
fixed it by cancel port stats work only if previous port's state was up
& runnig.

The following is the crash which can be triggered:

[   28.311104] Unable to handle kernel paging request at virtual address
000071775f776600
[   28.319097] Mem abort info:
[   28.321914]   ESR = 0x96000004
[   28.324996]   EC = 0x25: DABT (current EL), IL = 32 bits
[   28.330350]   SET = 0, FnV = 0
[   28.333430]   EA = 0, S1PTW = 0
[   28.336597] Data abort info:
[   28.339499]   ISV = 0, ISS = 0x00000004
[   28.343362]   CM = 0, WnR = 0
[   28.346354] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000100bf7000
[   28.352842] [000071775f776600] pgd=0000000000000000,
p4d=0000000000000000
[   28.359695] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   28.365310] Modules linked in: prestera_pci(+) prestera
uio_pdrv_genirq
[   28.372005] CPU: 0 PID: 1291 Comm: kworker/0:1H Not tainted
5.11.0-rc4 #1
[   28.378846] Hardware name: DNI AmazonGo1 A7040 board (DT)
[   28.384283] Workqueue: prestera_fw_wq prestera_fw_evt_work_fn
[prestera_pci]
[   28.391413] pstate: 60000085 (nZCv daIf -PAN -UAO -TCO BTYPE=--)
[   28.397468] pc : get_work_pool+0x48/0x60
[   28.401442] lr : try_to_grab_pending+0x6c/0x1b0
[   28.406018] sp : ffff80001391bc60
[   28.409358] x29: ffff80001391bc60 x28: 0000000000000000
[   28.414725] x27: ffff000104fc8b40 x26: ffff80001127de88
[   28.420089] x25: 0000000000000000 x24: ffff000106119760
[   28.425452] x23: ffff00010775dd60 x22: ffff00010567e000
[   28.430814] x21: 0000000000000000 x20: ffff80001391bcb0
[   28.436175] x19: ffff00010775deb8 x18: 00000000000000c0
[   28.441537] x17: 0000000000000000 x16: 000000008d9b0e88
[   28.446898] x15: 0000000000000001 x14: 00000000000002ba
[   28.452261] x13: 80a3002c00000002 x12: 00000000000005f4
[   28.457622] x11: 0000000000000030 x10: 000000000000000c
[   28.462985] x9 : 000000000000000c x8 : 0000000000000030
[   28.468346] x7 : ffff800014400000 x6 : ffff000106119758
[   28.473708] x5 : 0000000000000003 x4 : ffff00010775dc60
[   28.479068] x3 : 0000000000000000 x2 : 0000000000000060
[   28.484429] x1 : 000071775f776600 x0 : ffff00010775deb8
[   28.489791] Call trace:
[   28.492259]  get_work_pool+0x48/0x60
[   28.495874]  cancel_delayed_work+0x38/0xb0
[   28.500011]  prestera_port_handle_event+0x90/0xa0 [prestera]
[   28.505743]  prestera_evt_recv+0x98/0xe0 [prestera]
[   28.510683]  prestera_fw_evt_work_fn+0x180/0x228 [prestera_pci]
[   28.516660]  process_one_work+0x1e8/0x360
[   28.520710]  worker_thread+0x44/0x480
[   28.524412]  kthread+0x154/0x160
[   28.527670]  ret_from_fork+0x10/0x38
[   28.531290] Code: a8c17bfd d50323bf d65f03c0 9278dc21 (f9400020)
[   28.537429] ---[ end trace 5eced933df3a080b ]---

Fixes: 501ef3066c ("net: marvell: prestera: Add driver for Prestera family ASIC devices")
Signed-off-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 17:11:30 -07:00
Colin Ian King 55cdc26a91 net: mana: remove redundant initialization of variable err
The variable err is being initialized with a value that is
never read and it is being updated later with a new value.  The
initialization is redundant and can be removed

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 17:05:19 -07:00
Edward Cree 172e269edf sfc: ef10: fix TX queue lookup in TX event handling
We're starting from a TXQ label, not a TXQ type, so
 efx_channel_get_tx_queue() is inappropriate.  This worked by chance,
 because labels and types currently match on EF10, but we shouldn't
 rely on that.

Fixes: 12804793b1 ("sfc: decouple TXQ type from label")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 17:03:53 -07:00
Edward Cree 83b09a1807 sfc: farch: fix TX queue lookup in TX event handling
We're starting from a TXQ label, not a TXQ type, so
 efx_channel_get_tx_queue() is inappropriate (and could return NULL,
 leading to panics).

Fixes: 12804793b1 ("sfc: decouple TXQ type from label")
Cc: stable@vger.kernel.org
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 17:03:53 -07:00
Edward Cree 5b1faa9228 sfc: farch: fix TX queue lookup in TX flush done handling
We're starting from a TXQ instance number ('qid'), not a TXQ type, so
 efx_get_tx_queue() is inappropriate (and could return NULL, leading
 to panics).

Fixes: 12804793b1 ("sfc: decouple TXQ type from label")
Reported-by: Trevor Hemsley <themsley@voiceflex.com>
Cc: stable@vger.kernel.org
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 17:03:53 -07:00
Michael Walle 1b8caefaf4 net: enetc: automatically select IERB module
Now that enetc supports flow control we have to make sure the settings in
the IERB are correct. Therefore, we actually depend on the enetc-ierb
module. Previously it was possible that this module was disabled while the
enetc was enabled. Fix it by automatically select the enetc-ierb module.

Fixes: e7d48e5fbf ("net: enetc: add a mini driver for the Integrated Endpoint Register Block")
Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:56:32 -07:00
Petr Machata 7de85b0431 mlxsw: spectrum_qdisc: Index future FIFOs by band number
mlxsw used to hold an array of qdiscs indexed by the TC number. In the
previous patch, it was changed to allocate child qdiscs dynamically, and
they are now indexed by band number. Follow suit with the array of future
FIFOs.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata 5cbd960253 mlxsw: spectrum_qdisc: Allocate child qdiscs dynamically
Instead of keeping qdiscs in globally-preallocated arrays, introduce a
per-qdisc-kind value num_classes, and then allocate the necessary child
qdiscs (if any) based on that value. Since now dynamic allocation is
involved, mlxsw_sp_qdisc_replace() gets messy enough that it is worth it to
split it to two cases: a new qdisc allocation and a change of existing
qdisc. (Note that the change also includes what TC formally calls replace,
if the qdisc kind is the same.)

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata cff99e2045 mlxsw: spectrum_qdisc: Guard all qdisc accesses with a lock
The FIFO handler currently guards accesses to the future FIFO tracking by
asserting RTNL. In the future, the changes to the qdisc state will be more
thorough, so other qdiscs will need this guarding is as well. In order
to not further the RTNL infestation, instead convert to a custom lock that
will guard accesses to the qdisc state.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata 51d52ed955 mlxsw: spectrum_qdisc: Track children per qdisc
mlxsw currently allows a two-level structure of qdiscs: the root and
possibly a number of children. In order to support offloading more general
qdisc trees, introduce to struct mlxsw_sp_qdisc a pointer to child qdiscs.
Refer to the child qdiscs through this pointer, instead of going through
the tclass_qdiscs in qdisc_state. Additionally introduce a field
num_classes, which holds number of given qdisc's children.

Also introduce a generic function for walking qdisc trees. Rewrite
mlxsw_sp_qdisc_find() and _find_by_handle() to use the generic walker.

For now, keep the qdisc_state.tclass_qdisc, and just point root_qdiscs's
children to this array. Following patches will make the allocation dynamic.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata b21832b568 mlxsw: spectrum_qdisc: Promote backlog reduction to mlxsw_sp_qdisc_destroy()
When a qdisc is removed, it is necessary to update the backlog value at its
parent--unless the qdisc is at root position. RED, TBF and FIFO all do
that, each separately. Since all of them need to do this, just promote the
operation directly to mlxsw_sp_qdisc_destroy(), instead of deferring it to
individual destructors. Since FIFO dtor thus becomes trivial, remove it.

Add struct mlxsw_sp_qdisc.parent to point at the parent qdisc. This will be
handy later as deeper structures are offloaded. Use the parent qdisc to
find the chain of parents whose backlog value needs to be updated.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata 017a131cde mlxsw: spectrum_qdisc: Track tclass_num as int, not u8
tclass_num is just a number, a value that would be ordinarily passed around
as an int. (Which is unlike a u8 prio_bitmap.) In several places,
tclass_num already is an int. Convert the remaining instances.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata 549f2aae84 mlxsw: spectrum_qdisc: Drop an always-true condition
The function mlxsw_sp_qdisc_compare() is invoked a couple lines above this
check, which will bounce any requests where this condition does not hold.
Therefore drop it.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata 290fe2c595 mlxsw: spectrum_qdisc: Simplify mlxsw_sp_qdisc_compare()
The purpose of this function is to filter out events that are related to
qdiscs that are not offloaded, or are not offloaded anymore. But the
function is unnecessarily thorough:

- mlxsw_sp_qdisc pointer is never NULL in the context where it is called
- Two qdiscs with the same handle will never have different types. Even
  when replacing one qdisc with another in the same class, Linux will not
  permit handle reuse unless the qdisc type also matches.

Simplify the function by omitting these two unnecessary conditions.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata 17c0e6d175 mlxsw: spectrum_qdisc: Drop one argument from check_params callback
The mlxsw_sp_qdisc argument is not used in any of the actual callbacks.
Drop it.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
David S. Miller 790aad0ecc korina: Fix build.
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:40:08 -07:00
David S. Miller 56e2e5de44 korina: Fix conflict with global symbol desc_empty on x86.
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:24:26 -07:00
David S. Miller ff254dad0e mlx5-updates-2021-04-19
This patchset provides some updates to mlx5e and mlx5 SW steering drivers:
 
 1) Tariq and Vladyslav they both provide some trivial update to mlx5e netdev.
 
 The next 12 patches in the patchset are focused toward mlx5 SW steering:
 2) 3 trivial cleanup patches
 
 3) Dynamic Flex parser support:
    Flex parser is a HW parser that can support protocols that are not
     natively supported by the HCA, such as Geneve (TLV options) and GTP-U.
     There are 8 such parsers, and each of them can be assigned to parse a
     specific set of protocols.
 
 4) Enable matching on Geneve TLV options
 
 5) Use Flex parser for MPLS over UDP/GRE
 
 6) Enable matching on tunnel GTP-U and GTP-U first extension
    header using
 
 7) Improved QoS for SW steering internal QPair for a better insertion rate
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmB+R90ACgkQSD+KveBX
 +j5UKAf+OODHHlYUxp3k4uSRGPNHuVZsUw4DjoCY6b0E4uMup9bP0YF7/B1I8bpC
 xTbVK9SzYTVOt0pxBu3aJ1Qom5hpJt5iT7QG9m5LlhEn/ZD3KqpnenGuDMIlyOa5
 EvLIdeOoWxJ+7Za6pULy4hsbUcu8hupsBBN+poC3dN4akQu1NyvFE4mdHVTP/c7n
 DB0mZWskoDyXm1dQiZ4+cDWoltrrpFLo5n7N08QbS+AvJ7jsRrT5myBU4IPMEfP6
 peRecTpKZEOwBwTzxi41ao5XZYnKTROD3zax30v6DXxw5K41SQKwvCjgvpc2/V1J
 jymdJzdYa17mxu5XMC0aaQoFo2VBDg==
 =wwzX
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-04-19

This patchset provides some updates to mlx5e and mlx5 SW steering drivers:

1) Tariq and Vladyslav they both provide some trivial update to mlx5e netdev.

The next 12 patches in the patchset are focused toward mlx5 SW steering:
2) 3 trivial cleanup patches

3) Dynamic Flex parser support:
   Flex parser is a HW parser that can support protocols that are not
    natively supported by the HCA, such as Geneve (TLV options) and GTP-U.
    There are 8 such parsers, and each of them can be assigned to parse a
    specific set of protocols.

4) Enable matching on Geneve TLV options

5) Use Flex parser for MPLS over UDP/GRE

6) Enable matching on tunnel GTP-U and GTP-U first extension
   header using

7) Improved QoS for SW steering internal QPair for a better insertion rate
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:14:02 -07:00
Oleksij Rempel b62a12fc04 net: ag71xx: make use of generic NET_SELFTESTS library
With this patch the ag71xx on Atheros AR9331 will able to run generic net
selftests.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:08:02 -07:00
Oleksij Rempel 6016ba345f net: fec: make use of generic NET_SELFTESTS library
With this patch FEC on iMX will able to run generic net selftests

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:08:02 -07:00
Parav Pandit dedbc2d358 IB/mlx5: Set right RoCE l3 type and roce version while deleting GID
Currently when GID is deleted, it zero out all the fields of the RoCE
address in the SET_ROCE_ADDRESS command for a specified index.

roce_version = 0 means RoCEv1 in the SET_ROCE_ADDRESS command.

This assumes that device has RoCEv1 always enabled which is not always
correct. For example Subfunction does not support RoCEv1.

Due to this assumption a previously added RoCEv2 GID is always deleted as
RoCEv1 GID. This results in a below syndrome:

   mlx5_core.sf mlx5_core.sf.4: mlx5_cmd_check:777:(pid 4256): SET_ROCE_ADDRESS(0x761) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x12822d)

Hence set the right RoCE version during GID deletion provided by the core.

Link: https://lore.kernel.org/r/d3f54129c90ca329caf438dbe31875d8ad08d91a.1618753425.git.leonro@nvidia.com
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-20 09:41:10 -03:00
Yevgeny Kliteynik aeacb52a8d net/mlx5: DR, Add support for isolate_vl_tc QP
When using SW steering, rule insertion rate depends on the RDMA RC QP
performance used for writing to the ICM. During stress this QP is competing
on the HW resources with all the other QPs that are used to send data.
To protect SW steering QP's performance in such cases, we set this QP to
use isolated VL. The VL number is reserved by FW and is not exposed to the
driver.
Support for this QP on isolated VL exists only when both force-loopback and
isolate_vl_tc capabilities are set.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:46 -07:00
Yevgeny Kliteynik 7304d603a5 net/mlx5: DR, Add support for force-loopback QP
When supported by the device, SW steering RoCE RC QP that is used to
write/read to/from ICM will be created with force-loopback attribute.
Such QP doesn't require GID index upon creation.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:43 -07:00
Yevgeny Kliteynik df9dd15ae1 net/mlx5: DR, Add support for matching tunnel GTP-U
Enable matching on tunnel GTP-U and GTP-U first extension
header using dynamic flex parser.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:40 -07:00
Yevgeny Kliteynik 35ba005d82 net/mlx5: DR, Set flex parser for TNL_MPLS dynamically
Query the flex_parser id that's intended for TNL_MPLS
and use an appropriate flex parser for MPLS over UDP/GRE.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:37 -07:00
Yevgeny Kliteynik 3442e0335e net/mlx5: DR, Add support for matching on geneve TLV option
Enable matching on tunnel geneve TLV option using the flex parser.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:34 -07:00
Yevgeny Kliteynik 4923938d2f net/mlx5: DR, Set STEv0 ICMP flex parser dynamically
Set the flex parser ID dynamicly for ICMP instead of relying
on hardcoded values.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:31 -07:00
Yevgeny Kliteynik 160e9cb37a net/mlx5: DR, Add support for dynamic flex parser
Flex parser is a HW parser that can support protocols that are not
natively supported by the HCA, such as Geneve (TLV options) and GTP-U.
There are 8 such parsers, and each of them can be assigned to parse a
specific set of protocols.
This patch adds misc4 match params which allows using a correct flex parser
that was programmed to the required protocol.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:28 -07:00
Muhammad Sammar 323b91acc1 net/mlx5: DR, Remove protocol-specific flex_parser_3 definitions
Remove MPLS specific fields from flex parser 3 layout.
Flex parser can be used for multiple protocols and should
not be hardcoded to a specific type.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:24 -07:00
Yevgeny Kliteynik 25cb317680 net/mlx5: E-Switch, Improve error messages in term table creation
Add error code to the error messages and removed duplicated message:
if termination table creation failed, we already get an error message
in mlx5_eswitch_termtbl_create, so no need for the additional error print
in the calling function.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:18 -07:00
Yevgeny Kliteynik ff1925bb0d net/mlx5: DR, Fix SQ/RQ in doorbell bitmask
QP doorbell size is 16 bits.
Fixing sw steering's QP doorbel bitmask, which had 20 bits.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:15 -07:00
Yevgeny Kliteynik 7d22ad732d net/mlx5: DR, Rename an argument in dr_rdma_segments
Rename the argument to better reflect that the meaning is
not number of records, but wheather or not we should
ring the dorbell.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:12 -07:00
Tariq Toukan 6980ffa0c5 net/mlx5e: RX, Add checks for calculated Striding RQ attributes
Striding RQ attributes below are mutually dependent. An unaware
change to one might take the others out of the valid range derived
by the HW caps:
- The MPWQE size in bytes
- The number of strides in a MPWQE
- The stride size

Add checks to verify they are valid and comply to the HW spec
and SW assumptions/requirements.
This is not a fix, no particular issue exists today.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:09 -07:00
Vladyslav Tarasiuk 6a5689ba02 net/mlx5e: Fix possible non-initialized struct usage
If mlx5e_devlink_port_register() fails, driver may try to register
devlink health TX and RX reporters on non-registered devlink port.

Instead, create health reporters only if mlx5e_devlink_port_register()
does not fail. And destroy reporters only if devlink_port is registered.

Also, change mlx5e_get_devlink_port() behavior and return NULL in case
port is not registered to replicate devlink's wrapper when ndo is not
implemented.

Signed-off-by: Vladyslav Tarasiuk <vladyslavt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:06 -07:00
Tariq Toukan d408c01cae net/mlx5e: Fix lost changes during code movements
The changes done in commit [1] were missed by the code movements
done in [2], as they were developed in ~parallel.
Here we re-apply them.

[1] commit e4484d9df5 ("net/mlx5e: Enable striding RQ for Connect-X IPsec capable devices")
[2] commit b3a131c2a1 ("net/mlx5e: Move params logic into its dedicated file")

Fixes: b3a131c2a1 ("net/mlx5e: Move params logic into its dedicated file")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-19 20:17:03 -07:00
Jakub Kicinski 37434782d6 bnxt: add more ethtool standard stats
Michael suggest a few more stats we can expose.

$ ethtool -S eth0 --groups eth-mac
Standard stats for eth0:
eth-mac-FramesTransmittedOK: 902623288966
eth-mac-FramesReceivedOK: 28727667047
eth-mac-FrameCheckSequenceErrors: 1
eth-mac-AlignmentErrors: 0
eth-mac-OutOfRangeLengthField: 0
$ ethtool -S eth0 | grep '\(fcs\|align\|oor\)'
     rx_fcs_err_frames: 1
     rx_align_err_frames: 0
     tx_fcs_err_frames: 0

Suggested-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 16:22:23 -07:00
Vadym Kochan ced97eea39 net: marvell: prestera: add support for AC3X 98DX3265 device
Add PCI match for AC3X 98DX3265 device which is supported by the current
driver and firmware.

Signed-off-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 16:13:58 -07:00
Wong Vee Khee d7f576dc98 net: stmmac: fix memory leak during driver probe
On driver probe, kmemleak reported the following memory leak which was
due to allocated bitmap that was not being freed in stmmac_dvr_probe().

unreferenced object 0xffff9276014b13c0 (size 8):
  comm "systemd-udevd", pid 2143, jiffies 4294681112 (age 116.720s)
  hex dump (first 8 bytes):
    00 00 00 00 00 00 00 00                          ........
  backtrace:
    [<00000000c51e34b2>] stmmac_dvr_probe+0x1c0/0x440 [stmmac]
    [<00000000b530eb41>] intel_eth_pci_probe.cold+0x2b/0x14e [dwmac_intel]
    [<00000000b10f8929>] pci_device_probe+0xd2/0x150
    [<00000000fb254c74>] really_probe+0xf8/0x410
    [<0000000034128a59>] driver_probe_device+0x5d/0x150
    [<00000000016104d5>] device_driver_attach+0x53/0x60
    [<00000000cb18cd07>] __driver_attach+0x96/0x140
    [<00000000da9ffd5c>] bus_for_each_dev+0x7a/0xc0
    [<00000000af061a88>] bus_add_driver+0x184/0x1f0
    [<000000008be5c1c5>] driver_register+0x6c/0xc0
    [<0000000052b18a9e>] do_one_initcall+0x4d/0x210
    [<00000000154d4f07>] do_init_module+0x5c/0x230
    [<000000009b648d09>] load_module+0x2a5a/0x2d40
    [<000000000d86b76d>] __do_sys_finit_module+0xb5/0x120
    [<000000002b0cef95>] do_syscall_64+0x33/0x40
    [<0000000067b45bbb>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: bba2556efa ("net: stmmac: Enable RX via AF_XDP zero-copy")
Cc: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 16:02:27 -07:00
Thomas Bogendoerfer 6ef92063bf net: korina: Make driver COMPILE_TESTable
Move structs/defines for ethernet/dma register into driver, since they
are only used for this driver and remove any MIPS specific includes.
This makes it possible to COMPILE_TEST the driver.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer e4cd854ec4 net: korina: Get mdio input clock via common clock framework
With device tree clock is provided via CCF. For non device tree
use a maximum clock value to not overclock the PHY. The non device
tree usage will go away after platform is converted to DT.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer 10b26f0781 net: korina: Add support for device tree
If there is no mac address passed via platform data try to get it via
device tree and fall back to a random mac address, if all fail.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer af80425e05 net: korina: Only pass mac address via platform data
Get rid of access to struct korina_device by just passing the mac
address via platform data and use drvdata for passing netdev to remove
function.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer 0fc96939a9 net: korina: Use DMA API
Instead of messing with MIPS specific macros use DMA API for mapping
descriptors and skbs.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer 0fe632471a net: korina: Remove nested helpers
Remove helpers, which are only used in one call site.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer e42f10533d net: korina: Remove not needed cache flushes
Descriptors are mapped uncached so there is no need to do any cache
handling for them.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer b4cd249a8c net: korina: Use devres functions
Simplify probe/remove code by using devm_ functions.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Thomas Bogendoerfer 89f9d5400b net: korina: Fix MDIO functions
Fixed MDIO functions to work reliable and not just by accident.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:58:14 -07:00
Pablo Neira Ayuso f5c2cb583a net: ethernet: mtk_eth_soc: handle VLAN pop action
Do not hit EOPNOTSUPP when flowtable offload provides a VLAN pop action.

Fixes: efce49dfe6 ("netfilter: flowtable: add vlan pop action offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:55:05 -07:00
Pablo Neira Ayuso 014d029876 net: ethernet: mtk_eth_soc: missing mutex
Patch 2ed37183ab ("netfilter: flowtable: separate replace, destroy and
stats to different workqueues") splits the workqueue per event type. Add
a mutex to serialize updates.

Fixes: 502e84e238 ("net: ethernet: mtk_eth_soc: add flow offloading support")
Reported-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:55:05 -07:00
Pablo Neira Ayuso 0e389028ad net: ethernet: mtk_eth_soc: fix undefined reference to `dsa_port_from_netdev'
Caused by:

 CONFIG_NET_DSA=m
 CONFIG_NET_MEDIATEK_SOC=y

mtk_ppe_offload.c:undefined reference to `dsa_port_from_netdev'

Fixes: 502e84e238 ("net: ethernet: mtk_eth_soc: add flow offloading support")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:55:04 -07:00
Linus Walleij 8d892d6094 net: ethernet: ixp4xx: Set the DMA masks explicitly
The former fix only papered over the actual problem: the
ethernet core expects the netdev .dev member to have the
proper DMA masks set, or there will be BUG_ON() triggered
in kernel/dma/mapping.c.

Fix this by simply copying dma_mask and dma_mask_coherent
from the parent device.

Fixes: e45d0fad4a ("net: ethernet: ixp4xx: Use parent dev for DMA pool")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:47:56 -07:00
DENG Qingfang 6ecaf81d4a net: ethernet: mediatek: fix a typo bug in flow offloading
Issue was traffic problems after a while with increased ping times if
flow offload is active. It turns out that key_offset with cookie is
needed in rhashtable_params but was re-assigned to head_offset.
Fix the assignment.

Fixes: 502e84e238 ("net: ethernet: mtk_eth_soc: add flow offloading support")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:39:52 -07:00
Huazhong Tan e407efdd94 net: hns3: change the value of the SEPARATOR_VALUE macro in hclgevf_main.c
The SEPARATOR_VALUE macro is used as separator when getting
the register value, but the value of this macro is different
between pf and vf, it is a bit confusing for the user, so
synchronize the value of vf with pf.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:38:05 -07:00
Huazhong Tan 8ed64dbe0b net: hns3: cleanup inappropriate spaces in struct hlcgevf_tqp_stats
Modify some inappropriate spaces in comments of struct
hlcgevf_tqp_stats.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:38:05 -07:00
Huazhong Tan 1c5a2ba679 net: hns3: remove a duplicate pf reset counting
When enter suspend mode the counter of pf reset will be increased
twice, since both hclge_prepare_general() and hclge_prepare_wait()
increase this counter. So remove the duplicate counting in
hclge_prepare_general().

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:38:05 -07:00
Randy Dunlap 46fd447161 net: xilinx: drivers need/depend on HAS_IOMEM
kernel test robot reports build errors in 3 Xilinx ethernet drivers.
They all use ioremap functions that are only available when HAS_IOMEM
is set/enabled. If it is not enabled, they all have build errors,
so make these 3 drivers depend on HAS_IOMEM.

ld: drivers/net/ethernet/xilinx/xilinx_emaclite.o: in function `xemaclite_of_probe':
xilinx_emaclite.c:(.text+0x9fc): undefined reference to `devm_ioremap_resource'

ld: drivers/net/ethernet/xilinx/xilinx_axienet_main.o: in function `axienet_probe':
xilinx_axienet_main.c:(.text+0x942): undefined reference to `devm_ioremap_resource'

ld: drivers/net/ethernet/xilinx/ll_temac_main.o: in function `temac_probe':
ll_temac_main.c:(.text+0x1283): undefined reference to `devm_platform_ioremap_resource_byname'
ld: ll_temac_main.c:(.text+0x13ad): undefined reference to `devm_of_iomap'
ld: ll_temac_main.c:(.text+0x162e): undefined reference to `devm_platform_ioremap_resource'

Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Zhang Changzhong <zhangchangzhong@huawei.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: stable@vger.kernel.org
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:36:00 -07:00
Vladimir Oltean a864888788 net: enetc: add support for flow control
In the ENETC receive path, a frame received by the MAC is first stored
in a 256KB 'FIFO' memory, then transferred to DRAM when enqueuing it to
the RX ring. The FIFO is a shared resource for all ENETC ports, but
every port keeps track of its own memory utilization, on RX and on TX.

There is a setting for RX rings through which they can either operate in
'lossy' mode (where the lack of a free buffer causes an immediate
discard of the frame) or in 'lossless' mode (where the lack of a free
buffer in the ring makes the frame stay longer in the FIFO).

In turn, when the memory utilization of the FIFO exceeds a certain
margin, the MAC can be configured to emit PAUSE frames.

There is enough FIFO memory to buffer up to 3 MTU-sized frames per RX
port while not jeopardizing the other use cases (jumbo frames), and
also not consume bytes from the port TX allocations. Also, 3 MTU-sized
frames worth of memory is enough to ensure zero loss for 64 byte packets
at 1G line rate.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:31:45 -07:00
Vladimir Oltean e7d48e5fbf net: enetc: add a mini driver for the Integrated Endpoint Register Block
The NXP ENETC is a 4-port Ethernet controller which 'smells' to
operating systems like 4 distinct PCIe PFs with SR-IOV, each PF having
its own driver instance, but in fact there are some hardware resources
which are shared between all ports, like for example the 256 KB SRAM
FIFO between the MACs and the Host Transfer Agent which DMAs frames to
DRAM.

To hide the stuff that cannot be neatly exposed per port, the hardware
designers came up with this idea of having a dedicated register block
which is supposed to be populated by the bootloader, and contains
everything configuration-related: MAC addresses, FIFO partitioning, etc.

When a port is reset using PCIe Function Level Reset, its defaults are
transferred from the IERB configuration. Most of the time, the settings
made through the IERB are read-only in the port's memory space (if they
are even visible), so they cannot be modified at runtime.

Linux doesn't have any advanced FIFO partitioning requirements at all,
but when reading through the hardware manual, it became clear that, even
though there are many good 'recommendations' for default values, many of
them were not actually put in practice on LS1028A. So we end up with a
default configuration that:

(a) does not have enough TX and RX byte credits to support the max MTU
    of 9600 (which the Linux driver claims already) properly (at full speed)
(b) allows the FIFO to be overrun with RX traffic, potentially
    overwriting internal data structures.

The last part sounds a bit catastrophic, but it isn't. Frames are
supposed to transit the FIFO for a very short time, but they can
actually accumulate there under 2 conditions:

(a) there is very severe congestion on DRAM memory, or
(b) the RX rings visible to the operating system were configured for
    lossless operation, and they just ran out of free buffers to copy
    the frame to. This is what is used to put backpressure onto the MAC
    with flow control.

So since ENETC has not supported flow control thus far, RX FIFO overruns
were never seen with Linux. But with the addition of flow control, we
should configure some registers to prevent this from happening. What we
are trying to protect against are bad actors which continue to send us
traffic despite the fact that we have signaled a PAUSE condition. Of
course we can't be lossless in that case, but it is best to configure
the FIFO to do tail dropping rather than letting it overrun.

So in a nutshell, this driver is a fixup for all the IERB default values
that should have been but aren't.

The IERB configuration needs to be done _before_ the PFs are enabled.
So every PF searches for the presence of the "fsl,ls1028a-enetc-ierb"
node in the device tree, and if it finds it, it "registers" with the
IERB, which means that it requests the IERB to fix up its default
values. This is done through -EPROBE_DEFER. The IERB driver is part of
the fsl_enetc module, but is technically a platform driver, since the
IERB is a good old fashioned MMIO region, as opposed to ENETC ports
which pretend to be PCIe devices.

The driver was already configuring ENETC_PTXMBAR (FIFO allocation for
TX) because due to an omission, TXMBAR is a read/write register in the
PF memory space. But the manual is quite clear that the formula for this
should depend upon the TX byte credits (TXBCR). In turn, the TX byte
credits are only readable/writable through the IERB. So if we want to
ensure that the TXBCR register also has a value that is correct and in
line with TXMBAR, there is simply no way this can be done from the PF
driver, access to the IERB is needed.

I could have modified U-Boot to fix up the IERB values, but that is
quite undesirable, as old U-Boot versions are likely to be floating
around for quite some time from now.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:31:45 -07:00
Vladimir Oltean 87614b931c net: enetc: create a common enetc_pf_to_port helper
Even though ENETC interfaces are exposed as individual PCIe PFs with
their own driver instances, the ENETC is still fundamentally a
multi-port Ethernet controller, and some parts of the IP take a port
number (as can be seen in the PSFP implementation).

Create a common helper that can be used outside of the TSN code for
retrieving the ENETC port number based on the PF number. This is only
correct for LS1028A, the only Linux-capable instantiation of ENETC thus
far.

Note that ENETC port 3 is PF 6. The TSN code did not care about this
because ENETC port 3 does not support TSN, so the wrong mapping done by
enetc_get_port for PF 6 could have never been hit.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:31:45 -07:00
Dexuan Cui ca9c54d2d6 net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)
Add a VF driver for Microsoft Azure Network Adapter (MANA) that will be
available in the future.

Co-developed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Co-developed-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Shachar Raindel <shacharr@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-19 15:24:25 -07:00
Leon Romanovsky bcf9ee0520 RDMA/bnxt_re: Create direct symbol link between bnxt modules
Convert indirect probe call to its direct equivalent to create a symbol
link between RDMA and netdev modules. This will give us an ability to
remove custom module reference counting that doesn't belong to the driver.

Link: https://lore.kernel.org/r/20210401065715.565226-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-By: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-19 14:57:03 -03:00
Heiner Kallweit 11ac4e668a r8169: keep pause settings on interface down/up cycle
Currently, if the user changes the pause settings, the default settings
will be restored after an interface down/up cycle, and also when
resuming from suspend. This doesn't seem to provide the best user
experience. Change this to keep user settings, and just ensure that in
jumbo mode pause is disabled.
Small drawback: When switching back mtu from jumbo to non-jumbo then
pause remains disabled (but user can enable it using ethtool).
I think that's a not too common scenario and acceptable.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-17 14:02:48 -07:00
Jakub Kicinski 8203c7ce4e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
 - keep the ZC code, drop the code related to reinit
net/bridge/netfilter/ebtables.c
 - fix build after move to net_generic

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-17 11:08:07 -07:00
Vladimir Oltean 24e3930971 net: enetc: apply the MDIO workaround for XDP_REDIRECT too
Described in fd5736bf9f ("enetc: Workaround for MDIO register access
issue") is a workaround for a hardware bug that requires a register
access of the MDIO controller to never happen concurrently with a
register access of a port PF. To avoid that, a mutual exclusion scheme
with rwlocks was implemented - the port PF accessors are the 'read'
side, and the MDIO accessors are the 'write' side.

When we do XDP_REDIRECT between two ENETC interfaces, all is fine
because the MDIO lock is already taken from the NAPI poll loop.

But when the ingress interface is not ENETC, just the egress is, the
MDIO lock is not taken, so we might access the port PF registers
concurrently with MDIO, which will make the link flap due to wrong
values returned from the PHY.

To avoid this, let's just slap an enetc_lock_mdio/enetc_unlock_mdio at
the beginning and ending of enetc_xdp_xmit. The fact that the MDIO lock
is designed as a rwlock is important here, because the read side is
reentrant (that is one of the main reasons why we chose it). Usually,
the way we benefit of its reentrancy is by running the data path
concurrently on both CPUs, but in this case, we benefit from the
reentrancy by taking the lock even when the lock is already taken
(and that's the situation where ENETC is both the ingress and the egress
interface for XDP_REDIRECT, which was fine before and still is fine now).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 17:08:40 -07:00
Vladimir Oltean 92ff9a6e57 net: enetc: fix buffer leaks with XDP_TX enqueue rejections
If the TX ring is congested, enetc_xdp_tx() returns false for the
current XDP frame (represented as an array of software BDs).

This array of software TX BDs is constructed in enetc_rx_swbd_to_xdp_tx_swbd
from software BDs freshly cleaned from the RX ring. The issue is that we
scrub the RX software BDs too soon, more precisely before we know that
we can enqueue the TX BDs successfully into the TX ring.

If we can't enqueue them (and enetc_xdp_tx returns false), we call
enetc_xdp_drop which attempts to recycle the buffers held by the RX
software BDs. But because we scrubbed those RX BDs already, two things
happen:

(a) we leak their memory
(b) we populate the RX software BD ring with an all-zero rx_swbd
    structure, which makes the buffer refill path allocate more memory.

enetc_refill_rx_ring
-> if (unlikely(!rx_swbd->page))
   -> enetc_new_page

That is a recipe for fast OOM.

Fixes: 7ed2bc8007 ("net: enetc: add support for XDP_TX")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 17:08:40 -07:00
Vladimir Oltean 975acc833c net: enetc: handle the invalid XDP action the same way as XDP_DROP
When the XDP program returns an invalid action, we should free the RX
buffer.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 17:08:40 -07:00
Vladimir Oltean 7eab503b11 net: enetc: use dedicated TX rings for XDP
It is possible for one CPU to perform TX hashing (see netdev_pick_tx)
between the 8 ENETC TX rings, and the TX hashing to select TX queue 1.

At the same time, it is possible for the other CPU to already use TX
ring 1 for XDP (either XDP_TX or XDP_REDIRECT). Since there is no mutual
exclusion between XDP and the network stack, we run into an issue
because the ENETC TX procedure is not reentrant.

The obvious approach would be to just make XDP take the lock of the
network stack's TX queue corresponding to the ring it's about to enqueue
in.

For XDP_REDIRECT, this is quite straightforward, a lock at the beginning
and end of enetc_xdp_xmit() should do the trick.

But for XDP_TX, it's a bit more complicated. For one, we do TX batching
all by ourselves for frames with the XDP_TX verdict. This is something
we would like to keep the way it is, for performance reasons. But
batching means that the network stack's lock should be kept from the
first enqueued XDP_TX frame and until we ring the doorbell. That is
mostly fine, except for cases when in the same NAPI loop we have mixed
XDP_TX and XDP_REDIRECT frames. So if enetc_xdp_xmit() gets called while
we are holding the lock from the RX NAPI, then bam, deadlock. The naive
answer could be 'just flush the XDP_TX frames first, then release the
network stack's TX queue lock, then call xdp_do_flush_map()'. But even
xdp_do_redirect() is capable of flushing the batched XDP_REDIRECT
frames, so unless we unlock/relock the TX queue around xdp_do_redirect(),
there simply isn't any clean way to protect XDP_TX from concurrent
network stack .ndo_start_xmit() on another CPU.

So we need to take a different approach, and that is to reserve two
rings for the sole use of XDP. We leave TX rings
0..ndev->real_num_tx_queues-1 to be handled by the network stack, and we
pick them from the end of the priv->tx_ring array.

We make an effort to keep the mapping done by enetc_alloc_msix() which
decides which CPU handles the TX completions of which TX ring in its
NAPI poll. So the XDP TX ring of CPU 0 is handled by TX ring 6, and the
XDP TX ring of CPU 1 is handled by TX ring 7.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 17:08:40 -07:00
Vladimir Oltean ee3e875f10 net: enetc: increase TX ring size
Now that commit d6a2829e82 ("net: enetc: increase RX ring default
size") has increased the RX ring size, it is quite easy to congest the
TX rings when the traffic is predominantly XDP_TX, as the RX ring is
quite a bit larger than the TX one.

Since we bit the bullet and did the expensive thing already (larger RX
rings consume more memory pages), it seems quite foolish to keep the TX
rings small. So make them equally sized with TX.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 17:08:39 -07:00
Vladimir Oltean a6369fe6e0 net: enetc: remove unneeded xdp_do_flush_map()
xdp_do_redirect already contains:
-> dev_map_enqueue
   -> __xdp_enqueue
      -> bq_enqueue
         -> bq_xmit_all // if we have more than 16 frames

So the logic from enetc will never be hit, because ENETC_DEFAULT_TX_WORK
is 128.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 17:08:39 -07:00
Vladimir Oltean 8f50d8bb3f net: enetc: stop XDP NAPI processing when build_skb() fails
When the code path below fails:

enetc_clean_rx_ring_xdp // XDP_PASS
-> enetc_build_skb
   -> enetc_map_rx_buff_to_skb
      -> build_skb

enetc_clean_rx_ring_xdp will 'break', but that 'break' instruction isn't
strong enough to actually break the NAPI poll loop, just the switch/case
statement for XDP actions. So we increment rx_frm_cnt and go to the next
frames minding our own business.

Instead let's do what the skb NAPI poll function does, and break the
loop now, waiting for the memory pressure to go away. Otherwise the next
calls to build_skb() are likely to fail too.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 17:08:39 -07:00
Vladimir Oltean 672f9a2198 net: enetc: recycle buffers for frames with RX errors
When receiving a frame with errors, currently we do nothing with it (we
don't construct an skb or an xdp_buff), we just exit the NAPI poll loop.

Let's put the buffer back into the RX ring (similar to XDP_DROP).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 17:08:39 -07:00
Vladimir Oltean 6b04830d5e net: enetc: rename the buffer reuse helpers
enetc_put_xdp_buff has nothing to do with XDP, frankly, it is just a
helper to populate the recycle end of the shadow RX BD ring
(next_to_alloc) with a given buffer.

On the other hand, enetc_put_rx_buff plays more tricks than its name
would suggest.

So let's rename enetc_put_rx_buff into enetc_flip_rx_buff to reflect the
half-page buffer reuse tricks that it employs, and enetc_put_xdp_buff
into enetc_put_rx_buff which suggests a more garden-variety operation.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 17:08:39 -07:00
Vladimir Oltean e9e49ae88e net: enetc: remove redundant clearing of skb/xdp_frame pointer in TX conf path
Later in enetc_clean_tx_ring we have:

		/* Scrub the swbd here so we don't have to do that
		 * when we reuse it during xmit
		 */
		memset(tx_swbd, 0, sizeof(*tx_swbd));

So these assignments are unnecessary.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 17:08:39 -07:00
David S. Miller bc45f524d9 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2021-04-16

This series contains updates to igb and igc drivers.

Ederson adjusts Tx buffer distributions in Qav mode to improve
TSN-aware traffic for igb. He also enable PPS support and auxiliary PHC
functions for igc.

Grzegorz checks that the MTA register was properly written and
retries if not for igb.

Sasha adds reporting of EEE low power idle counters to ethtool and fixes
a return value being overwritten through looping for igc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 17:06:14 -07:00
Jakub Kicinski b572ec9ff0 mlx5: implement ethtool standard stats
Add support for PHY/MAC/Ctrl/RMON stats.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 16:59:47 -07:00
Jakub Kicinski 782bc00aff bnxt: implement ethtool standard stats
Most of the names seem to strongly correlate with names from
the standard and RFC. Whether ..+good_frames are indeed Frames..OK
I'm the least sure of.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 16:59:20 -07:00
Jakub Kicinski c1912ab0ee mlxsw: implement ethtool standard stats
mlxsw has nicely grouped stats, add support for standard uAPI.
I'm guessing the register access part. Compile tested only.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 16:59:20 -07:00
David S. Miller 03e481e88b mlx5-updates-2021-04-16
This patchset introduces updates to mlx5e netdev driver.
 
 1) Tariq refactors TLS offloads and adds resiliency against RX resync
    failures
 
 2) Maxim reduces code duplications by unifying channels reset flow
    regardless if channels are closed or open
 
 3) Aya Enhances TX/RX health reporters diagnostics to expose the
    internal clock time-stamping format
 
 4) Moshe adds support for ethtool extended link state, to show the reason
    for link down
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmB53AUACgkQSD+KveBX
 +j6rzAf+JwJG9G7GSj3a/xird4dlgt4xPbRLB19pTw19ZyHZyujDxdN4QM3r5hTk
 5ua1PnhYYaUcyPFvdgR9J0cIJ3QRaxZ+q/XnkE9Yo0eZ1DJ0SL/n6rxEQpcxpee1
 XP7qjJu3leVwh5mVW2uOx/ClrL9vYb/fG3Q00j59rUB+i9bZszXZgZ99hJvYBFTB
 k7W/9X6BNxuLlEg/Ui9L499aDWHRcIY5J2ku+1v/8paJZltk+IFv5glYszylE++M
 l68drIy3dIjl/Sxj6WR2rHTBus6AIFxWFH8C2L7uqGl97BPjS80snMPIefLJhW+y
 bQvzMDtfKDmIpvEIdzHPuEhEdqqteg==
 =YCy6
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2021-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-04-16

This patchset introduces updates to mlx5e netdev driver.

1) Tariq refactors TLS offloads and adds resiliency against RX resync
   failures

2) Maxim reduces code duplications by unifying channels reset flow
   regardless if channels are closed or open

3) Aya Enhances TX/RX health reporters diagnostics to expose the
   internal clock time-stamping format

4) Moshe adds support for ethtool extended link state, to show the reason
   for link down
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 16:53:52 -07:00
Claudiu Manoil 8eda54c5e6 gianfar: Drop GFAR_MQ_POLLING support
Gianfar used to enable all 8 Rx queues (DMA rings) per
ethernet device, even though the controller can only
support 2 interrupt lines at most.  This meant that
multiple Rx queues would have to be grouped per NAPI poll
routine, and the CPU would have to split the budget and
service them in a round robin manner.  The overhead of
this scheme proved to outweight the potential benefits.
The alternative was to introduce the "Single Queue" polling
mode, supporting one Rx queue per NAPI, which became the
default packet processing option and helped improve the
performance of the driver.
MQ_POLLING also relies on undocumeted device tree properties
to specify how to map the 8 Rx and Tx queues to a given
interrupt line (aka "interrupt group").  Using module parameters
to enable this mode wasn't an option either.  Long story short,
MQ_POLLING became obsolete, now it is just dead code, and no
one asked for it so far.
For the Tx queues, multi-queue support (more than 1 Tx queue
per CPU) could be revisited by adding tc MQPRIO support, but
again, one has to consider that there are only 2 interrupt lines.
So the NAPI poll routine would have to service multiple Tx rings.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 15:46:15 -07:00
Stefan Chulski 4ad29b1a48 net: mvpp2: Add parsing support for different IPv4 IHL values
Add parser entries for different IPv4 IHL values.
Each entry will set the L4 header offset according to the IPv4 IHL field.
L3 header offset will set during the parsing of the IPv4 protocol.

Because of missed parser support for IP header length > 20, RX IPv4 checksum HW offload fails
and skb->ip_summed set to CHECKSUM_NONE(checksum done by Network stack).
This patch adds RX IPv4 checksum HW offload capability for frames with IP header length > 20.

v1 --> v2
- Improve commit message.

Suggested-by: Dana Vardi <danat@marvell.com>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 15:29:40 -07:00
Ilya Lipnitskiy c5d66587b8 net: ethernet: mediatek: ppe: fix busy wait loop
The intention is for the loop to timeout if the body does not succeed.
The current logic calls time_is_before_jiffies(timeout) which is false
until after the timeout, so the loop body never executes.

Fix by using readl_poll_timeout as a more standard and less error-prone
solution.

Fixes: ba37b7caf1 ("net: ethernet: mtk_eth_soc: add support for initializing the PPE")
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Cc: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 15:24:18 -07:00
Gatis Peisenieks a1150a04b7 atl1c: move tx cleanup processing out of interrupt
Tx queue cleanup happens in interrupt handler on same core as rx queue
processing. Both can take considerable amount of processing in high
packet-per-second scenarios.

Sending big amounts of packets can stall the rx processing which is
unfair and also can lead to out-of-memory condition since
__dev_kfree_skb_irq queues the skbs for later kfree in softirq which
is not allowed to happen with heavy load in interrupt handler.

This puts tx cleanup in its own napi and enables threaded napi to
allow the rx/tx queue processing to happen on different cores.

The ability to sustain equal amounts of tx/rx traffic increased:
from 280Kpps to 1130Kpps on Threadripper 3960X with upcoming
Mikrotik 10/25G NIC,
from 520Kpps to 850Kpps on Intel i3-3320 with Mikrotik RB44Ge adapter.

Signed-off-by: Gatis Peisenieks <gatis@mikrotik.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 15:16:54 -07:00
Vladimir Oltean 2c4eca3ef7 net: bridge: switchdev: include local flag in FDB notifications
As explained in bugfix commit 6ab4c3117a ("net: bridge: don't notify
switchdev for local FDB addresses") as well as in this discussion:
https://lore.kernel.org/netdev/20210117193009.io3nungdwuzmo5f7@skbuf/

the switchdev notifiers for FDB entries managed to have a zero-day bug,
which was that drivers would not know what to do with local FDB entries,
because they were not told that they are local. The bug fix was to
simply not notify them of those addresses.

Let us now add the 'is_local' bit to bridge FDB entries, and make all
drivers ignore these entries by their own choice.

Co-developed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16 15:15:45 -07:00
Sasha Neftin 1feaf60ff2 igc: Expose LPI counters
Expose EEE Tx and Rx low power idle counters via ethtool
A EEE TX or RX LPI event occurs when the transmitter or the receiver
enters EEE (IEEE802.3az) LPI state.
ethtool --statistics <iface>

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-16 13:15:45 -07:00
Sasha Neftin b3d4f40562 igc: Fix overwrites return value
drivers/net/ethernet/intel/igc/igc_i225.c:235 igc_write_nvm_srwr()
warn: loop overwrites return value 'ret_val'

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-16 13:15:45 -07:00
Ederson de Souza 87938851b6 igc: enable auxiliary PHC functions for the i225
The i225 device offers a number of special PTP Hardware Clock features on
the Software Defined Pins (SDPs) - much like i210, which is used as
inspiration for this patch. It enables two possible functions, namely
time stamping external events and periodic output signals.

The assignment of PHC functions to the four SDP can be freely chosen by
the user.

For the external events time stamping, when the SDP (configured as input
by user) level changes, an interrupt is generated and the kernel
Precision Time Protocol (PTP) is informed.

For the periodic output signals, the i225 is configured to generate them
(so the SDP level will change periodically) and the driver also has to
keep updating the time of the next level change. However, this work is
not necessary for some frequencies as the i225 takes care of them
(namely, anything with a half-cycle of 500ms, 250ms, 125ms or < 70ms).

While i225 allows up to four timers to be used to source the time used
on the external events or output signals, this patch uses only one of
those timers. Main reason is to keep it simple, as it's not clear how
these extra timers would be exposed to users. Note that currently a NIC
can expose a single PTP device.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-16 13:15:45 -07:00
Ederson de Souza 64433e5bf4 igc: Enable internal i225 PPS
The i225 device can produce one interrupt on the full second, much
like i210 - from where this patch is inspired.

This patch sets up the full second interruption on the i225 and when
receiving it, it sends a PPS event to PTP (Precision Time Protocol)
kernel subsystem.

The PTP subsystem exposes the PPS events via ioctl and sysfs, and one
can use the `testptp` tool (tools/testing/selftests/ptp) to check that
the events are being generated.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-16 13:15:45 -07:00
Grzegorz Siwik 1d3cb90cb0 igb: Add double-check MTA_REGISTER for i210 and i211
Add new function which checks MTA_REGISTER if its filled correctly.
If not then writes again to same register.
There is possibility that i210 and i211 could not accept
MTA_REGISTER settings, specially when you add and remove
many of multicast addresses in short time.
Without this patch there is possibility that multicast settings will be
not always set correctly in hardware.

Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-16 13:15:39 -07:00
Aya Levin 95742c1cc5 net/mlx5: Enhance diagnostics info for TX/RX reporters
Add ts_format to 'Common Config' section of the TX/RX devlink reporters
diagnostics info. Possible values for ts_format: 'RT' or 'FRC'
which stands for: Real Time and Free Running Counters correspondingly.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16 11:48:34 -07:00
Aya Levin 302522e67c net/mlx5: Add helper to initialize 1PPS
Wrap 1PPS initialization in a helper for a cleaner init flow.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16 11:48:31 -07:00
Moshe Tal b3446acb2b net/mlx5e: Add ethtool extended link state
In case the interface was set up but cannot establish the link, ethtool
will print more information to help the user troubleshoot the state.

For example, no link due to missing cable:
$ ethtool eth1
...
Link detected: no (No cable)

Beside the general extended state, drivers can pass additional
information about the link state using the sub-state field. For example:

$ ethtool eth1
...
Link detected: no (Autoneg, No partner detected)

The extended state is available only for specific cases, in other cases
ethtool with print only "Link detected: no" as before

Signed-off-by: Moshe Tal <moshet@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16 11:48:28 -07:00
Maor Dickman 5cec6de0ae net/mlx5: Allocate FC bulk structs with kvzalloc() instead of kzalloc()
The bulk size is larger than 16K so use kvzalloc().
The bulk bitmask upper size limit is 16K so use kvcalloc().

Signed-off-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16 11:48:22 -07:00
Maxim Mikityanskiy 94872d4ef9 net/mlx5e: Cleanup safe switch channels API by passing params
mlx5e_safe_switch_channels accepts new_chs as a parameter and opens new
channels in place, then copying them to priv->channels. It requires all
the callers to allocate space for this temporary storage of the new
channels.

This commit cleans up the API by replacing new_chs with new_params, a
meaningful subset of new_chs to be filled by the caller. The temporary
space for the new channels is allocated inside mlx5e_safe_switch_params
(a new name for mlx5e_safe_switch_channels). An extra copy of params is
made, but since it's control flow, it's not critical.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16 11:48:20 -07:00
Maxim Mikityanskiy b3b886cf96 net/mlx5e: Refactor on-the-fly configuration changes
This commit extends mlx5e_safe_switch_channels() to support on-the-fly
configuration changes, when the channels are open, but don't need to be
recreated. Such flows exist when a parameter being changed doesn't
affect how the queues are created, or when the queues can be modified
while remaining active.

Before this commit, such flows were handled as special cases on the
caller site. This commit adds this functionality to
mlx5e_safe_switch_channels(), allowing the caller to pass a boolean
indicating whether it's required to recreate the channels or it's
allowed to skip it. The logic of switching channel parameters is now
completely encapsulated into mlx5e_safe_switch_channels().

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16 11:48:17 -07:00
Maxim Mikityanskiy 69cc4185dc net/mlx5e: Use mlx5e_safe_switch_channels when channels are closed
This commit uses new functionality of mlx5e_safe_switch_channels
introduced by the previous commit to reduce the amount of repeating
similar code all over the driver.

It's very common in mlx5e to call mlx5e_safe_switch_channels when the
channels are open, but assign parameters and run hardware commands
manually when the channels are closed.

After the previous commit it's no longer needed to do such manual things
every time, so this commit removes unneeded code and relies on the new
functionality of mlx5e_safe_switch_channels. Some of the places are
refactored and simplified, where more complex flows are used to change
configuration on the fly, without recreating the channels (the logic is
rewritten in a more robust way, with a reset required by default and a
list of exceptions).

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16 11:48:14 -07:00
Maxim Mikityanskiy 6cad120d9e net/mlx5e: Allow mlx5e_safe_switch_channels to work with channels closed
mlx5e_safe_switch_channels is used to modify channel parameters and/or
hardware configuration in a safe way, so that if anything goes wrong,
everything reverts to the old configuration and remains in a consistent
state.

However, this function only works when the channels are open. When the
caller needs to modify some parameters, first it has to check that the
channels are open, otherwise it has to assign parameters directly, and
such boilerplate repeats in many different places.

This commit prepares for the refactoring of such places by allowing
mlx5e_safe_switch_channels to work when the channels are closed. In this
case it will assign the new parameters and run the preactivate hook.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16 11:48:11 -07:00