Recent changes to netdevsim moved creating and destroying
devices from netlink to sysfs. The sysfs writes have been
implemented as direct writes, without shelling out. This
is faster, but leaves no trace in the logs. Add explicit
logs to make debugging possible.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Looks like the port adding loop makes a re-appearance on net-next
after net was merged back into it (even though it doesn't feature
in the merge diff).
The ports are already added in nsim_dev_create() so when we try
to add them again get EEXIST, and see:
netdevsim: probe of netdevsim0 failed with error -17
in the logs. When we remove the loop again the nsim_dev_probe()
and nsim_dev_remove() become a wrapper of nsim_dev_create() and
nsim_dev_destroy(). Remove this layer of indirection.
Fixes: d31e95585c ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
First set of patches for 5.5. The most active driver here clearly is
rtw88, lots of patches for it. More quiet on other drivers, smaller
fixes and cleanups all over.
This pull request also has a trivial conflict, the report and example
resolution here:
https://lkml.kernel.org/r/20191031111242.50ab1eca@canb.auug.org.au
Major changes:
rtw88
* add deep power save support
* add mac80211 software tx queue (wake_tx_queue) support
* enable hardware rate control
* add TX-AMSDU support
* add NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 support
* add power tracking support
* add 802.11ac beamformee support
* add set_bitrate_mask support
* add phy_info debugfs to show Tx/Rx physical status
* add RFE type 3 support for 8822b
ath10k
* add support for hardware rfkill on devices where firmware supports it
rtl8xxxu
* add bluetooth co-existence support for single antenna
iwlwifi
* Revamp the debugging infrastructure
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJdwYyqAAoJEG4XJFUm622b/jMH/0KUcGz8y4gkk2B2lMRyUOTu
t84LiSHxcsq6letlr/vak4S6NrxMLP8Z/ByyoKC8o3yeVsdyMNMSLZAztFFhxdXr
Haky2SM10q6vnn9s1skXS/qkHSd2WdUFT2DgYxyOPCtJUazVKjcboJ4YX/TUg99a
5eqPpZ4RXtW6uOmWHS7JXtLcCFPywKPBtMAjLEDMYOUSSBWExBNyNZNhznSS3ywY
4VKvc675gXE+WD3qXRhL8EJjyed94yuS3wYbKWp8iTaIRyluDmc5lVhjWH1A0HLE
Qb62QL8XLtbX5fcTnaupdAIXwxeIBylOBe8QtW7QUbAnGFf8bexLxfnQM+To4wI=
=24zD
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-2019-11-05' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 5.5
First set of patches for 5.5. The most active driver here clearly is
rtw88, lots of patches for it. More quiet on other drivers, smaller
fixes and cleanups all over.
This pull request also has a trivial conflict, the report and example
resolution here:
https://lkml.kernel.org/r/20191031111242.50ab1eca@canb.auug.org.au
Major changes:
rtw88
* add deep power save support
* add mac80211 software tx queue (wake_tx_queue) support
* enable hardware rate control
* add TX-AMSDU support
* add NL80211_EXT_FEATURE_CAN_REPLACE_PTK0 support
* add power tracking support
* add 802.11ac beamformee support
* add set_bitrate_mask support
* add phy_info debugfs to show Tx/Rx physical status
* add RFE type 3 support for 8822b
ath10k
* add support for hardware rfkill on devices where firmware supports it
rtl8xxxu
* add bluetooth co-existence support for single antenna
iwlwifi
* Revamp the debugging infrastructure
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
- bump version strings, by Simon Wunderlich
- Simplify batadv_v_ogm_aggr_list_free using skb_queue_purge,
by Christophe Jaillet
- Replace aggr_list_lock with lock free skb handlers,
by Christophe Jaillet
- explicitly mark fallthrough cases, by Sven Eckelmann
- Drop lockdep.h include from soft-interface.c, by Sven Eckelmann
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAl3BQgwWHHN3QHNpbW9u
d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoYaMD/93QapwlEvS+C/BQXk7B8nmrOUM
a/Q2E/Pgp4VX8EjJ031urkWNsqtOOdgi3fx0iUassTDU8ei0bK6vkPN+XYrNoAPO
G3Y4kBcyh2gViKC8ClNM2eUkrOpMJD31n7QiVmCPgyxwcK3pNoxvHSLVPgXx65BQ
kE+ZEKwfQTe2wATzzYZQbFl8tSj245euBjGKqrV8uVhXIu1EvNsyHKDWZWKZs1v/
igMDKssIn9JBO/AHqxcVqykps6uR8SD2ZPa3n72nDcJcUz5SMgFXuDU2b/QouvUj
bjlUxzGfD2V+J6ZcJyQmY3XNq4Ugs1xrqLWkvHlCeisyG3E98pWt1vF2fDl73jyd
jjamv6tjdvePYH4xALcxRD5PW+INaIrOjIse3j7gezZPdnVTeUiwpgrixV/eQoeH
Ku4RjFiE2eGTFpBHeWvDbtz6KU9f/p0RDlgxBr5ZU+cWvMR2Mqd7ZW6ICvx4UVB8
I/1GwKMsjJcO2FlsVbDXrFvOGHBirzk2BWG8/hLnrQe+E8QAvUaA7pJoqs1YFnSj
KnU2pVS1TV8PQTrxIe3CIwBL9uLPD9DjJ1wS7zdvswVQoZxXPof3Ebwe31v+xjsX
GmZMeC2nfVxmPojPcXCKf40Ffa8D1PP9YoAQfbhcuEgkeo98uybcfGqBWB7pWQhu
8t7HjEAS8VMx7btLfA==
=h5/d
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-for-davem-20191105' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- Simplify batadv_v_ogm_aggr_list_free using skb_queue_purge,
by Christophe Jaillet
- Replace aggr_list_lock with lock free skb handlers,
by Christophe Jaillet
- explicitly mark fallthrough cases, by Sven Eckelmann
- Drop lockdep.h include from soft-interface.c, by Sven Eckelmann
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ThinkPad Thunderbolt 3 Dock Gen 2 is another docking station that uses
RTL8153 based USB ethernet.
The device supports macpassthru, but it failed to pass the test of -AD,
-BND and -BD. Simply bypass these tests since the device supports this
feature just fine.
Also the ACPI objects have some differences between Dell's and Lenovo's,
so make those ACPI infos no longer hardcoded.
BugLink: https://bugs.launchpad.net/bugs/1827961
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements reset_prepare and reset_done, which are used
for handling FLR.
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru says:
====================
bnx2x/cnic: Enable Multi-Cos.
The patch series enables Multi-cos feature in the driver. This require
the use of new firmware 7.13.15.0.
Patch (1) adds driver changes to use new FW.
Patches (2) - (3) enables multi-cos functionality in bnx2x driver.
Patch (4) adds cnic driver change as required by new FW.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The new FW has added extra validation for HSI version to
make FW backward compatible with older VF drivers. Hence
set fp_hsi_ver to Fast Path HSI version of the FW in use.
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PF driver doesn't enable tx-switching for all cos queues/clients,
which causes packets drop from PF to VF. Fix this by enabling
tx-switching on all cos queues/clients.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FW version 7.13.15 addresses the issue in Multi-cos implementation.
This patch re-enables the Multi-Cos support in the driver.
Fixes: d1f0b5dce8 ("bnx2x: Disable multi-cos feature.")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 97a27d6d6e8d "bnx2x: Add FW 7.13.15.0" added said .bin FW to
linux-firmware tree. This FW addresses few important issues in the earlier
FW release.
This patch incorporates FW 7.13.15.0 in the bnx2x driver.
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pass a phy_interface_t to of_get_phy_mode(), by changing the type of
phy_mode in the device structure. This then requires that
zmii_attach() is also changes, since it takes a pointer to phy_mode.
Fixes: 0c65b2b90d ("net: of_get_phy_mode: Change API to solve int/unit warnings")
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
net_sched: convert packet counters to 64bit
This small patch series add 64bit support for packet counts.
Fact that the counters were still 32bit has been quite painful.
tc -s -d qd sh dev eth0 | head -3
qdisc mq 1: root
Sent 665706335338 bytes 6526520373 pkt (dropped 2441, overlimits 0 requeues 91)
backlog 0b 0p requeues 91
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Now the kernel uses 64bit packet counters in scheduler layer,
we want to export these counters to user space.
Instead risking breaking user space by adding fields
to struct gnet_stats_basic, add a new TCA_STATS_PKT64.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After this change, qdisc packet counter is no longer
a 32bit quantity. We still export 32bit values to user.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gnet_stats_basic_packed was really meant to be private kernel structure.
If this proves to be a problem, we will have to rename the in-kernel
version.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn says:
====================
mv88e6xxx ATU occupancy as devlink resource
This patchset add generic support to DSA for devlink resources. The
Marvell switch Address Translation Unit occupancy is then exported as
a resource. In order to do this, the number of ATU entries is added to
the per switch info structure. Helpers are added, and then the
resource itself is then added.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The ATU can report how many entries it contains. It does this per bin,
there being 4 bins in total. Export the ATU as a devlink resource, and
provide a method the needed callback to get the resource occupancy.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When retrieving the ATU statistics, and ATU get next has to be
performed to trigger the ATU to collect the statistics. Export a
helper from global1_atu to perform this.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
For each supported switch, add an entry to the info structure for the
number of MACs which can be stored in the ATU. This will later be used
to export the ATU as a devlink resource, and indicate its occupancy,
how full the ATU is.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add wrappers around the devlink resource API, so that DSA drivers can
register and unregister devlink resources.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli says:
====================
net: dsa: bcm_sf2: Add support for optional reset controller line
This patch series definest the optional reset controller line for the
BCM7445/BCM7278 integrated Ethernet switches and updates the driver to
drive that reset line in lieu of the internal watchdog based reset since
it does not work on BCM7278.
Changes in v2:
- make the reset_control_assert() conditional to BCM7278 in the remove
function as well
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Grab an optional and exclusive reset controller line for the switch and
manage it during probe/remove functions accordingly. For 7278 devices we
change bcm_sf2_sw_rst() to use the reset controller line since the
WATCHDOG_CTRL register does not reset the switch contrary to stated
documentation.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The BCM7445/BCM7278 built-in Ethernet switch have an optional reset line
to the SoC's reset controller, describe the 'resets' and 'reset-names'
properties as optional.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The openvswitch was supporting a MPLS label depth of 1 in the ingress
direction though the userspace OVS supports a max depth of 3 labels.
This change enables openvswitch module to support a max depth of
3 labels in the ingress.
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The macros HCLGE_MPF_ENBALE and HCLGEVF_MPF_ENBALE are defined but never
used. I was going to fix the spelling mistake "ENBALE" -> "ENABLE" but
found these macros are not used, so they can be removed.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use 'skb_queue_purge()' instead of re-implementing it.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The order in which the ports are deleted from the list and freed and the
call to dsa_switch_remove() is done is reversed, which leads to an
use after free condition. Reverse the two: first tear down the ports and
switch from the fabric, then free the ports associated with that switch
fabric.
Fixes: 05f294a852 ("net: dsa: allocate ports on touch")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matteo Croce says:
====================
icmp: move duplicate code in helper functions
Remove some duplicate code by moving it in two helper functions.
First patch adds the helpers, the second one uses it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The same code which recognizes ICMP error packets is duplicated several
times. Use the icmp_is_err() and icmpv6_is_err() helpers instead, which
do the same thing.
ip_multipath_l3_keys() and tcf_nat_act() didn't check for all the error types,
assume that they should instead.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add two helper functions, one for IPv4 and one for IPv6, to recognize
the ICMP packets which are error responses.
This packets are special because they have as payload the original
header of the packet which generated it (RFC 792 says at least 8 bytes,
but Linux actually includes much more than that).
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger says:
====================
netvsc: RSS related patches
Address a couple of issues related to recording RSS hash
value in skb. These were found by reviewing RSS support.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Since RSS hash is available from the host, record it in
the skb.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the driver needs to create a hash value because it
was not done at higher level, then the hash should be marked
as a software not hardware hash.
Fixes: f72860afa2 ("hv_netvsc: Exclude non-TCP port numbers from vRSS hashing")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2019-11-04
This series contains updates to the ice driver only.
Anirudh refactors the code to reduce the kernel configuration flags and
introduces ice_base.c file.
Maciej does additional refactoring on the configuring of transmit
rings so that we are not configuring per each traffic class flow.
Added support for XDP in the ice driver. Provides additional
re-organizing of the code in preparation for adding build_skb() support
in the driver. Adjusted the computational padding logic for headroom
and tailroom to better support build_skb(), which also aligns with the
logic in other Intel LAN drivers. Added build_skb support and make use
of the XDP's data_meta.
Krzysztof refactors the driver to prepare for AF_XDP support in the
driver and then adds support for AF_XDP.
v2: Updated patch 3 of the series based on community feedback with the
following changes...
- return -EOPNOTSUPP instead of ENOTSUPP for too large MTU which makes
it impossible to attach XDP prog
- don't check for case when there's no XDP prog currently on interface
and ice_xdp() is called with NULL bpf_prog; this happens when user
does "ip link set eth0 xdp off" and no prog is present on VSI; no need
for that as it is handled by higher layer
- drop the extack message for unknown xdp->command
- use the smp_processor_id() for accessing the XDP Tx ring for XDP_TX
action
- don't leave the interface in downed state in case of any failure
during the XDP Tx resources handling
- undo rename of ice_build_ctob
The above changes caused a ripple effect in patches 4 & 5 to update
references to ice_build_ctob() which are now build_ctob()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2019-11-04
This series contains old Halloween candy updates, yet still sweet, to
fm10k, ixgbe and i40e.
Jake adds the missing initializers for a couple of the TLV attribute
macros. Added support for capturing and reporting statistics for all of
the VFs in a given PF. Lastly, bump the version of the fm10k driver to
reflect the recent changes.
Alex addresses locality issues in the ixgbe driver when it is loaded on
a system supporting multiple NUMA nodes.
Manjunath Patil provides changes to the ixgbe driver, similar to those
made to igb, to prevent transmit packets to request a hardware timestamp
when the NIC has not been setup via the SIOCSHWTSTAMP ioctl.
Alice adds support for x710 by adding the missing device id's in the
appropriate places to ensure all the features are enabled in i40e.
Jesse adds support for VF stats gathering in the i40e via the kernel
via ndo_get_vf_stats function.
v2: Fixed up commit id references in patch 5's description to align with
how commit id's should be referenced.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the VF stats gathering via the kernel via ndo_get_vf_stats().
The driver will show per-VF stats in the output of the command:
ip -s link show dev <PF>
Testing Hints:
ip -s link show dev eth0
will return non-zero VF stats.
...
vf 0 MAC 00:55:aa:00:55:aa, spoof checking on, link-state enable, trust off
RX: bytes packets mcast bcast
128000 1000 104 104
TX: bytes packets
128000 1000
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The I40E_DEV_ID_10G_BASE_T_BC device id was added previously,
but was not enabled in all the appropriate places. Adding it
to enable it's use.
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
HW timestamping can only be requested for a packet if the NIC is first
setup via ioctl(SIOCSHWTSTAMP). If this step was skipped, then the ixgbe
driver still allowed TX packets to request HW timestamping. In this
situation, we see 'clearing Tx Timestamp hang' noise in the log.
Fix this by checking that the NIC is configured for HW TX timestamping
before accepting a HW TX timestamping request.
Similar-to:
commit 26bd4e2db0 ("igb: protect TX timestamping from API misuse")
commit 0a6f2f05a2 ("igb: Fix a test with HWTSTAMP_TX_ON")
Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
An upcoming out-of-tree release will be occurring which will include the
recent functionality to support virtual function statistics. Update the
kernel driver version to match this.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is meant to address locality issues present in the ixgbe driver
when it is loaded on a system supporting multiple NUMA nodes and more CPUs
then the device can map in a 1:1 fashion. Instead of just arbitrarily
mapping itself to CPUs 0-62 it would make much more sense to map itself to
the local CPUs first, and then map itself to any remaining CPUs that might
be used.
The first effect of this is that queue 0 should always be allocated on the
local CPU/NUMA node. This is important as it is the default destination if
a packet doesn't match any existing flow director filter or RSS rule and as
such having it local should help to reduce QPI cross-talk in the event of
an unrecognized traffic type.
In addition this should increase the likelihood of the RSS queues being
allocated and used on CPUs local to the device while the ATR/Flow Director
queues would be able to route traffic directly to the CPU that is likely to
be processing it.
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Support capturing and reporting statistics for all of the VFs associated
with a given PF device via the ndo_get_vf_stats callback.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add the missing field initializers for a couple of the TLV attribute
macros. This resolves the last few -Wmissing-field-initializers warnings
for the fm10k Linux driver.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
At this point ice driver is able to work on order 1 pages that are split
onto two 3k buffers. Let's reflect that when user is setting new MTU
size and XDP is present on interface.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Driver is now prepared for building the skb around the existing Rx
buffer, so introduce the ice_build_skb responsible for it. Make use of
XDP's data_meta as well.
I've observed around 30% less CPU consumption with build_skb Rx path, in
comparison to legacy Rx. What stands behind such result is the avoidance
of flow_dissector (which we were diving into via eth_get_headlen) and no
memcpy calls.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Take into account the underlying architecture specific settings and
based on that calculate the possible padding that can be supplied.
Typically, for x86 and standard MTU size we will end up with 192 bytes
of headroom. This is the same behavior as our other drivers have and we
can dedicate it for XDP purposes.
Furthermore, introduce the Rx ring flag for indicating whether build_skb
is used on particular. Based on that invoke the routines for padding
calculation.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add an ethtool "legacy-rx" priv flag for toggling the Rx path. This
control knob will be mainly used for build_skb usage as well as buffer
size/MTU manipulation.
In preparation for adding build_skb support in a way that it takes
care of how we set the values of max_frame and rx_buf_len fields of
struct ice_vsi. Specifically, in this patch mentioned fields are set to
values that will allow us to provide headroom and tailroom in-place.
This can be mostly broken down onto following:
- for legacy-rx "on" ethtool control knob, old behaviour is kept;
- for standard 1500 MTU size configure the buffer of size 1536, as
network stack is expecting the NET_SKB_PAD to be provided and
NET_IP_ALIGN can have a non-zero value (these can be typically equal
to 32 and 2, respectively);
- for larger MTUs go with max_frame set to 9k and configure the 3k
buffer in case when PAGE_SIZE of underlying arch is less than 8k; 3k
buffer is implying the need for order 1 page, so that our page
recycling scheme can still be applied;
With that said, substitute the hardcoded ICE_RXBUF_2048 and PAGE_SIZE
values in DMA API that we're making use of with rx_ring->rx_buf_len and
ice_rx_pg_size(rx_ring). The latter is an introduced helper for
determining the page size based on its order (which was figured out via
ice_rx_pg_order). Last but not least, take care of truesize calculation.
In the followup patch the headroom/tailroom computation logic will be
introduced.
This change aligns the buffer and frame configuration with other Intel
drivers, most importantly with iavf.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add zero copy AF_XDP support. This patch adds zero copy support for
Tx and Rx; code for zero copy is added to ice_xsk.h and ice_xsk.c.
For Tx, implement ndo_xsk_wakeup. As with other drivers, reuse
existing XDP Tx queues for this task, since XDP_REDIRECT guarantees
mutual exclusion between different NAPI contexts based on CPU ID. In
turn, a netdev can XDP_REDIRECT to another netdev with a different
NAPI context, since the operation is bound to a specific core and each
core has its own hardware ring.
For Rx, allocate frames as MEM_TYPE_ZERO_COPY on queues that AF_XDP is
enabled.
Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>