Included nic tls statistics in ethtool stats.
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fixed-link nodes are treated as PHY nodes by of_mdiobus_child_is_phy().
We must check if the interface is a fixed-link before looking up for PHY
nodes.
Fixes: 7897b071ac ("net: macb: convert to phylink")
Tested-by: Cristian Birsan <cristian.birsan@microchip.com>
Signed-off-by: Codrin Ciubotariu <codrin.ciubotariu@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add EHL PSE0/1 RGMII & SGMII 1Gbps PCI info and PCI ID
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As stmmac_pci.c file is getting bigger and more complex, it is reasonable
to separate all the Intel specific dwmac pci device to a different file.
This move includes Intel Quark, TGL and EHL. A new kernel config
CONFIG_DWMAC_INTEL is introduced and depends on X86. For this initial
patch, all the necessary function such as probe() and exit() are identical
besides the function name.
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement support for setting of packet trap group parameters by
invoking the trap_group_init() callback with the new parameters.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some packet traps are currently exposed to user space as being member of
"l3_drops" trap group, but internally they are member of a different
group.
Switch these traps to use the correct group so that they are all subject
to the same policer, as exposed to user space.
Set the trap priority of packets trapped due to loopback error during
routing to the lowest priority. Such packets are not routed again by the
kernel and therefore should not mask other traps (e.g., host miss) that
should be routed.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The policer is now initialized as part of the registration with devlink,
so there is no need to initialize it before the registration.
Remove the initialization.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Register supported packet trap policers with devlink and implement
callbacks to change their parameters and read their counters.
Prevent user space from passing invalid policer parameters down to the
device by checking their validity and communicating the failure via an
appropriate extack message.
v2:
* Remove the max/min validity checks from __mlxsw_sp_trap_policer_set()
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prepare an array of policer IDs to register with devlink and their
associated parameters.
The array is composed from both policers that are currently bound to
exposed trap groups and policers that are not bound to any trap group.
v2:
* Provide max/min rate/burst size when registering policers
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During initialization the driver configures various packet trap groups
and binds policers to them.
Currently, most of these groups are not exposed to user space and
therefore their policers should not be exposed as well. Otherwise, user
space will be able to alter policer parameters without knowing which
packet traps are policed by the policer.
Use a bitmap to track the used policer IDs so that these policers will
not be registered with devlink in a subsequent patch.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The QoS Policer Configuration Register (QPCR) is used to configure
hardware policers. Extend this register with following fields and
defines which will be used by subsequent patches:
1. Violate counter: reads number of packets dropped by the policer
2. Clear counter: to ensure we start counting from 0
3. Rate and burst size limits
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packet trap groups are used to aggregate logically related packet traps.
Currently, these groups allow user space to batch operations such as
setting the trap action of all member traps.
In order to prevent the CPU from being overwhelmed by too many trapped
packets, it is desirable to bind a packet trap policer to these groups.
For example, to limit all the packets that encountered an exception
during routing to 10Kpps.
Allow device drivers to bind default packet trap policers to packet trap
groups when the latter are registered with devlink.
The next patch will enable user space to change this default binding.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Clean up and rework the PM QoS API to simplify the code and
reduce the size of it (Rafael Wysocki).
- Fix a suspend-to-idle wakeup regression on Dell XPS13 9370
and similar platforms where the USB plug/unplug events are
handled by the EC (Rafael Wysocki).
- CLean up the intel_idle and PSCI cpuidle drivers (Rafael Wysocki,
Ulf Hansson).
- Extend the haltpoll cpuidle driver so that it can be forced to
run on some systems where it refused to load (Maciej Szmigiero).
- Convert several cpufreq documents to the .rst format and move the
legacy driver documentation into one common file (Mauro Carvalho
Chehab, Rafael Wysocki).
- Update several cpufreq drivers:
* Extend and fix the imx-cpufreq-dt driver (Anson Huang).
* Improve the -EPROBE_DEFER handling and fix unwanted CPU
overclocking on i.MX6ULL in imx6q-cpufreq (Anson Huang,
Christoph Niedermaier).
* Add support for Krait based SoCs to the qcom driver (Ansuel
Smith).
* Add support for OPP_PLUS to ti-cpufreq (Lokesh Vutla).
* Add platform specific intermediate callbacks support to
cpufreq-dt and update the imx6q driver (Peng Fan).
* Simplify and consolidate some pieces of the intel_pstate driver
and update its documentation (Rafael Wysocki, Alex Hung).
- Fix several devfreq issues:
* Remove unneeded extern keyword from a devfreq header file
and use the DEVFREQ_GOV_UPDATE_INTERNAL event name instead of
DEVFREQ_GOV_INTERNAL (Chanwoo Choi).
* Fix the handling of dev_pm_qos_remove_request() result (Leonard
Crestez).
* Use constant name for userspace governor (Pierre Kuo).
* Get rid of doc warnings and fix a typo (Christophe JAILLET).
- Use built-in RCU list checking in some places in the PM core to
avoid false-positive RCU usage warnings (Madhuparna Bhowmik).
- Add explicit READ_ONCE()/WRITE_ONCE() annotations to low-level
PM QoS routines (Qian Cai).
- Fix removal of wakeup sources to avoid NULL pointer dereferences
in a corner case (Neeraj Upadhyay).
- Clean up the handling of hibernate compat ioctls and fix the
related documentation (Eric Biggers).
- Update the idle_inject power capping driver to use variable-length
arrays instead of zero-length arrays (Gustavo Silva).
- Fix list format in a PM QoS document (Randy Dunlap).
- Make the cpufreq stats module use scnprintf() to avoid potential
buffer overflows (Takashi Iwai).
- Add pm_runtime_get_if_active() to PM-runtime API (Sakari Ailus).
- Allow no domain-idle-states DT property in generic PM domains (Ulf
Hansson).
- Fix a broken y-axis scale in the intel_pstate_tracer utility (Doug
Smythies).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl6B/YkSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxEjIP/jXoO1pAxq7BMx7naZnZL7pzemJfAGR7
HVnRLDo0IlsSwI7Jvuy13a0eI+EcGPA6pRo5qnBM4TZCIFsHoO5Yle47ndNGsi8r
Jd3T89oT3I+fXI4KTfWO0n+K/F6mv8/CTZDz/E7Z6zirpFxyyZQxgIsAT76RcZom
xhWna9vygOlBnFsQaAeph+GzoXBWIylaMZfylUeT3v4c4DLH6FzcbnINPkgJsZCw
Ayt1bmE0L9yiqCizEto91eaDObkxTHVFGr2OVNa/Y/SVW+VUThUJrXqV28opQxPZ
h4TiQorpTX1CwMmiXZwmoeqqsiVXrm0KyhK0lwc5tZ9FnZWiW4qjJ487Eu6TjOmh
gecT+M2Yexy0BvUGN0wIdaCLtfmf2Hjxk0trxM2blAh3uoFjf3UJ9SLNkRjlu2/b
QqWmIRRPljD5fEUid5lVV4EAXuITUzWMJeia+FiAsgx1SF3pZPar80f+FGrYfaJN
wL2BTwBx1aXpPpAkEX0kM9Rkf6oJsFATR3p7DNzyZ1bMrQUxiToWRlQBID5H6G4v
/kAkSTQjNQVwkkylUzTLOlcmL56sCvc0YPdybH62OsLXs9K4gyC8v6tEdtdA5qtw
0Up9DrYbNKKv6GrSXf8eyk2Q2CEqfRXHv2ACNnkLRXZ6fWnFiTfMgNj7zqtrfna7
tJBvrV9/ACXE
=cBQd
-----END PGP SIGNATURE-----
Merge tag 'pm-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These clean up and rework the PM QoS API, address a suspend-to-idle
wakeup regression on some ACPI-based platforms, clean up and extend a
few cpuidle drivers, update multiple cpufreq drivers and cpufreq
documentation, and fix a number of issues in devfreq and several other
things all over.
Specifics:
- Clean up and rework the PM QoS API to simplify the code and reduce
the size of it (Rafael Wysocki).
- Fix a suspend-to-idle wakeup regression on Dell XPS13 9370 and
similar platforms where the USB plug/unplug events are handled by
the EC (Rafael Wysocki).
- CLean up the intel_idle and PSCI cpuidle drivers (Rafael Wysocki,
Ulf Hansson).
- Extend the haltpoll cpuidle driver so that it can be forced to run
on some systems where it refused to load (Maciej Szmigiero).
- Convert several cpufreq documents to the .rst format and move the
legacy driver documentation into one common file (Mauro Carvalho
Chehab, Rafael Wysocki).
- Update several cpufreq drivers:
* Extend and fix the imx-cpufreq-dt driver (Anson Huang).
* Improve the -EPROBE_DEFER handling and fix unwanted CPU
overclocking on i.MX6ULL in imx6q-cpufreq (Anson Huang,
Christoph Niedermaier).
* Add support for Krait based SoCs to the qcom driver (Ansuel
Smith).
* Add support for OPP_PLUS to ti-cpufreq (Lokesh Vutla).
* Add platform specific intermediate callbacks support to
cpufreq-dt and update the imx6q driver (Peng Fan).
* Simplify and consolidate some pieces of the intel_pstate
driver and update its documentation (Rafael Wysocki, Alex
Hung).
- Fix several devfreq issues:
* Remove unneeded extern keyword from a devfreq header file and
use the DEVFREQ_GOV_UPDATE_INTERNAL event name instead of
DEVFREQ_GOV_INTERNAL (Chanwoo Choi).
* Fix the handling of dev_pm_qos_remove_request() result
(Leonard Crestez).
* Use constant name for userspace governor (Pierre Kuo).
* Get rid of doc warnings and fix a typo (Christophe JAILLET).
- Use built-in RCU list checking in some places in the PM core to
avoid false-positive RCU usage warnings (Madhuparna Bhowmik).
- Add explicit READ_ONCE()/WRITE_ONCE() annotations to low-level PM
QoS routines (Qian Cai).
- Fix removal of wakeup sources to avoid NULL pointer dereferences in
a corner case (Neeraj Upadhyay).
- Clean up the handling of hibernate compat ioctls and fix the
related documentation (Eric Biggers).
- Update the idle_inject power capping driver to use variable-length
arrays instead of zero-length arrays (Gustavo Silva).
- Fix list format in a PM QoS document (Randy Dunlap).
- Make the cpufreq stats module use scnprintf() to avoid potential
buffer overflows (Takashi Iwai).
- Add pm_runtime_get_if_active() to PM-runtime API (Sakari Ailus).
- Allow no domain-idle-states DT property in generic PM domains (Ulf
Hansson).
- Fix a broken y-axis scale in the intel_pstate_tracer utility (Doug
Smythies)"
* tag 'pm-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (78 commits)
cpufreq: intel_pstate: Simplify intel_pstate_cpu_init()
tools/power/x86/intel_pstate_tracer: fix a broken y-axis scale
ACPI: PM: s2idle: Refine active GPEs check
ACPICA: Allow acpi_any_gpe_status_set() to skip one GPE
PM: sleep: wakeup: Skip wakeup_source_sysfs_remove() if device is not there
PM / devfreq: Get rid of some doc warnings
PM / devfreq: Fix handling dev_pm_qos_remove_request result
PM / devfreq: Fix a typo in a comment
PM / devfreq: Change to DEVFREQ_GOV_UPDATE_INTERVAL event name
PM / devfreq: Remove unneeded extern keyword
PM / devfreq: Use constant name of userspace governor
ACPI: PM: s2idle: Fix comment in acpi_s2idle_prepare_late()
cpufreq: qcom: Add support for krait based socs
cpufreq: imx6q-cpufreq: Improve the logic of -EPROBE_DEFER handling
cpufreq: Use scnprintf() for avoiding potential buffer overflow
cpuidle: psci: Split psci_dt_cpu_init_idle()
PM / Domains: Allow no domain-idle-states DT property in genpd when parsing
PM / hibernate: Remove unnecessary compat ioctl overrides
PM: hibernate: fix docs for ioctls that return loff_t via pointer
Documentation: intel_pstate: update links for references
...
Factor out mapping the tx skb to a new function rtl8169_tx_map(). This
allows to remove redundancies, and rtl8169_get_txd_opts1() has only
one user left, so it can be inlined.
As a result rtl8169_xmit_frags() is significantly simplified, and in
rtl8169_start_xmit() the code is simplified and better readable.
No functional change intended.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The qed_chain data structure was modified in
commit 1a4a69751f ("qed: Chain support for external PBL") to support
receiving an external pbl (due to iWARP FW requirements).
The pages pointed to by the pbl are allocated in qed_chain_alloc
and their virtual address are stored in an virtual addresses array to
enable accessing and freeing the data. The physical addresses however
weren't stored and were accessed directly from the external-pbl
during free.
Destroy-qp flow, leads to freeing the external pbl before the chain is
freed, when the chain is freed it tries accessing the already freed
external pbl, leading to a use-after-free. Therefore we need to store
the physical addresses in additional to the virtual addresses in a
new data structure.
Fixes: 1a4a69751f ("qed: Chain support for external PBL")
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Bason <ybason@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the mtu is greater than TD_MSS_MAX, then TSO is disabled, see
rtl8169_fix_features(). Because mss is less than mtu, we can't have
the case mss > TD_MSS_MAX in the TSO path.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is a trivial passthrough towards the ocelot library, which
support port policers since commit 2c1d029a01 ("net: mscc: ocelot:
Implement port policers via tc command").
Some data structure conversion between the DSA core and the Ocelot
library is necessary, for policer parameters.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ocelot has 384 policers that can be allocated to ingress ports,
QoS classes per port, and VCAP IS2 entries. ocelot_police.c
supports to set policers which can be allocated to police action
of VCAP IS2. We allocate policers from maximum pol_id, and
decrease the pol_id when add a new vcap_is2 entry which is
police action.
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the FW RESET event comes to the driver from the firmware,
or the fw_status goes to 0 (stopped) or to 0xff (no PCI
connection), then shut down the driver activity. This event
signals a FW upgrade where we need to quiesce all operations and
wait for the FW to restart. The FW will continue the update
process once it sees all the LIFs are reset. When the update
process is done it will set the fw_status back to RUNNING.
Meanwhile, the heartbeat check continues and when the fw_status
is seen as set to running we can restart the driver operations.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the link goes down, we need to disable the queues on the
NIC in addition to stopping the netdev stack. This lets the
FW know that the driver has stopped queue activity, and then
the FW can do internal reconfiguration work, whether actually
Link related, or for other internal FW needs. To do this,
we pull out the queue enable and disable from ionic_open()
and ionic_stop() so they can be used by other routines.
To help keep things sane, we swap the queue enables so that
the rx queue and its napi are enabled before the tx queue
which rides on the rx queues napi.
We also drop the ionic_lif_quiesce() as it doesn't do anything
more than what the queue disable has already taken care of.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure the queue structures exist before trying
to delete them. This addresses a couple of error
recovery issues.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean out tx requests that didn't get finished before
shutting down the queue.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the irq request and free out of the qcq_init and deinit
and into the alloc and free routines where they belong for
better resource management.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the qcq debugfs add to the queue alloc, and likewise move
the debugfs delete to the queue free. The LIF debugfs add
also needs to be moved, but the del is already in the LIF free.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a link_status_check to the heartbeat watchdog.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rearrange the link_up/link_down messages so that we announce
link up when we first notice that the link is up when the
driver loads, and decouple the link_up/link_down messages from
the UP and DOWN netdev state.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cited commit extended the enums 'hwtstamp_tx_types' and
'hwtstamp_rx_filters' with values that were not accounted for in the
switch statements, resulting in the build warnings below.
Fix by adding a default case.
drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c: In function ‘mlxsw_sp_ptp_get_message_types’:
drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:915:2: warning: enumeration value ‘__HWTSTAMP_TX_CNT’ not handled in switch [-Wswitch]
915 | switch (tx_type) {
| ^~~~~~
drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:927:2: warning: enumeration value ‘__HWTSTAMP_FILTER_CNT’ not handled in switch [-Wswitch]
927 | switch (rx_filter) {
| ^~~~~~
Fixes: f76510b458 ("ethtool: add timestamping related string sets")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When health reporter is registered to devlink, devlink will implicitly set
auto recover if and only if the reporter has a recover method. No reason
to explicitly get the auto recover flag from the driver.
Remove this flag from all drivers that called
devlink_health_reporter_create.
All existing health reporters set auto recovery to true if they have a
recover method.
Yet, administrator can unset auto recover via netlink command as prior to
this patch.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It may be up to the driver (in case ANY HW stats is passed) to select
which type of HW stats he is going to use. Add an infrastructure to
expose this information to user.
$ tc filter add dev enp3s0np1 ingress proto ip handle 1 pref 1 flower dst_ip 192.168.1.1 action drop
$ tc -s filter show dev enp3s0np1 ingress
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
eth_type ipv4
dst_ip 192.168.1.1
in_hw in_hw_count 2
action order 1: gact action drop
random type none pass val 0
index 1 ref 1 bind 1 installed 10 sec used 10 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
used_hw_stats immediate <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When device is not open, the service task which update the port
information per second is not running. In this case, the port
capabilities, including speed ability, autoneg ability, media type,
may be incorrect. Then get/set link ksetting may fail.
This patch fixes it by updating the port information before getting/
setting link ksettings when device is not open, and start timer
task immediately by setting delay time to 0 when device opens.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, VF's RSS configuration would be set to default
after VF reset, the the user's one will loss.
To fix it, this patch separates hclgevf_rss_init_hw() into
two parts, one sets up the default RSS configuration and
just be called when driver loading, one configures the hardware
and be called by driver loading or reset.
Fixes: d97b307213 ("net: hns3: Add RSS tuples support for VF")
Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the fraglist SKB headlen is larger than zero, current code
still handle the fraglist SKB linear data as frag data, which may
cause TX error.
This patch adds a new DESC_TYPE_FRAGLIST_SKB type to handle the
mapping and unmapping of the fraglist SKB linear data buffer.
Fixes: 8ae10cfb50 ("net: hns3: support tx-scatter-gather-fraglist feature")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sparse warnings:
drivers/net/ethernet/amazon/ena/ena_netdev.c:460:6: warning: symbol 'ena_xdp_exchange_program_rx_in_range' was not declared. Should it be static?
drivers/net/ethernet/amazon/ena/ena_netdev.c:481:6: warning: symbol 'ena_xdp_exchange_program' was not declared. Should it be static?
drivers/net/ethernet/amazon/ena/ena_netdev.c:1555:5: warning: symbol 'ena_xdp_handle_buff' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sparse warning:
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:2065:5:
warning: symbol 'dpaa_a050385_wa' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for VLAN ID-based filtering by the MAC controller for MAC
drivers that support it. Only the 12-bit VID field is used.
Signed-off-by: Chuah Kim Tatt <kim.tatt.chuah@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a spelling mistake in a dev_err error message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add mlx5e_rep_indr_setup_ft_cb to support indr block setup
in FT mode.
Both tc rules and flow table rules are of the same format,
It can re-use tc parsing for that, and move the flow table rules
to their steering domain(the specific chain_index), the indr
block offload in FT also follow this scenario.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Refactor indr setup block for support ft indr setup in the
next patch. The function mlx5e_rep_indr_offload exposes
'flags' in order set additional flag for FT in next patch.
Rename mlx5e_rep_indr_setup_tc_block to mlx5e_rep_indr_setup_block
and add flow_setup_cb_t callback parameters in order set the
specific callback for FT in next patch.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
eswitch_offloads_chains.{c,h} were just introduced this kernel release
cycle, eswitch is in high development demand right now and many
features are planned to be added to it. eswitch deserves its own
directory and here we move these new files to there, in preparation for
upcoming eswitch features and new files.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
In VF lag mode when remove the bonding module without bring down the
bond device first, we could potentially have circular dependency when we
unload IB devices and also handle fib events:
1. The bond work starts first;
2. The "modprobe -rv bonding" process tries to release the bond device,
with the "pernet_ops_rwsem" lock hold;
3. The bond work blocks in unregister_netdevice_notifier() and waits for
the lock because fib event came right before;
4. The kernel fib module tries to free all the fib entries by broadcasting
the "FIB_EVENT_NH_DEL" event;
5. Upon the fib event this lag_mp module holds the fib lock and queue a
fib work.
So:
bond work -> modprobe task -> kernel fib module -> lag_mp -> bond work
Today we either reload IB devices in roce lag in nic mode or either handle
fib events in switchdev mode, but a new feature could change that we'll
need to reload IB devices also in switchdev mode so this is a future proof
fix as one may not notice this later.
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
mlx5: Remove uninitialized use of key in mlx5_core_create_mkey
{IB,net}/mlx5: Move asynchronous mkey creation to mlx5_ib
{IB,net}/mlx5: Assign mkey variant in mlx5_ib only
{IB,net}/mlx5: Setup mkey variant before mr create command invocation
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
A recent commit e893768179 ("devlink: prepare to support region
operations") used the region_cr_space_str and region_fw_health_str
variables as initializers for the devlink_region_ops structures.
This can result in compiler errors:
drivers/net/ethernet/mellanox//mlx4/crdump.c:45:10: error: initializer
element is not constant
.name = region_cr_space_str,
^
drivers/net/ethernet/mellanox//mlx4/crdump.c:45:10: note: (near
initialization for ‘region_cr_space_ops.name’)
drivers/net/ethernet/mellanox//mlx4/crdump.c:50:10: error: initializer
element is not constant
.name = region_fw_health_str,
The variables were made to be "const char * const", indicating that both
the pointer and data were constant. This was enough to resolve this on
recent GCC (gcc (GCC) 9.2.1 20190827 (Red Hat 9.2.1-1) for this author).
Unfortunately this is not enough for older compilers to realize that the
variable can be treated as a constant expression.
Fix this by introducing macros for the string and use those instead of
the variable name in the region ops structures.
Reported-by: tanhuazhong <tanhuazhong@huawei.com>
Fixes: e893768179 ("devlink: prepare to support region operations")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling queue_delayed_work concurrently with
destroy_workqueue might race to an unexpected outcome -
scheduled task after wq is destroyed or other resources
(like ptt_pool) are freed (yields NULL pointer dereference).
cancel_delayed_work prevents the race by cancelling
the timer triggered for scheduling a new task.
Fixes: 59ccf86fe ("qed: Add driver infrastucture for handling mfw requests")
Signed-off-by: Denis Bolotin <dbolotin@marvell.com>
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Basson <ybason@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The AER interfaces to clear error status registers were a confusing mess:
- pci_cleanup_aer_uncorrect_error_status() cleared non-fatal errors
from the Uncorrectable Error Status register.
- pci_aer_clear_fatal_status() cleared fatal errors from the
Uncorrectable Error Status register.
- pci_cleanup_aer_error_status_regs() cleared the Root Error Status
register (for Root Ports), the Uncorrectable Error Status register,
and the Correctable Error Status register.
Rename them to make them consistent:
From To
---------------------------------------- -------------------------------
pci_cleanup_aer_uncorrect_error_status() pci_aer_clear_nonfatal_status()
pci_aer_clear_fatal_status() pci_aer_clear_fatal_status()
pci_cleanup_aer_error_status_regs() pci_aer_clear_status()
Since pci_cleanup_aer_error_status_regs() (renamed to
pci_aer_clear_status()) is only used within drivers/pci/, move the
declaration from <linux/aer.h> to drivers/pci/pci.h.
[bhelgaas: commit log, add renames]
Link: https://lore.kernel.org/r/d1310a75dc3d28f7e8da4e99c45fbd3e60fe238e.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Changing the MTU for this switch means altering the
DEV_GMII:MAC_CFG_STATUS:MAC_MAXLEN_CFG field MAX_LEN, which in turn
limits the size of frames that can be received.
Special accounting needs to be done for the DSA CPU port (NPI port in
hardware terms). The NPI port configuration needs to be held inside the
private ocelot structure, since it is now accessed from multiple places.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change DMA descriptor length to handle jumbo frames beyond 8192 bytes.
Also update jumbo frame max size to include FCS, the DMA packet length
received includes FCS.
Signed-off-by: Murali Krishna Policharla <murali.policharla@broadcom.com>
Reviewed-by: Arun Parameswaran <arun.parameswaran@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On Android/x86 the module loading infrastructure can't deal with
softdeps. Therefore the check for presence of the Realtek PHY driver
module fails. mdiobus_register() will try to load the PHY driver
module, therefore move the check to after this call and explicitly
check that a dedicated PHY driver is bound to the PHY device.
Fixes: f325937735 ("r8169: check that Realtek PHY driver module is loaded")
Reported-by: Chih-Wei Huang <cwhuang@android-x86.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix macro names to report fw.mgmt and fw.ncsi versions to match the
devlink documentation.
Example display after fixes:
$ devlink dev info pci/0000:af:00.0
pci/0000:af:00.0:
driver bnxt_en
serial_number B0-26-28-FF-FE-25-84-20
versions:
fixed:
board.id BCM957454A4540
asic.id C454
asic.rev 1
running:
fw 216.1.154.0
fw.psid 0.0.0
fw.mgmt 216.1.146.0
fw.mgmt.api 1.10.1
fw.ncsi 864.0.44.0
fw.roce 216.1.16.0
Fixes: 9599e036b1 ("bnxt_en: Add support for devlink info command")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add part number info from the vital product data to info_get command
via devlink tool. Update bnxt.rst documentation as well.
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Store the part number and serial number information from VPD in
the bnxt structure. Follow up patch will add the support to display
the information via devlink command.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Display the minimum version of firmware interface spec supported
between driver and firmware. Also update bnxt.rst documentation file.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suppress the following smatch errors. None of these are actually
possible with current code paths.
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1220
mlxsw_sp_ipip_entry_find_decap() error: uninitialized symbol 'saddrp'.
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1220
mlxsw_sp_ipip_entry_find_decap() error: uninitialized symbol
'saddr_len'.
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1221
mlxsw_sp_ipip_entry_find_decap() error: uninitialized symbol
'saddr_prefix_len'.
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1390
mlxsw_sp_netdevice_ipip_ol_reg_event() error: uninitialized symbol
'ipipt'.
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:3255
mlxsw_sp_nexthop_group_update() error: uninitialized symbol 'err'.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suppress following warning from coccinelle:
drivers/net/ethernet/mellanox/mlxsw//switchx2.c:183:63-68: WARNING:
conversion to bool not needed here
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The static array 'mlxsw_afk_element_infos' in 'core_acl_flex_keys.h' is
copied to each file that includes the header, but not all use it. This
results in the following warnings when compiling with W=1:
drivers/net/ethernet/mellanox/mlxsw//core_acl_flex_keys.h:76:44:
warning: ‘mlxsw_afk_element_infos’ defined but not used
[-Wunused-const-variable=]
One way to suppress the warning is to mark the array with
'__maybe_unused', but another option is to remove it from the header
file entirely.
Change 'struct mlxsw_afk_element_inst' to store the key to the array
('element') instead of the array value keyed by 'element'. Adjust the
different users accordingly.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In merge commit 50853808ff ("Merge branch
'mlxsw-Prepare-for-VLAN-aware-bridge-w-VxLAN'") I flipped mlxsw to use
emulated 802.1Q FIDs and correspondingly emulated VLAN RIFs. This means
that the non-emulated variants are no longer used. Remove them and
suppress the following warnings when compiling with W=1:
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:7572:38: warning:
‘mlxsw_sp_rif_vlan_ops’ defined but not used [-Wunused-const-variable=]
drivers/net/ethernet/mellanox/mlxsw//spectrum_fid.c:584:41: warning:
‘mlxsw_sp_fid_8021q_family’ defined but not used
[-Wunused-const-variable=]
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suppress following warnings when compiling with W=1:
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1552: warning:
Function parameter or member 'mlxsw_sp' not described in
'__mlxsw_sp_ipip_entry_update_tunnel'
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1552: warning:
Function parameter or member 'ipip_entry' not described in
'__mlxsw_sp_ipip_entry_update_tunnel'
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1552: warning:
Function parameter or member 'extack' not described in
'__mlxsw_sp_ipip_entry_update_tunnel'
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suppress following warning when compiling with W=1:
drivers/net/ethernet/mellanox/mlxsw//i2c.c:78: warning: Function
parameter or member 'cmd' not described in 'mlxsw_i2c'
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Leon Romanovsky says:
====================
Those two patches from Michael extends mlx5_core and mlx5_ib flow steering
to support RDMA TX in similar way to already supported RDMA RX.
====================
Based on the mlx5-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Due to dependencies
* branch 'mlx5_tx_steering':
RDMA/mlx5: Add support for RDMA TX flow table
net/mlx5: Add support for RDMA TX steering
Add new RDMA TX flow steering namespace. Flow steering rules in
this namespace are used to filter transmitted RDMA traffic.
Link: https://lore.kernel.org/r/20200324061425.1570190-2-leon@kernel.org
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This patch reverts 5829210483 ("net: ks8851-ml: Fix 16-bit IO operation")
and edacb098ea ("net: ks8851-ml: Fix 16-bit data access"), because it
turns out these were only necessary due to buggy hardware. This patch adds
a check for such a buggy hardware to prevent any such mistakes again.
While working further on the KS8851 driver, it came to light that the
KS8851-16MLL is capable of switching bus endianness by a hardware strap,
EESK pin. If this strap is incorrect, the IO accesses require such endian
swapping as is being reverted by this patch. Such swapping also impacts
the performance significantly.
Hence, in addition to removing it, detect that the hardware is broken,
report to user, and fail to bind with such hardware.
Fixes: 5829210483 ("net: ks8851-ml: Fix 16-bit IO operation")
Fixes: edacb098ea ("net: ks8851-ml: Fix 16-bit data access")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Petr Stetiar <ynezz@true.cz>
Cc: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds XPN handling.
Our driver doesn't support XPN, but we should still update a couple
of places in the code, because the size of 'next_pn' field has
changed.
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for MACSec statistics on Atlantic network cards.
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the Atlantic HW-specific bindings for MACSec statistics,
e.g. register addresses / structs, helper function, etc, which will be
used by actual callback implementations.
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for MACSec ingress HW offloading on Atlantic
network cards.
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the Atlantic HW-specific bindings for MACSec ingress, e.g.
register addresses / structs, helper function, etc, which will be used by
actual callback implementations.
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for MACSec egress HW offloading on Atlantic
network cards.
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the Atlantic HW-specific bindings for MACSec egress, e.g.
register addresses / structs, helper function, etc, which will be used by
actual callback implementations.
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds basic functionality for MACSec offloading for Atlantic
NICs.
MACSec offloading functionality is enabled if network card has
appropriate FW that has MACSec offloading enabled in config.
Actual functionality (ingress, egress, etc) will be added in follow-up
patches.
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TI AM65x/J721E SoCs Gigabit Ethernet Switch subsystem (CPSW2G NUSS) has
two ports - One Ethernet port (port 1) with selectable RGMII and RMII
interfaces and an internal Communications Port Programming Interface (CPPI)
port (Host port 0) and with ALE in between. It also contains
- Management Data Input/Output (MDIO) interface for physical layer device
(PHY) management;
- Updated Address Lookup Engine (ALE) module;
- (TBD) New version of Common platform time sync (CPTS) module.
On the TI am65x/J721E SoCs CPSW NUSS Ethernet subsystem into device MCU
domain named MCU_CPSW0.
Host Port 0 CPPI Packet Streaming Interface interface supports 8 TX
channels and one RX channels operating by TI am654 NAVSS Unified DMA
Peripheral Root Complex (UDMA-P) controller.
Introduced driver provides standard Linux net_device to user space and supports:
- ifconfig up/down
- MAC address configuration
- ethtool operation:
--driver
--change
--register-dump
--negotiate phy
--statistics
--set-eee phy
--show-ring
--show-channels
--set-channels
- net_device ioctl mii-control
- promisc mode
- rx checksum offload for non-fragmented IPv4/IPv6 TCP/UDP packets.
The CPSW NUSS can verify IPv4/IPv6 TCP/UDP packets checksum and fills
csum information for each packet in psdata[2] word:
- BIT(16) CHECKSUM_ERROR - indicates csum error
- BIT(17) FRAGMENT - indicates fragmented packet
- BIT(18) TCP_UDP_N - Indicates TCP packet was detected
- BIT(19) IPV6_VALID, BIT(20) IPV4_VALID - indicates IPv6/IPv4 packet
- BIT(15, 0) CHECKSUM_ADD - This is the value that was summed
during the checksum computation. This value is FFFFh for non fragmented
IPV4/6 UDP/TCP packets with no checksum error.
RX csum offload can be disabled:
ethtool -K <dev> rx-checksum on|off
- tx checksum offload support for IPv4/IPv6 TCP/UDP packets (J721E only).
TX csum HW offload can be enabled/disabled:
ethtool -K <dev> tx-checksum-ip-generic on|off
- multiq and switch between round robin/prio modes for cppi tx queues by
using Netdev private flag "p0-rx-ptype-rrobin" to switch between
Round Robin and Fixed priority modes:
# ethtool --show-priv-flags eth0
Private flags for eth0:
p0-rx-ptype-rrobin: on
# ethtool --set-priv-flags eth0 p0-rx-ptype-rrobin off
Number of TX DMA channels can be changed using "ethtool -L eth0 tx <N>".
- GRO support: the napi_gro_receive() and napi_complete_done() are used.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for default thread configuration for AM65x CPSW NUSS ALE to
allow route all ingress packets to one default RX UDMA flow.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new CPSW ALE version, available on TI K3 AM654/J721E SoCs family,
allows to switch any external port to MAC only mode. When MAC only mode
enabled this port be treated like a MAC port for the host. All traffic
received is only sent to the host. The host must direct traffic to this
port as the lookup engine will not send traffic to the ports with the
p0_maconly bit set and the p0_no_learn also set. If p0_maconly bit is set
and the p0_no_learn is not set, the host can send non-directed packets that
can be sent to the destination of a MacOnly port. It is also possible that
The host can broadcast to all ports including MacOnly ports in this mode.
This patch add ALE supprt for MAC only mode.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On AM65xx MCU CPSW2G NUSS and 66AK2E/L NUSS the unregistered multicast
packets are still can be received with promisc and allmulti disabled.
This happens, because ALE VLAN entries on these SoCs do not contain port
masks for reg/unreg mcast packets, but instead store indexes of
ALE_VLAN_MASK_MUXx_REG registers which intended for store port masks for
reg/unreg mcast packets.
ALE VLAN entry:UNREG_MCAST_FLOOD_INDEX -> ALE_VLAN_MASK_MUXx
ALE VLAN entry:REG_MCAST_FLOOD_INDEX -> ALE_VLAN_MASK_MUXy
The commit b361da8373 ("net: netcp: ale: add proper ale entry mask bits
for netcp switch ALE") update ALE code to support such ALE entries, it is
always used ALE_VLAN_MASK_MUX0_REG index in ALE VLAN entry for unreg mcast
packets mask configuration, which is read-only, at least for AM65xx MCU
CPSW2G NUSS and 66AK2E/L NUSS. As result unreg mcast packets are allowed
always.
Hence, update ALE code to use ALE_VLAN_MASK_MUX1_REG index for ALE VLAN
entries to configure unreg mcast port mask.
Fixes: b361da8373 ("net: netcp: ale: add proper ale entry mask bits for netcp switch ALE")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The phy-gmii-sel can be only auto selected in Kconfig and now the pretty
complex Kconfig dependencies are defined for phy-gmii-sel driver, which
also need to be updated every time phy-gmii-sel is re-used for any new
networking driver.
Simplify Kconfig definition for phy-gmii-sel PHY driver - drop all
dependencies and from networking drivers and rely on using 'imply
PHY_TI_GMII_SEL' in Kconfig definitions for networking drivers instead.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a devlink region for exposing the device's Non Volatime Memory flash
contents.
Support the recently added .snapshot operation, enabling userspace to
request a snapshot of the NVM contents via DEVLINK_CMD_REGION_NEW.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each snapshot created for a devlink region must have an id. These ids
are supposed to be unique per "event" that caused the snapshot to be
created. Drivers call devlink_region_snapshot_id_get to obtain a new id
to use for a new event trigger. The id values are tracked per devlink,
so that the same id number can be used if a triggering event creates
multiple snapshots on different regions.
There is no mechanism for snapshot ids to ever be reused. Introduce an
xarray to store the count of how many snapshots are using a given id,
replacing the snapshot_id field previously used for picking the next id.
The devlink_region_snapshot_id_get() function will use xa_alloc to
insert an initial value of 1 value at an available slot between 0 and
U32_MAX.
The new __devlink_snapshot_id_increment() and
__devlink_snapshot_id_decrement() functions will be used to track how
many snapshots currently use an id.
Drivers must now call devlink_snapshot_id_put() in order to release
their reference of the snapshot id after adding region snapshots.
By tracking the total number of snapshots using a given id, it is
possible for the decrement() function to erase the id from the xarray
when it is not in use.
With this method, a snapshot id can become reused again once all
snapshots that referred to it have been deleted via
DEVLINK_CMD_REGION_DEL, and the driver has finished adding snapshots.
This work also paves the way to introduce a mechanism for userspace to
request a snapshot.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The devlink_snapshot_id_get() function returns a snapshot id. The
snapshot id is a u32, so there is no way to indicate an error code.
A future change is going to possibly add additional cases where this
function could fail. Refactor the function to return the snapshot id in
an argument, so that it can return zero or an error value.
This ensures that snapshot ids cannot be confused with error values, and
aids in the future refactor of snapshot id allocation management.
Because there is no current way to release previously used snapshot ids,
add a simple check ensuring that an error is reported in case the
snapshot_id would over flow.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It does not makes sense that two snapshots for a given region would use
different destructors. Simplify snapshot creation by adding
a .destructor op for regions.
This operation will replace the data_destructor for the snapshot
creation, and makes snapshot creation easier.
Noticed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify the devlink region code in preparation for adding new operations
on regions.
Create a devlink_region_ops structure, and move the name pointer from
within the devlink_region structure into the ops structure (similar to
the devlink_health_reporter_ops).
This prepares the regions to enable support of additional operations in
the future such as requesting snapshots, or accessing the region
directly without a snapshot.
In order to re-use the constant strings in the mlx4 driver their
declaration must be changed to 'const char * const' to ensure the
compiler realizes that both the data and the pointer cannot change.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move away from the deprecated API and return the shiny new ERRPTR where
useful.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move away from the deprecated API and return the shiny new ERRPTR where
useful.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
list_for_each_entry_from_reverse() iterates backwards over the list from
the current position, but in the error path we should start from the
previous position.
Fix this by using list_for_each_entry_continue_reverse() instead.
This suppresses the following error from coccinelle:
drivers/net/ethernet/mellanox/mlxsw//spectrum_mr.c:655:34-38: ERROR:
invalid reference to the index variable of the iterator on line 636
Fixes: c011ec1bbf ("mlxsw: spectrum: Add the multicast routing offloading logic")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Offload action pedit ex munge when used with a flower classifier. Only
allow setting of DSCP, ECN, or the whole DSField in IPv4 and IPv6 packets.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The QOS_ACTION is used for manipulating the QOS attributes of the packet.
Add the defines and helpers related to DSCP and ECN fields, and dscp_rw.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The original idea was to reuse this set of actions for ECN rewrite as well,
but on second look, it's not such a great idea. These two items should each
have its own command. Rename the existing enum to make it obvious that it
belongs to switch_prio_cmd.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In qlcnic_83xx_get_reset_instruction_template, the variable
of null test is bad, so correct it.
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) Cleanups from Dan Carpenter and wenxu.
2) Paul and Roi, Some minor updates and fixes to E-Switch to address
issues introduced in the previous reg_c0 updates series.
3) Eli Cohen simplifies and improves flow steering matching group searches
and flow table entries version management.
4) Parav Pandit, improves devlink eswitch mode changes thread safety.
By making devlink rely on driver for thread safety and introducing mlx5
eswitch mode change protection.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl58SW8ACgkQSD+KveBX
+j4AxQf8DdrFrBD0NFTcAILS4bnTJC0I3xKRPb/2oYtWLVyJ9G5XAZqHC0DAG7xs
jy8xhIFbeUxgLEdcx0la5vdR1mPlzs4XBHTe99YwzwK/jojrA7YXrlb3kv+RXWVY
uNVAby68wh4EnO61R51ahIBXLPNbiYpo/wAWKvvBKRkOcYMVTKIFiP157AnJWObY
fxnt06I0NFaIX8Va4MEqkrmUYrI4jJcqOJC9FwRBLDhFHcFkLh0Gav3vJJ7M4BCB
ggPJpuZ4pu43qX9TtSOm8V/GlWWN0RB7PdbvliFBEHYG21hf9MfE8bPPKRlB7CO+
B5+9ULhpvbjX7yRrkZ7fd4zlQ1siew==
=Flln
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2020-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2020-03-25
1) Cleanups from Dan Carpenter and wenxu.
2) Paul and Roi, Some minor updates and fixes to E-Switch to address
issues introduced in the previous reg_c0 updates series.
3) Eli Cohen simplifies and improves flow steering matching group searches
and flow table entries version management.
4) Parav Pandit, improves devlink eswitch mode changes thread safety.
By making devlink rely on driver for thread safety and introducing mlx5
eswitch mode change protection.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/atheros/atlx/atl2.c:40:19: warning: ‘atl2_driver_string’ defined but not used [-Wunused-const-variable=]
static const char atl2_driver_string[] = "Atheros(R) L2 Ethernet Driver";
^~~~~~~~~~~~~~~~~~
commit ea97374214 ("net/atheros: Clean atheros code from driver version")
left behind this, remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently eswitch mode change is occurring from 2 different execution
contexts as below.
1. sriov sysfs enable/disable
2. devlink eswitch set commands
Both of them need to access eswitch related data structures in
synchronized manner.
Without any synchronization below race condition exist.
SR-IOV enable/disable with devlink eswitch mode change:
cpu-0 cpu-1
----- -----
mlx5_device_disable_sriov() mlx5_devlink_eswitch_mode_set()
mlx5_eswitch_disable() esw_offloads_stop()
esw_offloads_disable() mlx5_eswitch_disable()
esw_offloads_disable()
Hence, they are synchronized using a new mode_lock.
eswitch's state_lock is not used as it can lead to a deadlock scenario
below and state_lock is only for vport and fdb exclusive access.
ip link set vf <param>
netlink rcv_msg() - Lock A
rtnl_lock
vfinfo()
esw->state_lock() - Lock B
devlink eswitch_set
devlink_mutex
esw->state_lock() - Lock B
attach_netdev()
register_netdev()
rtnl_lock - Lock A
Alternatives considered:
1. Acquiring rtnl lock before taking esw->state_lock to follow similar
locking sequence as ip link flow during eswitch mode set.
rtnl lock is not good idea for two reasons.
(a) Holding rtnl lock for several hundred device commands is not good
idea.
(b) It leads to below and more similar deadlocks.
devlink eswitch_set
devlink_mutex
rtnl_lock - Lock A
esw->state_lock() - Lock B
eswitch_disable()
reload()
ib_register_device()
ib_cache_setup_one()
rtnl_lock()
2. Exporting devlink lock may lead to undesired use of it in vendor
driver(s) in future.
3. Unloading representors outside of the mode_lock requires
serialization with other process trying to enable the eswitch.
4. Differing the representors life cycle to a different workqueue
requires synchronization with func_change_handler workqueue.
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Subsequent patch protects eswitch mode changes across sriov and devlink
interfaces. It is desirable for eswitch to provide thread safe eswitch
enable and disable APIs.
Hence, extend eswitch enable API to optionally update num_vfs when
requested.
In subsequent patch, eswitch num_vfs are updated after all the eswitch
users eswitch drops its reference count.
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In order to check eswitch state under a lock, prepare code to split
capability check and eswitch state check into two helper functions.
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlx5_register_device() doesn't check for any error and always returns 0.
Simplify mlx5_register_device() to return void and its caller, remove
dead code related to it.
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Group version is used when modifying a rule is allowed
(FLOW_ACT_NO_APPEND is clear) to detect a case where the rule was found
but while the groups where unlocked a new FTE was added. In this case,
the added FTE could be one with identical match value so we need to
attempt again with group lock held.
Change the code so version is retrieved only when FLOW_ACT_NO_APPEND is
cleared. As result, later compare can also be avoided if FLOW_ACT_NO_APPEND
is cleared.
Also improve comments text.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
FTE version is not used anywhere in the code so avoid incrementing it.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When adding a rule to a flow group we need increment the version of the
group. Current code fails to do that and as a result, when trying to add
a rule, we will fail to discover a case where an FTE with the same match
value was added while we scanned the groups of the same match criteria,
thus we may try to add an FTE that was already added.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Instead of using two different structs for searching groups with the
same match, use a single struct and thus simplify the code, make it more
readable and smaller size which means less code cache misses.
text data bss dec hex
before: 35524 2744 0 38268 957c
after: 35038 2744 0 37782 9396
When testing add 70000 rules, delete all the rules, and repeat three
times taking the average, we get (time in seconds):
Before the change: insert 16.80, delete 11.02
After the change: insert 16.55, delete 10.95
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The correct type is u32.
Fixes: d18296ffd9 ("net/mlx5: E-Switch, Introduce global tables")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We allocate a temporary memory but forget to free it.
Fixes: 11b717d615 ("net/mlx5: E-Switch, Get reg_c0 value on CQE")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Register c0 loopback is needed to fully support chains and prios.
Enable chains and prio only if loopback (of reg c1 which came together
with c0), is enabled. To be able to check that, move enabling of loopback
before eswitch chains init.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reg c0/c1 matching, rewrite of regs c0/c1, and copy header of regs c1,B
is needed for the restore table to function, might not be supported by
firmware, and creation of the restore table or the copy header will
fail.
Check reg_c1 loopback support, as firmware which supports this,
should have all of the above.
Fixes: 11b717d615 ("net/mlx5: E-Switch, Get reg_c0 value on CQE")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The function mlx5e_rep_setup_ft_cb check chain_index is zero twice.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The actions_match_supported() function returns a bool, true for success
and false for failure. This error path is returning a negative which
is cast to true but it should return false.
Fixes: 4c3844d9e9 ("net/mlx5e: CT: Introduce connection tracking")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Overlapping header include additions in macsec.c
A bug fix in 'net' overlapping with the removal of 'version'
string in ena_netdev.c
Overlapping test additions in selftests Makefile
Overlapping PCI ID table adjustments in iwlwifi driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds support to catch any bits set in SGE_INT_CAUSE5 for Parity Errors.
F_ERR_T_RXCRC flag is used to ignore that particular bit as it is not considered as fatal.
So, we clear out the bit before looking for error.
This patch now read and report separately all three registers(Cause1, Cause2, Cause5).
Also, checks for errors if any.
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Rahul Kundu <rahul.kundu@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since set_rx_mode takes a mutex lock for sending mailbox
message to admin function to set the mode, moved logic
to a workqueue.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed an issue wherein while refilling receive buffers
for the last page allocated, recount is not being updated.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently ENA only provides the PCI remove() handler, used during rmmod
for example. This is not called on shutdown/kexec path; we are potentially
creating a failure scenario on kexec:
(a) Kexec is triggered, no shutdown() / remove() handler is called for ENA;
instead pci_device_shutdown() clears the master bit of the PCI device,
stopping all DMA transactions;
(b) Kexec reboot happens and the device gets enabled again, likely having
its FW with that DMA transaction buffered; then it may trigger the (now
invalid) memory operation in the new kernel, corrupting kernel memory area.
This patch aims to prevent this, by implementing a shutdown() handler
quite similar to the remove() one - the difference being the handling
of the netdev, which is unregistered on remove(), but following the
convention observed in other drivers, it's only detached on shutdown().
This prevents an odd issue in AWS Nitro instances, in which after the 2nd
kexec the next one will fail with an initrd corruption, caused by a wild
DMA write to invalid kernel memory. The lspci output for the adapter
present in my instance is:
00:05.0 Ethernet controller [0200]: Amazon.com, Inc. Elastic Network
Adapter (ENA) [1d0f:ec20]
Suggested-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Acked-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c: In function cxgb4_get_free_ftid:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c:547:23:
warning: variable tab set but not used [-Wunused-but-set-variable]
commit 8d174351f2 ("cxgb4: rework TC filter rule insertion across regions")
involved this, remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl56fu0ACgkQSD+KveBX
+j4g1Qf/edWPTCMGK4eb0jBPUvnxkoGPYj4cq0tfeUaY7Q4r95RzNjuW3gTotzGV
JZymoC2OoWQxUR2Ye0FkM1C/RQFIAHinEX/KFOMJ6PL+k4+micXeIGNfVo3aflO0
kaTcBdgZKqFS5hpRtWZc/DVRWckqJYtaAJEFliQbYGwmfiZNoNr0/ZeU+/DX2dHn
bQkRHQZ3Zq43P4FhVBSyrfmsxUI71k7GtCdJ5G4i80e8qCCARKZDx7q1FRC0k6fh
a84+7NxpFRkl+kT+se/bxcQaFht49YSJVauGMKK8Ae+pz0XEaNrYsFz9zQY/s7W6
4a62hzlHuVmUwteZfH+secZzCnOkUw==
=rO1d
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2020-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2020-03-24
This series introduces some fixes to mlx5 driver.
From Aya, Fixes to the RX error recovery flows
From Leon, Fix IB capability mask
Please pull and let me know if there is any problem.
For -stable v5.5
('net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure')
For -stable v5.4
('net/mlx5e: Fix ICOSQ recovery flow with Striding RQ')
('net/mlx5e: Do not recover from a non-fatal syndrome')
('net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset')
('net/mlx5e: Enhance ICOSQ WQE info fields')
The above patch ('net/mlx5e: Enhance ICOSQ WQE info fields')
will fail to apply cleanly on v5.4 due to a trivial contextual conflict,
but it is an important fix, do I need to do something about it or just
assume Greg will know how to handle this ?
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The original change fixed an issue on RTL8168b by mimicking the vendor
driver behavior to disable MSI on chip versions before RTL8168d.
This however now caused an issue on a system with RTL8168c, see [0].
Therefore leave MSI disabled on RTL8168b, but re-enable it on RTL8168c.
[0] https://bugzilla.redhat.com/show_bug.cgi?id=1792839
Fixes: 003bd5b4a7 ("r8169: don't use MSI before RTL8168d")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With all DMA address accesses wrapped, we can actually support 64-bit
DMA if this option was chosen at IP integration time.
If the IP has been configured for an address width greater than 32 bits,
we assume the full 64 bit DMA width is working. In practise this will be
limited by the actual system address bus width, which will ideally be the
same as the DMA IP address width.
If this is not the case, the actual width can still be configured using a
dma-ranges property in the parent of the MAC node.
This increases the DMA mask on those systems to let the kernel choose
buffers from memory at higher addresses.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When newer revisions of the Axienet IP are configured for a 64-bit bus,
we *need* to write to the MSB part of the an address registers,
otherwise the IP won't recognise this as a DMA start condition.
This is even true when the actual DMA address comes from the lower 4 GB.
To autodetect this configuration, at probe time we write all 1's to such
an MSB register, and see if any bits stick. If this is configured for a
32-bit bus, those MSB registers are RES0, so reading back 0 indicates
that no MSB writes are necessary.
On the other hands reading anything other than 0 indicated the need to
write the MSB registers, so we set the respective flag.
The actual DMA mask stays at 32-bit for now. To help bisecting, a
separate patch will enable allocations from higher addresses.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer revisions of the AXI DMA IP (>= v7.1) support 64-bit addresses,
both for the descriptors itself, as well as for the buffers they are
pointing to.
This is realised by adding "MSB" words for the next and phys pointer
right behind the existing address word, now named "LSB". These MSB words
live in formerly reserved areas of the descriptor.
If the hardware supports it, write both words when setting an address.
The buffer address is handled by two wrapper functions, the two
occasions where we set the next pointers are open coded.
For now this is guarded by a flag which we don't set yet.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer versions of the Xilink DMA IP support busses with more than 32
address bits, by introducing an MSB word for the registers holding DMA
pointers (tail/current, RX/TX descriptor addresses).
On IP configured for more than 32 bits, it is also *required* to write
both words, to let the IP recognise this as a start condition for an
MM2S request, for instance.
Wrap the DMA pointer writes with a separate function, to add this
functionality later. For now we stick to the lower 32 bits.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mii-tool is useful for debugging, and all it requires to work is to wire
up the ioctl ops function pointer.
Add this to the axienet driver to enable mii-tool.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer revisions of the IP don't have these registers. Since we don't
really use them, just drop them from the ethtools dump.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the DT binding, the Ethernet core interrupt is optional.
Use platform_get_irq_optional() to avoid the error message when the
IRQ is not specified.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Especially with the default 32-bit DMA mask, DMA buffers are a limited
resource, so their allocation can fail.
So as the DMA API documentation requires, add error checking code after
dma_map_single() calls to catch the case where we run out of "low" memory.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Factor out the code that cleans up a number of connected TX descriptors,
as we will need it to properly roll back a failed _xmit() call.
There are subtle differences between cleaning up a successfully sent
chain (unknown number of involved descriptors, total data size needed)
and a chain that was about to set up (number of descriptors known), so
cater for those variations with some extra parameters.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since 0 is a valid DMA address, we cannot use the physical address to
check whether a TX descriptor is valid and is holding a DMA mapping.
Use the "cntrl" member of the descriptor to make this decision, as it
contains at least the length of the buffer, so 0 points to an
uninitialised buffer.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When axienet_dma_bd_init() bails out during the initialisation process,
it might do so with parts of the structure already allocated and
initialised, while other parts have not been touched yet. Before
returning in this case, we call axienet_dma_bd_release(), which does not
take care of this corner case.
This is most obvious by the first loop happily dereferencing
lp->rx_bd_v, which we actually check to be non NULL *afterwards*.
Make sure we only unmap or free already allocated structures, by:
- directly returning with -ENOMEM if nothing has been allocated at all
- checking for lp->rx_bd_v to be non-NULL *before* using it
- only unmapping allocated DMA RX regions
This avoids NULL pointer dereferences when initialisation fails.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we fail allocating the DMA buffers in axienet_dma_bd_init(), we
report this error, but carry on with initialisation nevertheless.
This leads to a kernel panic when the driver later wants to send a
packet, as it uses uninitialised data structures.
Make the axienet_device_reset() routine return an error value, as it
contains the DMA buffer initialisation. Make sure we propagate the error
up the chain and eventually fail the driver initialisation, to avoid
relying on non-initialised buffers.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The DMA error handler routine is currently a tasklet, scheduled to run
after the DMA error IRQ was handled.
However it needs to take the MDIO mutex, which is not allowed to do in a
tasklet. A kernel (with debug options) complains consequently:
[ 614.050361] net eth0: DMA Tx error 0x174019
[ 614.064002] net eth0: Current BD is at: 0x8f84aa0ce
[ 614.080195] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
[ 614.109484] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 40, name: kworker/u4:4
[ 614.135428] 3 locks held by kworker/u4:4/40:
[ 614.149075] #0: ffff000879863328 ((wq_completion)rpciod){....}, at: process_one_work+0x1f0/0x6a8
[ 614.177528] #1: ffff80001251bdf8 ((work_completion)(&task->u.tk_work)){....}, at: process_one_work+0x1f0/0x6a8
[ 614.209033] #2: ffff0008784e0110 (sk_lock-AF_INET-RPC){....}, at: tcp_sendmsg+0x24/0x58
[ 614.235429] CPU: 0 PID: 40 Comm: kworker/u4:4 Not tainted 5.6.0-rc3-00926-g4a165a9d5921 #26
[ 614.260854] Hardware name: ARM Test FPGA (DT)
[ 614.274734] Workqueue: rpciod rpc_async_schedule
[ 614.289022] Call trace:
[ 614.296871] dump_backtrace+0x0/0x1a0
[ 614.308311] show_stack+0x14/0x20
[ 614.318751] dump_stack+0xbc/0x100
[ 614.329403] ___might_sleep+0xf0/0x140
[ 614.341018] __might_sleep+0x4c/0x80
[ 614.352201] __mutex_lock+0x5c/0x8a8
[ 614.363348] mutex_lock_nested+0x1c/0x28
[ 614.375654] axienet_dma_err_handler+0x38/0x388
[ 614.389999] tasklet_action_common.isra.15+0x160/0x1a8
[ 614.405894] tasklet_action+0x24/0x30
[ 614.417297] efi_header_end+0xe0/0x494
[ 614.429020] irq_exit+0xd0/0xd8
[ 614.439047] __handle_domain_irq+0x60/0xb0
[ 614.451877] gic_handle_irq+0xdc/0x2d0
[ 614.463486] el1_irq+0xcc/0x180
[ 614.473451] __tcp_transmit_skb+0x41c/0xb58
[ 614.486513] tcp_write_xmit+0x224/0x10a0
[ 614.498792] __tcp_push_pending_frames+0x38/0xc8
[ 614.513126] tcp_rcv_established+0x41c/0x820
[ 614.526301] tcp_v4_do_rcv+0x8c/0x218
[ 614.537784] __release_sock+0x5c/0x108
[ 614.549466] release_sock+0x34/0xa0
[ 614.560318] tcp_sendmsg+0x40/0x58
[ 614.571053] inet_sendmsg+0x40/0x68
[ 614.582061] sock_sendmsg+0x18/0x30
[ 614.593074] xs_sendpages+0x218/0x328
[ 614.604506] xs_tcp_send_request+0xa0/0x1b8
[ 614.617461] xprt_transmit+0xc8/0x4f0
[ 614.628943] call_transmit+0x8c/0xa0
[ 614.640028] __rpc_execute+0xbc/0x6f8
[ 614.651380] rpc_async_schedule+0x28/0x48
[ 614.663846] process_one_work+0x298/0x6a8
[ 614.676299] worker_thread+0x40/0x490
[ 614.687687] kthread+0x134/0x138
[ 614.697804] ret_from_fork+0x10/0x18
[ 614.717319] xilinx_axienet 7fe00000.ethernet eth0: Link is Down
[ 615.748343] xilinx_axienet 7fe00000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
Since tasklets are not really popular anymore anyway, lets convert this
over to a work queue, which can sleep and thus can take the MDIO mutex.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to axienet, the temac driver is now architecture agnostic, and
can be at least compiled for several architectures.
Especially the fact that this is a soft IP for implementing in FPGAs
makes the current restriction rather pointless, as it could literally
appear on any architecture, as long as an FPGA is connected to the bus.
The driver hasn't been actually tried on any hardware, it is just a
drive-by patch when doing the same for axienet (a similar patch for
axienet is already merged).
This (temac and axienet) have been compile-tested for:
alpha hppa64 microblaze mips64 powerpc powerpc64 riscv64 s390 sparc64
(using kernel.org cross compilers).
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4_ptp_fineadjtime() doesn't pass the signedness of offset delta
in FW_PTP_CMD. Fix it by passing correct sign.
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For non-fatal syndromes like LOCAL_LENGTH_ERR, recovery shouldn't be
triggered. In these scenarios, the RQ is not actually in ERR state.
This misleads the recovery flow which assumes that the RQ is really in
error state and no more completions arrive, causing crashes on bad page
state.
Fixes: 8276ea1353 ("net/mlx5e: Report and recover from CQE with error on RQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In striding RQ mode, the buffers of an RX WQE are first
prepared and posted to the HW using a UMR WQEs via the ICOSQ.
We maintain the state of these in-progress WQEs in the RQ
SW struct.
In the flow of ICOSQ recovery, the corresponding RQ is not
in error state, hence:
- The buffers of the in-progress WQEs must be released
and the RQ metadata should reflect it.
- Existing RX WQEs in the RQ should not be affected.
For this, wrap the dealloc of the in-progress WQEs in
a function, and use it in the ICOSQ recovery flow
instead of mlx5e_free_rx_descs().
Fixes: be5323c837 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When resetting the RQ (moving RQ state from RST to RDY), the driver
resets the WQ's SW metadata.
In striding RQ mode, we maintain a field that reflects the actual
expected WQ head (including in progress WQEs posted to the ICOSQ).
It was mistakenly not reset together with the WQ. Fix this here.
Fixes: 8276ea1353 ("net/mlx5e: Report and recover from CQE with error on RQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add number of WQEBBs (WQE's Basic Block) to WQE info struct. Set the
number of WQEBBs on WQE post, and increment the consumer counter (cc)
on completion.
In case of error completions, the cc was mistakenly not incremented,
keeping a gap between cc and pc (producer counter). This failed the
recovery flow on the ICOSQ from a CQE error which timed-out waiting for
the cc and pc to meet.
Fixes: be5323c837 ("net/mlx5e: Report and recover from CQE error on ICOSQ")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The cap_mask1 isn't protected by field_select and not listed among RW
fields, but it is required to be written to properly initialize ports
in IB virtualization mode.
Link: https://lore.kernel.org/linux-rdma/88bab94d2fd72f3145835b4518bc63dda587add6.camel@redhat.com
Fixes: ab118da4c1 ("net/mlx5: Don't write read-only fields in MODIFY_HCA_VPORT_CONTEXT command")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
checkpatch found a lack of appropriate whitespace after certain keywords
as per the style guide. Add it in.
Signed-off-by: Logan Magee <mageelog@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/ethernet/cavium/thunder/nicvf_queues.c: In function nicvf_sq_free_used_descs:
drivers/net/ethernet/cavium/thunder/nicvf_queues.c:1182:12: warning:
variable tail set but not used [-Wunused-but-set-variable]
It's not used since commit 4863dea3fab01("net: Adding support for Cavium ThunderX network controller"),
so remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng zengkai <zhengzengkai@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If ring counts are not reset when ring reservation fails,
bnxt_init_dflt_ring_mode() will not be called again to reinitialise
IRQs when open() is called and results in system crash as napi will
also be not initialised. This patch fixes it by resetting the ring
counts.
Fixes: 47558acd56 ("bnxt_en: Reserve rings at driver open if none was reserved at probe time.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Other shutdown code paths will always disable PCI first to shutdown DMA
before freeing context memory. Do the same sequence in the error path
of probe to be safe and consistent.
Fixes: c20dc142dd ("bnxt_en: Disable bus master during PCI shutdown and driver unload.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code ignores the return value from
bnxt_hwrm_func_backing_store_cfg(), causing the driver to proceed in
the init path even when this vital firmware call has failed. Fix it
by propagating the error code to the caller.
Fixes: 1b9394e5a2 ("bnxt_en: Configure context memory on new devices.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The allocated ieee_ets structure goes out of scope without being freed,
leaking memory. Appropriate result codes should be returned so that
callers do not rely on invalid data passed by reference.
Also cache the ETS config retrieved from the device so that it doesn't
need to be freed. The balance of the code was clearly written with the
intent of having the results of querying the hardware cached in the
device structure. The commensurate store was evidently missed though.
Fixes: 7df4ae9fe8 ("bnxt_en: Implement DCBNL to support host-based DCBX.")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is an indexing bug in determining these ethtool priority
counters. Instead of using the queue ID to index, we need to
normalize by modulo 10 to get the index. This index is then used
to obtain the proper CoS queue counter. Rename bp->pri2cos to
bp->pri2cos_idx to make this more clear.
Fixes: e37fed7903 ("bnxt_en: Add ethtool -S priority counters.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packet trap groups are now explicitly registered by drivers and not
implicitly registered when the packet traps are registered. Therefore,
there is no need to encode entire group structure the trap is associated
with inside the trap structure.
Instead, only pass the group identifier. Refer to it as initial group
identifier, as future patches will allow user space to move traps
between groups.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the previously added API to explicitly register / unregister
supported packet trap groups. This is in preparation for future patches
that will enable drivers to pass additional group attributes, such as
associated policer identifier.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So far only the reset bit it set, but the handler executing the reset
is not scheduled. Therefore nothing will happen until some other action
schedules the handler. Improve this by ensuring that the handler is
scheduled.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current implementation makes the implicit assumption that if a bit
is set, then the work is scheduled already. Remove the need for this
implicit assumption and call schedule_work() always. It will check
internally whether the work is scheduled already.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently rtl_task() is designed to handle a large number of tasks.
However we have just one, so we can remove some overhead.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Factor out setting GPHY 10M to new helper rtl8168g_enable_gphy_10m.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2020-03-21
Implement basic support for the devlink interface in the ice driver.
Additionally pave some necessary changes for adding a devlink region that
exposes the NVM contents.
This series first contains 5 patches for enabling and implementing full NVM
read access via the ETHTOOL_GEEPROM interface. This includes some cleanup of
endian-types, a new function for reading from the NVM and Shadow RAM as a flat
addressable space, a function to calculate the available flash size during
load, and a change to how some of the NVM version fields are stored in the
ice_nvm_info structure.
Following this is 3 patches for implementing devlink support. First, one patch
which implements the basic framework and introduces the ice_devlink.c file.
Second, a patch to implement basic .info_get support. Finally, a patch which
reads the device PBA identifier and reports it as the `board.id` value in the
.info_get response.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes wrapper fn()s around mutex_init/lock/unlock.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>