When CONFIG_GENEVE is built as a loadable module, and bnx2x is built-in,
we get this link error:
drivers/net/built-in.o: In function `bnx2x_open':
:(.text+0x33322): undefined reference to `geneve_get_rx_port'
drivers/net/built-in.o: In function `bnx2x_sp_rtnl_task':
:(.text+0x3e632): undefined reference to `geneve_get_rx_port'
This avoids the problem by adding a separate Kconfig symbol named
CONFIG_BNX2X_GENEVE that is only enabled when the code is
reachable from the driver.
This is the same trick that BNX2X does for VXLAN support, and
is similar to how I40E handles both.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 883ce97d25 ("bnx2x: Add Geneve inner-RSS support")
Acked-By: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/phy/bcm7xxx.c
drivers/net/phy/marvell.c
drivers/net/vxlan.c
All three conflicts were cases of simple overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Current initialization sequence is lacking, causing some configurations
to fail.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The phy's firmware version isn't being parsed properly as it's
currently parsed like the rest of the 848xx phys.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a problem in current 84833 phy configuration -
in case 1Gb link is configured and jumbo-sized packets are being
used, device will experience RX crc errors.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when link is using KR2 it cannot be forced to any speed other
than 20g.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.om>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit e5d3a51cef ("bnx2x: extend DCBx support") was missing HSI
changes for big-endian machine, breaking compilation on such
platforms.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates setup_tc so we can pass additional parameters into
the ndo op in a generic way. To do this we provide structured union
and type flag.
This lets each classifier and qdisc provide its own set of attributes
without having to add new ndo ops or grow the signature of the
callback.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ndo_setup_tc() op was added to support drivers offloading tx
qdiscs however only support for mqprio was ever added. So we
only ever added support for passing the number of traffic classes
to the driver.
This patch generalizes the ndo_setup_tc op so that a handle can
be provided to indicate if the offload is for ingress or egress
or potentially even child qdiscs.
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: Or Gerlitz <ogerlitz@mellanox.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are several scenarios where taking a register dump from a device
might log benign GRC timeout attentions to system logs.
Most common of those is when taking the dump from a 2-port device.
Sadly, there's no easy way to mask the problematic attentions during the
flow - Changing this behvaior would require a firmware update.
For now, simply warn users to ignore the warnings.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for default application priority.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver is currently looking at shared information for determining whether
DCBx can be supported for a given port.
On 4-port devices, up-to-date management firmware can support DCBx on
each port of a given engine independently - but that would cause bnx2x to
misinterpert the support and assume DCBx is supported on both.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds the ability to perform RSS hashing based on encapsulated
headers for a geneve-encapsulated packet.
This also changes the Vxlan implementation in bnx2x to be uniform
for both vxlan and geneve [from configuration perspective].
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnx2x_schedule_sp_rtnl is exported by bnx2x, although no other module
uses it.
Reported-by: Benjamin Poirier <bpoirier@suse.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FW has a rare corner case in which a fragmented packet using lots
of frags would not be linearized, causing the FW to assert while trying
to transmit the packet.
To prevent this, we need to make sure the window of fragements containing
MSS worth of data contains 1 BD less than for regular packets due to
the additional parsing BD.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These fields are updated but never read.
Remove the overhead.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under heavy TX load, bnx2x_poll() can loop forever and trigger
soft lockup bugs.
A napi poll handler must yield after one TX completion round,
risk of livelock is too high otherwise.
Bug is very easy to trigger using a debug build, and udp flood, because
of added cpu cycles in TX completion, and we do not receive enough
packets to break the loop.
Reported-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'flags' field in bnx2x_stats_arr[] serves only one purpose - to tell
us if the statistic is a per-port stat and thus should not be shown for
virtual functions. It's strange that the field can have three different
values. A boolean will do just fine.
Also remove IS_FUNC_STAT(). It was used only once and it's in fact just
a negation of IS_PORT_STAT().
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's supposed to be impossible for TPA to give us anything else
than IPv4 or IPv6 here. But in case there is a way to reach this error
by some strange received frames, we don't want to flood the kernel log.
WARN_ONCE is better for this purpose.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alloc_pages() already prints a warning when it fails. No need to emit
another message. Certainly not at KERN_ERR level, because it is no big
deal if this GFP_ATOMIC allocation fails occasionally.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/renesas/ravb_main.c
kernel/bpf/syscall.c
net/ipv4/ipmr.c
All three conflicts were cases of overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 46e8a249423ff "bnx2x: Add FW 7.13.1.0" added said .bin FW to
linux-firmware; This patch incorporates the FW in the bnx2x driver.
This introduces 2 fixes/enhancements:
- In some management protocols there are outer-vlan configurations
that can be dynamically changed while device is running. This fixes
some corner cases where such a change did not take effect.
- Prevent VFs from sending MAC control frames; FW would treat a VF
sending such a packet as malicious and block any further communication
done by the VF.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Today, port statistics are being presented when using `ethool -S' only
for single-function devices, but there are some port statistics which are
crucial for analyzing bottle-necks. E.g., HW Rx discards due to lack of
buffer space [when device isn't handling ingress traffic fast enough].
Judging the pros and cons, it was decided that in-order to better support
automatic dump-gathering tools, bnx2x should no longer hide those stats.
This leaves only VFs lacking the port statistics.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver already has an internal counter for number of times a given queue
had to be stopped due to Tx ring exhaustion.
This add the counter to the statistics presented by driver, e.g., by using
`ethtool -S'.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commmit ac7eccd4d4 "bnx2x: track vxlan port count" contains a bug -
Instead of achieving the required goal, vxlan configuration would not
be removed since we're decrementing the port instead of the counter.
CC: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NAPI drivers no longer need to observe a particular protocol
to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y)
napi_hash_add() and napi_hash_del() are automatically called
from core networking stack, respectively from
netif_napi_add() and netif_napi_del()
This patch depends on free_netdev() and netif_napi_del() being
called from process context, which seems to be the norm.
Drivers might still prefer to call napi_hash_del() on their
own, since they might combine all the rcu grace periods into
a single one, knowing their NAPI structures lifetime, while
core networking stack has no idea of a possible combining.
Once this patch proves to not bring serious regressions,
we will cleanup drivers to either remove napi_hash_del()
or provide appropriate rcu grace periods combining.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We would like to automatically provide busy polling support
to all NAPI drivers, without them having to implement anything.
skb_mark_napi_id() can be called from napi_gro_receive() and
napi_get_frags().
Few drivers are still calling skb_mark_napi_id() because
they use netif_receive_skb(). They should eventually call
napi_gro_receive() instead. I will leave this to drivers
maintainers.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 05cc5a39dd "bnx2x: add vlan filtering offload" introduced
a regression in regard for vlans for 57710, 57711 adapters -
Loading 8021q module on a machine with such an adapter would cause
a null pointer dereference, as the driver mistakenly publishes it
has capabilities for vlan CTAG filtering.
Reported-by: Otto Sabart <osabart@redhat.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts. They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve". __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".
Over time, callers had a requirement to not block when fallback options
were available. Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.
This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative. High priority users continue to use
__GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim. __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.
This patch then converts a number of sites
o __GFP_ATOMIC is used by callers that are high priority and have memory
pools for those requests. GFP_ATOMIC uses this flag.
o Callers that have a limited mempool to guarantee forward progress clear
__GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
into this category where kswapd will still be woken but atomic reserves
are not used as there is a one-entry mempool to guarantee progress.
o Callers that are checking if they are non-blocking should use the
helper gfpflags_allow_blocking() where possible. This is because
checking for __GFP_WAIT as was done historically now can trigger false
positives. Some exceptions like dm-crypt.c exist where the code intent
is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
flag manipulations.
o Callers that built their own GFP flags instead of starting with GFP_KERNEL
and friends now also need to specify __GFP_KSWAPD_RECLAIM.
The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.
The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL. They may
now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
drivers/net/usb/asix_common.c
net/ipv4/inet_connection_sock.c
net/switchdev/switchdev.c
In the inet_connection_sock.c case the request socket hashing scheme
is completely different in net-next.
The other two conflicts were overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Many drivers initialize uselessly n_priv_flags, n_stats, testinfo_len,
eedump_len & regdump_len fields in their .get_drvinfo() ethtool op.
It's not necessary as these fields is filled in ethtool_get_drvinfo().
v2: removed unused variable
v3: removed another unused variable
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Configuring 4-tuple RSS hsahing for UDP
[E.g., by using `ethtool -N <interface> rx-flow-hash udp4 sdfn']
on a 57710/57711 adapter would cause it to assert as HW does not
support such a configuration.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After a good amount of debugging, I found bnx2x was byte swaping
the 40 bytes of rss_key.
If we byte swap the key, then bnx2x generates hashes matching
MSDN specs as documented in (Verifying the RSS Hash Calculation)
https://msdn.microsoft.com/en-us/library/windows/hardware/ff571021%
28v=vs.85%29.aspx
It is mostly a non issue, unless we want to mix different NIC
in a host, and want consistent hashing among all of them, ie
if they all use the boot time generated rss key, or if some application
is choosing specific tuple(s) so that incoming traffic lands into known
rx queue(s).
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The callback for adding vxlan port can be called with the same port for
both IPv4 and IPv6. Do not disable the offloading when the same port for
both protocols is added and later one of them removed.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit c48f350ff5 "bnx2x: Add MFW dump support" added the
bnx2x_update_mfw_dump() function that reads the current time and stores
it in a 32-bit field that gets passed into a buffer in a fixed format.
This is potentially broken when the epoch overflows in 2038, and
otherwise overflows in 2106. As we're trying to avoid uses of
struct timeval for this reason, I noticed the addition of this
function, and tried to rewrite it in a way that is more explicit
about the overflow and that will keep working once we deprecate
struct timeval.
I assume that it is not possible to change the ABI any more, otherwise
we should try to use a 64-bit field for the seconds right away.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Yuval Mintz <Yuval.Mintz@qlogic.com>
Cc: Ariel Elior <Ariel.Elior@qlogic.com>
Acked-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for 3 new PCI device combinations -
1077:16a1, 1077:16a4 and 1077:16ad.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit f34fa14cc0 ("bnx2x: Add vxlan RSS support") has introduced an
endianity issue when passing the vxlan UDP port to the HW.
Reported-by: <fengguang.wu@intel.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Latest FW submission added some vxlan offload capabilities to our device.
This patch adds the ability to connect to the vxlan NDOs and configure
the UDP port associated with it in the HW.
The device would now be capable of performing RSS according to the
inner headers of the vxlan packets.
Signed-off-by: Rajesh Borundia <Rajesh.Borundia@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Management firmware tells driver in case bandwidth configuration for
a specific function exists, but [regretably] the same field has different
meanings depending on the multi-function mode - it can either be
a percentile value or an actual speed.
For newer multi-function modes current logic is incorrect -
driver understands values as actual speeds instead of percentages,
causing the resulting chip configuration to be incorrect.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/cavium/Kconfig
The cavium conflict was overlapping dependency
changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Writing each 4Kb page into flash might take up-to ~100 miliseconds,
during which time management firmware cannot acces the nvram for its
own uses.
Firmware upgrade utility use the ethtool API to burn new flash images
for the device via the ethtool API, doing so by writing several page-worth
of data on each command. Such action might create problems for the
management firmware, as the nvram might not be accessible for a long time.
This patch changes the write implementation, releasing the nvram lock on
the completion of each page, allowing the management firmware time to
claim it and perform its own required actions.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On error flows its possible to free an SKB even if it was not allocated.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 230d00eb4b ("bnx2x: new Multi-function mode - BD") adds support
for the new mode in bnx2x. This expands this support by implementing
APIs required by our storage drivers to support that mode.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 05cc5a39dd ("bnx2x: add vlan filtering offload") has introduced
an incorrect logic for checking whether pvid should be configured for
a vf, causing the hypervisor driver to send unneeded ramrods for all of
the vfs each time a pvid has changed.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 05cc5a39dd ("bnx2x: add vlan filtering offload") has broken
compilation when CONFIG_BNX2X_SRIOV is not set.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current driver always uses vlan-promisc mode, i.e., it receives both
tagged and untagged traffic and lets the network stack drop packets
tagged with unrequested vlan tags.
This patch implements vlan-filtering offload in the driver -
Unless explicitly configured to promisc mode, only untagged packets or
packets tagged with requested vlans would reach the Rx flow.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>