Sometimes the drivers and other code would find it handy to know some
internal information about upper device being changed. So allow upper-code
to pass information down to notifier listeners during linking.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminate netdev_master_upper_dev_link_private and pass priv directly as
a parameter of netdev_master_upper_dev_link.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some code does not mind if a device is bond slave or team port and treats
them the same, as generic LAG ports.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some code does not mind if the master is bond or team and treats them
the same, as generic LAG.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to other helpers, caller can use this to find out if device is
team port.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to other helpers, caller can use this to find out if device is
team master.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to other speeds, add 100G to bonding 802.3ad code.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since CHANGEUPPER can now fail, add support for it in the newly
introduced netdev notifier error injection infrastructure.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
switchdev drivers reflect the newly requested topology to hardware when
CHANGEUPPER is received, after software links were already formed.
However, the operation can fail and user will not be notified, as the
return value of the notifier is not checked.
Add this check and rollback software links if necessary.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 7fd89545f3 ("i40e: remove BUG_ON from feature string building")
added defective output when I40E_FLAG_VEB_MODE_ENABLED was set in
function i40e_print_features.
Fix it.
Miscellanea:
- Remove unnecessary string variable
- Add space before not after fixed strings
- Use kmalloc not kzalloc
- Don't initialize i to 0, use result of first snprintf
Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-12-03
This series contains updates to ixgbe and ixgbevf only.
Mark cleans up ixgbe_init_phy_ops_x550em, since this was designed to
initialize function pointers only and moves the KR PHY reset to the
ixgbe_setup_internal_phy_t_x550em which was designed to detect which
mode the PHY operates in and set it up. Added the new thermal alarm
type support used with newer X550EM_x devices. Fixed both ixgbe and
ixgbevf to use a private work queue to avoid hangs, which would
possibly occur when creating and destroying many VFS repeatedly.
Updated ixgbe PTP implementation to accommodate X550EM_x devices,
which handle clocking differently. Fixed specification violations
in the datasheet, which was reported by Dan Streetman. Fixed ixgbe
to check for and handle IPv6 extended headers so that Tx checksum
offload can be done, which was reported by Tom Herbert. Fixed ixgbe
link issue for some systems with X540 or X550 by only inhibiting the
turning PHY power off when manageability is present.
Alex Duyck refactors the MAC address configuration code, which in
turns fixes an issue where once 63 entries had been used, you could no
longer add additional filters. Updated ixgbe to use __dev_uc_sync
which also resolved an issue in which you could not remove an FDB
address without having to reset the port. Updated the ixgbe driver
to make use of all the free RAR entries for FDB use if needed.
v2: updated patch 13 to "Alex Duyck Approved" version, in the original
submission, I had grabbed a previous version of the patch and did not
catch it was superseded by a later version
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
aarch64 doesn't have native store immediate instruction, such operation
has to be implemented by the below instruction sequence:
Load immediate to register
Store register
Signed-off-by: Yang Shi <yang.shi@linaro.org>
CC: Zi Shen Lim <zlim.lnx@gmail.com>
CC: Xi Wang <xi.wang@gmail.com>
Reviewed-by: Zi Shen Lim <zlim.lnx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While testing the np->opt RCU conversion, I found that UDP/IPv6 was
using a mixture of xchg() and sk_dst_lock to protect concurrent changes
to sk->sk_dst_cache, leading to possible corruptions and crashes.
ip6_sk_dst_lookup_flow() uses sk_dst_check() anyway, so the simplest
way to fix the mess is to remove sk_dst_lock completely, as we did for
IPv4.
__ip6_dst_store() and ip6_dst_store() share same implementation.
sk_setup_caps() being called with socket lock being held or not,
we have to use sk_dst_set() instead of __sk_dst_set()
Note that I had to move the "np->dst_cookie = rt6_get_cookie(rt);"
in ip6_dst_store() before the sk_setup_caps(sk, dst) call.
This is because ip6_dst_store() can be called from process context,
without any lock held.
As soon as the dst is installed in sk->sk_dst_cache, dst can be freed
from another cpu doing a concurrent ip6_dst_store()
Doing the dst dereference before doing the install is needed to make
sure no use after free would trigger.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch completes the work I did in commit 45f6fad84c
("ipv6: add complete rcu protection around np->opt"), as I missed
sctp part.
This simply makes sure np->opt is used with proper RCU locking
and accessors.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Major changes:
ath10k
* support Manegement Frame Protection (MFP)
* add thermal throttling support for 10.4 firmware
* add support for pktlog in QCA99X0
* add debugfs file to enable Bluetooth coexistence feature
* use firmware's native mesh interface type instead of raw mode
Check for and handle IPv6 extended headers so that Tx checksum
offload can be done. Also use skb_checksum_help for unexpected
cases. Thanks to Tom Herbert for noticing these problems. Thanks
to Alexander Duyck for seeing how to coalesce the error handling
into one location.
Reported-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of inhibiting PHY power control when manageability is
present, only inhibit turning PHY power off when manageability
is present. Consequently, PHY power will always be turned on when
requested. Without this patch, some systems with X540 or X550
devices in some conditions will never get link.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Check for and handle IPv6 extended headers so that Tx checksum
offload can be done. Also use skb_checksum_help for unexpected
cases. Thanks to Tom Herbert for noticing these problems. Thanks
to Alexander Duyck for recognizing problems with the first version
of this patch and recognizing how to coalesce error conditions
into a single location.
Reported-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Save VF device pointers and take references to speed accesses used
to monitor the device behavior to avoid slot resets. The saved
information avoids lock contention during the search used to access
each of the VFs.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
According to the datasheets, the driver should wait for the master
disable bit to read as being set before checking the status
register for master disable.
Reported-by: Dan Streetman <dan.streetman@canonical.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ixgbe driver was violating the specification in the datasheet
by not waiting 1ms before checking for the reset bit clearing. This
is called out for devices supported by ixgbe, so implement the
required delay.
Reported-by: Dan Streetman <dan.streetman@canonical.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The X550EM_x devices handle clocking differently, so update the
PTP implementation to accommodate them. This involves significant
changes to ixgbe's PTP code to accommodate the new range of
behaviors including things like non-power-of-2 clock wrapping.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that we allow the PF to make use of all free RAR
entries for FDB use if needed.
Previously the code limited us to 16 unicast entries, however this was
shared between MACVLAN which wasn't limited and the FDB code which was. So
instead of treating the FDB code as a second class citizen I have updated
it so that it has access to just as many entries as the MACVLAN filters.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change replaces the ixgbe_write_uc_addr_list call in ixgbe_set_rx_mode
with a call to __dev_uc_sync instead. This works much better with the MAC
addr list code that was already in place and solves an issue in which you
couldn't remove an FDB address without having to reset the port.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In the process of tracking down a memory leak when adding/removing FDB
entries I had to go through the MAC address configuration code for ixgbe.
In the process of doing so I found a number of issues that impacted
readability and performance. This change updates the code in general to
clean it up so it becomes clear what each step is doing. From what I can
tell there a couple of bugs cleaned up in this code.
First is the fact that the MAC addresses were being double counted for the
PF. As a result once entries up to 63 had been used you could no longer
add additional filters.
A simple test case for this:
for i in `seq 0 96`
do
ip link add link ens8 name mv$i type macvlan
ip link set dev mv$i up
done
Test script:
ethregs -s 0:8.0 | grep -e "RAH" | grep 8000....$
When things are working correctly RAL/H registers 1 - 97 will be consumed.
In the failing case it will stop at 63 and prevent any further filters from
being added.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Make some minor cleanups, such as simplifying return paths, deleting
unneeded initializations, return values more directly and so forth.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use a private workqueue to avoid hangs that were otherwise possible
when performing stress tests, such as creating and destroying many
VFS repeatedly.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use a private workqueue to avoid hangs that were otherwise possible
when performing stress tests, such as creating and destroying many
VFS repeatedly.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The newer copper PHY implementation used with newer X550EM_x
devices uses a different thermal alarm type than the earlier
one. Make changes to support both types.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch removes KR PHY reset from ixgbe_init_phy_ops_x550em,
since this function is meant to initialize function pointers for
the detected PHY type. Internal PHY reset was moved to
ixgbe_setup_internal_phy_t_x550em which will now detect which
mode the internal PHY operates in and set it up as required.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
1) remove duplicate include of tcp.h
2) put an ampersand at the end of a line instead of the beginning
3) remove a useless dev_info
4) match declaration of function to the implementation
5) repair incorrect comment
6) correct whitespace
7) remove unused define
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We shouldn't be using a bitwise operator here; it's not a bitwise
operation. Use a logical operator instead. Why doesn't c have a
logical-or-and-assign operator?
Change-ID: Id84f3ca884910bed7073c84b1e16a102e958d0de
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Operators should have spaces around them.
Change-ID: I64735e9aa8618b9a5059a87ace1c999d6d3bfcfb
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The separate functions to gather Flow control Rx XOFF stats was to
determine if the Tx for a queue was paused due to Link Flow Control(LFC)
or Priority Flow Control(PFC).
But, with recent change in the i40e driver the logic for checking th Tx
hang has been removed and these functions don't do anything meaningful.
Hence, there is no need to keep these separate functions to gather Rx
XOFF stats for LFC or PFC.
This patch removes these functions and moves the stat collection for
XOFF Rx to the i40e_update_pf_stats() that collects all the PF stats.
Change-ID: Iec1452dac3a6766f0d968e754cb407530d7c60cd
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of having our own custom symbol, we can just rely
on whether or not the kernel has the feature enabled.
In this case use IS_ENABLED(CONFIG_VXLAN) in order to handle
built-in or module in the current BKM way.
Change-ID: I5890fbb518ff8ed6bb07c3362fb0a8a829f9b241
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ethtool priv flags implementation to enable or disable packet split, which
is a hardware feature that inspects headers and will put headers in a
separate DMA buffer from the payload data. The driver was automatically
choosing to enable packet split in some cases and this gives the user the
ability to turn it off/on explicitly.
to query state:
ethtool --show-priv-flags ethx
to enable:
ethtool --set-priv-flags ethx packet-split on
to disable:
ethtool --set-priv-flags ethx packet-split off
Why would anyone want this?
Because some environments benefit from header/data split in the receive
buffer, and the driver defaults to one or the other depending on
environment/kernel parameters.
Why didn't you implement a generic ethtool control for this feature?
Because Intel hardware is the only hardware that supports header/data
split.
Change-ID: I803121e1eecc9ccb2884031fd85dd1110b3af66d
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don't use uint32_t type the kernel. Use u32 instead. No functional
change.
Change-ID: I77bbf3b6464edaef747c7104b43534032a4dba63
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
i40e_sync_vsi_filters() is the surly teenager of this driver. It says
it's going to report errors, but it doesn't actually do that most of the
time. And when it does, it leaves a mess.
Change this function to have a common exit point so it will properly
release the busy lock on the VSI. Propagate errors to the callers.
Finally, adjust a few callers to check for and deal with errors from
this function.
Change-ID: Ic6af4956491e72402ebb3c538a3c31a0ad7f8667
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
These allocations don't need to be at atomic level. GFP_KERNEL is fine
and they'll reduce stress on the allocator when the system is starved
for memory.
Change-ID: I3561d0399a681de0ad25291b6c848b224c1fde12
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the memory leak which would be seen otherwise when user
programs flow-director filter using ethtool (sideband filter programming).
When ethtool is used to program flow directory filter, 'raw_buf' gets
allocated and it is supposed to be freed as part of queue cleanup. But
check of 'tx_buffer->skb' was preventing it from being freed.
Change-ID: Ief4f0a1a32a653180498bf6e987c1b4342ab8923
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch contains following changes:
- detection and recovery logic (issue SW interrupt) has been moved to
service_task from timeout function.
- added some more debug info from tx_timeout.
Logic to detect and recover TX queue hung is now two step process:
- service_task detects TX queue hung and sets a bit(hung_detected) if
it was not set.
- if bit was set (means this is back-back hung condition detected),
issue SW interrupt and clear the bit.
- napi_poll clears the bit unconditionally since it cleans TX/RX queues.
Change-ID: Ieed03a48927c845a988b3ff375090bf37caeb903
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We already print the driver info string in probe, so don't print
it again in init. No need to repeat. No need to repeat.
Change-ID: Ief597997f580a8c54d5950e3a84c29f2075be66b
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use the helper function to set the real number of RX queues, and also
set the real number of TX queues.
Change-ID: I67982799de3f248fb4158ccdc9b1a74385f42ddd
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Future devices will allow for more queue pairs, so allocate a netdev
that can handle them. While we're at it, get rid of the separate
MAX_TX/MAX_RX defines. Since we always get matched queue pairs, having
these makes no sense.
Change-ID: I0e3556cd9a962506e509eb7c0afa36b329e8cb51
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Proxy entries could have null pointer to net-device.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Fixes: 84920c1420 ("net: Allow ipv6 proxies and arp proxies be shown with iproute2")
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't open-code it.
CC: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If tcp_send_ack() can not allocate skb, we properly handle this
and setup a timer to try later.
Use __GFP_NOWARN to avoid polluting syslog in the case host is
under memory pressure, so that pertinent messages are not lost under
a flood of useless information.
sk_gfp_atomic() can use its gfp_mask argument (all callers currently
were using GFP_ATOMIC before this patch)
We rename sk_gfp_atomic() to sk_gfp_mask() to clearly express this
function now takes into account its second argument (gfp_mask)
Note that when tcp_transmit_skb() is called with clone_it set to false,
we do not attempt memory allocations, so can pass a 0 gfp_mask, which
most compilers can emit faster than a non zero or constant value.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'hv_netvsc-less-headroom'
K. Y. Srinivasan says:
====================
hv_netvsc: Eliminate the additional head room
In an attempt to avoid having to allocate memory on the send path, the netvsc
driver was requesting additional head room so that both rndis header and the
netvsc packet (the state that had to persist) could be placed in the skb.
Since the amount of head room requested was exceeding the default head room
as set in LL_MAX_HEADER, we were forcing a reallocation of skb.
With this patch-set, I have reduced the size of the netvsc packet to less
than 20 bytes and with this reduction we don't need to ask for any additional
headroom. We place the rndis header in the skb head room and we place the
netvsc packet in control buffer area in the skb.
V2: - Addressed review comments:
- Eliminated more fields from netvsc packet structure.
V3: - Fixed a typo in patch: hv_netvsc: Don't ask for additional head room in the skb.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>