Commit Graph

20113 Commits

Author SHA1 Message Date
David S. Miller 8ab190fbe9 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2017-10-26

This series contains fixes to e1000, igb, ixgbe and i40e.

Vincenzo Maffione fixes a potential race condition which would result in
the interface being up but transmits are disabled in the hardware.

Colin Ian King fixes a possible NULL pointer dereference in e1000, which
was found by Coverity.

Jean-Philippe Brucker fixes a possible kernel panic when a driver cannot
map a transmit buffer, which is caused by an erroneous test.

Alex provides a fix for ixgbe, which is a partial revert of the commit
ffed21bcee ("ixgbe: Don't bother clearing buffer memory for descriptor rings")
because the previous commit messed up the exception handling path by
adding the count back in when we did not need to.  Also fixed a typo,
where the transmit ITR setting was being used to determine if we were
using adaptive receive interrupt moderation or not.  Lastly, fixed a
memory leak by including programming descriptors in the cleaned count.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-28 00:05:34 +09:00
Jose Abreu 6d9f0790af net: stmmac: First Queue must always be in DCB mode
According to DWMAC databook the first queue operating mode
must always be in DCB.

As MTL_QUEUE_DCB = 1, we need to always set the first queue
operating mode to DCB otherwise driver will think that queue
is in AVB mode (because MTL_QUEUE_AVB = 0).

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:32:56 +09:00
Jose Abreu 4894ac6b6c net: stmmac: dwc-qos-eth: Fix typo in DT bindings parsing
According to DT bindings documentation we are expecting a
property called "snps,read-requests" but we are parsing
instead a property called "read,read-requests".

This is clearly a typo. Fix it.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 23:23:19 +09:00
David S. Miller 5be9541a09 Merge tag 'mlx5-fixes-2017-10-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2017-10-26

The series includes some misc fixes for mlx5 core and etherent driver.
Please pull and let me know if there's any problem.

For -Stable:
net/mlx5e: Properly deal with encap flows add/del under neigh update (kernels >= 4.12)
net/mlx5: Fix health work queue spin lock to IRQ safe  (kernels >= 4.13)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-27 22:23:41 +09:00
Alexander Duyck 62b4c6694d i40e: Add programming descriptors to cleaned_count
This patch updates the i40e driver to include programming descriptors in
the cleaned_count. Without this change it becomes possible for us to leak
memory as we don't trigger a large enough allocation when the time comes to
allocate new buffers and we end up overwriting a number of rx_buffers equal
to the number of programming descriptors we encountered.

Fixes: 0e626ff7cc ("i40e: Fix support for flow director programming status")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Anders K. Pedersen <akp@cohaesio.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:58 -07:00
Alexander Duyck 10781348ca i40e: Fix incorrect use of tx_itr_setting when checking for Rx ITR setup
It looks like there was either a copy/paste error or just a typo that
resulted in the Tx ITR setting being used to determine if we were using
adaptive Rx interrupt moderation or not.

This patch fixes the typo.

Fixes: 65e87c0398 ("i40evf: support queue-specific settings for interrupt moderation")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:58 -07:00
Alexander Duyck 069db9cd0b ixgbe: Fix Tx map failure path
This patch is a partial revert of "ixgbe: Don't bother clearing buffer
memory for descriptor rings". Specifically I messed up the exception
handling path a bit and this resulted in us incorrectly adding the count
back in when we didn't need to.

In order to make this simpler I am reverting most of the exception handling
path change and instead just replacing the bit that was handled by the
unmap_and_free call.

Fixes: ffed21bcee ("ixgbe: Don't bother clearing buffer memory for descriptor rings")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:58 -07:00
Jean-Philippe Brucker 104ba83363 igb: Fix TX map failure path
When the driver cannot map a TX buffer, instead of rolling back
gracefully and retrying later, we currently get a panic:

[  159.885994] igb 0000:00:00.0: TX DMA map failed
[  159.886588] Unable to handle kernel paging request at virtual address ffff00000a08c7a8
               ...
[  159.897031] PC is at igb_xmit_frame_ring+0x9c8/0xcb8

Fix the erroneous test that leads to this situation.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:58 -07:00
Colin Ian King 5983587c8c e1000: avoid null pointer dereference on invalid stat type
Currently if the stat type is invalid then data[i] is being set
either by dereferencing a null pointer p, or it is reading from
an incorrect previous location if we had a valid stat type
previously.  Fix this by skipping over the read of p on an invalid
stat type.

Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:57 -07:00
Vincenzo Maffione 44c445c3d1 e1000: fix race condition between e1000_down() and e1000_watchdog
This patch fixes a race condition that can result into the interface being
up and carrier on, but with transmits disabled in the hardware.
The bug may show up by repeatedly IFF_DOWN+IFF_UP the interface, which
allows e1000_watchdog() interleave with e1000_down().

    CPU x                           CPU y
    --------------------------------------------------------------------
    e1000_down():
        netif_carrier_off()
                                    e1000_watchdog():
                                        if (carrier == off) {
                                            netif_carrier_on();
                                            enable_hw_transmit();
                                        }
        disable_hw_transmit();
                                    e1000_watchdog():
                                        /* carrier on, do nothing */

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2017-10-26 07:42:57 -07:00
Antoine Tenart 239dd4ee48 net: mvpp2: do not sleep in set_rx_mode
This patch replaces GFP_KERNEL by GFP_ATOMIC to avoid sleeping in the
ndo_set_rx_mode() call which is called with BH disabled.

Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26 17:07:52 +09:00
Antoine Tenart 20746d717e net: mvpp2: fix invalid parameters order when calling the tcam init
When calling mvpp2_prs_mac_multi_set() from mvpp2_prs_mac_init(), two
parameters (the port index and the table index) are inverted. Fixes
this.

Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26 17:07:52 +09:00
Antoine Tenart ef4816f0ee net: mvpp2: fix typo in the tcam setup
This patch fixes a typo in the mvpp2_prs_tcam_data_cmp() function, as
the shift value is inverted with the data.

Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26 17:07:52 +09:00
Huy Nguyen be0f161ef1 net/mlx5e: DCBNL, Implement tc with ets type and zero bandwidth
Previously, tc with ets type and zero bandwidth is not accepted
by driver. This behavior does not follow the IEEE802.1qaz spec.

If there are tcs with ets type and zero bandwidth, these tcs are
assigned to the lowest priority tc_group #0. We equally distribute
100% bw of the tc_group #0 to these zero bandwidth ets tcs.
Also, the non zero bandwidth ets tcs are assigned to tc_group #1.

If there is no zero bandwidth ets tc, the non zero bandwidth ets tcs
are assigned to tc_group #0.

Fixes: cdcf11212b ("net/mlx5e: Validate BW weight values of ETS")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-10-26 00:47:27 -07:00
Or Gerlitz 3c37745ec6 net/mlx5e: Properly deal with encap flows add/del under neigh update
Currently, the encap action offload is handled in the actions parse
function and not in mlx5e_tc_add_fdb_flow() where we deal with all
the other aspects of offloading actions (vlan, modify header) and
the rule itself.

When the neigh update code (mlx5e_tc_encap_flows_add()) recreates the
encap entry and offloads the related flows, we wrongly call again into
mlx5e_tc_add_fdb_flow(), this for itself would cause us to handle
again the offloading of vlans and header re-write which puts things
in non consistent state and step on freed memory (e.g the modify
header parse buffer which is already freed).

Since on error, mlx5e_tc_add_fdb_flow() detaches and may release the
encap entry, it causes a corruption at the neigh update code which goes
over the list of flows associated with this encap entry, or double free
when the tc flow is later deleted by user-space.

When neigh update (mlx5e_tc_encap_flows_del()) unoffloads the flows related
to an encap entry which is now invalid, we do a partial repeat of the eswitch
flow removal code which is wrong too.

To fix things up we do the following:

(1) handle the encap action offload in the eswitch flow add function
    mlx5e_tc_add_fdb_flow() as done for the other actions and the rule itself.

(2) modify the neigh update code (mlx5e_tc_encap_flows_add/del) to only
    deal with the encap entry and rules delete/add and not with any of
    the other offloaded actions.

Fixes: 232c001398 ('net/mlx5e: Add support to neighbour update flow')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-10-26 00:47:27 -07:00
Huy Nguyen 4ca637a20a net/mlx5: Delay events till mlx5 interface's add complete for pci resume
mlx5_ib_add is called during mlx5_pci_resume after a pci error.
Before mlx5_ib_add completes, there are multiple events which trigger
function mlx5_ib_event. This cause kernel panic because mlx5_ib_event
accesses unitialized resources.

The fix is to extend Erez Shitrit's patch <97834eba7c19>
("net/mlx5: Delay events till ib registration ends") to cover
the pci resume code path.

Trace:
mlx5_core 0001:01:00.6: mlx5_pci_resume was called
mlx5_core 0001:01:00.6: firmware version: 16.20.1011
mlx5_core 0001:01:00.6: mlx5_attach_interface:164:(pid 779):
mlx5_ib_event:2996:(pid 34777): warning: event on port 1
mlx5_ib_event:2996:(pid 34782): warning: event on port 1
Unable to handle kernel paging request for data at address 0x0001c104
Faulting instruction address: 0xd000000008f411fc
Oops: Kernel access of bad area, sig: 11 [#1]
...
...
Call Trace:
[c000000fff77bb70] [d000000008f4119c] mlx5_ib_event+0x64/0x470 [mlx5_ib] (unreliable)
[c000000fff77bc60] [d000000008e67130] mlx5_core_event+0xb8/0x210 [mlx5_core]
[c000000fff77bd10] [d000000008e4bd00] mlx5_eq_int+0x528/0x860[mlx5_core]

Fixes: 97834eba7c ("net/mlx5: Delay events till ib registration ends")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-10-26 00:47:27 -07:00
Moshe Shemesh 6377ed0bba net/mlx5: Fix health work queue spin lock to IRQ safe
spin_lock/unlock of health->wq_lock should be IRQ safe.
It was changed to spin_lock_irqsave since adding commit 0179720d6b
("net/mlx5: Introduce trigger_health_work function") which uses
spin_lock from asynchronous event (IRQ) context.
Thus, all spin_lock/unlock of health->wq_lock should have been moved
to IRQ safe mode.
However, one occurrence on new code using this lock missed that
change, resulting in possible deadlock:
  kernel: Possible unsafe locking scenario:
  kernel:       CPU0
  kernel:       ----
  kernel:  lock(&(&health->wq_lock)->rlock);
  kernel:  <Interrupt>
  kernel:    lock(&(&health->wq_lock)->rlock);
  kernel: #012 *** DEADLOCK ***

Fixes: 2a0165a034 ("net/mlx5: Cancel delayed recovery work when unloading the driver")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-10-26 00:47:27 -07:00
Pieter Jansen van Vuuren d309ae5c6a nfp: refuse offloading filters that redirects to upper devices
Previously we did not ensure that a netdev is a representative netdev
before dereferencing its private data. This can occur when an upper netdev
is created on a representative netdev. This patch corrects this by first
ensuring that the netdev is a representative netdev before using it.
Checking only switchdev_port_same_parent_id is not sufficient to ensure
that we can safely use the netdev. Failing to check that the netdev is also
a representative netdev would result in incorrect dereferencing.

Fixes: 1a1e586f54 ("nfp: add basic action capabilities to flower offloads")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-26 10:07:14 +09:00
Antoine Tenart 082297e614 net: mvpp2: do not call txq_done from the Tx path when Tx irqs are used
When Tx IRQs are used, txq_bufs_free() can be called from both the Tx
path and from NAPI poll(). This led to CPU stalls as if these two tasks
(Tx and Poll) are scheduled on two CPUs at the same time, DMA unmapping
operations are done on the same txq buffers.

This patch adds a check not to call txq_done() from the Tx path if Tx
interrupts are used as it does not make sense to do so.

Fixes: edc660fa09 ("net: mvpp2: replace TX coalescing interrupts with hrtimer")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 18:34:10 +09:00
Antoine Tenart 2092026788 net: mvpp2: do not unmap TSO headers buffers
The TSO header buffers are coming from a per cpu pool and should not
be unmapped as they are reused. The PPv2 driver was unmapping all
descriptors buffers unconditionally. This patch fixes this by checking
the buffers dma addresses before unmapping them, and by not unmapping
those who are located in the TSO header pool.

Fixes: 186cd4d4e4 ("net: mvpp2: software tso support")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 18:34:09 +09:00
Yan Markman 822eaf7cfb net: mvpp2: fix TSO headers allocation and management
TSO headers are managed with txq index and therefore should be aligned
with the txq size, not with the aggregated txq size.

Fixes: 186cd4d4e4 ("net: mvpp2: software tso support")
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Yan Markman <ymarkman@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-24 18:34:09 +09:00
Bernd Edlinger 8d5f4b0717 stmmac: Don't access tx_q->dirty_tx before netif_tx_lock
This is the possible reason for different hard to reproduce
problems on my ARMv7-SMP test system.

The symptoms are in recent kernels imprecise external aborts,
and in older kernels various kinds of network stalls and
unexpected page allocation failures.

My testing indicates that the trouble started between v4.5 and v4.6
and prevails up to v4.14.

Using the dirty_tx before acquiring the spin lock is clearly
wrong and was first introduced with v4.6.

Fixes: e3ad57c967 ("stmmac: review RX/TX ring management")

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 03:24:43 +01:00
Jose Abreu 9454360dec net: stmmac: Prevent infinite loop in get_rx_timestamp_status()
Prevent infinite loop by correctly setting the loop condition to
break when i == 10.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 02:50:40 +01:00
Jose Abreu 98870943a5 net: stmmac: Fix stmmac_get_rx_hwtstamp()
When using GMAC4 the valid timestamp is from CTX next desc but
we are passing the previous desc to get_rx_timestamp_status()
callback.

Fix this and while at it rework a little bit the function logic.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 02:50:40 +01:00
Jose Abreu 9c8080d068 net: stmmac: Add missing call to dev_kfree_skb()
When RX HW timestamp is enabled and a frame is discarded we are
not freeing the skb but instead only setting to NULL the entry.

Add a call to dev_kfree_skb_any() so that skb entry is correctly
freed.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 02:50:40 +01:00
Petr Machata dcbda2820f mlxsw: spectrum_router: Configure TIGCR on init
Spectrum tunnels do not default to ttl of "inherit" like the Linux ones
do. Configure TIGCR on router init so that the TTL of tunnel packets is
copied from the overlay packets.

Fixes: ee954d1a91 ("mlxsw: spectrum_router: Support GRE tunnels")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 02:19:03 +01:00
Petr Machata 14aefd9011 mlxsw: reg: Add Tunneling IPinIP General Configuration Register
The TIGCR register is used for setting up the IPinIP Tunnel
configuration.

Fixes: ee954d1a91 ("mlxsw: spectrum_router: Support GRE tunnels")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 02:19:03 +01:00
Igor Russkikh 417a3ae4b1 net: aquantia: Bad udp rate on default interrupt coalescing
Default Tx rates cause very long ISR delays on Tx.
0xff is 510us delay, giving only ~ 2000 interrupts per seconds for
Tx rings cleanup. With these settings udp tx rate was never higher than
~800Mbps on a single stream. Changing min delay to 0xF makes it
way better with ~6Gbps

TCP stream performance is almost unaffected by this change, since LSO
optimizations play important role.

CPU load is affected insignificantly by this change.

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 12:32:24 +01:00
Igor Russkikh b82ee71a86 net: aquantia: Enable coalescing management via ethtool interface
Aquantia NIC allows both TX and RX interrupt throttle rate (ITR)
management, but this was used in a very limited way via predefined
values. This patch allows to setup ITR default values via module
command line arguments and via standard ethtool coalescing settings.

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 12:32:24 +01:00
Igor Russkikh 6849540adc net: aquantia: mmio unmap was not performed on driver removal
That may lead to mmio resource leakage.

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 12:32:24 +01:00
Igor Russkikh 4c8bb609d3 net: aquantia: Limit number of MSIX irqs to the number of cpus
There is no much practical use from having MSIX vectors more that number
of cpus, thus cap this first with preconfigured limit, then with number
of cpus online.

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 12:32:24 +01:00
Igor Russkikh 93d87b8fbe net: aquantia: Fixed transient link up/down/up notification
When doing ifconfig down/up, driver did not reported carrier_off neither
in nic_stop nor in nic_start. That caused link to be visible as "up"
during couple of seconds immediately after "ifconfig up".

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 12:32:24 +01:00
Igor Russkikh 5d8d84e91d net: aquantia: Add queue restarts stats counter
Queue stat strings are cleaned up, duplicate stat name strings removed,
queue restarts counter added

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 12:32:24 +01:00
Igor Russkikh 65e665e68d net: aquantia: Reset nic statistics on interface up/down
Internal statistics system on chip never gets reset until hardware
reboot. This is quite inconvenient in terms of ethtool statistics usage.

This patch implements incremental statistics update inside of
service callback.

Upon nic initialization, first request is done to fetch
initial stat data, current collected stat data gets cleared.
Internal statistics mailbox readout is improved to save space and
increase readability

Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-21 12:32:24 +01:00
Netanel Belgazal a59df39676 net: ena: fix wrong max Tx/Rx queues on ethtool
ethtool ena_get_channels() expose the max number of queues as the max
number of queues ENA supports (128 queues) and not the actual number
of created queues.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:49:15 +01:00
Netanel Belgazal 411838e7b4 net: ena: fix rare kernel crash when bar memory remap fails
This failure is rare and only found on testing where deliberately fail
devm_ioremap()

[  451.170464] ena 0000:04:00.0: failed to remap regs bar
451.170549] Workqueue: pciehp-1 pciehp_power_thread
[  451.170551] task: ffff88085a5f2d00 task.stack: ffffc9000756c000
[  451.170552] RIP: 0010:devm_iounmap+0x2d/0x40
[  451.170553] RSP: 0018:ffffc9000756fac0 EFLAGS: 00010282
[  451.170554] RAX: 00000000fffffffe RBX: 0000000000000000 RCX:
0000000000000000
[  451.170555] RDX: ffffffff813a7e00 RSI: 0000000000000282 RDI:
0000000000000282
[  451.170556] RBP: ffffc9000756fac8 R08: 00000000fffffffe R09:
00000000000009b7
[  451.170557] R10: 0000000000000005 R11: 00000000000009b6 R12:
ffff880856c9d0a0
[  451.170558] R13: ffffc9000f5c90c0 R14: ffff880856c9d0a0 R15:
0000000000000028
[  451.170559] FS:  0000000000000000(0000) GS:ffff88085f400000(0000)
knlGS:0000000000000000
[  451.170560] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  451.170561] CR2: 00007f169038b000 CR3: 0000000001c09000 CR4:
00000000003406f0
[  451.170562] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  451.170562] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  451.170563] Call Trace:
[  451.170572]  ena_release_bars.isra.48+0x34/0x60 [ena]
[  451.170574]  ena_probe+0x144/0xd90 [ena]
[  451.170579]  ? ida_simple_get+0x98/0x100
[  451.170585]  ? kernfs_next_descendant_post+0x40/0x50
[  451.170591]  local_pci_probe+0x45/0xa0
[  451.170592]  pci_device_probe+0x157/0x180
[  451.170599]  driver_probe_device+0x2a8/0x460
[  451.170600]  __device_attach_driver+0x7e/0xe0
[  451.170602]  ? driver_allows_async_probing+0x30/0x30
[  451.170603]  bus_for_each_drv+0x68/0xb0
[  451.170605]  __device_attach+0xdd/0x160
[  451.170607]  device_attach+0x10/0x20
[  451.170610]  pci_bus_add_device+0x4f/0xa0
[  451.170611]  pci_bus_add_devices+0x39/0x70
[  451.170613]  pciehp_configure_device+0x96/0x120
[  451.170614]  pciehp_enable_slot+0x1b3/0x290
[  451.170616]  pciehp_power_thread+0x3b/0xb0
[  451.170622]  process_one_work+0x149/0x360
[  451.170623]  worker_thread+0x4d/0x3c0
[  451.170626]  kthread+0x109/0x140
[  451.170627]  ? rescuer_thread+0x380/0x380
[  451.170628]  ? kthread_park+0x60/0x60
[  451.170632]  ret_from_fork+0x25/0x30

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:49:15 +01:00
Netanel Belgazal cd7aea1875 net: ena: reduce the severity of some printouts
Decrease log level of checksum errors as these messages can be
triggered remotely by bad packets.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:49:15 +01:00
Thomas Falcon 2de09681e4 ibmvnic: Fix calculation of number of TX header descriptors
This patch correctly sets the number of additional header descriptors
that will be sent in an indirect SCRQ entry.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:20:39 +01:00
Ido Schimmel d965465b60 mlxsw: core: Fix possible deadlock
When an EMAD is transmitted, a timeout work item is scheduled with a
delay of 200ms, so that another EMAD will be retried until a maximum of
five retries.

In certain situations, it's possible for the function waiting on the
EMAD to be associated with a work item that is queued on the same
workqueue (`mlxsw_core`) as the timeout work item. This results in
flushing a work item on the same workqueue.

According to commit e159489baa ("workqueue: relax lockdep annotation
on flush_work()") the above may lead to a deadlock in case the workqueue
has only one worker active or if the system in under memory pressure and
the rescue worker is in use. The latter explains the very rare and
random nature of the lockdep splats we have been seeing:

[   52.730240] ============================================
[   52.736179] WARNING: possible recursive locking detected
[   52.742119] 4.14.0-rc3jiri+ #4 Not tainted
[   52.746697] --------------------------------------------
[   52.752635] kworker/1:3/599 is trying to acquire lock:
[   52.758378]  (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c4fa4>] flush_work+0x3a4/0x5e0
[   52.767837]
               but task is already holding lock:
[   52.774360]  (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0
[   52.784495]
               other info that might help us debug this:
[   52.791794]  Possible unsafe locking scenario:
[   52.798413]        CPU0
[   52.801144]        ----
[   52.803875]   lock(mlxsw_core_driver_name);
[   52.808556]   lock(mlxsw_core_driver_name);
[   52.813236]
                *** DEADLOCK ***
[   52.819857]  May be due to missing lock nesting notation
[   52.827450] 3 locks held by kworker/1:3/599:
[   52.832221]  #0:  (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0
[   52.842846]  #1:  ((&(&bridge->fdb_notify.dw)->work)){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0
[   52.854537]  #2:  (rtnl_mutex){+.+.}, at: [<ffffffff822ad8e7>] rtnl_lock+0x17/0x20
[   52.863021]
               stack backtrace:
[   52.867890] CPU: 1 PID: 599 Comm: kworker/1:3 Not tainted 4.14.0-rc3jiri+ #4
[   52.875773] Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
[   52.886267] Workqueue: mlxsw_core mlxsw_sp_fdb_notify_work [mlxsw_spectrum]
[   52.894060] Call Trace:
[   52.909122]  __lock_acquire+0xf6f/0x2a10
[   53.025412]  lock_acquire+0x158/0x440
[   53.047557]  flush_work+0x3c4/0x5e0
[   53.087571]  __cancel_work_timer+0x3ca/0x5e0
[   53.177051]  cancel_delayed_work_sync+0x13/0x20
[   53.182142]  mlxsw_reg_trans_bulk_wait+0x12d/0x7a0 [mlxsw_core]
[   53.194571]  mlxsw_core_reg_access+0x586/0x990 [mlxsw_core]
[   53.225365]  mlxsw_reg_query+0x10/0x20 [mlxsw_core]
[   53.230882]  mlxsw_sp_fdb_notify_work+0x2a3/0x9d0 [mlxsw_spectrum]
[   53.237801]  process_one_work+0x8f1/0x12f0
[   53.321804]  worker_thread+0x1fd/0x10c0
[   53.435158]  kthread+0x28e/0x370
[   53.448703]  ret_from_fork+0x2a/0x40
[   53.453017] mlxsw_spectrum 0000:01:00.0: EMAD retries (2/5) (tid=bf4549b100000774)
[   53.453119] mlxsw_spectrum 0000:01:00.0: EMAD retries (5/5) (tid=bf4549b100000770)
[   53.453132] mlxsw_spectrum 0000:01:00.0: EMAD reg access failed (tid=bf4549b100000770,reg_id=200b(sfn),type=query,status=0(operation performed))
[   53.453143] mlxsw_spectrum 0000:01:00.0: Failed to get FDB notifications

Fix this by creating another workqueue for EMAD timeouts, thereby
preventing the situation of a work item trying to flush a work item
queued on the same workqueue.

Fixes: caf7297e7a ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-18 12:19:15 +01:00
Sankar Patchineelam 5b1e1a9ce0 bnxt_en: Fix possible corruption in DCB parameters from firmware.
hwrm_send_message() is replaced with _hwrm_send_message(), and
hwrm_cmd_lock mutex lock is grabbed for the whole period of
firmware call until the firmware DCB parameters have been copied.
This will prevent possible corruption of the firmware data.

Fixes: 7df4ae9fe8 ("bnxt_en: Implement DCBNL to support host-based DCBX.")
Signed-off-by: Sankar Patchineelam <sankar.patchineelam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14 18:51:51 -07:00
Michael Chan cc72f3b1fe bnxt_en: Fix possible corrupted NVRAM parameters from firmware response.
In bnxt_find_nvram_item(), it is copying firmware response data after
releasing the mutex.  This can cause the firmware response data
to be corrupted if the next firmware response overwrites the response
buffer.  The rare problem shows up when running ethtool -i repeatedly.

Fix it by calling the new variant _hwrm_send_message_silent() that requires
the caller to take the mutex and to release it after the response data has
been copied.

Fixes: 3ebf6f0a09 ("bnxt_en: Add installed-package version reporting via Ethtool GDRVINFO")
Reported-by: Sarveswara Rao Mygapula <sarveswararao.mygapula@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14 18:51:51 -07:00
Michael Chan 021570793d bnxt_en: Fix VF resource checking.
In bnxt_sriov_enable(), we calculate to see if we have enough hardware
resources to enable the requested number of VFs.  The logic to check
for minimum completion rings and statistics contexts is missing.  Add
the required checks so that VF configuration won't fail.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14 18:51:51 -07:00
Vasundhara Volam 7ab0760f51 bnxt_en: Fix VF PCIe link speed and width logic.
PCIE PCIE_EP_REG_LINK_STATUS_CONTROL register is only defined in PF
config space, so we must read it from the PF.

Fixes: 90c4f788f6 ("bnxt_en: Report PCIe link speed and width during driver load")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14 18:51:51 -07:00
Michael Chan e2dc9b6e38 bnxt_en: Don't use rtnl lock to protect link change logic in workqueue.
As a further improvement to the PF/VF link change logic, use a private
mutex instead of the rtnl lock to protect link change logic.  With the
new mutex, we don't have to take the rtnl lock in the workqueue when
we have to handle link related functions.  If the VF and PF drivers
are running on the same host and both take the rtnl lock and one is
waiting for the other, it will cause timeout.  This patch fixes these
timeouts.

Fixes: 90c694bb71 ("bnxt_en: Fix RTNL lock usage on bnxt_update_link().")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14 18:51:51 -07:00
Michael Chan c213eae8d3 bnxt_en: Improve VF/PF link change logic.
Link status query firmware messages originating from the VFs are forwarded
to the PF.  The driver handles these interactions in a workqueue for the
VF and PF.  The VF driver waits for the response from the PF in the
workqueue.  If the PF and VF driver are running on the same host and the
work for both PF and VF are queued on the same workqueue, the VF driver
may not get the response if the PF work item is queued behind it on the
same workqueue.  This will lead to the VF link query message timing out.

To prevent this, we create a private workqueue for PFs instead of using
the common workqueue.  The VF query and PF response will never be on
the same workqueue.

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-14 18:51:51 -07:00
Emiliano Ingrassia b984986067 net: stmmac: dwmac_lib: fix interchanged sleep/timeout values in DMA reset function
The DMA reset timeout, used in read_poll_timeout, is
ten times shorter than the sleep time.
This patch fixes these values interchanging them, as it was
before the read_poll_timeout introduction.

Fixes: 8a70aeca80 ("net: stmmac: Use readl_poll_timeout")

Signed-off-by: Emiliano Ingrassia <ingrassia@epigenesys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 10:19:52 -07:00
Arnd Bergmann e7ad97938e liquidio: fix timespec64_to_ns typo
While experimenting with changes to the timekeeping code, I
ran into a build error in the liquidio driver:

drivers/net/ethernet/cavium/liquidio/lio_main.c: In function 'liquidio_ptp_settime':
drivers/net/ethernet/cavium/liquidio/lio_main.c:1850:22: error: passing argument 1 of 'timespec_to_ns' from incompatible pointer type [-Werror=incompatible-pointer-types]

The driver had a type mismatch since it was first merged, but
this never caused problems because it is only built on 64-bit
architectures that define timespec and timespec64 to the same
type.

If we ever want to compile-test the driver on 32-bit or change
the way that 64-bit timespec64 is defined, we need to fix it,
so let's just do it now.

Fixes: f21fb3ed36 ("Add support of Cavium Liquidio ethernet adapters")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 10:18:38 -07:00
Daniel Drake bde135a672 r8169: only enable PCI wakeups when WOL is active
rtl_init_one() currently enables PCI wakeups if the ethernet device
is found to be WOL-capable. There is no need to do this when
rtl8169_set_wol() will correctly enable or disable the same wakeup flag
when WOL is activated/deactivated.

This works around an ACPI DSDT bug which prevents the Acer laptop models
Aspire ES1-533, Aspire ES1-732, PackardBell ENTE69AP and Gateway NE533
from entering S3 suspend - even when no ethernet cable is connected.

On these platforms, the DSDT says that GPE08 is a wakeup source for
ethernet, but this GPE fires as soon as the system goes into suspend,
waking the system up immediately. Having the wakeup normally disabled
avoids this issue in the default case.

With this change, WOL will continue to be unusable on these platforms
(it will instantly wake up if WOL is later enabled by the user) but we
do not expect this to be a commonly used feature on these consumer
laptops. We have separately determined that WOL works fine without any
ACPI GPEs enabled during sleep, so a DSDT fix or override would be
possible to make WOL work.

Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-11 15:26:17 -07:00
Jakub Kicinski 5f0ca2fb71 nfp: handle page allocation failures
page_address() does not handle NULL argument gracefully,
make sure we NULL-check the page pointer before passing it
to page_address().

Fixes: ecd63a0217 ("nfp: add XDP support in the driver")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10 13:18:33 -07:00
Jakub Kicinski c3d64ad4fe nfp: fix ethtool stats gather retry
The while loop fetching 64 bit ethtool statistics may have
to retry multiple times, it shouldn't modify the outside state.

Fixes: 4c3523623d ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-10 13:18:33 -07:00