Currently the driver support only ethernet eswitch, and we want to
protect downstream IPoIB netdev from trying to access it in IB link.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to have different RX handler per profile, fix and refactor the
current code to take the rx handler directly from the netdevice profile
rather than computing it on runtime as it was done with the switchdev
mode representor rx handler.
This will also remove the current wrong assumption in mlx5e_alloc_rq
code that mlx5e_priv->ppriv is of the type vport_rep.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement mlx5e's IPoIB SKB transmit using the helper functions provided
by mlx5e ethernet tx flow, the only difference in the code between
mlx5e_xmit and mlx5i_xmit is that IPoIB has some extra fields to fill
(UD datagram segment) in the TX descriptor (WQE) and it doesn't need to
have any vlan handling.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Break current mlx5e xmit flow into smaller blocks (helper functions)
in order to reuse them for IPoIB SKB transmission.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create IPoIB underlay QP needed by the IPoIB netdevice profile for RSS
and TX HW context to perform on IPoIB traffic.
Reset the underlay QP on dev_uninit ndo to stop IPoIB traffic going
through this QP when the ULP IPoIB decides to cleanup.
Implement attach/detach mcast RDMA netdev callbacks for later RDMA
netdev use.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement open/close of IPoIB netdevice ndos using mlx5e's
channels API to manage data path resources (RQs/SQs/CQs).
Set IPoIB netdev address on dev_init ndo.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify mlx5e tis creation function to accept underlay qp number, which
will be needed by IPoIB.
Implement mlx5i (IPoIB) tx init/cleanup netdevice profile flows to
create one TIS with the IPoIB underlay qp, for IPoIB TX SQs.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Like the mlx5e ethernet mode, on IPoIB mode we need to create RX steering
tables, but IPoIB do not require MAC and VLAN steering tables so the
only tables we create in here are:
1. TTC Table (Traffic Type Classifier table for RSS steering)
2. ARFS Table (for accelerated RFS support)
Creation of those tables is identical to mlx5e ethernet mode, hence the
use of mlx5e_create_ttc_table and mlx5e_arfs_create_tables.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement IPoIB RX RSS (RQTs and TIRs) HW objects creation,
All we do here is simply reuse the mlx5e implementation to create
direct and indirect (RSS) steering HW objects.
For that we just expose
mlx5e_{create,destroy}_{direct,indirect}_{rqt,tir} functions into en.h
and call them from ipoib.c in init/cleanup_rx IPoIB netdevice profile
callbacks.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create mlx5e IPoIB netdevice profile skeleton in the new ipoib.c
file with empty implementation.
Downstream patches will provide the full mlx5 rdma netdevice acceleration
support for IPoIB into this new file, by using the mlx5e netdevice
profile and new mlx5_channels APIs and infrastructures.
Same as already done in mlx5e NIC netdevice and switchdev mode VF
representors.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for mlx5e RDMA net_device support, here we generalize
mlx5e_attach/detach in a way that those functions will be agnostic
to link type. For that we move ethernet specific NIC net device logic out
of those functions into {nic,rep}_{enable/disable} mlx5e NIC and
representor profiles callbacks.
Also some of the logic was moved only to NIC profile since it is not right
to have this logic for representor net device (e.g. set port MTU).
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get the relevant capabilities if supports ipoib_enhanced_offloads and
init the flow steering table accordingly.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IB flow tables need the underlay qp to perform flow steering.
Here we change the API of the flow tables creation to accept the
underlay QP number as a parameter in order to support IB (IPoIB) flow
steering.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Azure hosts are not supporting non-TCP port numbers in vRSS hashing for
now. For example, UDP packet loss rate will be high if port numbers are
also included in vRSS hash.
So, we created this patch to use only IP numbers for hashing in non-TCP
traffic.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the outgoing skb has a RX queue mapping available, we use the queue
number directly, other than put it through Send Indirection Table.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts were simply overlapping changes. In the net/ipv4/route.c
case the code had simply moved around a little bit and the same fix
was made in both 'net' and 'net-next'.
In the net/sched/sch_generic.c case a fix in 'net' happened at
the same time that a new argument was added to qdisc_hash_add().
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"Things seem to be settling down as far as networking is concerned,
let's hope this trend continues...
1) Add iov_iter_revert() and use it to fix the behavior of
skb_copy_datagram_msg() et al., from Al Viro.
2) Fix the protocol used in the synthetic SKB we cons up for the
purposes of doing a simulated route lookup for RTM_GETROUTE
requests. From Florian Larysch.
3) Don't add noop_qdisc to the per-device qdisc hashes, from Cong
Wang.
4) Don't call netdev_change_features with the team lock held, from
Xin Long.
5) Revert TCP F-RTO extension to catch more spurious timeouts because
it interacts very badly with some middle-boxes. From Yuchung
Cheng.
6) Fix the loss of error values in l2tp {s,g}etsockopt calls, from
Guillaume Nault.
7) ctnetlink uses bit positions where it should be using bit masks,
fix from Liping Zhang.
8) Missing RCU locking in netfilter helper code, from Gao Feng.
9) Avoid double frees and use-after-frees in tcp_disconnect(), from
Eric Dumazet.
10) Don't do a changelink before we register the netdevice in
bridging, from Ido Schimmel.
11) Lock the ipv6 device address list properly, from Rabin Vincent"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
netfilter: ipt_CLUSTERIP: Fix wrong conntrack netns refcnt usage
netfilter: nft_hash: do not dump the auto generated seed
drivers: net: usb: qmi_wwan: add QMI_QUIRK_SET_DTR for Telit PID 0x1201
ipv6: Fix idev->addr_list corruption
net: xdp: don't export dev_change_xdp_fd()
bridge: netlink: register netdevice before executing changelink
bridge: implement missing ndo_uninit()
bpf: reference may_access_skb() from __bpf_prog_run()
tcp: clear saved_syn in tcp_disconnect()
netfilter: nf_ct_expect: use proper RCU list traversal/update APIs
netfilter: ctnetlink: skip dumping expect when nfct_help(ct) is NULL
netfilter: make it safer during the inet6_dev->addr_list traversal
netfilter: ctnetlink: make it safer when checking the ct helper name
netfilter: helper: Add the rcu lock when call __nf_conntrack_helper_find
netfilter: ctnetlink: using bit to represent the ct event
netfilter: xt_TCPMSS: add more sanity tests on tcph->doff
net: tcp: Increase TCP_MIB_OUTRSTS even though fail to alloc skb
l2tp: don't mask errors in pppol2tp_getsockopt()
l2tp: don't mask errors in pppol2tp_setsockopt()
tcp: restrict F-RTO to work-around broken middle-boxes
...
virtio pci rework using shared interrupts caused a lot of issues. We
tried to fix them but run out of time. Revert for now, and revisit the
issue for the next kernel.
Luckily we are able to do this without loosing automatic
interrupt NUMA affinity which was the main motivator for the
rework.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJY6/oVAAoJECgfDbjSjVRpiDQH/3WL4zujwShOmEFSaUkka+BK
+Il64oVliZk1BMsMTqLsFYGqJtSlqOkQzWkQ2hyPwS9/U4pBzPZ4eJZCng/245YK
5NsT51/m8x3mjRATh0fPqsAwz8CdkWfMpwLYBS6V73RB1XCTVB4IV9vVk6g922oe
dkKlq6s3XvBqBJD02CkV1ApAYFyozF8ppyWdt7F/MsM9HdpM8uWR9F5fh/qDizbZ
ifPUkTSk8BcFzyZ57P/9rdn+cTpPY4PeKIurKwttCGFRm9++5a6RdIwP+zQm7ypC
LaI9StOj8ixloWjhS2eETMi/qLFkwf93gVFhRWhQzIetkjgqZoRIbcg+iLsi6uU=
=W6NP
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael S. Tsirkin:
"virtio oops fixes
The virtio pci rework using shared interrupts caused a lot of issues.
We tried to fix them but run out of time. Revert for now, and revisit
the issue for the next kernel.
Luckily we are able to do this without loosing automatic interrupt
NUMA affinity which was the main motivator for the rework"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio-pci: Remove affinity hint before freeing the interrupt
Revert "virtio_pci: remove struct virtio_pci_vq_info"
Revert "virtio_pci: use shared interrupts for virtqueues"
Revert "virtio_pci: don't duplicate the msix_enable flag in struct pci_dev"
Revert "virtio_pci: simplify MSI-X setup"
Revert "virtio_pci: fix out of bound access for msix_names"
MAINTAINERS: fix virtio file pattern
virtio_console: fix uninitialized variable use
virtio_net: clear MTU when out of range
virtio: allow drivers to validate features
virtio_net: enable big packets for large MTU values
Pass the new extended ACK reporting struct to all of the generic
netlink parsing functions. For now, pass NULL in almost all callers
(except for some in the core.)
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When link transitions from LINK_FAIL to LINK_UP, the commit phase is
not called. This leads to an erroneous state causing slave-link state to
get stuck in "going down" state while its speed and duplex are perfectly
fine. This issue is a side-effect of splitting link-set into propose and
commit phases introduced by de77ecd4ef ("bonding: improve link-status
update in mii-monitoring")
This patch fixes these issues by calling commit phase whenever link
state change is proposed.
Fixes: de77ecd4ef ("bonding: improve link-status update in mii-monitoring")
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is necessary to provide ethtool support for displaying and
modifying parameters of dwc-xlgmac.
Signed-off-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Field FL/TPL in register TDES3 is not correctly set on GMAC4.
TX appears to be functional on GMAC 4.10a even if this field is not set,
however, to avoid relying on undefined behavior, set the length in TDES3.
The field has a different meaning depending on if the TSE bit in TDES3
is set or not (TSO). However, regardless of the TSE bit, the field is
not optional. The field is already set correctly when the TSE bit is set.
Since there is no limit for the number of descriptors that can be
used for a single packet, the field should be set to the sum of
the buffers contained in:
[<desc with First Descriptor bit set> ... <desc n> ...
<desc with Last Descriptor bit set>], which should be equal to skb->len.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Save the filter tid while creating the server filter, which is used
later to retrieve the corresponding filter instance while handling
the filter reply.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Telit LE920A4 uses the same pid 0x1201 of LE920, but modem
implementation is different, since it requires DTR to be set for
answering to qmi messages.
This patch replaces QMI_FIXED_INTF with QMI_QUIRK_SET_DTR: tests on
LE920 have been performed in order to verify backward compatibility.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow up to three clocks to be specified and enabled for the orion-mdio
interface, which are required for this interface to be accessible on
Armada 8k platforms.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disable the MDIO interrupt, falling back to polled mode, if the resource
size does not allow us to access the interrupt registers. All current
DT bindings use a size of 0x84, which allows access, but verifying it is
good practice.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pre-existing write to disable interrupts on the remove path happens
whether we have an interrupt or not. While this may seem to be a good
idea, this driver is re-used in many different implementations, some
where the binding only specifies four bytes of register space. This
access causes us to access registers outside of the binding.
Make it conditional on the interrupt being present, which is the same
condition used when enabling the interrupt in the first place.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the mvmdio driver has an interrupt, it enables the "done" interrupt
after requesting its interrupt handler. However, probe failure results
in the interrupt being left enabled. Disable it on the failure path.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
I haven't seen any improvement above that size on the machines
I've tested with.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We set an arbitrary max at 1024 since we pre-allocate the actual
descriptor arrays and skb arrays to the full size to keep the
code a bit simpler and avoid allocation failures in the reset
task.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clear stale interrupts on entry, configure FIFO sizes, set FIFO
thresholds, configure interrupt mitigation.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The helpers just take space but don't provide much value. Simple
one line comments are more explanatory.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
To remove more confusion. This function is about obtaining the
initial MAC address at driver probe time.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid confusion with the ndo callback and generally be
clearer about the purpose of that function
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
So features can be turned on/off via ethtool
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We found out that HW checksum generation only works from AST2500
onward. This disables it on AST2400 and removes the "no-hw-checksum"
properties in the device-trees. The problem we had wasn't related
to NC-SI.
Also rework the logic testing for that property so it can be used
to disable HW checksum generation and checking regardless of whether
NC-SI is used or not in case other variants out there need this.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We test for aspeed chips to handle a couple of special cases,
but we do that by checking the machine type which isn't right.
Instead check the actual device compatible property. This also
updates the dtsi files for the aspeed SoC to match.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The documentation describes NETIF_F_IP_CSUM as deprecated
so let's switch to NETIF_F_HW_CSUM and use the helper to
handle unhandled protocols.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the submission of the lastest multiple buffer patch set, this fix was lost.
I am sending this patch to put it right again. The fix was originally proposed
by Arnd Bergmann.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The errata ERR007885 HW fix don't add to i.MX6ul ENET IP version,
so add sw workaroud for the chip.
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Correct the errata number ERR006358 comment typo.
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many boards use i2c/spi expander gpio as phy-reset-gpios and these
gpios maybe registered after fec port, driver should check the return
value of .of_get_named_gpio().
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In aarch64 system, it requires to trasfer ->dev to dma_alloc_coherent()
API, otherwise allocate failed and print kernel warning.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In aarch64 system, the BD pointer is 64bit, and the high-order 32-bits
of the address is effective, so replace usigned with (void *) type to
aovid 64bit address is casted to 32bit in .fec_enet_get_nextdesc() and
.fec_enet_get_prevdesc() functions.
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add return value check after calling .of_property_read_u32() to avoid
the warning reported by coverity.
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial conversion as only one vector is supported, but at least we
lose the useless msix_entry member in the per-device structure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove deprecated pci_enable_msix API in favour of its
successor pci_alloc_irq_vectors.
Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@cavium.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the deprecated pci_enable_msix API in favour of its successor.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>