Replace the representor private data to a net_device pointer holding the
representor netdevice, instead of void pointer holding mlx5e_priv.
It will be used by a new eswitch service function, returning the uplink representor
netdevice.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VF Representor udp tunnel ndo entries were removed by mistake,
return them.
Fixes: 370bad0f9a ('net/mlx5e: Support HW (offloaded) and SW counters for SRIOV switchdev mode')
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to support hardware offloading when the device given by the tc
rule is different from the Hardware underline device, extract the mirred
(egress) device from the tc action when a filter is added, using the new
tc_action_ops, get_dev().
Flower caches the information about the mirred device and use it for
calling ndo_setup_tc in filter change, update stats and delete.
Calling ndo_setup_tc of the mirred (egress) device instead of the
ingress device will allow a resolution between the software ingress
device and the underline hardware device.
The resolution will take place inside the offloading driver using
'egress_device' flag added to tc_to_netdev struct which is provided to
the offloading driver.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding support to a new tc_action_ops.
get_dev is a general option which allows to get the underline
device when trying to offload a tc rule.
In case of mirred action the returned device is the mirred (egress)
device.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of providing many arguments to fl_hw_{replace/destroy}_filter
functions, just provide cls_fl_filter struct that includes all the relevant
args.
This patches doesn't add any new functionality.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check skip_hw flag isn't set before calling
fl_hw_{replace/destroy}_filter and fl_hw_update_stats functions.
Replace the call to tc_should_offload with tc_can_offload.
tc_can_offload only checks if the device supports offloading, the check for
skip_hw flag is done earlier in the flow.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Creating a difference between two possible cases:
1. Not offloading tc rule since the user sets 'skip_hw' flag.
2. Not offloading tc rule since the device doesn't support offloading.
This patch doesn't add any new functionality.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric says: "By looking at tcpdump, and TS val of xmit packets of multiple
flows, we can deduct the relative qdisc delays (think of fq pacing).
This should work even if we have one flow per remote peer."
Having random per flow (or host) offsets doesn't allow that anymore so add
a way to turn this off.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
jiffies based timestamps allow for easy inference of number of devices
behind NAT translators and also makes tracking of hosts simpler.
commit ceaa1fef65 ("tcp: adding a per-socket timestamp offset")
added the main infrastructure that is needed for per-connection ts
randomization, in particular writing/reading the on-wire tcp header
format takes the offset into account so rest of stack can use normal
tcp_time_stamp (jiffies).
So only two items are left:
- add a tsoffset for request sockets
- extend the tcp isn generator to also return another 32bit number
in addition to the ISN.
Re-use of ISN generator also means timestamps are still monotonically
increasing for same connection quadruple, i.e. PAWS will still work.
Includes fixes from Eric Dumazet.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Rangankar says:
====================
Add QLogic FastLinQ iSCSI (qedi) driver.
This series introduces hardware offload iSCSI initiator driver for the
41000 Series Converged Network Adapters (579xx chip) by Qlogic. The overall
driver design includes a common module ('qed') and protocol specific
dependent modules ('qedi' for iSCSI).
This is an open iSCSI driver, modifications to open iSCSI user components
'iscsid', 'iscsiuio', etc. are required for the solution to work. The user
space changes are also in the process of being submitted.
https://groups.google.com/forum/#!forum/open-iscsi
The 'qed' common module, under drivers/net/ethernet/qlogic/qed/, is
enhanced with functionality required for the iSCSI support. This series
is based on:
net tree base: Merge of net and net-next as of 11/29/2016
Changes from RFC v2:
1. qedi patches are squashed into single patch to prevent krobot
warning.
2. Fixed 'hw_p_cpuq' incompatible pointer type.
3. Fixed sparse incompatible types in comparison expression.
4. Misc fixes with latest 'checkpatch --strict' option.
5. Remove int_mode option from MODULE_PARAM.
6. Prefix all MODULE_PARAM params with qedi_*.
7. Use CONFIG_QED_ISCSI instead of CONFIG_QEDI
8. Added bad task mem access fix.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds out of order packet handling for hardware offloaded
iSCSI. Out of order packet handling requires driver buffer allocation
and assistance.
Signed-off-by: Arun Easi <arun.easi@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds the backbone required for the various HW initalizations
which are necessary for the iSCSI driver (qedi) for QLogic FastLinQ
4xxxx line of adapters - FW notification, resource initializations, etc.
Signed-off-by: Arun Easi <arun.easi@cavium.com>
Signed-off-by: Yuval Mintz <yuval.mintz@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit ae148b0858
("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()").
skb->protocol is now set in __ip_local_out() and __ip6_local_out() before
dst_output() is called. It is no longer necessary to do it for each tunnel.
Cc: stable@vger.kernel.org
Signed-off-by: Eli Cooper <elicooper@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When xfrm is applied to TSO/GSO packets, it follows this path:
xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()
where skb_gso_segment() relies on skb->protocol to function properly.
This patch sets skb->protocol to ETH_P_IPV6 before dst_output() is called,
fixing a bug where GSO packets sent through an ipip6 tunnel are dropped
when xfrm is involved.
Cc: stable@vger.kernel.org
Signed-off-by: Eli Cooper <elicooper@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When xfrm is applied to TSO/GSO packets, it follows this path:
xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()
where skb_gso_segment() relies on skb->protocol to function properly.
This patch sets skb->protocol to ETH_P_IP before dst_output() is called,
fixing a bug where GSO packets sent through a sit tunnel are dropped
when xfrm is involved.
Cc: stable@vger.kernel.org
Signed-off-by: Eli Cooper <elicooper@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When packet_set_ring creates a ring buffer it will initialize a
struct timer_list if the packet version is TPACKET_V3. This value
can then be raced by a different thread calling setsockopt to
set the version to TPACKET_V1 before packet_set_ring has finished.
This leads to a use-after-free on a function pointer in the
struct timer_list when the socket is closed as the previously
initialized timer will not be deleted.
The bug is fixed by taking lock_sock(sk) in packet_setsockopt when
changing the packet version while also taking the lock at the start
of packet_set_ring.
Fixes: f6fb8f100b ("af-packet: TPACKET_V3 flexible buffer implementation.")
Signed-off-by: Philip Pettersson <philip.pettersson@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All architectures avoid memory corruption in an error path.
ARM prevents bogus acknowledgement of interrupts.
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJYQaDPAAoJEED/6hsPKofoq8gH/iJR/fcYg1ovboEaDIDdm/PI
XzbNgrYZID8Nk04chU2Dh1eD8k3DG64txuOEs+jf3XBPYNnU8TAlw6qHVMG6kzGJ
zA0CLGgH62DKXLuvnDJ75mpiJmzioGd4hdk0G8CIb9W2ySSUgrcmXMI3AoVP44lY
LKTCITKq6ePfQ7AIbd3a6YXaR0ZTNP52e1Y4vx+Hsl9WcrMUGKyCmd9IcDI9DrZr
ahMn+wx3Wzvb/NzH25OYkMAC9X5C7+b6O0IZm0ie8F8iU+JLlgGNiAHxQ5yAbu28
hzINTTUnwIxgoi/ZN0M8i+fo0RLKq5OCzPMTnUUgdloBL786XREW2t3Ca0kqavg=
=XdK2
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"All architectures avoid memory corruption in an error path. ARM
prevents bogus acknowledgement of interrupts"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: use after free in kvm_ioctl_create_device()
KVM: arm/arm64: vgic: Don't notify EOI for non-SPIs
Pull i2c fix from Wolfram Sang:
"Here is the revert for the regression of the i2c-octeon driver I
mentioned last time. I wished for a bit more feedback, but all people
working actively on it are in need of this patch, so here it goes"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Revert "i2c: octeon: thunderx: Limit register access retries"
The driver already uses its private lock for synchronization between xmit
and xmit completion handler making the additional use of the xmit_lock
unnecessary.
Furthermore the driver does not set NETIF_F_LLTX resulting in xmit to be
called with the xmit_lock held and then taking the private lock while xmit
completion handler does the reverse, first take the private lock, then the
xmit_lock.
Fix these issues by not taking the xmit_lock in the tx completion handler.
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
An explicit dma sync for device directly after mapping as well as an
explicit dma sync for cpu directly before unmapping is unnecessary and
costly on the hotpath. So remove these calls.
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is already using the %pM printf extension; might as well also use
%ph to make the code smaller.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
When CONFIG_INET is disabled, the new selftest results in a link
error:
drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.o: In function `mlx5e_test_loopback':
en_selftest.c:(.text.mlx5e_test_loopback+0x2ec): undefined reference to `ip_send_check'
en_selftest.c:(.text.mlx5e_test_loopback+0x34c): undefined reference to `udp4_hwcsum'
This hides the specific test in that configuration.
Fixes: 0952da791c ("net/mlx5e: Add support for loopback selftest")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
With binutils-2.26 and before, a weak missing symbol was kept during the
final link, and a missing CRC for an export would lead to that CRC being
treated as zero implicitly. With binutils-2.27, the crc symbol gets
dropped, and any module trying to use it will fail to load.
This sets the weak CRC symbol to zero explicitly, making it defined in
vmlinux, which in turn lets us load the modules referring to that CRC.
The comment above the __CRC_SYMBOL macro suggests that this was always
the intention, although it also seems that all symbols defined in C have
a correct CRC these days, and only the exports that are now done in
assembly need this.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Adam Borowski <kilobyte@angband.pl>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The core and the cluster sleep state entry latencies can't be same as
cluster sleep involves more work compared to core level e.g. shared
cache maintenance.
Experiments have shown on an average about 100us more latency for the
cluster sleep state compared to the core level sleep. This patch fixes
the entry latency for the cluster sleep state.
Fixes: 28e10a8f3a ("arm64: dts: juno: Add idle-states to device tree")
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: "Jon Medhurst (Tixy)" <tixy@linaro.org>
Reviewed-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
After 326fe02d1e ("net/mlx4_en: protect ring->xdp_prog with rcu_read_lock"),
the rcu_read_lock() in bpf_prog_run_xdp() is superfluous, since callers
need to hold rcu_read_lock() already to make sure BPF program doesn't
get released in the background.
Thus, drop it from bpf_prog_run_xdp(), as it can otherwise be misleading.
Still keeping the bpf_prog_run_xdp() is useful as it allows for grepping
in XDP supported drivers and to keep the typecheck on the context intact.
For mlx4, this means we don't have a double rcu_read_lock() anymore. nfp can
just make use of bpf_prog_run_xdp(), too. For qede, just move rcu_read_lock()
out of the helper. When the driver gets atomic replace support, this will
move to call-sites eventually.
mlx5 needs actual fixing as it has the same issue as described already in
326fe02d1e ("net/mlx4_en: protect ring->xdp_prog with rcu_read_lock"),
that is, we're under RCU bh at this time, BPF programs are released via
call_rcu(), and call_rcu() != call_rcu_bh(), so we need to properly mark
read side as programs can get xchg()'ed in mlx5e_xdp_set() without queue
reset.
Fixes: 86994156c7 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only when ICMP packets are enqueued onto the error queue,
sk_err is also set. Before f5f99309fa (sock: do not set sk_err
in sock_dequeue_err_skb), a subsequent error queue read
would set sk_err to the next error on the queue, or 0 if empty.
As no error types other than ICMP set this field, sk_err should
not be modified upon dequeuing them.
Only for ICMP errors, reset the (racy) sk_err. Some applications,
like traceroute, rely on it and go into a futile busy POLLERR
loop otherwise.
In principle, sk_err has to be set while an ICMP error is queued.
Testing is_icmp_err_skb(skb_next) approximates this without
requiring a full queue walk. Applications that receive both ICMP
and other errors cannot rely on this legacy behavior, as other
errors do not set sk_err in the first place.
Fixes: f5f99309fa (sock: do not set sk_err in sock_dequeue_err_skb)
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf says:
====================
bpf: BPF for lightweight tunnel encapsulation
This series implements BPF program invocation from dst entries via the
lightweight tunnels infrastructure. The BPF program can be attached to
lwtunnel_input(), lwtunnel_output() or lwtunnel_xmit() and see an L3
skb as context. Programs attached to input and output are read-only.
Programs attached to lwtunnel_xmit() can modify and redirect, push headers
and redirect packets.
The facility can be used to:
- Collect statistics and generate sampling data for a subset of traffic
based on the dst utilized by the packet thus allowing to extend the
existing realms.
- Apply additional per route/dst filters to prohibit certain outgoing
or incoming packets based on BPF filters. In particular, this allows
to maintain per dst custom state across multiple packets in BPF maps
and apply filters based on statistics and behaviour observed over time.
- Attachment of L2 headers at transmit where resolving the L2 address
is not required.
- Possibly many more.
v3 -> v4:
- Bumped LWT_BPF_MAX_HEADROOM from 128 to 256 (Alexei)
- Renamed bpf_skb_push() helper to bpf_skb_change_head() to relate to
existing bpf_skb_change_tail() helper (Alexei/Daniel)
- Added check in __bpf_redirect_common() to verify that program added a
link header before redirecting to a l2 device. Adding the check to
lwt-bpf code was considered but dropped due to massive code required
due to retrieval of net_device via per-cpu redirect buffer. A test
case was added to cover the scenario when a program directs to an l2
device without adding an appropriate l2 header.
(Alexei)
- Prohibited access to tc_classid (Daniel)
- Collapsed bpf_verifier_ops instance for lwt in/out as they are
identical (Daniel)
- Some cosmetic changes
v2 -> v3:
- Added real world sample lwt_len_hist_kern.c which demonstrates how to
collect a histogram on packet sizes for all packets flowing through
a number of routes.
- Restricted output to be read-only. Since the header can no longer
be modified, the rerouting functionality has been removed again.
- Added test case which cover destructive modification of packet data.
v1 -> v2:
- Added new BPF_LWT_REROUTE return code for program to indicate
that new route lookup should be performed. Suggested by Tom.
- New sample to illustrate rerouting
- New patch 05: Recursion limit for lwtunnel_output for the case
when user creates circular dst redirection. Also resolves the
issue for ILA.
- Fix to ensure headroom for potential future L2 header is still
guaranteed
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Registers new BPF program types which correspond to the LWT hooks:
- BPF_PROG_TYPE_LWT_IN => dst_input()
- BPF_PROG_TYPE_LWT_OUT => dst_output()
- BPF_PROG_TYPE_LWT_XMIT => lwtunnel_xmit()
The separate program types are required to differentiate between the
capabilities each LWT hook allows:
* Programs attached to dst_input() or dst_output() are restricted and
may only read the data of an skb. This prevent modification and
possible invalidation of already validated packet headers on receive
and the construction of illegal headers while the IP headers are
still being assembled.
* Programs attached to lwtunnel_xmit() are allowed to modify packet
content as well as prepending an L2 header via a newly introduced
helper bpf_skb_change_head(). This is safe as lwtunnel_xmit() is
invoked after the IP header has been assembled completely.
All BPF programs receive an skb with L3 headers attached and may return
one of the following error codes:
BPF_OK - Continue routing as per nexthop
BPF_DROP - Drop skb and return EPERM
BPF_REDIRECT - Redirect skb to device as per redirect() helper.
(Only valid in lwtunnel_xmit() context)
The return codes are binary compatible with their TC_ACT_
relatives to ease compatibility.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
A route on the output path hitting a RTN_LOCAL route will keep the dst
associated on its way through the loopback device. On the receive path,
the dst_input() call will thus invoke the input handler of the route
created in the output path. Thus, lwt redirection for input must be done
for dsts allocated in the otuput path as well.
Also, if a route is cached in the input path, the allocated dst should
respect lwtunnel configuration on the nexthop as well.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
orig_output for IPv4 was only set for dsts which hit an input route.
Set it consistently for locally generated traffic as well to allow
lwt to continue the dst_output() path as configured by the nexthop.
Fixes: 2536862311 ("lwt: Add support to redirect dst.input")
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed says:
====================
Mellanox 100G mlx5 updates 2016-11-29
The following series from Tariq and Roi, provides some critical fixes
and updates for the mlx5e driver.
From Tariq:
- Fix driver coherent memory huge allocation issues by fragmenting
completion queues, in a way that is transparent to the netdev driver by
providing a new buffer type "mlx5_frag_buf" with the same access API.
- Create UMR MKey per RQ to have better scalability.
From Roi:
- Some fixes for the encap-decap support and tc flower added lately to the
mlx5e driver.
v1->v2:
- Fix start index in error flow of mlx5_frag_buf_alloc_node, pointed out by Eric.
This series was generated against commit:
31ac1c1945 ("geneve: fix ip_hdr_len reserved for geneve6 tunnel.")
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Handling flow encap entry should be inside tc del flow
and is only relevant for offloaded eswitch TC rules.
Fixes: 11a457e9b6c1 ("net/mlx5e: Add basic TC tunnel set action for SRIOV offloads")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the function that deletes offloaded TC rule to get
struct mlx5e_tc_flow instance which contains both the flow
handle and flow attributes. This is a cleanup needed for
downstream patches, it doesn't change any functionality.
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the reverse unwinding principle, on delete time we should
first handle deletion of the steering rule and later handle the vlan
deletion from the eswitch.
Fixes: 8b32580df1 ("net/mlx5e: Add TC vlan action for SRIOV offloads")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We will never find a flow with the same cookie as cls_flower always
allocates a new flow and the cookie is the allocated memory address.
Fixes: e3a2b7ed01 ("net/mlx5e: Support offload cls_flower with drop action")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In Striding RQ implementation, we used a single UMR
(User-Mode Memory Registration) memory key for all RQs.
When the product of RQs number*size gets high, we hit a
limitation of u16 field size in FW.
Here we move to using a UMR memory key per RQ, so we can
scale to any number of rings, with the maximum buffer
size in each.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In next patch we are going to create a UMR MKey per RQ, we need
mlx5e_create_umr_mkey declared before mlx5e_create_rq.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new type of struct mlx5_frag_buf which is used to allocate fragmented
buffers rather than contiguous, and make the Completion Queues (CQs) use
it as they are big (default of 2MB per CQ in Striding RQ).
This fixes the failures of type:
"mlx5e_open_locked: mlx5e_open_channels failed, -12"
due to dma_zalloc_coherent insufficient contiguous coherent memory to
satisfy the driver's request when the user tries to setup more or larger
rings.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hovold says:
====================
net: stmmac: fix probe error handling and phydev leaks
This series fixes a number of issues with the stmmac-driver probe error
handling, which for example left clocks enabled after probe failures.
The final patch fixes a failure to deregister and free any fixed-link
PHYs that were registered during probe on probe errors and on driver
unbind. It also fixes a related of-node leak on late probe errors.
This series depends on the of_phy_deregister_fixed_link() helper that
was just merged to net.
As mentioned earlier, one staging driver also suffers from a similar
leak and can be fixed up once the above mentioned helper hits mainline.
Note that these patches have only been compile tested.
====================
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure to deregister and free any fixed-link phy registered during
probe on probe errors and on driver unbind by adding a new glue helper
function.
Drop the of-node reference taken in the same path also on late probe
errors (and not just on driver unbind) by moving the put from
stmmac_dvr_remove() to the new helper.
Fixes: 277323814e ("stmmac: add fixed-link device-tree support")
Fixes: 4613b279be ("ethernet: stmicro: stmmac: add missing of_node_put
after calling of_parse_phandle")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the OF-helper function header to reflect that the function no longer
has a platform-data parameter.
Fixes: b0003ead75 ("stmmac: make stmmac_probe_config_dt return the
platform data struct")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure to disable clocks before returning on late probe errors.
Fixes: 566e825162 ("net: stmmac: add a glue driver for the Amlogic
Meson 8b / GXBB DWMAC")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure to call any exit() callback to undo the effect of init()
before returning on late probe errors.
Fixes: cf3f047b9a ("stmmac: move hw init in the probe (v2)")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure to disable runtime PM, power down the PHY, and disable clocks
before returning on late probe errors.
Fixes: 27ffefd2d1 ("stmmac: dwmac-rk: create a new probe function")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure to disable clocks before returning on late probe errors.
Fixes: 8387ee21f9 ("stmmac: dwmac-sti: turn setup callback into a
probe function")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure to call stmmac_dvr_remove() before returning on late probe
errors so that memory is freed, clocks are disabled, and the netdev is
deregistered before its resources go away.
Fixes: 3c201b5a84 ("net: stmmac: socfpga: Remove re-registration of
reset controller")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neill Whillans says:
====================
net: Add support for SGMII PCS on Altera TSE MAC
These patches were created as part of work to add support for SGMII
PCS functionality to the Altera TSE MAC. Patches are based on 4.9-rc6
git tree.
The first patch in the series adds support for the VSC8572 dual-port
Gigabit Ethernet transceiver, used in integration testing.
The second patch adds support for the SGMII PCS functionality to the
Altera TSE driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the (optional) SGMII PCS functionality of the Altera
TSE MAC. If the phy-mode is set to 'sgmii' then we attempt to discover
and initialise the PCS so that the MAC can communicate to the PHY.
The PCS IP block provides a scratch register for testing presence of
the PCS, which is mapped into one of the two MDIO spaces present in
the MAC's register space. Once we have determined that the scratch
register is functioning, we attempt to initialise the PCS to
auto-negotiate an SGMII link with the PHY. There is no need to monitor
or manage the SGMII link beyond this, since the normal PHY MDIO will
then be used to monitor the media layer.
The Altera TSE MAC has only one way in which it can be configured with an
SGMII PCS, and as such, this patch only looks to the phy-mode to select
whether or not to attempt to initialise the PCS registers. During
initialisation, we report the PCS's equivalent of a PHY ID register.
This can be parameterised during the IP instantiation and is often left
as '0x00000000' which is not an error.
Signed-off-by: Neill Whillans <neill.whillans@codethink.co.uk>
Reviewed-by: Daniel Silverstone <daniel.silverstone@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the Vitesse VSC8572 which is functionally equivalent to
the already supported VSC8574. As such, all the same handling functions
are used since the VSC8572 merely has half the number of phy blocks
internally.
Signed-off-by: Stephen Agate <stephen.agate@uk.thalesgroup.com>
Signed-off-by: Neill Whillans <neill.whillans@codethink.co.uk>
Reviewed-by: Daniel Silverstone <daniel.silverstone@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>