Commit Graph

827343 Commits

Author SHA1 Message Date
Anirudh Venkataramanan 7b9ffc76bf ice: Add code for DCB initialization part 3/4
This patch adds a new function ice_pf_dcb_cfg (and related helpers)
which applies the DCB configuration obtained from the firmware. As
part of this, VSIs/netdevs are updated with traffic class information.

This patch requires a bit of a refactor of existing code.

1. For a MIB change event, the associated VSI is closed and brought up
   again. The gap between closing and opening the VSI can cause a race
   condition. Fix this by grabbing the rtnl_lock prior to closing the
   VSI and then only free it after re-opening the VSI during a MIB
   change event.

2. ice_sched_query_elem is used in ice_sched.c and with this patch, in
   ice_dcb.c as well. However, ice_dcb.c is not built when CONFIG_DCB is
   unset. This results in namespace warnings (ice_sched.o: Externally
   defined symbols with no external references) when CONFIG_DCB is unset.
   To avoid this move ice_sched_query_elem from ice_sched.c to
   ice_common.c.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan 0ebd3ff13c ice: Add code for DCB initialization part 2/4
This patch introduces a new top level function ice_init_dcb (and
related lower level helper functions) which continues the DCB init
flow.

This function uses ice_get_dcb_cfg to get, parse and store the DCB
configuration. Once this is done, it sets itself up to be notified
by the firmware on LLDP MIB change events.

Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan 37b6f6469f ice: Add code for DCB initialization part 1/4
This patch introduces a skeleton for ice_init_pf_dcb, the top level
function for DCB initialization. Subsequent patches will add to this
DCB init flow.

In this patch, ice_init_pf_dcb checks if DCB is a supported capability.
If so, an admin queue call to start the LLDP and DCBx in firmware is
issued. If not, an error is reported. Note that we don't fail the driver
init if DCB init fails.

Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan 802abbb44a ice: Bump version
Bump driver version to 0.7.3

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan f9867df6d9 ice: Fix incorrect use of abbreviations
Capitalize abbreviations and spell out some that aren't obvious.

Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Anirudh Venkataramanan 94c4441b5a ice: Fix typos in code comments
This patch fixes typos in code comments.

Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-18 08:38:47 -07:00
Larry Finger b5250c9c14 rtlwifi: rtl8188ee: Remove extraneous file
Somehow file drivers/net/wireless/realtek/rtlwifi/rtl8188ee/trx.c.rej was
incorporated into the sources. Obviously, it can be removed.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-04-18 16:08:23 +03:00
David Ahern b8fb1ab461 net ipv6: Prevent neighbor add if protocol is disabled on device
Disabling IPv6 on an interface removes existing entries but nothing prevents
new entries from being manually added. To that end, add a new neigh_table
operation, allow_add, that is called on RTM_NEWNEIGH to see if neighbor
entries are allowed on a given device. If IPv6 is disabled on the device,
allow_add returns false and passes a message back to the user via extack.

  $ echo 1 > /proc/sys/net/ipv6/conf/eth1/disable_ipv6
  $ ip -6 neigh add fe80::4c88:bff:fe21:2704 dev eth1 lladdr de:ad:be:ef:01:01
  Error: IPv6 is disabled on this device.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:19:07 -07:00
David S. Miller cea29a7072 Merge branch 'ipv6-Use-fib6_result-for-fib_lookups'
David Ahern says:

====================
ipv6: Use fib6_result for fib_lookups

Add fib6_result as a single data structure to hold results from a fib
lookup. IPv6 currently has everything in 1 data structure - a fib6_info,
but with nexthop objects the fib6_nh can be in a nexthop or a nexthop
can be a blackhole which affects the fib6_type and flags (REJECT).

v2
- fixed 2 bugs in patch12:
  i. checking return from fib6_table_lookup in fib6_lookup
  ii. call to fib6_rule_saddr in fib6_rule_action_alt should use res->nh
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:11:52 -07:00
David Ahern 7d21fec904 ipv6: Add fib6_type and fib6_flags to fib6_result
Add the fib6_flags and fib6_type to fib6_result. Update the lookup helpers
to set them and update post fib lookup users to use the version from the
result.

This allows nexthop objects to have blackhole nexthop.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:11:30 -07:00
David Ahern effda4dd97 ipv6: Pass fib6_result to fib lookups
Change fib6_lookup and fib6_table_lookup to take a fib6_result and set
f6i and nh rather than returning a fib6_info. For now both always
return 0.

A later patch set can make these more like the IPv4 counterparts and
return EINVAL, EACCESS, etc based on fib6_type.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:10:47 -07:00
David Ahern 8ff2e5b26c ipv6: Pass fib6_result to fib6_table_lookup tracepoint
Change fib6_table_lookup tracepoint to take the fib6_result and use
the fib6_info and fib6_nh from it.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:10:47 -07:00
David Ahern b7bc4b6a62 ipv6: Pass fib6_result to rt6_select and find_rr_leaf
Pass fib6_result to rt6_select. Instead of returning the fib entry, it
will set f6i and nh based on the lookup.

find_rr_leaf is changed to remove the match option in favor of taking
fib6_result and having __find_rr_leaf set f6i in the result.

In the process, update fib6_info references in __find_rr_leaf to f6i names.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:10:47 -07:00
David Ahern 75ef7389dd ipv6: Pass fib6_result to rt6_device_match
Pass fib6_result to rt6_device_match with f6i set. rt6_device_match
updates f6i in the result if it finds a better match and sets nh.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:10:47 -07:00
David Ahern b748f26092 ipv6: Pass fib6_result to ip6_mtu_from_fib6 and fib6_mtu
Change ip6_mtu_from_fib6 and fib6_mtu to take a fib6_result over a
fib6_info. Update both to use the fib6_nh from fib6_result.

Since the signature of ip6_mtu_from_fib6 is already changing, add const
to daddr and saddr.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:10:46 -07:00
David Ahern 5012f0a594 ipv6: Pass fib6_result to rt6_insert_exception
Update rt6_insert_exception to take a fib6_result over a fib6_info.
Change ort to f6i from the fib6_result and rename to better reflect
what it references (a fib6_info).

Since this function is already getting changed, update the comments
to reference fib6_info variables rather than the older rt6_info.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:10:46 -07:00
David Ahern 0d16158149 ipv6: Pass fib6_result to ip6_rt_get_dev_rcu and ip6_rt_copy_init
Now that all callers are update to have a fib6_result, pass it down
to ip6_rt_get_dev_rcu, ip6_rt_copy_init, and ip6_rt_init_dst.

In the process, change ort to f6i in ip6_rt_copy_init to make it
clear it is a reference to a fib6_info.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:10:46 -07:00
David Ahern db3fedee0c ipv6: Pass fib6_result to pcpu route functions
Update ip6_rt_pcpu_alloc, rt6_get_pcpu_route and rt6_make_pcpu_route
to a fib6_result over a fib6_info.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:10:46 -07:00
David Ahern 9b6b35abfb ipv6: Pass fib6_result to ip6_create_rt_rcu
Change ip6_create_rt_rcu to take fib6_result over a fib6_info.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:10:46 -07:00
David Ahern 85bd05deb3 ipv6: Pass fib6_result to ip6_rt_cache_alloc
Change ip6_rt_cache_alloc to take a fib6_result over a fib6_info.

Since ip6_rt_cache_alloc is only the caller, update the
rt6_is_gw_or_nonexthop helper to take fib6_result.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:10:46 -07:00
David Ahern 7e4b512875 ipv6: Pass fib6_result to rt6_find_cached_rt
Simplify rt6_find_cached_rt for the fast path cases and pass fib6_result
to rt6_find_cached_rt. Rename the local return variable to ret to maintain
consisting with fib6_result name.

Update the comment in rt6_find_cached_rt to reference the new names in
a fib6_info vs the old name when fib entries were an rt6_info.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:08:51 -07:00
David Ahern b1d4099150 ipv6: Rename fib6_multipath_select and pass fib6_result
Add 'struct fib6_result' to hold the fib entry and fib6_nh from a fib
lookup as separate entries, similar to what IPv4 now has with fib_result.

Rename fib6_multipath_select to fib6_select_path, pass fib6_result to
it, and set f6i and nh in the result once a path selection is done.
Call fib6_select_path unconditionally for path selection which means
moving the sibling and oif check to fib6_select_path. To handle the two
different call paths (2 only call multipath_select if flowi6_oif == 0 and
the other always calls it), add a new have_oif_match that controls the
sibling walk if relevant.

Update callers of fib6_multipath_select accordingly and have them use the
fib6_info and fib6_nh from the result.

This is needed for multipath nexthop objects where a single f6i can
point to multiple fib6_nh (similar to IPv4).

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 23:08:51 -07:00
Yonghong Song ba02de1aa0 selftests/bpf: fix a compilation error
I hit the following compilation error with gcc 4.8.5.

  prog_tests/flow_dissector.c: In function ‘test_flow_dissector’:
  prog_tests/flow_dissector.c:155:2: error: ‘for’ loop initial declarations are only allowed in C99 mode
    for (int i = 0; i < ARRAY_SIZE(tests); i++) {
    ^
  prog_tests/flow_dissector.c:155:2: note: use option -std=c99 or -std=gnu99 to compile your code

Let us fix the issue by avoiding this particular c99 feature.

Fixes: a5cb33464e ("selftests/bpf: make flow dissector tests more extensible")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-17 22:29:51 -07:00
Alexei Starovoitov 193d0002ef Merge branch 'bulk-cpumap-redirect'
Jesper Dangaard Brouer says:

====================
This patchset utilize a number of different kernel bulk APIs for optimizing
the performance for the XDP cpumap redirect feature.

Benchmark details are available here:
 https://github.com/xdp-project/xdp-project/blob/master/areas/cpumap/cpumap03-optimizations.org

Performance measurements can be considered micro benchmarks, as they measure
dropping packets at different stages in the network stack.
Summary based on above:

Baseline benchmarks
- baseline-redirect: UdpNoPorts: 3,180,074
- baseline-redirect: iptables-raw drop: 6,193,534

Patch1: bpf: cpumap use ptr_ring_consume_batched
- redirect: UdpNoPorts: 3,327,729
- redirect: iptables-raw drop: 6,321,540

Patch2: net: core: introduce build_skb_around
- redirect: UdpNoPorts: 3,221,303
- redirect: iptables-raw drop: 6,320,066

Patch3: bpf: cpumap do bulk allocation of SKBs
- redirect: UdpNoPorts: 3,290,563
- redirect: iptables-raw drop: 6,650,112

Patch4: bpf: cpumap memory prefetchw optimizations for struct page
- redirect: UdpNoPorts: 3,520,250
- redirect: iptables-raw drop: 7,649,604

In this V2 submission I have chosen drop the SKB-list patch using
netif_receive_skb_list() as it was not showing a performance improvement for
these micro benchmarks.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-17 19:09:26 -07:00
Jesper Dangaard Brouer 86d231459d bpf: cpumap memory prefetchw optimizations for struct page
A lot of the performance gain comes from this patch.

While analysing performance overhead it was found that the largest CPU
stalls were caused when touching the struct page area. It is first read with
a READ_ONCE from build_skb_around via page_is_pfmemalloc(), and when freed
written by page_frag_free() call.

Measurements show that the prefetchw (W) variant operation is needed to
achieve the performance gain. We believe this optimization it two fold,
first the W-variant saves one step in the cache-coherency protocol, and
second it helps us to avoid the non-temporal prefetch HW optimizations and
bring this into all cache-levels. It might be worth investigating if
prefetch into L2 will have the same benefit.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-17 19:09:25 -07:00
Jesper Dangaard Brouer 8f0504a97e bpf: cpumap do bulk allocation of SKBs
As cpumap now batch consume xdp_frame's from the ptr_ring, it knows how many
SKBs it need to allocate. Thus, lets bulk allocate these SKBs via
kmem_cache_alloc_bulk() API, and use the previously introduced function
build_skb_around().

Notice that the flag __GFP_ZERO asks the slab/slub allocator to clear the
memory for us. This does clear a larger area than needed, but my micro
benchmarks on Intel CPUs show that this is slightly faster due to being a
cacheline aligned area is cleared for the SKBs. (For SLUB allocator, there
is a future optimization potential, because SKBs will with high probability
originate from same page. If we can find/identify continuous memory areas
then the Intel CPU memset rep stos will have a real performance gain.)

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-17 19:09:25 -07:00
Jesper Dangaard Brouer ba0509b688 net: core: introduce build_skb_around
The function build_skb() also have the responsibility to allocate and clear
the SKB structure. Introduce a new function build_skb_around(), that moves
the responsibility of allocation and clearing to the caller. This allows
caller to use kmem_cache (slab/slub) bulk allocation API.

Next patch use this function combined with kmem_cache_alloc_bulk.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-17 19:09:24 -07:00
Jesper Dangaard Brouer 77361825bb bpf: cpumap use ptr_ring_consume_batched
Move ptr_ring dequeue outside loop, that allocate SKBs and calls network
stack, as these operations that can take some time. The ptr_ring is a
communication channel between CPUs, where we want to reduce/limit any
cacheline bouncing.

Do a concentrated bulk dequeue via ptr_ring_consume_batched, to shorten the
period and times the remote cacheline in ptr_ring is read

Batch size 8 is both to (1) limit BH-disable period, and (2) consume one
cacheline on 64-bit archs. After reducing the BH-disable section further
then we can consider changing this, while still thinking about L1 cacheline
size being active.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-04-17 19:09:24 -07:00
David S. Miller 6b0a7f84ea Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflict resolution of af_smc.c from Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 11:26:25 -07:00
David S. Miller cea0aa9cbd Merge branch 's390-next'
Julian Wiedmann says:

====================
s390/qeth: updates 2019-04-17

please apply some additional qeth patches to net-next. This patchset
converts the driver to use the kernel's multiqueue model.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:33:59 -07:00
Julian Wiedmann 54a50941b7 s390/qeth: stop/wake TX queues based on their fill level
Current xmit code only stops the txq after attempting to fill an
IO buffer that hasn't been TX-completed yet. In many-connection
scenarios, this can result in frequent rejected TX attempts, requeuing
of skbs with NETDEV_TX_BUSY and extra overhead.

Now that we have a proper 1-to-1 relation between stack-side txqs and
our HW Queues, overhaul the stop/wake logic so that the xmit code
stops the txq as needed.
Given that we might map multiple skbs into a single buffer, it's crucial
to ensure that the queue always provides an _entirely_ empty IO buffer.
Otherwise large skbs (eg TSO) might not fit into the last available
buffer. So whenever qeth_do_send_packet() first utilizes an _empty_
buffer, it updates & checks the used_buffers count.

This now ensures that an skb passed to qeth_xmit() can always be mapped
into an IO buffer, so remove all of the -EBUSY roll-back handling in the
TX path. We preserve the minimal safety-checks ("Is this IO buffer
really available?"), just in case some nasty future bug ever attempts to
corrupt an in-use buffer.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:33:59 -07:00
Julian Wiedmann e6c15b5f34 s390/qeth: simplify QoS code
qeth_get_priority_queue() is no longer used for IQD devices, remove the
special-casing of their mcast queue.

This effectively reverts
commit 70deb01662 ("qeth: omit outbound queue 3 for unicast packets in Priority Queuing on HiperSockets").

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:33:59 -07:00
Julian Wiedmann 73dc2daf11 s390/qeth: add TX multiqueue support for OSA devices
This adds trivial support for multiple TX queues on OSA-style devices
(both real HW and z/VM NICs). For now we expose the driver's existing
QoS mechanism via .ndo_select_queue, and adjust the number of available
TX queues when qeth_update_from_chp_desc() detects that the
HW configuration has changed.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:33:59 -07:00
Julian Wiedmann 3a18d75400 s390/qeth: add TX multiqueue support for IQD devices
qeth has been supporting multiple HW Output Queues for a long time. But
rather than exposing those queues to the stack, it uses its own queue
selection logic in .ndo_start_xmit... with all the drawbacks that
entails.
Start off by switching IQD devices over to a proper mqs net_device,
and converting all the netdev_queue management code.

One oddity with IQD devices is the requirement to place all mcast
traffic on the _highest_ established HW queue. Doing so via
.ndo_select_queue seems straight-forward - but that won't work if only
some of the HW queues are active
(ie. when dev->real_num_tx_queues < dev->num_tx_queues), since
netdev_cap_txqueue() will not allow us to put skbs on the higher queues.

To make this work, we
1. let .ndo_select_queue() map all mcast traffic to netdev_queue 0, and
2. later re-map the netdev_queue and HW queue indices in
   .ndo_start_xmit and the TX completion handler.

With this patch we default to a fixed set of 1 ucast and 1 mcast queue.
Support for dynamic reconfiguration is added at a later time.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:33:59 -07:00
Julian Wiedmann 333ef9d1d5 s390/qeth: don't keep statistics for tx timeout
struct netdev_queue contains a counter for tx timeouts, which gets
updated by dev_watchdog(). So let's not attempt to maintain our own
statistics, in particular not by overloading the skb-error counter.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:33:59 -07:00
Julian Wiedmann fdd1a5303e s390/qeth: don't bother updating the last-tx time
As the documentation for netif_trans_update() says, netdev_start_xmit()
already updates the last-tx time after every good xmit. So don't
duplicate that effort.

One odd case is that qeth_flush_buffers() also gets called from our
TX completion handler, to flush out any partially filled buffer when
we switch the queue to non-packing mode. But as the TX completion
handler will _always_ wake the txq, we don't have to worry about
the TX watchdog there.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:33:59 -07:00
Julian Wiedmann a4cdc9baee s390/qeth: handle error from qeth_update_from_chp_desc()
Subsequent code relies on the values that qeth_update_from_chp_desc()
reads from the CHP descriptor. Rather than dealing with weird errors
later on, just handle it properly here.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:33:59 -07:00
Julian Wiedmann 41c47da3b6 s390/qeth: clarify naming for some QDIO helpers
The naming of several QDIO helpers doesn't match their actual
functionality, or the structures they operate on. Clean this up.

s/qeth_alloc_qdio_buffers/qeth_alloc_qdio_queues
s/qeth_free_qdio_buffers/qeth_free_qdio_queues
s/qeth_alloc_qdio_out_buf/qeth_alloc_output_queue
s/qeth_clear_outq_buffers/qeth_drain_output_queue
s/qeth_clear_qdio_buffers/qeth_drain_output_queues

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:33:59 -07:00
David S. Miller 3a6f7892ac Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2019-04-16

This series contains updates to i40e driver only.

Adam fixes i40e so that queues can be restored to its original value if
configuring queue channels fails.  Bumped the maximum API version
supported and added the API version to error messages to clarify
supported firmware API versions.  Fixed the problem with the driver
being able to add only 7 multicast MAC address filters instead of 16.

Aleksandr adds support for Dynamic Device Personalization (DDP) which
allows loading profiles that change the way internal parser interprets
processed frames.

Nick fixes an issue where if we modify the VLAN stripping options when a
port VLAN is configured, it will break traffic for the VSI, so prevent
changes from being made.

Jake fixes an issue where a device reset can mess up the clock time
because we reset the clock time based on the kernel time every reset.
This causes us to potentially completely reset the PTP time, and can
cause unexpected behavior in programs like ptp4l.

Piotr fixes an LED blink issue with the 'ethtool -p' command, so that
identification blinking will work on all hardware.

Chinh fixed the error returned to correctly reflect the current state
when LLDP or DCBx is not in an operational state.

Grzegorz cleans up a misleading error message when untrusted VF tries to
exceed addresses beyond the NIC limit.

Carolyn fixes the error return code to correctly reflect the error case.

v2: updated the URL provided in the DDP patch (#2)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:31:21 -07:00
Linus Torvalds fe5cdef29e Fixes for some bugs cause by recent changes. One crash if you
feed bad data to the module parameters, one BUG that sometimes
 occurs when a user closes the connection, and one bug that
 cause the driver to not work if the configuration information
 only comes in from SMBIOS.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE/Q1c5nzg9ZpmiCaGYfOMkJGb/4EFAly3SdwACgkQYfOMkJGb
 /4HbXQ/8C/STdGNlR5NYSj11H58XNw3WskUb9xgRjnkee1vOfjeVHuQa5DndxUNj
 02ngdqwG+cg6W7CIujXiAm41Ty6YUjRttkUhlO+ByK6AR9failaf2emhLuoNDhht
 8UD2mhR6SK1ZUIFZjU0vWy9YH/IngbBm4BbNQn8nvUIL8xuSL/Bhx/5X0X7UNppr
 6ehBnGNlVCdeSGjor364Tero1ENYsvwJIwtI5lMWSi5Ig28BfuNtusWod7vTQU7y
 /yoUFUaTo8fwDczc7l5PpAgwxSqGF26bwuT7VYMEvudeNjq+FKieMNDCJy03gDmY
 F6OUcUcINTL+UW6cS6rdVO53lkD4qjupMM9MNSgED7XUI126OD5Igo3b+Du8fntT
 t9RszxXgn8VDpwBZCEiOpTg4+WpElWaWZFHxhEwhxUcrfBVQo37doqOi8JTozU9E
 OVOJn2M87S9AeHltNrGiMp2or8uma0X3c01loypgLCct7F5aO/YzjtzZyuHQ3InS
 Y2RwmNhoKs8z6sx4y7raS7BDUjsxh4N074PhFYgQ+9sWGWKJuHcp7OTlRlBXthf4
 zMbyvKj9U7nSfhz43eXkP6EqC6reLXfoHAZXRxiUsjQO6ROzFsak/y+cw1PKSkca
 NRA9U9pmdik67bF7Y0HGG3vo1om1lFNbeSVCt9CbG9A6AQwePtY=
 =jOqw
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.1-2' of git://github.com/cminyard/linux-ipmi

Pull IPMI fixes from Corey Minyard:
 "Fixes for some bugs cause by recent changes. One crash if you feed bad
  data to the module parameters, one BUG that sometimes occurs when a
  user closes the connection, and one bug that cause the driver to not
  work if the configuration information only comes in from SMBIOS"

* tag 'for-linus-5.1-2' of git://github.com/cminyard/linux-ipmi:
  ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier
  ipmi: ipmi_si_hardcode.c: init si_type array to fix a crash
  ipmi: Fix failure on SMBIOS specified devices
2019-04-17 10:25:25 -07:00
David S. Miller e77b8ba640 Merge branch 'stmmac-Enable-Flow-Control'
Jose Abreu says:

====================
net: stmmac: Enable Flow Control

I don't know of any specific reason why Flow Control is off by default but
do let me know if there is any.

Tested in B2B between XGMAC2 and GMAC5.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:14:28 -07:00
Jose Abreu e998933906 net: stmmac: Set Flow Control to automatic mode in the driver
By default Flow Control feature is not being enabled in stmmac.

This is a useful feature that can prevent loss of packets and now that
XGMAC already supports it (along with GMAC and QoS) it makes sense to
activate it.

Switch the module parameter to FLOW_AUTO so that Flow Control is
activated.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:14:28 -07:00
Jose Abreu ff82cfc783 net: stmmac: dwxgmac: Finish the Flow Control implementation
Finish the implementation of Flow Control feature. In order for it to
work correctly we need to set EHFC bit and the correct threshold values
for activating and deactivating it.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-17 10:14:28 -07:00
Linus Torvalds 2a3a028fc6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Handle init flow failures properly in iwlwifi driver, from Shahar S
    Matityahu.

 2) mac80211 TXQs need to be unscheduled on powersave start, from Felix
    Fietkau.

 3) SKB memory accounting fix in A-MDSU aggregation, from Felix Fietkau.

 4) Increase RCU lock hold time in mlx5 FPGA code, from Saeed Mahameed.

 5) Avoid checksum complete with XDP in mlx5, also from Saeed.

 6) Fix netdev feature clobbering in ibmvnic driver, from Thomas Falcon.

 7) Partial sent TLS record leak fix from Jakub Kicinski.

 8) Reject zero size iova range in vhost, from Jason Wang.

 9) Allow pending work to complete before clcsock release from Karsten
    Graul.

10) Fix XDP handling max MTU in thunderx, from Matteo Croce.

11) A lot of protocols look at the sa_family field of a sockaddr before
    validating it's length is large enough, from Tetsuo Handa.

12) Don't write to free'd pointer in qede ptp error path, from Colin Ian
    King.

13) Have to recompile IP options in ipv4_link_failure because it can be
    invoked from ARP, from Stephen Suryaputra.

14) Doorbell handling fixes in qed from Denis Bolotin.

15) Revert net-sysfs kobject register leak fix, it causes new problems.
    From Wang Hai.

16) Spectre v1 fix in ATM code, from Gustavo A. R. Silva.

17) Fix put of BROPT_VLAN_STATS_PER_PORT in bridging code, from Nikolay
    Aleksandrov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (111 commits)
  socket: fix compat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEW
  tcp: tcp_grow_window() needs to respect tcp_space()
  ocelot: Clean up stats update deferred work
  ocelot: Don't sleep in atomic context (irqs_disabled())
  net: bridge: fix netlink export of vlan_stats_per_port option
  qed: fix spelling mistake "faspath" -> "fastpath"
  tipc: set sysctl_tipc_rmem and named_timeout right range
  tipc: fix link established but not in session
  net: Fix missing meta data in skb with vlan packet
  net: atm: Fix potential Spectre v1 vulnerabilities
  net/core: work around section mismatch warning for ptp_classifier
  net: bridge: fix per-port af_packet sockets
  bnx2x: fix spelling mistake "dicline" -> "decline"
  route: Avoid crash from dereferencing NULL rt->from
  MAINTAINERS: normalize Woojung Huh's email address
  bonding: fix event handling for stacked bonds
  Revert "net-sysfs: Fix memory leak in netdev_register_kobject"
  rtnetlink: fix rtnl_valid_stats_req() nlmsg_len check
  qed: Fix the DORQ's attentions handling
  qed: Fix missing DORQ attentions
  ...
2019-04-17 09:57:45 -07:00
Corey Minyard 3b9a907223 ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier
free_user() could be called in atomic context.

This patch pushed the free operation off into a workqueue.

Example:

 BUG: sleeping function called from invalid context at kernel/workqueue.c:2856
 in_atomic(): 1, irqs_disabled(): 0, pid: 177, name: ksoftirqd/27
 CPU: 27 PID: 177 Comm: ksoftirqd/27 Not tainted 4.19.25-3 #1
 Hardware name: AIC 1S-HV26-08/MB-DPSB04-06, BIOS IVYBV060 10/21/2015
 Call Trace:
  dump_stack+0x5c/0x7b
  ___might_sleep+0xec/0x110
  __flush_work+0x48/0x1f0
  ? try_to_del_timer_sync+0x4d/0x80
  _cleanup_srcu_struct+0x104/0x140
  free_user+0x18/0x30 [ipmi_msghandler]
  ipmi_free_recv_msg+0x3a/0x50 [ipmi_msghandler]
  deliver_response+0xbd/0xd0 [ipmi_msghandler]
  deliver_local_response+0xe/0x30 [ipmi_msghandler]
  handle_one_recv_msg+0x163/0xc80 [ipmi_msghandler]
  ? dequeue_entity+0xa0/0x960
  handle_new_recv_msgs+0x15c/0x1f0 [ipmi_msghandler]
  tasklet_action_common.isra.22+0x103/0x120
  __do_softirq+0xf8/0x2d7
  run_ksoftirqd+0x26/0x50
  smpboot_thread_fn+0x11d/0x1e0
  kthread+0x103/0x140
  ? sort_range+0x20/0x20
  ? kthread_destroy_worker+0x40/0x40
  ret_from_fork+0x1f/0x40

Fixes: 77f8269606 ("ipmi: fix use-after-free of user->release_barrier.rda")

Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org # 5.0
Cc: Yang Yingliang <yangyingliang@huawei.com>
2019-04-17 10:29:27 -05:00
Arnd Bergmann e6986423d2 socket: fix compat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEW
It looks like the new socket options only work correctly
for native execution, but in case of compat mode fall back
to the old behavior as we ignore the 'old_timeval' flag.

Rework so we treat SO_RCVTIMEO_NEW/SO_SNDTIMEO_NEW the
same way in compat and native 32-bit mode.

Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Fixes: a9beb86ae6 ("sock: Add SO_RCVTIMEO_NEW and SO_SNDTIMEO_NEW")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16 21:52:22 -07:00
Eric Dumazet 50ce163a72 tcp: tcp_grow_window() needs to respect tcp_space()
For some reason, tcp_grow_window() correctly tests if enough room
is present before attempting to increase tp->rcv_ssthresh,
but does not prevent it to grow past tcp_space()

This is causing hard to debug issues, like failing
the (__tcp_select_window(sk) >= tp->rcv_wnd) test
in __tcp_ack_snd_check(), causing ACK delays and possibly
slow flows.

Depending on tcp_rmem[2], MTU, skb->len/skb->truesize ratio,
we can see the problem happening on "netperf -t TCP_RR -- -r 2000,2000"
after about 60 round trips, when the active side no longer sends
immediate acks.

This bug predates git history.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16 21:47:39 -07:00
David S. Miller 17f780b364 Merge branch 'dpaa2-eth-Add-flow-steering-support-without-masking'
Ioana Ciocoi Radulescu says:

====================
dpaa2-eth: Add flow steering support without masking

On DPAA2 platforms that lack a TCAM (like LS1088A), masking of
flow steering keys is not supported. Until now we didn't offer
flow steering capabilities at all on these platforms.

Introduce a limited support for flow steering, where we only
allow ethtool rules that share a common key (i.e. have the same
header fields). If a rule with a new composition key is wanted,
the user must first manually delete all previous rules.

First patch fixes a minor bug, the next two cleanup and prepare
the code and the last one introduces the actual FS support.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16 21:46:19 -07:00
Ioana Ciocoi Radulescu 2d6802374c dpaa2-eth: Add flow steering support without masking
On platforms that lack a TCAM (like LS1088A), masking of
flow steering keys is not supported. Until now we didn't
offer flow steering capabilities at all on these platforms,
since our driver implementation configured a "comprehensive"
FS key (containing all supported header fields), with masks
used to ignore the fields not present in the rules provided
by the user.

We now allow ethtool rules that share a common key (i.e. have
the same header fields). The FS key is now kept in the driver
private data and initialized when the first rule is added to
an empty table, rather than at probe time. If a rule with a new
composition key is wanted, the user must first manually delete
all previous rules.

When building a FS table entry to pass to firmware, we still use
the old building algorithm, which assumes an all-supported-fields
key, and later collapse the fields which aren't actually needed.

Masked rules are not supported; if provided, the mask value
will be ignored. For firmware versions older than MC10.7.0
(that only offer the legacy ABIs for configuring distribution
keys) flow steering without masking support remains unavailable.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16 21:46:19 -07:00
Ioana Ciocoi Radulescu 3a1e6b84ad dpaa2-eth: Update hash key composition code
Introduce an internal id bitfield to uniquely identify header fields
supported by the Rx distribution keys. For the hash key, add a
conversion from the RXH_* bitmask provided by ethtool to the internal
ids.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-16 21:46:19 -07:00