Commit Graph

278 Commits

Author SHA1 Message Date
Kittipon Meesompop ee75fb8615 s390/qeth: query IPv6 assists during hardsetup
For new functionality, the L2 subdriver will start using IPv6 assists.
So move the query from the L3 subdriver into the common setup path.

Signed-off-by: Kittipon Meesompop <kmeesomp@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 13:38:49 -04:00
Kittipon Meesompop 3aade31b2f s390/qeth: add stats counter for RX csum offload
This matches the statistics we gather for the TX offload path.

Signed-off-by: Kittipon Meesompop <kmeesomp@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 13:38:48 -04:00
Julian Wiedmann d4ac024688 s390/qeth: convert vlan spinlock to mutex
As the vid_list is only accessed from process context, there's no need to
protect it with a spinlock.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 13:38:47 -04:00
Julian Wiedmann 7bcd64eb8c s390/qeth: skip QDIO queue handler indirection
Both qeth sub drivers use the same QDIO queue handlers, there's no need
to expose them via the driver's discipline. No functional change.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-27 13:38:47 -04:00
Julian Wiedmann b7493e91c1 s390/qeth: use Read device to query hypervisor for MAC
For z/VM NICs, qeth needs to consider which of the three CCW devices in
an MPC group it uses for requesting a managed MAC address.

On the Base device, the hypervisor returns a default MAC which is
pre-assigned when creating the NIC (this MAC is also returned by the
READ MAC primitive). Querying any other device results in the allocation
of an additional MAC address.

For consistency with READ MAC and to avoid using up more addresses than
necessary, it is preferable to use the NIC's default MAC. So switch the
the diag26c over to using a NIC's Read device, which should always be
identical to the Base device.

Fixes: ec61bd2fd2 ("s390/qeth: use diag26c to get MAC address on L2")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22 14:42:32 -04:00
Julian Wiedmann db71bbbd11 s390/qeth: fix request-side race during cmd IO timeout
Submitting a cmd IO request (usually on the WRITE device, but for IDX
also on the READ device) is currently done with ccw_device_start()
and a manual timeout in the caller.
On timeout, the caller cleans up the related resources (eg. IO buffer).
But 1) the IO might still be active and utilize those resources, and
    2) when the IO completes, qeth_irq() will attempt to clean up the
       same resources again.

Instead of introducing additional resource locking, switch to
ccw_device_start_timeout() to ensure IO termination after timeout, and
let the IRQ handler alone deal with cleaning up after a request.

This also removes a stray write->irq_pending reset from
clear_ipacmd_list(). The routine doesn't terminate any pending IO on
the WRITE device, so this should be handled properly via IO timeout
in the IRQ handler.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22 14:42:32 -04:00
Julian Wiedmann a936b1ef37 s390/qeth: handle failure on workqueue creation
Creating the global workqueue during driver init may fail, deal with it.
Also, destroy the created workqueue on any subsequent error.

Fixes: 0f54761d16 ("qeth: Support VEPA mode")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22 14:42:31 -04:00
Julian Wiedmann 901e3f49fa s390/qeth: avoid control IO completion stalls
For control IO, qeth currently tracks the index of the buffer that it
expects to complete the next IO on each qeth_channel. If the channel
presents an IRQ while this buffer has not yet completed, no completion
processing for _any_ completed buffer takes place.
So if the 'next buffer' is skipped for any sort of reason* (eg. when it
is released due to error conditions, before the IO is started), the
buffer obviously won't switch to PROCESSED until it is eventually
allocated for a _different_ IO and completes.
Until this happens, all completion processing on that channel stalls
and pending requests possibly time out.

As a fix, remove the whole 'next buffer' logic and simply process any
IO buffer right when it completes. A channel will never have more than
one IO pending, so there's no risk of processing out-of-sequence.

*Note: currently just one location in the code really handles this problem,
       by advancing the 'next' index manually.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22 14:42:31 -04:00
Julian Wiedmann 686c97ee29 s390/qeth: fix error handling in adapter command callbacks
Make sure to check both return code fields before(!) processing the
command response. Otherwise we risk operating on invalid data.

This matches an earlier fix for SETASSPARMS commands, see
commit ad3cbf6133 ("s390/qeth: fix error handling in checksum cmd callback").

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-22 14:42:31 -04:00
Linus Torvalds becdce1c66 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:

 - Improvements for the spectre defense:
    * The spectre related code is consolidated to a single file
      nospec-branch.c
    * Automatic enable/disable for the spectre v2 defenses (expoline vs.
      nobp)
    * Syslog messages for specve v2 are added
    * Enable CONFIG_GENERIC_CPU_VULNERABILITIES and define the attribute
      functions for spectre v1 and v2

 - Add helper macros for assembler alternatives and use them to shorten
   the code in entry.S.

 - Add support for persistent configuration data via the SCLP Store Data
   interface. The H/W interface requires a page table that uses 4K pages
   only, the code to setup such an address space is added as well.

 - Enable virtio GPU emulation in QEMU. To do this the depends
   statements for a few common Kconfig options are modified.

 - Add support for format-3 channel path descriptors and add a binary
   sysfs interface to export the associated utility strings.

 - Add a sysfs attribute to control the IFCC handling in case of
   constant channel errors.

 - The vfio-ccw changes from Cornelia.

 - Bug fixes and cleanups.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (40 commits)
  s390/kvm: improve stack frame constants in entry.S
  s390/lpp: use assembler alternatives for the LPP instruction
  s390/entry.S: use assembler alternatives
  s390: add assembler macros for CPU alternatives
  s390: add sysfs attributes for spectre
  s390: report spectre mitigation via syslog
  s390: add automatic detection of the spectre defense
  s390: move nobp parameter functions to nospec-branch.c
  s390/cio: add util_string sysfs attribute
  s390/chsc: query utility strings via fmt3 channel path descriptor
  s390/cio: rename struct channel_path_desc
  s390/cio: fix unbind of io_subchannel_driver
  s390/qdio: split up CCQ handling for EQBS / SQBS
  s390/qdio: don't retry EQBS after CCQ 96
  s390/qdio: restrict buffer merging to eligible devices
  s390/qdio: don't merge ERROR output buffers
  s390/qdio: simplify math in get_*_buffer_frontier()
  s390/decompressor: trim uncompressed image head during the build
  s390/crypto: Fix kernel crash on aes_s390 module remove.
  s390/defkeymap: fix global init to zero
  ...
2018-04-09 09:04:10 -07:00
Sebastian Ott ded27d8d2e s390/cio: rename struct channel_path_desc
Rename struct channel_path_desc to struct channel_path_desc_fmt0
to fit the scheme. Provide a macro for the function wrappers that
gather this and related data from firmware.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-03-26 16:13:11 +02:00
David S. Miller 03fe2debbb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Fun set of conflict resolutions here...

For the mac80211 stuff, these were fortunately just parallel
adds.  Trivially resolved.

In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the
function phy_disable_interrupts() earlier in the file, whilst in
'net-next' the phy_error() call from this function was removed.

In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the
'rt_table_id' member of rtable collided with a bug fix in 'net' that
added a new struct member "rt_mtu_locked" which needs to be copied
over here.

The mlxsw driver conflict consisted of net-next separating
the span code and definitions into separate files, whilst
a 'net' bug fix made some changes to that moved code.

The mlx5 infiniband conflict resolution was quite non-trivial,
the RDMA tree's merge commit was used as a guide here, and
here are their notes:

====================

    Due to bug fixes found by the syzkaller bot and taken into the for-rc
    branch after development for the 4.17 merge window had already started
    being taken into the for-next branch, there were fairly non-trivial
    merge issues that would need to be resolved between the for-rc branch
    and the for-next branch.  This merge resolves those conflicts and
    provides a unified base upon which ongoing development for 4.17 can
    be based.

    Conflicts:
            drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f95
            (IB/mlx5: Fix cleanup order on unload) added to for-rc and
            commit b5ca15ad7e (IB/mlx5: Add proper representors support)
            add as part of the devel cycle both needed to modify the
            init/de-init functions used by mlx5.  To support the new
            representors, the new functions added by the cleanup patch
            needed to be made non-static, and the init/de-init list
            added by the representors patch needed to be modified to
            match the init/de-init list changes made by the cleanup
            patch.
    Updates:
            drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function
            prototypes added by representors patch to reflect new function
            names as changed by cleanup patch
            drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init
            stage list to match new order from cleanup patch
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-23 11:31:58 -04:00
Julian Wiedmann a6c3d93963 s390/qeth: on channel error, reject further cmd requests
When the IRQ handler determines that one of the cmd IO channels has
failed and schedules recovery, block any further cmd requests from
being submitted. The request would inevitably stall, and prevent the
recovery from making progress until the request times out.

This sort of error was observed after Live Guest Relocation, where
the pending IO on the READ channel intentionally gets terminated to
kick-start recovery. Simultaneously the guest executed SIOCETHTOOL,
triggering qeth to issue a QUERY CARD INFO command. The command
then stalled in the inoperabel WRITE channel.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22 11:52:30 -04:00
Julian Wiedmann 17bf8c9b3d s390/qeth: lock read device while queueing next buffer
For calling ccw_device_start(), issue_next_read() needs to hold the
device's ccwlock.
This is satisfied for the IRQ handler path (where qeth_irq() gets called
under the ccwlock), but we need explicit locking for the initial call by
the MPC initialization.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22 11:52:30 -04:00
Julian Wiedmann 1063e432bb s390/qeth: when thread completes, wake up all waiters
qeth_wait_for_threads() is potentially called by multiple users, make
sure to notify all of them after qeth_clear_thread_running_bit()
adjusted the thread_running_mask. With no timeout, callers would
otherwise stall.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22 11:52:30 -04:00
Julian Wiedmann 6be687395b s390/qeth: free netdevice when removing a card
On removal, a qeth card's netdevice is currently not properly freed
because the call chain looks as follows:

qeth_core_remove_device(card)
	lx_remove_device(card)
		unregister_netdev(card->dev)
		card->dev = NULL			!!!
	qeth_core_free_card(card)
		if (card->dev)				!!!
			free_netdev(card->dev)

Fix it by free'ing the netdev straight after unregistering. This also
fixes the sysfs-driven layer switch case (qeth_dev_layer2_store()),
where the need to free the current netdevice was not considered at all.

Note that free_netdev() takes care of the netif_napi_del() for us too.

Fixes: 4a71df5004 ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22 11:52:30 -04:00
Julian Wiedmann 1b45c80be0 s390/qeth: reset NAPI context during queue init
init_qdio_queues() resets the Input Queue's overall QDIO state, and
positions the buffer cursor back to 0. So this is the obvious place to
also reset the queue's NAPI context (in contrast to doing it rather
randomly in the middle of the big set_online() path).
No functional change.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:05 -05:00
Julian Wiedmann 37cf05d2ff s390/qeth: allocate skb from NAPI cache
napi_alloc_skb() doesn't need to disable IRQs during the allocation,
and thus may save us a few cycles.
Doing so requires a small fix-up in the HiperTransport path, which
currently assumes a fixed NET_SKB_PAD headroom padding. napi_alloc_skb()
adds an additional NET_IP_ALIGN padding, so use the proper helper for
setting up the mac_header offset.

Use this opportunity to convert the non-NAPI path to netdev_alloc_skb(),
which means that skb->dev is now always set-up during allocation and
doesn't need to be assigned manually.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:04 -05:00
Julian Wiedmann d857e11193 s390/qeth: remove outdated portname debug msg
The 'portname' attribute is deprecated and setting it has no effect.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:04 -05:00
Julian Wiedmann ff5caa7a28 s390/qeth: use __ipa_cmd() for casting an IPA cmd buffer
"s390/qeth: fix SETIP command handling" introduced a new helper, apply
it driver-wide.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-09 13:10:04 -05:00
Julian Wiedmann d22ffb5a71 s390/qeth: fix IPA command submission race
If multiple IPA commands are build & sent out concurrently,
fill_ipacmd_header() may assign a seqno value to a command that's
different from what send_control_data() later assigns to this command's
reply.
This is due to other commands passing through send_control_data(),
and incrementing card->seqno.ipa along the way.

So one IPA command has no reply that's waiting for its seqno, while some
other IPA command has multiple reply objects waiting for it.
Only one of those waiting replies wins, and the other(s) times out and
triggers a recovery via send_ipa_cmd().

Fix this by making sure that the same seqno value is assigned to
a command and its reply object.
Do so immediately before submitting the command & while holding the
irq_pending "lock", to produce nicely ascending seqnos.

As a side effect, *all* IPA commands now use a reply object that's
waiting for its actual seqno. Previously, early IPA commands that were
submitted while the card was still DOWN used the "catch-all" IDX seqno.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 11:13:12 -05:00
Julian Wiedmann 12472af896 s390/qeth: fix overestimated count of buffer elements
qeth_get_elements_for_range() doesn't know how to handle a 0-length
range (ie. start == end), and returns 1 when it should return 0.
Such ranges occur on TSO skbs, where the L2/L3/L4 headers (and thus all
of the skb's linear data) are skipped when mapping the skb into regular
buffer elements.

This overestimation may cause several performance-related issues:
1. sub-optimal IO buffer selection, where the next buffer gets selected
   even though the skb would actually still fit into the current buffer.
2. forced linearization, if the element count for a non-linear skb
   exceeds QETH_MAX_BUFFER_ELEMENTS.

Rather than modifying qeth_get_elements_for_range() and adding overhead
to every caller, fix up those callers that are in risk of passing a
0-length range.

Fixes: 2863c61334 ("qeth: refactor calculation of SBALE count")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 11:13:12 -05:00
Julian Wiedmann 1c5b2216fb s390/qeth: fix SETIP command handling
send_control_data() applies some special handling to SETIP v4 IPA
commands. But current code parses *all* command types for the SETIP
command code. Limit the command code check to IPA commands.

Fixes: 5b54e16f1a ("qeth: do not spin for SETIP ip assist command")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-09 14:30:23 -05:00
Julian Wiedmann 615dff2283 s390/qeth: support early setup for z/VM NICs
The transport mode that a z/VM NIC is configured in, must match the
hypervisor-internal network which the NIC is coupled to.

To get this right automatically, have qeth issue a diag26c hypervisor call
that provides all sorts of information for a specific VNIC.
With z/VM update VM65918, this also includes the VNIC's required
transport mode.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-02 13:52:23 -05:00
David S. Miller fba961ab29 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Lots of overlapping changes.  Also on the net-next side
the XDP state management is handled more in the generic
layers so undo the 'net' nfp fix which isn't applicable
in net-next.

Include a necessary change by Jakub Kicinski, with log message:

====================
cls_bpf no longer takes care of offload tracking.  Make sure
netdevsim performs necessary checks.  This fixes a warning
caused by TC trying to remove a filter it has not added.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-22 11:16:31 -05:00
Julian Wiedmann 99f0b85d5f s390/qeth: use ether_addr_* helpers
Be a little more self-documenting, and get rid of OSA_ADDR_LEN.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-20 15:23:45 -05:00
Elena Reshetova ae6959273a qeth: convert qeth_reply.refcnt from atomic_t to refcount_t
atomic_t variables are currently used to implement reference
counters with the following properties:
 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable qeth_reply.refcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

Suggested-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
[jwi: removed the WARN_ONs. Use CONFIG_REFCOUNT_FULL if you care.]
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-20 15:23:44 -05:00
Julian Wiedmann ad3cbf6133 s390/qeth: fix error handling in checksum cmd callback
Make sure to check both return code fields before processing the
response. Otherwise we risk operating on invalid data.

Fixes: c9475369bd ("s390/qeth: rework RX/TX checksum offload")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-20 15:11:49 -05:00
Julian Wiedmann 02f510f326 s390/qeth: update takeover IPs after configuration change
Any modification to the takeover IP-ranges requires that we re-evaluate
which IP addresses are takeover-eligible. Otherwise we might do takeover
for some addresses when we no longer should, or vice-versa.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:43 -05:00
Julian Wiedmann 7fbd9493f0 s390/qeth: apply takeover changes when mode is toggled
Just as for an explicit enable/disable, toggling the takeover mode also
requires that the IP addresses get updated. Otherwise all IPs that were
added to the table before the mode-toggle, get registered with the old
settings.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15 11:29:42 -05:00
Linus Torvalds 236fa078c6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Various TCP control block fixes, including one that crashes with
    SELinux, from David Ahern and Eric Dumazet.

 2) Fix ACK generation in rxrpc, from David Howells.

 3) ipvlan doesn't set the mark properly in the ipv4 route lookup key,
    from Gao Feng.

 4) SIT configuration doesn't take on the frag_off ipv4 field
    configuration properly, fix from Hangbin Liu.

 5) TSO can fail after device down/up on stmmac, fix from Lars Persson.

 6) Various bpftool fixes (mostly in JSON handling) from Quentin Monnet.

 7) Various SKB leak fixes in vhost/tun/tap (mostly observed as
    performance problems). From Wei Xu.

 8) mvpps's TX descriptors were not zero initialized, from Yan Markman.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits)
  tcp: use IPCB instead of TCP_SKB_CB in inet_exact_dif_match()
  tcp: add tcp_v4_fill_cb()/tcp_v4_restore_cb()
  rxrpc: Fix the MAINTAINERS record
  rxrpc: Use correct netns source in rxrpc_release_sock()
  liquidio: fix incorrect indentation of assignment statement
  stmmac: reset last TSO segment size after device open
  ipvlan: Add the skb->mark as flow4's member to lookup route
  s390/qeth: build max size GSO skbs on L2 devices
  s390/qeth: fix GSO throughput regression
  s390/qeth: fix thinko in IPv4 multicast address tracking
  tap: free skb if flags error
  tun: free skb in early errors
  vhost: fix skb leak in handle_rx()
  bnxt_en: Fix a variable scoping in bnxt_hwrm_do_send_msg()
  bnxt_en: fix dst/src fid for vxlan encap/decap actions
  bnxt_en: wildcard smac while creating tunnel decap filter
  bnxt_en: Need to unconditionally shut down RoCE in bnxt_shutdown
  phylink: ensure we take the link down when phylink_stop() is called
  sfp: warn about modules requiring address change sequence
  sfp: improve RX_LOS handling
  ...
2017-12-04 11:14:46 -08:00
Julian Wiedmann 6d69b1f1eb s390/qeth: fix GSO throughput regression
Using GSO with small MTUs currently results in a substantial throughput
regression - which is caused by how qeth needs to map non-linear skbs
into its IO buffer elements:
compared to a linear skb, each GSO-segmented skb effectively consumes
twice as many buffer elements (ie two instead of one) due to the
additional header-only part. This causes the Output Queue to be
congested with low-utilized IO buffers.

Fix this as follows:
If the MSS is low enough so that a non-SG GSO segmentation produces
order-0 skbs (currently ~3500 byte), opt out from NETIF_F_SG. This is
where we anticipate the biggest savings, since an SG-enabled
GSO segmentation produces skbs that always consume at least two
buffer elements.

Larger MSS values continue to get a SG-enabled GSO segmentation, since
1) the relative overhead of the additional header-only buffer element
becomes less noticeable, and
2) the linearization overhead increases.

With the throughput regression fixed, re-enable NETIF_F_SG by default to
reap the significant CPU savings of GSO.

Fixes: 5722963a8e ("qeth: do not turn on SG per default")
Reported-by: Nils Hoppmann <niho@de.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-02 21:35:21 -05:00
Greg Kroah-Hartman ab9953ff0f s390: net: add SPDX identifiers to the remaining files
It's good to have SPDX identifiers in all files to make it easier to
audit the kernel tree for correct licenses.

Update the drivers/s390/net/ files with the correct SPDX license
identifier based on the license text in the file itself.  The SPDX
identifier is a legally binding shorthand, which can be used instead of
the full boiler plate text.

This work is based on a script and data from Thomas Gleixner, Philippe
Ombredanne, and Kate Stewart.

Cc: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Cc: Ursula Braun <ubraun@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-24 14:28:43 +01:00
Linus Torvalds 5bbcc0f595 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Maintain the TCP retransmit queue using an rbtree, with 1GB
      windows at 100Gb this really has become necessary. From Eric
      Dumazet.

   2) Multi-program support for cgroup+bpf, from Alexei Starovoitov.

   3) Perform broadcast flooding in hardware in mv88e6xxx, from Andrew
      Lunn.

   4) Add meter action support to openvswitch, from Andy Zhou.

   5) Add a data meta pointer for BPF accessible packets, from Daniel
      Borkmann.

   6) Namespace-ify almost all TCP sysctl knobs, from Eric Dumazet.

   7) Turn on Broadcom Tags in b53 driver, from Florian Fainelli.

   8) More work to move the RTNL mutex down, from Florian Westphal.

   9) Add 'bpftool' utility, to help with bpf program introspection.
      From Jakub Kicinski.

  10) Add new 'cpumap' type for XDP_REDIRECT action, from Jesper
      Dangaard Brouer.

  11) Support 'blocks' of transformations in the packet scheduler which
      can span multiple network devices, from Jiri Pirko.

  12) TC flower offload support in cxgb4, from Kumar Sanghvi.

  13) Priority based stream scheduler for SCTP, from Marcelo Ricardo
      Leitner.

  14) Thunderbolt networking driver, from Amir Levy and Mika Westerberg.

  15) Add RED qdisc offloadability, and use it in mlxsw driver. From
      Nogah Frankel.

  16) eBPF based device controller for cgroup v2, from Roman Gushchin.

  17) Add some fundamental tracepoints for TCP, from Song Liu.

  18) Remove garbage collection from ipv6 route layer, this is a
      significant accomplishment. From Wei Wang.

  19) Add multicast route offload support to mlxsw, from Yotam Gigi"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2177 commits)
  tcp: highest_sack fix
  geneve: fix fill_info when link down
  bpf: fix lockdep splat
  net: cdc_ncm: GetNtbFormat endian fix
  openvswitch: meter: fix NULL pointer dereference in ovs_meter_cmd_reply_start
  netem: remove unnecessary 64 bit modulus
  netem: use 64 bit divide by rate
  tcp: Namespace-ify sysctl_tcp_default_congestion_control
  net: Protect iterations over net::fib_notifier_ops in fib_seq_sum()
  ipv6: set all.accept_dad to 0 by default
  uapi: fix linux/tls.h userspace compilation error
  usbnet: ipheth: prevent TX queue timeouts when device not ready
  vhost_net: conditionally enable tx polling
  uapi: fix linux/rxrpc.h userspace compilation errors
  net: stmmac: fix LPI transitioning for dwmac4
  atm: horizon: Fix irq release error
  net-sysfs: trigger netlink notification on ifalias change via sysfs
  openvswitch: Using kfree_rcu() to simplify the code
  openvswitch: Make local function ovs_nsh_key_attr_size() static
  openvswitch: Fix return value check in ovs_meter_cmd_features()
  ...
2017-11-15 11:56:19 -08:00
Julian Wiedmann 52c44d2975 s390/qeth: don't dump control cmd twice
A few lines down, qeth_prepare_control_data() makes further changes to
the control cmd buffer, and then also writes a trace entry for it.
So the first entry just pollutes the trace file with intermediate data,
drop it.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:11:05 +01:00
Julian Wiedmann 978759e826 s390/qeth: support GRO flush timer
Switch to napi_complete_done(), and thus enable delayed GRO flushing.
The timeout is configured via /sys/class/net/<if>/gro_flush_timeout.

Default timeout is 0, so no change in behaviour.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:11:05 +01:00
Julian Wiedmann 864c17c3d8 s390/qeth: try harder to get packets from RX buffer
Current code bails out when two subsequent buffer elements hold
insufficient data to contain a qeth_hdr packet descriptor.
This seems reasonable, but it would be legal for quirky hardware to
leave a few elements empty and then present packets in a subsequent
element. These packets would currently be dropped.

So make sure to check all buffer elements, until we hit the LAST_ENTRY
indication.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:11:05 +01:00
Julian Wiedmann 8d68af6af6 s390/qeth: consolidate skb allocation
Move the allocation of SG skbs into the main path. This allows for
a little code sharing, and handling ENOMEM from within one place.

As side effect, L2 SG skbs now get the proper amount of additional
headroom (read: zero) instead of the hard-coded ETH_HLEN.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:11:05 +01:00
Julian Wiedmann b6f72f9698 s390/qeth: clean up page frag creation
Replace the open-coded skb_add_rx_frag(), and use a fall-through
to remove some duplicated code.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:11:04 +01:00
Julian Wiedmann 6e6f472d92 s390/qeth: clean up initial MTU determination
1. Drop the support for Token Ring,
2. use the ETH_DATA_LEN macro for the default L2 MTU,
3. handle OSM via the default case (as OSM is L2-only), and
4. document why the L3 MTU is reduced.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:11:04 +01:00
Julian Wiedmann ed2e93efc3 s390/qeth: remove duplicated device matching
With commit "s390/ccwgroup: tie a ccwgroup driver to its ccw driver",
the ccwgroup core now ensures that a qeth group device only consists of
ccw devices which are supported by qeth. Therefore remove qeth's
internal device matching, and use .driver_info to determine the card
type.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:11:04 +01:00
Julian Wiedmann ce34435641 s390/qeth: rely on kernel for feature recovery
When recovering a device, qeth needs to re-run the IPA commands that
enable all previously active HW features.
Instead of duplicating qeth_set_features(), let netdev_update_features()
recover the missing HW features from dev->wanted_features.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-20 13:11:04 +01:00
Julian Wiedmann f9a5d70cfa s390/ccwgroup: tie a ccwgroup driver to its ccw driver
When grouping devices, the ccwgroup core only checks whether all of the
devices are bound to the same ccw_driver. It has no means of checking
if the requesting ccwgroup driver actually supports this device type.
qeth implements its own device matching in qeth_core_probe_device(),
while ctcm and lcs currently have no sanity-checking at all.

Enable ccwgroup drivers to optionally defer the device type checking to
the ccwgroup core, by specifying their supported ccw_driver.
This allows us drop the device type matching from qeth, and improves
the robustness of ctcm and lcs.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-29 15:51:30 +02:00
Julian Wiedmann 7c2e9ba373 s390/qeth: don't take queue lock in send_packet_fast()
Locking the output queue prior to TX is needed on OSA devices,
to synchronize against a packing flush from the TX completion code
(via qeth_check_outbound_queue()).
But send_packet_fast() is only used for IQDs, which don't do packing.
So remove the locking, and apply some easy cleanups.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-18 14:41:37 -07:00
Julian Wiedmann 9627923062 s390/qeth: remove unused code in qdio_establish_cq()
Storing the number of input buffers into 'i' has no effect, it is
immediately re-assigned in the next line.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-18 14:41:37 -07:00
Julian Wiedmann 0d6f02d375 s390/qeth: use skb_cow_head() for L2 OSA xmit
Taking a full copy via skb_realloc_headroom() on every xmit is overkill
and wastes CPU time; all we actually need is to push on the qeth_hdr.
So rework the L2 OSA TX path to avoid the copy.
Minor complications arise because struct qeth_hdr must not cross a page
boundary. So add a new helper qeth_push_hdr() that catches this, and
falls back to the hdr cache that we already use for IQDs.

This change uncovered that qeth's TX completion takes rather long.
Now that we no longer free the original skb straight away and thus call
skb->destructor later than before, throughput regresses significantly.
For now, restore old behaviour by adding an explicit skb_orphan(),
and a big TODO to improve the TX completion time.

Tested-by: Nils Hoppmann <niho@de.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 10:21:30 -07:00
Julian Wiedmann eaf3cc087f s390/qeth: unify code to build header elements
After plenty of refactoring, use hd_len as single indication that
the skb needs a dedicated header element.

This preserves existing behaviour for TSO, as 'hdr' always points
to skb->data.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 10:21:30 -07:00
Julian Wiedmann f1588177b2 s390/qeth: pass full IQD header length to fill_buffer()
This is a prerequisite for unifying the code to build header elements.
The TSO header has a different size, so we can no longer rely on implicitly
adding the size of a normal qeth_hdr.

No functional change.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 10:21:30 -07:00
Julian Wiedmann 9c3bfda999 s390/qeth: pass TSO data offset to fill_buffer()
For TSO we need to skip the skb's qeth/IP/TCP headers when mapping
it into buffer elements. Instead of (mis)using skb_pull(), pass a
corresponding offset to fill_buffer() like we already do for IQDs.

No actual change in the resulting TSO buffers.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 10:21:30 -07:00
Julian Wiedmann 13ddacb526 s390/qeth: pass TSO header length to fill_buffer()
The TSO code already calculates the length of its header element,
no need to duplicate this in the low-level code again.

Use this opportunity to make hd_len unsigned, and for TSO match
its calculation to what tso_fill_header() does.

No functional change.

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-18 10:21:30 -07:00