Commit Graph

162 Commits

Author SHA1 Message Date
David Vrabel 1650d5455b xen-netback: always fully coalesce guest Rx packets
Always fully coalesce guest Rx packets into the minimum number of ring
slots.  Reducing the number of slots per packet has significant
performance benefits when receiving off-host traffic.

Results from XenServer's performance benchmarks:

                         Baseline    Full coalesce
Interhost VM receive      7.2 Gb/s   11 Gb/s
Interhost aggregate      24 Gb/s     24 Gb/s
Intrahost single stream  14 Gb/s     14 Gb/s
Intrahost aggregate      34 Gb/s     34 Gb/s

However, this can increase the number of grant ops per packet which
decreases performance of backend (dom0) to VM traffic (by ~10%)
/unless/ grant copy has been optimized for adjacent ops with the same
source or destination (see "grant-table: defer releasing pages
acquired in a grant copy"[1] expected in Xen 4.6).

[1] http://lists.xen.org/archives/html/xen-devel/2015-01/msg01118.html

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-23 18:01:58 -08:00
Palik, Imre 07ff890dae xen-netback: fixing the propagation of the transmit shaper timeout
Since e9ce7cb6b1 ("xen-netback: Factor queue-specific data into queue struct"),
the transimt shaper timeout is always set to 0.  The value the user sets via
xenbus is never propagated to the transmit shaper.

This patch fixes the issue.

Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: Imre Palik <imrep@amazon.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-06 14:17:37 -05:00
David Vrabel 26c0e10258 xen-netback: support frontends without feature-rx-notify again
Commit bc96f648df (xen-netback: make
feature-rx-notify mandatory) incorrectly assumed that there were no
frontends in use that did not support this feature.  But the frontend
driver in MiniOS does not and since this is used by (qemu) stubdoms,
these stopped working.

Netback sort of works as-is in this mode except:

- If there are no Rx requests and the internal Rx queue fills, only
  the drain timeout will wake the thread.  The default drain timeout
  of 10 s would give unacceptable pauses.

- If an Rx stall was detected and the internal Rx queue is drained,
  then the Rx thread would never wake.

Handle these two cases (when feature-rx-notify is disabled) by:

- Reducing the drain timeout to 30 ms.

- Disabling Rx stall detection.

Reported-by: John <jw@nuclearfallout.net>
Tested-by: John <jw@nuclearfallout.net>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-18 12:49:49 -05:00
David S. Miller 22f10923dd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/amd/xgbe/xgbe-desc.c
	drivers/net/ethernet/renesas/sh_eth.c

Overlapping changes in both conflict cases.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-10 15:48:20 -05:00
Jan Beulich f15650b7f9 netback: don't store invalid vif pointer
When xenvif_alloc() fails, it returns a non-NULL error indicator. To
avoid eventual races, we shouldn't store that into struct backend_info
as readers of it only check for NULL.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09 18:30:08 -05:00
David S. Miller 60b7379dc5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-11-29 20:47:48 -08:00
Alexey Khoroshilov 2dd34339ac xen-netback: do not report success if backend_create_xenvif() fails
If xenvif_alloc() or xenbus_scanf() fail in backend_create_xenvif(),
xenbus is left in offline mode but netback_probe() reports success.

The patch implements propagation of error code for backend_create_xenvif().

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24 16:14:45 -05:00
Malcolm Crossley 7e5d775395 xen-netback: remove unconditional __pskb_pull_tail() in guest Tx path
Unconditionally pulling 128 bytes into the linear area is not required
for:

- security: Every protocol demux starts with pskb_may_pull() to pull
  frag data into the linear area, if necessary, before looking at
  headers.

- performance: Netback has already grant copied up-to 128 bytes from
  the first slot of a packet into the linear area. The first slot
  normally contain all the IPv4/IPv6 and TCP/UDP headers.

The unconditional pull would often copy frag data unnecessarily.  This
is a performance problem when running on a version of Xen where grant
unmap avoids TLB flushes for pages which are not accessed.  TLB
flushes can now be avoided for > 99% of unmaps (it was 0% before).

Grant unmap TLB flush avoidance will be available in a future version
of Xen (probably 4.6).

Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06 14:40:18 -05:00
David S. Miller 55b42b5ca2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/phy/marvell.c

Simple overlapping changes in drivers/net/phy/marvell.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-01 14:53:27 -04:00
Zoltan Kiss 44cc8ed17e xen-netback: Remove __GFP_COLD
This flag is unnecessary, it came from some old code.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-29 15:59:37 -04:00
Zoltan Kiss 8fe78989c3 xen-netback: Disable NAPI after disabling interrupts
Otherwise the interrupt handler still calls napi_complete. Although it
won't schedule NAPI again as either NAPI_STATE_DISABLE or
NAPI_STATE_SCHED is set, it is just unnecessary, and it makes more
sense to do this way.

Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-29 15:59:37 -04:00
David Vrabel ecf08d2dbb xen-netback: reintroduce guest Rx stall detection
If a frontend not receiving packets it is useful to detect this and
turn off the carrier so packets are dropped early instead of being
queued and drained when they expire.

A to-guest queue is stalled if it doesn't have enough free slots for a
an extended period of time (default 60 s).

If at least one queue is stalled, the carrier is turned off (in the
expectation that the other queues will soon stall as well).  The
carrier is only turned on once all queues are ready.

When the frontend connects, all the queues start in the stalled state
and only become ready once the frontend queues enough Rx requests.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-25 14:15:20 -04:00
David Vrabel f48da8b14d xen-netback: fix unlimited guest Rx internal queue and carrier flapping
Netback needs to discard old to-guest skb's (guest Rx queue drain) and
it needs detect guest Rx stalls (to disable the carrier so packets are
discarded earlier), but the current implementation is very broken.

1. The check in hard_start_xmit of the slot availability did not
   consider the number of packets that were already in the guest Rx
   queue.  This could allow the queue to grow without bound.

   The guest stops consuming packets and the ring was allowed to fill
   leaving S slot free.  Netback queues a packet requiring more than S
   slots (ensuring that the ring stays with S slots free).  Netback
   queue indefinately packets provided that then require S or fewer
   slots.

2. The Rx stall detection is not triggered in this case since the
   (host) Tx queue is not stopped.

3. If the Tx queue is stopped and a guest Rx interrupt occurs, netback
   will consider this an Rx purge event which may result in it taking
   the carrier down unnecessarily.  It also considers a queue with
   only 1 slot free as unstalled (even though the next packet might
   not fit in this).

The internal guest Rx queue is limited by a byte length (to 512 Kib,
enough for half the ring).  The (host) Tx queue is stopped and started
based on this limit.  This sets an upper bound on the amount of memory
used by packets on the internal queue.

This allows the estimatation of the number of slots for an skb to be
removed (it wasn't a very good estimate anyway).  Instead, the guest
Rx thread just waits for enough free slots for a maximum sized packet.

skbs queued on the internal queue have an 'expires' time (set to the
current time plus the drain timeout).  The guest Rx thread will detect
when the skb at the head of the queue has expired and discard expired
skbs.  This sets a clear upper bound on the length of time an skb can
be queued for.  For a guest being destroyed the maximum time needed to
wait for all the packets it sent to be dropped is still the drain
timeout (10 s) since it will not be sending new packets.

Rx stall detection is reintroduced in a later commit.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-25 14:15:20 -04:00
David Vrabel bc96f648df xen-netback: make feature-rx-notify mandatory
Frontends that do not provide feature-rx-notify may stall because
netback depends on the notification from frontend to wake the guest Rx
thread (even if can_queue is false).

This could be fixed but feature-rx-notify was introduced in 2006 and I
am not aware of any frontends that do not implement this.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-25 14:15:20 -04:00
David Vrabel 95afae4814 xen: remove DEFINE_XENBUS_DRIVER() macro
The DEFINE_XENBUS_DRIVER() macro looks a bit weird and causes sparse
errors.

Replace the uses with standard structure definitions instead.  This is
similar to pci and usb device registration.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2014-10-06 10:27:57 +01:00
Wei Liu e24f8191cc xen-netback: move netif_napi_add before binding interrupt
Interrupt is enabled when bind_interdomain_evtchn_to_irqhandler returns.
If there's interrupt pending interrupt handler is invoked.

NAPI needs to be initialised before binding interrupt otherwise the
interrupt handler will try to scheduling a NAPI instance that is not
initialised yet, resulting in kernel OOPS.

This fixes a regression introduced in ea2c5e13 ("xen-netback: move NAPI
add/remove calls").

Ideally function calls to create kthreads should also be moved before
binding but I intent to fix this regression with minimal changes and
refactor the code with another patch.

Reported-by: Thomas Leonard <talex5@gmail.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-25 17:31:42 -07:00
Wei Liu b125285821 xen-netback: remove loop waiting function
The original implementation relies on a loop to check if all inflight
packets are freed. Now we have proper reference counting, there's no
need to use loop anymore.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:07:44 -07:00
Wei Liu a64bd93452 xen-netback: don't stop dealloc kthread too early
Reference count the number of packets in host stack, so that we don't
stop the deallocation thread too early. If not, we can end up with
xenvif_free permanently waiting for deallocation thread to unmap grefs.

Reported-by: Thomas Leonard <talex5@gmail.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:07:44 -07:00
Wei Liu ea2c5e1342 xen-netback: move NAPI add/remove calls
Originally netif_napi_add was in xenvif_init_queue and netif_napi_del
was in xenvif_deinit_queue, while kthreads were handled in
xenvif_connect and xenvif_disconnect. Move netif_napi_add and
netif_napi_del to xenvif_connect and xenvif_disconnect so that they
reside together with kthread operations.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:07:44 -07:00
Wei Liu 628fa76b09 xen-netback: fix debugfs entry creation
The original code is bogus. The function gets called in a loop which
leaks entries created in previous rounds.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:06:43 -07:00
Wei Liu 5c807005fa xen-netback: fix debugfs write length check
Enlarge buffer size and check input length properly, so that we don't
misuse -ENOSPC.

Note that command like "kickXXXX" is still allowed, that's one patch for
another day if we really want to be very strict on this.

Reported-by: SeeChen Ng <seechen81@gmail.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:06:43 -07:00
Zoltan Kiss 2561cc15e3 xen-netback: Don't deschedule NAPI when carrier off
In the patch called "xen-netback: Turn off the carrier if the guest is not able
to receive" NAPI was descheduled when the carrier was set off. That's
not what most of the drivers do, and we don't have any specific reason to do so
as well, so revert that change.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-11 14:07:47 -07:00
Zoltan Kiss 743b0a92b9 xen-netback: Fix vif->disable handling
In the patch called "xen-netback: Turn off the carrier if the guest is not able
to receive" new branches were introduced to this if statement, risking that a
queue with non-zero id can reenable the disabled interface.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-07 16:02:56 -07:00
Zoltan Kiss f34a4cf9c9 xen-netback: Turn off the carrier if the guest is not able to receive
Currently when the guest is not able to receive more packets, qdisc layer starts
a timer, and when it goes off, qdisc is started again to deliver a packet again.
This is a very slow way to drain the queues, consumes unnecessary resources and
slows down other guests shutdown.
This patch change the behaviour by turning the carrier off when that timer
fires, so all the packets are freed up which were stucked waiting for that vif.
Instead of the rx_queue_purge bool it uses the VIF_STATUS_RX_PURGE_EVENT bit to
signal the thread that either the timeout happened or an RX interrupt arrived,
so the thread can check what it should do. It also disables NAPI, so the guest
can't transmit, but leaves the interrupts on, so it can resurrect.
Only the queues which brought down the interface can enable it again, the bit
QUEUE_STATUS_RX_STALLED makes sure of that.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 16:04:46 -07:00
Zoltan Kiss 3d1af1df97 xen-netback: Using a new state bit instead of carrier
This patch introduces a new state bit VIF_STATUS_CONNECTED to track whether the
vif is in a connected state. Using carrier will not work with the next patch
in this series, which aims to turn the carrier temporarily off if the guest
doesn't seem to be able to receive packets.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org

v2:
- rename the bitshift type to "enum state_bit_shift" here, not in the next patch
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-05 16:04:46 -07:00
David S. Miller 8fd90bb889 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/infiniband/hw/cxgb4/device.c

The cxgb4 conflict was simply overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22 00:44:59 -07:00
Zoltan Kiss d8cfbfc466 xen-netback: Fix pointer incrementation to avoid incorrect logging
Due to this pointer is increased prematurely, the error log contains rubbish.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-20 20:56:06 -07:00
Zoltan Kiss 1b860da040 xen-netback: Fix releasing header slot on error path
This patch makes this function aware that the first frag and the header might
share the same ring slot. That could happen if the first slot is bigger than
PKT_PROT_LEN. Due to this the error path might release that slot twice or never,
depending on the error scenario.
xenvif_idx_release is also removed from xenvif_idx_unmap, and called separately.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-20 20:56:06 -07:00
Zoltan Kiss b42cc6e421 xen-netback: Fix releasing frag_list skbs in error path
When the grant operations failed, the skb is freed up eventually, and it tries
to release the frags, if there is any. For the main skb nr_frags is set to 0 to
avoid this, but on the frag_list it iterates through the frags array, and tries
to call put_page on the page pointer which contains garbage at that time.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-20 20:56:06 -07:00
Zoltan Kiss 1a998d3e6b xen-netback: Fix handling frag_list on grant op error path
The error handling for skb's with frag_list was completely wrong, it caused
double unmap attempts to happen if the error was on the first skb. Move it to
the right place in the loop.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-20 20:56:05 -07:00
Tom Gundersen c835a67733 net: set name_assign_type in alloc_netdev()
Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert
all users to pass NET_NAME_UNKNOWN.

Coccinelle patch:

@@
expression sizeof_priv, name, setup, txqs, rxqs, count;
@@

(
-alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs)
+alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs)
|
-alloc_netdev_mq(sizeof_priv, name, setup, count)
+alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count)
|
-alloc_netdev(sizeof_priv, name, setup)
+alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup)
)

v9: move comments here from the wrong commit

Signed-off-by: Tom Gundersen <teg@jklm.no>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15 16:12:48 -07:00
Zoltan Kiss f51de24356 xen-netback: Adding debugfs "io_ring_qX" files
This patch adds debugfs capabilities to netback. There used to be a similar
patch floating around for classic kernel, but it used procfs. It is based on a
very similar blkback patch.
It creates xen-netback/[vifname]/io_ring_q[queueno] files, reading them output
various ring variables etc. Writing "kick" into it imitates an interrupt
happened, it can be useful to check whether the ring is just stalled due to a
missed interrupt.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08 20:48:36 -07:00
Wei Liu f7b50c4e7c xen-netback: bookkeep number of active queues in our own module
The original code uses netdev->real_num_tx_queues to bookkeep number of
queues and invokes netif_set_real_num_tx_queues to set the number of
queues. However, netif_set_real_num_tx_queues doesn't allow
real_num_tx_queues to be smaller than 1, which means setting the number
to 0 will not work and real_num_tx_queues is untouched.

This is bogus when xenvif_free is invoked before any number of queues is
allocated. That function needs to iterate through all queues to free
resources. Using the wrong number of queues results in NULL pointer
dereference.

So we bookkeep the number of queues in xen-netback to solve this
problem. This fixes a regression introduced by multiqueue patchset in
3.16-rc1.

There's another bug in original code that the real number of RX queues
is never set. In current Xen multiqueue design, the number of TX queues
and RX queues are in fact the same. We need to set the numbers of TX and
RX queues to the same value.

Also remove xenvif_select_queue and leave queue selection to core
driver, as suggested by David Miller.

Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-25 15:59:47 -07:00
Arnd Bergmann e7b599d7c1 net: xen-netback: include linux/vmalloc.h again
commit e9ce7cb6b1 ("xen-netback: Factor queue-specific data into
queue struct") added a use of vzalloc/vfree to interface.c, but
removed the #include <linux/vmalloc.h> statement at the same time,
which causes this build error:

drivers/net/xen-netback/interface.c: In function 'xenvif_free':
drivers/net/xen-netback/interface.c:754:2: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration]
  vfree(vif->queues);
  ^
cc1: some warnings being treated as errors

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Andrew J. Bennieston <andrew.bennieston@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11 15:19:28 -07:00
David S. Miller f666f87b94 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/xen-netback/netback.c
	net/core/filter.c

A filter bug fix overlapped some cleanups and a conversion
over to some new insn generation macros.

A xen-netback bug fix overlapped the addition of multi-queue
support.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-05 16:22:02 -07:00
Zoltan Kiss 59ae9fc670 xen-netback: Fix handling of skbs requiring too many slots
A recent commit (a02eb4 "xen-netback: worse-case estimate in xenvif_rx_action is
underestimating") capped the slot estimation to MAX_SKB_FRAGS, but that triggers
the next BUG_ON a few lines down, as the packet consumes more slots than
estimated.
This patch introduces full_coalesce on the skb callback buffer, which is used in
start_new_rx_buffer() to decide whether netback needs coalescing more
aggresively. By doing that, no packet should need more than
(XEN_NETIF_MAX_TX_SIZE + 1) / PAGE_SIZE data slots (excluding the optional GSO
slot, it doesn't carry data, therefore irrelevant in this case), as the provided
buffers are fully utilized.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-05 15:09:08 -07:00
Andrew J. Bennieston 8d3d53b3e4 xen-netback: Add support for multiple queues
Builds on the refactoring of the previous patch to implement multiple
queues between xen-netfront and xen-netback.

Writes the maximum supported number of queues into XenStore, and reads
the values written by the frontend to determine how many queues to use.

Ring references and event channels are read from XenStore on a per-queue
basis and rings are connected accordingly.

Also adds code to handle the cleanup of any already initialised queues
if the initialisation of a subsequent queue fails.

Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-04 14:48:16 -07:00
Wei Liu e9ce7cb6b1 xen-netback: Factor queue-specific data into queue struct
In preparation for multi-queue support in xen-netback, move the
queue-specific data from struct xenvif into struct xenvif_queue, and
update the rest of the code to use this.

Also adds loops over queues where appropriate, even though only one is
configured at this point, and uses alloc_netdev_mq() and the
corresponding multi-queue netif wake/start/stop functions in preparation
for multiple active queues.

Finally, implements a trivial queue selection function suitable for
ndo_select_queue, which simply returns 0 for a single queue and uses
skb_get_hash() to compute the queue index otherwise.

Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-04 14:48:16 -07:00
Andrew J. Bennieston a55d9766ce xen-netback: Move grant_copy_op array back into struct xenvif.
This array was allocated separately in commit ac3d5ac2 ("xen-netback:
fix guest-receive-side array sizes") due to it being very large, and a
struct xenvif is allocated as the netdev_priv part of a struct
net_device, i.e. via kmalloc() but falling back to vmalloc() if the
initial alloc. fails.

In preparation for the multi-queue patches, where this array becomes
part of struct xenvif_queue and is always allocated through vzalloc(),
move this back into the struct xenvif.

Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-04 14:48:16 -07:00
David S. Miller 54e5c4def0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/bonding/bond_alb.c
	drivers/net/ethernet/altera/altera_msgdma.c
	drivers/net/ethernet/altera/altera_sgdma.c
	net/ipv6/xfrm6_output.c

Several cases of overlapping changes.

The xfrm6_output.c has a bug fix which overlaps the renaming
of skb->local_df to skb->ignore_df.

In the Altera TSE driver cases, the register access cleanups
in net-next overlapped with bug fixes done in net.

Similarly a bug fix to send ALB packets in the bonding driver using
the right source address overlaps with cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24 00:32:30 -04:00
David Vrabel 0d08fceb2e xen-netback: fix race between napi_complete() and interrupt handler
When the NAPI budget was not all used, xenvif_poll() would call
napi_complete() /after/ enabling the interrupt.  This resulted in a
race between the napi_complete() and the napi_schedule() in the
interrupt handler.  The use of local_irq_save/restore() avoided by
race iff the handler is running on the same CPU but not if it was
running on a different CPU.

Fix this properly by calling napi_complete() before reenabling
interrupts (in the xenvif_napi_schedule_or_enable_irq() call).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 16:27:23 -04:00
Zoltan Kiss 583757446b xen-netback: Fix grant ref resolution in RX path
The original series for reintroducing grant mapping for netback had a patch [1]
to handle receiving of packets from an another VIF. Grant copy on the receiving
side needs the grant ref of the page to set up the op.
The original patch assumed (wrongly) that the frags array haven't changed. In
the case reported by Sander, the sending guest sent a packet where the linear
buffer and the first frag were under PKT_PROT_LEN (=128) bytes.
xenvif_tx_submit() then pulled up the linear area to 128 bytes, and ditched the
first frag. The receiving side had an off-by-one problem when gathered the grant
refs.
This patch fixes that by checking whether the actual frag's page pointer is the
same as the page in the original frag list. It can handle any kind of changes on
the original frags array, like:
- removing granted frags from the array at any point
- adding local pages to the frags list anywhere
- reordering the frags
It's optimized to the most common case, when there is 1:1 relation between the
frags and the list, plus works optimal when frags are removed from the end or
the beginning.

[1]: 3e2234: xen-netback: Handle foreign mapped pages on the guest RX path

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15 23:32:36 -04:00
Wilfried Klaebe 7ad24ea4bf net: get rid of SET_ETHTOOL_OPS
net: get rid of SET_ETHTOOL_OPS

Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone.
This does that.

Mostly done via coccinelle script:
@@
struct ethtool_ops *ops;
struct net_device *dev;
@@
-       SET_ETHTOOL_OPS(dev, ops);
+       dev->ethtool_ops = ops;

Compile tested only, but I'd seriously wonder if this broke anything.

Suggested-by: Dave Miller <davem@davemloft.net>
Signed-off-by: Wilfried Klaebe <w-lkml@lebenslange-mailadresse.de>
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-13 17:43:20 -04:00
Zoltan Kiss 00aefceb2f xen-netback: Trivial format string fix
There is a "%" after pending_idx instead of ":".

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-04 10:49:53 -04:00
Zoltan Kiss bdab82759b xen-netback: Grant copy the header instead of map and memcpy
An old inefficiency of the TX path that we are grant mapping the first slot,
and then copy the header part to the linear area. Instead, doing a grant copy
for that header straight on is more reasonable. Especially because there are
ongoing efforts to make Xen avoiding TLB flush after unmap when the page were
not touched in Dom0. In the original way the memcpy ruined that.
The key changes:
- the vif has a tx_copy_ops array again
- xenvif_tx_build_gops sets up the grant copy operations
- we don't have to figure out whether the header and first frag are on the same
  grant mapped page or not
Note, we only grant copy PKT_PROT_LEN bytes from the first slot, the rest (if
any) will be on the first frag, which is grant mapped. If the first slot is
smaller than PKT_PROT_LEN, then we grant copy that, and later __pskb_pull_tail
will pull more from the frags (if any)

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:22:40 -04:00
Zoltan Kiss 9074ce2493 xen-netback: Rename map ops
Rename identifiers to state explicitly that they refer to map ops.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-03 14:22:38 -04:00
Wei Liu e9d8b2c296 xen-netback: disable rogue vif in kthread context
When netback discovers frontend is sending malformed packet it will
disables the interface which serves that frontend.

However disabling a network interface involving taking a mutex which
cannot be done in softirq context, so we need to defer this process to
kthread context.

This patch does the following:
1. introduce a flag to indicate the interface is disabled.
2. check that flag in TX path, don't do any work if it's true.
3. check that flag in RX path, turn off that interface if it's true.

The reason to disable it in RX path is because RX uses kthread. After
this change the behavior of netback is still consistent -- it won't do
any TX work for a rogue frontend, and the interface will be eventually
turned off.

Also change a "continue" to "break" after xenvif_fatal_tx_err, as it
doesn't make sense to continue processing packets if frontend is rogue.

This is a fix for XSA-90.

Reported-by: Török Edwin <edwin@etorok.net>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-01 16:25:51 -04:00
David S. Miller 0b70195e0c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/xen-netback/netback.c

A bug fix overlapped with changing how the netback SKB control
block is implemented.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-31 16:56:43 -04:00
Paul Durrant 1425c7a4e8 xen-netback: BUG_ON in xenvif_rx_action() not catching overflow
The BUG_ON to catch ring overflow in xenvif_rx_action() makes the assumption
that meta_slots_used == ring slots used. This is not necessarily the case
for GSO packets, because the non-prefix GSO protocol consumes one more ring
slot than meta-slot for the 'extra_info'. This patch changes the test to
actually check ring slots.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-29 18:50:34 -04:00
Paul Durrant a02eb4732c xen-netback: worse-case estimate in xenvif_rx_action is underestimating
The worse-case estimate for skb ring slot usage in xenvif_rx_action()
fails to take fragment page_offset into account. The page_offset does,
however, affect the number of times the fragmentation code calls
start_new_rx_buffer() (i.e. consume another slot) and the worse-case
should assume that will always return true. This patch adds the page_offset
into the DIV_ROUND_UP for each frag.

Unfortunately some frontends aggressively limit the number of requests
they post into the shared ring so to avoid an estimate that is 'too'
pessimal it is capped at MAX_SKB_FRAGS.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-29 18:50:34 -04:00