The logic in connect() is currently written with the assumption that
xenbus_watch_pathfmt() will return an error for a node that does not
exist. This assumption is incorrect: xenstore does allow a watch to
be registered for a nonexistent node (and will send notifications
should the node be subsequently created).
As of commit 1f2565780 ("xen-netback: remove 'hotplug-status' once it
has served its purpose"), this leads to a failure when a domU
transitions into XenbusStateConnected more than once. On the first
domU transition into Connected state, the "hotplug-status" node will
be deleted by the hotplug_status_changed() callback in dom0. On the
second or subsequent domU transition into Connected state, the
hotplug_status_changed() callback will therefore never be invoked, and
so the backend will remain stuck in InitWait.
This failure prevents scenarios such as reloading the xen-netfront
module within a domU, or booting a domU via iPXE. There is
unfortunately no way for the domU to work around this dom0 bug.
Fix by explicitly checking for existence of the "hotplug-status" node,
thereby creating the behaviour that was previously assumed to exist.
Signed-off-by: Michael Brown <mbrown@fensystems.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following W=1 kernel build warning(s):
drivers/net/xen-netback/xenbus.c:419: warning: Function parameter or member 'dev' not described in 'frontend_changed'
drivers/net/xen-netback/xenbus.c:419: warning: Function parameter or member 'frontend_state' not described in 'frontend_changed'
drivers/net/xen-netback/xenbus.c:1001: warning: Function parameter or member 'dev' not described in 'netback_probe'
drivers/net/xen-netback/xenbus.c:1001: warning: Function parameter or member 'id' not described in 'netback_probe'
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some code does not directly make 'xenbus_watch' object and call
'register_xenbus_watch()' but use 'xenbus_watch_path()' instead. This
commit adds support of 'will_handle' callback in the
'xenbus_watch_path()' and it's wrapper, 'xenbus_watch_pathfmt()'.
This is part of XSA-349
Cc: stable@vger.kernel.org
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Reported-by: Michael Kurth <mku@amazon.de>
Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
If handling logics of watch events are slower than the events enqueue
logic and the events can be created from the guests, the guests could
trigger memory pressure by intensively inducing the events, because it
will create a huge number of pending events that exhausting the memory.
Fortunately, some watch events could be ignored, depending on its
handler callback. For example, if the callback has interest in only one
single path, the watch wouldn't want multiple pending events. Or, some
watches could ignore events to same path.
To let such watches to volutarily help avoiding the memory pressure
situation, this commit introduces new watch callback, 'will_handle'. If
it is not NULL, it will be called for each new event just before
enqueuing it. Then, if the callback returns false, the event will be
discarded. No watch is using the callback for now, though.
This is part of XSA-349
Cc: stable@vger.kernel.org
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Reported-by: Michael Kurth <mku@amazon.de>
Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
the patch basically adds the offset adjustment and netfront
state reading to make XDP work on netfront side.
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
By re-attaching RX, TX, and CTL rings during connect() rather than
assuming they are freshly allocated (i.e. assuming the counters are zero),
and avoiding forcing state to Closed in netback_remove() it is possible
for vif instances to be unbound and re-bound from and to (respectively) a
running guest.
Dynamic unbind/bind is a highly useful feature for a backend module as it
allows it to be unloaded and re-loaded (i.e. updated) without requiring
domUs to be halted.
This has been tested by running iperf as a server in the test VM and
then running a client against it in a continuous loop, whilst also
running:
while true;
do echo vif-$DOMID-$VIF >unbind;
echo down;
rmmod xen-netback;
echo unloaded;
modprobe xen-netback;
cd $(pwd);
brctl addif xenbr0 vif$DOMID.$VIF;
ip link set vif$DOMID.$VIF up;
echo up;
sleep 5;
done
in dom0 from /sys/bus/xen-backend/drivers/vif to continuously unbind,
unload, re-load, re-bind and re-plumb the backend.
Clearly a performance drop was seen but no TCP connection resets were
observed during this test and moreover a parallel SSH connection into the
guest remained perfectly usable throughout.
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removing the 'hotplug-status' node in netback_remove() is wrong; the script
may not have completed. Only remove the node once the watch has fired and
has been unregistered.
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
...as the comment above the function states.
The switch to Initialising at the start of the function is somewhat bogus
as the toolstack will have set that initial state anyway. To behave
correctly, a backend should switch to InitWait once it has set up all
xenstore values that may be required by a initialising frontend. This
patch calls backend_switch_state() to make the transition at the
appropriate point.
NOTE: backend_switch_state() ignores errors from xenbus_switch_state()
and so this patch removes an error path from netback_probe(). This
means a failure to change state at this stage (in the absence of
other failures) will leave the device instantiated. This is highly
unlikley to happen as a failure to change state would indicate a
failure to write to xenstore, and that will trigger other error
paths. Also, a 'stuck' device can still be cleaned up using 'unbind'
in any case.
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
...of xenbus.c
This is a cosmetic function re-ordering to reduce churn in a subsequent
patch. Some style fix-up was done to make checkpatch.pl happier.
No functional change.
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details you
should have received a copy of the gnu general public license along
with this program if not see http www gnu org licenses
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details [based]
[from] [clk] [highbank] [c] you should have received a copy of the
gnu general public license along with this program if not see http
www gnu org licenses
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 355 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During coredump analysis, it is not easy to obtain the address of
backend_info in xen-netback.
So far there are two ways to obtain backend_info:
1. Do what xenbus_device_find() does for vmcore to find the xenbus_device
and then derive it from dev_get_drvdata().
2. Extract backend_info from callstack of xenwatch (e.g., netback_remove()
or frontend_changed()).
This patch adds a reference from xenvif to backend_info so that it would be
much more easier to obtain backend_info during coredump analysis.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enabling
-Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
debugfs_remove_recursive has taken the IS_ERR_OR_NULL into account. Just
remove the unnecessary condition check.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prefer the direct use of octal for permissions.
Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace
and some typing.
Miscellanea:
o Whitespace neatening around these conversions.
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In some cases during XenBus disconnect event handling and subsequent
queue resource release there may be some TX handlers active on
other processors. Use RCU in order to synchronize with them.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch replaces use of 'be->vif' with 'vif' and hence generally
makes the function look tidier. No semantic change.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Today a Xenstore watch event is delivered via a callback function
declared as:
void (*callback)(struct xenbus_watch *,
const char **vec, unsigned int len);
As all watch events only ever come with two parameters (path and token)
changing the prototype to:
void (*callback)(struct xenbus_watch *,
const char *path, const char *token);
is the natural thing to do.
Apply this change and adapt all users.
Cc: konrad.wilk@oracle.com
Cc: roger.pau@citrix.com
Cc: wei.liu2@citrix.com
Cc: paul.durrant@citrix.com
Cc: netdev@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
vif->lock is used to protect statistics gathering agents from using the
queue structure during cleaning.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminate memory leaks introduced several years ago by cleaning the
queue resources which are allocated on XenBus connection event. Namely, queue
structure array and pages used for IO rings.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABAgAGBQJYT5HMAAoJELDendYovxMvhNQH/1g/3ahM4JKN8Z0SbjKBEdQm
yj2xOj6cE3l6wMSUblKjZD2DLLhpmcHT/E97Xro/lZQEfQJoMXXWWDFowMU/P1LA
mJxb7Fzq5Wr+6eGSAlIQB270MrpNi/luf+CWHMwVA3V7R3KRXwonOdGQSkISIzCd
tgIydEA3a9r2+HgeIBpZFZ4GcSrJQU75krMyl2tjD1C+jeYVd+zdoj2OnDsZQDZQ
hDWApMpNbpSBAn7JtSSdXWSTBsGH0lUECebeYPhPQ2sX2P6Y8+UCGwA7i6FFdbTa
agXfVSdRz8dCe3k19VcKDAw6nK9BTTMnEeEHmkmygIh6wuHPP44CzigTXIbJoXI=
=zjfm
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
"Xen features and fixes for 4.10
These are some fixes, a move of some arm related headers to share them
between arm and arm64 and a series introducing a helper to make code
more readable.
The most notable change is David stepping down as maintainer of the
Xen hypervisor interface. This results in me sending you the pull
requests for Xen related code from now on"
* tag 'for-linus-4.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (29 commits)
xen/balloon: Only mark a page as managed when it is released
xenbus: fix deadlock on writes to /proc/xen/xenbus
xen/scsifront: don't request a slot on the ring until request is ready
xen/x86: Increase xen_e820_map to E820_X_MAX possible entries
x86: Make E820_X_MAX unconditionally larger than E820MAX
xen/pci: Bubble up error and fix description.
xen: xenbus: set error code on failure
xen: set error code on failures
arm/xen: Use alloc_percpu rather than __alloc_percpu
arm/arm64: xen: Move shared architecture headers to include/xen/arm
xen/events: use xen_vcpu_id mapping for EVTCHNOP_status
xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing
xen-scsifront: Add a missing call to kfree
MAINTAINERS: update XEN HYPERVISOR INTERFACE
xenfs: Use proc_create_mount_point() to create /proc/xen
xen-platform: use builtin_pci_driver
xen-netback: fix error handling output
xen: make use of xenbus_read_unsigned() in xenbus
xen: make use of xenbus_read_unsigned() in xen-pciback
xen: make use of xenbus_read_unsigned() in xen-fbfront
...
The connect function prints an unintialized error code after an
earlier initialization was removed:
drivers/net/xen-netback/xenbus.c: In function 'connect':
drivers/net/xen-netback/xenbus.c:938:3: error: 'err' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This prints it as -EINVAL instead, which seems to be the most
appropriate error code. Before the patch that caused the warning,
this would print a positive number returned by vsscanf() instead,
which is also wrong. We probably don't need a backport though,
as fixing the warning here should be sufficient.
Fixes: f95842e7a9 ("xen: make use of xenbus_read_unsigned() in xen-netback")
Fixes: 8d3d53b3e4 ("xen-netback: Add support for multiple queues")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
For single items being collected this should be preferred as being more
typesafe (as the compiler can check format string and to-be-written-to
variable match) and more efficient (requiring one less parameter to be
passed).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of some reads from int to unsigned,
but these cases have been wrong before: negative values are not allowed
for the modified cases.
Cc: wei.liu2@citrix.com
Cc: paul.durrant@citrix.com
Cc: netdev@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
It is useful to be able to see the hash configuration when running tests.
This patch adds a debugfs node for that purpose.
The original version of this patch (commit c0c64c1523) was reverted due
to build failures caused by a conflict with commit 0364a8824c
("xen-netback: switch to threaded irq for control ring"). This new version
of the patch is nearly identical to the original, the only difference
being that creation of the debugfs node is predicated on 'ctrl_irq' being
non-zero rather then the now non-existent 'ctrl_task'.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As far as I am aware only very old Windows network frontends make use of
this style of passing GSO packets from backend to frontend. These
frontends can easily be replaced by the freely available Xen Project
Windows PV network frontend, which uses the 'default' mechanism for
passing GSO packets, which is also used by all Linux frontends.
NOTE: Removal of this feature will not cause breakage in old Windows
frontends. They simply will no longer receive GSO packets - the
packets instead being fragmented in the backend.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of error during netback_probe() (e.g. an entry missing on the
xenstore) netback_remove() is called on the new device, which will set
the device backend state to XenbusStateClosed by calling
set_backend_state(). However, the backend state wasn't initialized by
netback_probe() at this point, which will cause and invalid transaction
and set_backend_state() to BUG().
Initialize the backend state at the beginning of netback_probe() to
XenbusStateInitialising, and create two new valid state transitions on
set_backend_state(), from XenbusStateInitialising to XenbusStateClosed,
and from XenbusStateInitialising to XenbusStateInitWait.
Signed-off-by: Filipe Manco <filipe.manco@neclab.eu>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.
This patch adds the necessary code to xen-netback to map this new shared
ring, should it be created by a frontend, but does not add implementations
for any of the defined protocol messages. These are added in a subsequent
patch for clarity.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the "prepare for reconnect" pr_info in xenbus.c. It's largely
uninteresting and the states of the frontend and backend can easily be
observed by watching the (o)xenstored log.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
My recent patch to the Xen Project documents a protocol for 'dynamic
multicast control' in netif.h. This extends the previous multicast control
protocol to not require a shared ring reconnection to turn the feature off.
Instead the backend watches the "request-multicast-control" key in xenstore
and turns the feature off if the key value is written to zero.
This patch adds support for dynamic multicast control in xen-netback.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since vzalloc can be failed in memory pressure,
writes -ENOMEM to xenstore to indicate error.
Signed-off-by: Insu Yun <wuninsu@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xen's PV network protocol includes messages to add/remove ethernet
multicast addresses to/from a filter list in the backend. This allows
the frontend to request the backend only forward multicast packets
which are of interest thus preventing unnecessary noise on the shared
ring.
The canonical netif header in git://xenbits.xen.org/xen.git specifies
the message format (two more XEN_NETIF_EXTRA_TYPEs) so the minimal
necessary changes have been pulled into include/xen/interface/io/netif.h.
To prevent the frontend from extending the multicast filter list
arbitrarily a limit (XEN_NETBK_MCAST_MAX) has been set to 64 entries.
This limit is not specified by the protocol and so may change in future.
If the limit is reached then the next XEN_NETIF_EXTRA_TYPE_MCAST_ADD
sent by the frontend will be failed with NETIF_RSP_ERROR.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit edafc132ba ("xen-netback: making the bandwidth limiter runtime settable")
introduced the capability to change the bandwidth rate limit at runtime.
But it also introduced a possible crashing bug.
If netback receives two XenbusStateConnected without getting the
hotplug-status watch firing in between, then it will try to register the
watches for the rate limiter again. But this triggers a BUG() in the watch
registration code.
The fix modifies connect() to remove the possibly existing packet-rate
watches before trying to install those watches. This behaviour is in line
with how connect() deals with the hotplug-status watch.
Signed-off-by: Imre Palik <imrep@amazon.de>
Cc: Matt Wilson <msw@amazon.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we come to tear things down in netback_remove() and generate the
uevent it is possible that the xenstore directory has already been
removed (details below).
In such cases netback_uevent() won't be able to read the hotplug
script and will write a xenstore error node.
A recent change to the hypervisor exposed this race such that we now
sometimes lose it (where apparently we didn't ever before).
Instead read the hotplug script configuration during setup and use it
for the lifetime of the backend device.
The apparently more obvious fix of moving the transition to
state=Closed in netback_remove() to after the uevent does not work
because it is possible that we are already in state=Closed (in
reaction to the guest having disconnected as it shutdown). Being
already in Closed means the toolstack is at liberty to start tearing
down the xenstore directories. In principal it might be possible to
arrange to unregister the device sooner (e.g on transition to Closing)
such that xenstore would still be there but this state machine is
fragile and prone to anger...
A modern Xen system only relies on the hotplug uevent for driver
domains, when the backend is in the same domain as the toolstack it
will run the necessary setup/teardown directly in the correct sequence
wrt xenstore changes.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit e9ce7cb6b1 ("xen-netback: Factor queue-specific data into queue
struct") introduced a regression when moving queue-specific data into
the queue struct by failing to set the credit_bytes field. This
prevented bandwidth limiting from working. Initialize the field as it
was done before multiqueue support was added.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the current netback, the bandwidth limiter's parameters are only
settable during vif setup time. This patch register a watch on them, and
thus makes them runtime changeable.
When the watch fires, the timer is reset. The timer's mutex is used for
fencing the change.
Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: Imre Palik <imrep@amazon.de>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since e9ce7cb6b1 ("xen-netback: Factor queue-specific data into queue struct"),
the transimt shaper timeout is always set to 0. The value the user sets via
xenbus is never propagated to the transmit shaper.
This patch fixes the issue.
Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: Imre Palik <imrep@amazon.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit bc96f648df (xen-netback: make
feature-rx-notify mandatory) incorrectly assumed that there were no
frontends in use that did not support this feature. But the frontend
driver in MiniOS does not and since this is used by (qemu) stubdoms,
these stopped working.
Netback sort of works as-is in this mode except:
- If there are no Rx requests and the internal Rx queue fills, only
the drain timeout will wake the thread. The default drain timeout
of 10 s would give unacceptable pauses.
- If an Rx stall was detected and the internal Rx queue is drained,
then the Rx thread would never wake.
Handle these two cases (when feature-rx-notify is disabled) by:
- Reducing the drain timeout to 30 ms.
- Disabling Rx stall detection.
Reported-by: John <jw@nuclearfallout.net>
Tested-by: John <jw@nuclearfallout.net>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When xenvif_alloc() fails, it returns a non-NULL error indicator. To
avoid eventual races, we shouldn't store that into struct backend_info
as readers of it only check for NULL.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If xenvif_alloc() or xenbus_scanf() fail in backend_create_xenvif(),
xenbus is left in offline mode but netback_probe() reports success.
The patch implements propagation of error code for backend_create_xenvif().
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a frontend not receiving packets it is useful to detect this and
turn off the carrier so packets are dropped early instead of being
queued and drained when they expire.
A to-guest queue is stalled if it doesn't have enough free slots for a
an extended period of time (default 60 s).
If at least one queue is stalled, the carrier is turned off (in the
expectation that the other queues will soon stall as well). The
carrier is only turned on once all queues are ready.
When the frontend connects, all the queues start in the stalled state
and only become ready once the frontend queues enough Rx requests.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netback needs to discard old to-guest skb's (guest Rx queue drain) and
it needs detect guest Rx stalls (to disable the carrier so packets are
discarded earlier), but the current implementation is very broken.
1. The check in hard_start_xmit of the slot availability did not
consider the number of packets that were already in the guest Rx
queue. This could allow the queue to grow without bound.
The guest stops consuming packets and the ring was allowed to fill
leaving S slot free. Netback queues a packet requiring more than S
slots (ensuring that the ring stays with S slots free). Netback
queue indefinately packets provided that then require S or fewer
slots.
2. The Rx stall detection is not triggered in this case since the
(host) Tx queue is not stopped.
3. If the Tx queue is stopped and a guest Rx interrupt occurs, netback
will consider this an Rx purge event which may result in it taking
the carrier down unnecessarily. It also considers a queue with
only 1 slot free as unstalled (even though the next packet might
not fit in this).
The internal guest Rx queue is limited by a byte length (to 512 Kib,
enough for half the ring). The (host) Tx queue is stopped and started
based on this limit. This sets an upper bound on the amount of memory
used by packets on the internal queue.
This allows the estimatation of the number of slots for an skb to be
removed (it wasn't a very good estimate anyway). Instead, the guest
Rx thread just waits for enough free slots for a maximum sized packet.
skbs queued on the internal queue have an 'expires' time (set to the
current time plus the drain timeout). The guest Rx thread will detect
when the skb at the head of the queue has expired and discard expired
skbs. This sets a clear upper bound on the length of time an skb can
be queued for. For a guest being destroyed the maximum time needed to
wait for all the packets it sent to be dropped is still the drain
timeout (10 s) since it will not be sending new packets.
Rx stall detection is reintroduced in a later commit.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frontends that do not provide feature-rx-notify may stall because
netback depends on the notification from frontend to wake the guest Rx
thread (even if can_queue is false).
This could be fixed but feature-rx-notify was introduced in 2006 and I
am not aware of any frontends that do not implement this.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The DEFINE_XENBUS_DRIVER() macro looks a bit weird and causes sparse
errors.
Replace the uses with standard structure definitions instead. This is
similar to pci and usb device registration.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
The original code is bogus. The function gets called in a loop which
leaks entries created in previous rounds.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enlarge buffer size and check input length properly, so that we don't
misuse -ENOSPC.
Note that command like "kickXXXX" is still allowed, that's one patch for
another day if we really want to be very strict on this.
Reported-by: SeeChen Ng <seechen81@gmail.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>