Commit Graph

419 Commits

Author SHA1 Message Date
Wei Yongjun b226acab2f VSOCK: Use kvfree()
Use kvfree() instead of open-coding it.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02 16:56:08 +03:00
Michael S. Tsirkin 4d93824561 vhost: split out vringh Kconfig
vringh is pulled in by caif and mic, but the other
vhost config does not need to be there.
In particular, it makes no sense to have vhost net/scsi/sock
under caif/mic.

Create a separate Kconfig file and put vringh bits there.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02 16:54:28 +03:00
Michael S. Tsirkin ec33d031a1 vhost: detect 32 bit integer wrap around
Detect and fail early if long wrap around is triggered.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02 16:54:28 +03:00
Jason Wang 6b1e6cc785 vhost: new device IOTLB API
This patch tries to implement an device IOTLB for vhost. This could be
used with userspace(qemu) implementation of DMA remapping
to emulate an IOMMU for the guest.

The idea is simple, cache the translation in a software device IOTLB
(which is implemented as an interval tree) in vhost and use vhost_net
file descriptor for reporting IOTLB miss and IOTLB
update/invalidation. When vhost meets an IOTLB miss, the fault
address, size and access can be read from the file. After userspace
finishes the translation, it writes the translated address to the
vhost_net file to update the device IOTLB.

When device IOTLB is enabled by setting VIRTIO_F_IOMMU_PLATFORM all vq
addresses set by ioctl are treated as iova instead of virtual address and
the accessing can only be done through IOTLB instead of direct userspace
memory access. Before each round or vq processing, all vq metadata is
prefetched in device IOTLB to make sure no translation fault happens
during vq processing.

In most cases, virtqueues are contiguous even in virtual address space.
The IOTLB translation for virtqueue itself may make it a little
slower. We might add fast path cache on top of this patch.

Signed-off-by: Jason Wang <jasowang@redhat.com>
[mst: use virtio feature bit: VHOST_F_DEVICE_IOTLB -> VIRTIO_F_IOMMU_PLATFORM ]
[mst: fix build warnings ]
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
[ weiyj.lk: missing unlock on error ]
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
2016-08-02 16:53:54 +03:00
Jason Wang a9709d6874 vhost: convert pre sorted vhost memory array to interval tree
Current pre-sorted memory region array has some limitations for future
device IOTLB conversion:

1) need extra work for adding and removing a single region, and it's
   expected to be slow because of sorting or memory re-allocation.
2) need extra work of removing a large range which may intersect
   several regions with different size.
3) need trick for a replacement policy like LRU

To overcome the above shortcomings, this patch convert it to interval
tree which can easily address the above issue with almost no extra
work.

The patch could be used for:

- Extend the current API and only let the userspace to send diffs of
  memory table.
- Simplify Device IOTLB implementation.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02 02:57:31 +03:00
Jason Wang bfe2bc5128 vhost: introduce vhost memory accessors
This patch introduces vhost memory accessors which were just wrappers
for userspace address access helpers. This is a requirement for vhost
device iotlb implementation which will add iotlb translations in those
accessors.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02 02:57:30 +03:00
Asias He 304ba62fd4 VSOCK: Add Makefile and Kconfig
Enable virtio-vsock and vhost-vsock.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02 02:57:30 +03:00
Asias He 433fc58e6b VSOCK: Introduce vhost_vsock.ko
VM sockets vhost transport implementation.  This driver runs on the
host.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02 02:57:30 +03:00
Michael S. Tsirkin 6190efb08c vhost: drop vringh dependency
vringh isn't used by vhost net or scsi - it's used
by CAIF only at the moment. Drop the dependency.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-02 02:57:28 +03:00
Jason Wang 04b96e5528 vhost: lockless enqueuing
We use spinlock to synchronize the work list now which may cause
unnecessary contentions. So this patch switch to use llist to remove
this contention. Pktgen tests shows about 5% improvement:

Before:
~1300000 pps
After:
~1370000 pps

Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-01 21:44:51 +03:00
Jason Wang 7235acdb11 vhost: simplify work flushing
We used to implement the work flushing through tracking queued seq,
done seq, and the number of flushing. This patch simplify this by just
implement work flushing through another kind of vhost work with
completion. This will be used by lockless enqueuing patch.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-01 21:44:50 +03:00
Jason Wang 1576d98605 tun: switch to use skb array for tx
We used to queue tx packets in sk_receive_queue, this is less
efficient since it requires spinlocks to synchronize between producer
and consumer.

This patch tries to address this by:

- switch from sk_receive_queue to a skb_array, and resize it when
  tx_queue_len was changed.
- introduce a new proto_ops peek_len which was used for peeking the
  skb length.
- implement a tun version of peek_len for vhost_net to use and convert
  vhost_net to use peek_len if possible.

Pktgen test shows about 15.3% improvement on guest receiving pps for small
buffers:

Before: ~1300000pps
After : ~1500000pps

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 05:32:17 -04:00
Jason Wang 8241a1e466 vhost_net: stop polling socket during rx processing
We don't stop rx polling socket during rx processing, this will lead
unnecessary wakeups from under layer net devices (E.g
sock_def_readable() form tun). Rx will be slowed down in this
way. This patch avoids this by stop polling socket during rx
processing. A small drawback is that this introduces some overheads in
light load case because of the extra start/stop polling, but single
netperf TCP_RR does not notice any change. In a super heavy load case,
e.g using pktgen to inject packet to guest, we get about ~8.8%
improvement on pps:

before: ~1240000 pkt/s
after:  ~1350000 pkt/s

Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 14:46:11 -07:00
Christoph Hellwig 36ec2ddc0d target: make close_session optional
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-05-10 01:19:26 -07:00
Christoph Hellwig 22d11759a4 target: make ->shutdown_session optional
Turns out the template and thus many drivers got the return value wrong:
0 means the fabrics driver needs to put a session reference, which no
driver except for the iSCSI target drivers did.  Fortunately none of these
drivers supports explicit Node ACLs, so the bug was harmless.

Even without that only qla2xxx and iscsi every did real work in
shutdown_session, so get rid of the boilerplate code in all other
drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-05-10 01:19:18 -07:00
Linus Torvalds 5266e5b12c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "The highlights this round include:

   - Add target_alloc_session() w/ callback helper for doing se_session
     allocation + tag + se_node_acl lookup.  (HCH + nab)

   - Tree-wide fabric driver conversion to use target_alloc_session()

   - Convert sbp-target to use percpu_ida tag pre-allocation, and
     TARGET_SCF_ACK_KREF I/O krefs (Chris Boot + nab)

   - Convert usb-gadget to use percpu_ida tag pre-allocation, and
     TARGET_SCF_ACK_KREF I/O krefs (Andrzej Pietrasiewicz + nab)

   - Convert xen-scsiback to use percpu_ida tag pre-allocation, and
     TARGET_SCF_ACK_KREF I/O krefs (Juergen Gross + nab)

   - Convert tcm_fc to use TARGET_SCF_ACK_KREF I/O + TMR krefs

   - Convert ib_srpt to use percpu_ida tag pre-allocation

   - Add DebugFS node for qla2xxx target sess list (Quinn)

   - Rework iser-target connection termination (Jenny + Sagi)

   - Convert iser-target to new CQ API (HCH)

   - Add pass-through WRITE_SAME support for IBLOCK (Mike Christie)

   - Introduce data_bitmap for asynchronous access of data area (Sheng
     Yang + Andy)

   - Fix target_release_cmd_kref shutdown comp leak (Himanshu Madhani)

  Also, there is a separate PULL request coming for cxgb4 NIC driver
  prerequisites for supporting hw iscsi segmentation offload (ISO), that
  will be the base for a number of v4.7 developments involving
  iscsi-target hw offloads"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits)
  target: Fix target_release_cmd_kref shutdown comp leak
  target: Avoid DataIN transfers for non-GOOD SAM status
  target/user: Report capability of handling out-of-order completions to userspace
  target/user: Fix size_t format-spec build warning
  target/user: Don't free expired command when time out
  target/user: Introduce data_bitmap, replace data_length/data_head/data_tail
  target/user: Free data ring in unified function
  target/user: Use iovec[] to describe continuous area
  target: Remove enum transport_lunflags_table
  target/iblock: pass WRITE_SAME to device if possible
  iser-target: Kill the ->isert_cmd back pointer in struct iser_tx_desc
  iser-target: Kill struct isert_rdma_wr
  iser-target: Convert to new CQ API
  iser-target: Split and properly type the login buffer
  iser-target: Remove ISER_RECV_DATA_SEG_LEN
  iser-target: Remove impossible condition from isert_wait_conn
  iser-target: Remove redundant wait in release_conn
  iser-target: Rework connection termination
  iser-target: Separate flows for np listeners and connections cma events
  iser-target: Add new state ISER_CONN_BOUND to isert_conn
  ...
2016-03-22 12:41:14 -07:00
Nicholas Bellinger 65ea789869 vhost/scsi: Convert to target_alloc_session usage
This patch converts vhost/scsi pre-allocation of vhost_scsi_cmd
descriptors to use the new alloc_session callback().

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2016-03-10 21:45:25 -08:00
Jason Wang 0308813724 vhost_net: basic polling support
This patch tries to poll for new added tx buffer or socket receive
queue for a while at the end of tx/rx processing. The maximum time
spent on polling were specified through a new kind of vring ioctl.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 02:18:53 +02:00
Jason Wang d4a60603fa vhost: introduce vhost_vq_avail_empty()
This patch introduces a helper which will return true if we're sure
that the available ring is empty for a specific vq. When we're not
sure, e.g vq access failure, return false instead. This could be used
for busy polling code to exit the busy loop.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 02:18:50 +02:00
Jason Wang 526d3e7ff5 vhost: introduce vhost_has_work()
This path introduces a helper which can give a hint for whether or not
there's a work queued in the work list. This could be used for busy
polling code to exit the busy loop.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 02:18:45 +02:00
Greg Kurz 80f7d0301e vhost: rename vhost_init_used()
Looking at how callers use this, maybe we should just rename init_used
to vhost_vq_init_access. The _used suffix was a hint that we
access the vq used ring. But maybe what callers care about is
that it must be called after access_ok.

Also, this function manipulates the vq->is_le field which isn't related
to the vq used ring.

This patch simply renames vhost_init_used() to vhost_vq_init_access() as
suggested by Michael.

No behaviour change.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-02 17:02:04 +02:00
Greg Kurz c507203756 vhost: rename cross-endian helpers
The default use case for vhost is when the host and the vring have the
same endianness (default native endianness). But there are cases where
they differ and vhost should byteswap when accessing the vring.

The first case is when the host is big endian and the vring belongs to
a virtio 1.0 device, which is always little endian.

This is covered by the vq->is_le field. This field is initialized when
userspace calls the VHOST_SET_FEATURES ioctl. It is reset when the device
stops.

We already have a vhost_init_is_le() helper, but the reset operation is
opencoded as follows:

	vq->is_le = virtio_legacy_is_little_endian();

It isn't clear that we are resetting vq->is_le here.

This patch moves the code to a helper with a more explicit name.

The other case where we may have to byteswap is when the architecture can
switch endianness at runtime (bi-endian). If endianness differs in the host
and in the guest, then legacy devices need to be used in cross-endian mode.

This mode is available with CONFIG_VHOST_CROSS_ENDIAN_LEGACY=y, which
introduces a vq->user_be field. Userspace may enable cross-endian mode
by calling the SET_VRING_ENDIAN ioctl before the device is started. The
cross-endian mode is disabled when the device is stopped.

The current names of the helpers that manipulate vq->user_be are unclear.

This patch renames those helpers to clearly show that this is cross-endian
stuff and with explicit enable/disable semantics.

No behaviour change.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-02 17:02:00 +02:00
Greg Kurz e1f33be918 vhost: fix error path in vhost_init_used()
We don't want side effects. If something fails, we rollback vq->is_le to
its previous value.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-02 17:01:49 +02:00
Michael S. Tsirkin 5fba13b5cf vhost: replace % with & on data path
We know vring num is a power of 2, so use &
to mask the high bits.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07 17:28:10 +02:00
Michael S. Tsirkin d542483876 vhost: relax log address alignment
commit 5d9a07b0de ("vhost: relax used
address alignment") fixed the alignment for the used virtual address,
but not for the physical address used for logging.

That's a mistake: alignment should clearly be the same for virtual and
physical addresses,

Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07 17:27:54 +02:00
Linus Torvalds 9aa3d651a9 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "This series contains HCH's changes to absorb configfs attribute
  ->show() + ->store() function pointer usage from it's original
  tree-wide consumers, into common configfs code.

  It includes usb-gadget, target w/ drivers, netconsole and ocfs2
  changes to realize the improved simplicity, that now renders the
  original include/target/configfs_macros.h CPP magic for fabric drivers
  and others, unnecessary and obsolete.

  And with common code in place, new configfs attributes can be added
  easier than ever before.

  Note, there are further improvements in-flight from other folks for
  v4.5 code in configfs land, plus number of target fixes for post -rc1
  code"

In the meantime, a new user of the now-removed old configfs API came in
through the char/misc tree in commit 7bd1d4093c ("stm class: Introduce
an abstraction for System Trace Module devices").

This merge resolution comes from Alexander Shishkin, who updated his stm
class tracing abstraction to account for the removal of the old
show_attribute and store_attribute methods in commit 517982229f
("configfs: remove old API") from this pull.  As Alexander says about
that patch:

 "There's no need to keep an extra wrapper structure per item and the
  awkward show_attribute/store_attribute item ops are no longer needed.

  This patch converts policy code to the new api, all the while making
  the code quite a bit smaller and easier on the eyes.

  Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>"

That patch was folded into the merge so that the tree should be fully
bisectable.

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (23 commits)
  configfs: remove old API
  ocfs2/cluster: use per-attribute show and store methods
  ocfs2/cluster: move locking into attribute store methods
  netconsole: use per-attribute show and store methods
  target: use per-attribute show and store methods
  spear13xx_pcie_gadget: use per-attribute show and store methods
  dlm: use per-attribute show and store methods
  usb-gadget/f_serial: use per-attribute show and store methods
  usb-gadget/f_phonet: use per-attribute show and store methods
  usb-gadget/f_obex: use per-attribute show and store methods
  usb-gadget/f_uac2: use per-attribute show and store methods
  usb-gadget/f_uac1: use per-attribute show and store methods
  usb-gadget/f_mass_storage: use per-attribute show and store methods
  usb-gadget/f_sourcesink: use per-attribute show and store methods
  usb-gadget/f_printer: use per-attribute show and store methods
  usb-gadget/f_midi: use per-attribute show and store methods
  usb-gadget/f_loopback: use per-attribute show and store methods
  usb-gadget/ether: use per-attribute show and store methods
  usb-gadget/f_acm: use per-attribute show and store methods
  usb-gadget/f_hid: use per-attribute show and store methods
  ...
2015-11-13 20:04:17 -08:00
Michael S. Tsirkin e407f39afd vhost: fix performance on LE hosts
commit 2751c9882b ("vhost: cross-endian
support for legacy devices") introduced a minor regression: even with
cross-endian disabled, and even on LE host, vhost_is_little_endian is
checking is_le flag so there's always a branch.

To fix, simply check virtio_legacy_is_little_endian first.

Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-27 20:17:03 -07:00
Christoph Hellwig 2eafd72939 target: use per-attribute show and store methods
This also allows to remove the target-specific old configfs macros, and
gets rid of the target_core_fabric_configfs.h header which only had one
function declaration left that could be moved to a better place.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Acked-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-10-13 22:17:49 -07:00
Linus Torvalds 00ade1f553 virtio: fixes on top of 4.3-rc1
This fixes the virtio-test tool, and improves
 the error handling for virtio-ccw.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJV+TrqAAoJECgfDbjSjVRpsBwH/0CqspPuCPR/QMHmw0w66lQh
 I2kYdkvJy/nAXtY+6JHhVgxbPzJeZ2im5BK3+8pIUoMXSHRn1QMn+/a0WCA0GvC2
 iXSnIGFs/gMK80k2ULrf9HmopnxVtEZUY+6/DQXYD3lRDi25k8Xbs5x4ygPwGywD
 iV5/JPNjJvWLQQ4uDnlp6SoPfxa/P8lbpdZlIJHKAM0pTuWeU4rgncMo8SvHjVEL
 zyrg3ofDq3V0objwLOnLk0Z8i6uwEQrfZzprvDcZglK58B+jWm2cuAiaGoLHgwN6
 xkSQEWMKeV66FTuqzI+bCsvWf04+EkGbxk75RLx3GfT2LJg59XgxqEpN7yOCKFg=
 =O2nr
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes and cleanups from Michael Tsirkin:
 "This fixes the virtio-test tool, and improves the error handling for
  virtio-ccw"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio/s390: handle failures of READ_VQ_CONF ccw
  tools/virtio: propagate V=X to kernel build
  vhost: move features to core
  tools/virtio: fix build after 4.2 changes
2015-09-18 09:28:20 -07:00
Michael S. Tsirkin 4e9fa50c6c vhost: move features to core
virtio 1 and any layout are core features, move them
there. This fixes vhost test.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-09-16 12:48:07 +03:00
Greg Kroah-Hartman 5d44f4b348 Merge 4.2-rc6 into char-misc-next
We want the fixes in Linus's tree in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-09 16:28:09 -07:00
Greg Kroah-Hartman f368ed6088 char: make misc_deregister a void function
With well over 200+ users of this api, there are a mere 12 users that
actually checked the return value of this function.  And all of them
really didn't do anything with that information as the system or module
was shutting down no matter what.

So stop pretending like it matters, and just return void from
misc_deregister().  If something goes wrong in the call, you will get a
WARNING splat in the syslog so you know how to fix up your driver.
Other than that, there's nothing that can go wrong.

Cc: Alasdair Kergon <agk@redhat.com>
Cc: Neil Brown <neilb@suse.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Wim Van Sebroeck <wim@iguana.be>
Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 10:35:49 -07:00
Igor Mammedov 1e0994730f vhost: fix error handling for memory region alloc
callers of vhost_kvzalloc() expect the same behaviour on
allocation error as from kmalloc/vmalloc i.e. NULL return
value. So just return vzmalloc() returned value instead of
returning ERR_PTR(-ENOMEM)

Fixes: 4de7255f7d ("vhost: extend memory regions allocation to vmalloc")

Spotted-by: Dan Carpenter <dan.carpenter@oracle.com>
Suggested-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-27 18:05:05 +03:00
Marc-André Lureau 7932c0bd77 vhost: actually track log eventfd file
While reviewing vhost log code, I found out that log_file is never
set. Note: I haven't tested the change (QEMU doesn't use LOG_FD yet).

Cc: stable@vger.kernel.org
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-27 18:04:58 +03:00
Linus Torvalds d1a343a023 virtio/vhost: fixes for 4.2
Bugfixes and documentation fixes. Igor's patch that allows
 users to tweak memory table size is borderline,
 but it does fix known crashes, so I merged it.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVpBzpAAoJECgfDbjSjVRpQXEIAMEqetqPuRduynjIw2HNktle
 fe/UUhvipwTsAM4R2pmcEl5tW04A/M54RkXN4iVy0rPshAfG3Fh4XTjLmzSGU0fI
 KD6qX8/Zc/+DUWnfe3aUC6jOOrjb7c4xRKOlQ9X8lZgi2M6AzrPeoZHFTtbX34CU
 2kcnv5Sb1SaI/t2SaCc2CaKilQalEOcd59gGjje2QXjidZnIVHwONrOyjBiINUy6
 GzfTgvAje8ZiG6951W3HDwwSfcqXezin27LJqnZgaLCwTCKt2gdQ2MAKjrfP2aVF
 +gX2B4XxcFLutMVx/obsjvA1ceipubyUauRLB3mnO3P5VOj1qbofa2lj4pzQ80k=
 =EKPr
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio/vhost fixes from Michael Tsirkin:
 "Bugfixes and documentation fixes.

  Igor's patch that allows users to tweak memory table size is
  borderline, but it does fix known crashes, so I merged it"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost: add max_mem_regions module parameter
  vhost: extend memory regions allocation to vmalloc
  9p/trans_virtio: reset virtio device on remove
  virtio/s390: rename drivers/s390/kvm -> drivers/s390/virtio
  MAINTAINERS: separate section for s390 virtio drivers
  virtio: define virtio_pci_cfg_cap in header.
  virtio: Fix typecast of pointer in vring_init()
  virtio scsi: fix unused variable warning
  vhost: use binary search instead of linear in find_region()
  virtio_net: document VIRTIO_NET_CTRL_GUEST_OFFLOADS
2015-07-23 13:07:04 -07:00
Igor Mammedov c9ce42f72f vhost: add max_mem_regions module parameter
it became possible to use a bigger amount of memory
slots, which is used by memory hotplug for
registering hotplugged memory.
However QEMU crashes if it's used with more than ~60
pc-dimm devices and vhost-net enabled since host kernel
in module vhost-net refuses to accept more than 64
memory regions.

Allow to tweak limit via max_mem_regions module paramemter
with default value set to 64 slots.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-13 23:17:19 +03:00
Igor Mammedov 4de7255f7d vhost: extend memory regions allocation to vmalloc
with large number of memory regions we could end up with
high order allocations and kmalloc could fail if
host is under memory pressure.
Considering that memory regions array is used on hot path
try harder to allocate using kmalloc and if it fails resort
to vmalloc.
It's still better than just failing vhost_set_memory() and
causing guest crash due to it when a new memory hotplugged
to guest.

I'll still look at QEMU side solution to reduce amount of
memory regions it feeds to vhost to make things even better,
but it doesn't hurt for kernel to behave smarter and don't
crash older QEMU's which could use large amount of memory
regions.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-13 23:17:18 +03:00
Linus Torvalds 5c755fe142 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "It's been a busy development cycle for target-core in a number of
  different areas.

  The fabric API usage for se_node_acl allocation is now within
  target-core code, dropping the external API callers for all fabric
  drivers tree-wide.

  There is a new conversion to RCU hlists for se_node_acl and
  se_portal_group LUN mappings, that turns fast-past LUN lookup into a
  completely lockless code-path.  It also removes the original
  hard-coded limitation of 256 LUNs per fabric endpoint.

  The configfs attributes for backends can now be shared between core
  and driver code, allowing existing drivers to use common code while
  still allowing flexibility for new backend provided attributes.

  The highlights include:

   - Merge sbc_verify_dif_* into common code (sagi)
   - Remove iscsi-target support for obsolete IFMarker/OFMarker
     (Christophe Vu-Brugier)
   - Add bidi support in target/user backend (ilias + vangelis + agover)
   - Move se_node_acl allocation into target-core code (hch)
   - Add crc_t10dif_update common helper (akinobu + mkp)
   - Handle target-core odd SGL mapping for data transfer memory
     (akinobu)
   - Move transport ID handling into target-core (hch)
   - Move task tag into struct se_cmd + support 64-bit tags (bart)
   - Convert se_node_acl->device_list[] to RCU hlist (nab + hch +
     paulmck)
   - Convert se_portal_group->tpg_lun_list[] to RCU hlist (nab + hch +
     paulmck)
   - Simplify target backend driver registration (hch)
   - Consolidate + simplify target backend attribute implementations
     (hch + nab)
   - Subsume se_port + t10_alua_tg_pt_gp_member into se_lun (hch)
   - Drop lun_sep_lock for se_lun->lun_se_dev RCU usage (hch + nab)
   - Drop unnecessary core_tpg_register TFO parameter (nab)
   - Use 64-bit LUNs tree-wide (hannes)
   - Drop left-over TARGET_MAX_LUNS_PER_TRANSPORT limit (hannes)"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (76 commits)
  target: Bump core version to v5.0
  target: remove target_core_configfs.h
  target: remove unused TARGET_CORE_CONFIG_ROOT define
  target: consolidate version defines
  target: implement WRITE_SAME with UNMAP bit using ->execute_unmap
  target: simplify UNMAP handling
  target: replace se_cmd->execute_rw with a protocol_data field
  target/user: Fix inconsistent kmap_atomic/kunmap_atomic
  target: Send UA when changing LUN inventory
  target: Send UA upon LUN RESET tmr completion
  target: Send UA on ALUA target port group change
  target: Convert se_lun->lun_deve_lock to normal spinlock
  target: use 'se_dev_entry' when allocating UAs
  target: Remove 'ua_nacl' pointer from se_ua structure
  target_core_alua: Correct UA handling when switching states
  xen-scsiback: Fix compile warning for 64-bit LUN
  target: Remove TARGET_MAX_LUNS_PER_TRANSPORT
  target: use 64-bit LUNs
  target: Drop duplicate + unused se_dev_check_wce
  target: Drop unnecessary core_tpg_register TFO parameter
  ...
2015-07-04 14:13:43 -07:00
Linus Torvalds 5fc835284d virtio/vhost: cross endian support
I have just queued some more bugfix patches today but none fix regressions and
 none are related to these ones, so it looks like a good time for a merge for
 -rc1.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJVk7JOAAoJECgfDbjSjVRpHgEIAKrgLd7gIQ8lO+LCYqne6WLQ
 Ky8rOUnaxX4gD5N0akhfJFr/m/yIyAfk9+ALZZUo3kfuFiEsT2rn32iK/2Gj8pcu
 HFoAWhS+7b/ZsfpHRPtv/zVD3q4c3nWsWpfWK09J+4t0UJuC8fmGMoBzkS0kjZtd
 dQnHlJi5+1u4ch2x9sYYeVx7GOJ8a1W0q7cWJnWdOffWLEP9/zB8fgRVLFp/7AAd
 uBlza93RU81wS7q5tSUph6ESPqt2yu357e//4jnWjVx5EUXDRBL3A/T1JpC1qYSn
 WV2Gv14x+LVz2G8WgGmwfMq1H9Dvd/OzNToX5R8SIRx6Rh5L6gxFQjqt4dclGj8=
 =nKap
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio/vhost cross endian support from Michael Tsirkin:
 "I have just queued some more bugfix patches today but none fix
  regressions and none are related to these ones, so it looks like a
  good time for a merge for -rc1.

  The motivation for this is support for legacy BE guests on the new LE
  hosts.  There are two redeeming properties that made me merge this:

   - It's a trivial amount of code: since we wrap host/guest accesses
     anyway, almost all of it is well hidden from drivers.

   - Sane platforms would never set flags like VHOST_CROSS_ENDIAN_LEGACY,
     and when it's clear, there's zero overhead (as some point it was
     tested by compiling with and without the patches, got the same
     stripped binary).

  Maybe we could create a Kconfig symbol to enforce the second point:
  prevent people from enabling it eg on x86.  I will look into this"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio-pci: alloc only resources actually used.
  macvtap/tun: cross-endian support for little-endian hosts
  vhost: cross-endian support for legacy devices
  virtio: add explicit big-endian support to memory accessors
  vhost: introduce vhost_is_little_endian() helper
  vringh: introduce vringh_is_little_endian() helper
  macvtap: introduce macvtap_is_little_endian() helper
  tun: add tun_is_little_endian() helper
  virtio: introduce virtio_is_little_endian() helper
2015-07-03 16:02:25 -07:00
Igor Mammedov bcfeacab45 vhost: use binary search instead of linear in find_region()
For default region layouts performance stays the same
as linear search i.e. it takes around 210ns average for
translate_desc() that inlines find_region().

But it scales better with larger amount of regions,
235ns BS vs 300ns LS with 55 memory regions
and it will be about the same values when allowed number
of slots is increased to 509 like it has been done in kvm.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-01 10:12:12 +02:00
Linus Torvalds e0456717e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Add TX fast path in mac80211, from Johannes Berg.

 2) Add TSO/GRO support to ibmveth, from Thomas Falcon

 3) Move away from cached routes in ipv6, just like ipv4, from Martin
    KaFai Lau.

 4) Lots of new rhashtable tests, from Thomas Graf.

 5) Run ingress qdisc lockless, from Alexei Starovoitov.

 6) Allow servers to fetch TCP packet headers for SYN packets of new
    connections, for fingerprinting.  From Eric Dumazet.

 7) Add mode parameter to pktgen, for testing receive.  From Alexei
    Starovoitov.

 8) Cache access optimizations via simplifications of build_skb(), from
    Alexander Duyck.

 9) Move page frag allocator under mm/, also from Alexander.

10) Add xmit_more support to hv_netvsc, from KY Srinivasan.

11) Add a counter guard in case we try to perform endless reclassify
    loops in the packet scheduler.

12) Extern flow dissector to be programmable and use it in new "Flower"
    classifier.  From Jiri Pirko.

13) AF_PACKET fanout rollover fixes, performance improvements, and new
    statistics.  From Willem de Bruijn.

14) Add netdev driver for GENEVE tunnels, from John W Linville.

15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso.

16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet.

17) Add an ECN retry fallback for the initial TCP handshake, from Daniel
    Borkmann.

18) Add tail call support to BPF, from Alexei Starovoitov.

19) Add several pktgen helper scripts, from Jesper Dangaard Brouer.

20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa.

21) Favor even port numbers for allocation to connect() requests, and
    odd port numbers for bind(0), in an effort to help avoid
    ip_local_port_range exhaustion.  From Eric Dumazet.

22) Add Cavium ThunderX driver, from Sunil Goutham.

23) Allow bpf programs to access skb_iif and dev->ifindex SKB metadata,
    from Alexei Starovoitov.

24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai.

25) Double TCP Small Queues default to 256K to accomodate situations
    like the XEN driver and wireless aggregation.  From Wei Liu.

26) Add more entropy inputs to flow dissector, from Tom Herbert.

27) Add CDG congestion control algorithm to TCP, from Kenneth Klette
    Jonassen.

28) Convert ipset over to RCU locking, from Jozsef Kadlecsik.

29) Track and act upon link status of ipv4 route nexthops, from Andy
    Gospodarek.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits)
  bridge: vlan: flush the dynamically learned entries on port vlan delete
  bridge: multicast: add a comment to br_port_state_selection about blocking state
  net: inet_diag: export IPV6_V6ONLY sockopt
  stmmac: troubleshoot unexpected bits in des0 & des1
  net: ipv4 sysctl option to ignore routes when nexthop link is down
  net: track link-status of ipv4 nexthops
  net: switchdev: ignore unsupported bridge flags
  net: Cavium: Fix MAC address setting in shutdown state
  drivers: net: xgene: fix for ACPI support without ACPI
  ip: report the original address of ICMP messages
  net/mlx5e: Prefetch skb data on RX
  net/mlx5e: Pop cq outside mlx5e_get_cqe
  net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
  net/mlx5e: Remove extra spaces
  net/mlx5e: Avoid TX CQE generation if more xmit packets expected
  net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion
  net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq()
  net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them
  net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues
  net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device
  ...
2015-06-24 16:49:49 -07:00
Linus Torvalds acd53127c4 SCSI misc on 20150622
This is the usual grab bag of driver updates (lpfc, hpsa,
 megaraid_sas, cxgbi, be2iscsi) plus an assortment of minor updates.
 There are also one new driver: the Cisco snic; the advansys driver has
 been rewritten to get rid of the warning about converting it to the
 DMA API, the tape statistics patch got in and finally, there's a
 resuffle of SCSI header files to separate more cleanly initiator from
 target mode (and better share the common definitions).
 
 Signed-off-by: James Bottomley <JBottomley@Odin.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJViKWdAAoJEDeqqVYsXL0MAr8IAMmlA6HBVjMJJFCEOY9corHj
 e70MNQa7LUgf+JCdOtzGcvHXTiFFd4IHZAwXUJAnsC4IU2QWEfi1bjUTErlqBIGk
 LoZlXXpEHnFpmWot3OluOzzcGcxede8rVgPiKWVVdojIngBC2+LL/i2vPCJ84ri9
 WCVlk6KBvWZXuU6JuOKAb2FO9HOX7Q61wuKAMast2Qc6RNc2ksgc7VbstsITqzZ9
 FVEsjmQ5lqUj+xdxBpiUOdUpc22IJ4VcpBgQ2HrThvg6vf4aq937RJ/g4vi/g0SU
 Utk0a3bUw1H/WnYAfJVFx83nVEsS/954Z7/ERDg1sjlfLYwQtQnpov0XIbPIbZU=
 =k9IT
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is the usual grab bag of driver updates (lpfc, hpsa,
  megaraid_sas, cxgbi, be2iscsi) plus an assortment of minor updates.

  There is also one new driver: the Cisco snic.  The advansys driver has
  been rewritten to get rid of the warning about converting it to the
  DMA API, the tape statistics patch got in and finally, there's a
  resuffle of SCSI header files to separate more cleanly initiator from
  target mode (and better share the common definitions)"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (156 commits)
  snic: driver for Cisco SCSI HBA
  qla2xxx: Fix indentation
  qla2xxx: Comment out unreachable code
  fusion: remove dead MTRR code
  advansys: fix compilation errors and warnings when CONFIG_PCI is not set
  mptsas: fix depth param in scsi_track_queue_full
  megaraid: fix irq setup process regression
  lpfc: Update version to 10.7.0.0 for upstream patch set.
  lpfc: Fix to drop PLOGIs from fabric node till LOGO processing completes
  lpfc: Fix scsi task management error message.
  lpfc: Fix cq_id masking problem.
  lpfc: Fix scsi prep dma buf error.
  lpfc: Add support for using block multi-queue
  lpfc: Devices are not discovered during takeaway/giveback testing
  lpfc: Fix vport deletion failure.
  lpfc: Check for active portpeerbeacon.
  lpfc: Update driver version for upstream patch set 10.6.0.1.
  lpfc: Change buffer pool empty message to miscellaneous category
  lpfc: Fix incorrect log message reported for empty FCF record.
  lpfc: Fix rport leak.
  ...
2015-06-23 15:55:44 -07:00
Nicholas Bellinger bc0c94b140 target: Drop unnecessary core_tpg_register TFO parameter
This patch drops unnecessary target_core_fabric_ops parameter usage
for core_tpg_register() during fabric driver TFO->fabric_make_tpg()
se_portal_group creation callback execution.

Instead, use the existing se_wwn->wwn_tf->tf_ops pointer to ensure
fabric driver is really using the same TFO provided at module_init
time.

Also go ahead and drop the forward TFO declarations tree-wide, and
handling the special case for iscsi-target discovery TPG.

Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-06-15 23:23:22 -07:00
Bart Van Assche ba92999252 target: Minimize SCSI header #include directives
Only include SCSI initiator header files in target code that needs
these header files, namely the SCSI pass-through code and the tcm_loop
driver. Change SCSI_SENSE_BUFFERSIZE into TRANSPORT_SENSE_BUFFER in
target code because the former is intended for initiator code and the
latter for target code. With this patch the only initiator include
directives in target code that remain are as follows:

$ git grep -nHE 'include .scsi/(scsi.h|scsi_host.h|scsi_device.h|scsi_cmnd.h)' drivers/target drivers/infiniband/ulp/{isert,srpt} drivers/usb/gadget/legacy/tcm_*.[ch] drivers/{vhost,xen} include/{target,trace/events/target.h}
drivers/target/loopback/tcm_loop.c:29:#include <scsi/scsi.h>
drivers/target/loopback/tcm_loop.c:31:#include <scsi/scsi_host.h>
drivers/target/loopback/tcm_loop.c:32:#include <scsi/scsi_device.h>
drivers/target/loopback/tcm_loop.c:33:#include <scsi/scsi_cmnd.h>
drivers/target/target_core_pscsi.c:39:#include <scsi/scsi_device.h>
drivers/target/target_core_pscsi.c:40:#include <scsi/scsi_host.h>
drivers/xen/xen-scsiback.c:52:#include <scsi/scsi_host.h> /* SG_ALL */

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
2015-06-02 08:03:25 -07:00
David S. Miller dda922c831 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/phy/amd-xgbe-phy.c
	drivers/net/wireless/iwlwifi/Kconfig
	include/net/mac80211.h

iwlwifi/Kconfig and mac80211.h were both trivial overlapping
changes.

The drivers/net/phy/amd-xgbe-phy.c file got removed in 'net-next' and
the bug fix that happened on the 'net' side is already integrated
into the rest of the amd-xgbe driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-01 22:51:30 -07:00
Greg Kurz 2751c9882b vhost: cross-endian support for legacy devices
This patch brings cross-endian support to vhost when used to implement
legacy virtio devices. Since it is a relatively rare situation, the
feature availability is controlled by a kernel config option (not set
by default).

The vq->is_le boolean field is added to cache the endianness to be
used for ring accesses. It defaults to native endian, as expected
by legacy virtio devices. When the ring gets active, we force little
endian if the device is modern. When the ring is deactivated, we
revert to the native endian default.

If cross-endian was compiled in, a vq->user_be boolean field is added
so that userspace may request a specific endianness. This field is
used to override the default when activating the ring of a legacy
device. It has no effect on modern devices.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2015-06-01 15:48:55 +02:00
Greg Kurz 7d82410950 virtio: add explicit big-endian support to memory accessors
The current memory accessors logic is:
- little endian if little_endian
- native endian (i.e. no byteswap) if !little_endian

If we want to fully support cross-endian vhost, we also need to be
able to convert to big endian.

Instead of changing the little_endian argument to some 3-value enum, this
patch changes the logic to:
- little endian if little_endian
- big endian if !little_endian

The native endian case is handled by all users with a trivial helper. This
patch doesn't change any functionality, nor it does add overhead.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2015-06-01 15:48:54 +02:00
Greg Kurz ab27c07f60 vhost: introduce vhost_is_little_endian() helper
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2015-06-01 15:48:53 +02:00
Christoph Hellwig 7ad34a9367 target: target_core_configfs.h is not needed in fabric drivers
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:42:39 -07:00
Bart Van Assche 649ee05499 target: Move task tag into struct se_cmd + support 64-bit tags
Simplify target core and target drivers by storing the task tag
a.k.a. command identifier inside struct se_cmd.

For several transports (e.g. SRP) tags are 64 bits wide.
Hence add support for 64-bit tags.

(Fix core_tmr_abort_task conversion spec warnings - nab)
(Fix up usb-gadget to use 16-bit tags - HCH + bart)

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <qla2xxx-upstream@qlogic.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:42:31 -07:00
Christoph Hellwig 2650d71e24 target: move transport ID handling to the core
Now that struct se_portal_group contains a protocol identifier field we can
take all the code to format an parse protocol identifiers in CDBs into common
code instead of leaving this to low-level drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:42:30 -07:00
Christoph Hellwig 2aeeafae6b target: remove the get_fabric_proto_ident method
Now that we store the protocol identifier in the tpg structure we don't
need this method.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:42:30 -07:00
Christoph Hellwig e4aae5af81 target: change core_tpg_register prototype
Remove the unneeded fabric_ptr argument, and change the type argument
to pass in a SPC protocol identifier.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:42:27 -07:00
Christoph Hellwig 144bc4c2a4 target: move node ACL allocation to core code
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:42:23 -07:00
Christoph Hellwig c7d6a80392 target: refactor init/drop_nodeacl methods
By always allocating and adding, respectively removing and freeing
the se_node_acl structure in core code we can remove tons of repeated
code in the init_nodeacl and drop_nodeacl routines.  Additionally
this now respects the get_default_queue_depth method in this code
path as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:41:51 -07:00
Christoph Hellwig e1750d20e6 target: make the tpg_get_default_depth method optional
All fabric drivers except for iSCSI always return 1, so implement
that as default behavior.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:41:50 -07:00
Christoph Hellwig 55570113a9 vhost/scsi: remove struct vhost_scsi_nacl
Except for the embedded struct se_node_acl none of the fields were
ever used.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:41:50 -07:00
Bart Van Assche afc16604c0 target: Remove first argument of target_{get,put}_sess_cmd()
The first argument of these two functions is always identical
to se_cmd->se_sess. Hence remove the first argument.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: <qla2xxx-upstream@qlogic.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:41:47 -07:00
Christoph Hellwig d588cf8f61 target: Fix se_tpg_tfo->tf_subsys regression + remove tf_subsystem
There is just one configfs subsystem in the target code, so we might as
well add two helpers to reference / unreference it from the core code
instead of passing pointers to it around.

This fixes a regression introduced for v4.1-rc1 with commit 9ac8928e6,
where configfs_depend_item() callers using se_tpg_tfo->tf_subsys would
fail, because the assignment from the original target_core_subsystem[]
is no longer happening at target_register_template() time.

(Fix target_core_exit_configfs pointer dereference - Sagi)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 18:04:20 -07:00
David S. Miller 5538d294dd treewide: Add missing vmalloc.h inclusion.
All of these files were only building on non-x86 because of
the indirect of inclusion of vmalloc.h by, of all things,
"net/inet_hashtables.h"

None of this got caught during build testing, because on x86
there is an implicit vmalloc.h include via on of the arch asm/
headers.

This fixes all of these

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-28 14:31:59 -07:00
Linus Torvalds c6668726d2 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Lots of activity in target land the last months.

  The highlights include:

   - Convert fabric drivers tree-wide to target_register_template() (hch
     + bart)

   - iser-target hardening fixes + v1.0 improvements (sagi)

   - Convert iscsi_thread_set usage to kthread.h + kill
     iscsi_target_tq.c (sagi + nab)

   - Add support for T10-PI WRITE_STRIP + READ_INSERT operation (mkp +
     sagi + nab)

   - DIF fixes for CONFIG_DEBUG_SG=y + UNMAP file emulation (akinobu +
     sagi + mkp)

   - Extended TCMU ABI v2 for future BIDI + DIF support (andy + ilias)

   - Fix COMPARE_AND_WRITE handling for NO_ALLLOC drivers (hch + nab)

  Thanks to everyone who contributed this round with new features,
  bug-reports, fixes, cleanups and improvements.

  Looking forward, it's currently shaping up to be a busy v4.2 as well"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (69 commits)
  target: Put TCMU under a new config option
  target: Version 2 of TCMU ABI
  target: fix tcm_mod_builder.py
  target/file: Fix UNMAP with DIF protection support
  target/file: Fix SG table for prot_buf initialization
  target/file: Fix BUG() when CONFIG_DEBUG_SG=y and DIF protection enabled
  target: Make core_tmr_abort_task() skip TMFs
  target/sbc: Update sbc_dif_generate pr_debug output
  target/sbc: Make internal DIF emulation honor ->prot_checks
  target/sbc: Return INVALID_CDB_FIELD if DIF + sess_prot_type disabled
  target: Ensure sess_prot_type is saved across session restart
  target/rd: Don't pass incomplete scatterlist entries to sbc_dif_verify_*
  target: Remove the unused flag SCF_ACK_KREF
  target: Fix two sparse warnings
  target: Fix COMPARE_AND_WRITE with SG_TO_MEM_NOALLOC handling
  target: simplify the target template registration API
  target: simplify target_xcopy_init_pt_lun
  target: remove the unused SCF_CMD_XCOPY_PASSTHROUGH flag
  target/rd: reduce code duplication in rd_execute_rw()
  tcm_loop: fixup tpgt string to integer conversion
  ...
2015-04-24 10:22:09 -07:00
Christoph Hellwig 9ac8928e6a target: simplify the target template registration API
Instead of calling target_fabric_configfs_init() +
target_fabric_configfs_register() / target_fabric_configfs_deregister()
target_fabric_configfs_free() from every target driver, rewrite the API
so that we have simple register/unregister functions that operate on
a const operations vector.

This patch also fixes a memory leak in several target drivers. Several
target drivers namely called target_fabric_configfs_deregister()
without calling target_fabric_configfs_free().

A large part of this patch is based on earlier changes from
Bart Van Assche <bart.vanassche@sandisk.com>.

(v2: Add a new TF_CIT_SETUP_DRV macro so that the core configfs code
can declare attributes as either core only or for drivers)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-14 12:28:41 -07:00
Al Viro 01e97e6517 new helper: msg_data_left()
convert open-coded instances

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 15:53:35 -04:00
Nicholas Bellinger b1d75fe53e vhost/scsi: Add fabric_prot_type attribute support
This patch updates vhost-scsi to add a new fabric_prot_type TPG
attribute, used for controlling LLD level protection into LIO when
the backend device does not support T10-PI.

This is required for vhost-scsi to enable WRITE_STRIP + READ_INSERT
operations using software emulation + crct10dif instruction offload.

It's disabled by default and controls which se_sesion->sess_prot_type
are set at vhost_scsi_make_nexus() session registration time.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:46 -07:00
David S. Miller d5c1d8c567 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/netfilter/nf_tables_core.c

The nf_tables_core.c conflict was resolved using a conflict resolution
from Stephen Rothwell as a guide.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 22:22:43 -04:00
Linus Torvalds e477f3e013 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
 "Here are current target-pending fixes for v4.0-rc5 code that have made
  their way into the queue over the last weeks.

  The fixes this round include:

   - Fix long-standing iser-target logout bug related to early
     conn_logout_comp completion, resulting in iscsi_conn use-after-tree
     OOpsen.  (Sagi + nab)

   - Fix long-standing tcm_fc bug in ft_invl_hw_context() failure
     handing for DDP hw offload.  (DanC)

   - Fix incorrect use of unprotected __transport_register_session() in
     tcm_qla2xxx + other single local se_node_acl fabrics.  (Bart)

   - Fix reference leak in target_submit_cmd() -> target_get_sess_cmd()
     for ack_kref=1 failure path.  (Bart)

   - Fix pSCSI backend ->get_device_type() statistics OOPs with
     un-configured device.  (Olaf + nab)

   - Fix virtual LUN=0 target_configure_device failure OOPs at modprobe
     time.  (Claudio + nab)

   - Fix FUA write false positive failure regression in v4.0-rc1 code.
     (Christophe Vu-Brugier + HCH)"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: do not reject FUA CDBs when write cache is enabled but emulate_write_cache is 0
  target: Fix virtual LUN=0 target_configure_device failure OOPs
  target/pscsi: Fix NULL pointer dereference in get_device_type
  tcm_fc: missing curly braces in ft_invl_hw_context()
  target: Fix reference leak in target_get_sess_cmd() error path
  loop/usb/vhost-scsi/xen-scsiback: Fix use of __transport_register_session
  tcm_qla2xxx: Fix incorrect use of __transport_register_session
  iscsi-target: Avoid early conn_logout_comp for iser connections
  Revert "iscsi-target: Avoid IN_LOGOUT failure case for iser-target"
  target: Disallow changing of WRITE cache/FUA attrs after export
2015-03-21 11:24:38 -07:00
Bart Van Assche 2f450cc1fb loop/usb/vhost-scsi/xen-scsiback: Fix use of __transport_register_session
This patch changes loopback, usb-gadget, vhost-scsi and xen-scsiback
fabric code to invoke transport_register_session() instead of the
unprotected flavour, to ensure se_tpg->session_lock is taken when
adding new session list nodes to se_tpg->tpg_sess_list.

Note that since these four fabric drivers already hold their own
internal TPG mutexes when accessing se_tpg->tpg_sess_list, and
consist of a single se_session created through configfs attribute
access, no list corruption can currently occur.

So for correctness sake, go ahead and use the se_tpg->session_lock
protected version for these four fabric drivers.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-03-19 23:15:14 -07:00
David S. Miller 71a83a6db6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/rocker/rocker.c

The rocker commit was two overlapping changes, one to rename
the ->vport member to ->pport, and another making the bitmask
expression use '1ULL' instead of plain '1'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-03 21:16:48 -05:00
Linus Torvalds 789d7f60cd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) If an IPVS tunnel is created with a mixed-family destination
    address, it cannot be removed.  Fix from Alexey Andriyanov.

 2) Fix module refcount underflow in netfilter's nft_compat, from Pablo
    Neira Ayuso.

 3) Generic statistics infrastructure can reference variables sitting on
    a released function stack, therefore use dynamic allocation always.
    Fix from Ignacy Gawędzki.

 4) skb_copy_bits() return value test is inverted in ip_check_defrag().

 5) Fix network namespace exit in openvswitch, we have to release all of
    the per-net vports.  From Pravin B Shelar.

 6) Fix signedness bug in CAIF's cfpkt_iterate(), from Dan Carpenter.

 7) Fix rhashtable grow/shrink behavior, only expand during inserts and
    shrink during deletes.  From Daniel Borkmann.

 8) Netdevice names with semicolons should never be allowed, because
    they serve as a separator.  From Matthew Thode.

 9) Use {,__}set_current_state() where appropriate, from Fabian
    Frederick.

10) Revert byte queue limits support in r8169 driver, it's causing
    regressions we can't figure out.

11) tcp_should_expand_sndbuf() erroneously uses tp->packets_out to
    measure packets in flight, properly use tcp_packets_in_flight()
    instead.  From Neal Cardwell.

12) Fix accidental removal of support for bluetooth in CSR based Intel
    wireless cards.  From Marcel Holtmann.

13) We accidently added a behavioral change between native and compat
    tasks, wrt testing the MSG_CMSG_COMPAT bit.  Just ignore it if the
    user happened to set it in a native binary as that was always the
    behavior we had.  From Catalin Marinas.

14) Check genlmsg_unicast() return valud in hwsim netlink tx frame
    handling, from Bob Copeland.

15) Fix stale ->radar_required setting in mac80211 that can prevent
    starting new scans, from Eliad Peller.

16) Fix memory leak in nl80211 monitor, from Johannes Berg.

17) Fix race in TX index handling in xen-netback, from David Vrabel.

18) Don't enable interrupts in amx-xgbe driver until all software et al.
    state is ready for the interrupt handler to run.  From Thomas
    Lendacky.

19) Add missing netlink_ns_capable() checks to rtnl_newlink(), from Eric
    W Biederman.

20) The amount of header space needed in macvtap was not calculated
    properly, fix it otherwise we splat past the beginning of the
    packet.  From Eric Dumazet.

21) Fix bcmgenet TCP TX perf regression, from Jaedon Shin.

22) Don't raw initialize or mod timers, use setup_timer() and
    mod_timer() instead.  From Vaishali Thakkar.

23) Fix software maintained statistics in bcmgenet and systemport
    drivers, from Florian Fainelli.

24) DMA descriptor updates in sh_eth need proper memory barriers, from
    Ben Hutchings.

25) Don't do UDP Fragmentation Offload on RAW sockets, from Michal
    Kubecek.

26) Openvswitch's non-masked set actions aren't constructed properly
    into netlink messages, fix from Joe Stringer.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
  openvswitch: Fix serialization of non-masked set actions.
  gianfar: Reduce logging noise seen due to phy polling if link is down
  ibmveth: Add function to enable live MAC address changes
  net: bridge: add compile-time assert for cb struct size
  udp: only allow UFO for packets from SOCK_DGRAM sockets
  sh_eth: Really fix padding of short frames on TX
  Revert "sh_eth: Enable Rx descriptor word 0 shift for r8a7790"
  sh_eth: Fix RX recovery on R-Car in case of RX ring underrun
  sh_eth: Ensure proper ordering of descriptor active bit write/read
  net/mlx4_en: Disbale GRO for incoming loopback/selftest packets
  net/mlx4_core: Fix wrong mask and error flow for the update-qp command
  net: systemport: fix software maintained statistics
  net: bcmgenet: fix software maintained statistics
  rxrpc: don't multiply with HZ twice
  rxrpc: terminate retrans loop when sending of skb fails
  net/hsr: Fix NULL pointer dereference and refcnt bugs when deleting a HSR interface.
  net: pasemi: Use setup_timer and mod_timer
  net: stmmac: Use setup_timer and mod_timer
  net: 8390: axnet_cs: Use setup_timer and mod_timer
  net: 8390: pcnet_cs: Use setup_timer and mod_timer
  ...
2015-03-03 15:30:07 -08:00
Ying Xue 1b78414047 net: Remove iocb argument from sendmsg and recvmsg
After TIPC doesn't depend on iocb argument in its internal
implementations of sendmsg() and recvmsg() hooks defined in proto
structure, no any user is using iocb argument in them at all now.
Then we can drop the redundant iocb argument completely from kinds of
implementations of both sendmsg() and recvmsg() in the entire
networking stack.

Cc: Christoph Hellwig <hch@lst.de>
Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 13:06:31 -05:00
Michael S. Tsirkin 0d79a493e5 vhost: drop hard-coded num_buffers size
The 2 that we use for copy_to_iter comes from sizeof(u16),
it used to be that way before the iov iter update.
Fix it up, making it obvious the size of stack access
is right.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 15:53:44 -05:00
Michael S. Tsirkin 4c5a84421c vhost: cleanup iterator update logic
Recent iterator-related changes in vhost made it
harder to follow the logic fixing up the header.
In fact, the fixup always happens at the same
offset: sizeof(virtio_net_hdr): sometimes the
fixup iterator is updated by copy_to_iter,
sometimes-by iov_iter_advance.

Rearrange code to make this obvious.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 15:53:44 -05:00
Linus Torvalds e20d3ef540 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "The highlights this round include:

   - Update vhost-scsi to support F_ANY_LAYOUT using mm/iov_iter.c
     logic, and signal VERSION_1 support (MST + Viro + nab)

   - Fix iscsi/iser-target to remove problematic active_ts_set usage
     (Gavin Guo)

   - Update iscsi/iser-target to support multi-sequence sendtargets
     (Sagi)

   - Fix original PR_APTPL_BUF_LEN 8k size limitation (Martin Svec)

   - Add missing WRITE_SAME end-of-device sanity check (Bart)

   - Check for LBA + sectors wrap-around in sbc_parse_cdb() (nab)

   - Other various minor SPC/SBC compliance fixes based upon Ronnie
     Sahlberg test suite (nab)"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (32 commits)
  target: Set LBPWS10 bit in Logical Block Provisioning EVPD
  target: Fail UNMAP when emulate_tpu=0
  target: Fail WRITE_SAME w/ UNMAP=1 when emulate_tpws=0
  target: Add sanity checks for DPO/FUA bit usage
  target: Perform PROTECT sanity checks for WRITE_SAME
  target: Fail I/O with PROTECT bit when protection is unsupported
  target: Check for LBA + sectors wrap-around in sbc_parse_cdb
  target: Add missing WRITE_SAME end-of-device sanity check
  iscsi-target: Avoid IN_LOGOUT failure case for iser-target
  target: Fix PR_APTPL_BUF_LEN buffer size limitation
  iscsi-target: Drop problematic active_ts_list usage
  iscsi/iser-target: Support multi-sequence sendtargets text response
  iser-target: Remove duplicate function names
  vhost/scsi: potential memory corruption
  vhost/scsi: Global tcm_vhost -> vhost_scsi rename
  vhost/scsi: Drop left-over scsi_tcq.h include
  vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits
  vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq
  vhost/scsi: Add ANY_LAYOUT iov -> sgl mapping prerequisites
  vhost/scsi: Change vhost_scsi_map_to_sgl to accept iov ptr + len
  ...
2015-02-21 13:21:19 -08:00
Jason Wang 0960b6417e vhost_net: fix wrong iter offset when setting number of buffers
In commit ba7438aed9 ("vhost: don't bother copying iovecs in
handle_rx(), kill memcpy_toiovecend()"), we advance iov iter fixup
sizeof(struct virtio_net_hdr) bytes and fill the number of buffers
after doing the socket recvmsg(). This work well but was broken after
commit 6e03f896b5 ("Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") which tries
to advance sizeof(struct virtio_net_hdr_mrg_rxbuf). It will fill the
number of buffers at the wrong place. This patch fixes this.

Fixes 6e03f896b5
("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Cc: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-15 08:17:15 -08:00
Dan Carpenter 59c816c1f2 vhost/scsi: potential memory corruption
This code in vhost_scsi_make_tpg() is confusing because we limit "tpgt"
to UINT_MAX but the data type of "tpg->tport_tpgt" and that is a u16.

I looked at the context and it turns out that in
vhost_scsi_set_endpoint(), "tpg->tport_tpgt" is used as an offset into
the vs_tpg[] array which has VHOST_SCSI_MAX_TARGET (256) elements so
anything higher than 255 then it is invalid.  I have made that the limit
now.

In vhost_scsi_send_evt() we mask away values higher than 255, but now
that the limit has changed, we don't need the mask.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-05 22:44:12 -08:00
David S. Miller 6e03f896b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/vxlan.c
	drivers/vhost/net.c
	include/linux/if_vlan.h
	net/core/dev.c

The net/core/dev.c conflict was the overlap of one commit marking an
existing function static whilst another was adding a new function.

In the include/linux/if_vlan.h case, the type used for a local
variable was changed in 'net', whereas the function got rewritten
to fix a stacked vlan bug in 'net-next'.

In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next'
overlapped with an endainness fix for VHOST 1.0 in 'net'.

In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter
in 'net-next' whereas in 'net' there was a bug fix to pass in the
correct network namespace pointer in calls to this function.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05 14:33:28 -08:00
Michael S. Tsirkin 5201aa49b0 vhost/net: fix up num_buffers endian-ness
In virtio 1.0 mode, when mergeable buffers are enabled on a big-endian
host, num_buffers wasn't byte-swapped correctly, so large incoming
packets got corrupted.

To fix, fill it in within hdr - this also makes sure it gets
the correct type.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 13:59:31 -08:00
Nicholas Bellinger 1a1ff8256a vhost/scsi: Global tcm_vhost -> vhost_scsi rename
There is a large amount of code that still references the original
'tcm_vhost' naming conventions, instead of modern 'vhost_scsi'.

Go ahead and do a global rename to make the usage consistent.

Cc: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:39 -08:00
Nicholas Bellinger f575c61233 vhost/scsi: Drop left-over scsi_tcq.h include
With the recent removal of MSG_*_TAG defines in commit 68d81f40,
vhost-scsi is now using TCM_*_TAG and doesn't depend upon host
side scsi_tcq.h definitions anymore.

Cc: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:38 -08:00
Nicholas Bellinger 664ed90e62 vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits
Signal support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits
required for virtio-scsi 1.0 spec layout requirements.

Cc: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:38 -08:00
Nicholas Bellinger 09b13fa8c1 vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq
This patch adds ANY_LAYOUT compatible support within the existing
vhost_scsi_handle_vq() ->handle_kick() callback.

It calculates data_direction + exp_data_len for the new tcm_vhost_cmd
descriptor by walking both outgoing + incoming iovecs using iov_iter,
assuming the layout of outgoing request header + T10_PI + Data payload
comes first.

It also uses copy_from_iter() to copy leading virtio-scsi request header
that may or may not include SCSI CDB, that returns a re-calculated iovec
to start of T10_PI or Data SGL memory.

Also, go ahead and drop the legacy pre virtio v1.0 !ANY_LAYOUT logic.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:37 -08:00
Nicholas Bellinger e8de56b5e7 vhost/scsi: Add ANY_LAYOUT iov -> sgl mapping prerequisites
This patch adds ANY_LAYOUT prerequisites logic for accepting a set of
protection + data payloads via iov_iter.  Also includes helpers for
calcuating SGLs + invoking vhost_scsi_map_to_sgl() with a known number
of iovecs.

Required by ANY_LAYOUT processing when struct iovec may be offset into
the first outgoing virtio-scsi request header.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:36 -08:00
Nicholas Bellinger b4078b5fac vhost/scsi: Change vhost_scsi_map_to_sgl to accept iov ptr + len
This patch changes vhost_scsi_map_to_sgl() parameters to accept virtio
iovec ptr + len when determing pages_nr.

This is currently done with iov_num_pages() -> PAGE_ALIGN, so allow
the same parameters as well.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:36 -08:00
Nicholas Bellinger de1419e420 vhost/scsi: Fix incorrect early vhost_scsi_handle_vq failures
This patch fixes vhost_scsi_handle_vq() failure cases that result in BUG_ON()
getting triggered when vhost_scsi_free_cmd() is called, and ->tvc_se_cmd has
not been initialized by target_submit_cmd_map_sgls().

It changes tcm_vhost_release_cmd() to use tcm_vhost_cmd->tvc_nexus for obtaining
se_session pointer reference.  Also, avoid calling put_page() on NULL sg->page
entries in vhost_scsi_map_to_sgl() failure path.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:35 -08:00
Nicholas Bellinger 79c14141a4 vhost/scsi: Convert completion path to use copy_to_iter
Required for ANY_LAYOUT support when the incoming virtio-scsi response
header + fixed size sense buffer payload may span more than a single
iovec entry.

This changes existing code to save cmd->tvc_resp_iov instead of the
first single iovec base pointer from &vq->iov[out].

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:35 -08:00
Al Viro 57dd8a0735 vhost: vhost_scsi_handle_vq() should just use copy_from_user()
it has just verified that it asks no more than the length of the
first segment of iovec.

And with that the last user of stuff in lib/iovec.c is gone.
RIP.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:16 -05:00
Al Viro ba7438aed9 vhost: don't bother copying iovecs in handle_rx(), kill memcpy_toiovecend()
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:16 -05:00
Al Viro 98a527aac1 vhost: don't bother with copying iovec in handle_tx()
just advance the msg.msg_iter and be done with that.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:16 -05:00
Al Viro aad9a1cec7 vhost: switch vhost get_indirect() to iov_iter, kill memcpy_fromiovec()
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:15 -05:00
David S. Miller 3f3558bb51 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/xen-netfront.c

Minor overlapping changes in xen-netfront.c, mostly to do
with some buffer management changes alongside the split
of stats into TX and RX.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15 00:53:17 -05:00
Jiri Pirko df8a39defa net: rename vlan_tx_* helpers since "tx" is misleading there
The same macros are used for rx as well. So rename it.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-13 17:51:08 -05:00
Linus Torvalds fb43bd08af Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull scsi target fixes from Nicholas Bellinger:
 "Mostly minor fixes this time, including:

   - Add missing virtio-scsi -> TCM attribute conversion in vhost-scsi.
   - Fix persistent reservations write exclusive handling to allow
     readers for all registered I_T nexuses.
   - Drop arbitrary maximum I/O size limit in order to process I/Os
     larger than 4 MB, required for initiators that don't honor block
     limits EVPD.
   - Drop the now left-over fabric_max_sectors attribute"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iscsi-target: Fix typos in enum cmd_flags_table
  MAINTAINERS: Add entry for iSER target driver
  target: Allow Write Exclusive non-reservation holders to READ
  target: Drop left-over fabric_max_sectors attribute
  target: Drop arbitrary maximum I/O size limit
  Documentation/target: Update fabric_ops to latest code
  vhost-scsi: Add missing virtio-scsi -> TCM attribute conversion
2015-01-13 15:23:26 +13:00
Michael S. Tsirkin 99975cc6ad vhost/net: length miscalculation
commit 8b38694a2d
    vhost/net: virtio 1.0 byte swap
had this chunk:
-       heads[headcount - 1].len += datalen;
+       heads[headcount - 1].len = cpu_to_vhost32(vq, len - datalen);

This adds datalen with the wrong sign, causing guest panics.

Fixes: 8b38694a2d
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Suggested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-01-07 12:22:00 +02:00
Nicholas Bellinger 4624386080 vhost-scsi: Add missing virtio-scsi -> TCM attribute conversion
While looking at hch's recent conversion to drop the MSG_*_TAG
definitions, I noticed a long standing bug in vhost-scsi where
the VIRTIO_SCSI_S_* attribute definitions where incorrectly
being passed directly into target_submit_cmd_map_sgls().

This patch adds the missing virtio-scsi to TCM/SAM task attribute
conversion.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org # 3.5
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-01-06 13:46:29 -08:00
Michael S. Tsirkin 5d9a07b0de vhost: relax used address alignment
virtio 1.0 only requires used address to be 4 byte aligned,
vhost required 8 bytes (size of vring_used_elem).
Fix up vhost to match that.

Additionally, while vhost correctly requires 8 byte
alignment for log, it's unconnected to used ring:
it's a consequence that log has u64 entries.
Tweak code to make that clearer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-29 10:55:06 +02:00
Linus Torvalds 64ec45bff6 vhost/virtio: virtio 1.0 related fixes
Most importantly, this fixes using virtio_pci as a module.
 
 Further, the big virtio 1.0 conversion missed a couple of places. This fixes
 them up.
 
 This isn't 100% sparse-clean yet because on many architectures get_user
 triggers sparse warnings when used with __bitwise tag (when same tag is on both
 pointer and value read).
 
 I posted a patchset to fix it up by adding __force on all
 arches that don't already have it (many do), when that's
 merged these warnings will go away.
 
 Cc: Rusty Russell <rusty@rustcorp.com.au>
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJUkSztAAoJECgfDbjSjVRp5DkH/ibw+0ZaEFP/SXWnw6WONpaG
 pzMsrfMG/vxlOfutSUdDqG+oqqU2fSLvFq5qDK6Xk9/emRSwGduz29ZaxGh8J1MZ
 /Ojqtu/HSLl+UASTs+fMw49itghoIjmAPBwwMkQjvanfqLclgdz9UxzoCOc4YkO0
 PJQA/Vw6blVSP1m0p97PvzZkAiIetI2ixZn2vPJZc8vSkOHtygM9HdXKTv785HbG
 ycRbR9B3OBMvq26FIuWeyuY93FnyX2Qtf2bzwSSRdzo7qlsNhVVG7sKyWOOR+1xG
 TLmjhyTF57oA2GgZCVfgnFxsfiuIKMumfG0jbABTXmBGgA/ULef7HcF/lzLgdq8=
 =32cU
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael S Tsirkin:
 "virtio 1.0 related fixes

  Most importantly, this fixes using virtio_pci as a module.

  Further, the big virtio 1.0 conversion missed a couple of places.
  This fixes them up.

  This isn't 100% sparse-clean yet because on many architectures
  get_user triggers sparse warnings when used with __bitwise tag (when
  same tag is on both pointer and value read).

  I posted a patchset to fix it up by adding __force on all arches that
  don't already have it (many do), when that's merged these warnings
  will go away"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_pci: restore module attributes
  mic/host: fix up virtio 1.0 APIs
  vringh: update for virtio 1.0 APIs
  vringh: 64 bit features
  tools/virtio: add virtio 1.0 in vringh_test
  tools/virtio: add virtio 1.0 in virtio_test
  tools/virtio: enable -Werror
  tools/virtio: 64 bit features
  tools/virtio: fix vringh test
  tools/virtio: more stubs
  virtio: core support for config generation
  virtio_pci: add VIRTIO_PCI_NO_LEGACY
  virtio_pci: move probe to common file
  virtio_pci_common.h: drop VIRTIO_PCI_NO_LEGACY
  virtio_config: fix virtio_cread_bytes
  virtio: set VIRTIO_CONFIG_S_FEATURES_OK on restore
2014-12-18 20:50:30 -08:00
Michael S. Tsirkin b9f7ac8c72 vringh: update for virtio 1.0 APIs
When switching everything over to virtio 1.0 memory access APIs,
I missed converting vringh.
Fortunately, it's straight-forward.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-15 23:49:28 +02:00
Michael S. Tsirkin b97a8a9006 vringh: 64 bit features
Pass u64 everywhere.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-15 23:49:23 +02:00
Linus Torvalds 70e71ca0af Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) New offloading infrastructure and example 'rocker' driver for
    offloading of switching and routing to hardware.

    This work was done by a large group of dedicated individuals, not
    limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend,
    Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu

 2) Start making the networking operate on IOV iterators instead of
    modifying iov objects in-situ during transfers.  Thanks to Al Viro
    and Herbert Xu.

 3) A set of new netlink interfaces for the TIPC stack, from Richard
    Alpe.

 4) Remove unnecessary looping during ipv6 routing lookups, from Martin
    KaFai Lau.

 5) Add PAUSE frame generation support to gianfar driver, from Matei
    Pavaluca.

 6) Allow for larger reordering levels in TCP, which are easily
    achievable in the real world right now, from Eric Dumazet.

 7) Add a variable of napi_schedule that doesn't need to disable cpu
    interrupts, from Eric Dumazet.

 8) Use a doubly linked list to optimize neigh_parms_release(), from
    Nicolas Dichtel.

 9) Various enhancements to the kernel BPF verifier, and allow eBPF
    programs to actually be attached to sockets.  From Alexei
    Starovoitov.

10) Support TSO/LSO in sunvnet driver, from David L Stevens.

11) Allow controlling ECN usage via routing metrics, from Florian
    Westphal.

12) Remote checksum offload, from Tom Herbert.

13) Add split-header receive, BQL, and xmit_more support to amd-xgbe
    driver, from Thomas Lendacky.

14) Add MPLS support to openvswitch, from Simon Horman.

15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen
    Klassert.

16) Do gro flushes on a per-device basis using a timer, from Eric
    Dumazet.  This tries to resolve the conflicting goals between the
    desired handling of bulk vs.  RPC-like traffic.

17) Allow userspace to ask for the CPU upon what a packet was
    received/steered, via SO_INCOMING_CPU.  From Eric Dumazet.

18) Limit GSO packets to half the current congestion window, from Eric
    Dumazet.

19) Add a generic helper so that all drivers set their RSS keys in a
    consistent way, from Eric Dumazet.

20) Add xmit_more support to enic driver, from Govindarajulu
    Varadarajan.

21) Add VLAN packet scheduler action, from Jiri Pirko.

22) Support configurable RSS hash functions via ethtool, from Eyal
    Perry.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits)
  Fix race condition between vxlan_sock_add and vxlan_sock_release
  net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
  net/mlx4: Add support for A0 steering
  net/mlx4: Refactor QUERY_PORT
  net/mlx4_core: Add explicit error message when rule doesn't meet configuration
  net/mlx4: Add A0 hybrid steering
  net/mlx4: Add mlx4_bitmap zone allocator
  net/mlx4: Add a check if there are too many reserved QPs
  net/mlx4: Change QP allocation scheme
  net/mlx4_core: Use tasklet for user-space CQ completion events
  net/mlx4_core: Mask out host side virtualization features for guests
  net/mlx4_en: Set csum level for encapsulated packets
  be2net: Export tunnel offloads only when a VxLAN tunnel is created
  gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
  cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
  net: fec: only enable mdio interrupt before phy device link up
  net: fec: clear all interrupt events to support i.MX6SX
  net: fec: reset fep link status in suspend function
  net: sock: fix access via invalid file descriptor
  net: introduce helper macro for_each_cmsghdr
  ...
2014-12-11 14:27:06 -08:00
Al Viro c0371da604 put iov_iter into msghdr
Note that the code _using_ ->msg_iter at that point will be very
unhappy with anything other than unshifted iovec-backed iov_iter.
We still need to convert users to proper primitives.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09 16:29:03 -05:00