Commit Graph

370 Commits

Author SHA1 Message Date
Bart Van Assche 649ee05499 target: Move task tag into struct se_cmd + support 64-bit tags
Simplify target core and target drivers by storing the task tag
a.k.a. command identifier inside struct se_cmd.

For several transports (e.g. SRP) tags are 64 bits wide.
Hence add support for 64-bit tags.

(Fix core_tmr_abort_task conversion spec warnings - nab)
(Fix up usb-gadget to use 16-bit tags - HCH + bart)

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: <qla2xxx-upstream@qlogic.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:42:31 -07:00
Christoph Hellwig 2650d71e24 target: move transport ID handling to the core
Now that struct se_portal_group contains a protocol identifier field we can
take all the code to format an parse protocol identifiers in CDBs into common
code instead of leaving this to low-level drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:42:30 -07:00
Christoph Hellwig 2aeeafae6b target: remove the get_fabric_proto_ident method
Now that we store the protocol identifier in the tpg structure we don't
need this method.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:42:30 -07:00
Christoph Hellwig e4aae5af81 target: change core_tpg_register prototype
Remove the unneeded fabric_ptr argument, and change the type argument
to pass in a SPC protocol identifier.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:42:27 -07:00
Christoph Hellwig 144bc4c2a4 target: move node ACL allocation to core code
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:42:23 -07:00
Christoph Hellwig c7d6a80392 target: refactor init/drop_nodeacl methods
By always allocating and adding, respectively removing and freeing
the se_node_acl structure in core code we can remove tons of repeated
code in the init_nodeacl and drop_nodeacl routines.  Additionally
this now respects the get_default_queue_depth method in this code
path as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:41:51 -07:00
Christoph Hellwig e1750d20e6 target: make the tpg_get_default_depth method optional
All fabric drivers except for iSCSI always return 1, so implement
that as default behavior.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:41:50 -07:00
Christoph Hellwig 55570113a9 vhost/scsi: remove struct vhost_scsi_nacl
Except for the embedded struct se_node_acl none of the fields were
ever used.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:41:50 -07:00
Bart Van Assche afc16604c0 target: Remove first argument of target_{get,put}_sess_cmd()
The first argument of these two functions is always identical
to se_cmd->se_sess. Hence remove the first argument.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Andy Grover <agrover@redhat.com>
Cc: <qla2xxx-upstream@qlogic.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 22:41:47 -07:00
Christoph Hellwig d588cf8f61 target: Fix se_tpg_tfo->tf_subsys regression + remove tf_subsystem
There is just one configfs subsystem in the target code, so we might as
well add two helpers to reference / unreference it from the core code
instead of passing pointers to it around.

This fixes a regression introduced for v4.1-rc1 with commit 9ac8928e6,
where configfs_depend_item() callers using se_tpg_tfo->tf_subsys would
fail, because the assignment from the original target_core_subsystem[]
is no longer happening at target_register_template() time.

(Fix target_core_exit_configfs pointer dereference - Sagi)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Himanshu Madhani <himanshu.madhani@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-05-30 18:04:20 -07:00
David S. Miller 5538d294dd treewide: Add missing vmalloc.h inclusion.
All of these files were only building on non-x86 because of
the indirect of inclusion of vmalloc.h by, of all things,
"net/inet_hashtables.h"

None of this got caught during build testing, because on x86
there is an implicit vmalloc.h include via on of the arch asm/
headers.

This fixes all of these

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-28 14:31:59 -07:00
Linus Torvalds c6668726d2 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Lots of activity in target land the last months.

  The highlights include:

   - Convert fabric drivers tree-wide to target_register_template() (hch
     + bart)

   - iser-target hardening fixes + v1.0 improvements (sagi)

   - Convert iscsi_thread_set usage to kthread.h + kill
     iscsi_target_tq.c (sagi + nab)

   - Add support for T10-PI WRITE_STRIP + READ_INSERT operation (mkp +
     sagi + nab)

   - DIF fixes for CONFIG_DEBUG_SG=y + UNMAP file emulation (akinobu +
     sagi + mkp)

   - Extended TCMU ABI v2 for future BIDI + DIF support (andy + ilias)

   - Fix COMPARE_AND_WRITE handling for NO_ALLLOC drivers (hch + nab)

  Thanks to everyone who contributed this round with new features,
  bug-reports, fixes, cleanups and improvements.

  Looking forward, it's currently shaping up to be a busy v4.2 as well"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (69 commits)
  target: Put TCMU under a new config option
  target: Version 2 of TCMU ABI
  target: fix tcm_mod_builder.py
  target/file: Fix UNMAP with DIF protection support
  target/file: Fix SG table for prot_buf initialization
  target/file: Fix BUG() when CONFIG_DEBUG_SG=y and DIF protection enabled
  target: Make core_tmr_abort_task() skip TMFs
  target/sbc: Update sbc_dif_generate pr_debug output
  target/sbc: Make internal DIF emulation honor ->prot_checks
  target/sbc: Return INVALID_CDB_FIELD if DIF + sess_prot_type disabled
  target: Ensure sess_prot_type is saved across session restart
  target/rd: Don't pass incomplete scatterlist entries to sbc_dif_verify_*
  target: Remove the unused flag SCF_ACK_KREF
  target: Fix two sparse warnings
  target: Fix COMPARE_AND_WRITE with SG_TO_MEM_NOALLOC handling
  target: simplify the target template registration API
  target: simplify target_xcopy_init_pt_lun
  target: remove the unused SCF_CMD_XCOPY_PASSTHROUGH flag
  target/rd: reduce code duplication in rd_execute_rw()
  tcm_loop: fixup tpgt string to integer conversion
  ...
2015-04-24 10:22:09 -07:00
Christoph Hellwig 9ac8928e6a target: simplify the target template registration API
Instead of calling target_fabric_configfs_init() +
target_fabric_configfs_register() / target_fabric_configfs_deregister()
target_fabric_configfs_free() from every target driver, rewrite the API
so that we have simple register/unregister functions that operate on
a const operations vector.

This patch also fixes a memory leak in several target drivers. Several
target drivers namely called target_fabric_configfs_deregister()
without calling target_fabric_configfs_free().

A large part of this patch is based on earlier changes from
Bart Van Assche <bart.vanassche@sandisk.com>.

(v2: Add a new TF_CIT_SETUP_DRV macro so that the core configfs code
can declare attributes as either core only or for drivers)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-14 12:28:41 -07:00
Al Viro 01e97e6517 new helper: msg_data_left()
convert open-coded instances

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-11 15:53:35 -04:00
Nicholas Bellinger b1d75fe53e vhost/scsi: Add fabric_prot_type attribute support
This patch updates vhost-scsi to add a new fabric_prot_type TPG
attribute, used for controlling LLD level protection into LIO when
the backend device does not support T10-PI.

This is required for vhost-scsi to enable WRITE_STRIP + READ_INSERT
operations using software emulation + crct10dif instruction offload.

It's disabled by default and controls which se_sesion->sess_prot_type
are set at vhost_scsi_make_nexus() session registration time.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-04-07 23:27:46 -07:00
David S. Miller d5c1d8c567 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/netfilter/nf_tables_core.c

The nf_tables_core.c conflict was resolved using a conflict resolution
from Stephen Rothwell as a guide.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23 22:22:43 -04:00
Linus Torvalds e477f3e013 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
 "Here are current target-pending fixes for v4.0-rc5 code that have made
  their way into the queue over the last weeks.

  The fixes this round include:

   - Fix long-standing iser-target logout bug related to early
     conn_logout_comp completion, resulting in iscsi_conn use-after-tree
     OOpsen.  (Sagi + nab)

   - Fix long-standing tcm_fc bug in ft_invl_hw_context() failure
     handing for DDP hw offload.  (DanC)

   - Fix incorrect use of unprotected __transport_register_session() in
     tcm_qla2xxx + other single local se_node_acl fabrics.  (Bart)

   - Fix reference leak in target_submit_cmd() -> target_get_sess_cmd()
     for ack_kref=1 failure path.  (Bart)

   - Fix pSCSI backend ->get_device_type() statistics OOPs with
     un-configured device.  (Olaf + nab)

   - Fix virtual LUN=0 target_configure_device failure OOPs at modprobe
     time.  (Claudio + nab)

   - Fix FUA write false positive failure regression in v4.0-rc1 code.
     (Christophe Vu-Brugier + HCH)"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: do not reject FUA CDBs when write cache is enabled but emulate_write_cache is 0
  target: Fix virtual LUN=0 target_configure_device failure OOPs
  target/pscsi: Fix NULL pointer dereference in get_device_type
  tcm_fc: missing curly braces in ft_invl_hw_context()
  target: Fix reference leak in target_get_sess_cmd() error path
  loop/usb/vhost-scsi/xen-scsiback: Fix use of __transport_register_session
  tcm_qla2xxx: Fix incorrect use of __transport_register_session
  iscsi-target: Avoid early conn_logout_comp for iser connections
  Revert "iscsi-target: Avoid IN_LOGOUT failure case for iser-target"
  target: Disallow changing of WRITE cache/FUA attrs after export
2015-03-21 11:24:38 -07:00
Bart Van Assche 2f450cc1fb loop/usb/vhost-scsi/xen-scsiback: Fix use of __transport_register_session
This patch changes loopback, usb-gadget, vhost-scsi and xen-scsiback
fabric code to invoke transport_register_session() instead of the
unprotected flavour, to ensure se_tpg->session_lock is taken when
adding new session list nodes to se_tpg->tpg_sess_list.

Note that since these four fabric drivers already hold their own
internal TPG mutexes when accessing se_tpg->tpg_sess_list, and
consist of a single se_session created through configfs attribute
access, no list corruption can currently occur.

So for correctness sake, go ahead and use the se_tpg->session_lock
protected version for these four fabric drivers.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-03-19 23:15:14 -07:00
David S. Miller 71a83a6db6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/rocker/rocker.c

The rocker commit was two overlapping changes, one to rename
the ->vport member to ->pport, and another making the bitmask
expression use '1ULL' instead of plain '1'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-03 21:16:48 -05:00
Linus Torvalds 789d7f60cd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) If an IPVS tunnel is created with a mixed-family destination
    address, it cannot be removed.  Fix from Alexey Andriyanov.

 2) Fix module refcount underflow in netfilter's nft_compat, from Pablo
    Neira Ayuso.

 3) Generic statistics infrastructure can reference variables sitting on
    a released function stack, therefore use dynamic allocation always.
    Fix from Ignacy Gawędzki.

 4) skb_copy_bits() return value test is inverted in ip_check_defrag().

 5) Fix network namespace exit in openvswitch, we have to release all of
    the per-net vports.  From Pravin B Shelar.

 6) Fix signedness bug in CAIF's cfpkt_iterate(), from Dan Carpenter.

 7) Fix rhashtable grow/shrink behavior, only expand during inserts and
    shrink during deletes.  From Daniel Borkmann.

 8) Netdevice names with semicolons should never be allowed, because
    they serve as a separator.  From Matthew Thode.

 9) Use {,__}set_current_state() where appropriate, from Fabian
    Frederick.

10) Revert byte queue limits support in r8169 driver, it's causing
    regressions we can't figure out.

11) tcp_should_expand_sndbuf() erroneously uses tp->packets_out to
    measure packets in flight, properly use tcp_packets_in_flight()
    instead.  From Neal Cardwell.

12) Fix accidental removal of support for bluetooth in CSR based Intel
    wireless cards.  From Marcel Holtmann.

13) We accidently added a behavioral change between native and compat
    tasks, wrt testing the MSG_CMSG_COMPAT bit.  Just ignore it if the
    user happened to set it in a native binary as that was always the
    behavior we had.  From Catalin Marinas.

14) Check genlmsg_unicast() return valud in hwsim netlink tx frame
    handling, from Bob Copeland.

15) Fix stale ->radar_required setting in mac80211 that can prevent
    starting new scans, from Eliad Peller.

16) Fix memory leak in nl80211 monitor, from Johannes Berg.

17) Fix race in TX index handling in xen-netback, from David Vrabel.

18) Don't enable interrupts in amx-xgbe driver until all software et al.
    state is ready for the interrupt handler to run.  From Thomas
    Lendacky.

19) Add missing netlink_ns_capable() checks to rtnl_newlink(), from Eric
    W Biederman.

20) The amount of header space needed in macvtap was not calculated
    properly, fix it otherwise we splat past the beginning of the
    packet.  From Eric Dumazet.

21) Fix bcmgenet TCP TX perf regression, from Jaedon Shin.

22) Don't raw initialize or mod timers, use setup_timer() and
    mod_timer() instead.  From Vaishali Thakkar.

23) Fix software maintained statistics in bcmgenet and systemport
    drivers, from Florian Fainelli.

24) DMA descriptor updates in sh_eth need proper memory barriers, from
    Ben Hutchings.

25) Don't do UDP Fragmentation Offload on RAW sockets, from Michal
    Kubecek.

26) Openvswitch's non-masked set actions aren't constructed properly
    into netlink messages, fix from Joe Stringer.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
  openvswitch: Fix serialization of non-masked set actions.
  gianfar: Reduce logging noise seen due to phy polling if link is down
  ibmveth: Add function to enable live MAC address changes
  net: bridge: add compile-time assert for cb struct size
  udp: only allow UFO for packets from SOCK_DGRAM sockets
  sh_eth: Really fix padding of short frames on TX
  Revert "sh_eth: Enable Rx descriptor word 0 shift for r8a7790"
  sh_eth: Fix RX recovery on R-Car in case of RX ring underrun
  sh_eth: Ensure proper ordering of descriptor active bit write/read
  net/mlx4_en: Disbale GRO for incoming loopback/selftest packets
  net/mlx4_core: Fix wrong mask and error flow for the update-qp command
  net: systemport: fix software maintained statistics
  net: bcmgenet: fix software maintained statistics
  rxrpc: don't multiply with HZ twice
  rxrpc: terminate retrans loop when sending of skb fails
  net/hsr: Fix NULL pointer dereference and refcnt bugs when deleting a HSR interface.
  net: pasemi: Use setup_timer and mod_timer
  net: stmmac: Use setup_timer and mod_timer
  net: 8390: axnet_cs: Use setup_timer and mod_timer
  net: 8390: pcnet_cs: Use setup_timer and mod_timer
  ...
2015-03-03 15:30:07 -08:00
Ying Xue 1b78414047 net: Remove iocb argument from sendmsg and recvmsg
After TIPC doesn't depend on iocb argument in its internal
implementations of sendmsg() and recvmsg() hooks defined in proto
structure, no any user is using iocb argument in them at all now.
Then we can drop the redundant iocb argument completely from kinds of
implementations of both sendmsg() and recvmsg() in the entire
networking stack.

Cc: Christoph Hellwig <hch@lst.de>
Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02 13:06:31 -05:00
Michael S. Tsirkin 0d79a493e5 vhost: drop hard-coded num_buffers size
The 2 that we use for copy_to_iter comes from sizeof(u16),
it used to be that way before the iov iter update.
Fix it up, making it obvious the size of stack access
is right.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 15:53:44 -05:00
Michael S. Tsirkin 4c5a84421c vhost: cleanup iterator update logic
Recent iterator-related changes in vhost made it
harder to follow the logic fixing up the header.
In fact, the fixup always happens at the same
offset: sizeof(virtio_net_hdr): sometimes the
fixup iterator is updated by copy_to_iter,
sometimes-by iov_iter_advance.

Rearrange code to make this obvious.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-27 15:53:44 -05:00
Linus Torvalds e20d3ef540 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "The highlights this round include:

   - Update vhost-scsi to support F_ANY_LAYOUT using mm/iov_iter.c
     logic, and signal VERSION_1 support (MST + Viro + nab)

   - Fix iscsi/iser-target to remove problematic active_ts_set usage
     (Gavin Guo)

   - Update iscsi/iser-target to support multi-sequence sendtargets
     (Sagi)

   - Fix original PR_APTPL_BUF_LEN 8k size limitation (Martin Svec)

   - Add missing WRITE_SAME end-of-device sanity check (Bart)

   - Check for LBA + sectors wrap-around in sbc_parse_cdb() (nab)

   - Other various minor SPC/SBC compliance fixes based upon Ronnie
     Sahlberg test suite (nab)"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (32 commits)
  target: Set LBPWS10 bit in Logical Block Provisioning EVPD
  target: Fail UNMAP when emulate_tpu=0
  target: Fail WRITE_SAME w/ UNMAP=1 when emulate_tpws=0
  target: Add sanity checks for DPO/FUA bit usage
  target: Perform PROTECT sanity checks for WRITE_SAME
  target: Fail I/O with PROTECT bit when protection is unsupported
  target: Check for LBA + sectors wrap-around in sbc_parse_cdb
  target: Add missing WRITE_SAME end-of-device sanity check
  iscsi-target: Avoid IN_LOGOUT failure case for iser-target
  target: Fix PR_APTPL_BUF_LEN buffer size limitation
  iscsi-target: Drop problematic active_ts_list usage
  iscsi/iser-target: Support multi-sequence sendtargets text response
  iser-target: Remove duplicate function names
  vhost/scsi: potential memory corruption
  vhost/scsi: Global tcm_vhost -> vhost_scsi rename
  vhost/scsi: Drop left-over scsi_tcq.h include
  vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits
  vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq
  vhost/scsi: Add ANY_LAYOUT iov -> sgl mapping prerequisites
  vhost/scsi: Change vhost_scsi_map_to_sgl to accept iov ptr + len
  ...
2015-02-21 13:21:19 -08:00
Jason Wang 0960b6417e vhost_net: fix wrong iter offset when setting number of buffers
In commit ba7438aed9 ("vhost: don't bother copying iovecs in
handle_rx(), kill memcpy_toiovecend()"), we advance iov iter fixup
sizeof(struct virtio_net_hdr) bytes and fill the number of buffers
after doing the socket recvmsg(). This work well but was broken after
commit 6e03f896b5 ("Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") which tries
to advance sizeof(struct virtio_net_hdr_mrg_rxbuf). It will fill the
number of buffers at the wrong place. This patch fixes this.

Fixes 6e03f896b5
("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Cc: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-15 08:17:15 -08:00
Dan Carpenter 59c816c1f2 vhost/scsi: potential memory corruption
This code in vhost_scsi_make_tpg() is confusing because we limit "tpgt"
to UINT_MAX but the data type of "tpg->tport_tpgt" and that is a u16.

I looked at the context and it turns out that in
vhost_scsi_set_endpoint(), "tpg->tport_tpgt" is used as an offset into
the vs_tpg[] array which has VHOST_SCSI_MAX_TARGET (256) elements so
anything higher than 255 then it is invalid.  I have made that the limit
now.

In vhost_scsi_send_evt() we mask away values higher than 255, but now
that the limit has changed, we don't need the mask.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-05 22:44:12 -08:00
David S. Miller 6e03f896b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/vxlan.c
	drivers/vhost/net.c
	include/linux/if_vlan.h
	net/core/dev.c

The net/core/dev.c conflict was the overlap of one commit marking an
existing function static whilst another was adding a new function.

In the include/linux/if_vlan.h case, the type used for a local
variable was changed in 'net', whereas the function got rewritten
to fix a stacked vlan bug in 'net-next'.

In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next'
overlapped with an endainness fix for VHOST 1.0 in 'net'.

In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter
in 'net-next' whereas in 'net' there was a bug fix to pass in the
correct network namespace pointer in calls to this function.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05 14:33:28 -08:00
Michael S. Tsirkin 5201aa49b0 vhost/net: fix up num_buffers endian-ness
In virtio 1.0 mode, when mergeable buffers are enabled on a big-endian
host, num_buffers wasn't byte-swapped correctly, so large incoming
packets got corrupted.

To fix, fill it in within hdr - this also makes sure it gets
the correct type.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-04 13:59:31 -08:00
Nicholas Bellinger 1a1ff8256a vhost/scsi: Global tcm_vhost -> vhost_scsi rename
There is a large amount of code that still references the original
'tcm_vhost' naming conventions, instead of modern 'vhost_scsi'.

Go ahead and do a global rename to make the usage consistent.

Cc: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:39 -08:00
Nicholas Bellinger f575c61233 vhost/scsi: Drop left-over scsi_tcq.h include
With the recent removal of MSG_*_TAG defines in commit 68d81f40,
vhost-scsi is now using TCM_*_TAG and doesn't depend upon host
side scsi_tcq.h definitions anymore.

Cc: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:38 -08:00
Nicholas Bellinger 664ed90e62 vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits
Signal support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits
required for virtio-scsi 1.0 spec layout requirements.

Cc: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:38 -08:00
Nicholas Bellinger 09b13fa8c1 vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq
This patch adds ANY_LAYOUT compatible support within the existing
vhost_scsi_handle_vq() ->handle_kick() callback.

It calculates data_direction + exp_data_len for the new tcm_vhost_cmd
descriptor by walking both outgoing + incoming iovecs using iov_iter,
assuming the layout of outgoing request header + T10_PI + Data payload
comes first.

It also uses copy_from_iter() to copy leading virtio-scsi request header
that may or may not include SCSI CDB, that returns a re-calculated iovec
to start of T10_PI or Data SGL memory.

Also, go ahead and drop the legacy pre virtio v1.0 !ANY_LAYOUT logic.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:37 -08:00
Nicholas Bellinger e8de56b5e7 vhost/scsi: Add ANY_LAYOUT iov -> sgl mapping prerequisites
This patch adds ANY_LAYOUT prerequisites logic for accepting a set of
protection + data payloads via iov_iter.  Also includes helpers for
calcuating SGLs + invoking vhost_scsi_map_to_sgl() with a known number
of iovecs.

Required by ANY_LAYOUT processing when struct iovec may be offset into
the first outgoing virtio-scsi request header.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:36 -08:00
Nicholas Bellinger b4078b5fac vhost/scsi: Change vhost_scsi_map_to_sgl to accept iov ptr + len
This patch changes vhost_scsi_map_to_sgl() parameters to accept virtio
iovec ptr + len when determing pages_nr.

This is currently done with iov_num_pages() -> PAGE_ALIGN, so allow
the same parameters as well.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:36 -08:00
Nicholas Bellinger de1419e420 vhost/scsi: Fix incorrect early vhost_scsi_handle_vq failures
This patch fixes vhost_scsi_handle_vq() failure cases that result in BUG_ON()
getting triggered when vhost_scsi_free_cmd() is called, and ->tvc_se_cmd has
not been initialized by target_submit_cmd_map_sgls().

It changes tcm_vhost_release_cmd() to use tcm_vhost_cmd->tvc_nexus for obtaining
se_session pointer reference.  Also, avoid calling put_page() on NULL sg->page
entries in vhost_scsi_map_to_sgl() failure path.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:35 -08:00
Nicholas Bellinger 79c14141a4 vhost/scsi: Convert completion path to use copy_to_iter
Required for ANY_LAYOUT support when the incoming virtio-scsi response
header + fixed size sense buffer payload may span more than a single
iovec entry.

This changes existing code to save cmd->tvc_resp_iov instead of the
first single iovec base pointer from &vq->iov[out].

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-02-04 10:55:35 -08:00
Al Viro 57dd8a0735 vhost: vhost_scsi_handle_vq() should just use copy_from_user()
it has just verified that it asks no more than the length of the
first segment of iovec.

And with that the last user of stuff in lib/iovec.c is gone.
RIP.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:16 -05:00
Al Viro ba7438aed9 vhost: don't bother copying iovecs in handle_rx(), kill memcpy_toiovecend()
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:16 -05:00
Al Viro 98a527aac1 vhost: don't bother with copying iovec in handle_tx()
just advance the msg.msg_iter and be done with that.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:16 -05:00
Al Viro aad9a1cec7 vhost: switch vhost get_indirect() to iov_iter, kill memcpy_fromiovec()
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:15 -05:00
David S. Miller 3f3558bb51 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/xen-netfront.c

Minor overlapping changes in xen-netfront.c, mostly to do
with some buffer management changes alongside the split
of stats into TX and RX.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15 00:53:17 -05:00
Jiri Pirko df8a39defa net: rename vlan_tx_* helpers since "tx" is misleading there
The same macros are used for rx as well. So rename it.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-13 17:51:08 -05:00
Linus Torvalds fb43bd08af Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull scsi target fixes from Nicholas Bellinger:
 "Mostly minor fixes this time, including:

   - Add missing virtio-scsi -> TCM attribute conversion in vhost-scsi.
   - Fix persistent reservations write exclusive handling to allow
     readers for all registered I_T nexuses.
   - Drop arbitrary maximum I/O size limit in order to process I/Os
     larger than 4 MB, required for initiators that don't honor block
     limits EVPD.
   - Drop the now left-over fabric_max_sectors attribute"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iscsi-target: Fix typos in enum cmd_flags_table
  MAINTAINERS: Add entry for iSER target driver
  target: Allow Write Exclusive non-reservation holders to READ
  target: Drop left-over fabric_max_sectors attribute
  target: Drop arbitrary maximum I/O size limit
  Documentation/target: Update fabric_ops to latest code
  vhost-scsi: Add missing virtio-scsi -> TCM attribute conversion
2015-01-13 15:23:26 +13:00
Michael S. Tsirkin 99975cc6ad vhost/net: length miscalculation
commit 8b38694a2d
    vhost/net: virtio 1.0 byte swap
had this chunk:
-       heads[headcount - 1].len += datalen;
+       heads[headcount - 1].len = cpu_to_vhost32(vq, len - datalen);

This adds datalen with the wrong sign, causing guest panics.

Fixes: 8b38694a2d
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Suggested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-01-07 12:22:00 +02:00
Nicholas Bellinger 4624386080 vhost-scsi: Add missing virtio-scsi -> TCM attribute conversion
While looking at hch's recent conversion to drop the MSG_*_TAG
definitions, I noticed a long standing bug in vhost-scsi where
the VIRTIO_SCSI_S_* attribute definitions where incorrectly
being passed directly into target_submit_cmd_map_sgls().

This patch adds the missing virtio-scsi to TCM/SAM task attribute
conversion.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org # 3.5
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2015-01-06 13:46:29 -08:00
Michael S. Tsirkin 5d9a07b0de vhost: relax used address alignment
virtio 1.0 only requires used address to be 4 byte aligned,
vhost required 8 bytes (size of vring_used_elem).
Fix up vhost to match that.

Additionally, while vhost correctly requires 8 byte
alignment for log, it's unconnected to used ring:
it's a consequence that log has u64 entries.
Tweak code to make that clearer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-29 10:55:06 +02:00
Linus Torvalds 64ec45bff6 vhost/virtio: virtio 1.0 related fixes
Most importantly, this fixes using virtio_pci as a module.
 
 Further, the big virtio 1.0 conversion missed a couple of places. This fixes
 them up.
 
 This isn't 100% sparse-clean yet because on many architectures get_user
 triggers sparse warnings when used with __bitwise tag (when same tag is on both
 pointer and value read).
 
 I posted a patchset to fix it up by adding __force on all
 arches that don't already have it (many do), when that's
 merged these warnings will go away.
 
 Cc: Rusty Russell <rusty@rustcorp.com.au>
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJUkSztAAoJECgfDbjSjVRp5DkH/ibw+0ZaEFP/SXWnw6WONpaG
 pzMsrfMG/vxlOfutSUdDqG+oqqU2fSLvFq5qDK6Xk9/emRSwGduz29ZaxGh8J1MZ
 /Ojqtu/HSLl+UASTs+fMw49itghoIjmAPBwwMkQjvanfqLclgdz9UxzoCOc4YkO0
 PJQA/Vw6blVSP1m0p97PvzZkAiIetI2ixZn2vPJZc8vSkOHtygM9HdXKTv785HbG
 ycRbR9B3OBMvq26FIuWeyuY93FnyX2Qtf2bzwSSRdzo7qlsNhVVG7sKyWOOR+1xG
 TLmjhyTF57oA2GgZCVfgnFxsfiuIKMumfG0jbABTXmBGgA/ULef7HcF/lzLgdq8=
 =32cU
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael S Tsirkin:
 "virtio 1.0 related fixes

  Most importantly, this fixes using virtio_pci as a module.

  Further, the big virtio 1.0 conversion missed a couple of places.
  This fixes them up.

  This isn't 100% sparse-clean yet because on many architectures
  get_user triggers sparse warnings when used with __bitwise tag (when
  same tag is on both pointer and value read).

  I posted a patchset to fix it up by adding __force on all arches that
  don't already have it (many do), when that's merged these warnings
  will go away"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_pci: restore module attributes
  mic/host: fix up virtio 1.0 APIs
  vringh: update for virtio 1.0 APIs
  vringh: 64 bit features
  tools/virtio: add virtio 1.0 in vringh_test
  tools/virtio: add virtio 1.0 in virtio_test
  tools/virtio: enable -Werror
  tools/virtio: 64 bit features
  tools/virtio: fix vringh test
  tools/virtio: more stubs
  virtio: core support for config generation
  virtio_pci: add VIRTIO_PCI_NO_LEGACY
  virtio_pci: move probe to common file
  virtio_pci_common.h: drop VIRTIO_PCI_NO_LEGACY
  virtio_config: fix virtio_cread_bytes
  virtio: set VIRTIO_CONFIG_S_FEATURES_OK on restore
2014-12-18 20:50:30 -08:00
Michael S. Tsirkin b9f7ac8c72 vringh: update for virtio 1.0 APIs
When switching everything over to virtio 1.0 memory access APIs,
I missed converting vringh.
Fortunately, it's straight-forward.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-15 23:49:28 +02:00
Michael S. Tsirkin b97a8a9006 vringh: 64 bit features
Pass u64 everywhere.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-15 23:49:23 +02:00
Linus Torvalds 70e71ca0af Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) New offloading infrastructure and example 'rocker' driver for
    offloading of switching and routing to hardware.

    This work was done by a large group of dedicated individuals, not
    limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend,
    Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu

 2) Start making the networking operate on IOV iterators instead of
    modifying iov objects in-situ during transfers.  Thanks to Al Viro
    and Herbert Xu.

 3) A set of new netlink interfaces for the TIPC stack, from Richard
    Alpe.

 4) Remove unnecessary looping during ipv6 routing lookups, from Martin
    KaFai Lau.

 5) Add PAUSE frame generation support to gianfar driver, from Matei
    Pavaluca.

 6) Allow for larger reordering levels in TCP, which are easily
    achievable in the real world right now, from Eric Dumazet.

 7) Add a variable of napi_schedule that doesn't need to disable cpu
    interrupts, from Eric Dumazet.

 8) Use a doubly linked list to optimize neigh_parms_release(), from
    Nicolas Dichtel.

 9) Various enhancements to the kernel BPF verifier, and allow eBPF
    programs to actually be attached to sockets.  From Alexei
    Starovoitov.

10) Support TSO/LSO in sunvnet driver, from David L Stevens.

11) Allow controlling ECN usage via routing metrics, from Florian
    Westphal.

12) Remote checksum offload, from Tom Herbert.

13) Add split-header receive, BQL, and xmit_more support to amd-xgbe
    driver, from Thomas Lendacky.

14) Add MPLS support to openvswitch, from Simon Horman.

15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen
    Klassert.

16) Do gro flushes on a per-device basis using a timer, from Eric
    Dumazet.  This tries to resolve the conflicting goals between the
    desired handling of bulk vs.  RPC-like traffic.

17) Allow userspace to ask for the CPU upon what a packet was
    received/steered, via SO_INCOMING_CPU.  From Eric Dumazet.

18) Limit GSO packets to half the current congestion window, from Eric
    Dumazet.

19) Add a generic helper so that all drivers set their RSS keys in a
    consistent way, from Eric Dumazet.

20) Add xmit_more support to enic driver, from Govindarajulu
    Varadarajan.

21) Add VLAN packet scheduler action, from Jiri Pirko.

22) Support configurable RSS hash functions via ethtool, from Eyal
    Perry.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits)
  Fix race condition between vxlan_sock_add and vxlan_sock_release
  net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
  net/mlx4: Add support for A0 steering
  net/mlx4: Refactor QUERY_PORT
  net/mlx4_core: Add explicit error message when rule doesn't meet configuration
  net/mlx4: Add A0 hybrid steering
  net/mlx4: Add mlx4_bitmap zone allocator
  net/mlx4: Add a check if there are too many reserved QPs
  net/mlx4: Change QP allocation scheme
  net/mlx4_core: Use tasklet for user-space CQ completion events
  net/mlx4_core: Mask out host side virtualization features for guests
  net/mlx4_en: Set csum level for encapsulated packets
  be2net: Export tunnel offloads only when a VxLAN tunnel is created
  gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
  cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
  net: fec: only enable mdio interrupt before phy device link up
  net: fec: clear all interrupt events to support i.MX6SX
  net: fec: reset fep link status in suspend function
  net: sock: fix access via invalid file descriptor
  net: introduce helper macro for_each_cmsghdr
  ...
2014-12-11 14:27:06 -08:00
Al Viro c0371da604 put iov_iter into msghdr
Note that the code _using_ ->msg_iter at that point will be very
unhappy with anything other than unshifted iovec-backed iov_iter.
We still need to convert users to proper primitives.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-12-09 16:29:03 -05:00
Jason Wang 2e73c716ba vhost: remove unnecessary forward declarations in vhost.h
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:06:33 +02:00
Michael S. Tsirkin e72fd72e26 vhost/scsi: partial virtio 1.0 support
Include all endian conversions as required by virtio 1.0.
Don't set virtio 1.0 yet, since that requires ANY_LAYOUT
which we don't yet support.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-09 12:06:31 +02:00
Michael S. Tsirkin 41e3e42108 vhost/net: enable virtio 1.0
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:05:30 +02:00
Michael S. Tsirkin e4fca7d6ff vhost/net: larger header for virtio 1.0
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:30 +02:00
Michael S. Tsirkin 8b38694a2d vhost/net: virtio 1.0 byte swap
I had to add an explicit tag to suppress compiler warning:
gcc isn't smart enough to notice that
len is always initialized since function is called with size > 0.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin 3b1bbe8935 vhost: virtio 1.0 endian-ness support
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin 64f7f0510c vhost: switch to __get/__put_user exclusively
Most places in vhost can use __get/__put_user rather than
get/put_user since addresses are pre-validated.
This should be good for performance, but this also
will help make code sparse-clean: get/put_user macros
don't play well with __virtioXX bitwise tags.
Switch to get/put_user to __ variants everywhere in vhost.
There's one exception - for consistency switch that
as well, and add an explicit access_ok check.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin bf99573496 vhost/net: force len for TX to host endian
vhost/net keeps a copy of the used ring in host memory but (ab)uses
the length field for internal house-keeping. This works because the
length in the used ring for tx is always 0. In order to suppress sparse
warnings, we force native endianness here.
Note that these values are never exposed to guests.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin e05fd12b38 vhost: add memory access wrappers
Add guest memory access wrappers to handle virtio endianness
conversions.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin bd82752a83 vhost: make features 64 bit
We need to use bit 32 for virtio 1.0.
Make vhost_has_feature bool to avoid discarding high bits.

Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2014-12-09 12:05:28 +02:00
Nicholas Bellinger ab8edab132 vhost-scsi: Take configfs group dependency during VHOST_SCSI_SET_ENDPOINT
This patch addresses a bug where individual vhost-scsi configfs endpoint
groups can be removed from below while active exports to QEMU userspace
still exist, resulting in an OOPs.

It adds a configfs_depend_item() in vhost_scsi_set_endpoint() to obtain
an explicit dependency on se_tpg->tpg_group in order to prevent individual
vhost-scsi WWPN endpoints from being released via normal configfs methods
while an QEMU ioctl reference still exists.

Also, add matching configfs_undepend_item() in vhost_scsi_clear_endpoint()
to release the dependency, once QEMU's reference to the individual group
at /sys/kernel/config/target/vhost/$WWPN/$TPGT is released.

(Fix up vhost_scsi_clear_endpoint() error path - DanC)

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: <stable@vger.kernel.org> # 3.6+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-10-28 13:54:16 -07:00
Michael S. Tsirkin 6840444155 vhost-scsi: don't open-code kvfree
Now that we have kvfree, use it in vhost-scsi instead of
the open-coded version.

Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 09:22:48 +03:00
Romain Francoise d04257b07f vhost-net: don't open-code kvfree
Commit 23cc5a991c ("vhost-net: extend device allocation to vmalloc")
added another open-coded version of kvfree (which is available since
v3.15-rc5), nuke it.

Signed-off-by: Romain Francoise <romain@orebokech.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-23 09:22:48 +03:00
Linus Torvalds ed9ea4ed3a Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "The highlights this round include:

   - Add support for T10 PI pass-through between vhost-scsi +
     virtio-scsi (MST + Paolo + MKP + nab)
   - Add support for T10 PI in qla2xxx target mode (Quinn + MKP + hch +
     nab, merged through scsi.git)
   - Add support for percpu-ida pre-allocation in qla2xxx target code
     (Quinn + nab)
   - A number of iser-target fixes related to hardening the network
     portal shutdown path (Sagi + Slava)
   - Fix response length residual handling for a number of control CDBs
     (Roland + Christophe V.)
   - Various iscsi RFC conformance fixes in the CHAP authentication path
     (Tejas and Calsoft folks + nab)
   - Return TASK_SET_FULL status for tcm_fc(FCoE) DataIn + Response
     failures (Vasu + Jun + nab)
   - Fix long-standing ABORT_TASK + session reset hang (nab)
   - Convert iser-initiator + iser-target to include T10 bytes into EDTL
     (Sagi + Or + MKP + Mike Christie)
   - Fix NULL pointer dereference regression related to XCOPY introduced
     in v3.15 + CC'ed to v3.12.y (nab)"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (34 commits)
  target: Fix NULL pointer dereference for XCOPY in target_put_sess_cmd
  vhost-scsi: Include prot_bytes into expected data transfer length
  TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire
  libiscsi, iser: Adjust data_length to include protection information
  scsi_cmnd: Introduce scsi_transfer_length helper
  target: Report correct response length for some commands
  target/sbc: Check that the LBA and number of blocks are correct in VERIFY
  target/sbc: Remove sbc_check_valid_sectors()
  Target/iscsi: Fix sendtargets response pdu for iser transport
  Target/iser: Fix a wrong dereference in case discovery session is over iser
  iscsi-target: Fix ABORT_TASK + connection reset iscsi_queue_req memory leak
  target: Use complete_all for se_cmd->t_transport_stop_comp
  target: Set CMD_T_ACTIVE bit for Task Management Requests
  target: cleanup some boolean tests
  target/spc: Simplify INQUIRY EVPD=0x80
  tcm_fc: Generate TASK_SET_FULL status for response failures
  tcm_fc: Generate TASK_SET_FULL status for DataIN failures
  iscsi-target: Reject mutual authentication with reflected CHAP_C
  iscsi-target: Remove no-op from iscsit_tpg_del_portal_group
  iscsi-target: Fix CHAP_A parameter list handling
  ...
2014-06-12 22:38:32 -07:00
Linus Torvalds 3c81bdd9e7 vhost: infrastructure changes for 3.16
This reworks vhost core dropping unnecessary RCU uses in favor of VQ mutexes
 which are used on fast path anyway.  This fixes worst-case latency for users
 which change the memory mappings a lot.
 Memory allocation for vhost-net now supports fallback on vmalloc (same as for
 vhost-scsi) this makes it possible to create the device on systems where memory
 is very fragmented, with slightly lower performance.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTmFpNAAoJECgfDbjSjVRpD5oH/225t/P6h6kZWxzeZP+P+/Rj
 cEonwdwYRg+OOLWPnmJptK1zeWZPQhVoNxxREn3S8Zx3BN7QiBidegrMErM2x+uQ
 ZDcF0WC1mNc6rHPQ5n4N0ZItVrG8KQz5r2EYe0eKwMoy16C4rhZBY1Zj16VePMDK
 A00AqPDiUH1M7XPUQabXdRqNlxLXFoQYbrkAY+bgBeSl2qAfEbdMLW4ty7vqLOyT
 W5ZyJSBuv4BYmYh1KhmLW0WZBCleSFKkoGjHy5EZZ8fqZ63hpbvpqswl6PJeT4qW
 sq1yxyt8CBTi5lJwW3hIiiJwzDNsIc4zDhbyEfkQ3uHtOJ/SBzFptEWHun1hbRI=
 =DYAy
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull vhost infrastructure updates from Michael S. Tsirkin:
 "This reworks vhost core dropping unnecessary RCU uses in favor of VQ
  mutexes which are used on fast path anyway.  This fixes worst-case
  latency for users which change the memory mappings a lot.  Memory
  allocation for vhost-net now supports fallback on vmalloc (same as for
  vhost-scsi) this makes it possible to create the device on systems
  where memory is very fragmented, with slightly lower performance"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost: move memory pointer to VQs
  vhost: move acked_features to VQs
  vhost: replace rcu with mutex
  vhost-net: extend device allocation to vmalloc
2014-06-11 17:08:16 -07:00
Nicholas Bellinger 9f977ef7b6 vhost-scsi: Include prot_bytes into expected data transfer length
This patch updates vhost_scsi_get_tag() to accept the combined
expected data transfer length + T10 PI bytes as the value passed
into target_submit_cmd().

This is required now that target-core logic in commit 14ef9200
expects to subtract se_cmd->prot_length from se_cmd->data_length.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-06-11 13:06:50 -07:00
Michael S. Tsirkin 47283bef7e vhost: move memory pointer to VQs
commit 2ae76693b8bcabf370b981cd00c36cd41d33fabc
    vhost: replace rcu with mutex
replaced rcu sync for memory accesses with VQ mutex locl/unlock.
This is correct since all accesses are under VQ mutex, but incomplete:
we still do useless rcu lock/unlock operations, someone might copy this
code into some other context where this won't be right.
This use of RCU is also non standard and hard to understand.
Let's copy the pointer to each VQ structure, this way
the access rules become straight-forward, and there's
no need for RCU anymore.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-09 16:21:07 +03:00
Michael S. Tsirkin ea16c51433 vhost: move acked_features to VQs
Refactor code to make sure features are only accessed
under VQ mutex. This makes everything simpler, no need
for RCU here anymore.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-09 16:21:06 +03:00
Michael S. Tsirkin 98f9ca0a3f vhost: replace rcu with mutex
All memory accesses are done under some VQ mutex.
So lock/unlock all VQs is a faster equivalent of synchronize_rcu()
for memory access changes.
Some guests cause a lot of these changes, so it's helpful
to make them faster.

Reported-by: "Gonglei (Arei)" <arei.gonglei@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-09 16:21:06 +03:00
Michael S. Tsirkin 23cc5a991c vhost-net: extend device allocation to vmalloc
Michael Mueller provided a patch to reduce the size of
vhost-net structure as some allocations could fail under
memory pressure/fragmentation. We are still left with
high order allocations though.

This patch is handling the problem at the core level, allowing
vhost structures to use vmalloc() if kmalloc() failed.

As vmalloc() adds overhead on a critical network path, add __GFP_REPEAT
to kzalloc() flags to do this fallback only when really needed.

People are still looking at cleaner ways to handle the problem
at the API level, probably passing in multiple iovecs.
This hack seems consistent with approaches
taken since then by drivers/vhost/scsi.c and net/core/dev.c

Based on patch by Romain Francoise.

Cc: Michael Mueller <mimu@linux.vnet.ibm.com>
Signed-off-by: Romain Francoise <romain@orebokech.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-09 16:21:05 +03:00
Nicholas Bellinger 95e7c4341b vhost/scsi: Enable T10 PI IOV -> SGL memory mapping
This patch updates vhost_scsi_handle_vq() to check for the existance
of virtio_scsi_cmd_req_pi comparing vq->iov[0].iov_len in order to
calculate seperate data + protection SGLs from data_num.

Also update tcm_vhost_submission_work() to pass the pre-allocated
cmd->tvc_prot_sgl[] memory into target_submit_cmd_map_sgls(), and
update vhost_scsi_get_tag() parameters to accept scsi_tag, lun, and
task_attr.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: H. Peter Anvin <hpa@zytor.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-06-02 12:42:14 -07:00
Nicholas Bellinger e31885dd90 vhost/scsi: Add T10 PI IOV -> SGL memory mapping logic
This patch adds vhost_scsi_map_iov_to_prot() to perform the mapping of
T10 data integrity memory between virtio iov + struct scatterlist using
get_user_pages_fast() following existing code.

As with vhost_scsi_map_iov_to_sgl(), this does sanity checks against the
total prot_sgl_count vs. pre-allocated SGLs, and loops across protection
iovs using vhost_scsi_map_to_sgl() to perform the actual memory mapping.

Also update tcm_vhost_release_cmd() to release associated tvc_prot_sgl[]
struct page.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: H. Peter Anvin <hpa@zytor.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-06-02 12:42:07 -07:00
Nicholas Bellinger b1935f687b vhost/scsi: Add preallocation of protection SGLs
This patch updates tcm_vhost_make_nexus() to pre-allocate per descriptor
tcm_vhost_cmd->tvc_prot_sgl[] used to expose protection SGLs from within
virtio-scsi guest memory to vhost-scsi.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-06-02 12:42:01 -07:00
Nicholas Bellinger 5a01d08217 vhost/scsi: Move sanity check into vhost_scsi_map_iov_to_sgl
Move the overflow check for sgl_count > TCM_VHOST_PREALLOC_SGLS into
vhost_scsi_map_iov_to_sgl() so that it's based on the total number
of SGLs for all IOVs, instead of single IOVs.

Also, rename TCM_VHOST_PREALLOC_PAGES -> TCM_VHOST_PREALLOC_UPAGES
to better describe pointers to user-space pages.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-06-02 12:41:54 -07:00
Peter Zijlstra 4e857c58ef arch: Mass conversion of smp_mb__*()
Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-18 14:20:48 +02:00
Linus Torvalds 141eaccd01 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Here are the target pending updates for v3.15-rc1.  Apologies in
  advance for waiting until the second to last day of the merge window
  to send these out.

  The highlights this round include:

   - iser-target support for T10 PI (DIF) offloads (Sagi + Or)
   - Fix Task Aborted Status (TAS) handling in target-core (Alex Leung)
   - Pass in transport supported PI at session initialization (Sagi + MKP + nab)
   - Add WRITE_INSERT + READ_STRIP T10 PI support in target-core (nab + Sagi)
   - Fix iscsi-target ERL=2 ASYNC_EVENT connection pointer bug (nab)
   - Fix tcm_fc use-after-free of ft_tpg (Andy Grover)
   - Use correct ib_sg_dma primitives in ib_isert (Mike Marciniszyn)

  Also, note the virtio-scsi + vhost-scsi changes to expose T10 PI
  metadata into KVM guest have been left-out for now, as there where a
  few comments from MST + Paolo that where not able to be addressed in
  time for v3.15.  Please expect this feature for v3.16-rc1"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (43 commits)
  ib_srpt: Use correct ib_sg_dma primitives
  target/tcm_fc: Rename ft_tport_create to ft_tport_get
  target/tcm_fc: Rename ft_{add,del}_lport to {add,del}_wwn
  target/tcm_fc: Rename structs and list members for clarity
  target/tcm_fc: Limit to 1 TPG per wwn
  target/tcm_fc: Don't export ft_lport_list
  target/tcm_fc: Fix use-after-free of ft_tpg
  target: Add check to prevent Abort Task from aborting itself
  target: Enable READ_STRIP emulation in target_complete_ok_work
  target/sbc: Add sbc_dif_read_strip software emulation
  target: Enable WRITE_INSERT emulation in target_execute_cmd
  target/sbc: Add sbc_dif_generate software emulation
  target/sbc: Only expose PI read_cap16 bits when supported by fabric
  target/spc: Only expose PI mode page bits when supported by fabric
  target/spc: Only expose PI inquiry bits when supported by fabric
  target: Pass in transport supported PI at session initialization
  target/iblock: Fix double bioset_integrity_free bug
  Target/sbc: Initialize COMPARE_AND_WRITE write_sg scatterlist
  target/rd: T10-Dif: RAM disk is allocating more space than required.
  iscsi-target: Fix ERL=2 ASYNC_EVENT connection pointer bug
  ...
2014-04-12 16:51:08 -07:00
Nicholas Bellinger e70beee783 target: Pass in transport supported PI at session initialization
In order to support local WRITE_INSERT + READ_STRIP operations for
non PI enabled fabrics, the fabric driver needs to be able signal
what protection offload operations are supported.

This is done at session initialization time so the modes can be
signaled by individual se_wwn + se_portal_group endpoints, as well
as optionally across different transports on the same endpoint.

For iser-target, set TARGET_PROT_ALL if the underlying ib_device
has already signaled PI offload support, and allow this to be
exposed via a new iscsit_transport->iscsit_get_sup_prot_ops()
callback.

For loopback, set TARGET_PROT_ALL to signal SCSI initiator mode
operation.

For all other drivers, set TARGET_PROT_NORMAL to disable fabric
level PI.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-04-07 01:48:54 -07:00
Nicholas Bellinger 131e6abc67 target: Add TFO->abort_task for aborted task resources release
Now that TASK_ABORTED status is not generated for all cases by
TMR ABORT_TASK + LUN_RESET, a new TFO->abort_task() caller is
necessary in order to give fabric drivers a chance to unmap
hardware / software resources before the se_cmd descriptor is
released via the normal TFO->release_cmd() codepath.

This patch adds TFO->aborted_task() in core_tmr_abort_task()
in place of the original transport_send_task_abort(), and
also updates all fabric drivers to implement this caller.

The fabric drivers that include changes to perform cleanup
via ->aborted_task() are:

  - iscsi-target
  - iser-target
  - srpt
  - tcm_qla2xxx

The fabric drivers that currently set ->aborted_task() to
NOPs are:

  - loopback
  - tcm_fc
  - usb-gadget
  - sbp-target
  - vhost-scsi

For the latter five, there appears to be no additional cleanup
required before invoking TFO->release_cmd() to release the
se_cmd descriptor.

v2 changes:
  - Move ->aborted_task() call into transport_cmd_finish_abort (Alex)

Cc: Alex Leung <amleung21@yahoo.com>
Cc: Mark Rustad <mark.d.rustad@intel.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Vu Pham <vu@mellanox.com>
Cc: Chris Boot <bootc@bootc.net>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Giridhar Malavali <giridhar.malavali@qlogic.com>
Cc: Saurav Kashyap <saurav.kashyap@qlogic.com>
Cc: Quinn Tran <quinn.tran@qlogic.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-04-07 01:48:51 -07:00
Al Viro 09aaacf02a vhost: don't open-code sockfd_put()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-01 23:19:10 -04:00
Michael S. Tsirkin a39ee449f9 vhost: validate vhost_get_vq_desc return value
vhost fails to validate negative error code
from vhost_get_vq_desc causing
a crash: we are using -EFAULT which is 0xfffffff2
as vector size, which exceeds the allocated size.

The code in question was introduced in commit
8dd014adfe
    vhost-net: mergeable buffers support

CVE-2014-0055

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-28 16:10:35 -04:00
Michael S. Tsirkin d8316f3991 vhost: fix total length when packets are too short
When mergeable buffers are disabled, and the
incoming packet is too large for the rx buffer,
get_rx_bufs returns success.

This was intentional in order for make recvmsg
truncate the packet and then handle_rx would
detect err != sock_len and drop it.

Unfortunately we pass the original sock_len to
recvmsg - which means we use parts of iov not fully
validated.

Fix this up by detecting this overrun and doing packet drop
immediately.

CVE-2014-0077

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-28 16:07:31 -04:00
Linus Torvalds 702256e604 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
 "The bulk of the series are bugfixes for qla2xxx target NPIV support
  that went in for v3.14-rc1.  Also included are a few DIF related
  fixes, a qla2xxx fix (Cc'ed to stable) from Greg W., and vhost/scsi
  protocol version related fix from Venkatesh.

  Also just a heads up that a series to address a number of issues with
  iser-target active I/O reset/shutdown is still being tested, and will
  be included in a separate -rc6 PULL request"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  vhost/scsi: Check LUN structure byte 0 is set to 1, per spec
  qla2xxx: Fix kernel panic on selective retransmission request
  Target/sbc: Don't use sg as iterator in sbc_verify_read
  target: Add DIF sense codes in transport_generic_request_failure
  target/sbc: Fix sbc_dif_copy_prot addr offset bug
  tcm_qla2xxx: Fix NAA formatted name for NPIV WWPNs
  tcm_qla2xxx: Perform configfs depend/undepend for base_tpg
  tcm_qla2xxx: Add NPIV specific enable/disable attribute logic
  qla2xxx: Check + fail when npiv_vports_inuse exists in shutdown
  qla2xxx: Fix qlt_lport_register base_vha callback race
2014-03-01 21:33:09 -06:00
Venkatesh Srinivas 7fe412d07d vhost/scsi: Check LUN structure byte 0 is set to 1, per spec
The virtio spec requires byte 0 of the virtio-scsi LUN structure
to be '1'.

Signed-off-by: Venkatesh Srinivas <venkateshs@google.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-02-24 16:19:43 -08:00
Michael S. Tsirkin b0c057ca7e vhost: fix a theoretical race in device cleanup
vhost_zerocopy_callback accesses VQ right after it drops a ubuf
reference.  In theory, this could race with device removal which waits
on the ubuf kref, and crash on use after free.

Do all accesses within rcu read side critical section, and synchronize
on release.

Since callbacks are always invoked from bh, synchronize_rcu_bh seems
enough and will help release complete a bit faster.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13 18:47:30 -05:00
Michael S. Tsirkin 0ad8b480d6 vhost: fix ref cnt checking deadlock
vhost checked the counter within the refcnt before decrementing.  It
really wanted to know that it is the one that has the last reference, as
a way to batch freeing resources a bit more efficiently.

Note: we only let refcount go to 0 on device release.

This works well but we now access the ref counter twice so there's a
race: all users might see a high count and decide to defer freeing
resources.
In the end no one initiates freeing resources until the last reference
is gone (which is on VM shotdown so might happen after a looooong time).

Let's do what we probably should have done straight away:
switch from kref to plain atomic, documenting the
semantics, return the refcount value atomically after decrement,
then use that to avoid the deadlock.

Reported-by: Qin Chuanyu <qinchuanyu@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13 18:47:30 -05:00
Linus Torvalds 4e13c5d021 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "The highlights this round include:

  - add support for SCSI Referrals (Hannes)
  - add support for T10 DIF into target core (nab + mkp)
  - add support for T10 DIF emulation in FILEIO + RAMDISK backends (Sagi + nab)
  - add support for T10 DIF -> bio_integrity passthrough in IBLOCK backend (nab)
  - prep changes to iser-target for >= v3.15 T10 DIF support (Sagi)
  - add support for qla2xxx N_Port ID Virtualization - NPIV (Saurav + Quinn)
  - allow percpu_ida_alloc() to receive task state bitmask (Kent)
  - fix >= v3.12 iscsi-target session reset hung task regression (nab)
  - fix >= v3.13 percpu_ref se_lun->lun_ref_active race (nab)
  - fix a long-standing network portal creation race (Andy)"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (51 commits)
  target: Fix percpu_ref_put race in transport_lun_remove_cmd
  target/iscsi: Fix network portal creation race
  target: Report bad sector in sense data for DIF errors
  iscsi-target: Convert gfp_t parameter to task state bitmask
  iscsi-target: Fix connection reset hang with percpu_ida_alloc
  percpu_ida: Make percpu_ida_alloc + callers accept task state bitmask
  iscsi-target: Pre-allocate more tags to avoid ack starvation
  qla2xxx: Configure NPIV fc_vport via tcm_qla2xxx_npiv_make_lport
  qla2xxx: Enhancements to enable NPIV support for QLOGIC ISPs with TCM/LIO.
  qla2xxx: Fix scsi_host leak on qlt_lport_register callback failure
  IB/isert: pass scatterlist instead of cmd to fast_reg_mr routine
  IB/isert: Move fastreg descriptor creation to a function
  IB/isert: Avoid frwr notation, user fastreg
  IB/isert: seperate connection protection domains and dma MRs
  tcm_loop: Enable DIF/DIX modes in SCSI host LLD
  target/rd: Add DIF protection into rd_execute_rw
  target/rd: Add support for protection SGL setup + release
  target/rd: Refactor rd_build_device_space + rd_release_device_space
  target/file: Add DIF protection support to fd_execute_rw
  target/file: Add DIF protection init/format support
  ...
2014-01-31 15:31:23 -08:00
Kent Overstreet 6f6b5d1ec5 percpu_ida: Make percpu_ida_alloc + callers accept task state bitmask
This patch changes percpu_ida_alloc() + callers to accept task state
bitmask for prepare_to_wait() for code like target/iscsi that needs
it for interruptible sleep, that is provided in a subsequent patch.

It now expects TASK_UNINTERRUPTIBLE when the caller is able to sleep
waiting for a new tag, or TASK_RUNNING when the caller cannot sleep,
and is forced to return a negative value when no tags are available.

v2 changes:
  - Include blk-mq + tcm_fc + vhost/scsi + target/iscsi changes
  - Drop signal_pending_state() call
v3 changes:
  - Only call prepare_to_wait() + finish_wait() when != TASK_RUNNING
    (PeterZ)

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: <stable@vger.kernel.org> #3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-01-23 20:17:18 +00:00
Nicholas Bellinger def2b339b4 target: Add protection SGLs to target_submit_cmd_map_sgls
This patch adds support to target_submit_cmd_map_sgls() for
accepting 'sgl_prot' + 'sgl_prot_count' parameters for
DIF protection information.

Note the passed parameters are stored at se_cmd->t_prot_sg
and se_cmd->t_prot_nents respectively.

Also, update tcm_loop and vhost-scsi fabrics usage of
target_submit_cmd_map_sgls() to take into account the
new parameters.

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-01-18 09:58:09 +00:00
Zhi Yong Wu 59566b6e8c vhost: remove the dead branch
Since vhost_dev_init() forever return 0, some branches are never run,
therefore need to be removed.

Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06 15:22:05 -05:00
Linus Torvalds b0e3636f65 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Things have been quiet this round with mostly bugfixes, percpu
  conversions, and other minor iscsi-target conformance testing changes.

  The highlights include:

   - Add demo_mode_discovery attribute for iscsi-target (Thomas)
   - Convert tcm_fc(FCoE) to use percpu-ida pre-allocation
   - Add send completion interrupt coalescing for ib_isert
   - Convert target-core to use percpu-refcounting for se_lun
   - Fix mutex_trylock usage bug in iscsit_increment_maxcmdsn
   - tcm_loop updates (Hannes)
   - target-core ALUA cleanups + prep for v3.14 SCSI Referrals support (Hannes)

  v3.14 is currently shaping to be a busy development cycle in target
  land, with initial support for T10 Referrals and T10 DIF currently on
  the roadmap"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (40 commits)
  iscsi-target: chap auth shouldn't match username with trailing garbage
  iscsi-target: fix extract_param to handle buffer length corner case
  iscsi-target: Expose default_erl as TPG attribute
  target_core_configfs: split up ALUA supported states
  target_core_alua: Make supported states configurable
  target_core_alua: Store supported ALUA states
  target_core_alua: Rename ALUA_ACCESS_STATE_OPTIMIZED
  target_core_alua: spellcheck
  target core: rename (ex,im)plict -> (ex,im)plicit
  percpu-refcount: Add percpu-refcount.o to obj-y
  iscsi-target: Do not reject non-immediate CmdSNs exceeding MaxCmdSN
  iscsi-target: Convert iscsi_session statistics to atomic_long_t
  target: Convert se_device statistics to atomic_long_t
  target: Fix delayed Task Aborted Status (TAS) handling bug
  iscsi-target: Reject unsupported multi PDU text command sequence
  ib_isert: Avoid duplicate iscsit_increment_maxcmdsn call
  iscsi-target: Fix mutex_trylock usage in iscsit_increment_maxcmdsn
  target: Core does not need blkdev.h
  target: Pass through I/O topology for block backstores
  iser-target: Avoid using FRMR for single dma entry requests
  ...
2013-11-22 10:52:03 -08:00
Nicholas Bellinger 60a01f558a vhost/scsi: Fix incorrect usage of get_user_pages_fast write parameter
This patch addresses a long-standing bug where the get_user_pages_fast()
write parameter used for setting the underlying page table entry permission
bits was incorrectly set to write=1 for data_direction=DMA_TO_DEVICE, and
passed into get_user_pages_fast() via vhost_scsi_map_iov_to_sgl().

However, this parameter is intended to signal WRITEs to pinned userspace
PTEs for the virtio-scsi DMA_FROM_DEVICE -> READ payload case, and *not*
for the virtio-scsi DMA_TO_DEVICE -> WRITE payload case.

This bug would manifest itself as random process segmentation faults on
KVM host after repeated vhost starts + stops and/or with lots of vhost
endpoints + LUNs.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Asias He <asias@redhat.com>
Cc: <stable@vger.kernel.org> # 3.6+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-10-25 11:03:34 -07:00
Andy Grover d80e224dd5 target: Remove TF_CIT_TMPL macro
Remove a lingering macro that just hid a dereference.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-10-16 13:35:02 -07:00
Nicholas Bellinger 4a47d3a1ff vhost/scsi: Use GFP_ATOMIC with percpu_ida_alloc for obtaining tag
Fix GFP_KERNEL -> GFP_ATOMIC usage of percpu_ida_alloc() within
vhost_scsi_get_tag(), as this code is expected to be called directly
from interrupt context.

v2 changes:

  - Handle possible tag < 0 failure with GFP_ATOMIC

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Asias He <asias@redhat.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-10-01 21:27:31 -07:00
Michael S. Tsirkin d3d665a654 vhost-scsi: whitespace tweak
Remove space at start of line that sneaked in.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-17 22:56:09 +03:00
Michael S. Tsirkin 595cb75498 vhost/scsi: use vmalloc for order-10 allocation
As vhost scsi device struct is large, if the device is
created on a busy system, kzalloc() might fail, so this patch does a
fallback to vzalloc().

As vmalloc() adds overhead on data-path, add __GFP_REPEAT
to kzalloc() flags to do this fallback only when really needed.

Reviewed-by: Asias He <asias@redhat.com>
Reported-by: Dan Aloni <alonid@stratoscale.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-17 22:55:46 +03:00
Qin Chuanyu ac9fde2474 vhost: wake up worker outside spin_lock
the wake_up_process func is included by spin_lock/unlock in
vhost_work_queue,
but it could be done outside the spin_lock.
I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf,
the num as below.
                  original                 modified
thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps)     vhost(%)
1           9.59        28.82    |   9.59        27.49
8           9.61        32.92    |   9.62        26.77
64          9.58        46.48    |   9.55        38.99
256         9.6         63.7     |   9.6         52.59

Signed-off-by: Chuanyu Qin <qinchuanyu@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-17 09:21:32 +03:00
Linus Torvalds 48efe453e6 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
 "Lots of activity again this round for I/O performance optimizations
  (per-cpu IDA pre-allocation for vhost + iscsi/target), and the
  addition of new fabric independent features to target-core
  (COMPARE_AND_WRITE + EXTENDED_COPY).

  The main highlights include:

   - Support for iscsi-target login multiplexing across individual
     network portals
   - Generic Per-cpu IDA logic (kent + akpm + clameter)
   - Conversion of vhost to use per-cpu IDA pre-allocation for
     descriptors, SGLs and userspace page pointer list
   - Conversion of iscsi-target + iser-target to use per-cpu IDA
     pre-allocation for descriptors
   - Add support for generic COMPARE_AND_WRITE (AtomicTestandSet)
     emulation for virtual backend drivers
   - Add support for generic EXTENDED_COPY (CopyOffload) emulation for
     virtual backend drivers.
   - Add support for fast memory registration mode to iser-target (Vu)

  The patches to add COMPARE_AND_WRITE and EXTENDED_COPY support are of
  particular significance, which make us the first and only open source
  target to support the full set of VAAI primitives.

  Currently Linux clients are lacking upstream support to actually
  utilize these primitives.  However, with server side support now in
  place for folks like MKP + ZAB working on the client, this logic once
  reserved for the highest end of storage arrays, can now be run in VMs
  on their laptops"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (50 commits)
  target/iscsi: Bump versions to v4.1.0
  target: Update copyright ownership/year information to 2013
  iscsi-target: Bump default TCP listen backlog to 256
  target: Fix >= v3.9+ regression in PR APTPL + ALUA metadata write-out
  iscsi-target; Bump default CmdSN Depth to 64
  iscsi-target: Remove unnecessary wait_for_completion in iscsi_get_thread_set
  iscsi-target: Add thread_set->ts_activate_sem + use common deallocate
  iscsi-target: Fix race with thread_pre_handler flush_signals + ISCSI_THREAD_SET_DIE
  target: remove unused including <linux/version.h>
  iser-target: introduce fast memory registration mode (FRWR)
  iser-target: generalize rdma memory registration and cleanup
  iser-target: move rdma wr processing to a shared function
  target: Enable global EXTENDED_COPY setup/release
  target: Add Third Party Copy (3PC) bit in INQUIRY response
  target: Enable EXTENDED_COPY setup in spc_parse_cdb
  target: Add support for EXTENDED_COPY copy offload emulation
  target: Avoid non-existent tg_pt_gp_mem in target_alua_state_check
  target: Add global device list for EXTENDED_COPY
  target: Make helpers non static for EXTENDED_COPY command setup
  target: Make spc_parse_naa_6h_vendor_specific non static
  ...
2013-09-12 16:11:45 -07:00
Nicholas Bellinger 4c76251e8e target: Update copyright ownership/year information to 2013
Update copyright ownership/year information for target-core,
loopback, iscsi-target, tcm_qla2xx, vhost and iser-target.

Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-09-10 20:23:36 -07:00
Nicholas Bellinger 3aee26b4ae vhost/scsi: Add pre-allocation for tv_cmd SGL + upages memory
This patch adds support for pre-allocation of per tv_cmd descriptor
scatterlist + user-space page pointer memory using se_sess->sess_cmd_map
within tcm_vhost_make_nexus() code.

This includes sanity checks within vhost_scsi_map_to_sgl()
to reject I/O that exceeds these initial hardcoded values, and
the necessary cleanup in tcm_vhost_make_nexus() failure path +
tcm_vhost_drop_nexus().

v3 changes:
  - Rebase to v3.11-rc5 code

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Asias He <asias@redhat.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Reviewed-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2013-09-09 14:29:19 -07:00