Mike Christie <michael.christie@oracle.com> says:
The patches in this thread allow us to use the block pr_ops with LIO's
target_core_iblock module to support cluster applications in VMs. They
were built over Linus's tree. They also apply over linux-next and
Martin's tree and Jens's trees.
Currently, to use windows clustering or linux clustering (pacemaker +
cluster labs scsi fence agents) in VMs with LIO and vhost-scsi, you
have to use tcmu or pscsi or use a cluster aware FS/framework for the
LIO pr file. Setting up a cluster FS/framework is pain and waste when
your real backend device is already a distributed device, and pscsi
and tcmu are nice for specific use cases, but iblock gives you the
best performance and allows you to use stacked devices like
dm-multipath. So these patches allow iblock to work like pscsi/tcmu
where they can pass a PR command to the backend module. And then
iblock will use the pr_ops to pass the PR command to the real devices
similar to what we do for unmap today.
The patches are separated in the following groups:
Patch 1 - 2:
- Add block layer callouts for reading reservations and rename reservation
error code.
Patch 3 - 5:
- SCSI support for new callouts.
Patch 6:
- DM support for new callouts.
Patch 7 - 13:
- NVMe support for new callouts.
Patch 14 - 18:
- LIO support for new callouts.
This patchset has been tested with the libiscsi PGR ops and with
window's failover cluster verification test. Note that for scsi
backend devices we need this patchset:
https://lore.kernel.org/linux-scsi/20230123221046.125483-1-michael.christie@oracle.com/T/#m4834a643ffb5bac2529d65d40906d3cfbdd9b1b7
to handle UAs. To reduce the size of this patchset that's being done
separately to make reviewing easier. And to make merging easier this
patchset and the one above do not have any conflicts so can be merged
in different trees.
Link: https://lore.kernel.org/r/20230407200551.12660-1-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The iblock pr_ops support does not support commands that require port or
I_T Nexus info. This adds a struct target_opcode_descriptor as an argument
to the enabled callout so we can still have the common tcm_is_pr_enabled
and tcm_is_scsi2_reservations_enabled functions and also determine if the
command is supported based on the command and service action and device
settings.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230407200551.12660-17-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Mike Christie <michael.christie@oracle.com> says:
The following patches apply over Martin's 6.4 branches and Linus's tree.
They fix a couple regressions in iscsit that occur when there are TMRs
executing and a connection is closed. It also includes Dimitry's fixes in
related code paths for cmd cleanup when ERL2 is used and the write pending
hang during conn cleanup.
This version of the patchset brings it back to just regressions and fixes
for bugs we have a lot of users hitting. I'm going to fix isert and get it
hooked into iscsit properly in a second patchset, because this one was
getting so large. I've also moved my cleanup type of patches for a 3rd
patchset.
Link: https://lore.kernel.org/r/20230319015620.96006-1-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This fixes a bug where an initiator thinks a LUN_RESET has cleaned up
running commands when it hasn't. The bug was added in commit 51ec502a32
("target: Delete tmr from list before processing").
The problem occurs when:
1. We have N I/O cmds running in the target layer spread over 2 sessions.
2. The initiator sends a LUN_RESET for each session.
3. session1's LUN_RESET loops over all the running commands from both
sessions and moves them to its local drain_task_list.
4. session2's LUN_RESET does not see the LUN_RESET from session1 because
the commit above has it remove itself. session2 also does not see any
commands since the other reset moved them off the state lists.
5. sessions2's LUN_RESET will then complete with a successful response.
6. sessions2's inititor believes the running commands on its session are
now cleaned up due to the successful response and cleans up the running
commands from its side. It then restarts them.
7. The commands do eventually complete on the backend and the target
starts to return aborted task statuses for them. The initiator will
either throw a invalid ITT error or might accidentally lookup a new
task if the ITT has been reallocated already.
Fix the bug by reverting the patch, and serialize the execution of
LUN_RESETs and Preempt and Aborts.
Also prevent us from waiting on LUN_RESETs in core_tmr_drain_tmr_list,
because it turns out the original patch fixed a bug that was not
mentioned. For LUN_RESET1 core_tmr_drain_tmr_list can see a second
LUN_RESET and wait on it. Then the second reset will run
core_tmr_drain_tmr_list and see the first reset and wait on it resulting in
a deadlock.
Fixes: 51ec502a32 ("target: Delete tmr from list before processing")
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230319015620.96006-8-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
iSCSI needs to wait on outstanding commands like how SRP and the FC/FCoE
drivers do. It can't use target_stop_session() because for MCS support we
can't stop the entire session during recovery because if other connections
are OK then we want to be able to continue to execute I/O on them.
Move the per session cmd counters to a new struct so iSCSI can allocate
them per connection. The xcopy code can also just not allocate in the
future since it doesn't need to track commands.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230319015620.96006-2-michael.christie@oracle.com
Reviewed-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
RELATIVE TARGET PORT IDENTIFIER can be read and configured via configfs:
$ echo 0x10 > $TARGET/tpgt_N/rtpi
RTPI can be changed only on disabled target ports.
Co-developed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Link: https://lore.kernel.org/r/20230301084512.21956-5-d.bogdanov@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The code is not needed since target port-based RTPI allocation replaced it.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Link: https://lore.kernel.org/r/20230301084512.21956-4-d.bogdanov@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
SAM-5 4.6.5.2 (Relative Port Identifier attribute) defines the attribute as
unique across SCSI target ports.
The change introduces RTPI attribute to se_portal group. The value is
unique across all enabled SCSI target ports. It also limits number of SCSI
target ports to 65535.
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Link: https://lore.kernel.org/r/20230301084512.21956-2-d.bogdanov@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A MAXIMUM TRANSFER LENGTH value indicates the maximum transfer length in
logical blocks that the device server accepts for a single command. Fix
function sending the length in sectors instead of blocks.
This patch also removes the special casing for fileio in block_size_store
since this logic in now unified in spc_emulate_evpd_b0() for all backends.
Reviewed-by: Konstantin Shelekhin <k.shelekhin@yadro.com>
Reviewed-by: Dmitriy Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
Link: https://lore.kernel.org/r/20221114102500.88892-2-a.kovaleva@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
libiscsi tests check the support of DPO & FUA bits in usage bits of RSOC
response. This patch adds support for dynamic usage bits for each opcode.
Set support of DPO & FUA bits in usage_bits of RSOC response depending on
support DPOFUA in the backstore device.
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Konstantin Shelekhin <k.shelekhin@yadro.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Link: https://lore.kernel.org/r/20220906103421.22348-7-d.bogdanov@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Allow support for RSOC to be turned off via the emulate_rsoc attibute.
This is just for testing purposes.
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Link: https://lore.kernel.org/r/20220906103421.22348-5-d.bogdanov@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Report supported opcodes depending on a dynamic device configuration.
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Link: https://lore.kernel.org/r/20220906103421.22348-4-d.bogdanov@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add support for REPORT SUPPORTED OPERATION CODES command according to SPC4.
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Link: https://lore.kernel.org/r/20220906103421.22348-2-d.bogdanov@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
se_lun and se_lun_acl are immutable pointers of struct se_dev_entry.
Remove RCU usage for access to those pointers.
Link: https://lore.kernel.org/r/20220727214125.19647-3-d.bogdanov@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We are only holding the lun_tg_pt_gp_lock in target_alua_state_check() to
make sure tg_pt_gp is not freed from under us while we copy the state,
delay, ID values. We can instead use RCU here to access the tg_pt_gp.
With this patch IOPs can increase up to 10% for jobs like:
fio --filename=/dev/sdX --direct=1 --rw=randrw --bs=4k \
--ioengine=libaio --iodepth=64 --numjobs=N
when there are multiple sessions (running that fio command to each /dev/sdX
or using multipath and there are over 8 paths), or more than 8 queues for
the loop or vhost with multiple threads case and numjobs > 8.
Link: https://lore.kernel.org/r/20210930020422.92578-5-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch fixes the following bugs:
1. If there are multiple ordered cmds queued and multiple simple cmds
completing, target_restart_delayed_cmds() could be called on different
CPUs and each instance could start a ordered cmd. They could then run in
different orders than they were queued.
2. target_restart_delayed_cmds() and target_handle_task_attr() can race
where:
1. target_handle_task_attr() has passed the simple_cmds == 0 check.
2. transport_complete_task_attr() then decrements simple_cmds to 0.
3. transport_complete_task_attr() runs target_restart_delayed_cmds() and
it does not see any cmds on the delayed_cmd_list.
4. target_handle_task_attr() adds the cmd to the delayed_cmd_list.
The cmd will then end up timing out.
3. If we are sent > 1 ordered cmds and simple_cmds == 0, we can execute
them out of order, because target_handle_task_attr() will hit that
simple_cmds check first and return false for all ordered cmds sent.
4. We run target_restart_delayed_cmds() after every cmd completion, so if
there is more than 1 simple cmd running, we start executing ordered cmds
after that first cmd instead of waiting for all of them to complete.
5. Ordered cmds are not supposed to start until HEAD OF QUEUE and all older
cmds have completed, and not just simple.
6. It's not a bug but it doesn't make sense to take the delayed_cmd_lock
for every cmd completion when ordered cmds are almost never used. Just
replacing that lock with an atomic increases IOPs by up to 10% when
completions are spread over multiple CPUs and there are multiple
sessions/ mqs/thread accessing the same device.
This patch moves the queued delayed handling to a per device work to
serialze the cmd executions for each device and adds a new counter to track
HEAD_OF_QUEUE and SIMPLE cmds. We can then check the new counter to
determine when to run the work on the completion path.
Link: https://lore.kernel.org/r/20210930020422.92578-3-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Many fabric modules provide their own implementation of enable attribute in
tpg.
Provide a way to remove code duplication in the fabric modules and
automatically add "enable" attribute if a fabric module has an
implementation of fabric_enable_tpg().
Link: https://lore.kernel.org/r/20210910084133.17956-2-d.bogdanov@yadro.com
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, backend drivers can fail I/O with SAM_STAT_CHECK_CONDITION which
gets us TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE.
Add a new helper that allows backend drivers to fail with specific sense
codes.
This is based on a patch from Mike Christie <michael.christie@oracle.com>.
Cc: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20210803145410.80147-2-s.samoylenko@yadro.com
Reviewed-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Sergey Samoylenko <s.samoylenko@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
These members are only used for ALUA sense detail propagation, which can
just as easily be done via sense_reason_t.
Link: https://lore.kernel.org/r/20210728115353.2396-4-ddiss@suse.de
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Implement an attribute which provides a way to set a company specific WWN
in configfs via:
target/core/$backstore/$name/wwn/company_id
The Open Fabrics Alliance ID 001405h remains the default.
Link: https://lore.kernel.org/r/20210420185920.42431-3-s.samoylenko@yadro.com
Signed-off-by: Sergey Samoylenko <s.samoylenko@yadro.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It may not always be best to complete the IO on same CPU as it was
submitted on. This commit allows userspace to configure it.
This has been useful for vhost-scsi where we have a single thread for
submissions and completions. If we force the completion on the submission
CPU we may be adding conflicts with what the user has setup in the lower
levels with settings like the block layer rq_affinity or the driver's IRQ
or softirq (the network's rps_cpus value) settings.
We may also want to set it up where the vhost thread runs on CPU N and does
its submissions/completions there, and then have LIO do its completion
booking on CPU M, but can't configure the lower levels due to issues like
using dm-multipath with lots of paths (the path selector can throw commands
all over the system because it's only taking into account latency/throughput
at its level).
The new setting is in:
/sys/kernel/config/target/$fabric/$target/param/cmd_completion_affinity
Writing:
-1 -> Gives the current default behavior of completing on the
submission CPU.
-2 -> Completes the cmd on the CPU the lower layers sent it to us from.
> 0 -> Completes on the CPU userspace has specified.
Link: https://lore.kernel.org/r/20210227170006.5077-26-michael.christie@oracle.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
target_core_iblock is plugging and unplugging on every command and this is
causing perf issues for drivers that prefer batched cmds. With recent
patches we can now take multiple cmds from a fabric driver queue and then
pass them down the backend drivers in a batch. This patch adds this support
by adding 2 callouts to the backend for plugging and unplugging the
device. Subsequent commits will add support for iblock and tcmu device
plugging.
Link: https://lore.kernel.org/r/20210227170006.5077-22-michael.christie@oracle.com
Reviewed-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We have a couple holes in the cmd flags definitions. This cleans up the
definitions to fix that and make it easier to read.
Link: https://lore.kernel.org/r/20210227170006.5077-21-michael.christie@oracle.com
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
loop and vhost/scsi do their target cmd submission from driver
workqueues. This allows them to avoid an issue where the backend may block
waiting for resources like tags/requests, mem/locks, etc and that ends up
blocking their entire submission path and for the case of vhost-scsi both
the submission and completion path.
This patch adds a helper drivers can use to submit from a LIO workqueue.
This code will then be extended in the next patches to fix the plugging of
backend devices.
We are only converting vhost/loop initially, but the workqueue based
submission will work for other drivers and have similar benefits where the
main target loops will not end up blocking one some backend resource.
Link: https://lore.kernel.org/r/20210227170006.5077-17-michael.christie@oracle.com
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
According to FCP-4 (9.4.2):
If the command requested that data beyond the length specified by the
FCP_DL field be transferred, then the device server shall set the
FCP_RESID_OVER bit (see 9.5.8) to one in the FCP_RSP IU and:
a) process the command normally except that data beyond the FCP_DL count
shall not be requested or transferred;
b) transfer no data and return CHECK CONDITION status with the sense key
set to ILLEGAL REQUEST and the additional sense code set to INVALID FIELD
IN COMMAND INFORMATION UNIT; or
c) may transfer data and return CHECK CONDITION status with the sense key
set to ABORTED COMMAND and the additional sense code set to INVALID FIELD
IN COMMAND INFORMATION UNIT.
TCM follows b) and transfers no data for residual writes but returns
INVALID FIELD IN CDB instead of INVALID FIELD IN COMMAND INFORMATION UNIT.
Change the ASCQ to INVALID FIELD IN COMMAND INFORMATION UNIT to meet the
standard.
Link: https://lore.kernel.org/r/20201203082035.54566-4-a.kovaleva@yadro.com
Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Do a state_list/execute_task_lock per CPU, so we can do submissions from
different CPUs without contention with each other.
Note: tcm_fc was passing TARGET_SCF_USE_CPUID, but never set cpuid. The
assumption is that it wanted to set the cpuid to the CPU it was submitting
from so it will get this behavior with this patch.
[mkp: s/printk/pr_err/ + resolve COMPARE AND WRITE patch conflict]
Link: https://lore.kernel.org/r/1604257174-4524-8-git-send-email-michael.christie@oracle.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Drop the sess_cmd_lock by:
- Removing the sess_cmd_list use from LIO core, because it's been
moved to qla2xxx.
- Removing sess_tearing_down check in the I/O path. Instead of using that
bit and the sess_cmd_lock, we rely on the cmd_count percpu ref. To do
this we switch to percpu_ref_kill_and_confirm/percpu_ref_tryget_live.
Link: https://lore.kernel.org/r/1604257174-4524-7-git-send-email-michael.christie@oracle.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
cmd.bad_sector currently gets packed into the sense INFORMATION field for
TCM_LOGICAL_BLOCK_{GUARD,APP_TAG,REF_TAG}_CHECK_FAILED errors, which carry
an .add_sector_info flag in the sense_detail_table to ensure this.
In preparation for propagating a byte offset on COMPARE AND WRITE
TCM_MISCOMPARE_VERIFY error, rename cmd.bad_sector to cmd.sense_info and
sense_detail.add_sector_info to sense_detail.add_sense_info so that it
better reflects the sense INFORMATION field destination.
[ddiss: update previously overlooked ib_isert]
Link: https://lore.kernel.org/r/20201031233211.5207-3-ddiss@suse.de
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Target core is modified to call an optional backend callback function if a
TMR is received or commands are aborted implicitly after a PR command was
received. The backend function takes as parameters the se_dev, the type of
the TMR, and the list of aborted commands. If no commands were aborted, an
empty list is supplied.
Link: https://lore.kernel.org/r/20200726153510.13077-3-bstroesser@ts.fujitsu.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
pgr_support and alua_support device attributes show the inverted value of
the transport_flags:
* TRANSPORT_FLAG_PASSTHROUGH_PGR
* TRANSPORT_FLAG_PASSTHROUGH_ALUA
These attributes are per device, while the flags are per backend. Rename
the transport_flags in backend/transport to transport_flags_default and use
this value to initialize the new transport_flags field in the se_device
structure.
Now data and attribute both are per se_device.
Link: https://lore.kernel.org/r/20200427150823.15350-4-bstroesser@ts.fujitsu.com
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The emulate_ua_intlck_ctrl device attribute accepts values of 0, 1 or 2 via
ConfigFS, which map to unit attention interlocks control codes in the MODE
SENSE control Mode Page. Use an enum to track these values so that it's
clear that, unlike the remaining emulate_X attributes,
emulate_ua_intlck_ctrl isn't boolean.
Link: https://marc.info/?l=target-devel&m=158227825428798
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This should harden us against configfs API regressions similar to the one
fixed by the previous commit.
Link: https://marc.info/?l=target-devel&m=158211731505174
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The LIO unmap_zeroes_data device attribute is mapped to the LBPRZ flag in
the READ CAPACITY(16) and Thin Provisioning VPD INQUIRY responses.
The unmap_zeroes_data attribute is exposed via configfs, where any write
value is correctly validated via strtobool(). However, when initialised via
target_configure_unmap_from_queue() it takes the value of the device's
max_write_zeroes_sectors queue limit, which is non-boolean.
A non-boolean value can be read from configfs, but attempting to write the
same value back results in -EINVAL, causing problems for configuration
utilities such as targetcli.
Link: https://marc.info/?l=target-devel&m=158213354011309
Fixes: 2237498f0b ("target/iblock: Convert WRITE_SAME to blkdev_issue_zeroout")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Maintaining tpg_list without ever iterating over it is not useful. Hence
remove tpg_list. This patch does not change the behavior of the SCSI target
code.
Cc: Mike Christie <mchristie@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Link: https://lore.kernel.org/r/20190930232224.58980-1-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of tracking the initiator that established an SPC-2 reservation,
track the session through which the SPC-2 reservation has been
established. This patch does not change any functionality.
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Whether or not a session is being torn down does not affect whether or not
SCSI commands are in the task set. Hence remove the "tearing down" checks
from the TMF code. The TRANSPORT_ISTATE_PROCESSING check is left out
because it is now safe to wait for a command that is in that state. The
CMD_T_PRE_EXECUTE is left out because abort processing is postponed until
after commands have left the pre-execute state since the patch that makes
TMF processing synchronous.
See also commit 1c21a48055 ("target: Avoid early CMD_T_PRE_EXECUTE
failures during ABORT_TASK").
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In preparation for supporting user provided vendor strings, add an extra
byte to the vendor, model and revision arrays in struct t10_wwn. This
ensures that the full INQUIRY data can be carried in the arrays along with
a null-terminator.
Change a number of array readers and writers so that they account for
explicit null-termination:
- The pscsi_set_inquiry_info() and emulate_model_alias_store() codepaths
don't currently explicitly null-terminate; fix this.
- Existing t10_wwn field dumps use for-loops which step over
null-terminators for right-padding.
+ Use printf with width specifiers instead.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The TASK ABORTED STATUS (TAS) bit is defined as follows in SAM:
"TASK_ABORTED: this status shall be returned if a command is aborted by a
command or task management function on another I_T nexus and the control
mode page TAS bit is set to one". TAS handling is spread over the target
core and the iSCSI target driver. If a LUN RESET is received, the target
core will send the TASK_ABORTED response for all commands for which such a
response has to be sent. If an ABORT TASK is received, only the iSCSI
target driver will send the TASK_ABORTED response for the commands for
which that response has to be sent. That is a bug since all target drivers
have to honor the TAS bit. Fix this by moving the code that handles TAS
from the iSCSI target driver into the target core. Additionally, if a
command has been aborted, instead of sending the TASK_ABORTED status from
the context that processes the SCSI command send it from the context of the
ABORT TMF. The core_tmr_abort_task() change in this patch causes the
CMD_T_TAS flag to be set if a TASK_ABORTED status has to be sent back to
the initiator that submitted the command. If that flag has been set
transport_cmd_finish_abort() will send the TASK_ABORTED response.
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Disseldorp <ddiss@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch does not change any functionality but makes the patch that makes
TMF handling synchronous easier to read.
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Disseldorp <ddiss@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A quote from SAM-5: "The order in which task management requests are
processed is not specified by the SCSI architecture model. The SCSI
architecture model does not require in-order delivery of such task
management requests or processing by the task manager in the order
received. To guarantee the processing order of task management requests
referencing sent to a specific logical unit, an application client should
not have more than one such task management request pending to that logical
unit." This means that it is safe to use the system workqueues instead of
tmr_wq for processing TMFs. An intended side effect of this patch is that
it enables concurrent processing of TMFs.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: David Disseldorp <ddiss@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A session must only be released after all code that accesses the session
structure has finished. Make sure that this is the case by introducing a
new command counter per session that is only decremented after the
.release_cmd() callback has finished. This patch fixes the following crash:
BUG: KASAN: use-after-free in do_raw_spin_lock+0x1c/0x130
Read of size 4 at addr ffff8801534b16e4 by task rmdir/14805
CPU: 16 PID: 14805 Comm: rmdir Not tainted 4.18.0-rc2-dbg+ #5
Call Trace:
dump_stack+0xa4/0xf5
print_address_description+0x6f/0x270
kasan_report+0x241/0x360
__asan_load4+0x78/0x80
do_raw_spin_lock+0x1c/0x130
_raw_spin_lock_irqsave+0x52/0x60
srpt_set_ch_state+0x27/0x70 [ib_srpt]
srpt_disconnect_ch+0x1b/0xc0 [ib_srpt]
srpt_close_session+0xa8/0x260 [ib_srpt]
target_shutdown_sessions+0x170/0x180 [target_core_mod]
core_tpg_del_initiator_node_acl+0xf3/0x200 [target_core_mod]
target_fabric_nacl_base_release+0x25/0x30 [target_core_mod]
config_item_release+0x9c/0x110 [configfs]
config_item_put+0x26/0x30 [configfs]
configfs_rmdir+0x3b8/0x510 [configfs]
vfs_rmdir+0xb3/0x1e0
do_rmdir+0x262/0x2c0
do_syscall_64+0x77/0x230
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Disseldorp <ddiss@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Since transport_clear_lun_ref() already waits until the percpu-refcount
.release() method is called, it is not necessary to wait first until
percpu_ref_kill_and_confirm() has finished transitioning the refcount into
atomic mode. Remove the code that waits for percpu_ref_kill_and_confirm()
to complete and also the completion object that is used by that code. This
patch does not change the behavior of the SCSI target code.
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Disseldorp <ddiss@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
On write, the pi_prot_format configfs attribute invokes the device
format_prot() callback if present. Read dumps the contents of
se_dev_attrib.pi_prot_format which is always zero. Make the configfs
attribute write-only, and drop the always zero se_dev_attrib.pi_prot_format
storage.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The new emulate_pr backstore attribute allows for Persistent Reservation
and SCSI2 RESERVE/RELEASE support to be completely disabled. This can be
useful for scenarios such as:
- Ensuring ATS (Compare & Write) usage on recent VMware ESXi initiators.
- Allowing clustered (e.g. tcm-user) backends to block such requests,
avoiding the multi-node reservation state propagation.
When explicitly disabled, PR and RESERVE/RELEASE requests receive Invalid
Command Operation Code response sense data.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 057085e522 ("target: Fix race for SCF_COMPARE_AND_WRITE_POST
checking") removed the code that checks the SCF_COMPARE_AND_WRITE_POST
flag. Hence also remove the flag itself.
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
se_dev_entry.ua_count is only used to check whether or not
se_dev_entry.ua_list is empty. Use list_empty_careful() instead. Checking
whether or not ua_list is empty without holding the lock that protects that
list is fine because the code that dequeues from that list will check again
whether or not that list is empty.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of embedding the completion that is used for waiting for command
completion in struct se_cmd, let the context that waits for command
completion allocate it. This makes it possible to have a single code path
for non-aborted and aborted commands in target_release_cmd_kref() and
avoids that transport_generic_free_cmd() has to call
cmd->se_tfo->release_cmd() directly. This patch does not change any
functionality. Note: transport_generic_free_cmd() only waits until the
se_cmd reference count has reached zero after it has set both
CMD_T_FABRIC_STOP and CMD_T_ABORTED.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Target drivers must call target_sess_cmd_list_set_waiting() and
target_wait_for_sess_cmds() before freeing a session. Since freeing a
session is only safe after all commands that are associated with a session
have finished, make target_wait_for_sess_cmds() also wait for commands that
are being aborted. Instead of setting a flag in each pending command from
target_sess_cmd_list_set_waiting() and waiting in
target_wait_for_sess_cmds() on a per-command completion, only set a
per-session flag in the former function and wait on a per-session
completion in the latter function. This change is safe because once a SCSI
initiator system has submitted a command a target system is always allowed
to execute it to completion. See also commit 0f4a943168 ("target: Fix
remote-port TMR ABORT + se_cmd fabric stop").
This patch is based on the following two patches:
* Bart Van Assche, target: Simplify session shutdown code, February 19, 2015
(8df5463d7d).
* Christoph Hellwig, target: Rework session shutdown code, December 7, 2015
(http://thread.gmane.org/gmane.linux.scsi.target.devel/10695).
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The sbitmap and the percpu_ida perform essentially the same task,
allocating tags for commands. The sbitmap outperforms the percpu_ida as
documented here: https://lkml.org/lkml/2014/4/22/553
The sbitmap interface is a little harder to use, but being able to remove
the percpu_ida code and getting better performance justifies the additional
complexity.
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> # f_tcm
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>