Commit Graph

102 Commits

Author SHA1 Message Date
Mike Christie 9972cebb59 tcmu: fix unmap thread race
If the unmap thread has already run find_free_blocks
but not yet run prepare_to_wait when a wake_up(&unmap_wait)
call is done, the unmap thread is going to miss the wake
call. Instead of adding checks for if new waiters were added
this just has us use a work queue which will run us again
in this type of case.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2018-01-12 15:07:12 -08:00
Mike Christie 89ec9cfd3b tcmu: split unmap_thread_fn
Separate unmap_thread_fn to make it easier to read.

Note: this patch does not fix the bug where we might
miss a wake up call. The next patch will fix that.
This patch only separates the code into functions.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2018-01-12 15:07:11 -08:00
Mike Christie bf99ec1332 tcmu: merge common block release code
Have unmap_thread_fn use tcmu_blocks_release.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2018-01-12 15:07:10 -08:00
tangwenji 26d2b3106f tcmu: fix page addr in tcmu_flush_dcache_range
The page addr should be update.

Signed-off-by: tangwenji <tang.wenji@zte.com.cn>
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2018-01-12 15:07:10 -08:00
Linus Torvalds 844056fd74 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:

 - The final conversion of timer wheel timers to timer_setup().

   A few manual conversions and a large coccinelle assisted sweep and
   the removal of the old initialization mechanisms and the related
   code.

 - Remove the now unused VSYSCALL update code

 - Fix permissions of /proc/timer_list. I still need to get rid of that
   file completely

 - Rename a misnomed clocksource function and remove a stale declaration

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  m68k/macboing: Fix missed timer callback assignment
  treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts
  timer: Remove redundant __setup_timer*() macros
  timer: Pass function down to initialization routines
  timer: Remove unused data arguments from macros
  timer: Switch callback prototype to take struct timer_list * argument
  timer: Pass timer_list pointer to callbacks unconditionally
  Coccinelle: Remove setup_timer.cocci
  timer: Remove setup_*timer() interface
  timer: Remove init_timer() interface
  treewide: setup_timer() -> timer_setup() (2 field)
  treewide: setup_timer() -> timer_setup()
  treewide: init_timer() -> setup_timer()
  treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
  s390: cmm: Convert timers to use timer_setup()
  lightnvm: Convert timers to use timer_setup()
  drivers/net: cris: Convert timers to use timer_setup()
  drm/vc4: Convert timers to use timer_setup()
  block/laptop_mode: Convert timers to use timer_setup()
  net/atm/mpc: Avoid open-coded assignment of timer callback function
  ...
2017-11-25 08:37:16 -10:00
Linus Torvalds eda5d47134 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:

 "This series is predominantly bug-fixes, with a few small improvements
  that have been outstanding over the last release cycle.

  As usual, the associated bug-fixes have CC' tags for stable.

  Also, things have been particularly quiet wrt new developments the
  last months, with most folks continuing to focus on stability atop 4.x
  stable kernels for their respective production configurations.

  Also at this point, the stable trees have been synced up with
  mainline. This will continue to be a priority, as production users
  tend to run exclusively atop stable kernels, a few releases behind
  mainline.

  The highlights include:

   - Fix PR PREEMPT_AND_ABORT null pointer dereference regression in
     v4.11+ (tangwenji)

   - Fix OOPs during removing TCMU device (Xiubo Li + Zhang Zhuoyu)

   - Add netlink command reply supported option for each device (Kenjiro
     Nakayama)

   - cxgbit: Abort the TCP connection in case of data out timeout (Varun
     Prakash)

   - Fix PR/ALUA file path truncation (David Disseldorp)

   - Fix double se_cmd completion during ->cmd_time_out (Mike Christie)

   - Fix QUEUE_FULL + SCSI task attribute handling in 4.1+ (Bryant Ly +
     nab)

   - Fix quiese during transport_write_pending_qf endless loop (nab)

   - Avoid early CMD_T_PRE_EXECUTE failures during ABORT_TASK in 3.14+
     (Don White + nab)"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (35 commits)
  tcmu: Add a missing unlock on an error path
  tcmu: Fix some memory corruption
  iscsi-target: Fix non-immediate TMR reference leak
  iscsi-target: Make TASK_REASSIGN use proper se_cmd->cmd_kref
  target: Avoid early CMD_T_PRE_EXECUTE failures during ABORT_TASK
  target: Fix quiese during transport_write_pending_qf endless loop
  target: Fix caw_sem leak in transport_generic_request_failure
  target: Fix QUEUE_FULL + SCSI task attribute handling
  iSCSI-target: Use common error handling code in iscsi_decode_text_input()
  target/iscsi: Detect conn_cmd_list corruption early
  target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd()
  target/iscsi: Modify iscsit_do_crypto_hash_buf() prototype
  target/iscsi: Fix endianness in an error message
  target/iscsi: Use min() in iscsit_dump_data_payload() instead of open-coding it
  target/iscsi: Define OFFLOAD_BUF_SIZE once
  target: Inline transport_put_cmd()
  target: Suppress gcc 7 fallthrough warnings
  target: Move a declaration of a global variable into a header file
  tcmu: fix double se_cmd completion
  target: return SAM_STAT_TASK_SET_FULL for TCM_OUT_OF_RESOURCES
  ...
2017-11-24 19:19:20 -10:00
Kees Cook e99e88a9d2 treewide: setup_timer() -> timer_setup()
This converts all remaining cases of the old setup_timer() API into using
timer_setup(), where the callback argument is the structure already
holding the struct timer_list. These should have no behavioral changes,
since they just change which pointer is passed into the callback with
the same available pointers after conversion. It handles the following
examples, in addition to some other variations.

Casting from unsigned long:

    void my_callback(unsigned long data)
    {
        struct something *ptr = (struct something *)data;
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, ptr);

and forced object casts:

    void my_callback(struct something *ptr)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);

become:

    void my_callback(struct timer_list *t)
    {
        struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

Direct function assignments:

    void my_callback(unsigned long data)
    {
        struct something *ptr = (struct something *)data;
    ...
    }
    ...
    ptr->my_timer.function = my_callback;

have a temporary cast added, along with converting the args:

    void my_callback(struct timer_list *t)
    {
        struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;

And finally, callbacks without a data assignment:

    void my_callback(unsigned long data)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, 0);

have their argument renamed to verify they're unused during conversion:

    void my_callback(struct timer_list *unused)
    {
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

The conversion is done with the following Coccinelle script:

spatch --very-quiet --all-includes --include-headers \
	-I ./arch/x86/include -I ./arch/x86/include/generated \
	-I ./include -I ./arch/x86/include/uapi \
	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
	--dir . \
	--cocci-file ~/src/data/timer_setup.cocci

@fix_address_of@
expression e;
@@

 setup_timer(
-&(e)
+&e
 , ...)

// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _timer;
type _cast_data;
@@

(
-setup_timer(&_E->_timer, NULL, _E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E->_timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, &_E);
+timer_setup(&_E._timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._timer, NULL, 0);
)

@change_timer_function_usage@
expression _E;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@

(
-setup_timer(&_E->_timer, _callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
 _E->_timer@_stl.function = _callback;
|
 _E->_timer@_stl.function = &_callback;
|
 _E->_timer@_stl.function = (_cast_func)_callback;
|
 _E->_timer@_stl.function = (_cast_func)&_callback;
|
 _E._timer@_stl.function = _callback;
|
 _E._timer@_stl.function = &_callback;
|
 _E._timer@_stl.function = (_cast_func)_callback;
|
 _E._timer@_stl.function = (_cast_func)&_callback;
)

// callback(unsigned long arg)
@change_callback_handle_cast
 depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
(
	... when != _origarg
	_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
)
 }

// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
 depends on change_timer_function_usage &&
                     !change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
+	_handletype *_origarg = from_timer(_origarg, t, _timer);
+
	... when != _origarg
-	(_handletype *)_origarg
+	_origarg
	... when != _origarg
 }

// Avoid already converted callbacks.
@match_callback_converted
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
	    !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@

 void _callback(struct timer_list *t)
 { ... }

// callback(struct something *handle)
@change_callback_handle_arg
 depends on change_timer_function_usage &&
	    !match_callback_converted &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@

 void _callback(
-_handletype *_handle
+struct timer_list *t
 )
 {
+	_handletype *_handle = from_timer(_handle, t, _timer);
	...
 }

// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
 depends on change_timer_function_usage &&
	    change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@

 void _callback(struct timer_list *t)
 {
-	_handletype *_handle = from_timer(_handle, t, _timer);
 }

// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg &&
	    !change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@

(
-timer_setup(&_E->_timer, _callback, 0);
+setup_timer(&_E->_timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._timer, _callback, 0);
+setup_timer(&_E._timer, _callback, (_cast_data)&_E);
)

// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@

(
 _E->_timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
)

// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@

 _callback(
(
-(_cast_data)_E
+&_E->_timer
|
-(_cast_data)&_E
+&_E._timer
|
-_E
+&_E->_timer
)
 )

// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _timer;
identifier _callback;
@@

(
-setup_timer(&_E->_timer, _callback, 0);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0L);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0UL);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0L);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0UL);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0L);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0UL);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0L);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0UL);
+timer_setup(_timer, _callback, 0);
)

@change_callback_unused_data
 depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *unused
 )
 {
	... when != _origarg
 }

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:57:07 -08:00
Dan Carpenter 97488c7319 tcmu: Add a missing unlock on an error path
We added a new error path here but we forgot to drop the lock first
before returning.

Fixes: 0d44374c1a ("tcmu: fix double se_cmd completion")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-11-08 01:42:35 -08:00
Dan Carpenter 16b9327704 tcmu: Fix some memory corruption
"udev->nl_reply_supported" is an int but on 64 bit arches we are writing
8 bytes of data to it so it corrupts four bytes beyond the end of the
struct.

Fixes: b849b45675 ("target: Add netlink command reply supported option for each device")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-11-08 01:42:31 -08:00
Mike Christie 0d44374c1a tcmu: fix double se_cmd completion
If cmd_time_out != 0, then tcmu_queue_cmd_ring could end up
sleeping waiting for ring space, timing out and then returning
failure to lio, and tcmu_check_expired_cmd could also detect
the timeout and call target_complete_cmd on the cmd.

This patch just delays setting up the deadline value and adding
the cmd to the udev->commands idr until we have allocated ring
space and are about to send the cmd to userspace.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-11-04 15:01:55 -07:00
Kenjiro Nakayama b849b45675 target: Add netlink command reply supported option for each device
Currently netlink command reply support option
(TCMU_ATTR_SUPP_KERN_CMD_REPLY) can be enabled only on module
scope. Because of that, once an application enables the netlink
command reply support, all applications using target_core_user.ko
would be expected to support the netlink reply. To make matters worse,
users will not be able to add a device via configfs manually.

To fix these issues, this patch adds an option to make netlink command
reply disabled on each device through configfs. Original
TCMU_ATTR_SUPP_KERN_CMD_REPLY is still enabled on module scope to keep
backward-compatibility and used by default, however once users set
nl_reply_supported=<NAGATIVE_VALUE> via configfs for a particular
device, the device disables the netlink command reply support.

Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-11-04 14:54:07 -07:00
Kenjiro Nakayama b5ab697c62 target/tcmu: Use macro to call container_of in tcmu_cmd_time_out_show
This patch makes a tiny change that using TCMU_DEV in
tcmu_cmd_time_out_show so it is consistent with other functions.

Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-11-04 14:52:39 -07:00
Xiubo Li c22adc0b0c tcmu: fix crash when removing the tcmu device
Before the nl REMOVE msg has been sent to the userspace, the ring's
and other resources have been released, but the userspace maybe still
using them. And then we can see the crash messages like:

ring broken, not handling completions
BUG: unable to handle kernel paging request at ffffffffffffffd0
IP: tcmu_handle_completions+0x134/0x2f0 [target_core_user]
PGD 11bdc0c067
P4D 11bdc0c067
PUD 11bdc0e067
PMD 0

Oops: 0000 [#1] SMP
cmd_id not found, ring is broken
RIP: 0010:tcmu_handle_completions+0x134/0x2f0 [target_core_user]
RSP: 0018:ffffb8a2d8983d88 EFLAGS: 00010296
RAX: 0000000000000000 RBX: ffffb8a2aaa4e000 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000220
R10: 0000000076c71401 R11: ffff8d2e76c713f0 R12: ffffb8a2aad56bc0
R13: 000000000000001c R14: ffff8d2e32c90000 R15: ffff8d2e76c713f0
FS:  00007f411ffff700(0000) GS:ffff8d1e7fdc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd0 CR3: 0000001027070000 CR4:
00000000001406e0
Call Trace:
? tcmu_irqcontrol+0x2a/0x40 [target_core_user]
? uio_write+0x7b/0xc0 [uio]
? __vfs_write+0x37/0x150
? __getnstimeofday64+0x3b/0xd0
? vfs_write+0xb2/0x1b0
? syscall_trace_enter+0x1d0/0x2b0
? SyS_write+0x55/0xc0
? do_syscall_64+0x67/0x150
? entry_SYSCALL64_slow_path+0x25/0x25
Code: 41 5d 41 5e 41 5f 5d c3 83 f8 01 0f 85 cf 01 00
00 48 8b 7d d0 e8 dd 5c 1d f3 41 0f b7 74 24 04 48 8b
7d c8 31 d2 e8 5c c7 1b f3 <48> 8b 7d d0 49 89 c7 c6 07
00 0f 1f 40 00 4d 85 ff 0f 84 82 01  RIP:
tcmu_handle_completions+0x134/0x2f0 [target_core_user]
RSP: ffffb8a2d8983d88
CR2: ffffffffffffffd0

And the crash also could happen in tcmu_page_fault and other places.

Signed-off-by: Zhang Zhuoyu <zhangzhuoyu@cmss.chinamobile.com>
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-11-04 14:48:21 -07:00
Mark Rutland 6aa7de0591 locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.

For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.

However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:

----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()

// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch

virtual patch

@ depends on patch @
expression E1, E2;
@@

- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)

@ depends on patch @
expression E;
@@

- ACCESS_ONCE(E)
+ READ_ONCE(E)
----

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-25 11:01:08 +02:00
Bryant G. Ly ededd039d1 tcmu: free old string on reconfig
On initial tcmu_configure_device call the info->name would
have already been allocated and set, so on the second call
make sure to free it first.

Reported-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-30 15:23:21 -07:00
Xiubo Li c542942cb4 tcmu: Fix possible to/from address overflow when doing the memcpy
For most case the sg->length equals to PAGE_SIZE, so this bug won't
be triggered. Otherwise this will crash the kernel, for example when
all segments' sg->length equal to 1K.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-30 15:12:32 -07:00
Xiubo Li daf78c3051 tcmu: clean up the code and with one small fix
Remove useless blank line and code and at the same time add one error
path to catch the errors.

Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-11 10:48:07 -07:00
Xiubo Li b3743c71b7 tcmu: Fix possbile memory leak / OOPs when recalculating cmd base size
For all the entries allocated from the ring cmd area, the memory is
something like the stack memory, which will always reserve the old
data, so the entry->req.iov_bidi_cnt maybe none zero.

On some environments, the crash could be reproduce very easy and some
not. The following is the crash core trace as reported by Damien:

[  240.143969] CPU: 0 PID: 1285 Comm: iscsi_trx Not tainted 4.12.0-rc1+ #3
[  240.150607] Hardware name: ASUS All Series/H87-PRO, BIOS 2104 10/28/2014
[  240.157331] task: ffff8807de4f5800 task.stack: ffffc900047dc000
[  240.163270] RIP: 0010:memcpy_erms+0x6/0x10
[  240.167377] RSP: 0018:ffffc900047dfc68 EFLAGS: 00010202
[  240.172621] RAX: ffffc9065db85540 RBX: ffff8807f7980000 RCX: 0000000000000010
[  240.179771] RDX: 0000000000000010 RSI: ffff8807de574fe0 RDI: ffffc9065db85540
[  240.186930] RBP: ffffc900047dfd30 R08: ffff8807de41b000 R09: 0000000000000000
[  240.194088] R10: 0000000000000040 R11: ffff8807e9b726f0 R12: 00000006565726b0
[  240.201246] R13: ffffc90007612ea0 R14: 000000065657d540 R15: 0000000000000000
[  240.208397] FS:  0000000000000000(0000) GS:ffff88081fa00000(0000) knlGS:0000000000000000
[  240.216510] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  240.222280] CR2: ffffc9065db85540 CR3: 0000000001c0f000 CR4: 00000000001406f0
[  240.229430] Call Trace:
[  240.231887]  ? tcmu_queue_cmd+0x83c/0xa80
[  240.235916]  ? target_check_reservation+0xcd/0x6f0
[  240.240725]  __target_execute_cmd+0x27/0xa0
[  240.244918]  target_execute_cmd+0x232/0x2c0
[  240.249124]  ? __local_bh_enable_ip+0x64/0xa0
[  240.253499]  iscsit_execute_cmd+0x20d/0x270
[  240.257693]  iscsit_sequence_cmd+0x110/0x190
[  240.261985]  iscsit_get_rx_pdu+0x360/0xc80
[  240.267565]  ? iscsi_target_rx_thread+0x54/0xd0
[  240.273571]  iscsi_target_rx_thread+0x9a/0xd0
[  240.279413]  kthread+0x113/0x150
[  240.284120]  ? iscsi_target_tx_thread+0x1e0/0x1e0
[  240.290297]  ? kthread_create_on_node+0x40/0x40
[  240.296297]  ret_from_fork+0x2e/0x40
[  240.301332] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48
c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48
89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38
[  240.321751] RIP: memcpy_erms+0x6/0x10 RSP: ffffc900047dfc68
[  240.328838] CR2: ffffc9065db85540
[  240.333667] ---[ end trace b7e5354cfb54d08b ]---

To fix this, just memset all the entry memory before using it, and
also to be more readable we adjust the bidi code.

Fixed: fe25cc34795(tcmu: Recalculate the tcmu_cmd size to save cmd area
		memories)
Reported-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reported-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-11 10:47:58 -07:00
Bryant G. Ly de8c5221aa tcmu: Fix dev_config_store
Currently when there is a reconfig, the uio_info->name
does not get updated to reflect the change in the dev_config
name change.

On restart tcmu-runner there will be a mismatch between
the dev_config string in uio and the tcmu structure that contains
the string. When this occurs it'll reload the one in uio
and you lose the reconfigured device path.

v2: Created a helper function for the updating of uio_info

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-09 20:57:56 -07:00
Mike Christie 406f74c20d tcmu: fix sense handling during completion
We were just copying the sense to the cmd sense_buffer and
did not implement a transport_complete or set the
SCF_TRANSPORT_TASK_SENSE, so the sense was ignored.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:46 -07:00
Xiubo Li 9d62bc0e6d tcmu: Fix flushing cmd entry dcache page
When feeding the tcmu's cmd ring, we need to flush the dcache page
for the cmd entry to make sure these kernel stores are visible to
user space mappings of that page.

For the none PAD cmd entry, this will be flushed at the end of the
tcmu_queue_cmd_ring().

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:43 -07:00
Mike Christie 9260695d65 tcmu: fix multiple uio open/close sequences
If the uio device is open and closed multiple times, the
kref count will be off due to tcmu_release getting called
multiple times for each close. This patch integrates
Wenji Tang's patch to add a kref_get on open that now
matches the kref_put done on tcmu_release and adds
a kref_put in tcmu_destroy_device to match the kref_get
done in succesful tcmu_configure_device calls.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Cc: Wenji Tang <tang.wenji@zte.com.cn>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:42 -07:00
Mike Christie 531283ff75 tcmu: drop configured check in destroy
destroy_device is only called if we have successfully run
configure_device, so drop the duplicate tcmu_dev_configured check.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:41 -07:00
Mike Christie b3af66e243 tcmu: perfom device add, del and reconfig synchronously
This makes the device add, del reconfig operations sync. It fixes
the issue where for add and reconfig, we do not know if userspace
successfully completely the operation, so we leave invalid kernel
structs or report incorrect status for the config/reconfig operations.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:39 -07:00
Mike Christie 926347061e target: break up free_device callback
With this patch free_device is now used to free what is allocated in the
alloc_device callback and destroy_device tears down the resources that are
setup in the configure_device callback.

This patch will be needed in the next patch where tcmu needs
to be able to look up the device in the destroy callback.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:37 -07:00
Mike Christie 2d76443e02 tcmu: reconfigure netlink attr changes
1. TCMU_ATTR_TYPE is too generic when it describes only the
reconfiguration type, so rename to TCMU_ATTR_RECONFIG_TYPE.

2. Only return the reconfig type when it is a
TCMU_CMD_RECONFIG_DEVICE command.

3. CONFIG_* type is not needed. We can pass the value along with an
ATTR to userspace, so it does not need to read sysfs/configfs.

4. Fix leak in tcmu_dev_path_store and rename to dev_config to
reflect it is more than just a path that can be changed.

6. Don't update kernel struct value if netlink sending fails.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: "Bryant G. Ly" <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:36 -07:00
Colin Ian King 5821783bca tcmu: make array tcmu_attrib_attrs static const
The array tcmu_attrib_attrs does not need to be in global scope, so make
it static.

Cleans up sparse warning:
"symbol 'tcmu_attrib_attrs' was not declared. Should it be static?"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:34 -07:00
Xiubo Li 07932a023a tcmu: Fix module removal due to stuck unmap_thread thread again
Because the unmap code just after the schdule() returned may take
a long time and if the kthread_stop() is fired just when in this
routine, the module removal maybe stuck too.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:33 -07:00
Bryant G. Ly 8a45885c15 tcmu: Add Type of reconfig into netlink
This patch adds more info about the attribute being changed,
so that usersapce can easily figure out what is happening.

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-By: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:32 -07:00
Bryant G. Ly ee01825220 tcmu: Make dev_config configurable
This allows for userspace to change the device path after
it has been created. Thus giving the user the ability to change
the path. The use case for this is to allow for virtual optical
to have media change.

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-By: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:31 -07:00
Bryant G. Ly 801fc54d5d tcmu: Make dev_size configurable via userspace
Allow tcmu backstores to be able to set the device size
after it has been configured via set attribute.

Part of support in userspace to support certain backstores
changing device size.

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-By: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:30 -07:00
Bryant G. Ly 1068be7bd4 tcmu: Add netlink for device reconfiguration
This gives tcmu the ability to handle events that can cause
reconfiguration, such as resize, path changes, write_cache, etc...

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-By: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:30 -07:00
Bryant G. Ly 9a8bb60650 tcmu: Support emulate_write_cache
This will enable the toggling of write_cache in tcmu through targetcli-fb

Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-By: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-07-06 23:11:29 -07:00
Mike Christie f3cdbe39b2 tcmu: fix crash during device removal
We currently do

tcmu_free_device ->tcmu_netlink_event(TCMU_CMD_REMOVED_DEVICE) ->
uio_unregister_device -> kfree(tcmu_dev).

The problem is that the kernel does not wait for userspace to
do the close() on the uio device before freeing the tcmu_dev.
We can then hit a race where the kernel frees the tcmu_dev before
userspace does close() and so when close() -> release -> tcmu_release
is done, we try to access a freed tcmu_dev.

This patch made over the target-pending master branch moves the freeing
of the tcmu_dev to when the last reference has been dropped.

This also fixes a leak where if tcmu_configure_device was not called on a
device we did not free udev->name which was allocated at tcmu_alloc_device time.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-23 19:50:49 -07:00
Mike Christie d906d8af28 tcmu: fix module removal due to stuck thread
We need to do a kthread_should_stop to check when kthread_stop has been
called.

This was a regression added in

b6df4b79a5
tcmu: Add global data block pool support

so not sure if you wanted to merge it in with that patch or what.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-04 20:01:41 -07:00
Xiubo Li fe25cc3479 tcmu: Recalculate the tcmu_cmd size to save cmd area memories
For the "struct tcmu_cmd_entry" in cmd area, the minimum size
will be sizeof(struct tcmu_cmd_entry) == 112 Bytes. And it could
fill about (sizeof(struct rsp) - sizeof(struct req)) /
sizeof(struct iovec) == 68 / 16 ~= 4 data regions(iov[4]) by
default.

For most tcmu_cmds, the data block indexes allocated from the
data area will be continuous. And for the continuous blocks they
will be merged into the same region using only one iovec. For
the current code, it will always allocates the same number of
iovecs with blocks for each tcmu_cmd, and it will wastes much
memories.

For example, when the block size is 4K and the DATA_OUT buffer
size is 64K, and the regions needed is less than 5(on my
environment is almost 99.7%). The current code will allocate
about 16 iovecs, and there will be (16 - 4) * sizeof(struct
iovec) = 192 Bytes cmd area memories wasted.

Here adds two helpers to calculate the base size and full size
of the tcmu_cmd. And will recalculate them again when it make sure
how many iovs is needed before insert it to cmd area.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-02 20:39:38 -07:00
Xiubo Li b6df4b79a5 tcmu: Add global data block pool support
For each target there will be one ring, when the target number
grows larger and larger, it could eventually runs out of the
system memories.

In this patch for each target ring, currently for the cmd area
the size will be fixed to 8MB and for the data area the size
will grow from 0 to max 256K * PAGE_SIZE(1G for 4K page size).

For all the targets' data areas, they will get empty blocks
from the "global data block pool", which has limited to 512K *
PAGE_SIZE(2G for 4K page size) for now.

When the "global data block pool" has been used up, then any
target could wake up the unmap thread routine to shrink other
targets' data area memories. And the unmap thread routine will
always try to truncate the ring vma from the last using block
offset.

When user space has touched the data blocks out of tcmu_cmd
iov[], the tcmu_page_fault() will try to return one zeroed blocks.

Here we move the timeout's tcmu_handle_completions() into unmap
thread routine, that's to say when the timeout fired, it will
only do the tcmu_check_expired_cmd() and then wake up the unmap
thread to do the completions() and then try to shrink its idle
memories. Then the cmdr_lock could be a mutex and could simplify
this patch because the unmap_mapping_range() or zap_* may go to
sleep.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Jianfei Hu <hujianfei@cmss.chinamobile.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:58:04 -07:00
Xiubo Li 141685a391 tcmu: Add dynamic growing data area feature support
Currently for the TCMU, the ring buffer size is fixed to 64K cmd
area + 1M data area, and this will be bottlenecks for high iops.

The struct tcmu_cmd_entry {} size is fixed about 112 bytes with
iovec[N] & N <= 4, and the size of struct iovec is about 16 bytes.

If N == 0, the ratio will be sizeof(cmd entry) : sizeof(datas) ==
112Bytes : (N * 4096)Bytes = 28 : 0, no data area is need.

If 0 < N <=4, the ratio will be sizeof(cmd entry) : sizeof(datas)
== 112Bytes : (N * 4096)Bytes = 28 : (N * 1024), so the max will
be 28 : 1024.

If N > 4, the sizeof(cmd entry) will be [(N - 4) *16 + 112] bytes,
and its corresponding data size will be [N * 4096], so the ratio
of sizeof(cmd entry) : sizeof(datas) == [(N - 4) * 16 + 112)Bytes
: (N * 4096)Bytes == 4/1024 - 12/(N * 1024), so the max is about
4 : 1024.

When N is bigger, the ratio will be smaller.

As the initial patch, we will set the cmd area size to 2M, and
the cmd area size to 32M. The TCMU will dynamically grows the data
area from 0 to max 32M size as needed.

The cmd area memory will be allocated through vmalloc(), and the
data area's blocks will be allocated individually later when needed.

The allocated data area block memory will be managed via radix tree.
For now the bitmap still be the most efficient way to search and
manage the block index, this could be update later.

Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Jianfei Hu <hujianfei@cmss.chinamobile.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-05-01 22:57:36 -07:00
Xiubo Li a5d68ba858 tcmu: Skip Data-Out blocks before gathering Data-In buffer for BIDI case
For the bidirectional case, the Data-Out buffer blocks will always at
the head of the tcmu_cmd's bitmap, and before gathering the Data-In
buffer, first of all it should skip the Data-Out ones, or the device
supporting BIDI commands won't work.

Fixed: 26418649ee ("target/user: Introduce data_bitmap, replace
		data_length/data_head/data_tail")
Reported-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Tested-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Cc: stable@vger.kernel.org # 4.6+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-04-02 16:18:51 -07:00
Xiubo Li abe342a5b4 tcmu: Fix wrongly calculating of the base_command_size
The t_data_nents and t_bidi_data_nents are the numbers of the
segments, but it couldn't be sure the block size equals to size
of the segment.

For the worst case, all the blocks are discontiguous and there
will need the same number of iovecs, that's to say: blocks == iovs.
So here just set the number of iovs to block count needed by tcmu
cmd.

Tested-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Cc: stable@vger.kernel.org # 3.18+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-30 01:36:53 -07:00
Xiubo Li ab22d2604c tcmu: Fix possible overwrite of t_data_sg's last iov[]
If there has BIDI data, its first iov[] will overwrite the last
iov[] for se_cmd->t_data_sg.

To fix this, we can just increase the iov pointer, but this may
introuduce a new memory leakage bug: If the se_cmd->data_length
and se_cmd->t_bidi_data_sg->length are all not aligned up to the
DATA_BLOCK_SIZE, the actual length needed maybe larger than just
sum of them.

So, this could be avoided by rounding all the data lengthes up
to DATA_BLOCK_SIZE.

Reviewed-by: Mike Christie <mchristi@redhat.com>
Tested-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Reviewed-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Cc: stable@vger.kernel.org # 3.18+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-30 01:36:52 -07:00
Nicholas Bellinger 740372b76e tcmu: Allow cmd_time_out to be set to zero (disabled)
The new cmd_time_out configfs attribute for TCMU is allowed to
be disabled, so go ahead and drop the tcmu_cmd_time_out_store()
check.

Reported-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-30 01:36:44 -07:00
Nicholas Bellinger 7d7a743543 tcmu: Convert cmd_time_out into backend device attribute
Instead of putting cmd_time_out under ../target/core/user_0/foo/control,
which has historically been used by parameters needed for initial
backend device configuration, go ahead and move cmd_time_out into
a backend device attribute.

In order to do this, tcmu_module_init() has been updated to create
a local struct configfs_attribute **tcmu_attrs, that is based upon
the existing passthrough_attrib_attrs along with the new cmd_time_out
attribute.  Once **tcm_attrs has been setup, go ahead and point
it at tcmu_ops->tb_dev_attrib_attrs so it's picked up by target-core.

Also following MNC's previous change, ->cmd_time_out is stored in
milliseconds but exposed via configfs in seconds.  Also, note this
patch restricts the modification of ->cmd_time_out to before +
after the TCMU device has been configured, but not while it has
active fabric exports.

Cc: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 16:32:30 -07:00
Mike Christie af980e46a2 tcmu: make cmd timeout configurable
A single daemon could implement multiple types of devices
using multuple types of real devices that may not support
restarting from crashes and/or handling tcmu timeouts. This
makes the cmd timeout configurable, so handlers that do not
support it can turn if off for now.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 16:32:27 -07:00
Mike Christie 972c7f1679 tcmu: add helper to check if dev was configured
This adds a helper to check if the dev was configured. It
will be used in the next patch to prevent updates to some
config settings after the device has been setup.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 16:32:19 -07:00
Mike Christie 2579325ca0 tcmu: return on first Opt parse failure
We only were returing failure if the last opt to be parsed failed.
This has a return failure when we first detect a failure.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 14:47:25 -07:00
Mike Christie 3abaa2bfdb tcmu: allow hw_max_sectors greater than 128
tcmu hard codes the hw_max_sectors to 128 which is a litle small.
Userspace uses the max_sectors to report the optimal IO size and
some initiators perform better with larger IOs (open-iscsi seems
to do better with 256 to 512 depending on the test).

(Fix do not display hw max sectors twice - MNC)

Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-18 14:46:52 -07:00
Linus Torvalds cf393195c3 Merge branch 'idr-4.11' of git://git.infradead.org/users/willy/linux-dax
Pull IDR rewrite from Matthew Wilcox:
 "The most significant part of the following is the patch to rewrite the
  IDR & IDA to be clients of the radix tree. But there's much more,
  including an enhancement of the IDA to be significantly more space
  efficient, an IDR & IDA test suite, some improvements to the IDR API
  (and driver changes to take advantage of those improvements), several
  improvements to the radix tree test suite and RCU annotations.

  The IDR & IDA rewrite had a good spin in linux-next and Andrew's tree
  for most of the last cycle. Coupled with the IDR test suite, I feel
  pretty confident that any remaining bugs are quite hard to hit. 0-day
  did a great job of watching my git tree and pointing out problems; as
  it hit them, I added new test-cases to be sure not to be caught the
  same way twice"

Willy goes on to expand a bit on the IDR rewrite rationale:
 "The radix tree and the IDR use very similar data structures.

  Merging the two codebases lets us share the memory allocation pools,
  and results in a net deletion of 500 lines of code. It also opens up
  the possibility of exposing more of the features of the radix tree to
  users of the IDR (and I have some interesting patches along those
  lines waiting for 4.12)

  It also shrinks the size of the 'struct idr' from 40 bytes to 24 which
  will shrink a fair few data structures that embed an IDR"

* 'idr-4.11' of git://git.infradead.org/users/willy/linux-dax: (32 commits)
  radix tree test suite: Add config option for map shift
  idr: Add missing __rcu annotations
  radix-tree: Fix __rcu annotations
  radix-tree: Add rcu_dereference and rcu_assign_pointer calls
  radix tree test suite: Run iteration tests for longer
  radix tree test suite: Fix split/join memory leaks
  radix tree test suite: Fix leaks in regression2.c
  radix tree test suite: Fix leaky tests
  radix tree test suite: Enable address sanitizer
  radix_tree_iter_resume: Fix out of bounds error
  radix-tree: Store a pointer to the root in each node
  radix-tree: Chain preallocated nodes through ->parent
  radix tree test suite: Dial down verbosity with -v
  radix tree test suite: Introduce kmalloc_verbose
  idr: Return the deleted entry from idr_remove
  radix tree test suite: Build separate binaries for some tests
  ida: Use exceptional entries for small IDAs
  ida: Move ida_bitmap to a percpu variable
  Reimplement IDR and IDA using the radix tree
  radix-tree: Add radix_tree_iter_delete
  ...
2017-02-28 20:29:41 -08:00
Dave Jiang 11bac80004 mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to
take a vma and vmf parameter when the vma already resides in vmf.

Remove the vma parameter to simplify things.

[arnd@arndb.de: fix ARM build]
  Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:54 -08:00
Matthew Wilcox d3e709e63e idr: Return the deleted entry from idr_remove
It is a relatively common idiom (8 instances) to first look up an IDR
entry, and then remove it from the tree if it is found, possibly doing
further operations upon the entry afterwards.  If we change idr_remove()
to return the removed object, all of these users can save themselves a
walk of the IDR tree.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
2017-02-13 21:44:03 -05:00