Most importantly:
- do not free the recv context (rpl_context) after a successful post_recv()
- but do free the send context (c) after a failed send.
Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
rdma_request() should never be in charge of freeing rc.
When an error occurs:
* Either the rc buffer has been recv_post()'ed.
then kfree()'ing it certainly is a bad idea.
* Or is has not, and in that case req->rc still points to it,
hence it needs not be freed.
Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
The current code keeps track of the number of buffers posted in the RQ,
and will prevent it from overflowing. But it does so by simply dropping
post requests (And leaking memory in the process).
When this happens there will actually be too few buffers posted, and
soon the 9P server will complain about 'RNR retry counter exceeded'
errors.
Instead, use a semaphore, and block until the RQ is ready for another
buffer to be posted.
Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
A well-behaved server would not send twice the reply to a request.
But if it ever happens...
This additional check prevents the kernel from leaking memory
and possibly more nasty consequences in that unlikely event.
Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
The current value is too low to get good performance.
Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
The current code assumes that when a request in the request array
does have a tc, it also has a rc.
This is normally true, but not always : when using RDMA, req->rc
will temporarily be set to NULL after the request has been sent.
That is usually OK though, as when the reply arrives, req->rc will be
reassigned to a sane value before the request is recycled.
But there is a catch : if the request is flushed, the reply will never
arrive, and req->rc will be NULL, but not req->tc.
This patch fixes p9_tag_alloc to take this into account.
Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Allow requests for security.* and trusted.* xattr name spaces
to pass through to server.
The new files are 99% cut and paste from fs/9p/xattr_user.c with the
namespaces changed. It has the intended effect in superficial testing.
I do not know much detail about how these namespaces are used, but passing
them through to the server, which can decide whether to handle them or not,
seems reasonable.
I want to support a use case where an ext4 file system is mounted via 9P,
then re-exported via samba to windows clients in a cluster. Windows wants
to store xattrs such as security.NTACL. This works when ext4 directly
backs samba, but not when 9P is inserted. This use case is documented here:
http://code.google.com/p/diod/issues/detail?id=95
Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
If the privport option is specified, the tcp transport binds local
address to a reserved port before connecting to the 9p server.
In some cases when 9P AUTH cannot be implemented, this is better than
nothing.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Continue the approach taken by commit d2b57063e4 ("IB/core: Reserve
bits in enum ib_qp_create_flags for low-level driver use") and add
reserved entries to the ib_qp_type and ib_wr_opcode enums. Low-level
drivers can then define macros to use these reserved values, giving
proper names to the macros for readability. Also add a range of
reserved flags to enum ib_send_flags.
The mlx5 IB driver uses the new additions.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
This patch adds a check in isert_rx_opcode() to ignore non TEXT + LOGOUT
opcodes when SessionType=Discovery has been negotiated.
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The return value wasn't checked by any of the callers. Assuming this is
correct behaviour, we can simplify some code by not bothering to
generate it.
nab: Add srpt_queue_data_in() + srpt_queue_tm_rsp() nops around
srpt_queue_response() void return
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Three have been checked for but were never set. Remove the dead code.
Also renumbers the remaining ones to a) get rid of the holes after the
removal and b) avoid a collision between TMR_FUNCTION_COMPLETE==0 and
the uninitialized case. If we failed to set a code, we should rather
fall into the default case then return success.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch includes the conversion of iscsi-target configfs
attributes for NetworkPortal, NodeACL, TPG, IQN and Discovery
groups to use kstrtou*() instead of simple_strtou*().
It also cleans up new-line usage during iscsi_tpg_param_store_##name
to use isspace().
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes another potential buffer overflow while processing
iscsi_node_auth input for configfs attributes in v3.11 for-next
TPG tfc_tpg_auth_cit context.
Reported-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a potential buffer overflow while processing
iscsi_node_auth input for configfs attributes within NodeACL
tfc_tpg_nacl_auth_cit context.
Signed-off-by: Joern Engel <joern@logfs.org>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds isert_handle_text_cmd() to handle incoming
ISCSI_OP_TEXT PDU processing, along with isert_put_text_rsp()
for posting ISCSI_OP_TEXT_RSP ib_send_wr response.
It copies ISCSI_OP_TEXT payload using unsolicited payload at
&iser_rx_desc->data[0] into iscsi_cmd->text_in_ptr for usage
with outgoing isert_put_text_rsp() -> iscsit_build_text_rsp()
v2 changes:
- Let iscsit_build_text_rsp() determine any extra padding
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Now that these two variables are used for REJECT payloads as well
as SCSI response sense payloads, rename them to something that
makes more sense.
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Add output for ib_wc.vendor_err in isert_cq_[t,r]x_work(), which
is useful for debugging future issues.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The SBC-2 specification of READ CAPACITY(10) has PMI and LOGICAL BLOCK
ADDRESS fields in the CDB; in SBC-3 these fields are simply listed as
obsolete. However, SBC-2 also has the language
If the PMI bit is set to zero and the LOGICAL BLOCK ADDRESS field
is not set to zero, the device server shall terminate the command
with CHECK CONDITION status with the sense key set to ILLEGAL
REQUEST and the additional sense code set to INVALID FIELD IN CDB.
and in fact at least the Windows SCSI compliance test checks this
behavior. Since no one following SBC-3 is going to set these fields,
we might as well include the check from SBC-2 and pass this test.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We should use TCM_ADDRESS_OUT_OF_RANGE (-> sense data LOGICAL BLOCK
ADDRESS OUT OF RANGE) for IOs past the end of a device instead of
INVALID FIELD IN CDB.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds tracepoints to the target code for commands being
received and being completed, which is quite useful for debugging
interactions with initiators. For example, one can do something like the
following to watch commands that are completing unsuccessfully:
# echo 'scsi_status!=0' > /sys/kernel/debug/tracing/events/target/target_cmd_complete/filter
# echo 1 > /sys/kernel/debug/tracing/events/target/target_cmd_complete/enable
<run command that fails>
# cat /sys/kernel/debug/tracing/trace
iscsi_trx-0-1902 [003] ...1 990185.810385: target_cmd_complete: iqn.1993-08.org.debian:01:e51ede6aacfd <- LUN 001 status CHECK CONDITION (sense len 18 / 70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00) 0x95 data_length 512 CDB 95 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 (TA:SIMPLE C:00)
(v2: Drop undefined COMPARE_AND_WRITE)
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch addresses a bug where RDMA_CM_EVENT_DISCONNECTED may occur
before the connection shutdown has been completed by rx/tx threads,
that causes isert_free_conn() to wait indefinately on ->conn_wait.
This patch allows isert_disconnect_work code to invoke rdma_disconnect
when isert_disconnect_work() process context is started by client
session reset before isert_free_conn() code has been reached.
It also adds isert_conn->conn_mutex protection for ->state within
isert_disconnect_work(), isert_cq_comp_err() and isert_free_conn()
code, along with isert_check_state() for wait_event usage.
(v2: Add explicit iscsit_cause_connection_reinstatement call
during isert_disconnect_work() to force conn reset)
Cc: stable@vger.kernel.org # 3.10+
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
"drm/nve0-/gr: some new gpc registers can have multiple copies"
5ee86c4190 caused a regression for nvc0, because the bit indicating last
transfer has occured was no longer set, resulting in random system lockups.
Reported-by: Ronald Uitermark <ronald645@gmail.com>
Tested-by: Ronald Uitermark <ronald645@gmail.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pull slave-dmaengine updates from Vinod Koul:
"Once you have some time from extended weekend celebrations please
consider pulling the following to get:
- Various fixes and PCI driver for dw_dmac by Andy
- DT binding for imx-dma by Markus & imx-sdma by Shawn
- DT fixes for dmaengine by Lars
- jz4740 dmac driver by Lars
- and various fixes across the drivers"
What "extended weekend celebrations"? I'm in the merge window, who has
time for extended celebrations..
* 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (40 commits)
DMA: shdma: add DT support
DMA: shdma: shdma_chan_filter() has to be in shdma-base.h
DMA: shdma: (cosmetic) don't re-calculate a pointer
dmaengine: at_hdmac: prepare clk before calling enable
dmaengine/trivial: at_hdmac: add curly brackets to if/else expressions
dmaengine: at_hdmac: remove unsuded atc_cleanup_descriptors()
dmaengine: at_hdmac: add FIFO configuration parameter to DMA DT binding
ARM: at91: dt: add header to define at_hdmac configuration
MIPS: jz4740: Correct clock gate bit for DMA controller
MIPS: jz4740: Remove custom DMA API
MIPS: jz4740: Register jz4740 DMA device
dma: Add a jz4740 dmaengine driver
MIPS: jz4740: Acquire and enable DMA controller clock
dma: mmp_tdma: disable irq when disabling dma channel
dmaengine: PL08x: Avoid collisions with get_signal() macro
dmaengine: dw: select DW_DMAC_BIG_ENDIAN_IO automagically
dma: dw: add PCI part of the driver
dma: dw: split driver to library part and platform code
dma: move dw_dmac driver to an own directory
dw_dmac: don't check resource with devm_ioremap_resource
...
Pull first stage of __cpuinit removal from Paul Gortmaker:
"The two commits here 1) dummy out all the __cpuinit macros so that we
no longer generate such sections, and then 2) remove all the section
processing that we used to do for those sections.
This makes all the __cpuinit and friends no-ops, so that we can remove
the use cases of it at our leisure. Expect stage 2, which does the
tree wide removal sweep at the end of the merge window."
* 'cpuinit-delete' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
modpost: remove all traces of cpuinit/cpuexit sections
init.h: remove __cpuinit sections from the kernel
While doing some code inspection, I noticed that the slob constructor
method can be called with a NULL pointer. If memory is tight and slob
fails to allocate with slob_alloc() or slob_new_pages() it still calls
the ctor() method with a NULL pointer. Looking at the first ctor()
method I found, I noticed that it can not handle a NULL pointer (I'm
sure others probably can't either):
static void sighand_ctor(void *data)
{
struct sighand_struct *sighand = data;
spin_lock_init(&sighand->siglock);
init_waitqueue_head(&sighand->signalfd_wqh);
}
The solution is to only call the ctor() method if allocation succeeded.
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
CPU partial support can introduce level of indeterminism that is not
wanted in certain context (like a realtime kernel). Make it
configurable.
This patch is based on Christoph Lameter's "slub: Make cpu partial slab
support configurable V2".
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
At the moment, kmalloc() isn't even listed in the kernel API
documentation (DocBook/kernel-api.html after running "make htmldocs").
Another issue is that the documentation for kmalloc_node()
refers to kcalloc()'s documentation to describe its 'flags' parameter,
while kcalloc() refered to kmalloc()'s documentation, which doesn't exist!
This patch is a proposed fix for this. It also removes the documentation
for kmalloc() in include/linux/slob_def.h which isn't included to
generate the documentation anyway. This way, kmalloc() is described
in only one place.
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Some architectures (e.g. powerpc built with CONFIG_PPC_256K_PAGES=y
CONFIG_FORCE_MAX_ZONEORDER=11) get PAGE_SHIFT + MAX_ORDER > 26.
In 3.10 kernels, CONFIG_LOCKDEP=y with PAGE_SHIFT + MAX_ORDER > 26 makes
init_lock_keys() dereference beyond kmalloc_caches[26].
This leads to an unbootable system (kernel panic at initializing SLAB)
if one of kmalloc_caches[26...PAGE_SHIFT+MAX_ORDER-1] is not NULL.
Fix this by making sure that init_lock_keys() does not dereference beyond
kmalloc_caches[26] arrays.
Signed-off-by: Christoph Lameter <cl@linux.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-Love.SAKURA.ne.jp>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: <stable@vger.kernel.org> [3.10.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
In free path, we don't check number of cpu_partial, so one slab can
be linked in cpu partial list even if cpu_partial is 0. To prevent this,
we should check number of cpu_partial in put_cpu_partial().
Acked-by: Christoph Lameeter <cl@linux.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Use existing interface node_nr_slabs and node_nr_objs to get
nr_slabs and nr_objs.
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
This patch remove unused nr_partials variable.
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Slab have some tunables like limit, batchcount, and sharedfactor can be
tuned through function slabinfo_write. Commit (b7454ad3: mm/sl[au]b: Move
slabinfo processing to slab_common.c) uncorrectly change /proc/slabinfo
unwriteable for slab, this patch fix it by revert to original mode.
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
This patch shares s_next and s_stop between slab and slub.
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
The drain_freelist is called to drain slabs_free lists for cache reap,
cache shrink, memory hotplug callback etc. The tofree parameter should
be the number of slab to free instead of the number of slab objects to
free.
This patch fix the callers that pass # of objects. Make sure they pass #
of slabs.
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
$ make C=1 M=drivers/vhost
drivers/vhost/net.c:168:5: warning: symbol 'vhost_net_set_ubuf_info' was not declared. Should it be static?
drivers/vhost/net.c:194:6: warning: symbol 'vhost_net_vq_reset' was not declared. Should it be static?
drivers/vhost/scsi.c:219:6: warning: symbol 'tcm_vhost_done_inflight' was not declared. Should it be static?
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, vhost-net and vhost-scsi are sharing the vhost core code.
However, vhost-scsi shares the code by including the vhost.c file
directly.
Making vhost a separate module makes it is easier to share code with
other vhost devices.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This way, we use cmd for struct tcm_vhost_cmd and evt for struct
tcm_vhost_cmd.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It was needed when struct tcm_vhost_tpg is in tcm_vhost.h
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost_net_ubuf_put_and_wait has a confusing name:
it will actually also free it's argument.
Thus since commit 1280c27f8e
"vhost-net: flush outstanding DMAs on memory change"
vhost_net_flush tries to use the argument after passing it
to vhost_net_ubuf_put_and_wait, this results
in use after free.
To fix, don't free the argument in vhost_net_ubuf_put_and_wait,
add an new API for callers that want to free ubufs.
Acked-by: Asias He <asias@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds target_get_sess_cmd reference counting for
iscsit_handle_task_mgt_cmd(), and adds a target_put_sess_cmd()
for the failure case.
It also fixes a bug where ISCSI_OP_SCSI_TMFUNC type commands
where leaking iscsi_cmd->i_conn_node and eventually triggering
an OOPs during struct isert_conn shutdown.
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>