Commit Graph

2224 Commits

Author SHA1 Message Date
Jack Wang 630e438f04 RDMA/rtrs: Introduce head/tail wr
Introduce tail wr, we can send as the last wr, we want to send the local
invalidate wr after rdma wr in later patch.

While at it, also fix coding style issue.

Link: https://lore.kernel.org/r/20210621055340.11789-2-jinpu.wang@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 21:02:20 -03:00
Leon Romanovsky bf194997c7 RDMA: Fix kernel-doc warnings about wrong comment
Compilation with W=1 produces warnings similar to the below.

  drivers/infiniband/ulp/ipoib/ipoib_main.c:320: warning: This comment
	starts with '/**', but isn't a kernel-doc comment. Refer
	Documentation/doc-guide/kernel-doc.rst

All such occurrences were found with the following one line
 git grep -A 1 "\/\*\*" drivers/infiniband/

Link: https://lore.kernel.org/r/e57d5f4ddd08b7a19934635b44d6d632841b9ba7.1623823612.git.leonro@nvidia.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com> #rtrs
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 20:32:50 -03:00
Jack Wang a95fbe2aba RDMA/rtrs: Check device max_qp_wr limit when create QP
Currently we only check device max_qp_wr limit for IO connection, but not
for service connection. We should check for both.

So save the max_qp_wr device limit in wr_limit, and use it for both IO
connections and service connections.

While at it, also remove an outdated comments.

Link: https://lore.kernel.org/r/20210614090337.29557-6-jinpu.wang@ionos.com
Suggested-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-18 13:47:13 -03:00
Guoqing Jiang 354462eb7f RDMA/rtrs: Rename cq_size/queue_size to cq_num/queue_num
Those variables are passed to create_cq, create_qp, rtrs_iu_alloc and
rtrs_iu_free, so these *_size means the num of unit. And cq_size also
means number of cq element.

Also move the setting of cq_num to common path.

Link: https://lore.kernel.org/r/20210614090337.29557-5-jinpu.wang@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-18 13:47:13 -03:00
Md Haris Iqbal b012f0ad53 RDMA/rtrs: RDMA_RXE requires more number of WR
When using rdma_rxe, post_one_recv() returns ENOMEM error due to the full
recv queue.  This patch increase the number of WR for receive queue to
support all devices.

Link: https://lore.kernel.org/r/20210614090337.29557-4-jinpu.wang@ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-18 13:47:13 -03:00
Jack Wang 0509ebfa33 RDMA/rtrs-clt: Use minimal max_send_sge when create qp
We use device limit max_send_sge, which is suboptimal for memory usage.
We don't need that much for User Con, 1 is enough. And for IO con,
sess->max_segments + 1 is enough

Link: https://lore.kernel.org/r/20210614090337.29557-3-jinpu.wang@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-18 13:47:12 -03:00
Jack Wang 5e91eabf66 RDMA/rtrs-srv: Set minimal max_send_wr and max_recv_wr
Currently rtrs when create_qp use a coarse numbers (bigger in general),
which leads to hardware create more resources which only waste memory with
no benefits.

For max_send_wr, we don't really need alway max_qp_wr size when creating
qp, reduce it to cq_size.

For max_recv_wr,  cq_size is enough.

With the patch when sess_queue_depth=128, per session (2 paths) memory
consumption reduced from 188 MB to 65MB

When always_invalidate is enabled, we need send more wr, so treat it
special.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210614090337.29557-2-jinpu.wang@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-18 13:47:12 -03:00
Weihang Li a5e27fb68f RDMA/ipoib: Use refcount_t instead of atomic_t for reference counting
The refcount_t API will WARN on underflow and overflow of a reference
counter, and avoid use-after-free risks.

Link: https://lore.kernel.org/r/1622194663-2383-13-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-08 15:00:04 -03:00
Kamal Heib a3e74fb924 RDMA/ipoib: Fix warning caused by destroying non-initial netns
After the commit 5ce2dced8e ("RDMA/ipoib: Set rtnl_link_ops for ipoib
interfaces"), if the IPoIB device is moved to non-initial netns,
destroying that netns lets the device vanish instead of moving it back to
the initial netns, This is happening because default_device_exit() skips
the interfaces due to having rtnl_link_ops set.

Steps to reporoduce:
  ip netns add foo
  ip link set mlx5_ib0 netns foo
  ip netns delete foo

WARNING: CPU: 1 PID: 704 at net/core/dev.c:11435 netdev_exit+0x3f/0x50
Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun d
 fuse
CPU: 1 PID: 704 Comm: kworker/u64:3 Tainted: G S      W  5.13.0-rc1+ #1
Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016
Workqueue: netns cleanup_net
RIP: 0010:netdev_exit+0x3f/0x50
Code: 48 8b bb 30 01 00 00 e8 ef 81 b1 ff 48 81 fb c0 3a 54 a1 74 13 48
8b 83 90 00 00 00 48 81 c3 90 00 00 00 48 39 d8 75 02 5b c3 <0f> 0b 5b
c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00
RSP: 0018:ffffb297079d7e08 EFLAGS: 00010206
RAX: ffff8eb542c00040 RBX: ffff8eb541333150 RCX: 000000008010000d
RDX: 000000008010000e RSI: 000000008010000d RDI: ffff8eb440042c00
RBP: ffffb297079d7e48 R08: 0000000000000001 R09: ffffffff9fdeac00
R10: ffff8eb5003be000 R11: 0000000000000001 R12: ffffffffa1545620
R13: ffffffffa1545628 R14: 0000000000000000 R15: ffffffffa1543b20
FS:  0000000000000000(0000) GS:ffff8ed37fa00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005601b5f4c2e8 CR3: 0000001fc8c10002 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ops_exit_list.isra.9+0x36/0x70
 cleanup_net+0x234/0x390
 process_one_work+0x1cb/0x360
 ? process_one_work+0x360/0x360
 worker_thread+0x30/0x370
 ? process_one_work+0x360/0x360
 kthread+0x116/0x130
 ? kthread_park+0x80/0x80
 ret_from_fork+0x22/0x30

To avoid the above warning and later on the kernel panic that could happen
on shutdown due to a NULL pointer dereference, make sure to set the
netns_refund flag that was introduced by commit 3a5ca85707 ("can: dev:
Move device back to init netns on owning netns delete") to properly
restore the IPoIB interfaces to the initial netns.

Fixes: 5ce2dced8e ("RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces")
Link: https://lore.kernel.org/r/20210525150134.139342-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 15:20:39 -03:00
Martin K. Petersen 1ff28f229b Merge branch '5.14/scsi-result' into 5.14/scsi-staging
Include Hannes' SCSI command result rework in the staging branch.

[mkp: remove DRIVER_SENSE from mpi3mr]

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 01:37:04 -04:00
Mike Christie 9e5fe17008 scsi: iscsi: Rel ref after iscsi_lookup_endpoint()
Subsequent commits allow the kernel to do ep_disconnect. In that case we
will have to get a proper refcount on the ep so one thread does not delete
it from under another.

Link: https://lore.kernel.org/r/20210525181821.7617-7-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 01:28:20 -04:00
Mike Christie 891e2639de scsi: iscsi: Stop queueing during ep_disconnect
During ep_disconnect we have been doing iscsi_suspend_tx/queue to block new
I/O but every driver except cxgbi and iscsi_tcp can still get I/O from
__iscsi_conn_send_pdu() if we haven't called iscsi_conn_failure() before
ep_disconnect. This could happen if we were terminating the session, and
the logout timed out before it was even sent to libiscsi.

Fix the issue by adding a helper which reverses the bind_conn call that
allows new I/O to be queued. Drivers implementing ep_disconnect can use this
to make sure new I/O is not queued to them when handling the disconnect.

Link: https://lore.kernel.org/r/20210525181821.7617-3-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-06-02 01:28:19 -04:00
Hannes Reinecke 3d45cefc8e scsi: core: Drop obsolete Linux-specific SCSI status codes
Originally the SCSI subsystem has been using 'special' SCSI status codes,
which were the SAM-specified ones but shifted by 1.  As most drivers have
now been modified to use the SAM-specified ones, having two nearly
identical sets of definitions only causes confusion.

The Linux-specifed SCSI status codes have been marked obsolete for several
years so drop them and use the SAM-specified status codes throughout.

Link: https://lore.kernel.org/r/20210427083046.31620-41-hare@suse.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-05-31 23:59:18 -04:00
Jack Wang 0e8558476f RDMA/rtrs: Avoid Wtautological-constant-out-of-range-compare
drivers/infiniband/ulp/rtrs/rtrs-clt.c:1786:19: warning: result of comparison of
constant 'MAX_SESS_QUEUE_DEPTH' (65536) with expression of type 'u16'
(aka 'unsigned short') is always false [-Wtautological-constant-out-of-range-compare]

To fix it, limit MAX_SESS_QUEUE_DEPTH to u16 max, which is 65535, and
drop the check in rtrs-clt, as it's the type u16 max.

Link: https://lore.kernel.org/r/20210531122835.58329-1-jinpu.wang@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-31 15:38:08 -03:00
Gioh Kim 7ecd7e290b RDMA/rtrs-clt: Fix memory leak of not-freed sess->stats and stats->pcpu_stats
sess->stats and sess->stats->pcpu_stats objects are freed
when sysfs entry is removed. If something wrong happens and
session is closed before sysfs entry is created,
sess->stats and sess->stats->pcpu_stats objects are not freed.

This patch adds freeing of them at three places:
1. When client uses wrong address and session creation fails.
2. When client fails to create a sysfs entry.
3. When client adds wrong address via sysfs add_path.

Fixes: 215378b838 ("RDMA/rtrs: client: sysfs interface functions")
Link: https://lore.kernel.org/r/20210528113018.52290-21-jinpu.wang@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:59 -03:00
Md Haris Iqbal 5b73b799c2 RDMA/rtrs-clt: Check if the queue_depth has changed during a reconnection
The queue_depth is a module parameter for rtrs_server. It is used on the
client side to determing the queue_depth of the request queue for the RNBD
virtual block device.

During a reconnection event for an already mapped device, in case the
rtrs_server module queue_depth has changed, fail the reconnect attempt.

Also stop further auto reconnection attempts. A manual reconnect via
sysfs has to be triggerred.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20210528113018.52290-20-jinpu.wang@ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:59 -03:00
Jack Wang 6bb97a2c1a RDMA/rtrs-srv: Fix memory leak when having multiple sessions
Gioh notice memory leak below
unreferenced object 0xffff8880acda2000 (size 2048):
  comm "kworker/4:1", pid 77, jiffies 4295062871 (age 1270.730s)
  hex dump (first 32 bytes):
    00 20 da ac 80 88 ff ff 00 20 da ac 80 88 ff ff  . ....... ......
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000e85d85b5>] rtrs_srv_rdma_cm_handler+0x8e5/0xa90 [rtrs_server]
    [<00000000e31a988a>] cma_ib_req_handler+0xdc5/0x2b50 [rdma_cm]
    [<000000000eb02c5b>] cm_process_work+0x2d/0x100 [ib_cm]
    [<00000000e1650ca9>] cm_req_handler+0x11bc/0x1c40 [ib_cm]
    [<000000009c28818b>] cm_work_handler+0xe65/0x3cf2 [ib_cm]
    [<000000002b53eaa1>] process_one_work+0x4bc/0x980
    [<00000000da3499fb>] worker_thread+0x78/0x5c0
    [<00000000167127a4>] kthread+0x191/0x1e0
    [<0000000060802104>] ret_from_fork+0x3a/0x50
unreferenced object 0xffff88806d595d90 (size 8):
  comm "kworker/4:1H", pid 131, jiffies 4295062972 (age 1269.720s)
  hex dump (first 8 bytes):
    62 6c 61 00 6b 6b 6b a5                          bla.kkk.
  backtrace:
    [<000000004447d253>] kstrdup+0x2e/0x60
    [<0000000047259793>] kobject_set_name_vargs+0x2f/0xb0
    [<00000000c2ee3bc8>] dev_set_name+0xab/0xe0
    [<000000002b6bdfb1>] rtrs_srv_create_sess_files+0x260/0x290 [rtrs_server]
    [<0000000075d87bd7>] rtrs_srv_info_req_done+0x71b/0x960 [rtrs_server]
    [<00000000ccdf1bb5>] __ib_process_cq+0x94/0x100 [ib_core]
    [<00000000cbcb60cb>] ib_cq_poll_work+0x32/0xc0 [ib_core]
    [<000000002b53eaa1>] process_one_work+0x4bc/0x980
    [<00000000da3499fb>] worker_thread+0x78/0x5c0
    [<00000000167127a4>] kthread+0x191/0x1e0
    [<0000000060802104>] ret_from_fork+0x3a/0x50
unreferenced object 0xffff88806d6bb100 (size 256):
  comm "kworker/4:1H", pid 131, jiffies 4295062972 (age 1269.720s)
  hex dump (first 32 bytes):
    00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
    ff ff ff ff ff ff ff ff 00 59 4d 86 ff ff ff ff  .........YM.....
  backtrace:
    [<00000000a18a11e4>] device_add+0x74d/0xa00
    [<00000000a915b95f>] rtrs_srv_create_sess_files.cold+0x49/0x1fe [rtrs_server]
    [<0000000075d87bd7>] rtrs_srv_info_req_done+0x71b/0x960 [rtrs_server]
    [<00000000ccdf1bb5>] __ib_process_cq+0x94/0x100 [ib_core]
    [<00000000cbcb60cb>] ib_cq_poll_work+0x32/0xc0 [ib_core]
    [<000000002b53eaa1>] process_one_work+0x4bc/0x980
    [<00000000da3499fb>] worker_thread+0x78/0x5c0
    [<00000000167127a4>] kthread+0x191/0x1e0
    [<0000000060802104>] ret_from_fork+0x3a/0x50

The problem is we increase device refcount by get_device in process_info_req
for each path, but only does put_deice for last path, which lead to
memory leak.

To fix it, it also calls put_device when dev_ref is not 0.

Fixes: e2853c4947 ("RDMA/rtrs-srv-sysfs: fix missing put_device")
Link: https://lore.kernel.org/r/20210528113018.52290-19-jinpu.wang@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Gioh Kim 2371c40354 RDMA/rtrs-srv: Fix memory leak of unfreed rtrs_srv_stats object
When closing a session, currently the rtrs_srv_stats object in the
closing session is freed by kobject release. But if it failed
to create a session by various reasons, it must free the rtrs_srv_stats
object directly because kobject is not created yet.

This problem is found by kmemleak as below:

1. One client machine maps /dev/nullb0 with session name 'bla':
root@test1:~# echo "sessname=bla path=ip:192.168.122.190 \
device_path=/dev/nullb0" > /sys/devices/virtual/rnbd-client/ctl/map_device

2. Another machine failed to create a session with the same name 'bla':
root@test2:~# echo "sessname=bla path=ip:192.168.122.190 \
device_path=/dev/nullb1" > /sys/devices/virtual/rnbd-client/ctl/map_device
-bash: echo: write error: Connection reset by peer

3. The kmemleak on server machine reported an error:
unreferenced object 0xffff888033cdc800 (size 128):
  comm "kworker/2:1", pid 83, jiffies 4295086585 (age 2508.680s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000a72903b2>] __alloc_sess+0x1d4/0x1250 [rtrs_server]
    [<00000000d1e5321e>] rtrs_srv_rdma_cm_handler+0xc31/0xde0 [rtrs_server]
    [<00000000bb2f6e7e>] cma_ib_req_handler+0xdc5/0x2b50 [rdma_cm]
    [<00000000e896235d>] cm_process_work+0x2d/0x100 [ib_cm]
    [<00000000b6866c5f>] cm_req_handler+0x11bc/0x1c40 [ib_cm]
    [<000000005f5dd9aa>] cm_work_handler+0xe65/0x3cf2 [ib_cm]
    [<00000000610151e7>] process_one_work+0x4bc/0x980
    [<00000000541e0f77>] worker_thread+0x78/0x5c0
    [<00000000423898ca>] kthread+0x191/0x1e0
    [<000000005a24b239>] ret_from_fork+0x3a/0x50

Fixes: 39c2d639ca ("RDMA/rtrs-srv: Set .release function for rtrs srv device during device init")
Link: https://lore.kernel.org/r/20210528113018.52290-18-jinpu.wang@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Gioh Kim 07c1402729 RDMA/rtrs-srv: Duplicated session name is not allowed
If two clients try to use the same session name, rtrs-server generates a
kernel error that it failed to create the sysfs because the filename
is duplicated.

This patch adds code to check if there already exists the same session
name with the different UUID. If a client tries to add more session,
it sends the UUID and the session name. Therefore it is ok if there is
already same session name with the same UUID. The rtrs-server must fail
only-if there is the same session name with the different UUID.

Link: https://lore.kernel.org/r/20210528113018.52290-17-jinpu.wang@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Aleksei Marov <aleksei.marov@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Gioh Kim 64bce1ee97 RDMA/rtrs: Do not reset hb_missed_max after re-connection
When re-connecting, it resets hb_missed_max to 0.
Before the first re-connecting, client will trigger re-connection
when it gets hb-ack more than 5 times. But after the first
re-connecting, clients will do re-connection whenever it does
not get hb-ack because hb_missed_max is 0.

There is no need to reset hb_missed_max when re-connecting.
hb_missed_max should be kept until closing the session.

Fixes: c0894b3ea6 ("RDMA/rtrs: core: lib functions shared between client and server modules")
Link: https://lore.kernel.org/r/20210528113018.52290-16-jinpu.wang@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Jack Wang 78df092c38 RDMA/rtrs-srv: convert scnprintf to sysfs_emit
Link: https://lore.kernel.org/r/20210528113018.52290-15-jinpu.wang@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Md Haris Iqbal 0cdfb3b207 RDMA/rtrs-srv: Replace atomic_t with percpu_ref for ids_inflight
ids_inflight is used to track the inflight IOs. But the use of atomic_t
variable can cause performance drops and can also become a performance
bottleneck.

This commit replaces the use of atomic_t with a percpu_ref structure. The
advantage it offers is, it doesn't check if the reference has fallen to 0,
until the user explicitly signals it to; and that is done by the
percpu_ref_kill() function call. After that, the percpu_ref structure
behaves like an atomic_t and for every put call, checks whether the
reference has fallen to 0 or not.

rtrs_srv_stats_rdma_to_str shows the count of ids_inflight as 0
for user-mode tools not to be confused.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210528113018.52290-14-jinpu.wang@ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Md Haris Iqbal 41db63a7ef RDMA/rtrs-clt: Check state of the rtrs_clt_sess before reading its stats
When get_next_path_min_inflight is called to select the next path, it
iterates over the list of available rtrs_clt_sess (paths). It then reads
the number of inflight IOs for that path to select one which has the least
inflight IO.

But it may so happen that rtrs_clt_sess (path) is no longer in the
connected state because closing or error recovery paths can change the status
of the rtrs_clt_Sess.

For example, the client sent the heart-beat and did not get the
response, it would change the session status and stop IO processing.
The added checking of this patch can prevent accessing the broken path
and generating duplicated error messages.

It is ok if the status is changed after checking the status because
the error recovery path does not free memory and only tries to
reconnection. And also it is ok if the session is closed after checking
the status because closing the session changes the session status and
flush all IO beforing free memory. If the session is being accessed for
IO processing, the closing session will wait.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20210528113018.52290-13-jinpu.wang@ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Guoqing Jiang 7a2e0888b0 RDMA/rtrs-clt: Remove redundant 'break'
It is duplicated with the very next line

Link: https://lore.kernel.org/r/20210528113018.52290-12-jinpu.wang@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Guoqing Jiang 0aedfb695f RDMA/rtrs-srv: Kill __rtrs_srv_change_state
No need since the only user is rtrs_srv_change_state.

Link: https://lore.kernel.org/r/20210528113018.52290-11-jinpu.wang@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Guoqing Jiang b0c633c482 RDMA/rtrs-clt: Kill rtrs_clt_disconnect_from_sysfs
The function is just a wrapper of rtrs_clt_close_conns, let's call
rtrs_clt_close_conns directly.

Link: https://lore.kernel.org/r/20210528113018.52290-10-jinpu.wang@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Guoqing Jiang 5e82ac7c00 RDMA/rtrs-clt: Kill rtrs_clt_{start,stop}_hb
The two wrappers are not needed since we can call rtrs_{start,stop}_hb
directly.

Link: https://lore.kernel.org/r/20210528113018.52290-9-jinpu.wang@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Dima Stepanov 2d612f0d3d RDMA/rtrs: Use strscpy instead of strlcpy
During checkpatch analyzing the following warning message was found:
  WARNING:STRLCPY: Prefer strscpy over strlcpy - see:
  https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Fix it by using strscpy calls instead of strlcpy.

Link: https://lore.kernel.org/r/20210528113018.52290-8-jinpu.wang@ionos.com
Signed-off-by: Dima Stepanov <dmitrii.stepanov@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Gioh Kim 3f3d0eabc1 RDMA/rtrs: Define MIN_CHUNK_SIZE
Define MIN_CHUNK_SIZE to replace the hard-coding number.
We need 4k for metadata, so MIN_CHUNK_SIZE should be at least 8k.

Link: https://lore.kernel.org/r/20210528113018.52290-7-jinpu.wang@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Gioh Kim 3a98ea7041 RDMA/rtrs: Change MAX_SESS_QUEUE_DEPTH
Max IB immediate data size is 2^28 (MAX_IMM_PAYL_BITS)
and the minimum chunk size is 4096 (2^12).
Therefore the maximum sess_queue_depth is 65536 (2^16).

Link: https://lore.kernel.org/r/20210528113018.52290-6-jinpu.wang@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Guoqing Jiang 485f2fb1a0 RDMA/rtrs-srv: Clean up the code in __rtrs_srv_change_state
No need to use double switch to check the change of state everywhere,
let's change them to "if" to reduce size.

Link: https://lore.kernel.org/r/20210528113018.52290-5-jinpu.wang@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:58 -03:00
Md Haris Iqbal 6564b11031 RDMA/rtrs-srv: Add error messages for cases when failing RDMA connection
It was difficult to find out why it failed to establish RDMA
connection. This patch adds some messages to show which function
has failed why.

Link: https://lore.kernel.org/r/20210528113018.52290-4-jinpu.wang@ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:52:41 -03:00
Gioh Kim 21c6f5674b RDMA/rtrs-clt: Remove MAX_SESS_QUEUE_DEPTH from rtrs_send_sess_info
Client receives queue_depth value from server. There is no need
to use MAX_SESS_QUEUE_DEPTH value.

Link: https://lore.kernel.org/r/20210528113018.52290-3-jinpu.wang@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:43:39 -03:00
Guoqing Jiang cfbeb0b9bb RDMA/rtrs-srv: Kill reject_w_econnreset label
We can goto reject_w_err label after initialize err with -ECONNRESET.

Link: https://lore.kernel.org/r/20210528113018.52290-2-jinpu.wang@ionos.com
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:43:38 -03:00
YueHaibing 33e8234600 RDMA/srp: Use DEVICE_ATTR_*() macros
Use DEVICE_ATTR_*() helpers instead of plain DEVICE_ATTR, which makes the
code a bit shorter and easier to read.

Link: https://lore.kernel.org/r/20210528125750.20788-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:40:29 -03:00
YueHaibing 1f8f60f35f IB/ipoib: Use DEVICE_ATTR_*() macros
Use DEVICE_ATTR_*() helper instead of plain DEVICE_ATTR, which makes the
code a bit shorter and easier to read.

Link: https://lore.kernel.org/r/20210526132753.3092-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:39:16 -03:00
Max Gurtovoy 221442ea0f IB/isert: set rdma cm afonly flag
This will allow both IPv4 and IPv6 sockets to bind a single port at the
same time. Same behaviour is implemented in NVMe/RDMA target.

Link: https://lore.kernel.org/r/20210524085225.29064-1-mgurtovoy@nvidia.com
Reviewed-by: Alaa Hleihel <alaa@nvidia.com>
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:24:52 -03:00
Bart Van Assche ad215aaea4 RDMA/srp: Make struct scsi_cmnd and struct srp_request adjacent
Define .init_cmd_priv and .exit_cmd_priv callback functions in struct
scsi_host_template. Set .cmd_size such that the SCSI core allocates
per-command private data. Use scsi_cmd_priv() to access that private
data. Remove the req_ring pointer from struct srp_rdma_ch since it is no
longer necessary. Convert srp_alloc_req_data() and srp_free_req_data()
into functions that initialize one instance of the SRP-private command
data. This is a micro-optimization since this patch removes several
pointer dereferences from the hot path.

Note: due to commit e73a5e8e80 ("scsi: core: Only return started
requests from scsi_host_find_tag()"), it is no longer necessary to protect
the completion path against duplicate responses.

Link: https://lore.kernel.org/r/20210524041211.9480-6-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:21:21 -03:00
Bart Van Assche 7ec2e27a3a RDMA/srp: Fix a recently introduced memory leak
Only allocate a memory registration list if it will be used and if it will
be freed.

Link: https://lore.kernel.org/r/20210524041211.9480-5-bvanassche@acm.org
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Fixes: f273ad4f8d ("RDMA/srp: Remove support for FMR memory registration")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:21:21 -03:00
Bart Van Assche c838de1af1 RDMA/srp: Add more structure size checks
Before modifying how the __packed attribute is used, add compile time
size checks for the structures that will be modified.

Link: https://lore.kernel.org/r/20210524041211.9480-3-bvanassche@acm.org
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Cc: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28 20:21:20 -03:00
Yang Li 74ec242473 IB/srpt: Remove redundant assignment to ret
Variable 'ret' is set to -ENOMEM but this value is never read as it is
overwritten with a new value later on, hence it is a redundant assignment
and can be removed

In 'commit b79fafac70 ("target: make queue_tm_rsp() return void")'
srpt_queue_response() has been changed to return void, so after "goto
out", there is no need to return ret.

Clean up the following clang-analyzer warning:

drivers/infiniband/ulp/srpt/ib_srpt.c:2860:3: warning: Value stored to
'ret' is never read [clang-analyzer-deadcode.DeadStores]

Fixes: b99f8e4d7b ("IB/srpt: convert to the generic RDMA READ/WRITE API")
Link: https://lore.kernel.org/r/1620296105-121964-1-git-send-email-yang.lee@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-11 13:42:17 -03:00
Linus Torvalds bd313968fd block-5.13-2021-05-07
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmCVVnQQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgps0ND/0SL4zWQJ5fh+NVCyQJFLm0E+ejqWg6Ykmk
 EE1Dzhgr9lgxZU19UCXKtN0lF9icWPfoVDxvqsB2luJLc89GciOmla3PaknCgY6N
 QZ/GJh/2Kwb9ybVblzKvUNnGSZOZ8gplpAAXu4zlbFXl7xoGBb12kql78fjw84rS
 S4IG+nKvTdC6ENVTPwFMj0UREL5nccVJycvsuZgzYsSQ//5i5zViDz7mfdCujAo4
 g3rt8rctBqYoF684BG4OVkDp7ivJUFvMW93PVqvx8vw2sAOB11v+sAKvX5cZIsdM
 Z01a3C5nY8IQcpXhoI7n6Kgg4VY0ubeiOrlIBssNQWJszquAHPN7s5uiiSFaIKwg
 mCyo69Ofmk4wYm2UO0hM8y7x94QvUNKmlcVxb4ls5OEaAKS/v7chnjoovp8s8Me/
 2w1BMBB4qPcF99+K2GF9KyT/gKrXDRXkr9ERTtLLPpCf2uIXtFcU+X+Y64cOivhf
 ImN1kbN8fQm1ItiEntn5tVd9u9cDnfqTJhzutBolLP33jjarK3TblJ4cUZqN/xAC
 uH5k1IXZGHbrE9LuXUJQwFs752m21LElSkfG7OxzlktfJcKxJriM9o/dw0mgEmLv
 0i1meb55VMbtYT/dNWZEa2FRVtelFIngfoiLSgH0IHXU7sKgTEpgyLmSu4PrySez
 kRVUsF1Lfw==
 =Sv+q
 -----END PGP SIGNATURE-----

Merge tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - dasd spelling fixes (Bhaskar)

 - Limit bio max size on multi-page bvecs to the hardware limit, to
   avoid overly large bio's (and hence latencies). Originally queued for
   the merge window, but needed a fix and was dropped from the initial
   pull (Changheun)

 - NVMe pull request (Christoph):
      - reset the bdev to ns head when failover (Daniel Wagner)
      - remove unsupported command noise (Keith Busch)
      - misc passthrough improvements (Kanchan Joshi)
      - fix controller ioctl through ns_head (Minwoo Im)
      - fix controller timeouts during reset (Tao Chiu)

 - rnbd fixes/cleanups (Gioh, Md, Dima)

 - Fix iov_iter re-expansion (yangerkun)

* tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-block:
  block: reexpand iov_iter after read/write
  nvmet: remove unsupported command noise
  nvme-multipath: reset bdev to ns head when failover
  nvme-pci: fix controller reset hang when racing with nvme_timeout
  nvme: move the fabrics queue ready check routines to core
  nvme: avoid memset for passthrough requests
  nvme: add nvme_get_ns helper
  nvme: fix controller ioctl through ns_head
  bio: limit bio max size
  RDMA/rtrs: fix uninitialized symbol 'cnt'
  s390: dasd: Mundane spelling fixes
  block/rnbd: Remove all likely and unlikely
  block/rnbd-clt: Check the return value of the function rtrs_clt_query
  block/rnbd: Fix style issues
  block/rnbd-clt: Change queue_depth type in rnbd_clt_session to size_t
2021-05-07 11:35:12 -07:00
Gioh Kim c646790a1f RDMA/rtrs: fix uninitialized symbol 'cnt'
rtrs_clt_rdma_cq_direct returns an ninitialized value in cnt
if there is no session. This patch makes rtrs_clt_rdma_cq_direct
returns a negative value for block layer not to try again.

Fixes: 2958a995ed ("block/rnbd-clt: Support polling mode for IO latency optimization")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20210429092741.266533-1-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-03 11:00:11 -06:00
Linus Torvalds f34b2cf178 RDMA merge window pull request
This is significantly bug fixes and general cleanups. The noteworthy new
 features are fairly small:
 
 - XRC support for HNS and improves RQ operations
 
 - Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw and
   qib
 
 - Quite a few general cleanups on spelling, error handling, static checker
   detections, etc
 
 - Increase the number of device ports supported beyond 255. High port
   count software switches now exist
 
 - Several bug fixes for rtrs
 
 - mlx5 Device Memory support for host controlled atomics
 
 - Report SRQ tables through to rdma-tool
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmCMMHEACgkQOG33FX4g
 mxri3Q//RAgIExCGHebQ9xkptZHVyTLLJMpiMl2cqk3ZVRdDZ7QdiQjIqY2KqlUK
 nxBj7EXJeX6rV5a1xqCcOO1gBetB28TSwnCNE2ZqrXP5B59ISW8D052IWza3UkUz
 WmHLARxHQlyKBWA4+ZAgfoUGL0NmWA8QPf56t/RK/3/OsuYnGzcnWmmFbt8XKFcH
 NtO3KC45mKWDqqG0A0XRrLbEQz/ElO3OuPBqlBKgB3ZgGPzgsOUTOGkm1tCcZ89L
 /pvZGB7SklKZdCX8TxdpVGd9h0zHl8pqh1yEzvTA1ypNAYSUId2mvZXluU8J5yJl
 FLk7E1IxE5050FNEc7T5uZdUVntulYiqL2558coRI34l5w26pKGjIMxw/nTB8hg8
 4ZfBtKVemIG6yzW5Up6iBpK7qWYpvLWVShwYAWhbNsjN7JGzJuh1gJnjbmYgyz2P
 RTMU9wjFPLL2wZxg4LDHACVJNBb82j6KKuE+kZWpk11ro7INw9+7YwRuTo7/ezxC
 BwXKu8wF4igwSigV55jM+WnGXLhxdC3qmx/2cbtWyLM/PzdRL96tM0RWW5v8/Nv7
 teFhkt+f3RVqcfYH5K1qCXy3UFrxG6bxFSvcHHSBx2bdIrqhuTY5FqszAYImeW2j
 iHoyIsuSuGu79HQgOzAQZsEyksWi6OYDvA9Q9VBoPP4bJ3DOAa4=
 =vsXA
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This is significantly bug fixes and general cleanups. The noteworthy
  new features are fairly small:

   - XRC support for HNS and improves RQ operations

   - Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw
     and qib

   - Quite a few general cleanups on spelling, error handling, static
     checker detections, etc

   - Increase the number of device ports supported beyond 255. High port
     count software switches now exist

   - Several bug fixes for rtrs

   - mlx5 Device Memory support for host controlled atomics

   - Report SRQ tables through to rdma-tool"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (145 commits)
  IB/qib: Remove redundant assignment to ret
  RDMA/nldev: Add copy-on-fork attribute to get sys command
  RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
  RDMA/siw: Fix a use after free in siw_alloc_mr
  IB/hfi1: Remove redundant variable rcd
  RDMA/nldev: Add QP numbers to SRQ information
  RDMA/nldev: Return SRQ information
  RDMA/restrack: Add support to get resource tracking for SRQ
  RDMA/nldev: Return context information
  RDMA/core: Add CM to restrack after successful attachment to a device
  RDMA/cma: Skip device which doesn't support CM
  RDMA/rxe: Fix a bug in rxe_fill_ip_info()
  RDMA/mlx5: Expose private query port
  RDMA/mlx4: Remove an unused variable
  RDMA/mlx5: Fix type assignment for ICM DM
  IB/mlx5: Set right RoCE l3 type and roce version while deleting GID
  RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
  RDMA/cxgb4: add missing qpid increment
  IB/ipoib: Remove unnecessary struct declaration
  RDMA/bnxt_re: Get rid of custom module reference counting
  ...
2021-05-01 09:15:05 -07:00
Linus Torvalds d72cd4ad41 SCSI misc on 20210428
This series consists of the usual driver updates (ufs, target, tcmu,
 smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx).  The major core
 change is using a sbitmap instead of an atomic for queue tracking.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYInvqCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYh2AP0SgqqL
 WYZRT2oiyBOKD28v+ceOSiXvgjPlqABwVMC0BAEAn29/wNCxyvzZ1k/b0iPJ4M+S
 klkSxLzXKQLzJBgdK5w=
 =p5B/
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This consists of the usual driver updates (ufs, target, tcmu,
  smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx).

  The major core change is using a sbitmap instead of an atomic for
  queue tracking"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (412 commits)
  scsi: target: tcm_fc: Fix a kernel-doc header
  scsi: target: Shorten ALUA error messages
  scsi: target: Fix two format specifiers
  scsi: target: Compare explicitly with SAM_STAT_GOOD
  scsi: sd: Introduce a new local variable in sd_check_events()
  scsi: dc395x: Open-code status_byte(u8) calls
  scsi: 53c700: Open-code status_byte(u8) calls
  scsi: smartpqi: Remove unused functions
  scsi: qla4xxx: Remove an unused function
  scsi: myrs: Remove unused functions
  scsi: myrb: Remove unused functions
  scsi: mpt3sas: Fix two kernel-doc headers
  scsi: fcoe: Suppress a compiler warning
  scsi: libfc: Fix a format specifier
  scsi: aacraid: Remove an unused function
  scsi: core: Introduce enum scsi_disposition
  scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case
  scsi: core: Rename scsi_softirq_done() into scsi_complete()
  scsi: core: Remove an incorrect comment
  scsi: core: Make the scsi_alloc_sgtables() documentation more accurate
  ...
2021-04-28 17:22:10 -07:00
Linus Torvalds fc05860628 for-5.13/drivers-2021-04-27
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmCIJYcQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpieWD/92qbtWl/z+9oCY212xV+YMoMqj/vGROX+U
 9i/FQJ3AIC/AUoNjZeW3NIbiaNqde5mrLlUSCHgn6RLsHK7p0GQJ4ohpbIGFG5+i
 2+Efm+vjlCxLVGrkeZEwMtsht7w/NbOYDr1Rgv9b4lQ6iWI11Mg8E337Whl1me1k
 h6bEXaioK9yqxYtsLgcn9I1qQ2p7gok0HX7zFU/XxEUZylqH6E4vQhj2+NL8UUqE
 7siFHADZE99Z7LXtOkl8YyOlGU52RCUzqDHWydvkipKjgYBi95HLXGT64Z+WCEvz
 HI54oVDRWr+uWdqDFfy+ncHm8pNeP0GV9JPhDz4ELRTSndoxB2il7wRLvp6wxV9d
 8Y4j7vb30i+8GGbM0c79dnlG76D9r5ivbTKixcXFKB128NusQR6JymIv1pKlSKhk
 H871/iOarrepAAUwVR5CtldDDJCy/q1Hks+7UXbaM3F9iNitxsJNZryQq9xdTu/N
 ThFOTz+VECG4RJLxIwmsWGiLgwr52/ybAl2MBcn+s7uC4jM/TFKpdQBfQnOAiINb
 MLlfuYRRSMg1Osb2fYZneR2ifmSNOMRdDJb+tsZGz4xWmZcj0uL4QgqcsOvuiOEQ
 veF/Ky50qw57hWtiEhvqa7/WIxzNF3G3wejqqA8hpT9Qifu0QawYTnXGUttYNBB1
 mO9R3/ccaw==
 =c0x4
 -----END PGP SIGNATURE-----

Merge tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-block

Pull block driver updates from Jens Axboe:

 - MD changes via Song:
        - raid5 POWER fix
        - raid1 failure fix
        - UAF fix for md cluster
        - mddev_find_or_alloc() clean up
        - Fix NULL pointer deref with external bitmap
        - Performance improvement for raid10 discard requests
        - Fix missing information of /proc/mdstat

 - rsxx const qualifier removal (Arnd)

 - Expose allocated brd pages (Calvin)

 - rnbd via Gioh Kim:
        - Change maintainer
        - Change domain address of maintainers' email
        - Add polling IO mode and document update
        - Fix memory leak and some bug detected by static code analysis
          tools
        - Code refactoring

 - Series of floppy cleanups/fixes (Denis)

 - s390 dasd fixes (Julian)

 - kerneldoc fixes (Lee)

 - null_blk double free (Lv)

 - null_blk virtual boundary addition (Max)

 - Remove xsysace driver (Michal)

 - umem driver removal (Davidlohr)

 - ataflop fixes (Dan)

 - Revalidate disk removal (Christoph)

 - Bounce buffer cleanups (Christoph)

 - Mark lightnvm as deprecated (Christoph)

 - mtip32xx init cleanups (Shixin)

 - Various fixes (Tian, Gustavo, Coly, Yang, Zhang, Zhiqiang)

* tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-block: (143 commits)
  async_xor: increase src_offs when dropping destination page
  drivers/block/null_blk/main: Fix a double free in null_init.
  md/raid1: properly indicate failure when ending a failed write request
  md-cluster: fix use-after-free issue when removing rdev
  nvme: introduce generic per-namespace chardev
  nvme: cleanup nvme_configure_apst
  nvme: do not try to reconfigure APST when the controller is not live
  nvme: add 'kato' sysfs attribute
  nvme: sanitize KATO setting
  nvmet: avoid queuing keep-alive timer if it is disabled
  brd: expose number of allocated pages in debugfs
  ataflop: fix off by one in ataflop_probe()
  ataflop: potential out of bounds in do_format()
  drbd: Fix fall-through warnings for Clang
  block/rnbd: Use strscpy instead of strlcpy
  block/rnbd-clt-sysfs: Remove copy buffer overlap in rnbd_clt_get_path_name
  block/rnbd-clt: Remove max_segment_size
  block/rnbd-clt: Generate kobject_uevent when the rnbd device state changes
  block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev
  Documentation/ABI/rnbd-clt: Add description for nr_poll_queues
  ...
2021-04-28 14:39:37 -07:00
Jack Wang 503438a4f2 block/rnbd-clt: Remove max_segment_size
We always map with SZ_4K, so do not need max_segment_size.

Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-18-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-20 08:59:04 -06:00
Gioh Kim c81cba8551 block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev
struct rtrs_srv is not used when handling rnbd_srv_rdma_ev messages, so
cleaned up
rdma_ev function pointer in rtrs_srv_ops also is changed.

Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Aleksei Marov <aleksei.marov@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-16-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-20 08:59:04 -06:00
Gioh Kim 2958a995ed block/rnbd-clt: Support polling mode for IO latency optimization
RNBD can make double-queues for irq-mode and poll-mode.
For example, on 4-CPU system 8 request-queues are created,
4 for irq-mode and 4 for poll-mode.
If the IO has HIPRI flag, the block-layer will call .poll function
of RNBD. Then IO is sent to the poll-mode queue.
Add optional nr_poll_queues argument for map_devices interface.

To support polling of RNBD, RTRS client creates connections
for both of irq-mode and direct-poll-mode.

For example, on 4-CPU system it could've create 5 connections:
con[0] => user message (softirq cq)
con[1:4] => softirq cq

After this patch, it can create 9 connections:
con[0] => user message (softirq cq)
con[1:4] => softirq cq
con[5:8] => DIRECT-POLL cq

Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-14-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-20 08:59:04 -06:00
Gioh Kim 9f455eeafd block/rnbd-clt: Replace {NO_WAIT,WAIT} with RTRS_PERMIT_{WAIT,NOWAIT}
They are defined with the same value and similar meaning, let's remove
one of them, then we can remove {WAIT,NOWAIT}.

Also change the type of 'wait' from 'int' to 'enum wait_type' to make
it clear.

Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-9-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-20 08:59:04 -06:00
Wan Jiabing 9480fd557b IB/ipoib: Remove unnecessary struct declaration
struct ipoib_cm_tx is defined at 245th line.  And the definition is
independent on the MACRO.  The declaration here is unnecessary. Remove it.

Link: https://lore.kernel.org/r/20210415092124.27684-1-wanjiabing@vivo.com
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-20 09:34:39 -03:00
Manjunath Patil 338a010cb6 IB/ipoib: Improve latency in ipoib/cm connection formation
Currently IPoIB connected mode queries the device to get the pkey table
entry during connection formation. This will increase the time taken to
form the connection, especially when limited pkeys are in use.  This gets
worse when multiple connection attempts are done in parallel.

Since ipoib interfaces are locked to a single pkey, use the pkey index
that was determined at link up time instead of searching for anything.

This improved the latency from 500ms to 1ms on an internal setup.

Link: https://lore.kernel.org/r/1618338965-16717-1-git-send-email-manjunath.b.patil@oracle.com
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-19 14:55:46 -03:00
Jack Wang 3ccbd9333f RDMA/ipoib: Print a message if only child interface is UP
When "enhanced IPoIB" is enabled for CX-5 devices it requires the parent
device to be UP, otherwise the child devices won't work.

Thus add a debug message to give admin a hint when only the child
interface is UP but parent interface is not.

Link: https://lore.kernel.org/r/20210408093215.24023-1-jinpu.wang@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13 20:06:34 -03:00
Gioh Kim 7c71f0d12e RDMA/rtrs-clt: Simplify error message
Two error messages are only different message but have common
code to generate the path string.

Link: https://lore.kernel.org/r/20210406123639.202899-4-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13 19:51:34 -03:00
Gioh Kim 42cdc1909d RDMA/rtrs-srv: More debugging info when fail to send reply
It does not help to debug if it only print error message
without any debugging information which session and connection
the error happened.

Link: https://lore.kernel.org/r/20210406123639.202899-3-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13 19:51:34 -03:00
Gioh Kim 2f37b01725 RDMA/rtrs-clt: Print more info when an error happens
Client prints only error value and it is not enough for debugging.

1. When client receives an error from server: the client does not only
   print the error value but also more information of server connection.

2. When client failes to send IO: the client gets an error from RDMA
   layer. It also print more information of server connection.

Link: https://lore.kernel.org/r/20210406123639.202899-2-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13 19:51:34 -03:00
Gioh Kim cc85392bcd RDMA/rtrs-clt: New sysfs attribute to print the latency of each path
It shows the latest latency that the client checked when sending the
heart-beat.

Link: https://lore.kernel.org/r/20210407113444.150961-3-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13 19:44:54 -03:00
Gioh Kim dc3b66a0ce RDMA/rtrs-clt: Add a minimum latency multipath policy
This patch adds new multipath policy: min-latency.  Client checks the
latency of each path when it sends the heart-beat.  And it sends IO to the
path with the minimum latency.

Link: https://lore.kernel.org/r/20210407113444.150961-2-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-13 19:44:54 -03:00
Gioh Kim 7f4a8592ff RDMA/rtrs-clt: destroy sysfs after removing session from active list
A session can be removed dynamically by sysfs interface "remove_path" that
eventually calls rtrs_clt_remove_path_from_sysfs function.  The current
rtrs_clt_remove_path_from_sysfs first removes the sysfs interfaces and
frees sess->stats object. Second it removes the session from the active
list.

Therefore some functions could access non-connected session and access the
freed sess->stats object even-if they check the session status before
accessing the session.

For instance rtrs_clt_request and get_next_path_min_inflight check the
session status and try to send IO to the session.  The session status
could be changed when they are trying to send IO but they could not catch
the change and update the statistics information in sess->stats object,
and generate use-after-free problem.
(see: "RDMA/rtrs-clt: Check state of the rtrs_clt_sess before reading its
stats")

This patch changes the rtrs_clt_remove_path_from_sysfs to remove the
session from the active session list and then destroy the sysfs
interfaces.

Each function still should check the session status because closing or
error recovery paths can change the status.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20210412084002.33582-1-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12 20:19:17 -03:00
Wang Wensheng 6bc950beff RDMA/srpt: Fix error return code in srpt_cm_req_recv()
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.

Fixes: db7683d7de ("IB/srpt: Fix login-related race conditions")
Link: https://lore.kernel.org/r/20210408113132.87250-1-wangwensheng4@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12 20:18:35 -03:00
Mike Marciniszyn 042a00f93a IB/{ipoib,hfi1}: Add a timeout handler for rdma_netdev
The current rdma_netdev handling in ipoib hooks the tx_timeout handler,
but prints out a totally useless message that prevents effective debugging
especially when multiple transmit queues are being used.

Add a tx_timeout rdma_netdev hook and implement the callback in the hfi1
to print additional information.

The existing non-helpful message is avoided when the driver has presented
a callback.

Link: https://lore.kernel.org/r/1617026056-50483-3-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:19:00 -03:00
Jack Wang 0633e23771 RDMA/rtrs-clt: Cap max_io_size
Max io size is limited by both remote buffer size and the max fr pages per
mr.

Link: https://lore.kernel.org/r/20210325153308.1214057-20-gi-oh.kim@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 15:59:46 -03:00
Jack Wang 11b74cbf8e RDMA/rtrs: Cleanup unused 's' variable in __alloc_sess
Link: https://lore.kernel.org/r/20210325153308.1214057-18-gi-oh.kim@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 15:59:46 -03:00
Gioh Kim 88e2f10564 RDMA/rtrs-srv: Report temporary sessname for error message
Before receiving the session name, the error message cannot include any
information about which connection generates the error.

This patch stores the addresses of source and target in the sessname field
to show which generates the error. That field will be over-written
when receiving the session name from client.

Link: https://lore.kernel.org/r/20210325153308.1214057-17-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 15:59:46 -03:00
Gioh Kim 8e86499e6c RDMA/rtrs: New function converting rtrs_addr to string
There is common code converting addresses of source machine and
destination machine to a string.  We already have a struct rtrs_addr to
store two addresses.  This patch introduces a new function that converts
two addresses into one string with struct rtrs_addr.

Link: https://lore.kernel.org/r/20210325153308.1214057-14-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 15:49:47 -03:00
Md Haris Iqbal 7582207b10 RDMA/rtrs-clt: Close rtrs client conn before destroying rtrs clt session files
KASAN detected the following BUG:

  BUG: KASAN: use-after-free in rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client]
  Read of size 8 at addr ffff88bf2fb4adc0 by task swapper/0/0

  CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O      5.4.84-pserver #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10
  Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00       09/04/2012
  Call Trace:
   <IRQ>
   dump_stack+0x96/0xe0
   print_address_description.constprop.4+0x1f/0x300
   ? irq_work_claim+0x2e/0x50
   __kasan_report.cold.8+0x78/0x92
   ? rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client]
   kasan_report+0x10/0x20
   rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client]
   rtrs_clt_rdma_done+0xb1/0x760 [rtrs_client]
   ? lockdep_hardirqs_on+0x1a8/0x290
   ? process_io_rsp+0xb0/0xb0 [rtrs_client]
   ? mlx4_ib_destroy_cq+0x100/0x100 [mlx4_ib]
   ? add_interrupt_randomness+0x1a2/0x340
   __ib_process_cq+0x97/0x100 [ib_core]
   ib_poll_handler+0x41/0xb0 [ib_core]
   irq_poll_softirq+0xe0/0x260
   __do_softirq+0x127/0x672
   irq_exit+0xd1/0xe0
   do_IRQ+0xa3/0x1d0
   common_interrupt+0xf/0xf
   </IRQ>
  RIP: 0010:cpuidle_enter_state+0xea/0x780
  Code: 31 ff e8 99 48 47 ff 80 7c 24 08 00 74 12 9c 58 f6 c4 02 0f 85 53 05 00 00 31 ff e8 b0 6f 53 ff e8 ab 4f 5e ff fb 8b 44 24 04 <85> c0 0f 89 f3 01 00 00 48 8d 7b 14 e8 65 1e 77 ff c7 43 14 00 00
  RSP: 0018:ffffffffab007d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffca
  RAX: 0000000000000002 RBX: ffff88b803d69800 RCX: ffffffffa91a8298
  RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffffffffab021414
  RBP: ffffffffab6329e0 R08: 0000000000000002 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
  R13: 000000bf39d82466 R14: ffffffffab632aa0 R15: ffffffffab632ae0
   ? lockdep_hardirqs_on+0x1a8/0x290
   ? cpuidle_enter_state+0xe5/0x780
   cpuidle_enter+0x3c/0x60
   do_idle+0x2fb/0x390
   ? arch_cpu_idle_exit+0x40/0x40
   ? schedule+0x94/0x120
   cpu_startup_entry+0x19/0x1b
   start_kernel+0x5da/0x61b
   ? thread_stack_cache_init+0x6/0x6
   ? load_ucode_amd_bsp+0x6f/0xc4
   ? init_amd_microcode+0xa6/0xa6
   ? x86_family+0x5/0x20
   ? load_ucode_bsp+0x182/0x1fd
   secondary_startup_64+0xa4/0xb0

  Allocated by task 5730:
   save_stack+0x19/0x80
   __kasan_kmalloc.constprop.9+0xc1/0xd0
   kmem_cache_alloc_trace+0x15b/0x350
   alloc_sess+0xf4/0x570 [rtrs_client]
   rtrs_clt_open+0x3b4/0x780 [rtrs_client]
   find_and_get_or_create_sess+0x649/0x9d0 [rnbd_client]
   rnbd_clt_map_device+0xd7/0xf50 [rnbd_client]
   rnbd_clt_map_device_store+0x4ee/0x970 [rnbd_client]
   kernfs_fop_write+0x141/0x240
   vfs_write+0xf3/0x280
   ksys_write+0xba/0x150
   do_syscall_64+0x68/0x270
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

  Freed by task 5822:
   save_stack+0x19/0x80
   __kasan_slab_free+0x125/0x170
   kfree+0xe7/0x3f0
   kobject_put+0xd3/0x240
   rtrs_clt_destroy_sess_files+0x3f/0x60 [rtrs_client]
   rtrs_clt_close+0x3c/0x80 [rtrs_client]
   close_rtrs+0x45/0x80 [rnbd_client]
   rnbd_client_exit+0x10f/0x2bd [rnbd_client]
   __x64_sys_delete_module+0x27b/0x340
   do_syscall_64+0x68/0x270
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

When rtrs_clt_close is triggered, it iterates over all the present
rtrs_clt_sess and triggers close on them. However, the call to
rtrs_clt_destroy_sess_files is done before the rtrs_clt_close_conns. This
is incorrect since during the initialization phase we allocate
rtrs_clt_sess first, and then we go ahead and create rtrs_clt_con for it.

If we free the rtrs_clt_sess structure before closing the rtrs_clt_con, it
may so happen that an inflight IO completion would trigger the function
rtrs_clt_rdma_done, which would lead to the above UAF case.

Hence close the rtrs_clt_con connections first, and then trigger the
destruction of session files.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20210325153308.1214057-12-gi-oh.kim@ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 15:41:05 -03:00
Guoqing Jiang 57dae8baa6 RDMA/rtrs: Cleanup the code in rtrs_srv_rdma_cm_handler
Let these cases share the same path since all of them need to close
session.

Link: https://lore.kernel.org/r/20210325153308.1214057-11-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 15:40:43 -03:00
Guoqing Jiang 4cd5261df9 RDMA/rtrs: Remove sessname and sess_kobj from rtrs_attrs
The two members are not used in the code, so remove them.

Link: https://lore.kernel.org/r/20210325153308.1214057-10-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 15:40:28 -03:00
Guoqing Jiang 4a58ac5440 RDMA/rtrs: Kill the put label in rtrs_srv_create_once_sysfs_root_folders
We can remove the label after move put_device to the right place.

Link: https://lore.kernel.org/r/20210325153308.1214057-9-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 15:40:17 -03:00
Guoqing Jiang 44930991f2 RDMA/rtrs-clt: Remove redundant code from rtrs_clt_read_req
There is no need to dereference 's' from 'sess', since we have "sess =
to_clt_sess(s)" before.

And we can deference 'dev' from 's' earlier.

Link: https://lore.kernel.org/r/20210325153308.1214057-8-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 15:39:55 -03:00
Wan Jiabing 7f13e0be36 RDMA/iser: struct iscsi_iser_task is declared twice
struct iscsi_iser_task has been declared at 201st line. Remove the
duplicate.

Link: https://lore.kernel.org/r/20210326113347.903976-1-wanjiabing@vivo.com
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-30 19:53:08 -03:00
Lv Yunlong adb76a520d IB/isert: Fix a use after free in isert_connect_request
The device is got by isert_device_get() with refcount is 1, and is
assigned to isert_conn by
  isert_conn->device = device.

When isert_create_qp() failed, device will be freed with
isert_device_put().

Later, the device is used in isert_free_login_buf(isert_conn) by the
isert_conn->device->ib_device statement.

Free the device in the correct order.

Fixes: ae9ea9ed38 ("iser-target: Split some logic in isert_connect_request to routines")
Link: https://lore.kernel.org/r/20210322161325.7491-1-lyl2019@mail.ustc.edu.cn
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 14:15:40 -03:00
Mark Bloch 1fb7f8973f RDMA: Support more than 255 rdma ports
Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.

This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs.  HW/Spec defined values must also not be changed.

With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.

When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.

The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely

Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.

While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),

Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 09:31:21 -03:00
Jack Wang c33d516a1c RDMA/rtrs-clt: Use rdma_event_msg in log
It's easier to understand a string instead of enum.

Link: https://lore.kernel.org/r/20210222141551.54345-2-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 14:50:41 -04:00
Jack Wang 3b89e92c2a RDMA/rtrs: Use new shared CQ mechanism
Have the driver use shared CQs which provids a ~10%-20% improvement during
test.

Instead of opening a CQ for each QP per connection, a CQ for each QP will
be provided by the RDMA core driver that will be shared between the QPs on
that core reducing interrupt overhead.

Link: https://lore.kernel.org/r/20210222141551.54345-1-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 14:50:15 -04:00
Mike Christie 0869419947 scsi: target: core: Add gfp_t arg to target_cmd_init_cdb()
tcm_loop could be used like a normal block device, so we can't use
GFP_KERNEL and should use GFP_NOIO. This adds a gfp_t arg to
target_cmd_init_cdb() and converts the users. For every driver but loop
GFP_KERNEL is kept.

This will also be useful in subsequent patches where loop needs to do
target_submit_prep() from interrupt context to get a ref to the se_device,
and so it will need to use GFP_ATOMIC.

Link: https://lore.kernel.org/r/20210227170006.5077-16-michael.christie@oracle.com
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:02 -05:00
Mike Christie 50ab9c47f5 scsi: target: srpt: Convert to new submission API
target_submit_cmd_map_sgls() is being removed, so convert srpt to the new
submission API.

srpt uses target_stop_session() to sync session shutdown with LIO core, so
we use target_init_cmd()/target_submit_prep()/target_submit(), because
target_init_cmd() will detect the target_stop_session() call and return an
error.

Link: https://lore.kernel.org/r/20210227170006.5077-6-michael.christie@oracle.com
Cc: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:00 -05:00
Jack Wang ed40852967 RDMA/rtrs-srv: Do not pass a valid pointer to PTR_ERR()
smatch gives the warning:

  drivers/infiniband/ulp/rtrs/rtrs-srv.c:1805 rtrs_rdma_connect() warn: passing zero to 'PTR_ERR'

Which is trying to say smatch has shown that srv is not an error pointer
and thus cannot be passed to PTR_ERR.

The solution is to move the list_add() down after full initilization of
rtrs_srv. To avoid holding the srv_mutex too long, only hold it during the
list operation as suggested by Leon.

Fixes: 03e9b33a0f ("RDMA/rtrs: Only allow addition of path to an already established session")
Link: https://lore.kernel.org/r/20210216143807.65923-1-jinpu.wang@cloud.ionos.com
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 20:01:56 -04:00
Nicolas Morey-Chaisemartin 2b5715fc17 RDMA/srp: Fix support for unpopulated and unbalanced NUMA nodes
The current code computes a number of channels per SRP target and spreads
them equally across all online NUMA nodes.  Each channel is then assigned
a CPU within this node.

In the case of unbalanced, or even unpopulated nodes, some channels do not
get a CPU associated and thus do not get connected.  This causes the SRP
connection to fail.

This patch solves the issue by rewriting channel computation and
allocation:

- Drop channel to node/CPU association as it had no real effect on
  locality but added unnecessary complexity.

- Tweak the number of channels allocated to reduce CPU contention when
  possible:
  - Up to one channel per CPU (instead of up to 4 by node)
  - At least 4 channels per node, unless ch_count module parameter is
    used.

Link: https://lore.kernel.org/r/9cb4d9d3-30ad-2276-7eff-e85f7ddfb411@suse.com
Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 20:01:50 -04:00
Gioh Kim e2853c4947 RDMA/rtrs-srv-sysfs: fix missing put_device
put_device() decreases the ref-count and then the device will be
cleaned-up, while at is also add missing put_device in
rtrs_srv_create_once_sysfs_root_folders

This patch solves a kmemleak error as below:

  unreferenced object 0xffff88809a7a0710 (size 8):
    comm "kworker/4:1H", pid 113, jiffies 4295833049 (age 6212.380s)
    hex dump (first 8 bytes):
      62 6c 61 00 6b 6b 6b a5                          bla.kkk.
    backtrace:
      [<0000000054413611>] kstrdup+0x2e/0x60
      [<0000000078e3120a>] kobject_set_name_vargs+0x2f/0xb0
      [<00000000f1a17a6b>] dev_set_name+0xab/0xe0
      [<00000000d5502e32>] rtrs_srv_create_sess_files+0x2fb/0x314 [rtrs_server]
      [<00000000ed11a1ef>] rtrs_srv_info_req_done+0x631/0x800 [rtrs_server]
      [<000000008fc5aa8f>] __ib_process_cq+0x94/0x100 [ib_core]
      [<00000000a9599cb4>] ib_cq_poll_work+0x32/0xc0 [ib_core]
      [<00000000cfc376be>] process_one_work+0x4bc/0x980
      [<0000000016e5c96a>] worker_thread+0x78/0x5c0
      [<00000000c20b8be0>] kthread+0x191/0x1e0
      [<000000006c9c0003>] ret_from_fork+0x3a/0x50

Fixes: baa5b28b7a ("RDMA/rtrs-srv: Replace device_register with device_initialize and device_add")
Link: https://lore.kernel.org/r/20210212134525.103456-5-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:59 -04:00
Gioh Kim f7452a7e96 RDMA/rtrs-srv: fix memory leak by missing kobject free
kmemleak reported an error as below:

  unreferenced object 0xffff8880674b7640 (size 64):
    comm "kworker/4:1H", pid 113, jiffies 4296403507 (age 507.840s)
    hex dump (first 32 bytes):
      69 70 3a 31 39 32 2e 31 36 38 2e 31 32 32 2e 31  ip:192.168.122.1
      31 30 40 69 70 3a 31 39 32 2e 31 36 38 2e 31 32  10@ip:192.168.12
    backtrace:
      [<0000000054413611>] kstrdup+0x2e/0x60
      [<0000000078e3120a>] kobject_set_name_vargs+0x2f/0xb0
      [<00000000ca2be3ee>] kobject_init_and_add+0xb0/0x120
      [<0000000062ba5e78>] rtrs_srv_create_sess_files+0x14c/0x314 [rtrs_server]
      [<00000000b45b7217>] rtrs_srv_info_req_done+0x5b1/0x800 [rtrs_server]
      [<000000008fc5aa8f>] __ib_process_cq+0x94/0x100 [ib_core]
      [<00000000a9599cb4>] ib_cq_poll_work+0x32/0xc0 [ib_core]
      [<00000000cfc376be>] process_one_work+0x4bc/0x980
      [<0000000016e5c96a>] worker_thread+0x78/0x5c0
      [<00000000c20b8be0>] kthread+0x191/0x1e0
      [<000000006c9c0003>] ret_from_fork+0x3a/0x50

It is caused by the not-freed kobject of rtrs_srv_sess.  The kobject
embedded in rtrs_srv_sess has ref-counter 2 after calling
process_info_req(). Therefore it must call kobject_put twice.  Currently
it calls kobject_put only once at rtrs_srv_destroy_sess_files because
kobject_del removes the state_in_sysfs flag and then kobject_put in
free_sess() is not called.

This patch moves kobject_del() into free_sess() so that the kobject of
rtrs_srv_sess can be freed. And also this patch adds the missing call of
sysfs_remove_group() to clean-up the sysfs directory.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210212134525.103456-4-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:59 -04:00
Md Haris Iqbal 03e9b33a0f RDMA/rtrs: Only allow addition of path to an already established session
While adding a path from the client side to an already established
session, it was possible to provide the destination IP to a different
server. This is dangerous.

This commit adds an extra member to the rtrs_msg_conn_req structure, named
first_conn; which is supposed to notify if the connection request is the
first for that session or not.

On the server side, if a session does not exist but the first_conn
received inside the rtrs_msg_conn_req structure is 1, the connection
request is failed. This signifies that the connection request is for an
already existing session, and since the server did not find one, it is an
wrong connection request.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210212134525.103456-3-jinpu.wang@cloud.ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:59 -04:00
Jack Wang e6daa8f61d RDMA/rtrs-srv: Fix stack-out-of-bounds
BUG: KASAN: stack-out-of-bounds in _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
  Read of size 4 at addr ffff8880d5a7f980 by task kworker/0:1H/565

  CPU: 0 PID: 565 Comm: kworker/0:1H Tainted: G           O      5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10
  Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00       09/04/2012
  Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
  Call Trace:
   dump_stack+0x96/0xe0
   print_address_description.constprop.4+0x1f/0x300
   ? irq_work_claim+0x2e/0x50
   __kasan_report.cold.8+0x78/0x92
   ? _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
   kasan_report+0x10/0x20
   _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
   ? check_chain_key+0x1d7/0x2e0
   ? _mlx4_ib_post_recv+0x630/0x630 [mlx4_ib]
   ? lockdep_hardirqs_on+0x1a8/0x290
   ? stack_depot_save+0x218/0x56e
   ? do_profile_hits.isra.6.cold.13+0x1d/0x1d
   ? check_chain_key+0x1d7/0x2e0
   ? save_stack+0x4d/0x80
   ? save_stack+0x19/0x80
   ? __kasan_slab_free+0x125/0x170
   ? kfree+0xe7/0x3b0
   rdma_write_sg+0x5b0/0x950 [rtrs_server]

The problem is when we send imm_wr, the type should be ib_rdma_wr, so hw
driver like mlx4 can do rdma_wr(wr), so fix it by use the ib_rdma_wr as
type for imm_wr.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210212134525.103456-2-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:59 -04:00
Christoph Lameter 633d610212 RDMA/ipoib: Remove racy Subnet Manager sendonly join checks
When a system receives a REREG event from the SM, then the SM information
in the kernel is marked as invalid and a request is sent to the SM to
update the information. The SM information is invalid in that time period.

However, receiving a REREG also occurs simultaneously in user space
applications that are now trying to rejoin the multicast groups. Some of
those may be sendonly multicast groups which are then failing.

If the SM information is invalid then ib_sa_sendonly_fullmem_support()
returns false. That is wrong because it just means that we do not know yet
if the potentially new SM supports sendonly joins.

Sendonly join was introduced in 2015 and all the Subnet managers have
supported it ever since. So there is no point in checking if a subnet
manager supports it.

Should an old opensm get a request for a sendonly join then the request
will fail. The code that is removed here accomodated that situation and
fell back to a full join.

Falling back to a full join is problematic in itself. The reason to use
the sendonly join was to reduce the traffic on the Infiniband fabric
otherwise one could have just stayed with the regular join.  So this patch
may cause users of very old opensms to discover that lots of traffic
needlessly crosses their IB fabrics.

Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2101281845160.13303@www.lameter.com
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:58 -04:00
Max Gurtovoy 877745b477 IB/iser: Simplify prot_caps setting
Reduce the number of instructions made for setting protection caps. No
need to do bitwise OR with 0 since we can zero the return value in the
beginning of the function.

Link: https://lore.kernel.org/r/20210111145754.56727-5-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19 20:02:07 -04:00
Max Gurtovoy 6bd898baf2 IB/iser: Enforce iser_max_sectors to be greater than 0
A value of 0 will casue the driver to fail establishing a valid connection
to remote target.

The following can be seen in the log in this case:

  iser: iser_connect: connecting to: 1.1.1.88:3260
  iser: iser_cma_handler: address resolved (0): status 0 conn 00000000090aa4de id 00000000167d3b5a
  iser: iser_cma_handler: route resolved  (2): status 0 conn 00000000090aa4de id 00000000167d3b5a
  iser: iscsi_iser_ep_poll: iser conn 00000000090aa4de rc = 0
  iser: iser_create_ib_conn_res: setting conn 00000000090aa4de cma_id 00000000167d3b5a qp 00000000efa80660 max_send_wr 4619
  iser_cma_handler: established (9): status 0 conn 00000000090aa4de id 00000000167d3b5a
  iser: iser_connected_handler: remote qpn:1c7 my qpn:1c6
  iser: iser_connected_handler: conn 00000000090aa4de: negotiated remote invalidation
  iser: iscsi_iser_ep_poll: iser conn 00000000090aa4de rc = 1
  scsi host10: iSCSI Initiator over iSER
  mlx5_core 0000:07:00.0: mlx5_cmd_check:769:(pid 616473): CREATE_MKEY(0x200) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x3bf6f)
  iser: iser_create_fastreg_desc: Failed to allocate ib_fast_reg_mr err=-22
  iser: iser_alloc_rx_descriptors: failed allocating rx descriptors / data buffers
  iser: iscsi_iser_ep_disconnect: ep 00000000d2040785 iser conn 00000000090aa4de
  iser: iser_conn_terminate: iser_conn 00000000090aa4de state 3
  iser: iser_free_ib_conn_res: freeing conn 00000000090aa4de cma_id 00000000167d3b5a qp 00000000efa80660
  iser: iser_device_try_release: device 00000000dc871b1b refcount 0

Link: https://lore.kernel.org/r/20210111145754.56727-4-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19 20:02:07 -04:00
Max Gurtovoy 429c76133f IB/iser: Protect iscsi_max_lun module param using callback
Remove the check from the module_init function.

Link: https://lore.kernel.org/r/20210111145754.56727-3-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19 20:02:06 -04:00
Max Gurtovoy 5bf0e4b80b IB/iser: Remove unneeded semicolons
No need to add semicolon after closing bracket.

Link: https://lore.kernel.org/r/20210111145754.56727-2-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-19 20:02:06 -04:00
Max Gurtovoy a6dc16b699 IB/isert: Simplify signature cap check
Use if/else clause instead of "condition ? val1 : val2" to make the code
cleaner and simpler.

Link: https://lore.kernel.org/r/20210110111903.486681-3-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-18 16:59:04 -04:00
Max Gurtovoy ec53a2a654 IB/isert: Remove unneeded semicolon
No need to add semicolon after closing bracket.

Link: https://lore.kernel.org/r/20210110111903.486681-2-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-18 16:59:04 -04:00
Max Gurtovoy 8ebe0e2a7e IB/isert: Remove unneeded new lines
The Linux convention is to have only 1 new line between functions.

Link: https://lore.kernel.org/r/20210110111903.486681-1-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-18 16:59:03 -04:00
Jack Wang 7fbc3c373e RDMA/rtrs: Fix KASAN: stack-out-of-bounds bug
When KASAN is enabled, we notice warning below:
[  483.436975] ==================================================================
[  483.437234] BUG: KASAN: stack-out-of-bounds in _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[  483.437430] Read of size 4 at addr ffff88a195fd7d30 by task kworker/1:3/6954

[  483.437731] CPU: 1 PID: 6954 Comm: kworker/1:3 Kdump: loaded Tainted: G           O      5.4.82-pserver #5.4.82-1+feature+linux+5.4.y+dbg+20201210.1532+987e7a6~deb10
[  483.437976] Hardware name: Supermicro Super Server/X11DDW-L, BIOS 3.3 02/21/2020
[  483.438168] Workqueue: rtrs_server_wq hb_work [rtrs_core]
[  483.438323] Call Trace:
[  483.438486]  dump_stack+0x96/0xe0
[  483.438646]  ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[  483.438802]  print_address_description.constprop.6+0x1b/0x220
[  483.438966]  ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[  483.439133]  ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[  483.439285]  __kasan_report.cold.9+0x1a/0x32
[  483.439444]  ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[  483.439597]  kasan_report+0x10/0x20
[  483.439752]  _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[  483.439910]  ? update_sd_lb_stats+0xfb1/0xfc0
[  483.440073]  ? set_reg_wr+0x520/0x520 [mlx5_ib]
[  483.440222]  ? update_group_capacity+0x340/0x340
[  483.440377]  ? find_busiest_group+0x314/0x870
[  483.440526]  ? update_sd_lb_stats+0xfc0/0xfc0
[  483.440683]  ? __bitmap_and+0x6f/0x100
[  483.440832]  ? __lock_acquire+0xa2/0x2150
[  483.440979]  ? __lock_acquire+0xa2/0x2150
[  483.441128]  ? __lock_acquire+0xa2/0x2150
[  483.441279]  ? debug_lockdep_rcu_enabled+0x23/0x60
[  483.441430]  ? lock_downgrade+0x390/0x390
[  483.441582]  ? __lock_acquire+0xa2/0x2150
[  483.441729]  ? __lock_acquire+0xa2/0x2150
[  483.441876]  ? newidle_balance+0x425/0x8f0
[  483.442024]  ? __lock_acquire+0xa2/0x2150
[  483.442172]  ? debug_lockdep_rcu_enabled+0x23/0x60
[  483.442330]  hb_work+0x15d/0x1d0 [rtrs_core]
[  483.442479]  ? schedule_hb+0x50/0x50 [rtrs_core]
[  483.442627]  ? lock_downgrade+0x390/0x390
[  483.442781]  ? process_one_work+0x40d/0xa50
[  483.442931]  process_one_work+0x4ee/0xa50
[  483.443082]  ? pwq_dec_nr_in_flight+0x110/0x110
[  483.443231]  ? do_raw_spin_lock+0x119/0x1d0
[  483.443383]  worker_thread+0x65/0x5c0
[  483.443532]  ? process_one_work+0xa50/0xa50
[  483.451839]  kthread+0x1e2/0x200
[  483.451983]  ? kthread_create_on_node+0xc0/0xc0
[  483.452139]  ret_from_fork+0x3a/0x50

The problem is we use wrong type when send wr, hw driver expect the type
of IB_WR_RDMA_WRITE_WITH_IMM wr should be ib_rdma_wr, and doing
container_of to access member. The fix is simple use ib_rdma_wr instread
of ib_send_wr.

Fixes: c0894b3ea6 ("RDMA/rtrs: core: lib functions shared between client and server modules")
Link: https://lore.kernel.org/r/20201217141915.56989-20-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:10 -04:00
Jack Wang 6f5d1b3016 RDMA/rtrs-srv: Init wr_cnt as 1
Fix up wr_avail accounting. if wr_cnt is 0, then we do SIGNAL for first
wr, in completion we add queue_depth back, which is not right in the
sense of tracking for available wr.

So fix it by init wr_cnt to 1.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-19-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:10 -04:00
Jack Wang e8ae7ddb48 RDMA/rtrs-srv: Do not signal REG_MR
We do not need to wait for REG_MR completion, so remove the
SIGNAL flag.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-18-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Jack Wang aaed465f76 RDMA/rtrs-clt: Use bitmask to check sess->flags
We may want to add new flags, so it's better to use bitmask to check flags.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-17-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Jack Wang b38041d50a RDMA/rtrs: Do not signal for heatbeat
For HB, there is no need to generate signal for completion.

Also remove a comment accordingly.

Fixes: c0894b3ea6 ("RDMA/rtrs: core: lib functions shared between client and server modules")
Link: https://lore.kernel.org/r/20201217141915.56989-16-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reported-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Guoqing Jiang eab0982466 RDMA/rtrs-clt: Refactor the failure cases in alloc_clt
Make all failure cases go to the common path to avoid duplicate code.
And some issued existed before.

1. clt need to be freed to avoid memory leak.

2. return ERR_PTR(-ENOMEM) if kobject_create_and_add fails, because
   rtrs_clt_open checks the return value of by call "IS_ERR(clt)".

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-15-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Jack Wang 8537f2de65 RDMA/rtrs-srv: Fix missing wr_cqe
We had a few places wr_cqe is not set, which could lead to NULL pointer
deref or GPF in error case.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-14-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Guoqing Jiang 7a8732a6f9 RDMA/rtrs-clt: Rename __rtrs_clt_change_state to rtrs_clt_change_state
Let's rename it to rtrs_clt_change_state since the previous one is
killed.

Also update the comment to make it more clear.

Link: https://lore.kernel.org/r/20201217141915.56989-13-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Guoqing Jiang 11f7b3940d RDMA/rtrs-clt: Kill rtrs_clt_change_state
It is just a wrapper of rtrs_clt_change_state_get_old, and we can reuse
rtrs_clt_change_state_get_old with add the checking of 'old_state' is
valid or not.

Link: https://lore.kernel.org/r/20201217141915.56989-12-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Guoqing Jiang 88a8c54db9 RDMA/rtrs-clt: Remove unnecessary 'goto out'
This is not needed since the label is just after the place.

Link: https://lore.kernel.org/r/20201217141915.56989-11-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Guoqing Jiang 25a033f5a7 RDMA/rtrs-clt: Kill wait_for_inflight_permits
Let's wait the inflight permits before free it.

Link: https://lore.kernel.org/r/20201217141915.56989-10-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Guoqing Jiang 7b47b27fcb RDMA/rtrs-clt: Consolidate rtrs_clt_destroy_sysfs_root_{folder,files}
Since the two functions are called together, let's consolidate them in
a new function rtrs_clt_destroy_sysfs_root.

Link: https://lore.kernel.org/r/20201217141915.56989-9-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Guoqing Jiang 424774c9f3 RDMA/rtrs: Call kobject_put in the failure path
Per the comment of kobject_init_and_add, we need to free the memory
by call kobject_put.

Fixes: 215378b838 ("RDMA/rtrs: client: sysfs interface functions")
Fixes: 91b11610af ("RDMA/rtrs: server: sysfs interface functions")
Link: https://lore.kernel.org/r/20201217141915.56989-8-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Guoqing Jiang f77c4839ee RDMA/rtrs-srv: Jump to dereg_mr label if allocate iu fails
The rtrs_iu_free is called in rtrs_iu_alloc if memory is limited, so we
don't need to free the same iu again.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-7-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:09 -04:00
Jack Wang f47e4e3e71 RDMA/rtrs-clt: Set mininum limit when create QP
Currently rtrs when create_qp use a coarse numbers (bigger in general),
which leads to hardware create more resources which only waste memory
with no benefits.

- SERVICE con,
For max_send_wr/max_recv_wr, it's 2 times SERVICE_CON_QUEUE_DEPTH + 2

- IO con
For max_send_wr/max_recv_wr, it's sess->queue_depth * 3 + 1

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-6-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:08 -04:00
Jack Wang f991fdac81 RDMA/rtrs-srv: Use sysfs_remove_file_self for disconnect
Remove self first to avoid deadlock, we don't want to
use close_work to remove sess sysfs.

Fixes: 91b11610af ("RDMA/rtrs: server: sysfs interface functions")
Link: https://lore.kernel.org/r/20201217141915.56989-5-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Tested-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:08 -04:00
Jack Wang 99f0c38079 RDMA/rtrs-srv: Release lock before call into close_sess
In this error case, we don't need hold mutex to call close_sess.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-4-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Tested-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:25:08 -04:00
Jack Wang 7490fd1fe8 RDMA/rtrs: Extend ibtrs_cq_qp_create
rtrs does not have same limit for both max_send_wr and max_recv_wr,
To allow client and server set different values, export in a separate
parameter for rtrs_cq_qp_create.

Also fix the type accordingly, u32 should be used instead of u16.

Fixes: c0894b3ea6 ("RDMA/rtrs: core: lib functions shared between client and server modules")
Link: https://lore.kernel.org/r/20201217141915.56989-2-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-15 15:24:50 -04:00
Zheng Yongjun 90eef9f712 RDMA: Convert comma to semicolon
Replace a comma between expression statements by a semicolon.

Link: https://lore.kernel.org/r/20201214134118.4349-1-zhengyongjun3@huawei.com
Link: https://lore.kernel.org/r/20201214134146.4456-1-zhengyongjun3@huawei.com
Link: https://lore.kernel.org/r/20201214134218.4510-1-zhengyongjun3@huawei.com
Link: https://lore.kernel.org/r/20201214134243.4563-1-zhengyongjun3@huawei.com
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-07 13:53:41 -04:00
Gioh Kim 9aaf9a2aba block/rnbd-clt: Does not request pdu to rtrs-clt
Previously the rnbd client requested the rtrs to allocate rnbd_iu
just after the rtrs_iu. So the rnbd client passes the size of
rnbd_iu for rtrs_clt_open() and rtrs creates an array of
rnbd_iu and rtrs_iu.

For IO handling, rnbd_iu exists after the request because we pass
the size of rnbd_iu when setting the tag-set. Therefore we do not
use the rnbd_iu allocated by rtrs for IO handling.
We only use the rnbd_iu allocated by rtrs when doing session
initialization. Almost all rnbd_iu allocated by rtrs are wasted.

By this patch the rnbd client does not request rnbd_iu allocation
to rtrs but allocate it for itself when doing session initialization.

Also remove unused rtrs_permit_to_pdu from rtrs.

Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-16 14:56:09 -07:00
Linus Torvalds 009bd55dfc RDMA 5.11 pull request
A smaller set of patches, nothing stands out as being particularly major
 this cycle:
 
 - Driver bug fixes and updates: bnxt_re, cxgb4, rxe, hns, i40iw, cxgb4,
   mlx4 and mlx5
 
 - Bug fixes and polishing for the new rts ULP
 
 - Cleanup of uverbs checking for allowed driver operations
 
 - Use sysfs_emit all over the place
 
 - Lots of bug fixes and clarity improvements for hns
 
 - hip09 support for hns
 
 - NDR and 50/100Gb signaling rates
 
 - Remove dma_virt_ops and go back to using the IB DMA wrappers
 
 - mlx5 optimizations for contiguous DMA regions
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl/aNXUACgkQOG33FX4g
 mxqlMQ/+O6UhxKnDAnMB+HzDGvOm+KXNHOQBuzxz4ZWXqtUrW8WU5ca3PhXovc4z
 /QX0HhMhQmVsva5mjp1OGVATxQ2E+yasqFLg4QXAFWFR3N7s0u/sikE9i1DoPvOC
 lsmLTeRauCFaE4mJD5nvYwm+riECX0GmyVVW7v6V05xwAp0hwdhyU7Kb6Yh3lxsE
 umTz+onPNJcD6Tc4snziyC5QEp5ebEjAaj4dVI1YPR5X0c2RwC5E1CIDI6u4OQ2k
 j7/+Kvo8LNdYNERGiR169x6c1L7WS6dYnGMMeXRgyy0BVbVdRGDnvCV9VRmF66w5
 99fHfDjNMNmqbGNt/4/gwNdVrR9aI4jMZWCh7SmsguX6XwNOlhYldy3x3WnlkfkQ
 e4O0huJceJqcB2Uya70GqufnAetRXsbjzcvWxpR5YAwRmcRkm1f6aGK3BxPjWEbr
 BbYRpiKMxxT4yTe65BuuThzx6g4pNQHe0z3BM/dzMJQAX+PZcs1CPQR8F8PbCrZR
 Ad7qw4HJ587PoSxPi3toVMpYZRP6cISh1zx9q/JCj8cxH9Ri4MovUCS3cF63Ny3B
 1LJ2q0x8FuLLjgZJogKUyEkS8OO6q7NL8WumjvrYWWx19+jcYsV81jTRGSkH3bfY
 F7Esv5K2T1F2gVsCe1ZFFplQg6ja1afIcc+LEl8cMJSyTdoSub4=
 =9t8b
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A smaller set of patches, nothing stands out as being particularly
  major this cycle. The biggest item would be the new HIP09 HW support
  from HNS, otherwise it was pretty quiet for new work here:

   - Driver bug fixes and updates: bnxt_re, cxgb4, rxe, hns, i40iw,
     cxgb4, mlx4 and mlx5

   - Bug fixes and polishing for the new rts ULP

   - Cleanup of uverbs checking for allowed driver operations

   - Use sysfs_emit all over the place

   - Lots of bug fixes and clarity improvements for hns

   - hip09 support for hns

   - NDR and 50/100Gb signaling rates

   - Remove dma_virt_ops and go back to using the IB DMA wrappers

   - mlx5 optimizations for contiguous DMA regions"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (147 commits)
  RDMA/cma: Don't overwrite sgid_attr after device is released
  RDMA/mlx5: Fix MR cache memory leak
  RDMA/rxe: Use acquire/release for memory ordering
  RDMA/hns: Simplify AEQE process for different types of queue
  RDMA/hns: Fix inaccurate prints
  RDMA/hns: Fix incorrect symbol types
  RDMA/hns: Clear redundant variable initialization
  RDMA/hns: Fix coding style issues
  RDMA/hns: Remove unnecessary access right set during INIT2INIT
  RDMA/hns: WARN_ON if get a reserved sl from users
  RDMA/hns: Avoid filling sl in high 3 bits of vlan_id
  RDMA/hns: Do shift on traffic class when using RoCEv2
  RDMA/hns: Normalization the judgment of some features
  RDMA/hns: Limit the length of data copied between kernel and userspace
  RDMA/mlx4: Remove bogus dev_base_lock usage
  RDMA/uverbs: Fix incorrect variable type
  RDMA/core: Do not indicate device ready when device enablement fails
  RDMA/core: Clean up cq pool mechanism
  RDMA/core: Update kernel documentation for ib_create_named_qp()
  MAINTAINERS: SOFT-ROCE: Change Zhu Yanjun's email address
  ...
2020-12-16 13:42:26 -08:00
Linus Torvalds 60f7c503d9 SCSI misc on 20201216
This series consists of the usual driver updates (ufs, qla2xxx,
 smartpqi, target, zfcp, fnic, mpt3sas, ibmvfc) plus a load of
 cleanups, a major power management rework and a load of assorted minor
 updates.  There are a few core updates (formatting fixes being the big
 one) but nothing major this cycle.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCX9o0KSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbOZAP9D5NTN
 J7dJUo2MIMy84YBu+d9ag7yLlNiRWVY2yw5vHwD/Z7JjAVLwz/tzmyjU9//o2J6w
 hwhOv6Uto89gLCWSEz8=
 =KUPT
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This consists of the usual driver updates (ufs, qla2xxx, smartpqi,
  target, zfcp, fnic, mpt3sas, ibmvfc) plus a load of cleanups, a major
  power management rework and a load of assorted minor updates.

  There are a few core updates (formatting fixes being the big one) but
  nothing major this cycle"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits)
  scsi: mpt3sas: Update driver version to 36.100.00.00
  scsi: mpt3sas: Handle trigger page after firmware update
  scsi: mpt3sas: Add persistent MPI trigger page
  scsi: mpt3sas: Add persistent SCSI sense trigger page
  scsi: mpt3sas: Add persistent Event trigger page
  scsi: mpt3sas: Add persistent Master trigger page
  scsi: mpt3sas: Add persistent trigger pages support
  scsi: mpt3sas: Sync time periodically between driver and firmware
  scsi: qla2xxx: Update version to 10.02.00.104-k
  scsi: qla2xxx: Fix device loss on 4G and older HBAs
  scsi: qla2xxx: If fcport is undergoing deletion complete I/O with retry
  scsi: qla2xxx: Fix the call trace for flush workqueue
  scsi: qla2xxx: Fix flash update in 28XX adapters on big endian machines
  scsi: qla2xxx: Handle aborts correctly for port undergoing deletion
  scsi: qla2xxx: Fix N2N and NVMe connect retry failure
  scsi: qla2xxx: Fix FW initialization error on big endian machines
  scsi: qla2xxx: Fix crash during driver load on big endian machines
  scsi: qla2xxx: Fix compilation issue in PPC systems
  scsi: qla2xxx: Don't check for fw_started while posting NVMe command
  scsi: qla2xxx: Tear down session if FW say it is down
  ...
2020-12-16 13:34:31 -08:00
Sebastian Andrzej Siewior 0583531bb9 RDMA/iser: Remove in_interrupt() usage
iser_initialize_task_headers() uses in_interrupt() to find out if it is
safe to acquire a mutex.

in_interrupt() is deprecated as it is ill defined and does not provide
what it suggests. Aside of that it covers only parts of the contexts in
which a mutex may not be acquired.

The following callchains exist:

iscsi_queuecommand() *locks* iscsi_session::frwd_lock
-> iscsi_prep_scsi_cmd_pdu()
   -> session->tt->init_task() (iscsi_iser_task_init())
      -> iser_initialize_task_headers()
-> iscsi_iser_task_xmit() (iscsi_transport::xmit_task)
  -> iscsi_iser_task_xmit_unsol_data()
    -> iser_send_data_out()
      -> iser_initialize_task_headers()

iscsi_data_xmit() *locks* iscsi_session::frwd_lock
-> iscsi_prep_mgmt_task()
   -> session->tt->init_task() (iscsi_iser_task_init())
      -> iser_initialize_task_headers()
-> iscsi_prep_scsi_cmd_pdu()
   -> session->tt->init_task() (iscsi_iser_task_init())
      -> iser_initialize_task_headers()

__iscsi_conn_send_pdu() caller has iscsi_session::frwd_lock
  -> iscsi_prep_mgmt_task()
     -> session->tt->init_task() (iscsi_iser_task_init())
        -> iser_initialize_task_headers()
  -> session->tt->xmit_task() (

The only callchain that is close to be invoked in preemptible context:
iscsi_xmitworker() worker
-> iscsi_data_xmit()
   -> iscsi_xmit_task()
      -> conn->session->tt->xmit_task() (iscsi_iser_task_xmit()

In iscsi_iser_task_xmit() there is this check:
   if (!task->sc)
      return iscsi_iser_mtask_xmit(conn, task);

so it does end up in iser_initialize_task_headers() and
iser_initialize_task_headers() relies on iscsi_task::sc == NULL.

Remove conditional locking of iser_conn::state_mutex because there is no
call chain to do so. Remove the goto label and return early now that there
is no clean up needed.

Link: https://lore.kernel.org/r/20201204174256.62xfcvudndt7oufl@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Max Gurtovoy <maxg@nvidia.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-07 16:05:12 -04:00
Mauro Carvalho Chehab 2988ca08ba IB: Fix kernel-doc markups
Some functions have different names between their prototypes and the
kernel-doc markup.

Others need to be fixed, as kernel-doc markups should use this format:
        identifier - description

Link: https://lore.kernel.org/r/78b98c41a5a0f4c0106433d305b143028a4168b0.1606823973.git.mchehab+huawei@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-07 15:45:00 -04:00
Jack Wang d024f27de1 RDMA/ipoib: Distribute cq completion vector better
Currently ipoib choose cq completion vector based on port number, when HCA
only have one port, all the interface recv queue completion are bind to cq
completion vector 0.

To better distribute the load, use same method as __ib_alloc_cq_any to
choose completion vector, with the change, each interface now use
different completion vectors.

Link: https://lore.kernel.org/r/20201013074342.15867-1-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-20 16:18:59 -04:00
Jason Gunthorpe bf3b7b7ba9 Merge branch 'for-rc' into rdma.git
From https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git

The rc RDMA branch is needed due to dependencies on the next patches.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-17 15:20:26 -04:00
Zou Wei f7a95c902b IB/isert: Do not excplicitly check == false for bool
It is not the kernel style, warning reported by coccicheck:

./ib_isert.c:1104:12-24: WARNING: Comparison to bool

Link: https://lore.kernel.org/r/1604404674-32998-1-git-send-email-zou_wei@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12 12:01:37 -04:00
Linus Torvalds 6f3f374ac0 RDMA 5.10 second rc pull request
A few more merge window regressions that didn't make rc1:
 
 - New validation in the DMA layer triggers wrong use of the DMA layer in
   rxe, siw and rdmavt
 
 - Accidental change of a hypervisor facing ABI when widening the port
   speed u8 to u16 in vmw_pvrdma
 
 - Memory leak on error unwind in SRP target
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl+kPikACgkQOG33FX4g
 mxqBkQ//cUlx1JfZp2MDlvbrpk10+GTPrZt3PJkL7GcMDjIvplk4xMXvC2rp9PH0
 z3cuVblQI3skdQnokjrykpLLakBoe0y6pzqIrBZ4bq36Ggry5i88YD3yMBbCkHhl
 ZPKxcYGd2Qey32PNVe4KmYnZ1MPPQZzPYAUaMxvroZWbWPjfOsXCJC7wxZkQs7Qn
 CcqCFVJ7IU2YTK7ygLlRWnmjhNn0wdkDX6t4YhSB+EnTJosPYxGtorKa9/IpZJ5C
 NBhAJ7MiQGK5XtHdFpANuB+GYnm3Aob/UJl9YR3wvtzqHbWwCxoiSUlkqkjxtoak
 +6b6eS4XmubePqtd0AnuIpNkfi09CGe6VKuUwDsSt6eTMNHtJNsLR8LqkfblKb/9
 V9U19/4l2D8iedUR1Y3WR51diidJgHs7eSD9ycASTJ5HJqgBxz77K4eORu5zqMyr
 QtcnMBB7nYQ5tNYgz3s78xLorFjCbRAvtyvVPG3HXQcSEuauYJjrMXo8BbxNmI/Z
 JIzJhDsrm6S6FRu9BzMISNBHJl4ay5+Uv9A9SmFytmeXDGvDHVIuiwW1GbUfbR8n
 KecuAC+/8459LkeVf8h5nonOi30NbLOX4fpCJBi1PljBsbYl4ET7mhy9mS0mYe0s
 9lj/VbhHP8xuT2JiX15vopyUMCPVvlXaiJKp7ccK6lq42muv6Kc=
 =y+a3
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "A few more merge window regressions that didn't make rc1:

   - New validation in the DMA layer triggers wrong use of the DMA layer
     in rxe, siw and rdmavt

   - Accidental change of a hypervisor facing ABI when widening the port
     speed u8 to u16 in vmw_pvrdma

   - Memory leak on error unwind in SRP target"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/srpt: Fix typo in srpt_unregister_mad_agent docstring
  RDMA/vmw_pvrdma: Fix the active_speed and phys_state value
  IB/srpt: Fix memory leak in srpt_add_one
  RDMA: Fix software RDMA drivers for dma mapping error
2020-11-05 11:25:02 -08:00
Jason Gunthorpe 21fcdeec09 RDMA/srpt: Fix typo in srpt_unregister_mad_agent docstring
htmldocs fails with:

drivers/infiniband/ulp/srpt/ib_srpt.c:630: warning: Function parameter or member 'port_cnt' not described in 'srpt_unregister_mad_agent'

Fixes: 372a178628 ("IB/srpt: Fix memory leak in srpt_add_one")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-05 11:38:29 -04:00
Mike Christie 6f55b06f9b scsi: target: Drop sess_cmd_lock from I/O path
Drop the sess_cmd_lock by:

 - Removing the sess_cmd_list use from LIO core, because it's been
   moved to qla2xxx.

 - Removing sess_tearing_down check in the I/O path. Instead of using that
   bit and the sess_cmd_lock, we rely on the cmd_count percpu ref. To do
   this we switch to percpu_ref_kill_and_confirm/percpu_ref_tryget_live.

Link: https://lore.kernel.org/r/1604257174-4524-7-git-send-email-michael.christie@oracle.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-04 22:39:37 -05:00
David Disseldorp 8dd992fb67 scsi: target: Rename cmd.bad_sector to cmd.sense_info
cmd.bad_sector currently gets packed into the sense INFORMATION field for
TCM_LOGICAL_BLOCK_{GUARD,APP_TAG,REF_TAG}_CHECK_FAILED errors, which carry
an .add_sector_info flag in the sense_detail_table to ensure this.

In preparation for propagating a byte offset on COMPARE AND WRITE
TCM_MISCOMPARE_VERIFY error, rename cmd.bad_sector to cmd.sense_info and
sense_detail.add_sector_info to sense_detail.add_sense_info so that it
better reflects the sense INFORMATION field destination.

[ddiss: update previously overlooked ib_isert]

Link: https://lore.kernel.org/r/20201031233211.5207-3-ddiss@suse.de
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-04 22:02:19 -05:00
Linus Torvalds e6b0bd61a7 This pull contains a series of warning fixes from Mauro; once applied, the
number of warnings from the once-noisy docs build process is nearly zero.
 Getting to this point has required a lot of work; once there, hopefully we
 can keep things that way.
 
 I have packaged this as a separate pull because it does a fair amount of
 reaching outside of Documentation/.  The changes are all in comments and in
 code placement.  It's all been in linux-next since last week.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl+hscQPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YgZAH/0JeDA/1VLZYYTmdABz8mjBZsoW9tyPGGztF
 nsh5ykdHhL3MeTRwumW5armLVrfKhd1XT+nIzD7OcWlqu+RDOvQ5I95rahr473hP
 1SHTjqm3/AlJwQoeS72X5U6QEJQ58e2IwCbP23H3x7I3Q3snEA/HhswzxurfoB/Z
 j81YzDV2YPEc0LJWZ5Vn0NEdwP8cdpFv5rojsQmepq7K0yJ7tEHb7/u2cEuUBgXS
 8LcYCNPLpiN+q5N8uQ5oDjIUNdLQvP03kgKtQWiCTr4BRydOrDlJie28LIedamEz
 anu7UfaVK4bxn+ugRI0g2+aWQKux81ULCinKUWmLRNbcxjhaQqQ=
 =hDfp
 -----END PGP SIGNATURE-----

Merge tag 'docs-5.10-warnings' of git://git.lwn.net/linux

Pull documentation build warning fixes from Jonathan Corbet:
 "This contains a series of warning fixes from Mauro; once applied, the
  number of warnings from the once-noisy docs build process is nearly
  zero.

  Getting to this point has required a lot of work; once there,
  hopefully we can keep things that way.

  I have packaged this as a separate pull because it does a fair amount
  of reaching outside of Documentation/. The changes are all in comments
  and in code placement. It's all been in linux-next since last week"

* tag 'docs-5.10-warnings' of git://git.lwn.net/linux: (24 commits)
  docs: SafeSetID: fix a warning
  amdgpu: fix a few kernel-doc markup issues
  selftests: kselftest_harness.h: fix kernel-doc markups
  drm: amdgpu_dm: fix a typo
  gpu: docs: amdgpu.rst: get rid of wrong kernel-doc markups
  drm: amdgpu: kernel-doc: update some adev parameters
  docs: fs: api-summary.rst: get rid of kernel-doc include
  IB/srpt: docs: add a description for cq_size member
  locking/refcount: move kernel-doc markups to the proper place
  docs: lockdep-design: fix some warning issues
  MAINTAINERS: fix broken doc refs due to yaml conversion
  ice: docs fix a devlink info that broke a table
  crypto: sun8x-ce*: update entries to its documentation
  net: phy: remove kernel-doc duplication
  mm: pagemap.h: fix two kernel-doc markups
  blk-mq: docs: add kernel-doc description for a new struct member
  docs: userspace-api: add iommu.rst to the index file
  docs: hwmon: mp2975.rst: address some html build warnings
  docs: net: statistics.rst: remove a duplicated kernel-doc
  docs: kasan.rst: add two missing blank lines
  ...
2020-11-03 13:14:14 -08:00
Meir Lichtinger 235b6ac306 RDMA/ipoib: Add 50Gb and 100Gb link speeds to ethtool
The IBTA specification has new speeds - HDR and NDR, supporting signaling
rate of 50Gb and 100Gb respectively. ethtool support of ipoib driver
translates IB speed to signaling rate. Added translation of HDR and NDR IB
types to rates of 50Gb and 100Gb ethernet speed.

Link: https://lore.kernel.org/r/20201026132904.1338526-1-leon@kernel.org
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02 15:48:56 -04:00
Maor Gottlieb 372a178628 IB/srpt: Fix memory leak in srpt_add_one
Failure in srpt_refresh_port() for the second port will leave MAD
registered for the first one, however, the srpt_add_one() will be marked
as "failed" and SRPT will leak resources for that registered but not used
and released first port.

Unregister the MAD agent for all ports in case of failure.

Fixes: a42d985bd5 ("ib_srpt: Initial SRP Target merge for v3.3-rc1")
Link: https://lore.kernel.org/r/20201028065051.112430-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02 15:23:55 -04:00
Joe Perches e28bf1f03b RDMA: Convert various random sprintf sysfs _show uses to sysfs_emit
Manual changes for sysfs_emit as cocci scripts can't easily convert them.

Link: https://lore.kernel.org/r/ecde7791467cddb570c6f6d2c908ffbab9145cac.1602122880.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-30 21:03:52 -03:00
Joe Perches 45808361d4 RDMA: Manual changes for sysfs_emit and neatening
Make changes to use sysfs_emit in the RDMA code as cocci scripts can not
be written to handle _all_ the possible variants of various sprintf family
uses in sysfs show functions.

While there, make the code more legible and update its style to be more
like the typical kernel styles.

Miscellanea:

o Use intermediate pointers for dereferences
o Add and use string lookup functions
o return early when any intermediate call fails so normal return is
  at the bottom of the function
o mlx4/mcg.c:sysfs_show_group: use scnprintf to format intermediate strings

Link: https://lore.kernel.org/r/f5c9e4c9d8dafca1b7b70bd597ee7f8f219c31c8.1602122880.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-30 21:03:52 -03:00
Mauro Carvalho Chehab 1166eb3d52 IB/srpt: docs: add a description for cq_size member
Changeset c804af2c1d ("IB/srpt: use new shared CQ mechanism")
added a new member for struct srpt_rdma_ch, but didn't add the
corresponding kernel-doc markup, as repoted when doing
"make htmldocs":

	./drivers/infiniband/ulp/srpt/ib_srpt.h:331: warning: Function parameter or member 'cq_size' not described in 'srpt_rdma_ch'

Add a description for it.

Fixes: c804af2c1d ("IB/srpt: use new shared CQ mechanism")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Tested-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Link: https://lore.kernel.org/r/df0e5f0e866b91724299ef569a2da8115e48c0cf.1603791716.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2020-10-28 11:41:15 -06:00
Guoqing Jiang 3f4e3d962d RDMA/rtrs-clt: Remove 'addr' from rtrs_clt_add_path_to_arr
Remove the argument since it is not used in the function.

Link: https://lore.kernel.org/r/20201023074353.21946-13-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:41 -03:00
Guoqing Jiang e6ab8cf50f RDMA/rtrs: Introduce rtrs_post_send
Since the three functions share the similar logic, let's introduce one
common function for it.

Link: https://lore.kernel.org/r/20201023074353.21946-12-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:41 -03:00
Guoqing Jiang ffea6ad133 RDMA/rtrs-srv: Kill rtrs_srv_change_state_get_old
This function isn't needed since no caller checks the old_state of sess.

Link: https://lore.kernel.org/r/20201023074353.21946-11-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:41 -03:00
Gioh Kim c3b16b67d1 RDMA/rtrs-clt: Remove duplicated code
process_info_rsp checks that sg_cnt is zero twice.

Link: https://lore.kernel.org/r/20201023074353.21946-10-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:41 -03:00
Gioh Kim 16101b60e7 RDMA/rtrs-clt: Remove duplicated switch-case handling for CM error events
The events returning the same error value are put together.

Link: https://lore.kernel.org/r/20201023074353.21946-9-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:40 -03:00
Gioh Kim 8bd372ace3 RDMA/rtrs: Remove unnecessary argument dir of rtrs_iu_free
The direction of DMA operation is already in the rtrs_iu

Link: https://lore.kernel.org/r/20201023074353.21946-8-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:40 -03:00
Guoqing Jiang 3c8483f5a4 RDMA/rtrs-srv: Fix typo
It should mean region here.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201023074353.21946-7-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:40 -03:00
Guoqing Jiang d715ff8acb RDMA/rtrs-srv: Don't guard the whole __alloc_srv with srv_mutex
The purpose of srv_mutex is to protect srv_list as in put_srv, so no need
to hold it when allocate memory for srv since it could be time consuming.

Otherwise if one machine has limited memory, rsrv_close_work could be
blocked for a longer time due to the mutex is held by get_or_create_srv
since it can't get memory in time.

  INFO: task kworker/1:1:27478 blocked for more than 120 seconds.
        Tainted: G           O    4.14.171-1-storage #4.14.171-1.3~deb9
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  kworker/1:1     D    0 27478      2 0x80000000
  Workqueue: rtrs_server_wq rtrs_srv_close_work [rtrs_server]
  Call Trace:
   ? __schedule+0x38c/0x7e0
   schedule+0x32/0x80
   schedule_preempt_disabled+0xa/0x10
   __mutex_lock.isra.2+0x25e/0x4d0
   ? put_srv+0x44/0x100 [rtrs_server]
   put_srv+0x44/0x100 [rtrs_server]
   rtrs_srv_close_work+0x16c/0x280 [rtrs_server]
   process_one_work+0x1c5/0x3c0
   worker_thread+0x47/0x3e0
   kthread+0xfc/0x130
   ? trace_event_raw_event_workqueue_execute_start+0xa0/0xa0
   ? kthread_create_on_node+0x70/0x70
   ret_from_fork+0x1f/0x30

Let's move all the logics from __find_srv_and_get and __alloc_srv to
get_or_create_srv, and remove the two functions. Then it should be safe
for multiple processes to access the same srv since it is protected with
srv_mutex.

And since we don't want to allocate chunks with srv_mutex held, let's
check the srv->refcount after get srv because the chunks could not be
allocated yet.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201023074353.21946-6-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:40 -03:00
Gioh Kim f553e7601d RDMA/rtrs-clt: Missing error from rtrs_rdma_conn_established
When rtrs_rdma_conn_established returns error (non-zero value), the error
value is stored in con->cm_err and it cannot trigger
rtrs_rdma_error_recovery. Finally the error of rtrs_rdma_con_established
will be forgot.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201023074353.21946-5-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:40 -03:00
Jack Wang fcf2959da6 RDMA/rtrs-clt: Avoid run destroy_con_cq_qp/create_con_cq_qp in parallel
It could happen two kworkers race with each other:

        CPU0                             CPU1
    addr_resolver kworker           reconnect kworker
    rtrs_clt_rdma_cm_handler
    rtrs_rdma_addr_resolved
    create_con_cq_qp: s.dev_ref++
    "s.dev_ref is 1"
                                    wait in create_cm fails with TIMEOUT
                                    destroy_con_cq_qp: --s.dev_ref
                                    "s.dev_ref is 0"
                                    destroy_con_cq_qp: sess->s.dev = NULL
     rtrs_cq_qp_create -> create_qp(con, sess->dev->ib_pd...)
    sess->dev is NULL, panic.

To fix the problem using mutex to serialize create_con_cq_qp and
destroy_con_cq_qp.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201023074353.21946-4-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:39 -03:00
Jack Wang 73385fdbc4 RDMA/rtrs-clt: Remove outdated comment in create_con_cq_qp
As run destroy_con_cq_qp many times doesn't work, remove the comments.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201023074353.21946-3-jinpu.wang@cloud.ionos.com
Suggested-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:39 -03:00
Danil Kipnis 2b3062e4d9 RDMA/rtrs-clt: Remove destroy_con_cq_qp in case route resolving failed
We call destroy_con_cq_qp(con) in rtrs_rdma_addr_resolved() in case route
couldn't be resolved and then again in create_cm() because nothing
happens.

Don't call destroy_con_cq_qp from rtrs_rdma_addr_resolved, create_cm()
does the clean up already.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201023074353.21946-2-jinpu.wang@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:17:39 -03:00
Max Gurtovoy dae7a75f1f IB/isert: add module param to set sg_tablesize for IO cmd
Currently, iser target support max IO size of 16MiB by default. For some
adapters, allocating this amount of resources might reduce the total
number of possible connections that can be created. For those adapters,
it's preferred to reduce the max IO size to be able to create more
connections. Since there is no handshake procedure for max IO size in iser
protocol, set the default max IO size to 1MiB and add a module parameter
for enabling the option to control it for suitable adapters.

Fixes: 317000b926 ("IB/isert: allocate RW ctxs according to max IO size")
Link: https://lore.kernel.org/r/20201019094628.17202-1-mgurtovoy@nvidia.com
Reported-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 13:00:49 -03:00
Jason Gunthorpe 071ba4cc55 RDMA: Add rdma_connect_locked()
There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the
handler triggers a completion and another thread does rdma_connect() or
the handler directly calls rdma_connect().

In all cases rdma_connect() needs to hold the handler_mutex, but when
handler's are invoked this is already held by the core code. This causes
ULPs using the 2nd method to deadlock.

Provide a rdma_connect_locked() and have all ULPs call it from their
handlers.

Link: https://lore.kernel.org/r/0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.com
Reported-and-tested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Fixes: 2a7cec5381 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state")
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-28 09:14:49 -03:00
Joe Perches 3c6bff3cf9 RDMA: Convert sysfs kobject * show functions to use sysfs_emit()
Done with cocci script:

@@
identifier k_show;
identifier arg1, arg2, arg3;
@@
ssize_t k_show(struct kobject *
-	arg1
+	kobj
	, struct kobj_attribute *
-	arg2
+	attr
	, char *
-	arg3
+	buf
	)
{
	...
(
-	arg1
+	kobj
|
-	arg2
+	attr
|
-	arg3
+	buf
)
	...
}

@@
identifier k_show;
identifier kobj, attr, buf;
@@

ssize_t k_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
{
	<...
	return
-	sprintf(buf,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier k_show;
identifier kobj, attr, buf;
@@

ssize_t k_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
{
	<...
	return
-	snprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier k_show;
identifier kobj, attr, buf;
@@

ssize_t k_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
{
	<...
	return
-	scnprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier k_show;
identifier kobj, attr, buf;
expression chr;
@@

ssize_t k_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
{
	<...
	return
-	strcpy(buf, chr);
+	sysfs_emit(buf, chr);
	...>
}

@@
identifier k_show;
identifier kobj, attr, buf;
identifier len;
@@

ssize_t k_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
{
	<...
	len =
-	sprintf(buf,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier k_show;
identifier kobj, attr, buf;
identifier len;
@@

ssize_t k_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
{
	<...
	len =
-	snprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier k_show;
identifier kobj, attr, buf;
identifier len;
@@

ssize_t k_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
{
	<...
	len =
-	scnprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier k_show;
identifier kobj, attr, buf;
identifier len;
@@

ssize_t k_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
{
	<...
-	len += scnprintf(buf + len, PAGE_SIZE - len,
+	len += sysfs_emit_at(buf, len,
	...);
	...>
	return len;
}

@@
identifier k_show;
identifier kobj, attr, buf;
expression chr;
@@

ssize_t k_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
{
	...
-	strcpy(buf, chr);
-	return strlen(buf);
+	return sysfs_emit(buf, chr);
}

Link: https://lore.kernel.org/r/7761c1efaebb96c432c85171d58405c25a824ccd.1602122880.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 20:00:03 -03:00
Joe Perches 1c7fd72687 RDMA: Convert sysfs device * show functions to use sysfs_emit()
Done with cocci script:

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	sprintf(buf,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	snprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	scnprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
expression chr;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	strcpy(buf, chr);
+	sysfs_emit(buf, chr);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	sprintf(buf,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	snprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	scnprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
-	len += scnprintf(buf + len, PAGE_SIZE - len,
+	len += sysfs_emit_at(buf, len,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
expression chr;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	...
-	strcpy(buf, chr);
-	return strlen(buf);
+	return sysfs_emit(buf, chr);
}

Link: https://lore.kernel.org/r/7f406fa8e3aa2552c022bec680f621e38d1fe414.1602122879.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:53:21 -03:00
Linus Torvalds a1e16bc7d5 RDMA 5.10 pull request
The typical set of driver updates across the subsystem:
 
  - Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma, hns,
    usnic, qib, qedr, cxgb4, hns, bnxt_re
 
  - Various rtrs fixes and updates
 
  - Bug fix for mlx4 CM emulation for virtualization scenarios where MRA
    wasn't working right
 
  - Use tracepoints instead of pr_debug in the CM code
 
  - Scrub the locking in ucma and cma to close more syzkaller bugs
 
  - Use tasklet_setup in the subsystem
 
  - Revert the idea that 'destroy' operations are not allowed to fail at
    the driver level. This proved unworkable from a HW perspective.
 
  - Revise how the umem API works so drivers make fewer mistakes using it
 
  - XRC support for qedr
 
  - Convert uverbs objects RWQ and MW to new the allocation scheme
 
  - Large queue entry sizes for hns
 
  - Use hmm_range_fault() for mlx5 On Demand Paging
 
  - uverbs APIs to inspect the GID table instead of sysfs
 
  - Move some of the RDMA code for building large page SGLs into
    lib/scatterlist
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl+J37MACgkQOG33FX4g
 mxrKfRAAnIecwdE8df0yvVU5k0Eg6qVjMy9MMHq4va9m7g6GpUcNNI0nIlOASxH2
 l+9vnUQS3ebgsPeECaDYzEr0hh/u53+xw2g4WV5ts/hE8KkQ6erruXb9kasCe8yi
 5QWJ9K36T3c03Cd3EeH6JVtytAxuH42ombfo9BkFLPVyfG/R2tsAzvm5pVi73lxk
 46wtU1Bqi4tsLhyCbifn1huNFGbHp08OIBPAIKPUKCA+iBRPaWS+Dpi+93h3g3Bp
 oJwDhL9CBCGcHM+rKWLzek3Dy87FnQn7R1wmTpUFwkK+4AH3U/XazivhX035w1vL
 YJyhakVU0kosHlX9hJTNKDHJGkt0YEV2mS8dxAuqilFBtdnrVszb5/MirvlzC310
 /b5xCPSEusv9UVZV0G4zbySVNA9knZ4YaRiR3VDVMLKl/pJgTOwEiHIIx+vs3ejk
 p8GRWa1SjXw5LfZEQcq39J689ljt6xjCTonyuBSv7vSQq5v8pjBxvHxiAe2FIa2a
 ZyZeSCYoSh0SwJQukO2VO7aprhHP3TcCJ/987+X03LQ8tV2VWPktHqm62YCaDcOl
 fgiQuQdPivRjDDkJgMfDWDGKfZeHoWLKl5XsJhWByt0lablVrsvc+8ylUl1UI7gI
 16hWB/Qtlhfwg10VdApn+aOFpIS+s5P4XIp8ik57MZO+VeJzpmE=
 =LKpl
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A usual cycle for RDMA with a typical mix of driver and core subsystem
  updates:

   - Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma,
     hns, usnic, qib, qedr, cxgb4, hns, bnxt_re

   - Various rtrs fixes and updates

   - Bug fix for mlx4 CM emulation for virtualization scenarios where
     MRA wasn't working right

   - Use tracepoints instead of pr_debug in the CM code

   - Scrub the locking in ucma and cma to close more syzkaller bugs

   - Use tasklet_setup in the subsystem

   - Revert the idea that 'destroy' operations are not allowed to fail
     at the driver level. This proved unworkable from a HW perspective.

   - Revise how the umem API works so drivers make fewer mistakes using
     it

   - XRC support for qedr

   - Convert uverbs objects RWQ and MW to new the allocation scheme

   - Large queue entry sizes for hns

   - Use hmm_range_fault() for mlx5 On Demand Paging

   - uverbs APIs to inspect the GID table instead of sysfs

   - Move some of the RDMA code for building large page SGLs into
     lib/scatterlist"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (191 commits)
  RDMA/ucma: Fix use after free in destroy id flow
  RDMA/rxe: Handle skb_clone() failure in rxe_recv.c
  RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI
  RDMA: Explicitly pass in the dma_device to ib_register_device
  lib/scatterlist: Do not limit max_segment to PAGE_ALIGNED values
  IB/mlx4: Convert rej_tmout radix-tree to XArray
  RDMA/rxe: Fix bug rejecting all multicast packets
  RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()
  RDMA/rxe: Remove duplicate entries in struct rxe_mr
  IB/hfi,rdmavt,qib,opa_vnic: Update MAINTAINERS
  IB/rdmavt: Fix sizeof mismatch
  MAINTAINERS: CISCO VIC LOW LATENCY NIC DRIVER
  RDMA/bnxt_re: Fix sizeof mismatch for allocation of pbl_tbl.
  RDMA/bnxt_re: Use rdma_umem_for_each_dma_block()
  RDMA/umem: Move to allocate SG table from pages
  lib/scatterlist: Add support in dynamic allocation of SG table from pages
  tools/testing/scatterlist: Show errors in human readable form
  tools/testing/scatterlist: Rejuvenate bit-rotten test
  RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces
  RDMA/uverbs: Expose the new GID query API to user space
  ...
2020-10-17 11:18:18 -07:00
Kamal Heib 5ce2dced8e RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces
Report the "ipoib pkey", "mode" and "umcast" netlink attributes for every
IPoiB interface type, not just children created with 'ip link add'.

After setting the rtnl_link_ops for the parent interface, implement the
dellink() callback to block users from trying to remove it.

Fixes: 862096a8bb ("IB/ipoib: Add more rtnl_link_ops callbacks")
Link: https://lore.kernel.org/r/20201004132948.26669-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-05 15:05:45 -03:00
Rikard Falkeborn 3c4e919b48 RDMA/rtrs: Constify static struct attribute_group
The only usage of these is to pass their address to sysfs_create_group()
and sysfs_remove_group(), both which takes const pointers. Make it const
to allow the compiler to put them in read-only memory.

Link: https://lore.kernel.org/r/20200930224004.24279-3-rikard.falkeborn@gmail.com
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-01 20:44:52 -03:00
Gioh Kim 220aee3021 RDMA/rtrs: Remove unused field of rtrs_iu
list field is not used anywhere

Link: https://lore.kernel.org/r/20200930131407.6438-1-gi-oh.kim@clous.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-30 15:21:16 -03:00
Taehee Yoo eff7423365 net: core: introduce struct netdev_nested_priv for nested interface infrastructure
Functions related to nested interface infrastructure such as
netdev_walk_all_{ upper | lower }_dev() pass both private functions
and "data" pointer to handle their own things.
At this point, the data pointer type is void *.
In order to make it easier to expand common variables and functions,
this new netdev_nested_priv structure is added.

In the following patch, a new member variable will be added into this
struct to fix the lockdep issue.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:00:15 -07:00
Jason Gunthorpe 5dee5872f8 Merge branch 'mlx5_active_speed' into rdma.git for-next
Leon Romanovsky says:

====================
IBTA declares speed as 16 bits, but kernel stores it in u8. This series
fixes in-kernel declaration while keeping external interface intact.
====================

Based on the mlx5-next branch at
     git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.

* branch 'mlx5_active_speed':
  RDMA: Fix link active_speed size
  RDMA/mlx5: Delete duplicated mlx5_ptys_width enum
  net/mlx5: Refactor query port speed functions
2020-09-18 10:31:45 -03:00
Liu Shixin 3cc30e8dfc RDMA/ipoib: Convert to use DEFINE_SEQ_ATTRIBUTE macro
Use DEFINE_SEQ_ATTRIBUTE macro to simplify the code.

Link: https://lore.kernel.org/r/20200916025022.3992627-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-16 13:46:18 -03:00
Linus Torvalds b1df2a0783 RDMA second 5.9-rc pull request
A number of driver bug fixes and a few recent regressions:
 
 - Several bug fixes for bnxt_re. Crashing, incorrect data reported,
   and corruption on new HW
 
 - Memory leak and crash in rxe
 
 - Fix sysfs corruption in rxe if the netdev name is too long
 
 - Fix a crash on error unwind in the new cq_pool code
 
 - Fix kobject panics in rtrs by working device lifetime properly
 
 - Fix a data corruption bug in iser target related to misaligned buffers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl9atAYACgkQOG33FX4g
 mxqSdw//Qi29dnxzVGpsaO4/krd/VmI6NT6eNpgK7Nqx80DaCYer0JhtwZOUxHqK
 KbHIV9XB/f6BSI67c9ydYj4PNX6FpFnoUWQLvqZwip5VM7R6ifIVjm0ap1jCAUSS
 axDLFZOySIOYNhcZ5I+MtY/kxykKBjMteuMXdpBe4FwZ+XSmsC5KkfRH/+FUhjVG
 peL6aRVDv9TByH8w+iZE1wSmVrOphOE1C/jN5TyotQTmKe7IHoXJtkalosYHXFWw
 KZaaz52e4IYKVFl4HIcl6+FfPExhxsyfDtRHluvn+vzY/wFy1RZw6F0BZt7mioy5
 J8R6w82xEe/SNugTGuvIzqXOymmy9H4CrG9pHy4NRMMzC28LGI7qHJgVhr/jZy8+
 GPxR26cywDhPsd4XA2K3mvs7DVSoBUPYlIUnHdYjfBZl/ghColg9+XGyNv6pdrke
 Q7Kog5blcpOAahBX+ElBLvIZXk5oEk5W+3H/M0OeuVMQ/DrMtALrCnwpp4wDKVvO
 9QuYfGgQ+25xbV9kwzckLGo5eedN3cRD/v4hcqvQUZo+9zLYZ/HZRMjpOdrscQ+I
 QL4FgpcLpOASKZ+bYjjpFxK3rNVTDT9CYJw4/hxEaOhxRhtAO1Q9mJdvJTK6dj09
 oR9LPyefQkyKCAt+heWHKKkEYDiwT8U1SlR8STotg24VHIj6Rb4=
 =2DDd
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "A number of driver bug fixes and a few recent regressions:

   - Several bug fixes for bnxt_re. Crashing, incorrect data reported,
     and corruption on new HW

   - Memory leak and crash in rxe

   - Fix sysfs corruption in rxe if the netdev name is too long

   - Fix a crash on error unwind in the new cq_pool code

   - Fix kobject panics in rtrs by working device lifetime properly

   - Fix a data corruption bug in iser target related to misaligned
     buffers"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/isert: Fix unaligned immediate-data handling
  RDMA/rtrs-srv: Set .release function for rtrs srv device during device init
  RDMA/bnxt_re: Remove set but not used variable 'qplib_ctx'
  RDMA/core: Fix reported speed and width
  RDMA/core: Fix unsafe linked list traversal after failing to allocate CQ
  RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
  RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address
  RDMA/bnxt_re: Restrict the max_gids to 256
  RDMA/bnxt_re: Static NQ depth allocation
  RDMA/bnxt_re: Fix the qp table indexing
  RDMA/bnxt_re: Do not report transparent vlan from QP1
  RDMA/mlx4: Read pkey table length instead of hardcoded value
  RDMA/rxe: Fix panic when calling kmem_cache_create()
  RDMA/rxe: Fix memleak in rxe_mem_init_user
  RDMA/rxe: Fix the parent sysfs read when the interface has 15 chars
  RDMA/rtrs-srv: Replace device_register with device_initialize and device_add
2020-09-11 10:02:36 -07:00
Leon Romanovsky 119181d1d4 RDMA: Restore ability to fail on SRQ destroy
In similar way to other IB objects, restore the ability to return error on
SRQ destroy. Strictly speaking, this change is not necessary, and provided
here to ensure a symmetrical interface like other destroy functions.

Fixes: 68e326dea1 ("RDMA: Handle SRQ allocations by IB/core")
Link: https://lore.kernel.org/r/20200907120921.476363-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 14:14:24 -03:00
Sagi Grimberg 0b089c1ef7 IB/isert: Fix unaligned immediate-data handling
Currently we allocate rx buffers in a single contiguous buffers for
headers (iser and iscsi) and data trailer. This means that most likely the
data starting offset is aligned to 76 bytes (size of both headers).

This worked fine for years, but at some point this broke, resulting in
data corruptions in isert when a command comes with immediate data and the
underlying backend device assumes 512 bytes buffer alignment.

We assume a hard-requirement for all direct I/O buffers to be 512 bytes
aligned. To fix this, we should avoid passing unaligned buffers for I/O.

Instead, we allocate our recv buffers with some extra space such that we
can have the data portion align to 512 byte boundary. This also means that
we cannot reference headers or data using structure but rather
accessors (as they may move based on alignment). Also, get rid of the
wrong __packed annotation from iser_rx_desc as this has only harmful
effects (not aligned to anything).

This affects the rx descriptors for iscsi login and data plane.

Fixes: 3d75ca0ade ("block: introduce multi-page bvec helpers")
Link: https://lore.kernel.org/r/20200904195039.31687-1-sagi@grimberg.me
Reported-by: Stephen Rust <srust@blockbridge.com>
Tested-by: Doug Dumitru <doug@dumitru.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 13:46:03 -03:00
Md Haris Iqbal 558d52b297 RDMA/rtrs-srv: Incorporate ib_register_client into rtrs server init
The rnbd_server module's communication manager (cm) initialization depends
on the registration of the "network namespace subsystem" of the RDMA CM
agent module. As such, when the kernel is configured to load the
rnbd_server and the RDMA cma module during initialization; and if the
rnbd_server module is initialized before RDMA cma module, a null ptr
dereference occurs during the RDMA bind operation.

Call trace:

  Call Trace:
   ? xas_load+0xd/0x80
   xa_load+0x47/0x80
   cma_ps_find+0x44/0x70
   rdma_bind_addr+0x782/0x8b0
   ? get_random_bytes+0x35/0x40
   rtrs_srv_cm_init+0x50/0x80
   rtrs_srv_open+0x102/0x180
   ? rnbd_client_init+0x6e/0x6e
   rnbd_srv_init_module+0x34/0x84
   ? rnbd_client_init+0x6e/0x6e
   do_one_initcall+0x4a/0x200
   kernel_init_freeable+0x1f1/0x26e
   ? rest_init+0xb0/0xb0
   kernel_init+0xe/0x100
   ret_from_fork+0x22/0x30
  Modules linked in:
  CR2: 0000000000000015

All this happens cause the cm init is in the call chain of the module
init, which is not a preferred practice.

So remove the call to rdma_create_id() from the module init call chain.
Instead register rtrs-srv as an ib client, which makes sure that the
rdma_create_id() is called only when an ib device is added.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20200907103106.104530-1-haris.iqbal@cloud.ionos.com
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 13:31:08 -03:00
Md Haris Iqbal 39c2d639ca RDMA/rtrs-srv: Set .release function for rtrs srv device during device init
The device .release function was not being set during the device
initialization. This was leading to the below warning, in error cases when
put_srv was called before device_add was called.

Warning:

Device '(null)' does not have a release() function, it is broken and must
be fixed. See Documentation/kobject.txt.

So, set the device .release function during device initialization in the
__alloc_srv() function.

Fixes: baa5b28b7a ("RDMA/rtrs-srv: Replace device_register with device_initialize and device_add")
Link: https://lore.kernel.org/r/20200907102216.104041-1-haris.iqbal@cloud.ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 13:28:14 -03:00
Jason Gunthorpe 6989aa62d3 Linux 5.9-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9ML+IeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGA8EIAIy/kTbFS0yrE9yV
 hb98oX0z9+EU9YQg9vhaRWwPd+rJF/JMQZLqYcwbhjG9abaUL3T3fEcSAefMHw8E
 LAt+hYzA38dHt7tqhsFQX3vV1VorvDVICBVN0yRPRWKKikq4OPIHzaAR9tleGAF5
 8btQisl1PjN+obwYmLuNb6aX16OCwAF+uXOwehcoJs9dvMNhwtXRzfOflWzOvOo6
 tE0bHErlylLDfLv4ZzEfczTdks4QJZ7C0xLSf3oN9AAynW42Xnhct4hi8qZY/hCf
 CMaqeN4hdpub6TvQIqBdDqMMjEXGFgeNSnAEBQY9VpvUqz8NTu6sQxwgJEKDF5tg
 d81lv2c=
 =uW/F
 -----END PGP SIGNATURE-----

Merge tag 'v5.9-rc3' into rdma.git for-next

Required due to dependencies in following patches.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-31 12:28:12 -03:00
Md Haris Iqbal baa5b28b7a RDMA/rtrs-srv: Replace device_register with device_initialize and device_add
There are error cases when we will call free_srv before device kobject is
initialized; in such cases calling put_device generates the following
warning:

 kobject: '(null)' (000000009f5445ed): is not initialized, yet
 kobject_put() is being called.

So call device_initialize() only once when the server is allocated. If we
end up calling put_srv() and subsequently free_srv(), our call to
put_device() would result in deletion of the obj. Call device_add() later
when we actually have a connection. Correspondingly, call device_del()
instead of device_unregister() when srv->dev_ref falls to 0.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20200811092722.2450-1-haris.iqbal@cloud.ionos.com
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-24 13:44:53 -03:00
Gustavo A. R. Silva df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Max Gurtovoy c97119b6d3 IB/isert: remove duplicated error prints
The isert_post_recv function prints an error in case of failures, so no
need for the callers to add another print.

Link: https://lore.kernel.org/r/20200805121231.166162-2-maxg@mellanox.com
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-18 15:22:05 -03:00
Jack Wang 03ed5a8cda RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq
lockdep triggers a warning from time to time when running a regression
test:

 rnbd_client L685: </dev/nullb0@bla> Device disconnected.
 rnbd_client L1756: Unloading module

 workqueue: WQ_MEM_RECLAIM rtrs_client_wq:rtrs_clt_reconnect_work [rtrs_client] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
 WARNING: CPU: 2 PID: 18824 at kernel/workqueue.c:2517 check_flush_dependency+0xad/0x130

The root cause is workqueue core expect flushing should not be done for a
!WQ_MEM_RECLAIM wq from a WQ_MEM_RECLAIM workqueue.

In above case ib_addr workqueue without WQ_MEM_RECLAIM, but rtrs_wq
WQ_MEM_RECLAIM.

To avoid the warning, remove the WQ_MEM_RECLAIM flag.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200724111508.15734-4-haris.iqbal@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29 14:26:53 -03:00
Danil Kipnis 09e0dbbeed RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting
In order to avoid all the clients to start reconnecting at the same time
schedule the reconnect dwork with a random jitter of +[0,8] seconds.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200724111508.15734-2-haris.iqbal@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29 14:26:53 -03:00
Yamin Friedman c804af2c1d IB/srpt: use new shared CQ mechanism
Have the driver use shared CQs provided by the rdma core driver.  This
provides the advantage of improved efficiency handling interrupts.

Link: https://lore.kernel.org/r/20200722135629.49467-3-maxg@mellanox.com
Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29 09:10:32 -03:00
Yamin Friedman c6e6630723 IB/isert: use new shared CQ mechanism
Have the driver use shared CQs provided by the rdma core driver.  Since
this provides similar functionality to iser_comp it has been removed.

Now there is no reason to allocate very large CQs when the driver is
loaded while gaining the advantage of shared CQs. Previously when a single
connection was opened a CQ was opened for every core with enough space for
eight connections, this is a very large overhead that in most cases will
not be utilized.

Link: https://lore.kernel.org/r/20200722135629.49467-2-maxg@mellanox.com
Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29 09:10:31 -03:00
Yamin Friedman d56a7852ec IB/iser: use new shared CQ mechanism
Have the driver use shared CQs provided by the rdma core driver.  Since
this provides similar functionality to iser_comp it has been removed. Now
there is no reason to allocate very large CQs when the driver is loaded
while gaining the advantage of shared CQs.

Link: https://lore.kernel.org/r/20200722135629.49467-1-maxg@mellanox.com
Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Acked-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29 09:10:31 -03:00
Max Gurtovoy 317000b926 IB/isert: allocate RW ctxs according to max IO size
Current iSER target code allocates MR pool budget based on queue size.
Since there is no handshake between iSER initiator and target on max IO
size, we'll set the iSER target to support upto 16MiB IO operations and
allocate the correct number of RDMA ctxs according to the factor of MR's
per IO operation. This would guarantee sufficient size of the MR pool for
the required IO queue depth and IO size.

Link: https://lore.kernel.org/r/20200708091908.162263-1-maxg@mellanox.com
Reported-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Tested-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16 14:23:22 -03:00
Jason Gunthorpe f11f3f76c7 Merge branch 'mlx5_ipoib_qpn' into rdma.git for-next
Michael Guralnik says:

====================
This series handles IPoIB child interface creation with setting
interface's HW address.

In current implementation, lladdr requested by user is ignored and
overwritten. Child interface gets the same GID as the parent interface and
a QP number which is assigned by the underlying drivers.

In this series we fix this behavior so that user's requested address is
assigned to the newly created interface.

As specific QP number request is not supported for all vendors, QP number
requested by user will still be overwritten when this is not supported.

Behavior of creation of child interfaces through the sysfs mechanism or
without specifying a requested address, stays the same.
====================

Based on the mlx5-next branch at
      git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.

* branch 'mlx5_ipoib_qpn':
  RDMA/ipoib: Handle user-supplied address when creating child
  net/mlx5: Enable QP number request when creating IPoIB underlay QP

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06 14:29:58 -03:00
Michael Guralnik 87fb5c1ccb RDMA/ipoib: Handle user-supplied address when creating child
Use the address supplied by user when creating a child interface.

Previously, the address requested by the user was ignored and overridden
with parent's GID and the random QP number assigned to the child.

Link: https://lore.kernel.org/r/20200623110105.1225750-3-leon@kernel.org
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06 12:57:58 -03:00
Jason Gunthorpe 65936bf25f RDMA/ipoib: Fix ABBA deadlock with ipoib_reap_ah()
ipoib_mcast_carrier_on_task() insanely open codes a rtnl_lock() such that
the only time flush_workqueue() can be called is if it also clears
IPOIB_FLAG_OPER_UP.

Thus the flush inside ipoib_flush_ah() will deadlock if it gets unlucky
enough, and lockdep doesn't help us to find it early:

          CPU0               CPU1          CPU2
   __ipoib_ib_dev_flush()
      down_read(vlan_rwsem)

                         ipoib_vlan_add()
                           rtnl_trylock()
                           down_write(vlan_rwsem)

				      ipoib_mcast_carrier_on_task()
					 while (!rtnl_trylock())
					      msleep(20);

      ipoib_flush_ah()
	flush_workqueue(priv->wq)

Clean up the ah_reaper related functions and lifecycle to make sense:

 - Start/Stop of the reaper should only be done in open/stop NDOs, not in
   any other places

 - cancel and flush of the reaper should only happen in the stop NDO.
   cancel is only functional when combined with IPOIB_STOP_REAPER.

 - Non-stop places were flushing the AH's just need to flush out dead AH's
   synchronously and ignore the background task completely. It is fully
   locked and harmless to leave running.

Which ultimately fixes the ABBA deadlock by removing the unnecessary
flush_workqueue() from the problematic place under the vlan_rwsem.

Fixes: efc82eeeae ("IB/ipoib: No longer use flush as a parameter")
Link: https://lore.kernel.org/r/20200625174219.290842-1-kamalheib1@gmail.com
Reported-by: Kamal Heib <kheib@redhat.com>
Tested-by: Kamal Heib <kheib@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-02 10:46:06 -03:00
Kamal Heib 95a5631f6c RDMA/ipoib: Return void from ipoib_ib_dev_stop()
The return value from ipoib_ib_dev_stop() is always 0 - change it to be
void.

Link: https://lore.kernel.org/r/20200623105236.18683-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-24 08:52:31 -03:00
Kamal Heib 90cdff90df RDMA/ipoib: Return void from ipoib_mcast_stop_thread()
The return value from ipoib_mcast_stop_thread() is always 0 - change it to
be void.

Link: https://lore.kernel.org/r/20200622092256.6931-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-22 14:49:27 -03:00
Colton Lewis 11708142bc RDMA: Correct trivial kernel-doc inconsistencies
Silence documentation build warnings by correcting kernel-doc comments.

./drivers/infiniband/core/verbs.c:1004: warning: Function parameter or member 'uobject' not described in 'ib_create_srq_user'
./drivers/infiniband/core/verbs.c:1004: warning: Function parameter or member 'udata' not described in 'ib_create_srq_user'
./drivers/infiniband/core/umem_odp.c:161: warning: Function parameter or member 'ops' not described in 'ib_umem_odp_alloc_child'
./drivers/infiniband/core/umem_odp.c:225: warning: Function parameter or member 'ops' not described in 'ib_umem_odp_get'
./drivers/infiniband/sw/rdmavt/ah.c:104: warning: Excess function parameter 'ah_attr' description in 'rvt_create_ah'
./drivers/infiniband/sw/rdmavt/ah.c:104: warning: Excess function parameter 'create_flags' description in 'rvt_create_ah'
./drivers/infiniband/ulp/iser/iscsi_iser.h:363: warning: Function parameter or member 'all_list' not described in 'iser_fr_desc'
./drivers/infiniband/ulp/iser/iscsi_iser.h:377: warning: Function parameter or member 'all_list' not described in 'iser_fr_pool'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:148: warning: Function parameter or member 'rsvd0' not described in 'opa_vesw_info'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:148: warning: Function parameter or member 'rsvd1' not described in 'opa_vesw_info'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:148: warning: Function parameter or member 'rsvd2' not described in 'opa_vesw_info'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:148: warning: Function parameter or member 'rsvd3' not described in 'opa_vesw_info'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:148: warning: Function parameter or member 'rsvd4' not described in 'opa_vesw_info'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:205: warning: Function parameter or member 'rsvd0' not described in 'opa_per_veswport_info'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:205: warning: Function parameter or member 'rsvd1' not described in 'opa_per_veswport_info'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:205: warning: Function parameter or member 'rsvd2' not described in 'opa_per_veswport_info'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:205: warning: Function parameter or member 'rsvd3' not described in 'opa_per_veswport_info'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:342: warning: Function parameter or member 'reserved' not described in 'opa_veswport_summary_counters'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:394: warning: Function parameter or member 'rsvd0' not described in 'opa_veswport_error_counters'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:394: warning: Function parameter or member 'rsvd1' not described in 'opa_veswport_error_counters'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:394: warning: Function parameter or member 'rsvd2' not described in 'opa_veswport_error_counters'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:394: warning: Function parameter or member 'rsvd3' not described in 'opa_veswport_error_counters'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:394: warning: Function parameter or member 'rsvd4' not described in 'opa_veswport_error_counters'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:394: warning: Function parameter or member 'rsvd5' not described in 'opa_veswport_error_counters'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:394: warning: Function parameter or member 'rsvd6' not described in 'opa_veswport_error_counters'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:394: warning: Function parameter or member 'rsvd7' not described in 'opa_veswport_error_counters'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:394: warning: Function parameter or member 'rsvd8' not described in 'opa_veswport_error_counters'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:394: warning: Function parameter or member 'rsvd9' not described in 'opa_veswport_error_counters'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:460: warning: Function parameter or member 'reserved' not described in 'opa_vnic_vema_mad'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:485: warning: Function parameter or member 'reserved' not described in 'opa_vnic_notice_attr'
./drivers/infiniband/ulp/opa_vnic/opa_vnic_encap.h:500: warning: Function parameter or member 'reserved' not described in 'opa_vnic_vema_mad_trap'

Link: https://lore.kernel.org/r/5373936.DvuYhMxLoT@laptop.coltonlewis.name
Signed-off-by: Colton Lewis <colton.w.lewis@protonmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-22 11:57:39 -03:00
Jing Xiangfeng a7ca4c3ebe IB/srpt: Remove WARN_ON from srpt_cm_req_recv
The callers pass the pointer '&req' or 'private_data' to
srpt_cm_req_recv(), and 'private_data' is initialized in srp_send_req().
'sdev' is allocated and stored in srpt_add_one(). It's easy to show that
sdev and req are always valid. So we remove unnecessary WARN_ON.

Link: https://lore.kernel.org/r/20200617140803.181333-1-jingxiangfeng@huawei.com
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-18 15:18:00 -03:00
Masahiro Yamada a7f7f6248d treewide: replace '---help---' in Kconfig files with 'help'
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.

This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.

There are a variety of indentation styles found.

  a) 4 spaces + '---help---'
  b) 7 spaces + '---help---'
  c) 8 spaces + '---help---'
  d) 1 space + 1 tab + '---help---'
  e) 1 tab + '---help---'    (correct indentation)
  f) 1 tab + 1 space + '---help---'
  g) 1 tab + 2 spaces + '---help---'

In order to convert all of them to 1 tab + 'help', I ran the
following commend:

  $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-14 01:57:21 +09:00
Linus Torvalds 242b233198 RDMA 5.8 merge window pull request
A few large, long discussed works this time. The RNBD block driver has
 been posted for nearly two years now, and the removal of FMR has been a
 recurring discussion theme for a long time. The usual smattering of
 features and bug fixes.
 
 - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa
 
 - Continuing driver cleanups in bnxt_re, hns
 
 - Big cleanup of mlx5 QP creation flows
 
 - More consistent use of src port and flow label when LAG is used and a
   mlx5 implementation
 
 - Additional set of cleanups for IB CM
 
 - 'RNBD' network block driver and target. This is a network block RDMA
   device specific to ionos's cloud environment. It brings strong multipath
   and resiliency capabilities.
 
 - Accelerated IPoIB for HFI1
 
 - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple async fds
 
 - Support for exchanging the new IBTA defiend ECE data during RDMA CM
   exchanges
 
 - Removal of the very old and insecure FMR interface from all ULPs and
   drivers. FRWR should be preferred for at least a decade now.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl7X/IwACgkQOG33FX4g
 mxp2uw/+MI2S/aXqEBvZfTT8yrkAwqYezS0VeTDnwH/T6UlTMDhHVN/2Ji3tbbX3
 FEKT1i2mnAL5RqUAL1lr9g4sG/bVozrpN46Ws5Lu9dTbIPLKTNPWDuLFQDUShKY7
 OyMI/bRx6anGnsOy20iiBqnrQbrrZj5TECgnmrkAl62QFdcl7aBWe/yYjy4CT11N
 ub+aBXBREN1F1pc0HIjd2tI+8gnZc+mNm1LVVDRH9Capun/pI26qDNh7e6QwGyIo
 n8ItraC8znLwv/nsUoTE7/JRcsTEe6vJI26PQmczZfNJs/4O65G7fZg0eSBseZYi
 qKf7Uwtb3qW0R7jRUMEgFY4DKXVAA0G2ph40HXBuzOSsqlT6HqYMO2wgG8pJkrTc
 qAjoSJGzfAHIsjxzxKI8wKuufCddjCm30VWWU7EKeriI6h1J0uPVqKkQMfYBTkik
 696eZSBycAVgwayOng3XaehiTxOL7qGMTjUpDjUR6UscbiPG919vP+QsbIUuBXdb
 YoddBQJdyGJiaCXv32ciJjo9bjPRRi/bII7Q5qzCNI2mi4ZVbudF4ffzyQvdHtNJ
 nGnpRXoPi7kMvUrKTMPWkFjj0R5/UsPszsA51zbxPydfgBe0Dlc2PrrIG8dlzYAp
 wbV0Lec+iJucKlt7EZtrjz1xOiOOaQt/5/cW1bWqL+wk2t6gAuY=
 =9zTe
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A more active cycle than most of the recent past, with a few large,
  long discussed works this time.

  The RNBD block driver has been posted for nearly two years now, and
  flowing through RDMA due to it also introducing a new ULP.

  The removal of FMR has been a recurring discussion theme for a long
  time.

  And the usual smattering of features and bug fixes.

  Summary:

   - Various small driver bugs fixes in rxe, mlx5, hfi1, and efa

   - Continuing driver cleanups in bnxt_re, hns

   - Big cleanup of mlx5 QP creation flows

   - More consistent use of src port and flow label when LAG is used and
     a mlx5 implementation

   - Additional set of cleanups for IB CM

   - 'RNBD' network block driver and target. This is a network block
     RDMA device specific to ionos's cloud environment. It brings strong
     multipath and resiliency capabilities.

   - Accelerated IPoIB for HFI1

   - QP/WQ/SRQ ioctl migration for uverbs, and support for multiple
     async fds

   - Support for exchanging the new IBTA defiend ECE data during RDMA CM
     exchanges

   - Removal of the very old and insecure FMR interface from all ULPs
     and drivers. FRWR should be preferred for at least a decade now"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (247 commits)
  RDMA/cm: Spurious WARNING triggered in cm_destroy_id()
  RDMA/mlx5: Return ECE DC support
  RDMA/mlx5: Don't rely on FW to set zeros in ECE response
  RDMA/mlx5: Return an error if copy_to_user fails
  IB/hfi1: Use free_netdev() in hfi1_netdev_free()
  RDMA/hns: Uninitialized variable in modify_qp_init_to_rtr()
  RDMA/core: Move and rename trace_cm_id_create()
  IB/hfi1: Fix hfi1_netdev_rx_init() error handling
  RDMA: Remove 'max_map_per_fmr'
  RDMA: Remove 'max_fmr'
  RDMA/core: Remove FMR device ops
  RDMA/rdmavt: Remove FMR memory registration
  RDMA/mthca: Remove FMR support for memory registration
  RDMA/mlx4: Remove FMR support for memory registration
  RDMA/i40iw: Remove FMR leftovers
  RDMA/bnxt_re: Remove FMR leftovers
  RDMA/mlx5: Remove FMR leftovers
  RDMA/core: Remove FMR pool API
  RDMA/rds: Remove FMR support for memory registration
  RDMA/srp: Remove support for FMR memory registration
  ...
2020-06-05 14:05:57 -07:00
Max Gurtovoy f273ad4f8d RDMA/srp: Remove support for FMR memory registration
FMR is not supported on most recent RDMA devices (that use fast memory
registration mechanism). Also, FMR was recently removed from NFS/RDMA
ULP.

Link: https://lore.kernel.org/r/2-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-02 20:32:53 -03:00
Israel Rukshin 1fc431320a RDMA/iser: Remove support for FMR memory registration
FMR is not supported on most recent RDMA devices (that use fast memory
registration mechanism). Also, FMR was recently removed from NFS/RDMA
ULP.

Link: https://lore.kernel.org/r/1-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-02 20:32:53 -03:00
Bart Van Assche e0cca8b456 RDMA/srpt: Increase max_send_sge
The ib_srpt driver limits max_send_sge to 16. Since that is a workaround
for an mlx4 bug that has been fixed, increase max_send_sge. See also
commit f95ccffc71 ("IB/mlx4: Use 4K pages for kernel QP's WQE buffer").

Link: https://lore.kernel.org/r/20200525172212.14413-5-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29 14:49:55 -03:00
Bart Van Assche 66ced2eb2a RDMA/srpt: Reduce max_recv_sge to 1
Since srpt_post_recv() always sets num_sge to 1, reduce the max_recv_sge
parameter that is used at queue pair allocation time to 1.

Link: https://lore.kernel.org/r/20200525172212.14413-4-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29 14:49:55 -03:00
Bart Van Assche d4ee7f3a44 RDMA/srpt: Make debug output more detailed
Since the session name by itself is not sufficient to uniquely identify a
queue pair, include the queue pair number. Show the ASCII channel state
name instead of the numeric value. This change makes the ib_srpt debug
output more consistent.

Link: https://lore.kernel.org/r/20200525172212.14413-3-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29 14:49:55 -03:00
Bart Van Assche 87fee61c35 RDMA/srp: Make the channel count configurable per target
Increase the flexibility of the SRP initiator driver by making the channel
count configurable per target instead of only providing a kernel module
parameter for configuring the channel count.

Link: https://lore.kernel.org/r/20200525172212.14413-2-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29 14:49:55 -03:00
Valentine Fatiev 1acba6a817 IB/ipoib: Fix double free of skb in case of multicast traffic in CM mode
When connected mode is set, and we have connected and datagram traffic in
parallel, ipoib might crash with double free of datagram skb.

The current mechanism assumes that the order in the completion queue is
the same as the order of sent packets for all QPs. Order is kept only for
specific QP, in case of mixed UD and CM traffic we have few QPs (one UD and
few CM's) in parallel.

The problem:
----------------------------------------------------------

Transmit queue:
-----------------
UD skb pointer kept in queue itself, CM skb kept in spearate queue and
uses transmit queue as a placeholder to count the number of total
transmitted packets.

0   1   2   3   4  5  6  7  8   9  10  11 12 13 .........127
------------------------------------------------------------
NL ud1 UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ...........
------------------------------------------------------------
    ^                                  ^
   tail                               head

Completion queue (problematic scenario) - the order not the same as in
the transmit queue:

  1  2  3  4  5  6  7  8  9
------------------------------------
 ud1 CM1 UD2 ud3 cm2 cm3 ud4 cm4 ud5
------------------------------------

1. CM1 'wc' processing
   - skb freed in cm separate ring.
   - tx_tail of transmit queue increased although UD2 is not freed.
     Now driver assumes UD2 index is already freed and it could be used for
     new transmitted skb.

0   1   2   3   4  5  6  7  8   9  10  11 12 13 .........127
------------------------------------------------------------
NL NL  UD2 CM1 ud3 cm2 cm3 ud4 cm4 ud5 NL NL NL ...........
------------------------------------------------------------
        ^   ^                       ^
      (Bad)tail                    head
(Bad - Could be used for new SKB)

In this case (due to heavy load) UD2 skb pointer could be replaced by new
transmitted packet UD_NEW, as the driver assumes its free.  At this point
we will have to process two 'wc' with same index but we have only one
pointer to free.

During second attempt to free the same skb we will have NULL pointer
exception.

2. UD2 'wc' processing
   - skb freed according the index we got from 'wc', but it was already
     overwritten by mistake. So actually the skb that was released is the
     skb of the new transmitted packet and not the original one.

3. UD_NEW 'wc' processing
   - attempt to free already freed skb. NUll pointer exception.

The fix:
-----------------------------------------------------------------------

The fix is to stop using the UD ring as a placeholder for CM packets, the
cyclic ring variables tx_head and tx_tail will manage the UD tx_ring, a
new cyclic variables global_tx_head and global_tx_tail are introduced for
managing and counting the overall outstanding sent packets, then the send
queue will be stopped and waken based on these variables only.

Note that no locking is needed since global_tx_head is updated in the xmit
flow and global_tx_tail is updated in the NAPI flow only.  A previous
attempt tried to use one variable to count the outstanding sent packets,
but it did not work since xmit and NAPI flows can run at the same time and
the counter will be updated wrongly. Thus, we use the same simple cyclic
head and tail scheme that we have today for the UD tx_ring.

Fixes: 2c104ea683 ("IB/ipoib: Get rid of the tx_outstanding variable in all modes")
Link: https://lore.kernel.org/r/20200527134705.480068-1-leon@kernel.org
Signed-off-by: Valentine Fatiev <valentinef@mellanox.com>
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27 21:14:09 -03:00
Leon Romanovsky 8094ba0ace RDMA/cma: Provide ECE reject reason
IBTA declares "vendor option not supported" reject reason in REJ messages
if passive side doesn't want to accept proposed ECE options.

Due to the fact that ECE is managed by userspace, there is a need to let
users to provide such rejected reason.

Link: https://lore.kernel.org/r/20200526103304.196371-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27 16:05:05 -03:00
Kamal Heib ebd6e96b33 RDMA/ipoib: Remove can_sleep parameter from iboib_mcast_alloc
can_sleep is always 0 when iboib_mcast_alloc() is called, so remove it and
use GFP_ATOMIC instead of GFP_KERNEL.

Link: https://lore.kernel.org/r/20200525130305.171509-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25 15:48:12 -03:00
Danil Kipnis a94dae867c RDMA/rtrs: Get rid of the do_next_path while_next_path macros
The macros do_each_path/while_each_path lead to a smatch warning:

drivers/infiniband/ulp/rtrs/rtrs-clt.c:1196 rtrs_clt_failover_req() warn: inconsistent indenting
drivers/infiniband/ulp/rtrs/rtrs-clt.c:2890 rtrs_clt_request() warn: inconsistent indenting

Also checkpatch complains:
ERROR: Macros with multiple statements should be enclosed in a do - while loop

The macros are used only in two places: for a normal IO path and for the
failover path triggered after errors.

Get rid of the macros and just use a for loop iterating over the list of
paths in both places. It is easier to read and also less lines of code.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200522053924.528980-1-danil.kipnis@cloud.ionos.com
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-22 15:50:22 -03:00
Md Haris Iqbal e172037be7 RDMA/rtrs: server: Use already dereferenced rtrs_sess structure
The rtrs_sess structure has already been extracted above from the
rtrs_srv_sess structure. Use that to avoid redundant dereferencing.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20200522082833.1480551-1-haris.phnx@gmail.com
Signed-off-by: Md Haris Iqbal <haris.phnx@gmail.com>
Acked-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-22 15:50:22 -03:00
Gary Leshner 8f149b6847 IB/ipoib: Add capability to switch between datagram and connected mode
This is the prerequisite modification to the ipoib ulp to allow a
rdma netdev to obtain the default ndo ops for init/uninit/open/close.

This is accomplished by setting the netdev ops field within the
callback function passed to the netdev allocation routine which
in turn was passed into the rdma netdev allocation routine.

This allows the rdma netdev to call back into the ulp to create the
resources required for connected mode operation.

Additionally as the ulp is not re-entrant, when switching modes,
the number of real tx queues is set to 1 for the connected mode.

For datagram mode the number of real tx queues is set to the
actual number of tx queues specified at the netdev's allocation.

For the internal ulp netdev the number of tx queues defaults to 1.

It is up to the rdma netdev to specify the actual number it can support.

When the driver does not support a rdma netdev for acceleration,
(-ENOTSUPPORTED return code or the verbs function for allocation is
NULL) the ipoib ulp functions are unaffected by using the internal
netdev allocated by the ipoib ulp.

Link: https://lore.kernel.org/r/20200511160706.173205.19086.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21 11:23:57 -03:00
Gary Leshner b7e159eb00 IB/{hfi1, ipoib, rdma}: Broadcast ping sent packets which exceeded mtu size
When in connected mode ipoib sent broadcast pings which exceeded the mtu
size for broadcast addresses.

Add an mtu attribute to the rdma_netdev structure which ipoib sets to its
mcast mtu size.

The RDMA netdev uses this value to determine if the skb length is too long
for the mtu specified and if it is, drops the packet and logs an error
about the errant packet.

Link: https://lore.kernel.org/r/20200511160655.173205.14546.stgit@awfm-01.aw.intel.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21 11:23:57 -03:00
Kaike Wan 6d72344cf6 IB/ipoib: Increase ipoib Datagram mode MTU's upper limit
Currently the ipoib UD mtu is restricted to 4K bytes. Remove this
limitation so that the IPOIB module can potentially use an MTU (in UD
mode) that is bounded by the MTU of the underlying device. A field is
added to the ib_port_attr structure to indicate the maximum physical
MTU the underlying device supports.

Link: https://lore.kernel.org/r/20200511160618.173205.23053.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21 11:23:55 -03:00
Gary Leshner 7f90a5a069 IB/{rdmavt, hfi1}: Implement creation of accelerated UD QPs
Adds capability to create a qpn to be recognized as an accelerated
UD QP for ipoib.

This is accomplished by reserving 0x81 in byte[0] of the qpn as the
prefix for these qp types and reserving qpns between 0x810000 and
0x81ffff.

The hfi1 capability mask already contained a flag for the VNIC netdev.
This has been renamed and extended to include both VNIC and ipoib.

The rvt code to allocate qps now recognizes this flag and sets 0x81
into byte[0] of the qpn.

The code to allocate qpns is modified to reset the qpn numbering when it
is detected that a value is located in byte[0] for a UD QP and it is a
qpn being requested for net dev use. If it is a regular UD QP then it is
allowable to have bits set in byte[0] of the qpn and provide the
previously normal behavior.

The code to free the qpn now checks for the AIP prefix value of 0x81 and
removes it from the qpn before being freed so that the lower 16 bit
number can be reused.

This patch requires minor changes in the IB core and ipoib to facilitate
the creation of accelerated UP QPs.

Link: https://lore.kernel.org/r/20200511160607.173205.11757.stgit@awfm-01.aw.intel.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21 11:23:54 -03:00
Danil Kipnis d6ea395072 rnbd/rtrs: Pass max segment size from blk user to the rdma library
When Block Device Layer is disabled, BLK_MAX_SEGMENT_SIZE is undefined.
The rtrs is a transport library and should compile independently of the
block layer. The desired max segment size should be passed down by the
user.

Introduce max_segment_size parameter for the rtrs_clt_open() call.

Fixes: f7a7a5c228 ("block/rnbd: client: main functionality")
Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Fixes: cb80329c94 ("RDMA/rtrs: client: private header with client structs and functions")
Fixes: b5c27cdb09 ("RDMA/rtrs: public interface header to establish RDMA connections")
Link: https://lore.kernel.org/r/20200519111419.924170-1-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-19 20:43:26 -03:00
Wei Yongjun 6b31afcef5 RDMA/rtrs: server: Fix some error return code
Fix to return negative error code -ENOMEM from the some error handling
cases instead of 0, as done elsewhere in this function.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Fixes: 91b11610af ("RDMA/rtrs: server: sysfs interface functions")
Link: https://lore.kernel.org/r/20200519091912.134358-1-weiyongjun1@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-19 20:42:25 -03:00
Gustavo A. R. Silva e198408670 RDMA/rtrs: client: Fix function return on success
Remove the if-statement and return the value contained in _err_,
unconditionally.

Link: https://lore.kernel.org/r/20200519163612.GA6043@embeddedor
Addresses-Coverity-ID: 1493753 ("Identical code for different branches")
Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-19 20:41:41 -03:00
Dan Carpenter bf1d8edb38 RDMA/rtrs: Fix a couple off by one bugs in rtrs_srv_rdma_done()
These > comparisons should be >= to prevent accessing one element beyond
the end of the buffer.

Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20200519154525.GA66801@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-19 20:40:21 -03:00
Dan Carpenter b386cd65d9 RDMA/rtrs: Fix some signedness bugs in error handling
The problem is that "req->sg_cnt" is an unsigned int so if "nr" is
negative, it gets type promoted to a high positive value and the condition
is false.  This patch fixes it by handling negatives separately.

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200519133223.GN2078@kadam
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-19 20:40:20 -03:00
Kamal Heib 23bbd5818e RDMA/srpt: Fix disabling device management
Avoid disabling device management for devices that don't support
Management datagrams (MADs) by checking if the "mad_agent" pointer is
initialized before calling ib_modify_port, also fix the error flow in
srpt_refresh_port() to disable device management if
ib_register_mad_agent() fail.

Fixes: 09f8a1486d ("RDMA/srpt: Fix handling of SR-IOV and iWARP ports")
Link: https://lore.kernel.org/r/20200514114720.141139-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17 20:43:19 -03:00
Xiongfeng Wang a8f5c1f1a5 RDMA/srpt: Add a newline when printing parameter 'srpt_service_guid' by sysfs
When I cat module parameter 'srpt_service_guid', it displays as follows.
It is better to add a newline for easy reading.

[root@hulk-202 ~]# cat /sys/module/ib_srpt/parameters/srpt_service_guid
0x0205cdfffe8346b9[root@hulk-202 ~]#

Link: https://lore.kernel.org/r/1589182629-27743-1-git-send-email-wangxiongfeng2@huawei.com
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17 20:37:52 -03:00
Jack Wang 745b6a3d4a RDMA/rtrs: a bit of documentation
README with description of major sysfs entries, sysfs documentation has
been moved to ABI dir as suggested by Bart.

Link: https://lore.kernel.org/r/20200511135131.27580-15-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17 18:57:15 -03:00
Jack Wang c013fbc1fd RDMA/rtrs: include client and server modules into kernel compilation
Add rtrs Makefile, Kconfig and also corresponding lines into upper layer
infiniband/ulp files.

Link: https://lore.kernel.org/r/20200511135131.27580-14-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17 18:57:15 -03:00
Jack Wang 91b11610af RDMA/rtrs: server: sysfs interface functions
This is the sysfs interface to rtrs sessions on server side:

  /sys/class/rtrs-server/<SESS-NAME>/
    *** rtrs session accepted from a client peer
    |
    |- paths/<SRC@DST>/
       *** established paths from a client in a session
       |
       |- disconnect
       |  *** disconnect path
       |
       |- hca_name
       |  *** HCA name
       |
       |- hca_port
       |  *** HCA port
       |
       |- stats/
          *** current path statistics
          |
	  |- rdma

Link: https://lore.kernel.org/r/20200511135131.27580-13-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-17 18:57:15 -03:00