linux-sg2042/net
Chuck Lever 431af645cf xprtrdma: Fix client lock-up after application signal fires
After a signal, the RPC client aborts synchronous RPCs running on
behalf of the signaled application.

The server is still executing those RPCs, and will write the results
back into the client's memory when it's done. By the time the server
writes the results, that memory is likely being used for other
purposes. Therefore xprtrdma has to immediately invalidate all
memory regions used by those aborted RPCs to prevent the server's
writes from clobbering that re-used memory.

With FMR memory registration, invalidation takes a relatively long
time. In fact, the invalidation is often still running when the
server tries to write the results into the memory regions that are
being invalidated.

This sets up a race between two processes:

1.  After the signal, xprt_rdma_free calls ro_unmap_safe.
2.  While ro_unmap_safe is still running, the server replies and
    rpcrdma_reply_handler runs, calling ro_unmap_sync.

Both processes invoke ib_unmap_fmr on the same FMR.

The mlx4 driver allows two ib_unmap_fmr calls on the same FMR at
the same time, but HCAs generally don't tolerate this. Sometimes
this can result in a system crash.

If the HCA happens to survive, rpcrdma_reply_handler continues. It
removes the rpc_rqst from rq_list and releases the transport_lock.
This enables xprt_rdma_free to run in another process, and the
rpc_rqst is released while rpcrdma_reply_handler is still waiting
for the ib_unmap_fmr call to finish.

But further down in rpcrdma_reply_handler, the transport_lock is
taken again, and "rqst" is dereferenced. If "rqst" has already been
released, this triggers a general protection fault. Since bottom-
halves are disabled, the system locks up.

Address both issues by reversing the order of the xprt_lookup_rqst
call and the ro_unmap_sync call. Introduce a separate lookup
mechanism for rpcrdma_req's to enable calling ro_unmap_sync before
xprt_lookup_rqst. Now the handler takes the transport_lock once
and holds it for the XID lookup and RPC completion.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=305
Fixes: 68791649a7 ('xprtrdma: Invalidate in the RPC reply ... ')
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13 16:00:11 -04:00
..
6lowpan 6lowpan: Don't set IFF_NO_QUEUE 2017-04-12 22:02:40 +02:00
9p xen: fixes for 4.12 rc2 2017-05-19 15:06:48 -07:00
802 Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
8021q vlan: Keep NETIF_F_HW_CSUM similar to other software devices 2017-05-08 14:39:19 -04:00
appletalk lib/vsprintf.c: remove %Z support 2017-02-27 18:43:47 -08:00
atm neighbour: fix nlmsg_pid in notifications 2017-03-22 10:48:49 -07:00
ax25 net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
batman-adv This feature/cleanup patchset includes the following patches: 2017-04-06 14:37:50 -07:00
bluetooth Bluetooth: Add selftest for ECDH key generation 2017-04-30 16:52:43 +03:00
bpf bpf: Align packet data properly in program testing framework. 2017-05-02 11:46:28 -04:00
bridge net: bridge: fix a null pointer dereference in br_afspec 2017-06-06 16:05:31 -04:00
caif sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
can can: fix CAN BCM build with CONFIG_PROC_FS disabled 2017-04-27 09:34:13 +02:00
ceph libceph: cleanup old messages according to reconnect seq 2017-05-24 18:10:51 +02:00
core devlink: fix potential memort leak 2017-06-05 11:24:28 -04:00
dcb net: rtnetlink: plumb extended ack to doit function 2017-04-17 15:35:38 -04:00
dccp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-05-15 15:50:49 -07:00
decnet Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-05-09 15:42:31 -07:00
dns_resolver Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-03 10:16:38 -08:00
dsa net: dsa: Fix stale cpu_switch reference after unbind then bind 2017-06-04 22:55:17 -04:00
ethernet Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2017-02-16 21:25:49 -05:00
hsr netlink: extended ACK reporting 2017-04-13 13:58:20 -04:00
ieee802154 netlink: pass extended ACK struct where available 2017-04-13 13:58:22 -04:00
ife net: Introduce ife encapsulation module 2017-02-03 15:16:45 -05:00
ipv4 net: ping: do not abuse udp_poll() 2017-06-04 22:56:55 -04:00
ipv6 net/ipv6: Fix CALIPSO causing GPF with datagram support 2017-06-06 15:18:20 -04:00
ipx ipx: call ipxitf_put() in ioctl error path 2017-05-02 15:34:53 -04:00
irda net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
iucv net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
kcm kcm: remove a useless copy_from_user() 2017-04-17 13:28:48 -04:00
key af_key: Fix slab-out-of-bounds in pfkey_compile_policy. 2017-05-08 08:03:01 +02:00
l2tp l2tp: remove useless device duplication test in l2tp_eth_create() 2017-04-27 16:32:13 -04:00
l3mdev
lapb Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
llc net: llc: add lock_sock in llc_ui_bind to avoid a race condition 2017-05-26 14:20:29 -04:00
mac80211 mac80211: fix dropped counter in multiqueue RX 2017-06-01 21:26:03 +02:00
mac802154 drivers: add explicit interrupt.h includes 2017-03-30 11:05:34 -07:00
mpls mpls: fix clearing of dead nh_flags on link up 2017-05-31 14:48:24 -04:00
ncsi
netfilter netfilter: ctnetlink: fix incorrect nf_ct_put during hash resize 2017-05-24 11:26:01 +02:00
netlabel netlink: pass extended ACK struct to parsing functions 2017-04-13 13:58:22 -04:00
netlink netlink: don't send unknown nsid 2017-06-01 11:49:39 -04:00
netrom net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
nfc NFC 4.12 pull request 2017-04-21 15:29:40 -04:00
openvswitch netfilter: introduce nf_conntrack_helper_put helper function 2017-05-15 12:42:29 +02:00
packet net/packet: fix missing net_device reference release 2017-05-15 14:22:12 -04:00
phonet net: rtnetlink: plumb extended ack to doit function 2017-04-17 15:35:38 -04:00
psample net: Introduce psample, a new genetlink channel for packet sampling 2017-01-24 13:44:28 -05:00
qrtr Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-04-21 20:23:53 -07:00
rds Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
rfkill rfkill: remove rfkill-regulator 2017-01-24 11:07:35 +01:00
rose net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
rxrpc rxrpc: Trace client call connection 2017-04-06 11:10:41 +01:00
sched net: sched: cls_matchall: fix null pointer dereference 2017-05-22 14:54:16 -04:00
sctp sctp: fix ICMP processing if skb is non-linear 2017-05-26 14:40:46 -04:00
smc net/smc: Add warning about remote memory exposure 2017-05-16 14:49:43 -04:00
strparser strparser: destroy workqueue on module exit 2017-03-03 20:43:26 -08:00
sunrpc xprtrdma: Fix client lock-up after application signal fires 2017-07-13 16:00:11 -04:00
switchdev netlink: pass extended ACK struct to parsing functions 2017-04-13 13:58:22 -04:00
tipc tipc: make macro tipc_wait_for_cond() smp safe 2017-05-11 22:19:30 -04:00
unix af_unix: Use designated initializers 2017-04-06 12:43:04 -07:00
vmw_vsock vsock: use new wait API for vsock_stream_sendmsg() 2017-05-22 14:39:36 -04:00
wimax
wireless cfg80211: make cfg80211_sched_scan_results() work from atomic context 2017-05-23 14:36:46 +02:00
x25 net: x25: fix one potential use-after-free issue 2017-05-18 10:05:40 -04:00
xfrm xfrm: fix state migration copy replay sequence numbers 2017-05-19 12:49:13 +02:00
Kconfig bpf: make jited programs visible in traces 2017-02-17 13:40:05 -05:00
Makefile bpf: introduce BPF_PROG_TEST_RUN command 2017-04-01 12:45:57 -07:00
compat.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-02-22 10:15:09 -08:00
socket.c l2tp: device MTU setup, tunnel socket needs a lock 2017-04-17 13:01:48 -04:00
sysctl_net.c sysctl: Remove dead register_sysctl_root 2017-04-16 23:42:49 -05:00