Go to file
Wen Gu 341adeec9a net/smc: Forward wakeup to smc socket waitqueue after fallback
When we replace TCP with SMC and a fallback occurs, there may be
some socket waitqueue entries remaining in smc socket->wq, such
as eppoll_entries inserted by userspace applications.

After the fallback, data flows over TCP/IP and only clcsocket->wq
will be woken up. Applications can't be notified by the entries
which were inserted in smc socket->wq before fallback. So we need
a mechanism to wake up smc socket->wq at the same time if some
entries remaining in it.

The current workaround is to transfer the entries from smc socket->wq
to clcsock->wq during the fallback. But this may cause a crash
like this:

 general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] PREEMPT SMP PTI
 CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Tainted: G E     5.16.0+ #107
 RIP: 0010:__wake_up_common+0x65/0x170
 Call Trace:
  <IRQ>
  __wake_up_common_lock+0x7a/0xc0
  sock_def_readable+0x3c/0x70
  tcp_data_queue+0x4a7/0xc40
  tcp_rcv_established+0x32f/0x660
  ? sk_filter_trim_cap+0xcb/0x2e0
  tcp_v4_do_rcv+0x10b/0x260
  tcp_v4_rcv+0xd2a/0xde0
  ip_protocol_deliver_rcu+0x3b/0x1d0
  ip_local_deliver_finish+0x54/0x60
  ip_local_deliver+0x6a/0x110
  ? tcp_v4_early_demux+0xa2/0x140
  ? tcp_v4_early_demux+0x10d/0x140
  ip_sublist_rcv_finish+0x49/0x60
  ip_sublist_rcv+0x19d/0x230
  ip_list_rcv+0x13e/0x170
  __netif_receive_skb_list_core+0x1c2/0x240
  netif_receive_skb_list_internal+0x1e6/0x320
  napi_complete_done+0x11d/0x190
  mlx5e_napi_poll+0x163/0x6b0 [mlx5_core]
  __napi_poll+0x3c/0x1b0
  net_rx_action+0x27c/0x300
  __do_softirq+0x114/0x2d2
  irq_exit_rcu+0xb4/0xe0
  common_interrupt+0xba/0xe0
  </IRQ>
  <TASK>

The crash is caused by privately transferring waitqueue entries from
smc socket->wq to clcsock->wq. The owners of these entries, such as
epoll, have no idea that the entries have been transferred to a
different socket wait queue and still use original waitqueue spinlock
(smc socket->wq.wait.lock) to make the entries operation exclusive,
but it doesn't work. The operations to the entries, such as removing
from the waitqueue (now is clcsock->wq after fallback), may cause a
crash when clcsock waitqueue is being iterated over at the moment.

This patch tries to fix this by no longer transferring wait queue
entries privately, but introducing own implementations of clcsock's
callback functions in fallback situation. The callback functions will
forward the wakeup to smc socket->wq if clcsock->wq is actually woken
up and smc socket->wq has remaining entries.

Fixes: 2153bd1e3d ("net/smc: Transfer remaining wait queue entries during fallback")
Suggested-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31 11:07:13 +00:00
Documentation Networking fixes for 5.17-rc2, including fixes from netfilter and can. 2022-01-27 20:58:39 +02:00
LICENSES LICENSES/LGPL-2.1: Add LGPL-2.1-or-later as valid identifiers 2021-12-16 14:33:10 +01:00
arch ARM fixes for 5.17-rc: 2022-01-25 08:02:46 +02:00
block bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
certs certs: Fix build error when CONFIG_MODULE_SIG_KEY is empty 2022-01-23 00:08:44 +09:00
crypto lib/crypto: add prompts back to crypto libraries 2022-01-18 13:03:55 +01:00
drivers net: stmmac: properly handle with runtime pm in stmmac_dvr_remove() 2022-01-28 15:13:22 +00:00
fs NFS Client Updates for Linux 5.17 2022-01-25 20:16:03 +02:00
include ax25: add refcount in ax25_dev to avoid UAF bugs 2022-01-28 14:56:47 +00:00
init lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() 2022-01-22 08:33:37 +02:00
ipc proc: remove PDE_DATA() completely 2022-01-22 08:33:37 +02:00
kernel powerpc fixes for 5.17 #2 2022-01-23 17:52:42 +02:00
lib bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
mm bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
net net/smc: Forward wakeup to smc socket waitqueue after fallback 2022-01-31 11:07:13 +00:00
samples Merge branch 'akpm' (patches from Andrew) 2022-01-20 10:41:01 +02:00
scripts Devicetree fixes for v5.17, take 1: 2022-01-22 09:52:17 +02:00
security fs.idmapped.v5.17 2022-01-11 14:26:55 -08:00
sound proc: remove PDE_DATA() completely 2022-01-22 08:33:37 +02:00
tools Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf 2022-01-27 18:53:02 -08:00
usr usr/include/Makefile: add linux/nfc.h to the compile-test coverage 2022-01-22 21:48:45 +09:00
virt Generic: 2022-01-22 09:40:01 +02:00
.clang-format genirq/msi: Make interrupt allocation less convoluted 2021-12-16 22:22:20 +01:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap mailmap: update email address of Brian Silverman 2022-01-24 18:27:23 +01:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Removing Ohad from remoteproc/rpmsg maintenance 2021-12-08 10:09:40 -07:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Merge tag 'ieee802154-for-net-2022-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan 2022-01-28 15:10:45 +00:00
Makefile Linux 5.17-rc1 2022-01-23 10:12:53 +02:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.