Go to file
Peter Oberparleiter cf245e831a s390/cio: fix invalid -EBUSY on ccw_device_start
commit 5ef1dc40ffa6a6cb968b0fdc43c3a61727a9e950 upstream.

The s390 common I/O layer (CIO) returns an unexpected -EBUSY return code
when drivers try to start I/O while a path-verification (PV) process is
pending. This can lead to failed device initialization attempts with
symptoms like broken network connectivity after boot.

Fix this by replacing the -EBUSY return code with a deferred condition
code 1 reply to make path-verification handling consistent from a
driver's point of view.

The problem can be reproduced semi-regularly using the following process,
while repeating steps 2-3 as necessary (example assumes an OSA device
with bus-IDs 0.0.a000-0.0.a002 on CHPID 0.02):

1. echo 0.0.a000,0.0.a001,0.0.a002 >/sys/bus/ccwgroup/drivers/qeth/group
2. echo 0 > /sys/bus/ccwgroup/devices/0.0.a000/online
3. echo 1 > /sys/bus/ccwgroup/devices/0.0.a000/online ; \
   echo on > /sys/devices/css0/chp0.02/status

Background information:

The common I/O layer starts path-verification I/Os when it receives
indications about changes in a device path's availability. This occurs
for example when hardware events indicate a change in channel-path
status, or when a manual operation such as a CHPID vary or configure
operation is performed.

If a driver attempts to start I/O while a PV is running, CIO reports a
successful I/O start (ccw_device_start() return code 0). Then, after
completion of PV, CIO synthesizes an interrupt response that indicates
an asynchronous status condition that prevented the start of the I/O
(deferred condition code 1).

If a PV indication arrives while a device is busy with driver-owned I/O,
PV is delayed until after I/O completion was reported to the driver's
interrupt handler. To ensure that PV can be started eventually, CIO
reports a device busy condition (ccw_device_start() return code -EBUSY)
if a driver tries to start another I/O while PV is pending.

In some cases this -EBUSY return code causes device drivers to consider
a device not operational, resulting in failed device initialization.

Note: The code that introduced the problem was added in 2003. Symptoms
started appearing with the following CIO commit that causes a PV
indication when a device is removed from the cio_ignore list after the
associated parent subchannel device was probed, but before online
processing of the CCW device has started:

2297791c92 ("s390/cio: dont unregister subchannel from child-drivers")

During boot, the cio_ignore list is modified by the cio_ignore dracut
module [1] as well as Linux vendor-specific systemd service scripts[2].
When combined, this commit and boot scripts cause a frequent occurrence
of the problem during boot.

[1] https://github.com/dracutdevs/dracut/tree/master/modules.d/81cio_ignore
[2] https://github.com/SUSE/s390-tools/blob/master/cio_ignore.service

Cc: stable@vger.kernel.org # v5.15+
Fixes: 2297791c92 ("s390/cio: dont unregister subchannel from child-drivers")
Tested-By: Thorsten Winkler <twinkler@linux.ibm.com>
Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com>
Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-01 13:34:58 +01:00
Documentation docs: Instruct LaTeX to cope with deeper nesting 2024-03-01 13:34:58 +01:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
arch LoongArch: Update cpu_sibling_map when disabling nonboot CPUs 2024-03-01 13:34:58 +01:00
block block: Fix WARNING in _copy_from_iter 2024-03-01 13:34:49 +01:00
certs certs: Reference revocation list for all keyrings 2023-08-17 20:12:41 +00:00
crypto crypto: algif_hash - Remove bogus SGL free on zero-length error path 2024-02-23 09:25:11 +01:00
drivers s390/cio: fix invalid -EBUSY on ccw_device_start 2024-03-01 13:34:58 +01:00
fs btrfs: defrag: avoid unnecessary defrag caused by incorrect extent size 2024-03-01 13:34:58 +01:00
include xen/events: reduce externally visible helper functions 2024-03-01 13:34:57 +01:00
init update workarounds for gcc "asm goto" issue 2024-02-23 09:24:47 +01:00
io_uring io_uring/net: fix multishot accept overflow handling 2024-02-23 09:25:10 +01:00
ipc Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
kernel sched/rt: Disallow writing invalid values to sched_rt_period_us 2024-03-01 13:34:47 +01:00
lib Revert "kobject: Remove redundant checks for whether ktype is NULL" 2024-02-23 09:24:58 +01:00
mm blk-wbt: Fix detection of dirty-throttled tasks 2024-02-23 09:25:16 +01:00
net mptcp: corner case locking for rx path fields initialization 2024-03-01 13:34:57 +01:00
rust rust: upgrade to Rust 1.73.0 2024-02-16 19:10:43 +01:00
samples work around gcc bugs with 'asm goto' with outputs 2024-02-23 09:24:47 +01:00
scripts modpost: Add '.ltext' and '.ltext.*' to TEXT_SECTIONS 2024-02-23 09:25:03 +01:00
security lsm: fix the logic in security_inode_getsecctx() 2024-02-23 09:25:02 +01:00
sound ALSA: usb-audio: Ignore clock selector errors for single connection 2024-03-01 13:34:52 +01:00
tools tools: selftests: riscv: Fix compile warnings in mm tests 2024-03-01 13:34:48 +01:00
usr initramfs: Encode dependency on KBUILD_BUILD_TIMESTAMP 2023-06-06 17:54:49 +09:00
virt ARM: 2023-09-07 13:52:20 -07:00
.clang-format iommu: Add for_each_group_device() 2023-05-23 08:15:51 +02:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: rpm-pkg: rename binkernel.spec to kernel.spec 2023-07-25 00:59:33 +09:00
.mailmap 20 hotfixes. 12 are cc:stable and the remainder address post-6.5 issues 2023-10-24 09:52:16 -10:00
.rustfmt.toml rust: add `.rustfmt.toml` 2022-09-28 09:02:20 +02:00
COPYING
CREDITS USB: Remove Wireless USB and UWB documentation 2023-08-09 14:17:32 +02:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: add Catherine as xfs maintainer for 6.6.y 2024-02-16 19:10:43 +01:00
Makefile Linux 6.6.18 2024-02-23 09:25:28 +01:00
README

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.