OpenCloudOS-Kernel/kernel
Petr Pavlu 0254127ab9 module: Don't wait for GOING modules
During a system boot, it can happen that the kernel receives a burst of
requests to insert the same module but loading it eventually fails
during its init call. For instance, udev can make a request to insert
a frequency module for each individual CPU when another frequency module
is already loaded which causes the init function of the new module to
return an error.

Since commit 6e6de3dee5 ("kernel/module.c: Only return -EEXIST for
modules that have finished loading"), the kernel waits for modules in
MODULE_STATE_GOING state to finish unloading before making another
attempt to load the same module.

This creates unnecessary work in the described scenario and delays the
boot. In the worst case, it can prevent udev from loading drivers for
other devices and might cause timeouts of services waiting on them and
subsequently a failed boot.

This patch attempts a different solution for the problem 6e6de3dee5
was trying to solve. Rather than waiting for the unloading to complete,
it returns a different error code (-EBUSY) for modules in the GOING
state. This should avoid the error situation that was described in
6e6de3dee5 (user space attempting to load a dependent module because
the -EEXIST error code would suggest to user space that the first module
had been loaded successfully), while avoiding the delay situation too.

This has been tested on linux-next since December 2022 and passes
all kmod selftests except test 0009 with module compression enabled
but it has been confirmed that this issue has existed and has gone
unnoticed since prior to this commit and can also be reproduced without
module compression with a simple usleep(5000000) on tools/modprobe.c [0].
These failures are caused by hitting the kernel mod_concurrent_max and can
happen either due to a self inflicted kernel module auto-loead DoS somehow
or on a system with large CPU count and each CPU count incorrectly triggering
many module auto-loads. Both of those issues need to be fixed in-kernel.

[0] https://lore.kernel.org/all/Y9A4fiobL6IHp%2F%2FP@bombadil.infradead.org/

Fixes: 6e6de3dee5 ("kernel/module.c: Only return -EEXIST for modules that have finished loading")
Co-developed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Cc: stable@vger.kernel.org
Reviewed-by: Petr Mladek <pmladek@suse.com>
[mcgrof: enhance commit log with testing and kmod test result interpretation ]
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-01-24 12:52:52 -08:00
..
bpf bpf: Fix pointer-leak due to insufficient speculative store bypass mitigation 2023-01-13 17:18:35 +01:00
cgroup MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
configs mm, slob: rename CONFIG_SLOB to CONFIG_SLOB_DEPRECATED 2022-12-01 00:09:20 +01:00
debug kdb: use srcu console list iterator 2022-12-02 11:25:00 +01:00
dma dma-mapping: reject GFP_COMP for noncoherent allocations 2022-12-21 08:45:38 +01:00
entry entry: kmsan: introduce kmsan_unpoison_entry_regs() 2022-10-03 14:03:25 -07:00
events perf/core: Call LSM hook after copying perf_event_attr 2022-12-27 12:44:01 +01:00
futex - Prevent the leaking of a debug timer in futex_waitv() 2023-01-01 11:15:05 -08:00
gcov gcov: add support for checksum field 2022-12-21 14:31:52 -08:00
irq genirq/msi: Return MSI_XA_DOMAIN_SIZE as the maximum MSI index when no domain is present 2022-12-16 14:04:04 +00:00
kcsan kcsan: test: don't put the expect array on the stack 2023-01-02 08:59:33 -08:00
livepatch modules changes for v6.2-rc1 2022-12-13 14:05:39 -08:00
locking - Prevent the leaking of a debug timer in futex_waitv() 2023-01-01 11:15:05 -08:00
module module: Don't wait for GOING modules 2023-01-24 12:52:52 -08:00
power PM: sleep: Refine error message in try_to_freeze_tasks() 2022-12-06 12:04:34 +01:00
printk Merge branch 'rework/console-list-lock' into for-linus 2023-01-19 14:56:38 +01:00
rcu Urgent RCU pull request for v6.2 2022-12-21 07:59:57 -08:00
sched sched/core: Fix NULL pointer access fault in sched_setaffinity() with non-SMP configs 2023-01-16 10:07:25 +01:00
time time: Fix various kernel-doc problems 2023-01-03 11:07:58 +01:00
trace bpf: Skip task with pid=1 in send_signal_common() 2023-01-06 22:48:58 +01:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels" 2022-03-31 10:36:55 +02:00
Makefile kernel hardening fixes for v6.2-rc1 2022-12-23 12:00:24 -08:00
acct.c acct: fix potential integer overflow in encode_comp_t() 2022-11-30 16:13:18 -08:00
async.c Revert "module, async: async_synchronize_full() on module init iff async is used" 2022-02-03 11:20:34 -08:00
audit.c audit: use time_after to compare time 2022-08-29 19:47:03 -04:00
audit.h audit: remove selinux_audit_rule_update() declaration 2022-09-07 11:30:15 -04:00
audit_fsnotify.c audit: fix potential double free on error path from fsnotify_add_inode_mark 2022-08-22 18:50:06 -04:00
audit_tree.c audit: use fsnotify group lock helpers 2022-04-25 14:37:28 +02:00
audit_watch.c audit_init_parent(): constify path 2022-09-01 17:39:30 -04:00
auditfilter.c audit/stable-5.17 PR 20220110 2022-01-11 13:08:21 -08:00
auditsc.c audit: unify audit_filter_{uring(), inode_name(), syscall()} 2022-10-17 14:24:42 -04:00
backtracetest.c
bounds.c mm: multi-gen LRU: minimal implementation 2022-09-26 19:46:09 -07:00
capability.c caps: use type safe idmapping helpers 2022-10-26 10:02:39 +02:00
cfi.c cfi: Switch to -fsanitize=kcfi 2022-09-26 10:13:13 -07:00
compat.c
configs.c
context_tracking.c MAINTAINERS: Add Paul as context tracking maintainer 2022-07-05 13:33:00 -07:00
cpu.c cpu/hotplug: Do not bail-out in DYING/STARTING sections 2022-12-02 12:43:02 +01:00
cpu_pm.c context_tracking: Take IRQ eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
crash_core.c vmcoreinfo: warn if we exceed vmcoreinfo data size 2022-11-30 16:13:17 -08:00
crash_dump.c
cred.c cred: Do not default to init_cred in prepare_kernel_cred() 2022-11-01 10:04:52 -07:00
delayacct.c delayacct: support re-entrance detection of thrashing accounting 2022-09-26 19:46:07 -07:00
dma.c
exec_domain.c
exit.c exit: Use READ_ONCE() for all oops/warn limit reads 2022-12-16 12:26:57 -08:00
extable.c context_tracking: Take NMI eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
fail_function.c fail_function: fix wrong use of fei_attr_remove() 2022-09-11 21:55:11 -07:00
fork.c New Feature: 2022-12-17 14:06:53 -06:00
freezer.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
gen_kheaders.sh kheaders: explicitly validate existence of cpio command 2023-01-18 16:34:04 +09:00
groups.c security: Add LSM hook to setgroups() syscall 2022-07-15 18:21:49 +00:00
hung_task.c sched: Fix more TASK_state comparisons 2022-09-30 16:50:39 +02:00
iomem.c
irq_work.c irq_work: use kasan_record_aux_stack_noalloc() record callstack 2022-04-15 14:49:55 -07:00
jump_label.c jump_label: Prevent key->enabled int overflow 2022-12-01 15:53:05 -08:00
kallsyms.c kallsyms: Add self-test facility 2022-11-15 00:42:02 -08:00
kallsyms_internal.h kallsyms: Reduce the memory occupied by kallsyms_seqs_of_names[] 2022-11-12 18:47:36 -08:00
kallsyms_selftest.c kallsyms: Fix scheduling with interrupts disabled in self-test 2023-01-13 15:09:08 -08:00
kallsyms_selftest.h kallsyms: Add self-test facility 2022-11-15 00:42:02 -08:00
kcmp.c
kcov.c kcov: kmsan: unpoison area->list in kcov_remote_area_put() 2022-10-03 14:03:23 -07:00
kexec.c panic, kexec: make __crash_kexec() NMI safe 2022-09-11 21:55:06 -07:00
kexec_core.c kexec: remove the unneeded result variable 2022-11-18 13:55:07 -08:00
kexec_elf.c
kexec_file.c kexec: replace crash_mem_range with range 2022-11-18 13:55:07 -08:00
kexec_internal.h panic, kexec: make __crash_kexec() NMI safe 2022-09-11 21:55:06 -07:00
kheaders.c
kmod.c
kprobes.c kprobes: kretprobe events missing on 2-core KVM guest 2022-12-15 08:48:40 +09:00
ksysfs.c kernel/ksysfs.c: export kernel cpu byteorder 2022-11-10 19:07:31 +01:00
kthread.c signal: break out of wait loops on kthread_stop() 2022-10-09 16:01:59 -07:00
latencytop.c latencytop: use the last element of latency_record of system 2022-09-11 21:55:12 -07:00
module_signature.c
notifier.c notifier: repair slips in kernel-doc comments 2022-11-30 19:32:30 +01:00
nsproxy.c fs/exec: switch timens when a task gets a new mm 2022-10-25 15:15:52 -07:00
padata.c Kbuild updates for v6.2 2022-12-19 12:33:32 -06:00
panic.c kernel hardening fixes for v6.2-rc1 2022-12-23 12:00:24 -08:00
params.c Driver Core changes for 6.2-rc1 2022-12-16 03:54:54 -08:00
pid.c gfs2: Add glockfd debugfs file 2022-06-29 13:07:16 +02:00
pid_namespace.c kernel: pid_namespace: use NULL instead of using plain integer as pointer 2022-04-29 14:38:00 -07:00
profile.c kernel/profile.c: simplify duplicated code in profile_setup() 2022-09-11 21:55:12 -07:00
ptrace.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
range.c
reboot.c kernel/reboot: Add SYS_OFF_MODE_RESTART_PREPARE mode 2022-10-04 15:59:36 +02:00
regset.c
relay.c relay: fix type mismatch when allocating memory in relay_create_buf() 2022-12-11 19:30:19 -08:00
resource.c Driver Core changes for 6.2-rc1 2022-12-16 03:54:54 -08:00
resource_kunit.c
rseq.c rseq: Use pr_warn_once() when deprecated/unknown ABI flags are encountered 2022-11-14 09:58:32 +01:00
scftorture.c scftorture: Fix distribution of short handler delays 2022-04-11 17:07:29 -07:00
scs.c scs: add support for dynamic shadow call stacks 2022-11-09 18:06:35 +00:00
seccomp.c seccomp: Add wait_killable semantic to seccomp user notifier 2022-05-03 14:11:58 -07:00
signal.c hardening updates for v6.2-rc1 2022-12-14 12:20:00 -08:00
smp.c bitmap patches for v6.1-rc1 2022-10-10 12:49:34 -07:00
smpboot.c smpboot: use atomic_try_cmpxchg in cpu_wait_death and cpu_report_death 2022-09-11 21:55:10 -07:00
smpboot.h
softirq.c context_tracking: Take IRQ eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
stackleak.c stackleak: add on/off stack variants 2022-05-08 01:33:09 -07:00
stacktrace.c uaccess: remove CONFIG_SET_FS 2022-02-25 09:36:06 +01:00
static_call.c static_call: Don't make __static_call_return0 static 2022-04-05 09:59:38 +02:00
static_call_inline.c static_call: Add call depth tracking support 2022-10-17 16:41:16 +02:00
stop_machine.c Scheduler changes in this cycle were: 2022-05-24 11:11:13 -07:00
sys.c prlimit: do_prlimit needs to have a speculation check 2023-01-21 16:14:17 +01:00
sys_ni.c kernel/sys_ni: add compat entry for fadvise64_64 2022-08-20 15:17:45 -07:00
sysctl-test.c kernel/sysctl-test: use SYSCTL_{ZERO/ONE_HUNDRED} instead of i_{zero/one_hundred} 2022-09-08 16:56:45 -07:00
sysctl.c MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
task_work.c task_work: use try_cmpxchg in task_work_add, task_work_cancel_match and task_work_run 2022-09-11 21:55:10 -07:00
taskstats.c genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
torture.c torture: Wake up kthreads after storing task_struct pointer 2022-02-01 17:24:39 -08:00
tracepoint.c tracepoint: Optimize the critical region of mutex_lock in tracepoint_module_coming() 2022-09-26 13:01:18 -04:00
tsacct.c taskstats: version 12 with thread group and exe info 2022-04-29 14:38:03 -07:00
ucount.c ucounts: Split rlimit and ucount values and max values 2022-05-18 18:24:57 -05:00
uid16.c
uid16.h
umh.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
up.c
user-return-notifier.c
user.c kernel/user: Allow user_struct::locked_vm to be usable for iommufd 2022-11-30 20:16:49 -04:00
user_namespace.c ucounts: Split rlimit and ucount values and max values 2022-10-09 16:24:05 -07:00
usermode_driver.c blob_to_mnt(): kern_unmount() is needed to undo kern_mount() 2022-05-19 23:25:47 -04:00
utsname.c
utsname_sysctl.c kernel/utsname_sysctl.c: Fix hostname polling 2022-10-23 12:01:01 -07:00
watch_queue.c This was a moderately busy cycle for documentation, but nothing all that 2022-08-02 19:24:24 -07:00
watchdog.c powerpc updates for 6.0 2022-08-06 16:38:17 -07:00
watchdog_hld.c Revert "printk: add functions to prefer direct printing" 2022-06-23 18:41:40 +02:00
workqueue.c workqueue: Make queue_rcu_work() use call_rcu_hurry() 2022-11-30 13:17:05 -08:00
workqueue_internal.h