OpenCloudOS-Kernel/fs
David Howells 8715fe2fc1 afs: Fix refcount underflow from error handling race
[ Upstream commit 52bf9f6c09fca8c74388cd41cc24e5d1bff812a9 ]

If an AFS cell that has an unreachable (eg. ENETUNREACH) server listed (VL
server or fileserver), an asynchronous probe to one of its addresses may
fail immediately because sendmsg() returns an error.  When this happens, a
refcount underflow can happen if certain events hit a very small window.

The way this occurs is:

 (1) There are two levels of "call" object, the afs_call and the
     rxrpc_call.  Each of them can be transitioned to a "completed" state
     in the event of success or failure.

 (2) Asynchronous afs_calls are self-referential whilst they are active to
     prevent them from evaporating when they're not being processed.  This
     reference is disposed of when the afs_call is completed.

     Note that an afs_call may only be completed once; once completed
     completing it again will do nothing.

 (3) When a call transmission is made, the app-side rxrpc code queues a Tx
     buffer for the rxrpc I/O thread to transmit.  The I/O thread invokes
     sendmsg() to transmit it - and in the case of failure, it transitions
     the rxrpc_call to the completed state.

 (4) When an rxrpc_call is completed, the app layer is notified.  In this
     case, the app is kafs and it schedules a work item to process events
     pertaining to an afs_call.

 (5) When the afs_call event processor is run, it goes down through the
     RPC-specific handler to afs_extract_data() to retrieve data from rxrpc
     - and, in this case, it picks up the error from the rxrpc_call and
     returns it.

     The error is then propagated to the afs_call and that is completed
     too.  At this point the self-reference is released.

 (6) If the rxrpc I/O thread manages to complete the rxrpc_call within the
     window between rxrpc_send_data() queuing the request packet and
     checking for call completion on the way out, then
     rxrpc_kernel_send_data() will return the error from sendmsg() to the
     app.

 (7) Then afs_make_call() will see an error and will jump to the error
     handling path which will attempt to clean up the afs_call.

 (8) The problem comes when the error handling path in afs_make_call()
     tries to unconditionally drop an async afs_call's self-reference.
     This self-reference, however, may already have been dropped by
     afs_extract_data() completing the afs_call

 (9) The refcount underflows when we return to afs_do_probe_vlserver() and
     that tries to drop its reference on the afs_call.

Fix this by making afs_make_call() attempt to complete the afs_call rather
than unconditionally putting it.  That way, if afs_extract_data() manages
to complete the call first, afs_make_call() won't do anything.

The bug can be forced by making do_udp_sendmsg() return -ENETUNREACH and
sticking an msleep() in rxrpc_send_data() after the 'success:' label to
widen the race window.

The error message looks something like:

    refcount_t: underflow; use-after-free.
    WARNING: CPU: 3 PID: 720 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
    ...
    RIP: 0010:refcount_warn_saturate+0xba/0x110
    ...
    afs_put_call+0x1dc/0x1f0 [kafs]
    afs_fs_get_capabilities+0x8b/0xe0 [kafs]
    afs_fs_probe_fileserver+0x188/0x1e0 [kafs]
    afs_lookup_server+0x3bf/0x3f0 [kafs]
    afs_alloc_server_list+0x130/0x2e0 [kafs]
    afs_create_volume+0x162/0x400 [kafs]
    afs_get_tree+0x266/0x410 [kafs]
    vfs_get_tree+0x25/0xc0
    fc_mount+0xe/0x40
    afs_d_automount+0x1b3/0x390 [kafs]
    __traverse_mounts+0x8f/0x210
    step_into+0x340/0x760
    path_openat+0x13a/0x1260
    do_filp_open+0xaf/0x160
    do_sys_openat2+0xaf/0x170

or something like:

    refcount_t: underflow; use-after-free.
    ...
    RIP: 0010:refcount_warn_saturate+0x99/0xda
    ...
    afs_put_call+0x4a/0x175
    afs_send_vl_probes+0x108/0x172
    afs_select_vlserver+0xd6/0x311
    afs_do_cell_detect_alias+0x5e/0x1e9
    afs_cell_detect_alias+0x44/0x92
    afs_validate_fc+0x9d/0x134
    afs_get_tree+0x20/0x2e6
    vfs_get_tree+0x1d/0xc9
    fc_mount+0xe/0x33
    afs_d_automount+0x48/0x9d
    __traverse_mounts+0xe0/0x166
    step_into+0x140/0x274
    open_last_lookups+0x1c1/0x1df
    path_openat+0x138/0x1c3
    do_filp_open+0x55/0xb4
    do_sys_openat2+0x6c/0xb6

Fixes: 34fa47612b ("afs: Fix race in async call refcounting")
Reported-by: Bill MacAllister <bill@ca-zephyr.org>
Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052304
Suggested-by: Jeffrey E Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/2633992.1702073229@warthog.procyon.org.uk/ # v1
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-20 17:01:43 +01:00
..
9p 9p: v9fs_listxattr: fix %s null argument warning 2023-11-28 17:19:46 +00:00
adfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
affs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
afs afs: Fix refcount underflow from error handling race 2023-12-20 17:01:43 +01:00
autofs v6.6-vfs.autofs 2023-08-28 11:39:14 -07:00
befs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
bfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
btrfs btrfs: fix 64bit compat send ioctl arguments not initializing version member 2023-12-08 08:52:20 +01:00
cachefiles - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
ceph assorted fixes all over the place 2023-10-27 16:44:58 -10:00
coda v6.6-vfs.ctime 2023-08-28 09:31:32 -07:00
configfs configfs: convert to ctime accessor functions 2023-07-13 10:28:05 +02:00
cramfs v6.6-vfs.super 2023-08-28 11:04:18 -07:00
crypto fscrypt: Replace 1-element array with flexible array 2023-05-23 19:46:09 -07:00
debugfs debugfs: Fix __rcu type comparison warning 2023-11-20 11:59:26 +01:00
devpts v6.6-vfs.misc 2023-08-28 10:17:14 -07:00
dlm fs: dlm: Simplify buffer size computation in dlm_create_debug_file() 2023-11-20 11:59:37 +01:00
ecryptfs fs: Pass AT_GETATTR_NOSEC flag to getattr interface function 2023-12-03 07:33:03 +01:00
efivarfs efivarfs: fix statfs() on efivarfs 2023-09-11 09:10:02 +00:00
efs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
erofs erofs: fix erofs_insert_workgroup() lockref usage 2023-11-20 11:59:23 +01:00
exfat exfat: support handle zero-size directory 2023-11-28 17:19:44 +00:00
exportfs exportfs: remove kernel-doc warnings in exportfs 2023-08-29 17:45:22 -04:00
ext2 ext2: Fix ki_pos update for DIO buffered-io fallback case 2023-12-08 08:52:19 +01:00
ext4 ext4: fix warning in ext4_dio_write_end_io() 2023-12-20 17:01:42 +01:00
f2fs f2fs: split initial and dynamic conditions for extent_cache 2023-11-28 17:20:11 +00:00
fat for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
freevxfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
fscache
fuse fuse update for 6.6 2023-09-05 12:45:55 -07:00
gfs2 gfs2: don't withdraw if init_threads() got interrupted 2023-11-28 17:20:11 +00:00
hfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
hfsplus for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
hostfs hostfs: convert to ctime accessor functions 2023-07-24 10:30:00 +02:00
hpfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
hugetlbfs fs: use nth_page() in place of direct struct page manipulation 2023-11-28 17:20:05 +00:00
iomap iomap: fix short copy in iomap_write_iter() 2023-10-19 09:41:36 -07:00
isofs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
jbd2 jbd2: fix potential data lost in recovering journal raced with synchronizing fs bdev 2023-11-28 17:20:04 +00:00
jffs2 jffs2: convert to ctime accessor functions 2023-07-24 10:30:01 +02:00
jfs jfs: fix array-index-out-of-bounds in diAlloc 2023-11-28 17:19:43 +00:00
kernfs Driver core changes for 6.6-rc1 2023-09-01 09:43:18 -07:00
lockd SUNRPC: Add enum svc_auth_status 2023-08-29 17:45:22 -04:00
minix for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
netfs netfs: Only call folio_start_fscache() one time for each folio 2023-09-18 12:03:46 -07:00
nfs NFSv4.1: fix SP4_MACH_CRED protection for pnfs IO 2023-11-28 17:19:49 +00:00
nfs_common
nfsd NFSD: Fix checksum mismatches in the duplicate reply cache 2023-12-03 07:33:02 +01:00
nilfs2 nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage() 2023-12-13 18:45:22 +01:00
nls nls: Hide new NLS_UCS2_UTILS 2023-08-31 12:07:34 -05:00
notify fanotify: limit reporting of event with non-decodeable file handles 2023-10-19 16:19:20 +02:00
ntfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
ntfs3 driver ntfs3 for linux 6.6 2023-10-19 09:10:18 -07:00
ocfs2 Many ext4 and jbd2 cleanups and bug fixes for v6.6-rc1. 2023-08-31 15:18:15 -07:00
omfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
openpromfs openpromfs: convert to ctime accessor functions 2023-07-24 10:30:03 +02:00
orangefs fs: drop the timespec64 argument from update_time 2023-08-11 09:04:57 +02:00
overlayfs fs: Pass AT_GETATTR_NOSEC flag to getattr interface function 2023-12-03 07:33:03 +01:00
proc watchdog: move softlockup_panic back to early_param 2023-11-28 17:19:57 +00:00
pstore pstore/platform: Add check for kstrdup 2023-11-20 11:58:53 +01:00
qnx4 for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
qnx6 for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
quota quota: explicitly forbid quota files from being encrypted 2023-11-28 17:20:04 +00:00
ramfs ramfs: convert to ctime accessor functions 2023-07-24 10:30:04 +02:00
reiserfs reiserfs: Replace 1-element array with C99 style flex-array 2023-09-11 14:07:46 +02:00
romfs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
smb ksmbd: fix memory leak in smb2_lock() 2023-12-20 17:01:43 +01:00
squashfs squashfs: convert to ctime accessor functions 2023-07-24 10:30:05 +02:00
sysfs sysfs: Skip empty folders creation 2023-06-15 13:37:53 +02:00
sysv for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
tracefs eventfs: Check for NULL ef in eventfs_set_attr() 2023-11-20 11:59:38 +01:00
ubifs fs: drop the timespec64 argument from update_time 2023-08-11 09:04:57 +02:00
udf \n 2023-08-30 12:10:50 -07:00
ufs for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
unicode
vboxsf v6.6-vfs.ctime 2023-08-28 09:31:32 -07:00
verity fsverity: skip PKCS#7 parser when keyring is empty 2023-08-20 10:33:43 -07:00
xfs xfs: recovery should not clear di_flushiter unconditionally 2023-11-28 17:20:09 +00:00
zonefs New code for 6.6: 2023-08-28 11:59:52 -07:00
Kconfig for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
Kconfig.binfmt riscv: support the elf-fdpic binfmt loader 2023-08-23 14:17:43 -07:00
Makefile fs: add CONFIG_BUFFER_HEAD 2023-08-02 09:13:09 -06:00
aio.c aio: Annotate struct kioctx_table with __counted_by 2023-09-20 14:22:01 +02:00
anon_inodes.c
attr.c v6.6-vfs.misc 2023-08-28 10:17:14 -07:00
bad_inode.c fs: drop the timespec64 argument from update_time 2023-08-11 09:04:57 +02:00
binfmt_elf.c Merge branch 'expand-stack' 2023-06-28 20:35:21 -07:00
binfmt_elf_fdpic.c fs: binfmt_elf_efpic: fix personality for ELF-FDPIC 2023-09-29 17:20:45 -07:00
binfmt_elf_test.c
binfmt_flat.c
binfmt_misc.c fs: convert to ctime accessor functions 2023-07-13 10:28:04 +02:00
binfmt_script.c
buffer.c iomap: add a workaround for racy i_size updates on block devices 2023-09-25 08:55:00 -07:00
char_dev.c
compat_binfmt_elf.c
coredump.c v6.5/vfs.misc 2023-06-26 09:50:21 -07:00
d_path.c
dax.c mm: remove enum page_entry_size 2023-08-24 16:20:30 -07:00
dcache.c fs/dcache: Replace printk and WARN_ON by WARN 2023-08-19 13:41:11 +02:00
direct-io.c - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
drop_caches.c fs: drop_caches: draining pages before dropping caches 2023-08-18 10:12:11 -07:00
eventfd.c eventfd: prevent underflow for eventfd semaphores 2023-07-11 11:41:34 +02:00
eventpoll.c epoll: simplify ep_alloc() 2023-07-26 14:56:07 +02:00
exec.c - An extensive rework of kexec and crash Kconfig from Eric DeVolder 2023-08-29 14:53:51 -07:00
fcntl.c fcntl: Cast commands with int args explicitly 2023-07-10 14:36:11 +02:00
fhandle.c fsnotify: move fsnotify_open() hook into do_dentry_open() 2023-06-12 10:43:45 +02:00
file.c v6.6-vfs.misc 2023-08-28 10:17:14 -07:00
file_table.c fs: use __fput_sync in close(2) 2023-08-08 19:36:51 +02:00
filesystems.c
fs-writeback.c writeback, cgroup: switch inodes with dirty timestamps to release dying cgwbs 2023-11-20 11:58:52 +01:00
fs_context.c fs: factor out vfs_parse_monolithic_sep() helper 2023-10-12 18:53:36 +03:00
fs_parser.c
fs_pin.c
fs_struct.c kill do_each_thread() 2023-08-21 13:46:25 -07:00
fs_types.c
fsopen.c fs: add FSCONFIG_CMD_CREATE_EXCL 2023-08-14 18:48:02 +02:00
init.c
inode.c filemap: add a per-mapping stable writes flag 2023-12-03 07:33:03 +01:00
internal.h for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
ioctl.c v6.6-vfs.super 2023-08-28 11:04:18 -07:00
kernel_read_file.c fs: Fix kernel-doc warnings 2023-08-19 12:12:12 +02:00
libfs.c libfs: getdents() should return 0 after reaching EOD 2023-12-03 07:33:03 +01:00
locks.c NFSD 6.6 Release Notes 2023-08-31 15:32:18 -07:00
mbcache.c
mnt_idmapping.c
mount.h
mpage.c
namei.c audit,io_uring: io_uring openat triggers audit reference count underflow 2023-10-13 18:34:46 +02:00
namespace.c v6.5/vfs.mount 2023-06-26 10:27:04 -07:00
nsfs.c fs: convert to ctime accessor functions 2023-07-13 10:28:04 +02:00
open.c v6.6-vfs.fchmodat2 2023-08-28 11:25:27 -07:00
pipe.c fs/pipe: remove duplicate "offset" initializer 2023-09-20 14:22:01 +02:00
pnode.c fs: allow to mount beneath top mount 2023-05-19 04:30:22 +02:00
pnode.h fs: allow to mount beneath top mount 2023-05-19 04:30:22 +02:00
posix_acl.c fs: convert to ctime accessor functions 2023-07-13 10:28:04 +02:00
proc_namespace.c tty, proc, kernfs, random: Use copy_splice_read() 2023-05-24 08:42:16 -06:00
read_write.c fs: Fix one kernel-doc comment 2023-08-15 08:32:45 +02:00
readdir.c vfs: get rid of old '->iterate' directory operation 2023-08-06 15:08:35 +02:00
remap_range.c fs: use UB-safe check for signed addition overflow in remap_verify_area 2023-05-24 11:03:59 +02:00
select.c
seq_file.c
signalfd.c
splice.c - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
stack.c fs: convert to ctime accessor functions 2023-07-13 10:28:04 +02:00
stat.c fs: Pass AT_GETATTR_NOSEC flag to getattr interface function 2023-12-03 07:33:03 +01:00
statfs.c statfs: enforce statfs[64] structure initialization 2023-05-17 15:20:17 +02:00
super.c fs: export sget_dev() 2023-08-31 12:47:15 +02:00
sync.c
sysctls.c sysctl: Refactor base paths registrations 2023-05-23 21:43:26 -07:00
timerfd.c
userfaultfd.c mm: userfaultfd: remove stale comment about core dump locking 2023-08-24 16:20:27 -07:00
utimes.c
xattr.c tmpfs,xattr: GFP_KERNEL_ACCOUNT for simple xattrs 2023-08-22 10:57:46 +02:00