Go to file
Al Viro a8023f8b55 close_range(): fix the logics in descriptor table trimming
commit 678379e1d4f7443b170939525d3312cfc37bf86b upstream.

Cloning a descriptor table picks the size that would cover all currently
opened files.  That's fine for clone() and unshare(), but for close_range()
there's an additional twist - we clone before we close, and it would be
a shame to have
	close_range(3, ~0U, CLOSE_RANGE_UNSHARE)
leave us with a huge descriptor table when we are not going to keep
anything past stderr, just because some large file descriptor used to
be open before our call has taken it out.

Unfortunately, it had been dealt with in an inherently racy way -
sane_fdtable_size() gets a "don't copy anything past that" argument
(passed via unshare_fd() and dup_fd()), close_range() decides how much
should be trimmed and passes that to unshare_fd().

The problem is, a range that used to extend to the end of descriptor
table back when close_range() had looked at it might very well have stuff
grown after it by the time dup_fd() has allocated a new files_struct
and started to figure out the capacity of fdtable to be attached to that.

That leads to interesting pathological cases; at the very least it's a
QoI issue, since unshare(CLONE_FILES) is atomic in a sense that it takes
a snapshot of descriptor table one might have observed at some point.
Since CLOSE_RANGE_UNSHARE close_range() is supposed to be a combination
of unshare(CLONE_FILES) with plain close_range(), ending up with a
weird state that would never occur with unshare(2) is confusing, to put
it mildly.

It's not hard to get rid of - all it takes is passing both ends of the
range down to sane_fdtable_size().  There we are under ->files_lock,
so the race is trivially avoided.

So we do the following:
	* switch close_files() from calling unshare_fd() to calling
dup_fd().
	* undo the calling convention change done to unshare_fd() in
60997c3d45 "close_range: add CLOSE_RANGE_UNSHARE"
	* introduce struct fd_range, pass a pointer to that to dup_fd()
and sane_fdtable_size() instead of "trim everything past that point"
they are currently getting.  NULL means "we are not going to be punching
any holes"; NR_OPEN_MAX is gone.
	* make sane_fdtable_size() use find_last_bit() instead of
open-coding it; it's easier to follow that way.
	* while we are at it, have dup_fd() report errors by returning
ERR_PTR(), no need to use a separate int *errorp argument.

Fixes: 60997c3d45 "close_range: add CLOSE_RANGE_UNSHARE"
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-10 11:58:00 +02:00
Documentation arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386 2024-10-10 11:57:52 +02:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
arch riscv: Fix kernel stack size when KASAN is enabled 2024-10-10 11:57:53 +02:00
block blk_iocost: fix more out of bound shifts 2024-10-10 11:57:24 +02:00
certs certs: Reference revocation list for all keyrings 2023-08-17 20:12:41 +00:00
crypto crypto: simd - Do not call crypto_alloc_tfm during registration 2024-10-10 11:57:26 +02:00
drivers net: pcs: xpcs: fix the wrong register that was written back 2024-10-10 11:57:58 +02:00
fs close_range(): fix the logics in descriptor table trimming 2024-10-10 11:58:00 +02:00
include close_range(): fix the logics in descriptor table trimming 2024-10-10 11:58:00 +02:00
init rust: fix the default format for CONFIG_{RUSTC,BINDGEN}_VERSION_TEXT 2024-08-29 17:33:29 +02:00
io_uring io_uring/sqpoll: do not put cpumask on stack 2024-10-04 16:29:44 +02:00
ipc sysctl: treewide: drop unused argument ctl_table_root::set_ownership(table) 2024-08-11 12:47:13 +02:00
kernel close_range(): fix the logics in descriptor table trimming 2024-10-10 11:58:00 +02:00
lib mm/filemap: optimize filemap folio adding 2024-10-04 16:30:02 +02:00
mm mm: krealloc: consider spare memory for __GFP_ZERO 2024-10-10 11:57:50 +02:00
net mac802154: Fix potential RCU dereference issue in mac802154_scan_worker 2024-10-10 11:57:59 +02:00
rust rust: sync: require `T: Sync` for `LockedBy::access` 2024-10-10 11:57:44 +02:00
samples samples/bpf: Fix compilation errors with cf-protection option 2024-10-04 16:29:19 +02:00
scripts scripts: kconfig: merge_config: config files: add a trailing newline 2024-09-18 19:24:05 +02:00
security tomoyo: fallback to realpath if symlink's pathname does not exist 2024-10-10 11:57:57 +02:00
sound ALSA: hda/realtek: Add a quirk for HP Pavilion 15z-ec200 2024-10-10 11:57:46 +02:00
tools rtla: Fix the help text in osnoise and timerlat top tools 2024-10-10 11:58:00 +02:00
usr initramfs: Encode dependency on KBUILD_BUILD_TIMESTAMP 2023-06-06 17:54:49 +09:00
virt KVM: Use dedicated mutex to protect kvm_usage_count to avoid deadlock 2024-10-04 16:29:47 +02:00
.clang-format iommu: Add for_each_group_device() 2023-05-23 08:15:51 +02:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore Remove *.orig pattern from .gitignore 2024-10-04 16:29:44 +02:00
.mailmap 20 hotfixes. 12 are cc:stable and the remainder address post-6.5 issues 2023-10-24 09:52:16 -10:00
.rustfmt.toml rust: add `.rustfmt.toml` 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS USB: Remove Wireless USB and UWB documentation 2023-08-09 14:17:32 +02:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS membarrier: riscv: Add full memory barrier in switch_mm() 2024-09-12 11:11:45 +02:00
Makefile Linux 6.6.54 2024-10-04 16:30:05 +02:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.