OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
YiFei Zhu	445247b023	xtensa: Enable seccomp architecture tracking To enable seccomp constant action bitmaps, we need to have a static mapping to the audit architecture and system call table size. Add these for xtensa. Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/79669648ba167d668ea6ffb4884250abcd5ed254.1605101222.git.yifeifz2@illinois.edu	2020-11-20 11:16:35 -08:00
YiFei Zhu	4c18bc054b	sh: Enable seccomp architecture tracking To enable seccomp constant action bitmaps, we need to have a static mapping to the audit architecture and system call table size. Add these for sh. Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/61ae084cd4783b9b50860d9dedb4a348cf1b7b6f.1605101222.git.yifeifz2@illinois.edu	2020-11-20 11:16:35 -08:00
YiFei Zhu	c09058eda2	s390: Enable seccomp architecture tracking To enable seccomp constant action bitmaps, we need to have a static mapping to the audit architecture and system call table size. Add these for s390. Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/a381b10aa2c5b1e583642f3cd46ced842d9d4ce5.1605101222.git.yifeifz2@illinois.edu	2020-11-20 11:16:35 -08:00
YiFei Zhu	673a11a7e4	riscv: Enable seccomp architecture tracking To enable seccomp constant action bitmaps, we need to have a static mapping to the audit architecture and system call table size. Add these for riscv. Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/58ef925d00505cbb77478fa6bd2b48ab2d902460.1605101222.git.yifeifz2@illinois.edu	2020-11-20 11:16:35 -08:00
YiFei Zhu	e7bcb4622d	powerpc: Enable seccomp architecture tracking To enable seccomp constant action bitmaps, we need to have a static mapping to the audit architecture and system call table size. Add these for powerpc. __LITTLE_ENDIAN__ is used here instead of CONFIG_CPU_LITTLE_ENDIAN to keep it consistent with asm/syscall.h. Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/0b64925362671cdaa26d01bfe50b3ba5e164adfd.1605101222.git.yifeifz2@illinois.edu	2020-11-20 11:16:35 -08:00
YiFei Zhu	6aa7923c87	parisc: Enable seccomp architecture tracking To enable seccomp constant action bitmaps, we need to have a static mapping to the audit architecture and system call table size. Add these for parisc. Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/9bb86c546eda753adf5270425e7353202dbce87c.1605101222.git.yifeifz2@illinois.edu	2020-11-20 11:16:34 -08:00
YiFei Zhu	6e9ae6f988	csky: Enable seccomp architecture tracking To enable seccomp constant action bitmaps, we need to have a static mapping to the audit architecture and system call table size. Add these for csky. Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/f9219026d4803b22f3e57e3768b4e42e004ef236.1605101222.git.yifeifz2@illinois.edu	2020-11-20 11:16:34 -08:00
Kees Cook	424c9102fa	arm: Enable seccomp architecture tracking To enable seccomp constant action bitmaps, we need to have a static mapping to the audit architecture and system call table size. Add these for arm. Signed-off-by: Kees Cook <keescook@chromium.org>	2020-11-20 11:16:34 -08:00
Kees Cook	ffde703470	arm64: Enable seccomp architecture tracking To enable seccomp constant action bitmaps, we need to have a static mapping to the audit architecture and system call table size. Add these for arm64. Signed-off-by: Kees Cook <keescook@chromium.org>	2020-11-20 11:16:34 -08:00
Kees Cook	192cf32243	selftests/seccomp: Compare bitmap vs filter overhead As part of the seccomp benchmarking, include the expectations with regard to the timing behavior of the constant action bitmaps, and report inconsistencies better. Example output with constant action bitmaps on x86: $ sudo ./seccomp_benchmark 100000000 Current BPF sysctl settings: net.core.bpf_jit_enable = 1 net.core.bpf_jit_harden = 0 Benchmarking 200000000 syscalls... 129.359381409 - 0.008724424 = 129350656985 (129.4s) getpid native: 646 ns 264.385890006 - 129.360453229 = 135025436777 (135.0s) getpid RET_ALLOW 1 filter (bitmap): 675 ns 399.400511893 - 264.387045901 = 135013465992 (135.0s) getpid RET_ALLOW 2 filters (bitmap): 675 ns 545.872866260 - 399.401718327 = 146471147933 (146.5s) getpid RET_ALLOW 3 filters (full): 732 ns 696.337101319 - 545.874097681 = 150463003638 (150.5s) getpid RET_ALLOW 4 filters (full): 752 ns Estimated total seccomp overhead for 1 bitmapped filter: 29 ns Estimated total seccomp overhead for 2 bitmapped filters: 29 ns Estimated total seccomp overhead for 3 full filters: 86 ns Estimated total seccomp overhead for 4 full filters: 106 ns Estimated seccomp entry overhead: 29 ns Estimated seccomp per-filter overhead (last 2 diff): 20 ns Estimated seccomp per-filter overhead (filters / 4): 19 ns Expectations: native ≤ 1 bitmap (646 ≤ 675): ✔️ native ≤ 1 filter (646 ≤ 732): ✔️ per-filter (last 2 diff) ≈ per-filter (filters / 4) (20 ≈ 19): ✔️ 1 bitmapped ≈ 2 bitmapped (29 ≈ 29): ✔️ entry ≈ 1 bitmapped (29 ≈ 29): ✔️ entry ≈ 2 bitmapped (29 ≈ 29): ✔️ native + entry + (per filter * 4) ≈ 4 filters total (755 ≈ 752): ✔️ [YiFei: Changed commit message to show stats for this patch series] Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/1b61df3db85c5f7f1b9202722c45e7b39df73ef2.1602431034.git.yifeifz2@illinois.edu	2020-11-20 11:16:34 -08:00
Kees Cook	25db91209a	x86: Enable seccomp architecture tracking Provide seccomp internals with the details to calculate which syscall table the running kernel is expecting to deal with. This allows for efficient architecture pinning and paves the way for constant-action bitmaps. Co-developed-by: YiFei Zhu <yifeifz2@illinois.edu> Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/da58c3733d95c4f2115dd94225dfbe2573ba4d87.1602431034.git.yifeifz2@illinois.edu	2020-11-20 11:16:34 -08:00
YiFei Zhu	8e01b51a31	seccomp/cache: Add "emulator" to check if filter is constant allow SECCOMP_CACHE will only operate on syscalls that do not access any syscall arguments or instruction pointer. To facilitate this we need a static analyser to know whether a filter will return allow regardless of syscall arguments for a given architecture number / syscall number pair. This is implemented here with a pseudo-emulator, and stored in a per-filter bitmap. In order to build this bitmap at filter attach time, each filter is emulated for every syscall (under each possible architecture), and checked for any accesses of struct seccomp_data that are not the "arch" nor "nr" (syscall) members. If only "arch" and "nr" are examined, and the program returns allow, then we can be sure that the filter must return allow independent from syscall arguments. Nearly all seccomp filters are built from these cBPF instructions: BPF_LD \| BPF_W \| BPF_ABS BPF_JMP \| BPF_JEQ \| BPF_K BPF_JMP \| BPF_JGE \| BPF_K BPF_JMP \| BPF_JGT \| BPF_K BPF_JMP \| BPF_JSET \| BPF_K BPF_JMP \| BPF_JA BPF_RET \| BPF_K BPF_ALU \| BPF_AND \| BPF_K Each of these instructions are emulated. Any weirdness or loading from a syscall argument will cause the emulator to bail. The emulation is also halted if it reaches a return. In that case, if it returns an SECCOMP_RET_ALLOW, the syscall is marked as good. Emulator structure and comments are from Kees [1] and Jann [2]. Emulation is done at attach time. If a filter depends on more filters, and if the dependee does not guarantee to allow the syscall, then we skip the emulation of this syscall. [1] https://lore.kernel.org/lkml/20200923232923.3142503-5-keescook@chromium.org/ [2] https://lore.kernel.org/lkml/CAG48ez1p=dR_2ikKq=xVxkoGg0fYpTBpkhJSv1w-6BG=76PAvw@mail.gmail.com/ Suggested-by: Jann Horn <jannh@google.com> Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> Reviewed-by: Jann Horn <jannh@google.com> Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/71c7be2db5ee08905f41c3be5c1ad6e2601ce88f.1602431034.git.yifeifz2@illinois.edu	2020-11-20 11:16:34 -08:00
YiFei Zhu	f9d480b6ff	seccomp/cache: Lookup syscall allowlist bitmap for fast path The overhead of running Seccomp filters has been part of some past discussions [1][2][3]. Oftentimes, the filters have a large number of instructions that check syscall numbers one by one and jump based on that. Some users chain BPF filters which further enlarge the overhead. A recent work [6] comprehensively measures the Seccomp overhead and shows that the overhead is non-negligible and has a non-trivial impact on application performance. We observed some common filters, such as docker's [4] or systemd's [5], will make most decisions based only on the syscall numbers, and as past discussions considered, a bitmap where each bit represents a syscall makes most sense for these filters. The fast (common) path for seccomp should be that the filter permits the syscall to pass through, and failing seccomp is expected to be an exceptional case; it is not expected for userspace to call a denylisted syscall over and over. When it can be concluded that an allow must occur for the given architecture and syscall pair (this determination is introduced in the next commit), seccomp will immediately allow the syscall, bypassing further BPF execution. Each architecture number has its own bitmap. The architecture number in seccomp_data is checked against the defined architecture number constant before proceeding to test the bit against the bitmap with the syscall number as the index of the bit in the bitmap, and if the bit is set, seccomp returns allow. The bitmaps are all clear in this patch and will be initialized in the next commit. When only one architecture exists, the check against architecture number is skipped, suggested by Kees Cook [7]. [1] https://lore.kernel.org/linux-security-module/c22a6c3cefc2412cad00ae14c1371711@huawei.com/T/ [2] https://lore.kernel.org/lkml/202005181120.971232B7B@keescook/T/ [3] https://github.com/seccomp/libseccomp/issues/116 [4] `ae0ef82b90/profiles/seccomp/default.json` [5] `6743a1caf4/src/shared/seccomp-util.c (L270)` [6] Draco: Architectural and Operating System Support for System Call Security https://tianyin.github.io/pub/draco.pdf, MICRO-53, Oct. 2020 [7] https://lore.kernel.org/bpf/202010091614.8BB0EB64@keescook/ Co-developed-by: Dimitrios Skarlatos <dskarlat@cs.cmu.edu> Signed-off-by: Dimitrios Skarlatos <dskarlat@cs.cmu.edu> Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> Reviewed-by: Jann Horn <jannh@google.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/10f91a367ec4fcdea7fc3f086de3f5f13a4a7436.1602431034.git.yifeifz2@illinois.edu	2020-11-20 11:16:34 -08:00
Linus Torvalds	09162bc32c	Linux 5.10-rc4	2020-11-15 16:44:31 -08:00
Linus Torvalds	a6af8718b9	drm nouveau fixes for 5.10-rc4 nouveau: - atomic modesetting regression fix - ttm pre-nv50 fix - connector NULL ptr deref fix -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJfsZIJAAoJEAx081l5xIa+2HkP/AzENGcpenWcpJf+qabAGS7B 3ofu7AOFjLsnXT3PFGEPsoWCk4v0Eu9o0E9V2xevVVwDdoNue+fZ9cHkIoNsD6cL iDOmZPWcmyuAKDJedBESAP9ivjzmRVOwCPaTWINuhkOqFBgEmBXT/npLyg5iT36l vYWx1MCFGNvFadTfiBbwc+rNi1qNhPX3+TEUD0Tki7UUkB6Q+Yzifc5KSQAHmxnq ACdeKB6uHWvQzzw4dLYYm5I2iUfqpnC++otqjAtpjhiIx2Iuus1vZWiyTK2WfAEy Q7R4DkI4u9r/BiiAxoAjiSceyaAxL2dlbDyMr6dfoGQrffzaiM6UOHwB9FaSfe2b G5s7VJEQj1Bl42OvVuH+X8iZlzkPhh4SXfP02/nDhnYhz6agcLOFndcLGCeHp4AX Om5RPMH23p7bHJ92YqvTHzgCZHtG4SXc/fUG8KLFlRd+6Xgbl7mprfO/+H8Jvyt/ nrLzITNksvXY3zPXOrMERzNo0658sj4KuxOOXwz/eRDTah2DQe0xxu5tMRzUiclm plC4zEmRLusIcyrM7Aa8APh5GxsJ4eyOj2wx5V508eUhFIEyZ/ia83K/7olUG7/+ HGXTHtF3qd3CejGtya27ejgiNXRUTqniUvFFT8VQqISocEJ8R2zEO6FH0cdk+rnL 7baFeSvHzSfEEc3Wmls8 =bjmv -----END PGP SIGNATURE----- Merge tag 'drm-fixes-2020-11-16' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Nouveau fixes: - atomic modesetting regression fix - ttm pre-nv50 fix - connector NULL ptr deref fix" * tag 'drm-fixes-2020-11-16' of git://anongit.freedesktop.org/drm/drm: drm/nouveau/kms/nv50-: Use atomic encoder callbacks everywhere drm/nouveau/ttm: avoid using nouveau_drm.ttm.type_vram prior to nv50 drm/nouveau/kms: Fix NULL pointer dereference in nouveau_connector_detect_depth	2020-11-15 13:07:36 -08:00
Dave Airlie	8f598d15ee	Merge branch 'linux-5.10' of git://github.com/skeggsb/linux into drm-fixes - atomic modesetting regression fix - ttm pre-nv50 fix - connector NULL ptr deref fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Ben Skeggs <skeggsb@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CACAvsv5D9p78MNN0OxVeRZxN8LDqcadJEGUEFCgWJQ6+_rjPuw@mail.gmail.com	2020-11-16 06:36:31 +10:00
Linus Torvalds	9cfd9c4599	Char/Misc driver fixes for 5.10-rc4 Here are some small char/misc/whatever driver fixes for 5.10-rc4. Nothing huge, lots of small fixes for reported issues: - habanalabs driver fixes - speakup driver fixes - uio driver fixes - virtio driver fix - other tiny driver fixes Full details are in the shortlog. All of these have been in linux-next for a full week with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX7E69w8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ykCzgCgmxhxo/A/fnBiZxVgIQjL9KK791wAnjjcypF4 yLivbpFLSMLTUb46sxSL =bFxo -----END PGP SIGNATURE----- Merge tag 'char-misc-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small char/misc/whatever driver fixes for 5.10-rc4. Nothing huge, lots of small fixes for reported issues: - habanalabs driver fixes - speakup driver fixes - uio driver fixes - virtio driver fix - other tiny driver fixes Full details are in the shortlog. All of these have been in linux-next for a full week with no reported issues" * tag 'char-misc-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: uio: Fix use-after-free in uio_unregister_device() firmware: xilinx: fix out-of-bounds access nitro_enclaves: Fixup type and simplify logic of the poll mask setup speakup ttyio: Do not schedule() in ttyio_in_nowait speakup: Fix clearing selection in safe context speakup: Fix var_id_t values and thus keymap virtio: virtio_console: fix DMA memory allocation for rproc serial habanalabs/gaudi: mask WDT error in QMAN habanalabs/gaudi: move coresight mmu config habanalabs: fix kernel pointer type mei: protect mei_cl_mtu from null dereference	2020-11-15 10:15:17 -08:00
Linus Torvalds	281b3ec3a7	USB/Thunderbolt fixes for 5.10-rc4 Here are some small Thunderbolt and USB driver fixes for 5.10-rc4 to solve some reported issues. Nothing huge in here, just small things: - thunderbolt memory leaks fixed and new device ids added - revert of problem patch for the musb driver - new quirks added for USB devices - typec power supply fixes to resolve much reported problems about charging notifications not working anymore All except the cdc-acm driver quirk addition have been in linux-next with no reported issues (the quirk patch was applied on Friday, and is self-contained.) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX7E6EA8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymHNwCePOBlfVmcH3eEqVTByPdAG+L5m7MAnRLRqmMw aPpNB/a0CRSPxH+5Z+nV =Vi/z -----END PGP SIGNATURE----- Merge tag 'usb-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB and Thunderbolt fixes from Greg KH: "Here are some small Thunderbolt and USB driver fixes for 5.10-rc4 to solve some reported issues. Nothing huge in here, just small things: - thunderbolt memory leaks fixed and new device ids added - revert of problem patch for the musb driver - new quirks added for USB devices - typec power supply fixes to resolve much reported problems about charging notifications not working anymore All except the cdc-acm driver quirk addition have been in linux-next with no reported issues (the quirk patch was applied on Friday, and is self-contained)" * tag 'usb-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: cdc-acm: Add DISABLE_ECHO for Renesas USB Download mode MAINTAINERS: add usb raw gadget entry usb: typec: ucsi: Report power supply changes xhci: hisilicon: fix refercence leak in xhci_histb_probe Revert "usb: musb: convert to devm_platform_ioremap_resource_byname" thunderbolt: Add support for Intel Tiger Lake-H thunderbolt: Only configure USB4 wake for lane 0 adapters thunderbolt: Add uaccess dependency to debugfs interface thunderbolt: Fix memory leak if ida_simple_get() fails in enumerate_services() thunderbolt: Add the missed ida_simple_remove() in ring_request_msix()	2020-11-15 10:02:41 -08:00
Linus Torvalds	0062442ecf	Fixes for ARM and x86, the latter especially for old processors without two-dimensional paging (EPT/NPT). -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+xQ54UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMzeQf+JP9NpXgeB7dhiODhmO5SyLdw0u9j kVOM6+kHcEvG6o0yU1uUZr2ZPh9vIAwIjXi8Luiodcazdp6jvxvJ32CeMYJz2lel y+3Gjp3WS2+FExOjBephBztaMHLihlWQt3E0EKuCc7StyfMhaZooiTRMpvrmiLWe HQ/epM9oLMyrCqG9MKkvTwH0lDyB5CprV1BNt6YyKjt7d5swEqC75A6lOXnmdAah utgx1agSIVQPv6vDF9HLaQaoelHT7ucudx+zIkvOAmoQ56AJMPfCr0+Af3ZVW+f/ I5tXVfBhoOV3BVSIsJS7Px0HcZt7siVtl6ISZZos8ox85S4ysjWm2vXFcQ== =MiOr -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "Fixes for ARM and x86, the latter especially for old processors without two-dimensional paging (EPT/NPT)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: mmu: fix is_tdp_mmu_check when the TDP MMU is not in use KVM: SVM: Update cr3_lm_rsvd_bits for AMD SEV guests KVM: x86: Introduce cr3_lm_rsvd_bits in kvm_vcpu_arch KVM: x86: clflushopt should be treated as a no-op by emulation KVM: arm64: Handle SCXTNUM_ELx traps KVM: arm64: Unify trap handlers injecting an UNDEF KVM: arm64: Allow setting of ID_AA64PFR0_EL1.CSV2 from userspace	2020-11-15 09:57:58 -08:00
Linus Torvalds	326fd6db61	A small set of fixes for x86: - Cure the fallout from the MSI irqdomain overhaul which missed that the Intel IOMMU does not register virtual function devices and therefore never reaches the point where the MSI interrupt domain is assigned. This makes the VF devices use the non-remapped MSI domain which is trapped by the IOMMU/remap unit. - Remove an extra space in the SGI_UV architecture type procfs output for UV5. - Remove a unused function which was missed when removing the UV BAU TLB shootdown handler. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+xJi0THHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoVWxD/9Tq4W6Kniln7mtoEWHRvHRceiiGcS3 MocvqurhoJwirH4F2gkvCegTBy0r3FdUORy3OMmChVs6nb8XpPpso84SANCRePWp JZezpVwLSNC4O1/ZCg1Kjj4eUpzLB/UjUUQV9RsjL5wyQEhfCZgb1D40yLM/2dj5 SkVm/EAqWuQNtYe/jqAOwTX/7mV+k2QEmKCNOigM13R9EWgu6a4J8ta1gtNSbwvN jWMW+M1KjZ76pfRK+y4OpbuFixteSzhSWYPITSGwQz4IpQ+Ty2Rv0zzjidmDnAR+ Q73cup0dretdVnVDRpMwDc06dBCmt/rbN50w4yGU0YFRFDgjGc8sIbQzuIP81nEQ XY4l4rcBgyVufFsLrRpQxu1iYPFrcgU38W1kRkkJ3Kl/rY1a2ZU7sLE4kt4Oh55W A9KCmsfqP1PCYppjAQ0QT4NOp4YtecPvAU4UcBOb722DDBd8TfhLWWGw2yG57Q/d Wnu8xCJGy7BaLHLGGGseAft+D4aNnCjKC3jgMyvNtRDXaV2cK2Kdd6ehMlWVUapD xfLlKXE+igXMyoWJIWjTXQJs4dpKu6QpJCPiorwEZ8rmNaRfxsWEJVbeYwEkmUke bMoBBSCbZT86WVOYhI8WtrIemraY0mMYrrcE03M96HU3eYB8BV92KrIzZWThupcQ ZqkZbqCZm3vfHA== =X/P+ -----END PGP SIGNATURE----- Merge tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A small set of fixes for x86: - Cure the fallout from the MSI irqdomain overhaul which missed that the Intel IOMMU does not register virtual function devices and therefore never reaches the point where the MSI interrupt domain is assigned. This made the VF devices use the non-remapped MSI domain which is trapped by the IOMMU/remap unit - Remove an extra space in the SGI_UV architecture type procfs output for UV5 - Remove a unused function which was missed when removing the UV BAU TLB shootdown handler" * tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: iommu/vt-d: Cure VF irqdomain hickup x86/platform/uv: Fix copied UV5 output archtype x86/platform/uv: Drop last traces of uv_flush_tlb_others	2020-11-15 09:49:56 -08:00
Linus Torvalds	64b609d6a6	A set of fixes for perf: - A set of commits which reduce the stack usage of various perf event handling functions which allocated large data structs on stack causing stack overflows in the worst case. - Use the proper mechanism for detecting soft interrupts in the recursion protection. - Make the resursion protection simpler and more robust. - Simplify the scheduling of event groups to make the code more robust and prepare for fixing the issues vs. scheduling of exclusive event groups. - Prevent event multiplexing and rotation for exclusive event groups - Correct the perf event attribute exclusive semantics to take pinned events, e.g. the PMU watchdog, into account - Make the anythread filtering conditional for Intel's generic PMU counters as it is not longer guaranteed to be supported on newer CPUs. Check the corresponding CPUID leaf to make sure. - Fixup a duplicate initialization in an array which was probably cause by the usual copy & paste - forgot to edit mishap. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+xIi0THHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYofixD/4+4gc8DhOmAkMrN0Z9tiW8ebgMKmb9 wZRkMr5Osi0GzLJOPZ6SdY6jd0A3rMN/sW6P1DT6pDtcty4bKFoW5VZBuUDIAhel BC4C93L3y1En/GEZu1GTy3LvsBwLBQTOoY4goDjbdAbk60S/0RTHOGyQsRsOQFe6 fVs3iXozAFuaR6I6N3dlxuJAE51zvr8MyBWaUoByNDB//1+lLNW+JfClaAOG1oXx qZIg/niatBVGzSGgKNRUyh3g8G1HJtabsA/NZ4PH8ZHuYABfmj4lmmUPR77ICLfV wMITEBG7eaktB8EqM9hvaoOZLA5kpXHO2JbCFSs4c4x11mlC8g7QMV3poCw33YoN a5TmT1A3muri1riy1/Ee9lXACOq7/tf2+Xfn9o6dvDdBwd6s5pzlhLGR8gILp2lF 2bcg3IwYvHT/Kiurb/WGNpbCqQIPJpcUcfs3tNBCCtKegahUQNnGjxN3NVo9RCit zfL6xIJ8eZiYnsxXx4NKm744AukWiql3aRNgRkOdBP5WC68xt6VLcxG1YZKUoDhy jRSOCD/DuPSMSvAAgN7S8OWlPsKWBxVxxWYV+K8FpwhgzbQ3WbS3UDiYkhgjeOxu OlM692oWpllKvQWlvYthr2Be6oPCRRi1vvADNNbTKzgHk5i61bwympsGl1EZx3Pz 2ROp7NJFRESnqw== =FzCf -----END PGP SIGNATURE----- Merge tag 'perf-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A set of fixes for perf: - A set of commits which reduce the stack usage of various perf event handling functions which allocated large data structs on stack causing stack overflows in the worst case - Use the proper mechanism for detecting soft interrupts in the recursion protection - Make the resursion protection simpler and more robust - Simplify the scheduling of event groups to make the code more robust and prepare for fixing the issues vs. scheduling of exclusive event groups - Prevent event multiplexing and rotation for exclusive event groups - Correct the perf event attribute exclusive semantics to take pinned events, e.g. the PMU watchdog, into account - Make the anythread filtering conditional for Intel's generic PMU counters as it is not longer guaranteed to be supported on newer CPUs. Check the corresponding CPUID leaf to make sure - Fixup a duplicate initialization in an array which was probably caused by the usual 'copy & paste - forgot to edit' mishap" * tag 'perf-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Fix Add BW copypasta perf/x86/intel: Make anythread filter support conditional perf: Tweak perf_event_attr::exclusive semantics perf: Fix event multiplexing for exclusive groups perf: Simplify group_sched_in() perf: Simplify group_sched_out() perf/x86: Make dummy_iregs static perf/arch: Remove perf_sample_data::regs_user_copy perf: Optimize get_recursion_context() perf: Fix get_recursion_context() perf/x86: Reduce stack usage for x86_pmu::drain_pebs() perf: Reduce stack usage of perf_output_begin()	2020-11-15 09:46:36 -08:00
Linus Torvalds	d0a37fd57f	A set of scheduler fixes: - Address a load balancer regression by making the load balancer use the same logic as the wakeup path to spread tasks in the LLC domain. - Prefer the CPU on which a task run last over the local CPU in the fast wakeup path for asymmetric CPU capacity systems to align with the symmetric case. This ensures more locality and prevents massive migration overhead on those asymetric systems - Fix a memory corruption bug in the scheduler debug code caused by handing a modified buffer pointer to kfree(). -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+xJIoTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYofyGD/9rUnLlC1h7jEufVa4yPG94DcEqiXT7 8B/zNRKnOmqQePCYUm+DS8njSFqpF9VjR+5zpos3bgYqwn7DyfV+hpxbbgS9NDh/ qRg5gxhTrR4uMyZN62Fex5JS4bP8mKO7oc0usgV2Ytsg3e4H+9DqYhuaA5GrJAxC J3d1Hv/YBW2Uo+RZpB20aaJr0srN7bswTtPMxeeqo8q3Qh4pFcI+rmA4WphVAgHF jQWaNP4YVTgNjqxy7nBp7zFHlSdRbLohldZFtueYmRo1mjmkyQ34Cg7etfBvN1Uf iVYZLaInr0YPr0qR4FrQ3yI8ln/HESxshs0ARzMReYVT71mV//o5wftE18uCULQB rRu9vYz+LBVhkdgx118jJdNJqyqk6Ca6h9ZLqyBKuckj9a39289bwWiS6D/6W51p gurq58YTb2lRzyCnOVEULXehYRJkDI8EToiWppRVm9gy43OFPNox7n6TvNLW6BLS I8msTVdqDYXXj4U1o4Mf9K5LBKlda+ARuBu87r7kH1BJLxXHnOHcEkmeN8O9k7eu jdWfeDzDDjBjt/TU+X4f4RNjudUZrSPQrrESE5+XhfM4CwqcPXa2M/dGtPekW/ED 9IqxPvwkau+0Ym6gkuanfnmda+JVR/nLvZV0uFuUGd+2xMcRemZbZE6hTUiYvYPY CAHpOhmeakbr6w== =wFcU -----END PGP SIGNATURE----- Merge tag 'sched-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "A set of scheduler fixes: - Address a load balancer regression by making the load balancer use the same logic as the wakeup path to spread tasks in the LLC domain - Prefer the CPU on which a task run last over the local CPU in the fast wakeup path for asymmetric CPU capacity systems to align with the symmetric case. This ensures more locality and prevents massive migration overhead on those asymetric systems - Fix a memory corruption bug in the scheduler debug code caused by handing a modified buffer pointer to kfree()" * tag 'sched-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/debug: Fix memory corruption caused by multiple small reads of flags sched/fair: Prefer prev cpu in asymmetric wakeup path sched/fair: Ensure tasks spreading in LLC during LB	2020-11-15 09:39:35 -08:00
Linus Torvalds	259c2fbef8	Two fixes for the locking subsystem: - Prevent an unconditional interrupt enable in a futex helper function which can be called from contexts which expect interrupts to stay disabled across the call. - Don't modify lockdep chain keys in the validation process as that causes chain inconsistency. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+xF7cTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoR9dD/0X7EmbGdTIL9ZgEQfch8fWY79uHqQ7 EXgVQb26KuWDzRutgk0qeSixQTYfhzoT2nPRGNpAQtyYYir0p2fXG2kstJEQZQzq toK5VsL11NJKVhlO/1y5+RcufsgfjTqOpqygbMm1gz+7ejfe7kfJZUAEMwbWzZRn a8HZ/VTQb/Z/F1xv+PuACkCp79ezzosL4hiN5QG0FEyiX47Pf8HuXTHKAD3SJgnc 6ZvCkDkCHOv3jGmQ68sXBQ2m/ciYnDs1D8J/SD9zmggLFs8+R0LKCNxI46HfgRDV 3oqx7OivazDvBmNXlSCFQQG+saIgRlWuVPV8PVTD3Ihmx25DzXhreSyxjyRhl3uL WN7C3ztk6lJv0B/BbtpFceobXjE7IN71CjDIFCwii20dn5UTrgLaJwnW1YrX6qqb +iz3cJs4bNLD1brGAlx8lqZxZ2omyXcPNQOi+vSkdTy8C/OYmypC1xusWSBpBtkp 1+V1uoYFtJWsBzKfmbXAPSSiw9ppIVe/w3J/LrCcFv3CAEaDlCkN0klOF3/ULsga d+hMEUKagIqRXNeBdoEfY78LzSfIqkovy3rLNnWATu8fS2IdoRiwhliRxMCU9Ceh 4WU1F4QAqu42cd0zYVWnVIfDVP3Qx/ENl8/Jd0m7Z8jmmjhOQo5XO7o5PxP8uP3U NvnoT5d5D2KysA== =XuF7 -----END PGP SIGNATURE----- Merge tag 'locking-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "Two fixes for the locking subsystem: - Prevent an unconditional interrupt enable in a futex helper function which can be called from contexts which expect interrupts to stay disabled across the call - Don't modify lockdep chain keys in the validation process as that causes chain inconsistency" * tag 'locking-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lockdep: Avoid to modify chain keys in validate_chain() futex: Don't enable IRQs unconditionally in put_pi_state()	2020-11-15 09:25:43 -08:00
Linus Torvalds	a50cf15906	Merge branch 'for-5.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu Pull percpu fix and cleanup from Dennis Zhou: "A fix for a Wshadow warning in the asm-generic percpu macros came in and then I tacked on the removal of flexible array initializers in the percpu allocator" * 'for-5.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu: percpu: convert flexible array initializers to use struct_size() asm-generic: percpu: avoid Wshadow warning	2020-11-15 08:57:19 -08:00
Paolo Bonzini	c887c9b9ca	kvm: mmu: fix is_tdp_mmu_check when the TDP MMU is not in use In some cases where shadow paging is in use, the root page will be either mmu->pae_root or vcpu->arch.mmu->lm_root. Then it will not have an associated struct kvm_mmu_page, because it is allocated with alloc_page instead of kvm_mmu_alloc_page. Just return false quickly from is_tdp_mmu_root if the TDP MMU is not in use, which also includes the case where shadow paging is enabled. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-11-15 08:55:43 -05:00
Linus Torvalds	e28c0d7c92	Merge branch 'akpm' (patches from Andrew) Merge fixes from Andrew Morton: "14 patches. Subsystems affected by this patch series: mm (migration, vmscan, slub, gup, memcg, hugetlbfs), mailmap, kbuild, reboot, watchdog, panic, and ocfs2" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: ocfs2: initialize ip_next_orphan panic: don't dump stack twice on warn hugetlbfs: fix anon huge page migration race mm: memcontrol: fix missing wakeup polling thread kernel/watchdog: fix watchdog_allowed_mask not used warning reboot: fix overflow parsing reboot cpu number Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint" compiler.h: fix barrier_data() on clang mm/gup: use unpin_user_pages() in __gup_longterm_locked() mm/slub: fix panic in slab_alloc_node() mailmap: fix entry for Dmitry Baryshkov/Eremin-Solenikov mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate mm/compaction: count pages and stop correctly during page isolation	2020-11-14 12:35:11 -08:00
Linus Torvalds	31908a604c	Two small clk driver fixes - Make to_clk_regmap() inline to avoid compiler annoyance - Fix critical clks on i.MX imx8m SoCs -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAl+wMj8RHHNib3lkQGtl cm5lbC5vcmcACgkQrQKIl8bklSVtMw//QPVXMc66oTHs7k4McPeOMQpArc3zXlNA HMtyiXmwJ7NOtztAMth4uZupVFELnP9VHTIiqwKOYbL9gDz+czzdGQw1N/8zwa7o faXj7F7lQDx6Tr03vFdWVA8CQCCq+4nY4qKpDLTM3qRb2D7VpJAtqX2QgMZRMQ6k kW999ETuAN8Aecmxf89wGM1SWNPuuRpWdeX1h03vfpw8vU99QgkFMOL8fkGPOtlJ 1//vlexVk4adic5Go2jd4YAPCxHMcgzrxI8kQGnYANE6aWadtdueJMv/HdpADvIP KSD8NYdajaVAlMZVK8ktL8Bg3kXsoOfDYLVFDlNGNA+kOSBNSbFgaatS1tn5LuUD Pee7Mdhqgg72XrkzH96z0ECzlWpYzdxESiQT344twO8fjdSemXsavOqpf61/bad8 jyZxJtSNgsw/BUjoCRK/S89/GLRP/9sb7PGE/XnSkfyp7QKCpfT2UTVOeblesYj/ jlaILUzgEQ2IEZB8hRVMQpzrn3yiPh4NA1TYrCEQ0QYnkt2PzHHZjG/L6ad1u549 9ZMHN+6NhPoJM5az3UnmuGg9aAnw9texXeeUltqF96EGJW61QgNR8bEEY5PkRCYV MFm60L56yK02Xbu6Ko0gH9URpxzS3RyaJai9sGc+MqlyuSmnCmJiKwbI/SNd/Ayl 6cVzh0ITiII= =5At+ -----END PGP SIGNATURE----- Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Two small clk driver fixes: - Make to_clk_regmap() inline to avoid compiler annoyance - Fix critical clks on i.MX imx8m SoCs" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: imx8m: fix bus critical clk registration clk: define to_clk_regmap() as inline function	2020-11-14 12:30:18 -08:00
Linus Torvalds	7e908b7461	hwmon fixes for v5.10-rc4 Fix potential bufer overflow in pmbus/max20730 driver Fix locking issue in pmbus core Fix regression causing timeouts in applesmc driver Fix RPM calculation in pwm-fan driver Restrict counter visibility in amd_energy driver -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEiHPvMQj9QTOCiqgVyx8mb86fmYEFAl+u1xAACgkQyx8mb86f mYHqjRAAl7lKDmHbxhe5RS/KatiSdTXimSZTGoXZGDneSrxKd3aer3Ns1VxS6ToT WdJzaY1+NC+EMlYQIGGT9ev+kkwAWzAjoAWI3tSxnM9Qj0QbN1c0iw1vltbirc93 MzLp44aggjXB/CYofFqCVOXeEHF1efkFZzvfAykWOoNtRiCsETN1ssodHiMYiV7y Qfa2knnetBBAMabfXIVAd6kD+Cg7EkmUU9dtMWiLa88UWau/D8nLh0wjcIt2KDEm HWEWk6mvy0FOq4wm34qwYp/C5EXprjJrG3Jt3DWB05yo30JaK/TCqBDLSh94bMdD 5beCRqDc9xW1/2axuZvGiQRTH99gNsbE5/ZHoq+nywg+XIEG3/nhKfFJ6GbP864H uQAuxMw5PFZ45JsjU2Jrtukw/0t9qWbPaAH1iC8WowJ3b1D9w0J2XEgkcJIMhRE+ pikAAbqw2gWl30ELGhpfIgntV3IUWM245ul+N5xf6Rb7qung/tGnXk6Nq2vse2yq M1NLs3AJcOVEulMI01ubfY4LfkmVa4zoxrXIq2FbFjCU65F2XIWiBh0inKJeYWDa OQKdF1ouOmimxP6jC4mzPkHze2wT+a2kN0Ke1xgK45r5O+k5946L4UVr4vB0ql+6 nXn2DS6svXshJfgYpCzovpR0u+xDWnw3Pje6xfSHQYDsR+xx7H8= =0N42 -----END PGP SIGNATURE----- Merge tag 'hwmon-for-v5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - Fix potential bufer overflow in pmbus/max20730 driver - Fix locking issue in pmbus core - Fix regression causing timeouts in applesmc driver - Fix RPM calculation in pwm-fan driver - Restrict counter visibility in amd_energy driver * tag 'hwmon-for-v5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (amd_energy) modify the visibility of the counters hwmon: (applesmc) Re-work SMC comms hwmon: (pwm-fan) Fix RPM calculation hwmon: (pmbus) Add mutex locking for sysfs reads hwmon: (pmbus/max20730) use scnprintf() instead of snprintf()	2020-11-14 12:25:39 -08:00
Linus Torvalds	0c0451112b	SCSI fixes on 20201113 Three small fixes, all in the embedded ufs driver subsystem. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCX68gjyYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishS1yAP0c2nHF MJuoPNYWIMVSzVSyacrIeonh6PxxR3gEMq/AuAD/clrcaDCPm4Crnc4xNOBsLS52 WmYiQpViUGtMY2jXLo0= =dqHV -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three small fixes, all in the embedded ufs driver subsystem" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufshcd: Fix missing destroy_workqueue() scsi: ufs: Try to save power mode change and UIC cmd completion timeout scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold()	2020-11-14 12:19:21 -08:00
Linus Torvalds	30636a59f4	selinux/stable-5.10 PR 20201113 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl+vD0MUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOHZw//SbZ4c32LGxANDJrASdRWGRYzUsSm cQZrLR373TMMmawZDenvRK0Wfa8vj+BK5V5knBXXTDFq5M+TVtlzhRhED4ZH40RP CmaL+XC24ed1IPsYiZZ0lytXrDnb7EgmGaSjHP7RttBdLG/2ABamm9FNZFpiFJjV y9txgNyJ9NvrdFCO+UoLEm9BFq4x/Ddzu+BAEqu1iSGaKXufiBlq9uQ+3qommJlx Bv+i6HeqCruTj+y8p8tkR7aLma9jsbqCpBlQkG1ja4pgxMXwC2uRbNzz//dlxWZ5 pKtz8HJqi+myla/rwNZAA82uZlkFnEOp7Yy404Deu73WznMdrQ1oJqlGDrXdafKn BJwpcUQnJ7NP2MK1JrMbxdVllUKsZF2ICP0t3CFTuuyJ2CakOuaSW+ERBvbO6NYa G7h9Ll8AO4ADbG3W+/c6nybDt8AkjBLJTg7uXWO3LIjjZ/UJwJxO4YwzYbaMdg6Q 7zVNj7j36YLHhv9Hc60vhCjO53KXYo60Z6TCblkeyClifvM8Gx109JkOwPXWFGIY Bh7KbKtGr2dZoFxU2yS5oQgtbzJhsfJQGYsF8uANHcj76hQ/rh+Ff1NKKofHaRSl umQSgNbekaQjnmq08otVlIHtjdaG+or1PFW2Yep37OwsCXwKrFG4YZRY9FdpwQeG Cjx3JAJTibvjMqg= =qpR+ -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20201113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux fix from Paul Moore: "One small SELinux patch to make sure we return an error code when an allocation fails. It passes all of our tests, but given the nature of the patch that isn't surprising" * tag 'selinux-pr-20201113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: Fix error return code in sel_ib_pkey_sid_slow()	2020-11-14 12:04:02 -08:00
Linus Torvalds	4aea779d35	This pull request contains the following bug fixes for UML: - Call PMD destructor in __pmd_free_tlb() -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAl+vBSoWHHJpY2hhcmRA c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wflpEACpf2cjKYhSmUxEhfBqTatNVmKh Ug1+JtoAS8r+uK/p70uwRbNeITkUtDMMUyjzPU9eL8U+slBHSgSAHnwaAEi54y8J cHVYh8fzeIzGEDC67BC69n+Oai/hKxC5vi+AaHiF6vbuV0TcFu+zup2iQHb0jSi7 eBFe0IDFp3eyNOit85Cgs/1uM3IqLvFSBj1RYH850pT5DFbbJ07yg+1x1poKGVLL vECTFEA6JkXCSj/FYGDDmRtphiLXppR7DFvvob5U7ckv8/BXd74MfDnwDMWj5QGU B1sGTHcDKtjTNzbdYGMlxySxF4aJExdX6TU9orXGkpzD8A1U6efTbTZd37WZJCgZ dsIodoCbVu0nyn4u8O9pFBAt0jNa7R37Ak3DSxFfZ3q1zkFWopz3sMSkxBXohape sWyj0eG+OU33XqQYgI6rVFUzpxkP+UoU03yEXNTJTAHdv66yrocYjHN/drwxoeIk Z8mpoIVxYssLjtx2I8v7mxmUlDO+AcLueJYATxZ0k5NGIbJfEXwjEqRfZ3RZcFHL zQKXCr6LWM3SN6RqBOoyHTVHgIUE8jKfyrQvEFaNCKSvDo4FMbSfZTYq6GijLANw k77Moiqg81blj+Olh3egcWhbroAPTUBibSbcBoqNYBt+gMXCF/OI6e5EYDn5raWL Eq73YPpKe/a03FYwlA== =Jbrq -----END PGP SIGNATURE----- Merge tag 'for-linus-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull uml fix from Richard Weinberger: "Call PMD destructor in __pmd_free_tlb()" * tag 'for-linus-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: um: Call pgtable_pmd_page_dtor() in __pmd_free_tlb()	2020-11-14 11:56:59 -08:00
David Howells	3ad216ee73	afs: Fix afs_write_end() when called with copied == 0 [ver #3 ] When afs_write_end() is called with copied == 0, it tries to set the dirty region, but there's no way to actually encode a 0-length region in the encoding in page->private. "0,0", for example, indicates a 1-byte region at offset 0. The maths miscalculates this and sets it incorrectly. Fix it to just do nothing but unlock and put the page in this case. We don't actually need to mark the page dirty as nothing presumably changed. Fixes: `65dd2d6072` ("afs: Alter dirty range encoding in page->private") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:51:18 -08:00
Wengang Wang	f5785283dd	ocfs2: initialize ip_next_orphan Though problem if found on a lower 4.1.12 kernel, I think upstream has same issue. In one node in the cluster, there is the following callback trace: # cat /proc/21473/stack __ocfs2_cluster_lock.isra.36+0x336/0x9e0 [ocfs2] ocfs2_inode_lock_full_nested+0x121/0x520 [ocfs2] ocfs2_evict_inode+0x152/0x820 [ocfs2] evict+0xae/0x1a0 iput+0x1c6/0x230 ocfs2_orphan_filldir+0x5d/0x100 [ocfs2] ocfs2_dir_foreach_blk+0x490/0x4f0 [ocfs2] ocfs2_dir_foreach+0x29/0x30 [ocfs2] ocfs2_recover_orphans+0x1b6/0x9a0 [ocfs2] ocfs2_complete_recovery+0x1de/0x5c0 [ocfs2] process_one_work+0x169/0x4a0 worker_thread+0x5b/0x560 kthread+0xcb/0xf0 ret_from_fork+0x61/0x90 The above stack is not reasonable, the final iput shouldn't happen in ocfs2_orphan_filldir() function. Looking at the code, 2067 /* Skip inodes which are already added to recover list, since dio may 2068 * happen concurrently with unlink/rename */ 2069 if (OCFS2_I(iter)->ip_next_orphan) { 2070 iput(iter); 2071 return 0; 2072 } 2073 The logic thinks the inode is already in recover list on seeing ip_next_orphan is non-NULL, so it skip this inode after dropping a reference which incremented in ocfs2_iget(). While, if the inode is already in recover list, it should have another reference and the iput() at line 2070 should not be the final iput (dropping the last reference). So I don't think the inode is really in the recover list (no vmcore to confirm). Note that ocfs2_queue_orphans(), though not shown up in the call back trace, is holding cluster lock on the orphan directory when looking up for unlinked inodes. The on disk inode eviction could involve a lot of IOs which may need long time to finish. That means this node could hold the cluster lock for very long time, that can lead to the lock requests (from other nodes) to the orhpan directory hang for long time. Looking at more on ip_next_orphan, I found it's not initialized when allocating a new ocfs2_inode_info structure. This causes te reflink operations from some nodes hang for very long time waiting for the cluster lock on the orphan directory. Fix: initialize ip_next_orphan as NULL. Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201109171746.27884-1-wen.gang.wang@oracle.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:04 -08:00
Christophe Leroy	2f31ad64a9	panic: don't dump stack twice on warn Before commit `3f388f2863` ("panic: dump registers on panic_on_warn"), __warn() was calling show_regs() when regs was not NULL, and show_stack() otherwise. After that commit, show_stack() is called regardless of whether show_regs() has been called or not, leading to duplicated Call Trace: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/nohash/8xx.c:186 mmu_mark_initmem_nx+0x24/0x94 CPU: 0 PID: 1 Comm: swapper Not tainted 5.10.0-rc2-s3k-dev-01375-gf46ec0d3ecbd-dirty #4092 NIP: c00128b4 LR: c0010228 CTR: 00000000 REGS: c9023e40 TRAP: 0700 Not tainted (5.10.0-rc2-s3k-dev-01375-gf46ec0d3ecbd-dirty) MSR: 00029032 <EE,ME,IR,DR,RI> CR: 24000424 XER: 00000000 GPR00: c0010228 c9023ef8 c2100000 0074c000 ffffffff 00000000 c2151000 c07b3880 GPR08: ff000900 0074c000 c8000000 c33b53a8 24000822 00000000 c0003a20 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 GPR24: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00800000 NIP [c00128b4] mmu_mark_initmem_nx+0x24/0x94 LR [c0010228] free_initmem+0x20/0x58 Call Trace: free_initmem+0x20/0x58 kernel_init+0x1c/0x114 ret_from_kernel_thread+0x14/0x1c Instruction dump: 7d291850 7d234b78 4e800020 9421ffe0 7c0802a6 bfc10018 3fe0c060 3bff0000 3fff4080 3bffffff 90010024 57ff0010 <0fe00000> 392001cd 7c3e0b78 953e0008 CPU: 0 PID: 1 Comm: swapper Not tainted 5.10.0-rc2-s3k-dev-01375-gf46ec0d3ecbd-dirty #4092 Call Trace: __warn+0x8c/0xd8 (unreliable) report_bug+0x11c/0x154 program_check_exception+0x1dc/0x6e0 ret_from_except_full+0x0/0x4 --- interrupt: 700 at mmu_mark_initmem_nx+0x24/0x94 LR = free_initmem+0x20/0x58 free_initmem+0x20/0x58 kernel_init+0x1c/0x114 ret_from_kernel_thread+0x14/0x1c ---[ end trace 31702cd2a9570752 ]--- Only call show_stack() when regs is NULL. Fixes: `3f388f2863` ("panic: dump registers on panic_on_warn") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lkml.kernel.org/r/e8c055458b080707f1bc1a98ff8bea79d0cec445.1604748361.git.christophe.leroy@csgroup.eu Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:04 -08:00
Mike Kravetz	336bf30eb7	hugetlbfs: fix anon huge page migration race Qian Cai reported the following BUG in [1] LTP: starting move_pages12 BUG: unable to handle page fault for address: ffffffffffffffe0 ... RIP: 0010:anon_vma_interval_tree_iter_first+0xa2/0x170 avc_start_pgoff at mm/interval_tree.c:63 Call Trace: rmap_walk_anon+0x141/0xa30 rmap_walk_anon at mm/rmap.c:1864 try_to_unmap+0x209/0x2d0 try_to_unmap at mm/rmap.c:1763 migrate_pages+0x1005/0x1fb0 move_pages_and_store_status.isra.47+0xd7/0x1a0 __x64_sys_move_pages+0xa5c/0x1100 do_syscall_64+0x5f/0x310 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Hugh Dickins diagnosed this as a migration bug caused by code introduced to use i_mmap_rwsem for pmd sharing synchronization. Specifically, the routine unmap_and_move_huge_page() is always passing the TTU_RMAP_LOCKED flag to try_to_unmap() while holding i_mmap_rwsem. This is wrong for anon pages as the anon_vma_lock should be held in this case. Further analysis suggested that i_mmap_rwsem was not required to he held at all when calling try_to_unmap for anon pages as an anon page could never be part of a shared pmd mapping. Discussion also revealed that the hack in hugetlb_page_mapping_lock_write to drop page lock and acquire i_mmap_rwsem is wrong. There is no way to keep mapping valid while dropping page lock. This patch does the following: - Do not take i_mmap_rwsem and set TTU_RMAP_LOCKED for anon pages when calling try_to_unmap. - Remove the hacky code in hugetlb_page_mapping_lock_write. The routine will now simply do a 'trylock' while still holding the page lock. If the trylock fails, it will return NULL. This could impact the callers: - migration calling code will receive -EAGAIN and retry up to the hard coded limit (10). - memory error code will treat the page as BUSY. This will force killing (SIGKILL) instead of SIGBUS any mapping tasks. Do note that this change in behavior only happens when there is a race. None of the standard kernel testing suites actually hit this race, but it is possible. [1] https://lore.kernel.org/lkml/20200708012044.GC992@lca.pw/ [2] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.anvils/ Fixes: `c0d0381ade` ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") Reported-by: Qian Cai <cai@lca.pw> Suggested-by: Hugh Dickins <hughd@google.com> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201105195058.78401-1-mike.kravetz@oracle.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:04 -08:00
Muchun Song	8b21ca0218	mm: memcontrol: fix missing wakeup polling thread When we poll the swap.events, we can miss being woken up when the swap event occurs. Because we didn't notify. Fixes: `f3a53a3a1e` ("mm, memcontrol: implement memory.swap.events") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <guro@fb.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Yafang Shao <laoar.shao@gmail.com> Cc: Chris Down <chris@chrisdown.name> Cc: Tejun Heo <tj@kernel.org> Link: https://lkml.kernel.org/r/20201105161936.98312-1-songmuchun@bytedance.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:04 -08:00
Santosh Sivaraj	e7e046155a	kernel/watchdog: fix watchdog_allowed_mask not used warning Define watchdog_allowed_mask only when SOFTLOCKUP_DETECTOR is enabled. Fixes: `7feeb9cd4f` ("watchdog/sysctl: Clean up sysctl variable name space") Signed-off-by: Santosh Sivaraj <santosh@fossix.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20201106015025.1281561-1-santosh@fossix.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:03 -08:00
Matteo Croce	df5b0ab3e0	reboot: fix overflow parsing reboot cpu number Limit the CPU number to num_possible_cpus(), because setting it to a value lower than INT_MAX but higher than NR_CPUS produces the following error on reboot and shutdown: BUG: unable to handle page fault for address: ffffffff90ab1bb0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 1c09067 P4D 1c09067 PUD 1c0a063 PMD 0 Oops: 0000 [#1] SMP CPU: 1 PID: 1 Comm: systemd-shutdow Not tainted 5.9.0-rc8-kvm #110 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 RIP: 0010:migrate_to_reboot_cpu+0xe/0x60 Code: ea ea 00 48 89 fa 48 c7 c7 30 57 f1 81 e9 fa ef ff ff 66 2e 0f 1f 84 00 00 00 00 00 53 8b 1d d5 ea ea 00 e8 14 33 fe ff 89 da <48> 0f a3 15 ea fc bd 00 48 89 d0 73 29 89 c2 c1 e8 06 65 48 8b 3c RSP: 0018:ffffc90000013e08 EFLAGS: 00010246 RAX: ffff88801f0a0000 RBX: 0000000077359400 RCX: 0000000000000000 RDX: 0000000077359400 RSI: 0000000000000002 RDI: ffffffff81c199e0 RBP: ffffffff81c1e3c0 R08: ffff88801f41f000 R09: ffffffff81c1e348 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 00007f32bedf8830 R14: 00000000fee1dead R15: 0000000000000000 FS: 00007f32bedf8980(0000) GS:ffff88801f480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff90ab1bb0 CR3: 000000001d057000 CR4: 00000000000006a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __do_sys_reboot.cold+0x34/0x5b do_syscall_64+0x2d/0x40 Fixes: `1b3a5d02ee` ("reboot: move arch/x86 reboot= handling to generic kernel") Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Fabian Frederick <fabf@skynet.be> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201103214025.116799-3-mcroce@linux.microsoft.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:03 -08:00
Matteo Croce	8b92c4ff44	Revert "kernel/reboot.c: convert simple_strtoul to kstrtoint" Patch series "fix parsing of reboot= cmdline", v3. The parsing of the reboot= cmdline has two major errors: - a missing bound check can crash the system on reboot - parsing of the cpu number only works if specified last Fix both. This patch (of 2): This reverts commit `616feab753`. kstrtoint() and simple_strtoul() have a subtle difference which makes them non interchangeable: if a non digit character is found amid the parsing, the former will return an error, while the latter will just stop parsing, e.g. simple_strtoul("123xyx") = 123. The kernel cmdline reboot= argument allows to specify the CPU used for rebooting, with the syntax `s####` among the other flags, e.g. "reboot=warm,s31,force", so if this flag is not the last given, it's silently ignored as well as the subsequent ones. Fixes: `616feab753` ("kernel/reboot.c: convert simple_strtoul to kstrtoint") Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Petr Mladek <pmladek@suse.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Mike Rapoport <rppt@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Robin Holt <robinmholt@gmail.com> Cc: Fabian Frederick <fabf@skynet.be> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201103214025.116799-2-mcroce@linux.microsoft.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:03 -08:00
Arvind Sankar	3347acc6fc	compiler.h: fix barrier_data() on clang Commit `815f0ddb34` ("include/linux/compiler.h: make compiler-.h mutually exclusive") neglected to copy barrier_data() from compiler-gcc.h into compiler-clang.h. The definition in compiler-gcc.h was really to work around clang's more aggressive optimization, so this broke barrier_data() on clang, and consequently memzero_explicit() as well. For example, this results in at least the memzero_explicit() call in lib/crypto/sha256.c:sha256_transform() being optimized away by clang. Fix this by moving the definition of barrier_data() into compiler.h. Also move the gcc/clang definition of barrier() into compiler.h, __memory_barrier() is icc-specific (and barrier() is already defined using it in compiler-intel.h) and doesn't belong in compiler.h. [rdunlap@infradead.org: fix ALPHA builds when SMP is not enabled] Link: https://lkml.kernel.org/r/20201101231835.4589-1-rdunlap@infradead.org Fixes: `815f0ddb34` ("include/linux/compiler.h: make compiler-.h mutually exclusive") Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201014212631.207844-1-nivedita@alum.mit.edu Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:03 -08:00
Jason Gunthorpe	96e1fac162	mm/gup: use unpin_user_pages() in __gup_longterm_locked() When FOLL_PIN is passed to __get_user_pages() the page list must be put back using unpin_user_pages() otherwise the page pin reference persists in a corrupted state. There are two places in the unwind of __gup_longterm_locked() that put the pages back without checking. Normally on error this function would return the partial page list making this the caller's responsibility, but in these two cases the caller is not allowed to see these pages at all. Fixes: `3faa52c03f` ("mm/gup: track FOLL_PIN pages") Reported-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/0-v2-3ae7d9d162e2+2a7-gup_cma_fix_jgg@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:03 -08:00
Laurent Dufour	22e4663e91	mm/slub: fix panic in slab_alloc_node() While doing memory hot-unplug operation on a PowerPC VM running 1024 CPUs with 11TB of ram, I hit the following panic: BUG: Kernel NULL pointer dereference on read at 0x00000007 Faulting instruction address: 0xc000000000456048 Oops: Kernel access of bad area, sig: 11 [#2] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS= 2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp CPU: 160 PID: 1 Comm: systemd Tainted: G D 5.9.0 #1 NIP: c000000000456048 LR: c000000000455fd4 CTR: c00000000047b350 REGS: c00006028d1b77a0 TRAP: 0300 Tainted: G D (5.9.0) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24004228 XER: 00000000 CFAR: c00000000000f1b0 DAR: 0000000000000007 DSISR: 40000000 IRQMASK: 0 GPR00: c000000000455fd4 c00006028d1b7a30 c000000001bec800 0000000000000000 GPR04: 0000000000000dc0 0000000000000000 00000000000374ef c00007c53df99320 GPR08: 000007c53c980000 0000000000000000 000007c53c980000 0000000000000000 GPR12: 0000000000004400 c00000001e8e4400 0000000000000000 0000000000000f6a GPR16: 0000000000000000 c000000001c25930 c000000001d62528 00000000000000c1 GPR20: c000000001d62538 c00006be469e9000 0000000fffffffe0 c0000000003c0ff8 GPR24: 0000000000000018 0000000000000000 0000000000000dc0 0000000000000000 GPR28: c00007c513755700 c000000001c236a4 c00007bc4001f800 0000000000000001 NIP [c000000000456048] __kmalloc_node+0x108/0x790 LR [c000000000455fd4] __kmalloc_node+0x94/0x790 Call Trace: kvmalloc_node+0x58/0x110 mem_cgroup_css_online+0x10c/0x270 online_css+0x48/0xd0 cgroup_apply_control_enable+0x2c4/0x470 cgroup_mkdir+0x408/0x5f0 kernfs_iop_mkdir+0x90/0x100 vfs_mkdir+0x138/0x250 do_mkdirat+0x154/0x1c0 system_call_exception+0xf8/0x200 system_call_common+0xf0/0x27c Instruction dump: e93e0000 e90d0030 39290008 7cc9402a e94d0030 e93e0000 7ce95214 7f89502a 2fbc0000 419e0018 41920230 e9270010 <89290007> 7f994800 419e0220 7ee6bb78 This pointing to the following code: mm/slub.c:2851 if (unlikely(!object \|\| !node_match(page, node))) { c000000000456038: 00 00 bc 2f cmpdi cr7,r28,0 c00000000045603c: 18 00 9e 41 beq cr7,c000000000456054 <__kmalloc_node+0x114> node_match(): mm/slub.c:2491 if (node != NUMA_NO_NODE && page_to_nid(page) != node) c000000000456040: 30 02 92 41 beq cr4,c000000000456270 <__kmalloc_node+0x330> page_to_nid(): include/linux/mm.h:1294 c000000000456044: 10 00 27 e9 ld r9,16(r7) c000000000456048: 07 00 29 89 lbz r9,7(r9) <<<< r9 = NULL node_match(): mm/slub.c:2491 c00000000045604c: 00 48 99 7f cmpw cr7,r25,r9 c000000000456050: 20 02 9e 41 beq cr7,c000000000456270 <__kmalloc_node+0x330> The panic occurred in slab_alloc_node() when checking for the page's node: object = c->freelist; page = c->page; if (unlikely(!object \|\| !node_match(page, node))) { object = __slab_alloc(s, gfpflags, node, addr, c); stat(s, ALLOC_SLOWPATH); The issue is that object is not NULL while page is NULL which is odd but may happen if the cache flush happened after loading object but before loading page. Thus checking for the page pointer is required too. The cache flush is done through an inter processor interrupt when a piece of memory is off-lined. That interrupt is triggered when a memory hot-unplug operation is initiated and offline_pages() is calling the slub's MEM_GOING_OFFLINE callback slab_mem_going_offline_callback() which is calling flush_cpu_slab(). If that interrupt is caught between the reading of c->freelist and the reading of c->page, this could lead to such a situation. That situation is expected and the later call to this_cpu_cmpxchg_double() will detect the change to c->freelist and redo the whole operation. In commit `6159d0f5c0` ("mm/slub.c: page is always non-NULL in node_match()") check on the page pointer has been removed assuming that page is always valid when it is called. It happens that this is not true in that particular case, so check for page before calling node_match() here. Fixes: `6159d0f5c0` ("mm/slub.c: page is always non-NULL in node_match()") Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Christoph Lameter <cl@linux.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201027190406.33283-1-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:03 -08:00
Dmitry Baryshkov	044747e971	mailmap: fix entry for Dmitry Baryshkov/Eremin-Solenikov Change back surname to new (old) one. Dmitry Baryshkov -> Dmitry Eremin-Solenikov -> Dmitry Baryshkov. Map several odd entries to main identity. Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20201103005158.1181426-1-dmitry.baryshkov@linaro.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:03 -08:00
Nicholas Piggin	2da9f6305f	mm/vmscan: fix NR_ISOLATED_FILE corruption on 64-bit Previously the negated unsigned long would be cast back to signed long which would have the correct negative value. After commit `730ec8c01a` ("mm/vmscan.c: change prototype for shrink_page_list"), the large unsigned int converts to a large positive signed long. Symptoms include CMA allocations hanging forever holding the cma_mutex due to alloc_contig_range->...->isolate_migratepages_block waiting forever in "while (unlikely(too_many_isolated(pgdat)))". [akpm@linux-foundation.org: fix -stat.nr_lazyfree_fail as well, per Michal] Fixes: `730ec8c01a` ("mm/vmscan.c: change prototype for shrink_page_list") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vaneet Narang <v.narang@samsung.com> Cc: Maninder Singh <maninder1.s@samsung.com> Cc: Amit Sahrawat <a.sahrawat@samsung.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201029032320.1448441-1-npiggin@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:03 -08:00
Zi Yan	d20bdd571e	mm/compaction: stop isolation if too many pages are isolated and we have pages to migrate In isolate_migratepages_block, if we have too many isolated pages and nr_migratepages is not zero, we should try to migrate what we have without wasting time on isolating. In theory it's possible that multiple parallel compactions will cause too_many_isolated() to become true even if each has isolated less than COMPACT_CLUSTER_MAX, and loop forever in the while loop. Bailing immediately prevents that. [vbabka@suse.cz: changelog addition] Fixes: `1da2f328fa` (“mm,thp,compaction,cma: allow THP migration for CMA allocations”) Suggested-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@vger.kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: Yang Shi <shy828301@gmail.com> Link: https://lkml.kernel.org/r/20201030183809.3616803-2-zi.yan@sent.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:03 -08:00
Zi Yan	38935861d8	mm/compaction: count pages and stop correctly during page isolation In isolate_migratepages_block, when cc->alloc_contig is true, we are able to isolate compound pages. But nr_migratepages and nr_isolated did not count compound pages correctly, causing us to isolate more pages than we thought. So count compound pages as the number of base pages they contain. Otherwise, we might be trapped in too_many_isolated while loop, since the actual isolated pages can go up to COMPACT_CLUSTER_MAX*512=16384, where COMPACT_CLUSTER_MAX is 32, since we stop isolation after cc->nr_migratepages reaches to COMPACT_CLUSTER_MAX. In addition, after we fix the issue above, cc->nr_migratepages could never be equal to COMPACT_CLUSTER_MAX if compound pages are isolated, thus page isolation could not stop as we intended. Change the isolation stop condition to '>='. The issue can be triggered as follows: In a system with 16GB memory and an 8GB CMA region reserved by hugetlb_cma, if we first allocate 10GB THPs and mlock them (so some THPs are allocated in the CMA region and mlocked), reserving 6 1GB hugetlb pages via /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages will get stuck (looping in too_many_isolated function) until we kill either task. With the patch applied, oom will kill the application with 10GB THPs and let hugetlb page reservation finish. [ziy@nvidia.com: v3] Link: https://lkml.kernel.org/r/20201030183809.3616803-1-zi.yan@sent.com Fixes: `1da2f328fa` ("cmm,thp,compaction,cma: allow THP migration for CMA allocations") Signed-off-by: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Yang Shi <shy828301@gmail.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Rik van Riel <riel@surriel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201029200435.3386066-1-zi.yan@sent.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-11-14 11:26:03 -08:00
Lyude Paul	5c6fb4b28b	drm/nouveau/kms/nv50-: Use atomic encoder callbacks everywhere It turns out that I forgot to go through and make sure that I converted all encoder callbacks to use atomic_enable/atomic_disable(), so let's go and actually do that. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Fixes: `09838c4efe` ("drm/nouveau/kms: Search for encoders' connectors properly") Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2020-11-14 14:19:18 +10:00
Ben Skeggs	6c27ffabeb	drm/nouveau/ttm: avoid using nouveau_drm.ttm.type_vram prior to nv50 Pre-NV50 chipsets don't currently use the MMU subsystem that later chipsets use, and type_vram is negative here, leading to an OOB memory access. This was previously guarded by a chipset check, restore that. Reported-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: `5839172f09` ("drm/nouveau: explicitly specify caching to use") Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2020-11-14 14:19:17 +10:00
Alexander Kapshuk	630f512280	drm/nouveau/kms: Fix NULL pointer dereference in nouveau_connector_detect_depth This oops manifests itself on the following hardware: 01:00.0 VGA compatible controller: NVIDIA Corporation G98M [GeForce G 103M] (rev a1) Oct 09 14:17:46 lp-sasha kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000 Oct 09 14:17:46 lp-sasha kernel: #PF: supervisor read access in kernel mode Oct 09 14:17:46 lp-sasha kernel: #PF: error_code(0x0000) - not-present page Oct 09 14:17:46 lp-sasha kernel: PGD 0 P4D 0 Oct 09 14:17:46 lp-sasha kernel: Oops: 0000 [#1] SMP PTI Oct 09 14:17:46 lp-sasha kernel: CPU: 1 PID: 191 Comm: systemd-udevd Not tainted 5.9.0-rc8-next-20201009 #38 Oct 09 14:17:46 lp-sasha kernel: Hardware name: Hewlett-Packard Compaq Presario CQ61 Notebook PC/306A, BIOS F.03 03/23/2009 Oct 09 14:17:46 lp-sasha kernel: RIP: 0010:nouveau_connector_detect_depth+0x71/0xc0 [nouveau] Oct 09 14:17:46 lp-sasha kernel: Code: 0a 00 00 48 8b 49 48 c7 87 b8 00 00 00 06 00 00 00 80 b9 4d 0a 00 00 00 75 1e 83 fa 41 75 05 48 85 c0 75 29 8b 81 10 0d 00 00 <39> 06 7c 25 f6 81 14 0d 00 00 02 75 b7 c3 80 b9 0c 0d 00 00 00 75 Oct 09 14:17:46 lp-sasha kernel: RSP: 0018:ffffc9000028f8c0 EFLAGS: 00010297 Oct 09 14:17:46 lp-sasha kernel: RAX: 0000000000014c08 RBX: ffff8880369d4000 RCX: ffff8880369d3000 Oct 09 14:17:46 lp-sasha kernel: RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffff8880369d4000 Oct 09 14:17:46 lp-sasha kernel: RBP: ffff88800601cc00 R08: ffff8880051da298 R09: ffffffff8226201a Oct 09 14:17:46 lp-sasha kernel: R10: ffff88800469aa80 R11: ffff888004c84ff8 R12: 0000000000000000 Oct 09 14:17:46 lp-sasha kernel: R13: ffff8880051da000 R14: 0000000000002000 R15: 0000000000000003 Oct 09 14:17:46 lp-sasha kernel: FS: 00007fd0192b3440(0000) GS:ffff8880bc900000(0000) knlGS:0000000000000000 Oct 09 14:17:46 lp-sasha kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 09 14:17:46 lp-sasha kernel: CR2: 0000000000000000 CR3: 0000000004976000 CR4: 00000000000006e0 Oct 09 14:17:46 lp-sasha kernel: Call Trace: Oct 09 14:17:46 lp-sasha kernel: nouveau_connector_get_modes+0x1e6/0x240 [nouveau] Oct 09 14:17:46 lp-sasha kernel: ? kfree+0xb9/0x240 Oct 09 14:17:46 lp-sasha kernel: ? drm_connector_list_iter_next+0x7c/0xa0 Oct 09 14:17:46 lp-sasha kernel: drm_helper_probe_single_connector_modes+0x1ba/0x7c0 Oct 09 14:17:46 lp-sasha kernel: drm_client_modeset_probe+0x27e/0x1360 Oct 09 14:17:46 lp-sasha kernel: ? nvif_object_sclass_put+0xc/0x20 [nouveau] Oct 09 14:17:46 lp-sasha kernel: ? nouveau_cli_init+0x3cc/0x440 [nouveau] Oct 09 14:17:46 lp-sasha kernel: ? ktime_get_mono_fast_ns+0x49/0xa0 Oct 09 14:17:46 lp-sasha kernel: ? nouveau_drm_open+0x4e/0x180 [nouveau] Oct 09 14:17:46 lp-sasha kernel: __drm_fb_helper_initial_config_and_unlock+0x3f/0x4a0 Oct 09 14:17:46 lp-sasha kernel: ? drm_file_alloc+0x18f/0x260 Oct 09 14:17:46 lp-sasha kernel: ? mutex_lock+0x9/0x40 Oct 09 14:17:46 lp-sasha kernel: ? drm_client_init+0x110/0x160 Oct 09 14:17:46 lp-sasha kernel: nouveau_fbcon_init+0x14d/0x1c0 [nouveau] Oct 09 14:17:46 lp-sasha kernel: nouveau_drm_device_init+0x1c0/0x880 [nouveau] Oct 09 14:17:46 lp-sasha kernel: nouveau_drm_probe+0x11a/0x1e0 [nouveau] Oct 09 14:17:46 lp-sasha kernel: pci_device_probe+0xcd/0x140 Oct 09 14:17:46 lp-sasha kernel: really_probe+0xd8/0x400 Oct 09 14:17:46 lp-sasha kernel: driver_probe_device+0x4a/0xa0 Oct 09 14:17:46 lp-sasha kernel: device_driver_attach+0x9c/0xc0 Oct 09 14:17:46 lp-sasha kernel: __driver_attach+0x6f/0x100 Oct 09 14:17:46 lp-sasha kernel: ? device_driver_attach+0xc0/0xc0 Oct 09 14:17:46 lp-sasha kernel: bus_for_each_dev+0x75/0xc0 Oct 09 14:17:46 lp-sasha kernel: bus_add_driver+0x106/0x1c0 Oct 09 14:17:46 lp-sasha kernel: driver_register+0x86/0xe0 Oct 09 14:17:46 lp-sasha kernel: ? 0xffffffffa044e000 Oct 09 14:17:46 lp-sasha kernel: do_one_initcall+0x48/0x1e0 Oct 09 14:17:46 lp-sasha kernel: ? _cond_resched+0x11/0x60 Oct 09 14:17:46 lp-sasha kernel: ? kmem_cache_alloc_trace+0x19c/0x1e0 Oct 09 14:17:46 lp-sasha kernel: do_init_module+0x57/0x220 Oct 09 14:17:46 lp-sasha kernel: __do_sys_finit_module+0xa0/0xe0 Oct 09 14:17:46 lp-sasha kernel: do_syscall_64+0x33/0x40 Oct 09 14:17:46 lp-sasha kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Oct 09 14:17:46 lp-sasha kernel: RIP: 0033:0x7fd01a060d5d Oct 09 14:17:46 lp-sasha kernel: Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e3 70 0c 00 f7 d8 64 89 01 48 Oct 09 14:17:46 lp-sasha kernel: RSP: 002b:00007ffc8ad38a98 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 Oct 09 14:17:46 lp-sasha kernel: RAX: ffffffffffffffda RBX: 0000563f6e7fd530 RCX: 00007fd01a060d5d Oct 09 14:17:46 lp-sasha kernel: RDX: 0000000000000000 RSI: 00007fd01a19f95d RDI: 000000000000000f Oct 09 14:17:46 lp-sasha kernel: RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000007 Oct 09 14:17:46 lp-sasha kernel: R10: 000000000000000f R11: 0000000000000246 R12: 00007fd01a19f95d Oct 09 14:17:46 lp-sasha kernel: R13: 0000000000000000 R14: 0000563f6e7fbc10 R15: 0000563f6e7fd530 Oct 09 14:17:46 lp-sasha kernel: Modules linked in: nouveau(+) ttm xt_string xt_mark xt_LOG vgem v4l2_dv_timings uvcvideo ulpi udf ts_kmp ts_fsm ts_bm snd_aloop sil164 qat_dh895xccvf nf_nat_sip nf_nat_irc nf_nat_ftp nf_nat nf_log_ipv6 nf_log_ipv4 nf_log_common ltc2990 lcd intel_qat input_leds i2c_mux gspca_main videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc drivetemp cuse fuse crc_itu_t coretemp ch7006 ath5k ath algif_hash Oct 09 14:17:46 lp-sasha kernel: CR2: 0000000000000000 Oct 09 14:17:46 lp-sasha kernel: ---[ end trace 0ddafe218ad30017 ]--- Oct 09 14:17:46 lp-sasha kernel: RIP: 0010:nouveau_connector_detect_depth+0x71/0xc0 [nouveau] Oct 09 14:17:46 lp-sasha kernel: Code: 0a 00 00 48 8b 49 48 c7 87 b8 00 00 00 06 00 00 00 80 b9 4d 0a 00 00 00 75 1e 83 fa 41 75 05 48 85 c0 75 29 8b 81 10 0d 00 00 <39> 06 7c 25 f6 81 14 0d 00 00 02 75 b7 c3 80 b9 0c 0d 00 00 00 75 Oct 09 14:17:46 lp-sasha kernel: RSP: 0018:ffffc9000028f8c0 EFLAGS: 00010297 Oct 09 14:17:46 lp-sasha kernel: RAX: 0000000000014c08 RBX: ffff8880369d4000 RCX: ffff8880369d3000 Oct 09 14:17:46 lp-sasha kernel: RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffff8880369d4000 Oct 09 14:17:46 lp-sasha kernel: RBP: ffff88800601cc00 R08: ffff8880051da298 R09: ffffffff8226201a Oct 09 14:17:46 lp-sasha kernel: R10: ffff88800469aa80 R11: ffff888004c84ff8 R12: 0000000000000000 Oct 09 14:17:46 lp-sasha kernel: R13: ffff8880051da000 R14: 0000000000002000 R15: 0000000000000003 Oct 09 14:17:46 lp-sasha kernel: FS: 00007fd0192b3440(0000) GS:ffff8880bc900000(0000) knlGS:0000000000000000 Oct 09 14:17:46 lp-sasha kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 09 14:17:46 lp-sasha kernel: CR2: 0000000000000000 CR3: 0000000004976000 CR4: 00000000000006e0 The disassembly: Code: 0a 00 00 48 8b 49 48 c7 87 b8 00 00 00 06 00 00 00 80 b9 4d 0a 00 00 00 75 1e 83 fa 41 75 05 48 85 c0 75 29 8b 81 10 0d 00 00 <39> 06 7c 25 f6 81 14 0d 00 00 02 75 b7 c3 80 b9 0c 0d 00 00 00 75 All code ======== 0: 0a 00 or (%rax),%al 2: 00 48 8b add %cl,-0x75(%rax) 5: 49 rex.WB 6: 48 c7 87 b8 00 00 00 movq $0x6,0xb8(%rdi) d: 06 00 00 00 11: 80 b9 4d 0a 00 00 00 cmpb $0x0,0xa4d(%rcx) 18: 75 1e jne 0x38 1a: 83 fa 41 cmp $0x41,%edx 1d: 75 05 jne 0x24 1f: 48 85 c0 test %rax,%rax 22: 75 29 jne 0x4d 24: 8b 81 10 0d 00 00 mov 0xd10(%rcx),%eax 2a:* 39 06 cmp %eax,(%rsi) <-- trapping instruction 2c: 7c 25 jl 0x53 2e: f6 81 14 0d 00 00 02 testb $0x2,0xd14(%rcx) 35: 75 b7 jne 0xffffffffffffffee 37: c3 retq 38: 80 b9 0c 0d 00 00 00 cmpb $0x0,0xd0c(%rcx) 3f: 75 .byte 0x75 Code starting with the faulting instruction =========================================== 0: 39 06 cmp %eax,(%rsi) 2: 7c 25 jl 0x29 4: f6 81 14 0d 00 00 02 testb $0x2,0xd14(%rcx) b: 75 b7 jne 0xffffffffffffffc4 d: c3 retq e: 80 b9 0c 0d 00 00 00 cmpb $0x0,0xd0c(%rcx) 15: 75 .byte 0x75 objdump -SF --disassemble=nouveau_connector_detect_depth [...] if (nv_connector->edid && c85e1: 83 fa 41 cmp $0x41,%edx c85e4: 75 05 jne c85eb <nouveau_connector_detect_depth+0x6b> (File Offset: 0xc866b) c85e6: 48 85 c0 test %rax,%rax c85e9: 75 29 jne c8614 <nouveau_connector_detect_depth+0x94> (File Offset: 0xc8694) nv_connector->type == DCB_CONNECTOR_LVDS_SPWG) duallink = ((u8 *)nv_connector->edid)[121] == 2; else duallink = mode->clock >= bios->fp.duallink_transition_clk; if ((!duallink && (bios->fp.strapless_is_24bit & 1)) \|\| c85eb: 8b 81 10 0d 00 00 mov 0xd10(%rcx),%eax c85f1: 39 06 cmp %eax,(%rsi) c85f3: 7c 25 jl c861a <nouveau_connector_detect_depth+0x9a> (File Offset: 0xc869a) ( duallink && (bios->fp.strapless_is_24bit & 2))) c85f5: f6 81 14 0d 00 00 02 testb $0x2,0xd14(%rcx) c85fc: 75 b7 jne c85b5 <nouveau_connector_detect_depth+0x35> (File Offset: 0xc8635) connector->display_info.bpc = 8; [...] % scripts/faddr2line /lib/modules/5.9.0-rc8-next-20201009/kernel/drivers/gpu/drm/nouveau/nouveau.ko nouveau_connector_detect_depth+0x71/0xc0 nouveau_connector_detect_depth+0x71/0xc0: nouveau_connector_detect_depth at /home/sasha/linux-next/drivers/gpu/drm/nouveau/nouveau_connector.c:891 It is actually line 889. See the disassembly below. 889 duallink = mode->clock >= bios->fp.duallink_transition_clk; The NULL pointer being dereferenced is mode. Git bisect has identified the following commit as bad: `f28e32d390` drm/nouveau/kms: Don't change EDID when it hasn't actually changed Here is the chain of events that causes the oops. On entry to nouveau_connector_detect_lvds, edid is set to NULL. The call to nouveau_connector_detect sets nv_connector->edid to valid memory, with status set to connector_status_connected and the flow of execution branching to the out label. The subsequent call to nouveau_connector_set_edid erronously clears nv_connector->edid, via the local edid pointer which remains set to NULL. Fix this by setting edid to the value of the just acquired nv_connector->edid and executing the body of nouveau_connector_set_edid only if nv_connector->edid and edid point to different memory addresses thus preventing nv_connector->edid from being turned into a dangling pointer. Fixes: `f28e32d390` ("drm/nouveau/kms: Don't change EDID when it hasn't actually changed") Signed-off-by: Alexander Kapshuk <alexander.kapshuk@gmail.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2020-11-14 14:19:17 +10:00
Linus Torvalds	f01c30de86	More VFS fixes for 5.10-rc4: - Minor cleanups of the sb_start_* fs freeze helpers. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl+sDaIACgkQ+H93GTRK tOu4sw//bIdBw11YfI9sPtMJR/RkK3lm/pU4A/eJYGD65Mzk8J4kNi6jXKuyqQ8e /RpTqKWOwVW05Qg5HlKTxXRyr5Q788+EuBQH2t8VukWVdAgK2TFvNTTXb7QDsNSD SneC7Sox3CEO+vYnBsr7tUjfl7AYH0uFTxLkvpYqSQBn2+jo2x0s7NyKKZSDAASI +Rmhinw4QjjAHYC54nBy6Q47XhrZJj7XCODJdEql81cKSJUvjCo3url3sNvGXXNW oXbs5IO5cVQrQx6n9rQxCfkN1dz9c/CBopYFwdgmg76Bj4VLSzCYVecnMeDl53pV 3jXesNtJcR2dz64e98K1Moof2dHSm0/NP0Q7KnMYEaGEl6tAtyjSx9lL2Qd6npG+ mG460UHd/7RHXoH/BTaCrtHHyA4pApHMqf+w3R2ienxrltKUJAEfGM/5x8o0ikWx laeT0L/m6Yv/dGnDvNthhoF84tCiQUnxg+UeXiKv4R9uFL1bKMFPw5i1zWuXqqaX yZPqUY1tiecQskr89AimOVI64L2MJ4DgBey1JzNL/XzPtw55Qu+LR6MkkaIC08Wu ubGJTm6fPw3Cz8JYgn4WIgKB9Q7yAoKsyl0mGLQh2SJT1FS8WLct+SRPwXcMVfJT VpkgjJW/ak5L+XfQU6Ev39zUasEAqdaxvPoTxUfne6spUiNbgrk= =ZC9a -----END PGP SIGNATURE----- Merge tag 'vfs-5.10-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull fs freeze fix and cleanups from Darrick Wong: "A single vfs fix for 5.10, along with two subsequent cleanups. A very long time ago, a hack was added to the vfs fs freeze protection code to work around lockdep complaints about XFS, which would try to run a transaction (which requires intwrite protection) to finalize an xfs freeze (by which time the vfs had already taken intwrite). Fast forward a few years, and XFS fixed the recursive intwrite problem on its own, and the hack became unnecessary. Fast forward almost a decade, and latent bugs in the code converting this hack from freeze flags to freeze locks combine with lockdep bugs to make this reproduce frequently enough to notice page faults racing with freeze. Since the hack is unnecessary and causes thread race errors, just get rid of it completely. Making this kind of vfs change midway through a cycle makes me nervous, but a large enough number of the usual VFS/ext4/XFS/btrfs suspects have said this looks good and solves a real problem vector. And once that removal is done, __sb_start_write is now simple enough that it becomes possible to refactor the function into smaller, simpler static inline helpers in linux/fs.h. The cleanup is straightforward. Summary: - Finally remove the "convert to trylock" weirdness in the fs freezer code. It was necessary 10 years ago to deal with nested transactions in XFS, but we've long since removed that; and now this is causing subtle race conditions when lockdep goes offline and sb_start_* aren't prepared to retry a trylock failure. - Minor cleanups of the sb_start_* fs freeze helpers" * tag 'vfs-5.10-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: vfs: move __sb_{start,end}_write* to fs.h vfs: separate __sb_start_write into blocking and non-blocking helpers vfs: remove lockdep bogosity in __sb_start_write	2020-11-13 16:07:53 -08:00

1 2 3 4 5 ...

967373 Commits All Branches Search

967373 Commits

All Branches