OpenCloudOS-Kernel/arch
Michael Ellerman 98d0219e04 powerpc/64s/radix: Fix crash with unaligned relocated kernel
If a relocatable kernel is loaded at an address that is not 2MB aligned
and told not to relocate to zero, the kernel can crash due to
mark_rodata_ro() incorrectly changing some read-write data to read-only.

Scenarios where the misalignment can occur are when the kernel is
loaded by kdump or using the RELOCATABLE_TEST config option.

Example crash with the kernel loaded at 5MB:

  Run /sbin/init as init process
  BUG: Unable to handle kernel data access on write at 0xc000000000452000
  Faulting instruction address: 0xc0000000005b6730
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
  CPU: 1 PID: 1 Comm: init Not tainted 6.2.0-rc1-00011-g349188be4841 #166
  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-5b4c5a hv:linux,kvm pSeries
  NIP:  c0000000005b6730 LR: c000000000ae9ab8 CTR: 0000000000000380
  REGS: c000000004503250 TRAP: 0300   Not tainted  (6.2.0-rc1-00011-g349188be4841)
  MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 44288480  XER: 00000000
  CFAR: c0000000005b66ec DAR: c000000000452000 DSISR: 0a000000 IRQMASK: 0
  ...
  NIP memset+0x68/0x104
  LR  zero_user_segments.constprop.0+0xa8/0xf0
  Call Trace:
    ext4_mpage_readpages+0x7f8/0x830
    ext4_readahead+0x48/0x60
    read_pages+0xb8/0x380
    page_cache_ra_unbounded+0x19c/0x250
    filemap_fault+0x58c/0xae0
    __do_fault+0x60/0x100
    __handle_mm_fault+0x1230/0x1a40
    handle_mm_fault+0x120/0x300
    ___do_page_fault+0x20c/0xa80
    do_page_fault+0x30/0xc0
    data_access_common_virt+0x210/0x220

This happens because mark_rodata_ro() tries to change permissions on the
range _stext..__end_rodata, but _stext sits in the middle of the 2MB
page from 4MB to 6MB:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000002400000 with 2.00 MiB pages (exec)

The logic that changes the permissions assumes the linear mapping was
split correctly at boot, so it marks the entire 2MB page read-only. That
leads to the write fault above.

To fix it, the boot time mapping logic needs to consider that if the
kernel is running at a non-zero address then _stext is a boundary where
it must split the mapping.

That leads to the mapping being split correctly, allowing the rodata
permission change to take happen correctly, with no spillover:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000000500000 with 64.0 KiB pages
  radix-mmu: Mapped 0x0000000000500000-0x0000000000600000 with 64.0 KiB pages (exec)
  radix-mmu: Mapped 0x0000000000600000-0x0000000002400000 with 2.00 MiB pages (exec)

If the kernel is loaded at a 2MB aligned address, the mapping continues
to use 2MB pages as before:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000002c00000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000002c00000-0x0000000100000000 with 2.00 MiB pages

Fixes: c55d7b5e64 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230110124753.1325426-1-mpe@ellerman.id.au
2023-01-31 21:37:39 +11:00
..
alpha MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
arc MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
arm ARM: SoC fixes for 6.2 2022-12-19 16:07:59 -06:00
arm64 remoteproc updates for v6.2 2022-12-21 09:37:14 -08:00
csky arch/csky patches for 6.2-rc1 2022-12-19 07:51:30 -06:00
hexagon MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
ia64 - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
loongarch LoongArch changes for v6.2 2022-12-19 08:23:27 -06:00
m68k m68k: remove broken strcmp implementation 2022-12-21 08:56:43 -08:00
microblaze MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
mips Fixes due to DT changes 2022-12-23 10:49:45 -08:00
nios2 MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
openrisc MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
parisc parisc architecture fixes for kernel v6.2-rc1: 2022-12-20 08:43:53 -06:00
powerpc powerpc/64s/radix: Fix crash with unaligned relocated kernel 2023-01-31 21:37:39 +11:00
riscv KVM/riscv changes for 6.2 2022-12-21 18:52:15 -08:00
s390 random: do not include <asm/archrandom.h> from random.h 2022-12-20 03:13:45 +01:00
sh treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
sparc MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
um New Feature: 2022-12-17 14:06:53 -06:00
x86 - Pass only an initialized perf event attribute to the LSM hook 2023-01-01 11:27:00 -08:00
xtensa MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
.gitignore
Kconfig arm64 fixes for -rc1 2022-12-16 13:46:41 -06:00