Commit Graph

347 Commits

Author SHA1 Message Date
Anup Patel d5fdade933
RISC-V: mm: Fix set_satp_mode() for platform not having Sv57
When Sv57 is not available the satp.MODE test in set_satp_mode() will
fail and lead to pgdir re-programming for Sv48. The pgdir re-programming
will fail as well due to pre-existing pgdir entry used for Sv57 and as
a result kernel fails to boot on RISC-V platform not having Sv57.

To fix above issue, we should clear the pgdir memory in set_satp_mode()
before re-programming.

Fixes: 011f09d120 ("riscv: mm: Set sv57 on defaultly")
Reported-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-21 15:10:39 -07:00
Chuanhua Han 7da9ca3f5b
riscv: mm: Remove the copy operation of pmd
Since all processes share the kernel address space,
we only need to copy pgd in case of a vmalloc page
fault exception, the other levels of page tables are
shared, so the operation of copying pmd is unnecessary.

Signed-off-by: Chuanhua Han <hanchuanhua@oppo.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-20 17:36:58 -07:00
Christoph Hellwig c6af2aa9ff swiotlb: make the swiotlb_init interface more useful
Pass a boolean flag to indicate if swiotlb needs to be enabled based on
the addressing needs, and replace the verbose argument with a set of
flags, including one to force enable bounce buffering.

Note that this patch removes the possibility to force xen-swiotlb use
with the swiotlb=force parameter on the command line on x86 (arm and
arm64 never supported that), but this interface will be restored shortly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-04-18 07:21:11 +02:00
Linus Torvalds aa5b537b0e RISC-V Patches for the 5.18 Merge Window, Part 1
* Support for Sv57-based virtual memory.
 * Various improvements for the MicroChip PolarFire SOC and the
   associated Icicle dev board, which should allow upstream kernels to
   boot without any additional modifications.
 * An improved memmove() implementation.
 * Support for the new Ssconfpmf and SBI PMU extensions, which allows for
   a much more useful perf implementation on RISC-V systems.
 * Support for restartable sequences.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmI96FcTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiQBFD/425+6xmoOru6Wiki3Ja0fqQToNrQyW
 IbmE/8AxUP7UxMvJSNzvQm8deXgklzvmegXCtnjwZZins971vMzzDSI83k/zn8I7
 m5thVC9z01BjodV+pvIp/44hS6FesolOLzkVHksX0Zh6h0iidrc34Qf5HrqvvNfN
 CZ/4K1+E9ig5r9qZp4WdvocCXj+FzwF/30GjKoW9vwA599CEG/dCo+TNN9GKD6XS
 k+xiUGwlIRA+kCLSPFCi7ev9XPr1tCmQB7uB8Igcvr7Y3mWl8HKfajQVXBnXNRC3
 ifbDxpx1elJiLPyf7Rza8jIDwDhLQdxBiwPgDgP9h9R4x0uF4efq8PzLzFlFmaE+
 9Z9thfykBb5dXYDFDje9bAOXvKnGk7Iqxdsz0qWo/ChEQawX1+11bJb0TNN8QTT9
 YvlQfUXgb1dmEcj5yG2uVE1Y8L7YNLRMsZU3W3FbmPJZoavSOuU4x0yCGeLyv597
 76af3nuBJ5v80Db97gu6St+HIACeevKflsZUf/8GS/p7d1DlvmrWzQUMEycxPTG9
 UZpZak58jh7AqQ9JbLnavhwmeacY50vpZOw6QHGAHSN+8daCPlOHDG7Ver7Z+kNj
 +srJ7iKMvLnnaEjGNgavfxdqTOme1gv4LWs/JdHYMkpphqVN92xBDJnhXTPRVZiQ
 0x39vK86qtB46A==
 =Omc6
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for Sv57-based virtual memory.

 - Various improvements for the MicroChip PolarFire SOC and the
   associated Icicle dev board, which should allow upstream kernels to
   boot without any additional modifications.

 - An improved memmove() implementation.

 - Support for the new Ssconfpmf and SBI PMU extensions, which allows
   for a much more useful perf implementation on RISC-V systems.

 - Support for restartable sequences.

* tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (36 commits)
  rseq/selftests: Add support for RISC-V
  RISC-V: Add support for restartable sequence
  MAINTAINERS: Add entry for RISC-V PMU drivers
  Documentation: riscv: Remove the old documentation
  RISC-V: Add sscofpmf extension support
  RISC-V: Add perf platform driver based on SBI PMU extension
  RISC-V: Add RISC-V SBI PMU extension definitions
  RISC-V: Add a simple platform driver for RISC-V legacy perf
  RISC-V: Add a perf core library for pmu drivers
  RISC-V: Add CSR encodings for all HPMCOUNTERS
  RISC-V: Remove the current perf implementation
  RISC-V: Improve /proc/cpuinfo output for ISA extensions
  RISC-V: Do no continue isa string parsing without correct XLEN
  RISC-V: Implement multi-letter ISA extension probing framework
  RISC-V: Extract multi-letter extension names from "riscv, isa"
  RISC-V: Minimal parser for "riscv, isa" strings
  RISC-V: Correctly print supported extensions
  riscv: Fixed misaligned memory access. Fixed pointer comparison.
  MAINTAINERS: update riscv/microchip entry
  riscv: dts: microchip: add new peripherals to icicle kit device tree
  ...
2022-03-25 10:11:38 -07:00
Jisheng Zhang d414cb379a riscv: mm: init: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef
Replace the conditional compilation using "#ifdef CONFIG_KEXEC_CORE" by a
check for "IS_ENABLED(CONFIG_KEXEC_CORE)", to simplify the code and
increase compile coverage.

Link: https://lkml.kernel.org/r/20211206160514.2000-3-jszhang@kernel.org
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23 19:00:34 -07:00
Alexandre Ghiti e4fcfe6eca
riscv: Fix kasan pud population
In sv48, the kasan inner regions are not aligned on PGDIR_SIZE and then
when we populate the kasan linear mapping region, we clear the kasan
vmalloc region which is in the same PGD.

Fix this by copying the content of the kasan early pud after allocating a
new PGD for the first time.

Fixes: e8a62cc26d ("riscv: Implement sv48 support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:34:29 -08:00
Alexandre Ghiti 625e24a550
riscv: Move high_memory initialization to setup_bootmem
high_memory used to be initialized in mem_init, way after setup_bootmem.
But a call to dma_contiguous_reserve in this function gives rise to the
below warning because high_memory is equal to 0 and is used at the very
beginning at cma_declare_contiguous_nid.

It went unnoticed since the move of the kasan region redefined
KERN_VIRT_SIZE so that it does not encompass -1 anymore.

Fix this by initializing high_memory in setup_bootmem.

------------[ cut here ]------------
virt_to_phys used for non-linear address: ffffffffffffffff (0xffffffffffffffff)
WARNING: CPU: 0 PID: 0 at arch/riscv/mm/physaddr.c:14 __virt_to_phys+0xac/0x1b8
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.17.0-rc1-00007-ga68b89289e26 #27
Hardware name: riscv-virtio,qemu (DT)
epc : __virt_to_phys+0xac/0x1b8
 ra : __virt_to_phys+0xac/0x1b8
epc : ffffffff80014922 ra : ffffffff80014922 sp : ffffffff84a03c30
 gp : ffffffff85866c80 tp : ffffffff84a3f180 t0 : ffffffff86bce657
 t1 : fffffffef09406e8 t2 : 0000000000000000 s0 : ffffffff84a03c70
 s1 : ffffffffffffffff a0 : 000000000000004f a1 : 00000000000f0000
 a2 : 0000000000000002 a3 : ffffffff8011f408 a4 : 0000000000000000
 a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffff84a03747
 s2 : ffffffd800000000 s3 : ffffffff86ef4000 s4 : ffffffff8467f828
 s5 : fffffff800000000 s6 : 8000000000006800 s7 : 0000000000000000
 s8 : 0000000480000000 s9 : 0000000080038ea0 s10: 0000000000000000
 s11: ffffffffffffffff t3 : ffffffff84a035c0 t4 : fffffffef09406e8
 t5 : fffffffef09406e9 t6 : ffffffff84a03758
status: 0000000000000100 badaddr: 0000000000000000 cause: 0000000000000003
[<ffffffff8322ef4c>] cma_declare_contiguous_nid+0xf2/0x64a
[<ffffffff83212a58>] dma_contiguous_reserve_area+0x46/0xb4
[<ffffffff83212c3a>] dma_contiguous_reserve+0x174/0x18e
[<ffffffff83208fc2>] paging_init+0x12c/0x35e
[<ffffffff83206bd2>] setup_arch+0x120/0x74e
[<ffffffff83201416>] start_kernel+0xce/0x68c
irq event stamp: 0
hardirqs last  enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<0000000000000000>] 0x0
softirqs last  enabled at (0): [<0000000000000000>] 0x0
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace 0000000000000000 ]---

Fixes: f7ae02333d ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:34:12 -08:00
Alexandre Ghiti c648c4bb7d
riscv: Fix config KASAN && DEBUG_VIRTUAL
__virt_to_phys function is called very early in the boot process (ie
kasan_early_init) so it should not be instrumented by KASAN otherwise it
bugs.

Fix this by declaring phys_addr.c as non-kasan instrumentable.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b72721 (riscv: Add KASAN support)
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:32:41 -08:00
Alexandre Ghiti 5f763b3b59
riscv: Fix DEBUG_VIRTUAL false warnings
KERN_VIRT_SIZE used to encompass the kernel mapping before it was
redefined when moving the kasan mapping next to the kernel mapping to only
match the maximum amount of physical memory.

Then, kernel mapping addresses that go through __virt_to_phys are now
declared as wrong which is not true, one can use __virt_to_phys on such
addresses.

Fix this by redefining the condition that matches wrong addresses.

Fixes: f7ae02333d ("riscv: Move KASAN mapping next to the kernel mapping")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 15:32:04 -08:00
Alexandre Ghiti a3d3280378
riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP
In order to get the pfn of a struct page* when sparsemem is enabled
without vmemmap, the mem_section structures need to be initialized which
happens in sparse_init.

But kasan_early_init calls pfn_to_page way before sparse_init is called,
which then tries to dereference a null mem_section pointer.

Fix this by removing the usage of this function in kasan_early_init.

Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-03 13:11:30 -08:00
Palmer Dabbelt 9195c294bc
RISC-V: Add Sv57 page table support
This implements Sv57 support at runtime. The kernel will try to boot
with 5-level page table firstly , and will fallback to 4-level if the HW
does not support it. And it will finally fallback to 3-level if the HW
alse does not support sv48.

* riscv-sv57:
  riscv: mm: Support kasan for sv57
  riscv: mm: Set sv57 on defaultly
  riscv: mm: Prepare pt_ops helper functions for sv57
  riscv: mm: Control p4d's folding by pgtable_l5_enabled
2022-02-22 09:40:52 -08:00
Qinglin Pan 8fbdccd2b1
riscv: mm: Support kasan for sv57
This patchset add kasan_populate and kasan_shallow_populate for sv57,
and is tested on both qemu and unmatched with CONFIG_KASAN and
CONFIG_KASAN_VMALLOC on.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 16:32:47 -08:00
Qinglin Pan 011f09d120
riscv: mm: Set sv57 on defaultly
This patch sets sv57 on defaultly if CONFIG_64BIT. And do fallback to try
to set sv48 on boot time if sv57 is not supported in current hardware.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 16:32:45 -08:00
Qinglin Pan 677b9eb881
riscv: mm: Prepare pt_ops helper functions for sv57
This patch prepare some pt_ops helper functions which will be used in
creating sv57 mappings during boot time.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 16:32:42 -08:00
Qinglin Pan d10efa21a9
riscv: mm: Control p4d's folding by pgtable_l5_enabled
To determine pgtable level at boot time, we can not use helper functions
in include/asm-generic/pgtable-nop4d.h and must implement these
functions. This patch uses pgtable_l5_enabled variable instead of
including pgtable-nop4d.h to controle p4d's folding, and implements
corresponding helper functions.

Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 16:32:39 -08:00
Jisheng Zhang 67ff2f2626
riscv: mm: init: mark satp_mode __ro_after_init
satp_mode is never modified after init, so it can be marked as
__ro_after_init.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-14 15:41:41 -08:00
Jisheng Zhang f81393a5b2
riscv: extable: fix err reg writing in dedicated uaccess handler
Mayuresh reported commit 20802d8d47 ("riscv: extable: add a dedicated
uaccess handler") breaks the writev02 test case in LTP. This is due to
the err reg isn't correctly set with the errno(-EFAULT in writev02
case). First of all, the err and zero regs are reg numbers rather than
reg offsets in struct pt_regs; Secondly, regs_set_gpr() should write
the regs when offset isn't zero(zero means epc)

Fix it by correcting regs_set_gpr() logic and passing the correct reg
offset to it.

Reported-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Fixes: 20802d8d47 ("riscv: extable: add a dedicated uaccess handler")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-08 17:02:47 -08:00
Palmer Dabbelt ca0cb9a60f
riscv/mm: Add XIP_FIXUP for riscv_pfn_base
This manifests as a crash early in boot on VexRiscv.

Signed-off-by: Myrtle Shah <gatecat@ds0.me>
[Palmer: split commit]
Fixes: 44c9225729 ("RISC-V: enable XIP")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-04 13:27:23 -08:00
Palmer Dabbelt 4b1c70aa8e
riscv/mm: Add XIP_FIXUP for phys_ram_base
This manifests as a crash early in boot on VexRiscv.

Signed-off-by: Myrtle Shah <gatecat@ds0.me>
[Palmer: split commit]
Fixes: 6d7f91d914 ("riscv: Get rid of CONFIG_PHYS_RAM_BASE in kernel physical address conversion")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-02-04 13:18:56 -08:00
Atish Patra 26fb751ca3
RISC-V: Do not use cpumask data structure for hartid bitmap
Currently, SBI APIs accept a hartmask that is generated from struct
cpumask. Cpumask data structure can hold upto NR_CPUs value. Thus, it
is not the correct data structure for hartids as it can be higher
than NR_CPUs for platforms with sparse or discontguous hartids.

Remove all association between hartid mask and struct cpumask.

Reviewed-by: Anup Patel <anup@brainfault.org> (For Linux RISC-V changes)
Acked-by: Anup Patel <anup@brainfault.org> (For KVM RISC-V changes)
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-20 09:27:22 -08:00
kernel test robot 20aa49541a
riscv: fix boolconv.cocci warnings
arch/riscv/mm/init.c:48:11-16: WARNING: conversion to bool not needed here

 Remove unneeded conversion to bool

Semantic patch information:
 Relational and logical operators evaluate to bool,
 explicit conversion is overly verbose and unneeded.

Generated by: scripts/coccinelle/misc/boolconv.cocci

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 19:52:30 -08:00
Palmer Dabbelt 0c34e79e52
RISC-V: Introduce sv48 support without relocatable kernel
This patchset allows to have a single kernel for sv39 and sv48 without
being relocatable.

The idea comes from Arnd Bergmann who suggested to do the same as x86,
that is mapping the kernel to the end of the address space, which allows
the kernel to be linked at the same address for both sv39 and sv48 and
then does not require to be relocated at runtime.

This implements sv48 support at runtime. The kernel will try to boot
with 4-level page table and will fallback to 3-level if the HW does not
support it. Folding the 4th level into a 3-level page table has almost
no cost at runtime.

Note that kasan region had to be moved to the end of the address space
since its location must be known at compile-time and then be valid for
both sv39 and sv48 (and sv57 that is coming).

* riscv-sv48-v3:
  riscv: Explicit comment about user virtual address space size
  riscv: Use pgtable_l4_enabled to output mmu_type in cpuinfo
  riscv: Implement sv48 support
  asm-generic: Prepare for riscv use of pud_alloc_one and pud_free
  riscv: Allow to dynamically define VA_BITS
  riscv: Introduce functions to switch pt_ops
  riscv: Split early kasan mapping to prepare sv48 introduction
  riscv: Move KASAN mapping next to the kernel mapping
  riscv: Get rid of MAXPHYSMEM configs

Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 19:37:44 -08:00
Alexandre Ghiti e8a62cc26d
riscv: Implement sv48 support
By adding a new 4th level of page table, give the possibility to 64bit
kernel to address 2^48 bytes of virtual address: in practice, that offers
128TB of virtual address space to userspace and allows up to 64TB of
physical memory.

If the underlying hardware does not support sv48, we will automatically
fallback to a standard 3-level page table by folding the new PUD level into
PGDIR level. In order to detect HW capabilities at runtime, we
use SATP feature that ignores writes with an unsupported mode.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 17:54:09 -08:00
Alexandre Ghiti 840125a97a
riscv: Introduce functions to switch pt_ops
This simply gathers the different pt_ops initialization in functions
where a comment was added to explain why the page table operations must
be changed along the boot process.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 17:54:06 -08:00
Alexandre Ghiti 2efad17e57
riscv: Split early kasan mapping to prepare sv48 introduction
Now that kasan shadow region is next to the kernel, for sv48, this
region won't be aligned on PGDIR_SIZE and then when populating this
region, we'll need to get down to lower levels of the page table. So
instead of reimplementing the page table walk for the early population,
take advantage of the existing functions used for the final population.

Note that kasan swapper initialization must also be split since memblock
is not initialized at this point and as the last PGD is shared with the
kernel, we'd need to allocate a PUD so postpone the kasan final
population after the kernel population is done.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 17:54:05 -08:00
Alexandre Ghiti f7ae02333d
riscv: Move KASAN mapping next to the kernel mapping
Now that KASAN_SHADOW_OFFSET is defined at compile time as a config,
this value must remain constant whatever the size of the virtual address
space, which is only possible by pushing this region at the end of the
address space next to the kernel mapping.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 17:54:04 -08:00
Jisheng Zhang 805a3ebed5
riscv: mm: init: try best to remove #ifdef CONFIG_XIP_KERNEL usage
Currently, the #ifdef CONFIG_XIP_KERNEL usage can be divided into the
following three types:

The first one is for functions/declarations only used in XIP case.

The second one is for XIP_FIXUP case. Something as below:
|foo_type foo;
|#ifdef CONFIG_XIP_KERNEL
|#define foo    (*(foo_type *)XIP_FIXUP(&foo))
|#endif

Usually, it's better to let the foo macro sit with the foo var
together. But if various foos are defined adjacently, we can
save some #ifdef CONFIG_XIP_KERNEL usage by grouping them together.

The third one is for different implementations for XIP, usually, this
is a #ifdef...#else...#endif case.

This patch moves the pt_ops macro to adjacent #ifdef CONFIG_XIP_KERNEL
and group first type usage cases into one.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 09:56:35 -08:00
Jisheng Zhang fe036db7d8
riscv: mm: init: try IS_ENABLED(CONFIG_XIP_KERNEL) instead of #ifdef
Try our best to replace the conditional compilation using
"#ifdef CONFIG_XIP_KERNEL" with "IS_ENABLED(CONFIG_XIP_KERNEL)", to
simplify the code and to increase compile coverage.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 09:56:33 -08:00
Jisheng Zhang 3274a6ef3b
riscv: mm: init: remove _pt_ops and use pt_ops directly
Except "pt_ops", other global vars when CONFIG_XIP_KERNEL=y is defined
as below:

|foo_type foo;
|#ifdef CONFIG_XIP_KERNEL
|#define foo	(*(foo_type *)XIP_FIXUP(&foo))
|#endif

Follow the same way for pt_ops to unify the style and to simplify code.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 09:56:32 -08:00
Jisheng Zhang 07aabe8fb6
riscv: mm: init: try best to use IS_ENABLED(CONFIG_64BIT) instead of #ifdef
Try our best to replace the conditional compilation using
"#ifdef CONFIG_64BIT" by a check for "IS_ENABLED(CONFIG_64BIT)", to
simplify the code and to increase compile coverage.

Now we can also remove the __maybe_unused used in max_mapped_addr
declaration.

We also remove the BUG_ON check of mapping the last 4K bytes of the
addressable memory since this is always true for every kernel actually.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 09:56:31 -08:00
Jisheng Zhang 902d6364aa
riscv: mm: init: remove unnecessary "#ifdef CONFIG_CRASH_DUMP"
The is_kdump_kernel() returns false for !CRASH_DUMP case, so we don't
need the #ifdef CONFIG_CRASH_DUMP for is_kdump_kernel() checking.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-19 09:56:30 -08:00
Linus Torvalds f1b744f65e RISC-V Patches for the 5.17 Merge Window, Part 1
* Support for the DA9063 as used on the HiFive Unmatched.
 * Support for relative extables, which puts us in line with other
   architectures and save some space in vmlinux.
 * A handful of kexec fixes/improvements, including the ability to run
   crash kernels from PCI-addressable memory on the HiFive Unmatched.
 * Support for the SBI SRST extension, which allows systems that do not
   have an explicit driver in Linux to reboot.
 * A handful of fixes and cleanups, including to the defconfigs and
   device trees.
 
 ---
 This time I do expect to have a part 2, as there's still some smaller
 patches on the list.  I was hoping to get through more of that over the
 weekend, but I got distracted with the ABI issues.  Figured it's better
 to send this sooner rather than waiting.
 
 Included are my merge resolutions against a master from this morning, if
 that helps any:
 
 diff --cc arch/riscv/include/asm/sbi.h
 index 289621da4a2a,9c46dd3ff4a2..000000000000
 --- a/arch/riscv/include/asm/sbi.h
 +++ b/arch/riscv/include/asm/sbi.h
 @@@ -27,7 -27,14 +27,15 @@@ enum sbi_ext_id
         SBI_EXT_IPI = 0x735049,
         SBI_EXT_RFENCE = 0x52464E43,
         SBI_EXT_HSM = 0x48534D,
  +      SBI_EXT_SRST = 0x53525354,
 +
 +       /* Experimentals extensions must lie within this range */
 +       SBI_EXT_EXPERIMENTAL_START = 0x08000000,
 +       SBI_EXT_EXPERIMENTAL_END = 0x08FFFFFF,
 +
 +       /* Vendor extensions must lie within this range */
 +       SBI_EXT_VENDOR_START = 0x09000000,
 +       SBI_EXT_VENDOR_END = 0x09FFFFFF,
   };
 
   enum sbi_ext_base_fid {
 diff --git a/arch/riscv/boot/dts/sifive/hifive-unmatched-a00.dts b/arch/riscv/boot/dts/sifive/hifive-unmatched-a00.dts
 index e03a4c94cf3f..6bfa1f24d3de 100644
 --- a/arch/riscv/boot/dts/sifive/hifive-unmatched-a00.dts
 +++ b/arch/riscv/boot/dts/sifive/hifive-unmatched-a00.dts
 @@ -188,14 +188,6 @@ vdd_ldo11: ldo11 {
                                 regulator-always-on;
                         };
                 };
 -
 -               rtc {
 -                       compatible = "dlg,da9063-rtc";
 -               };
 -
 -               wdt {
 -                       compatible = "dlg,da9063-watchdog";
 -               };
         };
  };
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmHnDV4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiaGWD/wOMHVLrkLZDxKHY3lFU7S7FanpFgcU
 L265fgKtoG/QOI9WPuQlN7pYvrC4ssUvtQ23WwZ+iz4pJlUwoMb2TAqBBeTXxEbW
 pVF2QqnlPdv2ZEn95MFxZ0HQB2+xgJKPL5gdD6Iz7oe2378lf7tywSF7MYpxG/AA
 CeHUxzhEPhQJntufTievMhvYpM7ZyhCr19ZAHXRaPoGReJK5ZMCeYHGTrHD4EisG
 hO/Pg2vx/Ynxi/vb/C69kpTBvu4Qsxnbhgfy1SowrO3FhxcZTbyrZ6l8uRxSAHIg
 dA0NLPh/YDQCPXYnphQcLo+Q9Gy4Sz5es7ULnnMyyEOZxoVyy4up3rCAFAL3Ubav
 CNQdk/ZWtrZ+s4chilA1kW97apxocvmq5ULg+7Hi58ZUzk+y7MQBVCClohyONVEU
 /leJzJ3nq3YHFgfo8Uh7L+iPzlNgycfi4gRnGJIkEVRhXBPTfJ/Pc5wjPoPVsFvt
 pjEYT4YaXITZ0QBLdcuPex5h3PXkRsORsZl8eJGnIz8742KA4tfFraZ4BkbrjoqC
 tLsi7Si9hN3JKhLsNgclb76tDkoz4CY7yZ7TT7hRbKdZZJkVRu1XqUq75X18CVQv
 9p7Q7j1b5H3Z+/5KOxwS0UO73y92yvyVvi0cLqBoD2Tkeq3beumxmy50Qy+O+h1D
 Ut7GwcyavzfS8Q==
 =uqtf
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.17-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the DA9063 as used on the HiFive Unmatched.

 - Support for relative extables, which puts us in line with other
   architectures and save some space in vmlinux.

 - A handful of kexec fixes/improvements, including the ability to run
   crash kernels from PCI-addressable memory on the HiFive Unmatched.

 - Support for the SBI SRST extension, which allows systems that do not
   have an explicit driver in Linux to reboot.

 - A handful of fixes and cleanups, including to the defconfigs and
   device trees.

* tag 'riscv-for-linus-5.17-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits)
  RISC-V: Use SBI SRST extension when available
  riscv: mm: fix wrong phys_ram_base value for RV64
  RISC-V: Use common riscv_cpuid_to_hartid_mask() for both SMP=y and SMP=n
  riscv: head: remove useless __PAGE_ALIGNED_BSS and .balign
  riscv: errata: alternative: mark vendor_patch_func __initdata
  riscv: head: make secondary_start_common() static
  riscv: remove cpu_stop()
  riscv: try to allocate crashkern region from 32bit addressible memory
  riscv: use hart id instead of cpu id on machine_kexec
  riscv: Don't use va_pa_offset on kdump
  riscv: dts: sifive: fu540-c000: Fix PLIC node
  riscv: dts: sifive: fu540-c000: Drop bogus soc node compatible values
  riscv: dts: sifive: Group tuples in register properties
  riscv: dts: sifive: Group tuples in interrupt properties
  riscv: dts: microchip: mpfs: Group tuples in interrupt properties
  riscv: dts: microchip: mpfs: Fix clock controller node
  riscv: dts: microchip: mpfs: Fix reference clock node
  riscv: dts: microchip: mpfs: Fix PLIC node
  riscv: dts: microchip: mpfs: Drop empty chosen node
  riscv: dts: canaan: Group tuples in interrupt properties
  ...
2022-01-19 11:38:21 +02:00
Linus Torvalds 35ce8ae9ae Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull signal/exit/ptrace updates from Eric Biederman:
 "This set of changes deletes some dead code, makes a lot of cleanups
  which hopefully make the code easier to follow, and fixes bugs found
  along the way.

  The end-game which I have not yet reached yet is for fatal signals
  that generate coredumps to be short-circuit deliverable from
  complete_signal, for force_siginfo_to_task not to require changing
  userspace configured signal delivery state, and for the ptrace stops
  to always happen in locations where we can guarantee on all
  architectures that the all of the registers are saved and available on
  the stack.

  Removal of profile_task_ext, profile_munmap, and profile_handoff_task
  are the big successes for dead code removal this round.

  A bunch of small bug fixes are included, as most of the issues
  reported were small enough that they would not affect bisection so I
  simply added the fixes and did not fold the fixes into the changes
  they were fixing.

  There was a bug that broke coredumps piped to systemd-coredump. I
  dropped the change that caused that bug and replaced it entirely with
  something much more restrained. Unfortunately that required some
  rebasing.

  Some successes after this set of changes: There are few enough calls
  to do_exit to audit in a reasonable amount of time. The lifetime of
  struct kthread now matches the lifetime of struct task, and the
  pointer to struct kthread is no longer stored in set_child_tid. The
  flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is
  removed. Issues where task->exit_code was examined with
  signal->group_exit_code should been examined were fixed.

  There are several loosely related changes included because I am
  cleaning up and if I don't include them they will probably get lost.

  The original postings of these changes can be found at:
     https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.org
     https://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.org
     https://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org

  I trimmed back the last set of changes to only the obviously correct
  once. Simply because there was less time for review than I had hoped"

* 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits)
  ptrace/m68k: Stop open coding ptrace_report_syscall
  ptrace: Remove unused regs argument from ptrace_report_syscall
  ptrace: Remove second setting of PT_SEIZED in ptrace_attach
  taskstats: Cleanup the use of task->exit_code
  exit: Use the correct exit_code in /proc/<pid>/stat
  exit: Fix the exit_code for wait_task_zombie
  exit: Coredumps reach do_group_exit
  exit: Remove profile_handoff_task
  exit: Remove profile_task_exit & profile_munmap
  signal: clean up kernel-doc comments
  signal: Remove the helper signal_group_exit
  signal: Rename group_exit_task group_exec_task
  coredump: Stop setting signal->group_exit_task
  signal: Remove SIGNAL_GROUP_COREDUMP
  signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process
  signal: Make coredump handling explicit in complete_signal
  signal: Have prepare_signal detect coredumps using signal->core_state
  signal: Have the oom killer detect coredumps using signal->core_state
  exit: Move force_uaccess back into do_exit
  exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit
  ...
2022-01-17 05:49:30 +02:00
Qi Zheng 36ef159f44 mm: remove redundant check about FAULT_FLAG_ALLOW_RETRY bit
Since commit 4064b98270 ("mm: allow VM_FAULT_RETRY for multiple
times") allowed VM_FAULT_RETRY for multiple times, the
FAULT_FLAG_ALLOW_RETRY bit of fault_flag will not be changed in the page
fault path, so the following check is no longer needed:

	flags & FAULT_FLAG_ALLOW_RETRY

So just remove it.

[akpm@linux-foundation.org: coding style fixes]

Link: https://lkml.kernel.org/r/20211110123358.36511-1-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Kirill Shutemov <kirill@shutemov.name>
Cc: Peter Xu <peterx@redhat.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-15 16:30:27 +02:00
Jisheng Zhang b0fd4b1bf9
riscv: mm: fix wrong phys_ram_base value for RV64
Currently, if 64BIT and !XIP_KERNEL, the phys_ram_base is always 0,
no matter the real start of dram reported by memblock is.

Fixes: 6d7f91d914 ("riscv: Get rid of CONFIG_PHYS_RAM_BASE in kernel physical address conversion")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09 13:51:48 -08:00
Nick Kossifidis decf89f86e
riscv: try to allocate crashkern region from 32bit addressible memory
When allocating crash kernel region without explicitly specifying its
base address/size, memblock_phys_alloc_range will attempt to allocate
memory top to bottom (memblock.bottom_up is false), so the crash
kernel region will end up in highmem on 64bit systems. This way
swiotlb can't work on the crash kernel, since there won't be any
32bit addressible memory available for the bounce buffers.

Try to allocate 32bit addressible memory if available, for the
crash kernel by restricting the top search address to be less
than SZ_4G. If that fails fallback to the previous behavior.

I tested this on HiFive Unmatched where the pci-e controller needs
swiotlb to work, with this patch it's possible to access the pci-e
controller on crash kernel and mount the rootfs from the nvme.

Signed-off-by: Nick Kossifidis <mick@ics.forth.gr>
Fixes: e53d28180d ("RISC-V: Add kdump support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-09 10:40:14 -08:00
Kefeng Wang 5a7ac592c5
riscv: mm: Enable PMD split page table lock for RV64
After commit 1355c31eeb ("asm-generic: pgalloc: provide generic
pmd_alloc_one() and pmd_free_one()"), the main part to support
PMD split page table lock is in asm-generic/pgalloc.h.

The only change is add pgtable_pmd_page_ctor() into alloc_pmd_late(),
then we could enable ARCH_ENABLE_SPLIT_PMD_PTLOCK for RV64.

Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 19:31:49 -08:00
Alexandre Ghiti 7cc8c75b54
riscv: Make vmalloc/vmemmap end equal to the start of the next region
We used to define VMALLOC_END equal to the start of the next region
*minus one* which is inconsistent with the use of this define in the
core code (for example, see the definitions of VMALLOC_TOTAL and
is_vmalloc_addr).

And then make the definition of VMEMMAP_END consistent with VMALLOC_END
and all other regions actually.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 19:22:13 -08:00
Jisheng Zhang 20802d8d47
riscv: extable: add a dedicated uaccess handler
Inspired by commit 2e77a62cb3 ("arm64: extable: add a dedicated
uaccess handler"), do similar to riscv to add a dedicated uaccess
exception handler to update registers in exception context and
subsequently return back into the function which faulted, so we remove
the need for fixups specialized to each faulting instruction.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:53:29 -08:00
Jisheng Zhang 2bf847db0c
riscv: extable: add `type` and `data` fields
This is a riscv port of commit d6e2cc5647 ("arm64: extable: add `type`
and `data` fields").

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:52:54 -08:00
Jisheng Zhang 4c2e7ce8b9
riscv: extable: use `ex` for `exception_table_entry`
The var name "fixup" is a bit confusing, since this is a
exception_table_entry. Use "ex" instead  to refer to an entire entry.
In subsequent patches we'll use `fixup` to refer to the fixup
field specifically.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:52:34 -08:00
Jisheng Zhang ef127bca11
riscv: extable: make fixup_exception() return bool
The return values of fixup_exception() and riscv_bpf_fixup_exception()
represent a boolean condition rather than an error code, so it's better
to return `bool` rather than `int`.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:52:29 -08:00
Jisheng Zhang c07935cb3c
riscv: bpf: move rv_bpf_fixup_exception signature to extable.h
This is to group riscv related extable related functions signature
into one file.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:52:24 -08:00
Jisheng Zhang bb1f85d604
riscv: switch to relative exception tables
Similar as other architectures such as arm64, x86 and so on, use
offsets relative to the exception table entry values rather than
absolute addresses for both the exception locationand the fixup.

However, RISCV label difference will actually produce two relocations,
a pair of R_RISCV_ADD32 and R_RISCV_SUB32. Take below simple code for
example:

$ cat test.S
.section .text
1:
        nop
.section __ex_table,"a"
        .balign 4
        .long (1b - .)
.previous

$ riscv64-linux-gnu-gcc -c test.S
$ riscv64-linux-gnu-readelf -r test.o
Relocation section '.rela__ex_table' at offset 0x100 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  000600000023 R_RISCV_ADD32     0000000000000000 .L1^B1 + 0
000000000000  000500000027 R_RISCV_SUB32     0000000000000000 .L0  + 0

The modpost will complain the R_RISCV_SUB32 relocation, so we need to
patch modpost.c to skip this relocation for .rela__ex_table section.

After this patch, the __ex_table section size of defconfig vmlinux is
reduced from 7072 Bytes to 3536 Bytes.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-01-05 17:52:20 -08:00
Eric W. Biederman 0e25498f8c exit: Add and use make_task_dead.
There are two big uses of do_exit.  The first is it's design use to be
the guts of the exit(2) system call.  The second use is to terminate
a task after something catastrophic has happened like a NULL pointer
in kernel code.

Add a function make_task_dead that is initialy exactly the same as
do_exit to cover the cases where do_exit is called to handle
catastrophic failure.  In time this can probably be reduced to just a
light wrapper around do_task_dead. For now keep it exactly the same so
that there will be no behavioral differences introducing this new
concept.

Replace all of the uses of do_exit that use it for catastraphic
task cleanup with make_task_dead to make it clear what the code
is doing.

As part of this rename rewind_stack_do_exit
rewind_stack_and_make_dead.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-13 12:04:45 -06:00
Linus Torvalds b89f311d7e RISC-V Patches for the 5.16 Merge Window, Part 1
* Support for time namespaces in the VDSO, along with some associated
   cleanups.
 * Support for building rv32 randconfigs.
 * Improvements to the XIP port that allow larger kernels to function
 * Various device tree cleanups for both the SiFive and Microchip boards
 * A handful of defconfig updates, including enabling Nouveau.
 
 There are also various small cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmGN9T4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiQEBD/9EvGe5jPb8binNiLN81ExYh3HGBJ9A
 sY+/rhVWnTUX/igBU4M9gNHu5Rts2ITlsBda9YnYRgMunAS1cnZJ/jbvckfsFA+s
 svehPgDZSR+BqBnVDN1hEsY+TrNbFFFZIApGDbImuC5+81np/p3u1zkl3hhO0amJ
 7T46qxRq4iazbuohi47/ba6ufXyLb8XBg3LTUfp+lTnH7/VLbwspPWdzNgiJpMzB
 qEWfYd4au2zfqHC6SYHXidy/tkquhJ6DgKq0G3+XUCugeennMeisDWI9GsxTAzIa
 Pynm9GHA/AuK1/kt9qB/Czsg7bqY8RUreUMLNZEyHzwKA1OWQVxlfCdl5zIpxlkq
 eJkxjbhhPuneVRGUSJgKQqBEUtGlH9yISqO2CFFdKRqzm1ImJtkAJGC+SubpKSbU
 D+ZAGbAyEwDUfl1yyKzKRg6YnzntMxh8sfNdJRYMrOhczk8L//R8yxG6SX6bjw7W
 lTQk7wkpo2yS7L799dukKdlllYbUvqavZwsIHTyb1TsRoavQeqoJq/6jKXJNhFHD
 rA/OQgCUXuHTmpgdDUhHuLZVvJqT5n8Vxigi30nIq24KOgTom8tRRtiAWOJv7G+Z
 b/Y3eq0VE8yjhPSbtsAxooZnpEVgXsVqx3t/lBt0MiJE1a3JLSSchRTHBjWllCiV
 +yvCerCh8PoT8w==
 =Mpen
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for time namespaces in the VDSO, along with some associated
   cleanups.

 - Support for building rv32 randconfigs.

 - Improvements to the XIP port that allow larger kernels to function

 - Various device tree cleanups for both the SiFive and Microchip boards

 - A handful of defconfig updates, including enabling Nouveau.

There are also various small cleanups.

* tag 'riscv-for-linus-5.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: defconfig: enable DRM_NOUVEAU
  riscv/vdso: Drop unneeded part due to merge issue
  riscv: remove .text section size limitation for XIP
  riscv: dts: sifive: add missing compatible for plic
  riscv: dts: microchip: add missing compatibles for clint and plic
  riscv: dts: sifive: drop duplicated nodes and properties in sifive
  riscv: dts: sifive: fix Unleashed board compatible
  riscv: dts: sifive: use only generic JEDEC SPI NOR flash compatible
  riscv: dts: microchip: use vendor compatible for Cadence SD4HC
  riscv: dts: microchip: drop unused pinctrl-names
  riscv: dts: microchip: drop duplicated MMC/SDHC node
  riscv: dts: microchip: fix board compatible
  riscv: dts: microchip: drop duplicated nodes
  dt-bindings: mmc: cdns: document Microchip MPFS MMC/SDHCI controller
  riscv: add rv32 and rv64 randconfig build targets
  riscv: mm: don't advertise 1 num_asid for 0 asid bits
  riscv: set default pm_power_off to NULL
  riscv/vdso: Add support for time namespaces
2021-11-13 09:15:42 -08:00
Björn Töpel f47d4ffe3a riscv, bpf: Fix RV32 broken build, and silence RV64 warning
Commit 252c765bd7 ("riscv, bpf: Add BPF exception tables") only addressed
RV64, and broke the RV32 build [1]. Fix by gating the exception tables code
with CONFIG_ARCH_RV64I.

Further, silence a "-Wmissing-prototypes" warning [2] in the RV64 BPF JIT.

  [1] https://lore.kernel.org/llvm/202111020610.9oy9Rr0G-lkp@intel.com/
  [2] https://lore.kernel.org/llvm/202110290334.2zdMyRq4-lkp@intel.com/

Fixes: 252c765bd7 ("riscv, bpf: Add BPF exception tables")
Signed-off-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Tong Tiangen <tongtiangen@huawei.com>
Link: https://lore.kernel.org/bpf/20211103115453.397209-1-bjorn@kernel.org
2021-11-05 16:52:34 +01:00
Linus Torvalds fc02cb2b37 Core:
- Remove socket skb caches
 
  - Add a SO_RESERVE_MEM socket op to forward allocate buffer space
    and avoid memory accounting overhead on each message sent
 
  - Introduce managed neighbor entries - added by control plane and
    resolved by the kernel for use in acceleration paths (BPF / XDP
    right now, HW offload users will benefit as well)
 
  - Make neighbor eviction on link down controllable by userspace
    to work around WiFi networks with bad roaming implementations
 
  - vrf: Rework interaction with netfilter/conntrack
 
  - fq_codel: implement L4S style ce_threshold_ect1 marking
 
  - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
 
 BPF:
 
  - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
    as implemented in LLVM14
 
  - Introduce bpf_get_branch_snapshot() to capture Last Branch Records
 
  - Implement variadic trace_printk helper
 
  - Add a new Bloomfilter map type
 
  - Track <8-byte scalar spill and refill
 
  - Access hw timestamp through BPF's __sk_buff
 
  - Disallow unprivileged BPF by default
 
  - Document BPF licensing
 
 Netfilter:
 
  - Introduce egress hook for looking at raw outgoing packets
 
  - Allow matching on and modifying inner headers / payload data
 
  - Add NFT_META_IFTYPE to match on the interface type either from
    ingress or egress
 
 Protocols:
 
  - Multi-Path TCP:
    - increase default max additional subflows to 2
    - rework forward memory allocation
    - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
 
  - MCTP flow support allowing lower layer drivers to configure msg
    muxing as needed
 
  - Automatic Multicast Tunneling (AMT) driver based on RFC7450
 
  - HSR support the redbox supervision frames (IEC-62439-3:2018)
 
  - Support for the ip6ip6 encapsulation of IOAM
 
  - Netlink interface for CAN-FD's Transmitter Delay Compensation
 
  - Support SMC-Rv2 eliminating the current same-subnet restriction,
    by exploiting the UDP encapsulation feature of RoCE adapters
 
  - TLS: add SM4 GCM/CCM crypto support
 
  - Bluetooth: initial support for link quality and audio/codec
    offload
 
 Driver APIs:
 
  - Add a batched interface for RX buffer allocation in AF_XDP
    buffer pool
 
  - ethtool: Add ability to control transceiver modules' power mode
 
  - phy: Introduce supported interfaces bitmap to express MAC
    capabilities and simplify PHY code
 
  - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
 
 New drivers:
 
  - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
 
  - Ethernet driver for ASIX AX88796C SPI device (x88796c)
 
 Drivers:
 
  - Broadcom PHYs
    - support 72165, 7712 16nm PHYs
    - support IDDQ-SR for additional power savings
 
  - PHY support for QCA8081, QCA9561 PHYs
 
  - NXP DPAA2: support for IRQ coalescing
 
  - NXP Ethernet (enetc): support for software TCP segmentation
 
  - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
    Gigabit-capable IP found on RZ/G2L SoC
 
  - Intel 100G Ethernet
    - support for eswitch offload of TC/OvS flow API, including
      offload of GRE, VxLAN, Geneve tunneling
    - support application device queues - ability to assign Rx and Tx
      queues to application threads
    - PTP and PPS (pulse-per-second) extensions
 
  - Broadcom Ethernet (bnxt)
    - devlink health reporting and device reload extensions
 
  - Mellanox Ethernet (mlx5)
    - offload macvlan interfaces
    - support HW offload of TC rules involving OVS internal ports
    - support HW-GRO and header/data split
    - support application device queues
 
  - Marvell OcteonTx2:
    - add XDP support for PF
    - add PTP support for VF
 
  - Qualcomm Ethernet switch (qca8k): support for QCA8328
 
  - Realtek Ethernet DSA switch (rtl8366rb)
    - support bridge offload
    - support STP, fast aging, disabling address learning
    - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
 
  - Mellanox Ethernet/IB switch (mlxsw)
    - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
    - offload root TBF qdisc as port shaper
    - support multiple routing interface MAC address prefixes
    - support for IP-in-IP with IPv6 underlay
 
  - MediaTek WiFi (mt76)
    - mt7921 - ASPM, 6GHz, SDIO and testmode support
    - mt7915 - LED and TWT support
 
  - Qualcomm WiFi (ath11k)
    - include channel rx and tx time in survey dump statistics
    - support for 80P80 and 160 MHz bandwidths
    - support channel 2 in 6 GHz band
    - spectral scan support for QCN9074
    - support for rx decapsulation offload (data frames in 802.3
      format)
 
  - Qualcomm phone SoC WiFi (wcn36xx)
    - enable Idle Mode Power Save (IMPS) to reduce power consumption
      during idle
 
  - Bluetooth driver support for MediaTek MT7922 and MT7921
 
  - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x
    and Realtek 8822C/8852A
 
  - Microsoft vNIC driver (mana)
    - support hibernation and kexec
 
  - Google vNIC driver (gve)
    - support for jumbo frames
    - implement Rx page reuse
 
 Refactor:
 
  - Make all writes to netdev->dev_addr go thru helpers, so that we
    can add this address to the address rbtree and handle the updates
 
  - Various TCP cleanups and optimizations including improvements
    to CPU cache use
 
  - Simplify the gnet_stats, Qdisc stats' handling and remove
    qdisc->running sequence counter
 
  - Driver changes and API updates to address devlink locking
    deficiencies
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGAzX4ACgkQMUZtbf5S
 IrvW3g//Q0ZLrOuHK9pZ8sCXMMhDj8qL6ajm0otMddHWA/+1UglwVBKFhsajfxOf
 wJ/5LZis+XKLpLqKTU5chKVfn39HuDGe/D3l+egi01Gv5BW0+XzEhagfyR5tJX5z
 wsGG5CXO/we/laVSzRiFtwwVEKHKN20YC+tIQwYOYP5Wy3q4G7qDsFhT7GqgsGCS
 n74QUEAIB5Tz0ODWFqLtbsySzIurXrskibwt5T9bvAAlPw/lCU68mmG+NVJ7VddO
 lBbNkLMOo8yW9Ci20H09SrYd4jZTmMARo9tsFO1tAvAMk7qpn0Wd8pnOYTjFFoMD
 +qjiFSVMh7E0JGb8Y7NCvwaB99suAK5rfGP68Xwe62DfP7vYWEx4pZGxBP19F4ld
 6Kn1ME33BX9rUF9tBecf0bdKfJUwB2Q2Xou/b9laG04bwiqsc9iG5FQq1C46lnLZ
 QdzNiS1My4dJMczkWt66HF3Kx30ibwHfvKMIHjf4PqkzEatkv6Y6SBZ57KXL+Lde
 0BQSFhbf0tm2Gf55etzrczLElI3uqHSFWUNZZ2Bt6WmzO1e6tpV9nAtRWF4C/dFg
 QDpLJtOOOY65uq+qz09zoPfv2lem868SrCAuFrVn99bEpYjx/CGNFDeEI02l6jyr
 84eUxd364UcbIk3fc+eTGdXHLQNVk30G0AHVBBxaWNIidwfqXeE=
 =srde
 -----END PGP SIGNATURE-----

Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - Remove socket skb caches

   - Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
     avoid memory accounting overhead on each message sent

   - Introduce managed neighbor entries - added by control plane and
     resolved by the kernel for use in acceleration paths (BPF / XDP
     right now, HW offload users will benefit as well)

   - Make neighbor eviction on link down controllable by userspace to
     work around WiFi networks with bad roaming implementations

   - vrf: Rework interaction with netfilter/conntrack

   - fq_codel: implement L4S style ce_threshold_ect1 marking

   - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()

  BPF:

   - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
     as implemented in LLVM14

   - Introduce bpf_get_branch_snapshot() to capture Last Branch Records

   - Implement variadic trace_printk helper

   - Add a new Bloomfilter map type

   - Track <8-byte scalar spill and refill

   - Access hw timestamp through BPF's __sk_buff

   - Disallow unprivileged BPF by default

   - Document BPF licensing

  Netfilter:

   - Introduce egress hook for looking at raw outgoing packets

   - Allow matching on and modifying inner headers / payload data

   - Add NFT_META_IFTYPE to match on the interface type either from
     ingress or egress

  Protocols:

   - Multi-Path TCP:
      - increase default max additional subflows to 2
      - rework forward memory allocation
      - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS

   - MCTP flow support allowing lower layer drivers to configure msg
     muxing as needed

   - Automatic Multicast Tunneling (AMT) driver based on RFC7450

   - HSR support the redbox supervision frames (IEC-62439-3:2018)

   - Support for the ip6ip6 encapsulation of IOAM

   - Netlink interface for CAN-FD's Transmitter Delay Compensation

   - Support SMC-Rv2 eliminating the current same-subnet restriction, by
     exploiting the UDP encapsulation feature of RoCE adapters

   - TLS: add SM4 GCM/CCM crypto support

   - Bluetooth: initial support for link quality and audio/codec offload

  Driver APIs:

   - Add a batched interface for RX buffer allocation in AF_XDP buffer
     pool

   - ethtool: Add ability to control transceiver modules' power mode

   - phy: Introduce supported interfaces bitmap to express MAC
     capabilities and simplify PHY code

   - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks

  New drivers:

   - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)

   - Ethernet driver for ASIX AX88796C SPI device (x88796c)

  Drivers:

   - Broadcom PHYs
      - support 72165, 7712 16nm PHYs
      - support IDDQ-SR for additional power savings

   - PHY support for QCA8081, QCA9561 PHYs

   - NXP DPAA2: support for IRQ coalescing

   - NXP Ethernet (enetc): support for software TCP segmentation

   - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
     Gigabit-capable IP found on RZ/G2L SoC

   - Intel 100G Ethernet
      - support for eswitch offload of TC/OvS flow API, including
        offload of GRE, VxLAN, Geneve tunneling
      - support application device queues - ability to assign Rx and Tx
        queues to application threads
      - PTP and PPS (pulse-per-second) extensions

   - Broadcom Ethernet (bnxt)
      - devlink health reporting and device reload extensions

   - Mellanox Ethernet (mlx5)
      - offload macvlan interfaces
      - support HW offload of TC rules involving OVS internal ports
      - support HW-GRO and header/data split
      - support application device queues

   - Marvell OcteonTx2:
      - add XDP support for PF
      - add PTP support for VF

   - Qualcomm Ethernet switch (qca8k): support for QCA8328

   - Realtek Ethernet DSA switch (rtl8366rb)
      - support bridge offload
      - support STP, fast aging, disabling address learning
      - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch

   - Mellanox Ethernet/IB switch (mlxsw)
      - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
      - offload root TBF qdisc as port shaper
      - support multiple routing interface MAC address prefixes
      - support for IP-in-IP with IPv6 underlay

   - MediaTek WiFi (mt76)
      - mt7921 - ASPM, 6GHz, SDIO and testmode support
      - mt7915 - LED and TWT support

   - Qualcomm WiFi (ath11k)
      - include channel rx and tx time in survey dump statistics
      - support for 80P80 and 160 MHz bandwidths
      - support channel 2 in 6 GHz band
      - spectral scan support for QCN9074
      - support for rx decapsulation offload (data frames in 802.3
        format)

   - Qualcomm phone SoC WiFi (wcn36xx)
      - enable Idle Mode Power Save (IMPS) to reduce power consumption
        during idle

   - Bluetooth driver support for MediaTek MT7922 and MT7921

   - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
     Realtek 8822C/8852A

   - Microsoft vNIC driver (mana)
      - support hibernation and kexec

   - Google vNIC driver (gve)
      - support for jumbo frames
      - implement Rx page reuse

  Refactor:

   - Make all writes to netdev->dev_addr go thru helpers, so that we can
     add this address to the address rbtree and handle the updates

   - Various TCP cleanups and optimizations including improvements to
     CPU cache use

   - Simplify the gnet_stats, Qdisc stats' handling and remove
     qdisc->running sequence counter

   - Driver changes and API updates to address devlink locking
     deficiencies"

* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
  Revert "net: avoid double accounting for pure zerocopy skbs"
  selftests: net: add arp_ndisc_evict_nocarrier
  net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
  net: arp: introduce arp_evict_nocarrier sysctl parameter
  libbpf: Deprecate AF_XDP support
  kbuild: Unify options for BTF generation for vmlinux and modules
  selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
  bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
  bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
  net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
  net: avoid double accounting for pure zerocopy skbs
  tcp: rename sk_wmem_free_skb
  netdevsim: fix uninit value in nsim_drv_configure_vfs()
  selftests/bpf: Fix also no-alu32 strobemeta selftest
  bpf: Add missing map_delete_elem method to bloom filter map
  selftests/bpf: Add bloom map success test for userspace calls
  bpf: Add alignment padding for "map_extra" + consolidate holes
  bpf: Bloom filter map naming fixups
  selftests/bpf: Add test cases for struct_ops prog
  bpf: Add dummy BPF STRUCT_OPS for test purpose
  ...
2021-11-02 06:20:58 -07:00
Jakub Kicinski b7b98f8689 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-11-01

We've added 181 non-merge commits during the last 28 day(s) which contain
a total of 280 files changed, 11791 insertions(+), 5879 deletions(-).

The main changes are:

1) Fix bpf verifier propagation of 64-bit bounds, from Alexei.

2) Parallelize bpf test_progs, from Yucong and Andrii.

3) Deprecate various libbpf apis including af_xdp, from Andrii, Hengqi, Magnus.

4) Improve bpf selftests on s390, from Ilya.

5) bloomfilter bpf map type, from Joanne.

6) Big improvements to JIT tests especially on Mips, from Johan.

7) Support kernel module function calls from bpf, from Kumar.

8) Support typeless and weak ksym in light skeleton, from Kumar.

9) Disallow unprivileged bpf by default, from Pawan.

10) BTF_KIND_DECL_TAG support, from Yonghong.

11) Various bpftool cleanups, from Quentin.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (181 commits)
  libbpf: Deprecate AF_XDP support
  kbuild: Unify options for BTF generation for vmlinux and modules
  selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
  bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
  bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
  selftests/bpf: Fix also no-alu32 strobemeta selftest
  bpf: Add missing map_delete_elem method to bloom filter map
  selftests/bpf: Add bloom map success test for userspace calls
  bpf: Add alignment padding for "map_extra" + consolidate holes
  bpf: Bloom filter map naming fixups
  selftests/bpf: Add test cases for struct_ops prog
  bpf: Add dummy BPF STRUCT_OPS for test purpose
  bpf: Factor out helpers for ctx access checking
  bpf: Factor out a helper to prepare trampoline for struct_ops prog
  selftests, bpf: Fix broken riscv build
  riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h
  tools, build: Add RISC-V to HOSTARCH parsing
  riscv, bpf: Increase the maximum number of iterations
  selftests, bpf: Add one test for sockmap with strparser
  selftests, bpf: Fix test_txmsg_ingress_parser error
  ...
====================

Link: https://lore.kernel.org/r/20211102013123.9005-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 19:59:46 -07:00
Alexandre Ghiti 54c5639d8f
riscv: Fix asan-stack clang build
Nathan reported that because KASAN_SHADOW_OFFSET was not defined in
Kconfig, it prevents asan-stack from getting disabled with clang even
when CONFIG_KASAN_STACK is disabled: fix this by defining the
corresponding config.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-29 08:54:50 -07:00
Alexandre Ghiti cf11d01135
riscv: Do not re-populate shadow memory with kasan_populate_early_shadow
When calling this function, all the shadow memory is already populated
with kasan_early_shadow_pte which has PAGE_KERNEL protection.
kasan_populate_early_shadow write-protects the mapping of the range
of addresses passed in argument in zero_pte_populate, which actually
write-protects all the shadow memory mapping since kasan_early_shadow_pte
is used for all the shadow memory at this point. And then when using
memblock API to populate the shadow memory, the first write access to the
kernel stack triggers a trap. This becomes visible with the next commit
that contains a fix for asan-stack.

We already manually populate all the shadow memory in kasan_early_init
and we write-protect kasan_early_shadow_pte at the end of kasan_init
which makes the calls to kasan_populate_early_shadow superfluous so
we can remove them.

Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: e178d670f2 ("riscv/kasan: add KASAN_VMALLOC support")
Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-29 08:53:42 -07:00
Tong Tiangen 252c765bd7 riscv, bpf: Add BPF exception tables
When a tracing BPF program attempts to read memory without using the
bpf_probe_read() helper, the verifier marks the load instruction with
the BPF_PROBE_MEM flag. Since the riscv JIT does not currently recognize
this flag it falls back to the interpreter.

Add support for BPF_PROBE_MEM, by appending an exception table to the
BPF program. If the load instruction causes a data abort, the fixup
infrastructure finds the exception table and fixes up the fault, by
clearing the destination register and jumping over the faulting
instruction.

A more generic solution would add a "handler" field to the table entry,
like on x86 and s390. The same issue in ARM64 is fixed in 8008342853
("bpf, arm64: Add BPF exception tables").

Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Pu Lehui <pulehui@huawei.com>
Tested-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20211027111822.3801679-1-tongtiangen@huawei.com
2021-10-28 01:02:44 +02:00
Vitaly Wool f9ace4ede4
riscv: remove .text section size limitation for XIP
Currently there's a limit of 8MB for the .text section of a RISC-V
image in the XIP case. This breaks compilation of many automatic
builds and is generally inconvenient. This patch removes that
limitation and optimizes XIP image file size at the same time.

Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-26 14:31:15 -07:00
Alexandre Ghiti bb8958d5dc
riscv: Flush current cpu icache before other cpus
On SiFive Unmatched, I recently fell onto the following BUG when booting:

[    0.000000] ftrace: allocating 36610 entries in 144 pages
[    0.000000] Oops - illegal instruction [#1]
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.13.1+ #5
[    0.000000] Hardware name: SiFive HiFive Unmatched A00 (DT)
[    0.000000] epc : riscv_cpuid_to_hartid_mask+0x6/0xae
[    0.000000]  ra : __sbi_rfence_v02+0xc8/0x10a
[    0.000000] epc : ffffffff80007240 ra : ffffffff80009964 sp : ffffffff81803e10
[    0.000000]  gp : ffffffff81a1ea70 tp : ffffffff8180f500 t0 : ffffffe07fe30000
[    0.000000]  t1 : 0000000000000004 t2 : 0000000000000000 s0 : ffffffff81803e60
[    0.000000]  s1 : 0000000000000000 a0 : ffffffff81a22238 a1 : ffffffff81803e10
[    0.000000]  a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
[    0.000000]  a5 : 0000000000000000 a6 : ffffffff8000989c a7 : 0000000052464e43
[    0.000000]  s2 : ffffffff81a220c8 s3 : 0000000000000000 s4 : 0000000000000000
[    0.000000]  s5 : 0000000000000000 s6 : 0000000200000100 s7 : 0000000000000001
[    0.000000]  s8 : ffffffe07fe04040 s9 : ffffffff81a22c80 s10: 0000000000001000
[    0.000000]  s11: 0000000000000004 t3 : 0000000000000001 t4 : 0000000000000008
[    0.000000]  t5 : ffffffcf04000808 t6 : ffffffe3ffddf188
[    0.000000] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000002
[    0.000000] [<ffffffff80007240>] riscv_cpuid_to_hartid_mask+0x6/0xae
[    0.000000] [<ffffffff80009474>] sbi_remote_fence_i+0x1e/0x26
[    0.000000] [<ffffffff8000b8f4>] flush_icache_all+0x12/0x1a
[    0.000000] [<ffffffff8000666c>] patch_text_nosync+0x26/0x32
[    0.000000] [<ffffffff8000884e>] ftrace_init_nop+0x52/0x8c
[    0.000000] [<ffffffff800f051e>] ftrace_process_locs.isra.0+0x29c/0x360
[    0.000000] [<ffffffff80a0e3c6>] ftrace_init+0x80/0x130
[    0.000000] [<ffffffff80a00f8c>] start_kernel+0x5c4/0x8f6
[    0.000000] ---[ end trace f67eb9af4d8d492b ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

While ftrace is looping over a list of addresses to patch, it always failed
when patching the same function: riscv_cpuid_to_hartid_mask. Looking at the
backtrace, the illegal instruction is encountered in this same function.
However, patch_text_nosync, after patching the instructions, calls
flush_icache_range. But looking at what happens in this function:

flush_icache_range -> flush_icache_all
                   -> sbi_remote_fence_i
                   -> __sbi_rfence_v02
                   -> riscv_cpuid_to_hartid_mask

The icache and dcache of the current cpu are never synchronized between the
patching of riscv_cpuid_to_hartid_mask and calling this same function.

So fix this by flushing the current cpu's icache before asking for the other
cpus to do the same.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: fab957c11e ("RISC-V: Atomic and Locking Code")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04 18:24:15 -07:00
Vineet Gupta 21ccdccd21
riscv: mm: don't advertise 1 num_asid for 0 asid bits
Even if mmu doesn't support ASID, current code calculates @num_asids=1
which is misleading, so avoid setting any asid related variables in such
case.

Also while here, print the number of asid bits discovered even for the
disabled case.

Verified this on Hifive Unmatched.

Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04 14:16:58 -07:00
Linus Torvalds 063df71a57 RISC-V Patches for the 5.15 Merge Window, Part 1
* Support for PC-relative instructions (auipc and branches) in kprobes.
 * Support for forced IRQ threading.
 * Support for the hlt/nohlt kernel command line options, via the generic
   idle loop.
 * Support for showing the edge/level triggered behavior of interrupts in
   /proc/interrupts.
 * A handful of cleanups to our address mapping mechanisms.
 * Support for allocating gigantic hugepages via CMA.
 * Support for the undefined behavior sanitizer.
 * A handful of cleanups to the VDSO that allow the kernel to build with
   LLD.
 * Support for hugepage migration.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmEzxA4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYifuvD/93Tr9oAy6dwkGoa/VIzBc/nklCeTiO
 cv2W9eFQMG+PdRh1Ezi9FrLO6qZoADGOPPcnuc2HTTe58CWWWg0juC862H/N2k3V
 +3LGhHTv3E7NsEZNa4YhQXw6pPmj5OhbHab87HNowHpuNJGpKYE6Q+QbnYl5vorI
 9YS10pRpqvO6tdHUW335Z5N0nW9X3HCHD6NUb1ZV8qd34QzP+XeCJMqVXpoOijNO
 96lhyDYPivM2Rhi+uvtSjUeWzbddnN/P9IywfECddHFJyGvoEtiIROo9d2sR6pla
 lAa1YOw0ybD/UPHtL/diEU+K8ivNVKdRN/YvlzHS4CdCrPyuWSUmBMozsOBMBwf1
 UZCMDwmP2sQkHu1Ht9g1fXwk2R85j1s+qE5cZEXyAEULKfHlmt4vygd1xgm8DTf9
 xgS+np2g0frcmKU8cLPmC7Sr0YI1fPeDMRkBXTYE1SmBF3eHrZCCbeM1H97p2mUC
 zs/Dvyoj7e25HGyYqxN3lHZzhMluqh7wCQSOQMB2YxVwdiA4VjyjxLF4MKAdXL8i
 F1QTETsjLGu6UiMfNIQ4RVs981/fGYSANdOvMfwo447gVH6JaxC51sLo6YIDtXss
 tLzchHNu5IMNqB7XOb1gU8ZKeXruG9lrxwUOyEIYAVi6pHCbkdLV0A8ymwEjj06b
 aGV4xFPaAGvDLQ==
 =rDB3
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.15-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - support PC-relative instructions (auipc and branches) in kprobes

 - support for forced IRQ threading

 - support for the hlt/nohlt kernel command line options, via the
   generic idle loop

 - show the edge/level triggered behavior of interrupts
   in /proc/interrupts

 - a handful of cleanups to our address mapping mechanisms

 - support for allocating gigantic hugepages via CMA

 - support for the undefined behavior sanitizer (UBSAN)

 - a handful of cleanups to the VDSO that allow the kernel to build with
   LLD.

 - support for hugepage migration

* tag 'riscv-for-linus-5.15-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits)
  riscv: add support for hugepage migration
  RISC-V: Fix VDSO build for !MMU
  riscv: use strscpy to replace strlcpy
  riscv: explicitly use symbol offsets for VDSO
  riscv: Enable Undefined Behavior Sanitizer UBSAN
  riscv: Keep the riscv Kconfig selects sorted
  riscv: Support allocating gigantic hugepages using CMA
  riscv: fix the global name pfn_base confliction error
  riscv: Move early fdt mapping creation in its own function
  riscv: Simplify BUILTIN_DTB device tree mapping handling
  riscv: Use __maybe_unused instead of #ifdefs around variable declarations
  riscv: Get rid of map_size parameter to create_kernel_page_table
  riscv: Introduce va_kernel_pa_offset for 32-bit kernel
  riscv: Optimize kernel virtual address conversion macro
  dt-bindings: riscv: add starfive jh7100 bindings
  riscv: Enable GENERIC_IRQ_SHOW_LEVEL
  riscv: Enable idle generic idle loop
  riscv: Allow forced irq threading
  riscv: Implement thread_struct whitelist for hardened usercopy
  riscv: kprobes: implement the branch instructions
  ...
2021-09-05 11:31:23 -07:00
Linus Torvalds 14726903c8 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "173 patches.

  Subsystems affected by this series: ia64, ocfs2, block, and mm (debug,
  pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
  bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure,
  hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock,
  oom-kill, migration, ksm, percpu, vmstat, and madvise)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (173 commits)
  mm/madvise: add MADV_WILLNEED to process_madvise()
  mm/vmstat: remove unneeded return value
  mm/vmstat: simplify the array size calculation
  mm/vmstat: correct some wrong comments
  mm/percpu,c: remove obsolete comments of pcpu_chunk_populated()
  selftests: vm: add COW time test for KSM pages
  selftests: vm: add KSM merging time test
  mm: KSM: fix data type
  selftests: vm: add KSM merging across nodes test
  selftests: vm: add KSM zero page merging test
  selftests: vm: add KSM unmerge test
  selftests: vm: add KSM merge test
  mm/migrate: correct kernel-doc notation
  mm: wire up syscall process_mrelease
  mm: introduce process_mrelease system call
  memblock: make memblock_find_in_range method private
  mm/mempolicy.c: use in_task() in mempolicy_slab_node()
  mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies
  mm/mempolicy: advertise new MPOL_PREFERRED_MANY
  mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY
  ...
2021-09-03 10:08:28 -07:00
Mike Rapoport a7259df767 memblock: make memblock_find_in_range method private
There are a lot of uses of memblock_find_in_range() along with
memblock_reserve() from the times memblock allocation APIs did not exist.

memblock_find_in_range() is the very core of memblock allocations, so any
future changes to its internal behaviour would mandate updates of all the
users outside memblock.

Replace the calls to memblock_find_in_range() with an equivalent calls to
memblock_phys_alloc() and memblock_phys_alloc_range() and make
memblock_find_in_range() private method of memblock.

This simplifies the callers, ensures that (unlikely) errors in
memblock_reserve() are handled and improves maintainability of
memblock_find_in_range().

Link: https://lkml.kernel.org/r/20210816122622.30279-1-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>		[arm64]
Acked-by: Kirill A. Shutemov <kirill.shtuemov@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	[ACPI]
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Nick Kossifidis <mick@ics.forth.gr>			[riscv]
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:17 -07:00
Linus Torvalds 9e5f3ffcf1 Devicetree updates for v5.15:
- Refactor arch kdump DT related code to a common implementation
 
 - Add fw_devlink tracking for 'phy-handle', 'leds', 'backlight',
   'resets', and 'pwm' properties
 
 - Various clean-ups to DT FDT code
 
 - Fix a runtime error for !CONFIG_SYSFS
 
 - Convert Synopsys DW PCI and derivative binding docs to schemas. Add
   Toshiba Visconti PCIe binding.
 
 - Convert a bunch of memory controller bindings to schemas
 
 - Covert eeprom-93xx46, Samsung Exynos TRNG, Samsung Exynos IRQ
   combiner, arm-charlcd, img-ascii-lcd, UniPhier eFuse, Xilinx Zynq
   MPSoC FPGA, Xilinx Zynq MPSoC reset, Mediatek mmsys, Gemini boards,
   brcm,iproc-i2c, faraday,ftpci100, and ks8851 net to DT schema.
 
 - Extend nvmem bindings to handle bit offsets in unit-addresses
 
 - Add DT schemas for HiKey 970 PCIe PHY
 
 - Remove unused ZTE, energymicro,efm32-timer, and Exynos SATA bindings
 
 - Enable dtc pci_device_reg warning by default
 
 - Fixes for handling 'unevaluatedProperties' in preparation to enable
   pending support in the tooling for jsonschema 2020-12 draft
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmEuWEsQHHJvYmhAa2Vy
 bmVsLm9yZwAKCRD6+121jbxhw+CtD/45m84GisULb7FFmlo+WY2SbzE8a+MUEXo0
 5ZZoMViSvBchphap9ueFNDdrLMUOHMsFaxHuTCUxXr4tq7EOemM7Br4OLiwiRrM5
 o2CwBvXYu+49c4UKVFMM6RCKFiXvw5NLI4Twpj4Ge8farHvt9Ecwtq+Y+RYWgFk2
 xwXWut7ZK3zBU6B+s4MRBATCFTD5oC4pAJIK3OQUlUPqZEQqdTRBKv5lyg+VUY2k
 eU0Cyzm0dZAmtjAu8ovhVNLfK1pp165QiaFIE1qh5H3ZVZAJlNyqN4jBDx9E4pLj
 BeazrsqfOkC8mZC+T7TgixhwB6D+r6/JW9NiCjYbarXibIsUOKSTKtj8XR8eZF/g
 sLeVDx33U5S+dlj1OB7scwq4Q9sG27ii2rlkvafA5KKBjoR2dzz7o9JesCV1Guha
 goPXmcd08e+KrjINxVc6gk4Y+KG8u+G7qnXnnmSatESJKxiDu1OgU3L16mlTJFaM
 hBmrh5rx1y8EkQnzgceTZIIWh30poSQKKyDB6Ta4Dude5JE+rS30oVURDR7MIrav
 rY70OYOiSq/nCcC7bc0Yu0UxJi+bwH28WvsD0aeCUOBTFsnI4j2uvsPsh3Aq74O0
 UbQmUCMxhpmsDVdIOqlS1IVH8M79I+BrDTPVP6EE96ttoj9FbSi6AgjeGJzVMC99
 EhtWe+gKTQ==
 =28CD
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:

 - Refactor arch kdump DT related code to a common implementation

 - Add fw_devlink tracking for 'phy-handle', 'leds', 'backlight',
   'resets', and 'pwm' properties

 - Various clean-ups to DT FDT code

 - Fix a runtime error for !CONFIG_SYSFS

 - Convert Synopsys DW PCI and derivative binding docs to schemas. Add
   Toshiba Visconti PCIe binding.

 - Convert a bunch of memory controller bindings to schemas

 - Covert eeprom-93xx46, Samsung Exynos TRNG, Samsung Exynos IRQ
   combiner, arm-charlcd, img-ascii-lcd, UniPhier eFuse, Xilinx Zynq
   MPSoC FPGA, Xilinx Zynq MPSoC reset, Mediatek mmsys, Gemini boards,
   brcm,iproc-i2c, faraday,ftpci100, and ks8851 net to DT schema.

 - Extend nvmem bindings to handle bit offsets in unit-addresses

 - Add DT schemas for HiKey 970 PCIe PHY

 - Remove unused ZTE, energymicro,efm32-timer, and Exynos SATA bindings

 - Enable dtc pci_device_reg warning by default

 - Fixes for handling 'unevaluatedProperties' in preparation to enable
   pending support in the tooling for jsonschema 2020-12 draft

* tag 'devicetree-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (78 commits)
  dt-bindings: display: remove zte,vou.txt binding doc
  dt-bindings: hwmon: merge max1619 into trivial devices
  dt-bindings: mtd-physmap: Add 'arm,vexpress-flash' compatible
  dt-bindings: PCI: imx6: convert the imx pcie controller to dtschema
  dt-bindings: Use 'enum' instead of 'oneOf' plus 'const' entries
  dt-bindings: Add vendor prefix for Topic Embedded Systems
  of: fdt: Rename reserve_elfcorehdr() to fdt_reserve_elfcorehdr()
  arm64: kdump: Remove custom linux,usable-memory-range handling
  arm64: kdump: Remove custom linux,elfcorehdr handling
  riscv: Remove non-standard linux,elfcorehdr handling
  of: fdt: Use IS_ENABLED(CONFIG_BLK_DEV_INITRD) instead of #ifdef
  of: fdt: Add generic support for handling usable memory range property
  of: fdt: Add generic support for handling elf core headers property
  crash_dump: Make elfcorehdr address/size symbols always visible
  dt-bindings: memory: convert Samsung Exynos DMC to dtschema
  dt-bindings: devfreq: event: convert Samsung Exynos PPMU to dtschema
  dt-bindings: devfreq: event: convert Samsung Exynos NoCP to dtschema
  kbuild: Enable dtc 'pci_device_reg' warning by default
  dt-bindings: soc: remove obsolete zte zx header
  dt-bindings: clock: remove obsolete zte zx header
  ...
2021-09-01 18:34:51 -07:00
Geert Uytterhoeven 2931ea847d riscv: Remove non-standard linux,elfcorehdr handling
RISC-V uses platform-specific code to locate the elf core header in
memory.  However, this does not conform to the standard
"linux,elfcorehdr" DT bindings, as it relies on a reserved memory node
with the "linux,elfcorehdr" compatible value, instead of on a
"linux,elfcorehdr" property under the "/chosen" node.

The non-compliant code can just be removed, as the standard behavior is
already implemented by platform-agnostic handling in the FDT core code.

Fixes: 5640975003 ("RISC-V: Add crash kernel support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
Acked-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/41c75d6ee3114ae6304f8afe0051895af91200ee.1628670468.git.geert+renesas@glider.be
2021-08-24 17:09:01 -05:00
Kefeng Wang 8ba1a8b77b
riscv: Support allocating gigantic hugepages using CMA
This patch adds support to allocate gigantic hugepages using CMA by
specifying the hugetlb_cma= kernel parameter.  This is only supported on
RV64.

Reviewed-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-13 21:33:54 -07:00
Kenneth Lee fb31f0a499
riscv: fix the global name pfn_base confliction error
RISCV uses a global variable pfn_base for page/pfn translation. But this
is a common name and will be used elsewhere. In those cases, the
page-pfn macros which refer to this name will be referred to the
local/input variable instead. (such as in vfio_pin_pages_remote). This
make everything wrong.

This patch changes the name from pfn_base to riscv_pfn_base to fix
this problem.

Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-13 15:31:51 -07:00
Alexandre Ghiti fdf3a7a1e0
riscv: Fix comment regarding kernel mapping overlapping with IS_ERR_VALUE
The current comment states that we check if the 64-bit kernel mapping
overlaps with the last 4K of the address space that is reserved to
error values in create_kernel_page_table, which is not the case since it
is done in setup_vm. But anyway, remove the reference to any function
and simply note that in 64-bit kernel, the check should be done as soon
as the kernel mapping base address is known.

Fixes: db6b84a368 ("riscv: Make sure the kernel mapping does not overlap with IS_ERR_VALUE")
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-12 07:16:58 -07:00
Alexandre Ghiti fe45ffa4c5
riscv: Move early fdt mapping creation in its own function
The code that handles the early fdt mapping is hard to read and does not
create the same mapping size depending on the kernel:

- for 64-bit, 2 PMD entries are used which amounts to a 4MB mapping
- for 32-bit, 2 PGDIR entries are used which amounts to a 8MB mapping

So keep using 2 PMD entries for 64-bit and use only one PGD entry for
32-bit needed to cover 4MB. Move that into a new function called
create_fdt_early_page_table which, using the same naming as
create_kernel_page_table.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-11 22:41:38 -07:00
Alexandre Ghiti 977765ce31
riscv: Simplify BUILTIN_DTB device tree mapping handling
__PAGETABLE_PMD_FOLDED defines a 2-level page table that is only used in
32-bit kernel, so there is no need to check for CONFIG_64BIT in #ifndef
__PAGETABLE_PMD_FOLDED and vice-versa.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-11 22:41:34 -07:00
Alexandre Ghiti 6f3e5fd241
riscv: Use __maybe_unused instead of #ifdefs around variable declarations
This allows to simplify the code and make it more readable.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-11 22:41:31 -07:00
Alexandre Ghiti 526f83df1d
riscv: Get rid of map_size parameter to create_kernel_page_table
The kernel must always be mapped using PMD_SIZE, and this is already the
case, this just simplifies create_kernel_page_table.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-11 22:41:20 -07:00
Alexandre Ghiti 0aba691a74
riscv: Introduce va_kernel_pa_offset for 32-bit kernel
va_kernel_pa_offset was only used for 64-bit as the kernel mapping lies
in the linear mapping for 32-bit kernel and then only the offset between
the PAGE_OFFSET and the kernel load address is needed.

But this distinction complexifies the code with #ifdefs and especially
with a separate definition of the address conversions macros.

Simplify the code by defining this variable for both 32-bit and 64-bit.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-11 22:41:11 -07:00
Alexandre Ghiti 6d7f91d914
riscv: Get rid of CONFIG_PHYS_RAM_BASE in kernel physical address conversion
The usage of CONFIG_PHYS_RAM_BASE for all kernel types was a mistake:
this value is implementation-specific and this breaks the genericity of
the RISC-V kernel.

Fix this by introducing a new variable phys_ram_base that holds this
value at runtime and use it in the kernel physical address conversion
macro. Since this value is used only for XIP kernels, evaluate it only if
CONFIG_XIP_KERNEL is set which in addition optimizes this macro for
standard kernels at compile-time.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Tested-by: Emil Renner Berthing <kernel@esmil.dk>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: 44c9225729 ("RISC-V: enable XIP")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-08-06 22:41:28 -07:00
Alexandre Ghiti db6b84a368
riscv: Make sure the kernel mapping does not overlap with IS_ERR_VALUE
The check that is done in setup_bootmem currently only works for 32-bit
kernel since the kernel mapping has been moved outside of the linear
mapping for 64-bit kernel. So make sure that for 64-bit kernel, the kernel
mapping does not overlap with the last 4K of the addressable memory.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-22 21:34:36 -07:00
Alexandre Ghiti c99127c452
riscv: Make sure the linear mapping does not use the kernel mapping
For 64-bit kernel, the end of the address space is occupied by the
kernel mapping and currently, the functions to populate the kernel page
tables (i.e. create_p*d_mapping) do not override existing mapping so we
must make sure the linear mapping does not map memory in the kernel mapping
by clipping the memory above the memory limit.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: c9811e379b ("riscv: Add mem kernel parameter support")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-22 20:48:04 -07:00
Alexandre Ghiti c09dc9e1cd
riscv: Fix memory_limit for 64-bit kernel
As described in Documentation/riscv/vm-layout.rst, the end of the
virtual address space for 64-bit kernel is occupied by the modules/BPF/
kernel mappings so this actually reduces the amount of memory we are able
to map and then use in the linear mapping. So make sure this limit is
correctly set.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-22 20:29:30 -07:00
Palmer Dabbelt 444818b599
Merge remote-tracking branch 'riscv/riscv-fix-32bit' into fixes
This contains a single fix for 32-bit boot.  It happens this was already
fixed by c9811e379b ("riscv: Add mem kernel parameter support"), but
the bug existed before that feature addition so I've applied the patch
earlier and then merged it in (which results in a conflict, which is
fixed via not changing the resulting tree).

* riscv/riscv-fix-32bit:
  riscv: Fix 32-bit RISC-V boot failure
2021-07-21 22:18:58 -07:00
Bin Meng d0e4dae744
riscv: Fix 32-bit RISC-V boot failure
Commit dd2d082b57 ("riscv: Cleanup setup_bootmem()") adjusted
the calling sequence in setup_bootmem(), which invalidates the fix
commit de043da0b9 ("RISC-V: Fix usage of memblock_enforce_memory_limit")
did for 32-bit RISC-V unfortunately.

So now 32-bit RISC-V does not boot again when testing booting kernel
on QEMU 'virt' with '-m 2G', which was exactly what the original
commit de043da0b9 ("RISC-V: Fix usage of memblock_enforce_memory_limit")
tried to fix.

Fixes: dd2d082b57 ("riscv: Cleanup setup_bootmem()")
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-21 22:17:41 -07:00
Linus Torvalds 9b76d71fa8 RISC-V Patches for the 5.14 Merge Window, Part 1
In addition to We have a handful of new features for 5.14:
 
 * Support for transparent huge pages.
 * Support for generic PCI resources mapping.
 * Support for the mem= kernel parameter.
 * Support for KFENCE.
 * A handful of fixes to avoid W+X mappings in the kernel.
 * Support for VMAP_STACK based overflow detection.
 * An optimized copy_{to,from}_user.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmDn3XITHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiW4UD/9Z0BJNXG9rOERofyFWwb7EYchT551A
 Coi8BFpuUCZfT9qonuBzQcPrAlXH/T9yMDGiShZio9jh29bnaOIqo5NrvNjB88VU
 LdarNeiumPM4SCQFIsbIBnRrk5OQDtzsPx+dS5xVUQlnHUV26xBakAJo3K3FxG9Y
 bl2JU+LTvP52eBKpKHp0i2pbuC2z0dBEu9Y3d4q8phI+YeolJ0rgOCbMra+maCls
 kk5TeROrDJpTg5INsWpVNvKvRk+sNlh0K9AsZHL1fIA8d2UFyTeCltUNjkWkIZA/
 SBiS+ruxtLD9rTPOrF1i+rvW+rs/gL1LkPjOvoilTMuNm5ltCu5MLsflX6oZ5wX4
 2vAXup6HQzLcJpCCq2QyMIl2s8ORV+gY89nYmRYHLzCoqGtF/WhbSFSKjqNM1+Kf
 /M4C9q3kEopkdlHuKahR8dP4wYz6kP1kEijv83P21ahUFsShSLyLAegYoKO0pNeC
 H/WOWMBqpG+rE80Tsctfo/z3g16Wc51yesYjXsOP5iRYoytG2uGLtzG7fPTjtyuC
 Fa3Ue/9d/W/XyZ7bzolvGRoaeoHqoIXb07MsDLDhO2cHYg6B/wIvHXfPcM7ovR6f
 VT0aRu85dvvewlukuqyH5+rwSPNQfzHo32GIiK7h8QjjIEh1axOFi+7oXGfbcIUw
 EP0u4AZsK9iHZg==
 =uB+H
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:
 "We have a handful of new features for 5.14:

   - Support for transparent huge pages.

   - Support for generic PCI resources mapping.

   - Support for the mem= kernel parameter.

   - Support for KFENCE.

   - A handful of fixes to avoid W+X mappings in the kernel.

   - Support for VMAP_STACK based overflow detection.

   - An optimized copy_{to,from}_user"

* tag 'riscv-for-linus-5.14-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (37 commits)
  riscv: xip: Fix duplicate included asm/pgtable.h
  riscv: Fix PTDUMP output now BPF region moved back to module region
  riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall
  riscv: add VMAP_STACK overflow detection
  riscv: ptrace: add argn syntax
  riscv: mm: fix build errors caused by mk_pmd()
  riscv: Introduce structure that group all variables regarding kernel mapping
  riscv: Map the kernel with correct permissions the first time
  riscv: Introduce set_kernel_memory helper
  riscv: Enable KFENCE for riscv64
  RISC-V: Use asm-generic for {in,out}{bwlq}
  riscv: add ASID-based tlbflushing methods
  riscv: pass the mm_struct to __sbi_tlb_flush_range
  riscv: Add mem kernel parameter support
  riscv: Simplify xip and !xip kernel address conversion macros
  riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED
  riscv: Only initialize swiotlb when necessary
  riscv: fix typo in init.c
  riscv: Cleanup unused functions
  riscv: mm: Use better bitmap_zalloc()
  ...
2021-07-09 10:36:29 -07:00
Alexandre Ghiti 7761e36bc7
riscv: Fix PTDUMP output now BPF region moved back to module region
BPF region was moved back to the region below the kernel at the end of
the module region by 3a02764c37 ("riscv: Ensure BPF_JIT_REGION_START
aligned with PMD size"), so reflect this change in kernel page table
output.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: 3a02764c37 ("riscv: Ensure BPF_JIT_REGION_START aligned with PMD size")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-06 15:21:27 -07:00
Alexandre Ghiti 658e2c5125
riscv: Introduce structure that group all variables regarding kernel mapping
We have a lot of variables that are used to hold kernel mapping addresses,
offsets between physical and virtual mappings and some others used for XIP
kernels: they are all defined at different places in mm/init.c, so group
them into a single structure with, for some of them, more explicit and concise
names.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-07-05 18:04:00 -07:00
Palmer Dabbelt 01112e5e20
Merge branch 'riscv-wx-mappings' into for-next
This contains both the short-term fix for the W+X boot mappings and the
larger cleanup.

* riscv-wx-mappings:
  riscv: Map the kernel with correct permissions the first time
  riscv: Introduce set_kernel_memory helper
  riscv: Simplify xip and !xip kernel address conversion macros
  riscv: Remove CONFIG_PHYS_RAM_BASE_FIXED
  riscv: mm: Fix W+X mappings at boot
2021-06-30 21:50:32 -07:00
Alexandre Ghiti e5c35fa040
riscv: Map the kernel with correct permissions the first time
For 64-bit kernels, we map all the kernel with write and execute
permissions and afterwards remove writability from text and executability
from data.

For 32-bit kernels, the kernel mapping resides in the linear mapping, so we
map all the linear mapping as writable and executable and afterwards we
remove those properties for unused memory and kernel mapping as
described above.

Change this behavior to directly map the kernel with correct permissions
and avoid going through the whole mapping to fix the permissions.

At the same time, this fixes an issue introduced by commit 2bfc6cd81b
("riscv: Move kernel mapping outside of linear mapping") as reported
here https://github.com/starfive-tech/linux/issues/17.

Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-30 21:18:58 -07:00
Liu Shixin 47513f243b
riscv: Enable KFENCE for riscv64
Add architecture specific implementation details for KFENCE and enable
KFENCE for the riscv64 architecture. In particular, this implements the
required interface in <asm/kfence.h>.

KFENCE requires that attributes for pages from its memory pool can
individually be set. Therefore, force the kfence pool to be mapped at
page granularity.

Testing this patch using the testcases in kfence_test.c and all passed.

Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-30 20:55:41 -07:00
Guo Ren 3f1e782998
riscv: add ASID-based tlbflushing methods
Implement optimized version of the tlb flushing routines for systems
using ASIDs. These are behind the use_asid_allocator static branch to
not affect existing systems not using ASIDs.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
[hch: rebased on top of previous cleanups, use the same algorithm as
      the non-ASID based code for local vs global flushes, keep functions
      as local as possible]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-30 20:55:39 -07:00
Christoph Hellwig 70c7605c08
riscv: pass the mm_struct to __sbi_tlb_flush_range
Move the call mm_cpumask from the callers into __sbi_tlb_flush_range to
reduce a bit of duplicate code and prepare for future changes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-30 20:55:38 -07:00
Jisheng Zhang 3a02764c37
riscv: Ensure BPF_JIT_REGION_START aligned with PMD size
Andreas reported commit fc8504765e ("riscv: bpf: Avoid breaking W^X")
breaks booting with one kind of defconfig, I reproduced a kernel panic
with the defconfig:

[    0.138553] Unable to handle kernel paging request at virtual address ffffffff81201220
[    0.139159] Oops [#1]
[    0.139303] Modules linked in:
[    0.139601] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc5-default+ #1
[    0.139934] Hardware name: riscv-virtio,qemu (DT)
[    0.140193] epc : __memset+0xc4/0xfc
[    0.140416]  ra : skb_flow_dissector_init+0x1e/0x82
[    0.140609] epc : ffffffff8029806c ra : ffffffff8033be78 sp : ffffffe001647da0
[    0.140878]  gp : ffffffff81134b08 tp : ffffffe001654380 t0 : ffffffff81201158
[    0.141156]  t1 : 0000000000000002 t2 : 0000000000000154 s0 : ffffffe001647dd0
[    0.141424]  s1 : ffffffff80a43250 a0 : ffffffff81201220 a1 : 0000000000000000
[    0.141654]  a2 : 000000000000003c a3 : ffffffff81201258 a4 : 0000000000000064
[    0.141893]  a5 : ffffffff8029806c a6 : 0000000000000040 a7 : ffffffffffffffff
[    0.142126]  s2 : ffffffff81201220 s3 : 0000000000000009 s4 : ffffffff81135088
[    0.142353]  s5 : ffffffff81135038 s6 : ffffffff8080ce80 s7 : ffffffff80800438
[    0.142584]  s8 : ffffffff80bc6578 s9 : 0000000000000008 s10: ffffffff806000ac
[    0.142810]  s11: 0000000000000000 t3 : fffffffffffffffc t4 : 0000000000000000
[    0.143042]  t5 : 0000000000000155 t6 : 00000000000003ff
[    0.143220] status: 0000000000000120 badaddr: ffffffff81201220 cause: 000000000000000f
[    0.143560] [<ffffffff8029806c>] __memset+0xc4/0xfc
[    0.143859] [<ffffffff8061e984>] init_default_flow_dissectors+0x22/0x60
[    0.144092] [<ffffffff800010fc>] do_one_initcall+0x3e/0x168
[    0.144278] [<ffffffff80600df0>] kernel_init_freeable+0x1c8/0x224
[    0.144479] [<ffffffff804868a8>] kernel_init+0x12/0x110
[    0.144658] [<ffffffff800022de>] ret_from_exception+0x0/0xc
[    0.145124] ---[ end trace f1e9643daa46d591 ]---

After some investigation, I think I found the root cause: commit
2bfc6cd81b ("move kernel mapping outside of linear mapping") moves
BPF JIT region after the kernel:

| #define BPF_JIT_REGION_START	PFN_ALIGN((unsigned long)&_end)

The &_end is unlikely aligned with PMD size, so the front bpf jit
region sits with part of kernel .data section in one PMD size mapping.
But kernel is mapped in PMD SIZE, when bpf_jit_binary_lock_ro() is
called to make the first bpf jit prog ROX, we will make part of kernel
.data section RO too, so when we write to, for example memset the
.data section, MMU will trigger a store page fault.

To fix the issue, we need to ensure the BPF JIT region is PMD size
aligned. This patch acchieve this goal by restoring the BPF JIT region
to original position, I.E the 128MB before kernel .text section. The
modification to kasan_init.c is inspired by Alexandre.

Fixes: fc8504765e ("riscv: bpf: Avoid breaking W^X")
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-18 21:10:05 -07:00
Jisheng Zhang 314b781706
riscv: kasan: Fix MODULES_VADDR evaluation due to local variables' name
commit 2bfc6cd81b ("riscv: Move kernel mapping outside of linear
mapping") makes use of MODULES_VADDR to populate kernel, BPF, modules
mapping. Currently, MODULES_VADDR is defined as below for RV64:

| #define MODULES_VADDR   (PFN_ALIGN((unsigned long)&_end) - SZ_2G)

But kasan_init() has two local variables which are also named as _start,
_end, so MODULES_VADDR is evaluated with the local variable _end
rather than the global "_end" as we expected. Fix this issue by
renaming the two local variables.

Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-18 21:09:56 -07:00
Kefeng Wang c9811e379b
riscv: Add mem kernel parameter support
The memblock_enforce_memory_limit() could change the memblock
range, so move the dram_end assignment after it in bootmem_init(),
then support mem= cmdline.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-15 08:59:04 -07:00
Kefeng Wang ce3aca0465
riscv: Only initialize swiotlb when necessary
The SWIOTLB buffer is not needed unless the physical address space
is beyond the limit of dma, only initialize swiotlb when swiotlb_force
is true or not all system memory is DMA-able.

Also move the swiotlb_init() into mem_init().

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-11 13:42:26 -07:00
Vitaly Wool ae3d69bcc4
riscv: fix typo in init.c
Commit 0106235682 introduced a typo in "__initdata" spelling
which led to build breakage for XIP. Fix that.

Fixes: 0106235682 ("riscv: mm: init: Consolidate vars, functions")
Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-08 17:32:24 -07:00
Kefeng Wang 5def4429ae
riscv: mm: Use better bitmap_zalloc()
Use better bitmap_zalloc() to allocate bitmap.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-08 17:12:30 -07:00
Jisheng Zhang 8a4102a0cf
riscv: mm: Fix W+X mappings at boot
When the kernel mapping was moved the last 2GB of the address space,
(__va(PFN_PHYS(max_low_pfn))) is much smaller than the .data section
start address, the last set_memory_nx() in protect_kernel_text_data()
will fail, thus the .data section is still mapped as W+X. This results
in below W+X mapping waring at boot. Fix it by passing the correct
.data section page num to the set_memory_nx().

[    0.396516] ------------[ cut here ]------------
[    0.396889] riscv/mm: Found insecure W+X mapping at address (____ptrval____)/0xffffffff80c00000
[    0.398347] WARNING: CPU: 0 PID: 1 at arch/riscv/mm/ptdump.c:258 note_page+0x244/0x24a
[    0.398964] Modules linked in:
[    0.399459] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc1+ #14
[    0.400003] Hardware name: riscv-virtio,qemu (DT)
[    0.400591] epc : note_page+0x244/0x24a
[    0.401368]  ra : note_page+0x244/0x24a
[    0.401772] epc : ffffffff80007c86 ra : ffffffff80007c86 sp : ffffffe000e7bc30
[    0.402304]  gp : ffffffff80caae88 tp : ffffffe000e70000 t0 : ffffffff80cb80cf
[    0.402800]  t1 : ffffffff80cb80c0 t2 : 0000000000000000 s0 : ffffffe000e7bc80
[    0.403310]  s1 : ffffffe000e7bde8 a0 : 0000000000000053 a1 : ffffffff80c83ff0
[    0.403805]  a2 : 0000000000000010 a3 : 0000000000000000 a4 : 6c7e7a5137233100
[    0.404298]  a5 : 6c7e7a5137233100 a6 : 0000000000000030 a7 : ffffffffffffffff
[    0.404849]  s2 : ffffffff80e00000 s3 : 0000000040000000 s4 : 0000000000000000
[    0.405393]  s5 : 0000000000000000 s6 : 0000000000000003 s7 : ffffffe000e7bd48
[    0.405935]  s8 : ffffffff81000000 s9 : ffffffffc0000000 s10: ffffffe000e7bd48
[    0.406476]  s11: 0000000000001000 t3 : 0000000000000072 t4 : ffffffffffffffff
[    0.407016]  t5 : 0000000000000002 t6 : ffffffe000e7b978
[    0.407435] status: 0000000000000120 badaddr: 0000000000000000 cause: 0000000000000003
[    0.408052] Call Trace:
[    0.408343] [<ffffffff80007c86>] note_page+0x244/0x24a
[    0.408855] [<ffffffff8010c5a6>] ptdump_hole+0x14/0x1e
[    0.409263] [<ffffffff800f65c6>] walk_pgd_range+0x2a0/0x376
[    0.409690] [<ffffffff800f6828>] walk_page_range_novma+0x4e/0x6e
[    0.410146] [<ffffffff8010c5f8>] ptdump_walk_pgd+0x48/0x78
[    0.410570] [<ffffffff80007d66>] ptdump_check_wx+0xb4/0xf8
[    0.410990] [<ffffffff80006738>] mark_rodata_ro+0x26/0x2e
[    0.411407] [<ffffffff8031961e>] kernel_init+0x44/0x108
[    0.411814] [<ffffffff80002312>] ret_from_exception+0x0/0xc
[    0.412309] ---[ end trace 7ec3459f2547ea83 ]---
[    0.413141] Checked W+X mappings: failed, 512 W+X pages found

Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-06-01 21:15:09 -07:00
Jisheng Zhang 0106235682
riscv: mm: init: Consolidate vars, functions
Consolidate the following items in init.c

Staticize global vars as much as possible;
Add __initdata mark if the global var isn't needed after init
Add __init mark if the func isn't needed after init
Add __ro_after_init if the global var is read only after init

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-29 13:51:16 -07:00
Jisheng Zhang 3df952ae2a
riscv: Add __init section marker to some functions again
These functions are not needed after booting, so mark them as __init
to move them to the __init section.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-29 13:39:27 -07:00
Jisheng Zhang 8237c5243a
riscv: Optimize switch_mm by passing "cpu" to flush_icache_deferred()
Directly passing the cpu to flush_icache_deferred() rather than calling
smp_processor_id() again.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
[Palmer: drop the QEMU performance numbers, and update the comment]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25 22:50:52 -07:00
Kefeng Wang 50bae95e17
riscv: mm: Drop redundant _sdata and _edata declaration
The _sdata/_edata is already in sections.h, drop redundant
declaration.

Also move _xiprom/_exiprom declarations at the beginning of
the file, cleanup one CONFIG_XIP_KERNEL.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25 22:50:51 -07:00
Kefeng Wang f842f5ff6a
riscv: Move setup_bootmem into paging_init
Make setup_bootmem() static.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25 22:50:50 -07:00
Jisheng Zhang 8f3e136ff3
riscv: mm: Remove setup_zero_page()
The empty_zero_page sits at .bss..page_aligned section, so will be
cleared to zero during clearing bss, we don't need to clear it again.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-25 22:50:48 -07:00
Nanyong Sun e88b333142
riscv: mm: add THP support on 64-bit
Bring Transparent HugePage support to riscv. A
transparent huge page is always represented as a pmd.

Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-22 10:20:02 -07:00
Nanyong Sun c3b2d67046
riscv: mm: add param stride for __sbi_tlb_flush_range
Add a parameter: stride for __sbi_tlb_flush_range(),
represent the page stride between the address of start and end.
Normally, the stride is PAGE_SIZE, and when flush huge page
address, the stride can be the huge page size such as:PMD_SIZE,
then it only need to flush one tlb entry if the address range
within PMD_SIZE.

Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-22 10:19:47 -07:00
Geert Uytterhoeven 8d91b09733
riscv: Consistify protect_kernel_linear_mapping_text_rodata() use
The various uses of protect_kernel_linear_mapping_text_rodata() are
not consistent:
  - Its definition depends on "64BIT && !XIP_KERNEL",
  - Its forward declaration depends on MMU,
  - Its single caller depends on "STRICT_KERNEL_RWX && 64BIT && MMU &&
    !XIP_KERNEL".

Fix this by settling on the dependencies of the caller, which can be
simplified as STRICT_KERNEL_RWX depends on "MMU && !XIP_KERNEL".
Provide a dummy definition, as the caller is protected by
"IS_ENABLED(CONFIG_STRICT_KERNEL_RWX)" instead of "#ifdef
CONFIG_STRICT_KERNEL_RWX".

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-06 09:40:15 -07:00
Geert Uytterhoeven 8db6f937f4
riscv: Only extend kernel reservation if mapped read-only
When the kernel mapping was moved outside of the linear mapping, the
kernel memory reservation was increased, to take into account mapping
granularity.  However, this is done unconditionally, regardless of
whether the kernel memory is mapped read-only or not.

If this extension is not needed, up to 2 MiB may be lost, which has a
big impact on e.g. Canaan K210 (64-bit nommu) platforms with only 8 MiB
of RAM.

Reclaim the lost memory by only extending the reserved region when
needed, i.e. depending on a simplified version of the conditional logic
around the call to protect_kernel_linear_mapping_text_rodata().

Fixes: 2bfc6cd81b ("riscv: Move kernel mapping outside of linear mapping")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-05-06 09:40:12 -07:00
Linus Torvalds 939b7cbc00 RISC-V Patches for the 5.13 Merge Window, Part 1
* Support for the memtest= kernel command-line argument.
 * Support for building the kernel with FORTIFY_SOURCE.
 * Support for generic clockevent broadcasts.
 * Support for the buildtar build target.
 * Some build system cleanups to pass more LLVM-friendly arguments.
 * Support for kprobes.
 * A rearranged kernel memory map, the first part of supporting sv48
   systems.
 * Improvements to kexec, along with support for kdump and crash kernels.
 * An alternatives-based errata framework, along with support for
   handling a pair of errata that manifest on some SiFive designs
   (including the HiFive Unmatched).
 * Support for XIP.
 * A device tree for the Microchip PolarFire ICICLE SoC and associated
   dev board.
 
 Along with a bunch of cleanups.  There are already a handful of fixes
 on the list so there will likely be a part 2.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmCS4lITHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYieZqEACSihfcOgZ/oyGWN3chca917/yCWimM
 DOu37Zlh81TNPgzzJwbT44IY5sg/lSecwktxs665TChiJjr3JlM4jmz+u64KOTA8
 mTWhqZNr5zT9kFj/m3x0V9yYOVr9g43QRmIlc14d+8JaQDw0N8WeH/yK85/CXDSS
 X5gQK/e9q/yPf/NPyPuPm67jDsFnJERINWaAHI8lhA5fvFyy/xRLmSkuexchysss
 XOGfyxxX590jGLK1vD+5wccX7ZwfwU4jriTaxyah/VBl8QUur/xSPVyspHIdWiMG
 jrNXI1dg6oI861BdjryUpZI0iYJaRe5FRWUx7uTIqHfIyL/MnvYI7USVYOOPb72M
 yZgN903R++5NeUUVTzfXwaigTwfXAPB6USFqZpEfRAf204pgNybmznJWThAVBdYG
 rUixp7GsEMU3aAT2tE/iHR33JQxQfnZq8Tg43/4gB7MoACrzQrYrGcPnj9xssMyV
 F1hnao3dr+5Xjo3MwfkW9JvLPwvDuE3mdrdj+a0XZ45gbTJeuBhYxo3VOsFeijhQ
 gf/VYuoNn5iae9fiMzx5rlmFT9NJDYKDhla+BpAel84/6nRryyfCZCaE5FvDynOO
 CNQynaeJMIMEygPBYR9FVVCwm+EtVsz3NVFKEuo5ilQpgX8ipctxiqy2+moZALLN
 OWlEH6BKEgXqkw==
 =PsA8
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.13-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the memtest= kernel command-line argument.

 - Support for building the kernel with FORTIFY_SOURCE.

 - Support for generic clockevent broadcasts.

 - Support for the buildtar build target.

 - Some build system cleanups to pass more LLVM-friendly arguments.

 - Support for kprobes.

 - A rearranged kernel memory map, the first part of supporting sv48
   systems.

 - Improvements to kexec, along with support for kdump and crash
   kernels.

 - An alternatives-based errata framework, along with support for
   handling a pair of errata that manifest on some SiFive designs
   (including the HiFive Unmatched).

 - Support for XIP.

 - A device tree for the Microchip PolarFire ICICLE SoC and associated
   dev board.

... along with a bunch of cleanups.  There are already a handful of fixes
on the list so there will likely be a part 2.

* tag 'riscv-for-linus-5.13-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (45 commits)
  RISC-V: Always define XIP_FIXUP
  riscv: Remove 32b kernel mapping from page table dump
  riscv: Fix 32b kernel build with CONFIG_DEBUG_VIRTUAL=y
  RISC-V: Fix error code returned by riscv_hartid_to_cpuid()
  RISC-V: Enable Microchip PolarFire ICICLE SoC
  RISC-V: Initial DTS for Microchip ICICLE board
  dt-bindings: riscv: microchip: Add YAML documentation for the PolarFire SoC
  RISC-V: Add Microchip PolarFire SoC kconfig option
  RISC-V: enable XIP
  RISC-V: Add crash kernel support
  RISC-V: Add kdump support
  RISC-V: Improve init_resources()
  RISC-V: Add kexec support
  RISC-V: Add EM_RISCV to kexec UAPI header
  riscv: vdso: fix and clean-up Makefile
  riscv/mm: Use BUG_ON instead of if condition followed by BUG.
  riscv/kprobe: fix kernel panic when invoking sys_read traced by kprobe
  riscv: Set ARCH_HAS_STRICT_MODULE_RWX if MMU
  riscv: module: Create module allocations without exec permissions
  riscv: bpf: Avoid breaking W^X
  ...
2021-05-06 09:24:18 -07:00