OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Nicholas Piggin	3db8aa10de	powerpc/64e/interrupt: NMI save irq soft-mask state in C 64e non-maskable interrupts save the state of the irq soft-mask in asm. This can be done in C in interrupt wrappers as 64s does. I haven't been able to test this with qemu because it doesn't seem to cause FSL bookE WDT interrupts. This makes WatchdogException an NMI interrupt, which affects 32-bit as well (okay, or create a new handler?) Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210316104206.407354-6-npiggin@gmail.com	2021-04-14 23:04:20 +10:00
Nicholas Piggin	0c2472de23	powerpc/64e/interrupt: use new interrupt return Update the new C and asm interrupt return code to account for 64e specifics, switch over to use it. The now-unused old ret_from_except code, that was moved to 64e after the 64s conversion, is removed. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210316104206.407354-5-npiggin@gmail.com	2021-04-14 23:04:20 +10:00
Nicholas Piggin	dc6231821a	powerpc/interrupt: update common interrupt code for This makes adjustments to 64-bit asm and common C interrupt return code to be usable by the 64e subarchitecture. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210316104206.407354-4-npiggin@gmail.com	2021-04-14 23:04:20 +10:00
Nicholas Piggin	4228b2c3d2	powerpc/64e/interrupt: always save nvgprs on interrupt In order to use the C interrupt return, nvgprs must always be saved. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210316104206.407354-3-npiggin@gmail.com	2021-04-14 23:04:19 +10:00
Nicholas Piggin	5a5a893c4a	powerpc/syscall: switch user_exit_irqoff and trace_hardirqs_off order user_exit_irqoff() -> __context_tracking_exit -> vtime_user_exit warns in __seqprop_assert due to lockdep thinking preemption is enabled because trace_hardirqs_off() has not yet been called. Switch the order of these two calls, which matches their ordering in interrupt_enter_prepare. Fixes: `5f0b6ac390` ("powerpc/64/syscall: Reconcile interrupts") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210316104206.407354-2-npiggin@gmail.com	2021-04-14 23:04:19 +10:00
Madhavan Srinivasan	2e2a441d2c	powerpc/perf: Infrastructure to support checking of attr.config* Introduce code to support the checking of attr.config* for values which are reserved for a given platform. Performance Monitoring Unit (PMU) configuration registers have fields that are reserved and some specific values for bit fields are reserved. For ex., MMCRA[61:62] is Random Sampling Mode (SM) and value of 0b11 for this field is reserved. Writing non-zero or invalid values in these fields will have unknown behaviours. Patch adds a generic call-back function "check_attr_config" in "struct power_pmu", to be called in event_init to check for attr.config* values for a given platform. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408074504.248211-1-maddy@linux.ibm.com	2021-04-14 23:04:19 +10:00
Pu Lehui	59fd366b9b	powerpc/fadump: make symbol 'rtas_fadump_set_regval' static Fix sparse warnings: arch/powerpc/platforms/pseries/rtas-fadump.c:250:6: warning: symbol 'rtas_fadump_set_regval' was not declared. Should it be static? Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408062012.85973-1-pulehui@huawei.com	2021-04-14 23:04:19 +10:00
Christophe Leroy	7e9ab144c1	powerpc/mem: Use kmap_local_page() in flushing functions Flushing functions don't rely on preemption being disabled, so use kmap_local_page() instead of kmap_atomic(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b6a880ea0ec7886b51edbb4979c188be549231c0.1617895813.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:19 +10:00
Christophe Leroy	6c96020882	powerpc/mem: Inline flush_dcache_page() flush_dcache_page() is only a few lines, it is worth inlining. ia64, csky, mips, openrisc and riscv have a similar flush_dcache_page() and inline it. On pmac32_defconfig, we get a small size reduction. On ppc64_defconfig, we get a very small size increase. In both case that's in the noise (less than 0.1%). text data bss dec hex filename 18991155 5934744 `1497624` 26423523 19330e3 vmlinux64.before 18994829 `5936732` `1497624` 26429185 1934701 vmlinux64.after 9150963 2467502 184548 11803013 b41985 vmlinux32.before `9149689` 2467302 184548 11801539 b413c3 vmlinux32.after Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/21c417488b70b7629dae316539fb7bb8bdef4fdd.1617895813.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:19 +10:00
Christophe Leroy	67b8e6af19	powerpc/mem: Help GCC realise __flush_dcache_icache() flushes single pages 'And' the given page address with PAGE_MASK to help GCC. With the patch: 00000024 <__flush_dcache_icache>: 24: 54 63 00 26 rlwinm r3,r3,0,0,19 28: 39 40 00 40 li r10,64 2c: 7c 69 1b 78 mr r9,r3 30: 7d 49 03 a6 mtctr r10 34: 7c 00 48 6c dcbst 0,r9 38: 39 29 00 20 addi r9,r9,32 3c: 7c 00 48 6c dcbst 0,r9 40: 39 29 00 20 addi r9,r9,32 44: 42 00 ff f0 bdnz 34 <__flush_dcache_icache+0x10> 48: 7c 00 04 ac hwsync 4c: 39 20 00 40 li r9,64 50: 7d 29 03 a6 mtctr r9 54: 7c 00 1f ac icbi 0,r3 58: 38 63 00 20 addi r3,r3,32 5c: 7c 00 1f ac icbi 0,r3 60: 38 63 00 20 addi r3,r3,32 64: 42 00 ff f0 bdnz 54 <__flush_dcache_icache+0x30> 68: 7c 00 04 ac hwsync 6c: 4c 00 01 2c isync 70: 4e 80 00 20 blr Without the patch: 00000024 <__flush_dcache_icache>: 24: 54 6a 00 34 rlwinm r10,r3,0,0,26 28: 39 23 10 1f addi r9,r3,4127 2c: 7d 2a 48 50 subf r9,r10,r9 30: 55 29 d9 7f rlwinm. r9,r9,27,5,31 34: 41 82 00 94 beq c8 <__flush_dcache_icache+0xa4> 38: 71 28 00 01 andi. r8,r9,1 3c: 38 c9 ff ff addi r6,r9,-1 40: 7d 48 53 78 mr r8,r10 44: 7d 27 4b 78 mr r7,r9 48: 40 82 00 6c bne b4 <__flush_dcache_icache+0x90> 4c: 54 e7 f8 7e rlwinm r7,r7,31,1,31 50: 7c e9 03 a6 mtctr r7 54: 7c 00 40 6c dcbst 0,r8 58: 39 08 00 20 addi r8,r8,32 5c: 7c 00 40 6c dcbst 0,r8 60: 39 08 00 20 addi r8,r8,32 64: 42 00 ff f0 bdnz 54 <__flush_dcache_icache+0x30> 68: 7c 00 04 ac hwsync 6c: 71 28 00 01 andi. r8,r9,1 70: 39 09 ff ff addi r8,r9,-1 74: 40 82 00 2c bne a0 <__flush_dcache_icache+0x7c> 78: 55 29 f8 7e rlwinm r9,r9,31,1,31 7c: 7d 29 03 a6 mtctr r9 80: 7c 00 57 ac icbi 0,r10 84: 39 4a 00 20 addi r10,r10,32 88: 7c 00 57 ac icbi 0,r10 8c: 39 4a 00 20 addi r10,r10,32 90: 42 00 ff f0 bdnz 80 <__flush_dcache_icache+0x5c> 94: 7c 00 04 ac hwsync 98: 4c 00 01 2c isync 9c: 4e 80 00 20 blr a0: 7c 00 57 ac icbi 0,r10 a4: 2c 08 00 00 cmpwi r8,0 a8: 39 4a 00 20 addi r10,r10,32 ac: 40 82 ff cc bne 78 <__flush_dcache_icache+0x54> b0: 4b ff ff e4 b 94 <__flush_dcache_icache+0x70> b4: 7c 00 50 6c dcbst 0,r10 b8: 2c 06 00 00 cmpwi r6,0 bc: 39 0a 00 20 addi r8,r10,32 c0: 40 82 ff 8c bne 4c <__flush_dcache_icache+0x28> c4: 4b ff ff a4 b 68 <__flush_dcache_icache+0x44> c8: 7c 00 04 ac hwsync cc: 7c 00 04 ac hwsync d0: 4c 00 01 2c isync d4: 4e 80 00 20 blr Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/23030822ea5cd0a122948b10226abe56602dc027.1617895813.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:18 +10:00
Christophe Leroy	52d490437f	powerpc/mem: flush_dcache_icache_phys() is for HIGHMEM pages only __flush_dcache_icache() is usable for non HIGHMEM pages on every platform. It is only for HIGHMEM pages that BOOKE needs kmap() and BOOK3S needs flush_dcache_icache_phys(). So make flush_dcache_icache_phys() dependent on CONFIG_HIGHMEM and call it only when it is a HIGHMEM page. We could make flush_dcache_icache_phys() available at all time, but as it is declared NOKPROBE_SYMBOL(), GCC doesn't optimise it out when it is not used. So define a stub for !CONFIG_HIGHMEM in order to remove the #ifdef in flush_dcache_icache_page() and use IS_ENABLED() instead. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/79ed5d7914f497cd5fcd681ca2f4d50a91719455.1617895813.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:18 +10:00
Christophe Leroy	cd97d9e8b5	powerpc/mem: Optimise flush_dcache_icache_hugepage() flush_dcache_icache_hugepage() is a static function, with only one caller. That caller calls it when PageCompound() is true, so bugging on !PageCompound() is useless if we can trust the compiler a little. Remove the BUG_ON(!PageCompound()). The number of elements of a page won't change over time, but GCC doesn't know about it, so it gets the value at every iteration. To avoid that, call compound_nr() outside the loop and save it in a local variable. Whether the page is a HIGHMEM page or not doesn't change over time. But GCC doesn't know it so it does the test on every iteration. Do the test outside the loop. When the page is not a HIGHMEM page, page_address() will fallback on lowmem_page_address(), so call lowmem_page_address() directly and don't suffer the call to page_address() on every iteration. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ab03712b70105fccfceef095aa03007de9295a40.1617895813.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:18 +10:00
Christophe Leroy	e618c7aea1	powerpc/mem: Call flush_coherent_icache() at higher level flush_coherent_icache() doesn't need the address anymore, so it can be called immediately when entering the public functions and doesn't need to be disseminated among lower level functions. And use page_to_phys() instead of open coding the calculation of phys address to call flush_dcache_icache_phys(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5f063986e325d2efdd404b8f8c5f4bcbd4eb11a6.1617895813.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:18 +10:00
Christophe Leroy	131637a17d	powerpc/mem: Remove address argument to flush_coherent_icache() flush_coherent_icache() can use any valid address as mentionned by the comment. Use PAGE_OFFSET as base address. This allows removing the user access stuff. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/742b6360ae4f344a1c6ecfadcf3b6645f443fa7a.1617895813.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:18 +10:00
Christophe Leroy	bf26e0bbd2	powerpc/mem: Declare __flush_dcache_icache() static __flush_dcache_icache() is only used in mem.c. Move it before the functions that use it and declare it static. And also fix the name of the parameter in the comment. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3fa903eb5a10b2bc7d99a8c559ffdaa05452d8e0.1617895813.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:18 +10:00
Christophe Leroy	b26e8f2725	powerpc/mem: Move cache flushing functions into mm/cacheflush.c Cache flushing functions are in the middle of completely unrelated stuff in mm/mem.c Create a dedicated mm/cacheflush.c for those functions. Also cleanup the list of included headers. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7bf6f1600acad146e541a4e220940062f2e5b03d.1617895813.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:17 +10:00
Bixuan Cui	ff0b4155ae	powerpc/powernv: make symbol 'mpipl_kobj' static The sparse tool complains as follows: arch/powerpc/platforms/powernv/opal-core.c:74:16: warning: symbol 'mpipl_kobj' was not declared. This symbol is not used outside of opal-core.c, so marks it static. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210409063855.57347-1-cuibixuan@huawei.com	2021-04-14 23:04:17 +10:00
Pu Lehui	f234ad405a	powerpc/xmon: Make symbol 'spu_inst_dump' static Fix sparse warning: arch/powerpc/xmon/xmon.c:4216:1: warning: symbol 'spu_inst_dump' was not declared. Should it be static? This symbol is not used outside of xmon.c, so make it static. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210409070151.163424-1-pulehui@huawei.com	2021-04-14 23:04:17 +10:00
Bixuan Cui	cc331eee03	powerpc/perf/hv-24x7: Make some symbols static The sparse tool complains as follows: arch/powerpc/perf/hv-24x7.c:229:1: warning: symbol '__pcpu_scope_hv_24x7_txn_flags' was not declared. Should it be static? arch/powerpc/perf/hv-24x7.c:230:1: warning: symbol '__pcpu_scope_hv_24x7_txn_err' was not declared. Should it be static? arch/powerpc/perf/hv-24x7.c:236:1: warning: symbol '__pcpu_scope_hv_24x7_hw' was not declared. Should it be static? arch/powerpc/perf/hv-24x7.c:244:1: warning: symbol '__pcpu_scope_hv_24x7_reqb' was not declared. Should it be static? arch/powerpc/perf/hv-24x7.c:245:1: warning: symbol '__pcpu_scope_hv_24x7_resb' was not declared. Should it be static? This symbol is not used outside of hv-24x7.c, so this commit marks it static. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210409090124.59492-1-cuibixuan@huawei.com	2021-04-14 23:04:17 +10:00
Bixuan Cui	107dadb046	powerpc/perf: Make symbol 'isa207_pmu_format_attr' static The sparse tool complains as follows: arch/powerpc/perf/isa207-common.c:24:18: warning: symbol 'isa207_pmu_format_attr' was not declared. Should it be static? This symbol is not used outside of isa207-common.c, so this commit marks it static. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210409090119.59444-1-cuibixuan@huawei.com	2021-04-14 23:04:17 +10:00
Bixuan Cui	2235dea17d	powerpc/pseries/pmem: Make symbol 'drc_pmem_match' static The sparse tool complains as follows: arch/powerpc/platforms/pseries/pmem.c:142:27: warning: symbol 'drc_pmem_match' was not declared. Should it be static? This symbol is not used outside of pmem.c, so this commit marks it static. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210409090114.59396-1-cuibixuan@huawei.com	2021-04-14 23:04:17 +10:00
Bixuan Cui	193e4cd8ed	powerpc/pseries: Make symbol '__pcpu_scope_hcall_stats' static The sparse tool complains as follows: arch/powerpc/platforms/pseries/hvCall_inst.c:29:1: warning: symbol '__pcpu_scope_hcall_stats' was not declared. Should it be static? This symbol is not used outside of hvCall_inst.c, so this commit marks it static. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210409090109.59347-1-cuibixuan@huawei.com	2021-04-14 23:04:17 +10:00
Leonardo Bras	472724111f	powerpc/iommu: Enable remaining IOMMU Pagesizes present in LoPAR According to LoPAR, ibm,query-pe-dma-window output named "IO Page Sizes" will let the OS know all possible pagesizes that can be used for creating a new DDW. Currently Linux will only try using 3 of the 8 available options: 4K, 64K and 16M. According to LoPAR, Hypervisor may also offer 32M, 64M, 128M, 256M and 16G. Enabling bigger pages would be interesting for direct mapping systems with a lot of RAM, while using less TCE entries. Signed-off-by: Leonardo Bras <leobras.c@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408201915.174217-1-leobras.c@gmail.com	2021-04-14 23:04:16 +10:00
Masahiro Yamada	672bff581e	powerpc/syscalls: switch to generic syscallhdr.sh Many architectures duplicate similar shell scripts. This commit converts powerpc to use scripts/syscallhdr.sh. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210301153019.362742-2-masahiroy@kernel.org	2021-04-14 23:04:16 +10:00
Masahiro Yamada	14b3c9d24a	powerpc/syscalls: switch to generic syscalltbl.sh Many architectures duplicate similar shell scripts. This commit converts powerpc to use scripts/syscalltbl.sh. This also unifies syscall_table_32.h and syscall_table_c32.h. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210301153019.362742-1-masahiroy@kernel.org	2021-04-14 23:04:16 +10:00
Nathan Lynch	e5d5676352	powerpc/rtas: rename RTAS_RMOBUF_MAX to RTAS_USER_REGION_SIZE RTAS_RMOBUF_MAX doesn't actually describe a "maximum" value in any sense. It represents the size of an area of memory set aside for user space to use as work areas for certain RTAS calls. Rename it to RTAS_USER_REGION_SIZE. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408140630.205502-6-nathanl@linux.ibm.com	2021-04-14 23:04:16 +10:00
Nathan Lynch	0649cdc823	powerpc/rtas: move syscall filter setup into separate function Reduce conditionally compiled sections within rtas_initialize() by moving the filter table initialization into its own function already guarded by CONFIG_PPC_RTAS_FILTER. No behavior change intended. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408140630.205502-5-nathanl@linux.ibm.com	2021-04-14 23:04:16 +10:00
Nathan Lynch	0ab1c929ae	powerpc/rtas: remove ibm_suspend_me_token There's not a compelling reason to cache the value of the token for the ibm,suspend-me function. Just look it up when needed in the RTAS syscall's special case for it. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408140630.205502-4-nathanl@linux.ibm.com	2021-04-14 23:04:16 +10:00
Nathan Lynch	01c1b9984a	powerpc/rtas-proc: remove unused RMO_READ_BUF_MAX This constant is unused. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408140630.205502-3-nathanl@linux.ibm.com	2021-04-14 23:04:16 +10:00
Nathan Lynch	c13ff6f325	powerpc/rtas: improve ppc_rtas_rmo_buf_show documentation Add kerneldoc for ppc_rtas_rmo_buf_show(), the callback for /proc/powerpc/rtas/rmo_buffer, explaining its expected use. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408140630.205502-2-nathanl@linux.ibm.com	2021-04-14 23:04:15 +10:00
Mahesh Salgaonkar	5ae5bc12d0	powerpc/eeh: Fix EEH handling for hugepages in ioremap space. During the EEH MMIO error checking, the current implementation fails to map the (virtual) MMIO address back to the pci device on radix with hugepage mappings for I/O. This results into failure to dispatch EEH event with no recovery even when EEH capability has been enabled on the device. eeh_check_failure(token) # token = virtual MMIO address addr = eeh_token_to_phys(token); edev = eeh_addr_cache_get_dev(addr); if (!edev) return 0; eeh_dev_check_failure(edev); <= Dispatch the EEH event In case of hugepage mappings, eeh_token_to_phys() has a bug in virt -> phys translation that results in wrong physical address, which is then passed to eeh_addr_cache_get_dev() to match it against cached pci I/O address ranges to get to a PCI device. Hence, it fails to find a match and the EEH event never gets dispatched leaving the device in failed state. The commit `3343962068` ("powerpc/eeh: Handle hugepages in ioremap space") introduced following logic to translate virt to phys for hugepage mappings: eeh_token_to_phys(): + pa = pte_pfn(ptep); + + / On radix we can do hugepage mappings for io, so handle that */ + if (hugepage_shift) { + pa <<= hugepage_shift; <= This is wrong + pa \|= token & ((1ul << hugepage_shift) - 1); + } This patch fixes the virt -> phys translation in eeh_token_to_phys() function. $ cat /sys/kernel/debug/powerpc/eeh_address_cache mem addr range [0x0000040080000000-0x00000400807fffff]: 0030:01:00.1 mem addr range [0x0000040080800000-0x0000040080ffffff]: 0030:01:00.1 mem addr range [0x0000040081000000-0x00000400817fffff]: 0030:01:00.0 mem addr range [0x0000040081800000-0x0000040081ffffff]: 0030:01:00.0 mem addr range [0x0000040082000000-0x000004008207ffff]: 0030:01:00.1 mem addr range [0x0000040082080000-0x00000400820fffff]: 0030:01:00.0 mem addr range [0x0000040082100000-0x000004008210ffff]: 0030:01:00.1 mem addr range [0x0000040082110000-0x000004008211ffff]: 0030:01:00.0 Above is the list of cached io address ranges of pci 0030:01:00.<fn>. Before this patch: Tracing 'arg1' of function eeh_addr_cache_get_dev() during error injection clearly shows that 'addr=' contains wrong physical address: kworker/u16:0-7 [001] .... 108.883775: eeh_addr_cache_get_dev: (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x80103000a510 dmesg shows no EEH recovery messages: [ 108.563768] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x9ae) != mcp_pulse (0x7fff) [ 108.563788] bnx2x: [bnx2x_hw_stats_update:870(eth2)]NIG timer max (4294967295) [ 108.883788] bnx2x: [bnx2x_acquire_hw_lock:2013(eth1)]lock_status 0xffffffff resource_bit 0x1 [ 108.884407] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout [ 108.884976] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout <..> After this patch: eeh_addr_cache_get_dev() trace shows correct physical address: <idle>-0 [001] ..s. 1043.123828: eeh_addr_cache_get_dev: (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x40080bc7cd8 dmesg logs shows EEH recovery getting triggerred: [ 964.323980] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x746f) != mcp_pulse (0x7fff) [ 964.323991] EEH: Recovering PHB#30-PE#10000 [ 964.324002] EEH: PE location: N/A, PHB location: N/A [ 964.324006] EEH: Frozen PHB#30-PE#10000 detected <..> Fixes: `3343962068` ("powerpc/eeh: Handle hugepages in ioremap space") Cc: stable@vger.kernel.org # v5.3+ Reported-by: Dominic DeMarco <ddemarc@us.ibm.com> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/161821396263.48361.2796709239866588652.stgit@jupiter	2021-04-14 23:04:15 +10:00
Cédric Le Goater	fd6db2892e	powerpc/xive: Modernize XIVE-IPI domain with an 'alloc' handler Instead of calling irq_create_mapping() to map the IPI for a node, introduce an 'alloc' handler. This is usually an extension to support hierarchy irq_domains which is not exactly the case for XIVE-IPI domain. However, we can now use the irq_domain_alloc_irqs() routine which allocates the IRQ descriptor on the specified node, even better for cache performance on multi node machines. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-10-clg@kaod.org	2021-04-14 23:04:15 +10:00
Cédric Le Goater	7dcc37b3ef	powerpc/xive: Map one IPI interrupt per node ipistorm [] can be used to benchmark the raw interrupt rate of an interrupt controller by measuring the number of IPIs a system can sustain. When applied to the XIVE interrupt controller of POWER9 and POWER10 systems, a significant drop of the interrupt rate can be observed when crossing the second node boundary. This is due to the fact that a single IPI interrupt is used for all CPUs of the system. The structure is shared and the cache line updates impact greatly the traffic between nodes and the overall IPI performance. As a workaround, the impact can be reduced by deactivating the IRQ lockup detector ("noirqdebug") which does a lot of accounting in the Linux IRQ descriptor structure and is responsible for most of the performance penalty. As a fix, this proposal allocates an IPI interrupt per node, to be shared by all CPUs of that node. It solves the scaling issue, the IRQ lockup detector still has an impact but the XIVE interrupt rate scales linearly. It also improves the "noirqdebug" case as showed in the tables below. P9 DD2.2 - 2s * 64 threads "noirqdebug" Mint/s Mint/s chips cpus IPI/sys IPI/chip IPI/chip IPI/sys -------------------------------------------------------------- 1 0-15 4.984023 4.875405 4.996536 5.048892 0-31 10.879164 10.544040 10.757632 11.037859 0-47 15.345301 14.688764 14.926520 15.310053 0-63 17.064907 17.066812 17.613416 17.874511 2 0-79 11.768764 21.650749 22.689120 22.566508 0-95 10.616812 26.878789 28.434703 28.320324 0-111 10.151693 31.397803 31.771773 32.388122 0-127 9.948502 33.139336 34.875716 35.224548 * P10 DD1 - 4s (not homogeneous) 352 threads "noirqdebug" Mint/s Mint/s chips cpus IPI/sys IPI/chip IPI/chip IPI/sys -------------------------------------------------------------- 1 0-15 2.409402 2.364108 2.383303 2.395091 0-31 6.028325 6.046075 6.089999 6.073750 0-47 8.655178 8.644531 8.712830 8.724702 0-63 11.629652 11.735953 12.088203 12.055979 0-79 14.392321 14.729959 14.986701 14.973073 0-95 12.604158 13.004034 17.528748 17.568095 2 0-111 9.767753 13.719831 19.968606 20.024218 0-127 6.744566 16.418854 22.898066 22.995110 0-143 6.005699 19.174421 25.425622 25.417541 0-159 5.649719 21.938836 27.952662 28.059603 0-175 5.441410 24.109484 31.133915 31.127996 3 0-191 5.318341 24.405322 33.999221 33.775354 0-207 5.191382 26.449769 36.050161 35.867307 0-223 5.102790 29.356943 39.544135 39.508169 0-239 5.035295 31.933051 42.135075 42.071975 0-255 4.969209 34.477367 44.655395 44.757074 4 0-271 4.907652 35.887016 47.080545 47.318537 0-287 4.839581 38.076137 50.464307 50.636219 0-303 4.786031 40.881319 53.478684 53.310759 0-319 4.743750 43.448424 56.388102 55.973969 0-335 4.709936 45.623532 59.400930 58.926857 0-351 4.681413 45.646151 62.035804 61.830057 [*] https://github.com/antonblanchard/ipistorm Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-9-clg@kaod.org	2021-04-14 23:04:15 +10:00
Cédric Le Goater	33e4bc5946	powerpc/xive: Fix xmon command "dxi" When under xmon, the "dxi" command dumps the state of the XIVE interrupts. If an interrupt number is specified, only the state of the associated XIVE interrupt is dumped. This form of the command lacks an irq_data parameter which is nevertheless used by xmon_xive_get_irq_config(), leading to an xmon crash. Fix that by doing a lookup in the system IRQ mapping to query the IRQ descriptor data. Invalid interrupt numbers, or not belonging to the XIVE IRQ domain, OPAL event interrupt number for instance, should be caught by the previous query done at the firmware level. Fixes: `97ef275077` ("powerpc/xive: Fix xmon support on the PowerNV platform") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-8-clg@kaod.org	2021-04-14 23:04:15 +10:00
Cédric Le Goater	6bf66eb8f4	powerpc/xive: Simplify the dump of XIVE interrupts under xmon Move the xmon routine under XIVE subsystem and rework the loop on the interrupts taking into account the xive_irq_domain to filter out IPIs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-7-clg@kaod.org	2021-04-14 23:04:14 +10:00
Cédric Le Goater	a74ce5926b	powerpc/xive: Drop check on irq_data in xive_core_debug_show() When looping on IRQ descriptor, irq_data is always valid. Fixes: `930914b7d5` ("powerpc/xive: Add a debugfs file to dump internal XIVE state") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-6-clg@kaod.org	2021-04-14 23:04:14 +10:00
Cédric Le Goater	5159d98728	powerpc/xive: Simplify xive_core_debug_show() Now that the IPI interrupt has its own domain, the checks on the HW interrupt number XIVE_IPI_HW_IRQ and on the chip can be replaced by a check on the domain. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-5-clg@kaod.org	2021-04-14 23:04:14 +10:00
Cédric Le Goater	1835e72942	powerpc/xive: Remove useless check on XIVE_IPI_HW_IRQ The IPI interrupt has its own domain now. Testing the HW interrupt number is not needed anymore. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-4-clg@kaod.org	2021-04-14 23:04:14 +10:00
Cédric Le Goater	7d34849413	powerpc/xive: Introduce an IPI interrupt domain The IPI interrupt is a special case of the XIVE IRQ domain. When mapping and unmapping the interrupts in the Linux interrupt number space, the HW interrupt number 0 (XIVE_IPI_HW_IRQ) is checked to distinguish the IPI interrupt from other interrupts of the system. Simplify the XIVE interrupt domain by introducing a specific domain for the IPI. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331144514.892250-3-clg@kaod.org	2021-04-14 23:04:14 +10:00
Yu Kuai	078277acbd	powerpc/smp: Make some symbols static The sparse tool complains as follows: arch/powerpc/kernel/smp.c:86:1: warning: symbol '__pcpu_scope_cpu_coregroup_map' was not declared. Should it be static? arch/powerpc/kernel/smp.c:125:1: warning: symbol '__pcpu_scope_thread_group_l1_cache_map' was not declared. Should it be static? arch/powerpc/kernel/smp.c:132:1: warning: symbol '__pcpu_scope_thread_group_l2_cache_map' was not declared. Should it be static? These symbols are not used outside of smp.c, so this commit marks them static. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210407125903.4139663-1-yukuai3@huawei.com	2021-04-14 23:04:14 +10:00
Li Huafei	f6f1f48e8b	powerpc/mce: Make symbol 'mce_ue_event_work' static The sparse tool complains as follows: arch/powerpc/kernel/mce.c:43:1: warning: symbol 'mce_ue_event_work' was not declared. Should it be static? This symbol is not used outside of mce.c, so this commit marks it static. Signed-off-by: Li Huafei <lihuafei1@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408035802.31853-1-lihuafei1@huawei.com	2021-04-14 23:04:13 +10:00
Li Huafei	7f262b4dcf	powerpc/security: Make symbol 'stf_barrier' static The sparse tool complains as follows: arch/powerpc/kernel/security.c:253:6: warning: symbol 'stf_barrier' was not declared. Should it be static? This symbol is not used outside of security.c, so this commit marks it static. Signed-off-by: Li Huafei <lihuafei1@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408033951.28369-1-lihuafei1@huawei.com	2021-04-14 23:04:13 +10:00
Christophe Leroy	80edc68e04	powerpc/32s: Define a MODULE area below kernel text all the time On book3s/32, the segment below kernel text is used for module allocation when CONFIG_STRICT_KERNEL_RWX is defined. In order to benefit from the powerpc specific module_alloc() function which allocate modules with 32 Mbytes from end of kernel text, use that segment below PAGE_OFFSET at all time. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a46dcdd39a9e80b012d86c294c4e5cd8d31665f3.1617283827.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:13 +10:00
Christophe Leroy	9132a2e82a	powerpc/8xx: Define a MODULE area below kernel text On the 8xx, TASK_SIZE is 0x80000000. The space between TASK_SIZE and PAGE_OFFSET is not used. In order to benefit from the powerpc specific module_alloc() function which allocate modules with 32 Mbytes from end of kernel text, define MODULES_VADDR and MODULES_END. Set a 256Mb area just below PAGE_OFFSET, like book3s/32. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a225606d5b3a8bc53fe612ad52c855c60b0a0a58.1617283827.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:13 +10:00
Christophe Leroy	2ec13df167	powerpc/modules: Load modules closer to kernel text On book3s/32, when STRICT_KERNEL_RWX is selected, modules are allocated on the segment just before kernel text, ie on the 0xb0000000-0xbfffffff when PAGE_OFFSET is 0xc0000000. On the 8xx, TASK_SIZE is 0x80000000. The space between TASK_SIZE and PAGE_OFFSET is not used and could be used for modules. The idea comes from ARM architecture. Having modules just below PAGE_OFFSET offers an opportunity to minimise the distance between kernel text and modules and avoid trampolines in modules to access kernel functions or other module functions. When MODULES_VADDR is defined, powerpc has it's own module_alloc() function. In that function, first try to allocate the module above the limit defined by '_etext - 32M'. Then if the allocation fails, fallback to the entire MODULES area. DEBUG logs in module_32.c without the patch: [ 1572.588822] module_32: Applying ADD relocate section 13 to 12 [ 1572.588891] module_32: Doing plt for call to 0xc00671a4 at 0xcae04024 [ 1572.588964] module_32: Initialized plt for 0xc00671a4 at cae04000 [ 1572.589037] module_32: REL24 value = CAE04000. location = CAE04024 [ 1572.589110] module_32: Location before: 48000001. [ 1572.589171] module_32: Location after: 4BFFFFDD. [ 1572.589231] module_32: ie. jump to 03FFFFDC+CAE04024 = CEE04000 [ 1572.589317] module_32: Applying ADD relocate section 15 to 14 [ 1572.589386] module_32: Doing plt for call to 0xc00671a4 at 0xcadfc018 [ 1572.589457] module_32: Initialized plt for 0xc00671a4 at cadfc000 [ 1572.589529] module_32: REL24 value = CADFC000. location = CADFC018 [ 1572.589601] module_32: Location before: 48000000. [ 1572.589661] module_32: Location after: 4BFFFFE8. [ 1572.589723] module_32: ie. jump to 03FFFFE8+CADFC018 = CEDFC000 With the patch: [ 279.404671] module_32: Applying ADD relocate section 13 to 12 [ 279.404741] module_32: REL24 value = C00671B4. location = BF808024 [ 279.404814] module_32: Location before: 48000001. [ 279.404874] module_32: Location after: 4885F191. [ 279.404933] module_32: ie. jump to 0085F190+BF808024 = C00671B4 [ 279.405016] module_32: Applying ADD relocate section 15 to 14 [ 279.405085] module_32: REL24 value = C00671B4. location = BF800018 [ 279.405156] module_32: Location before: 48000000. [ 279.405215] module_32: Location after: 4886719C. [ 279.405275] module_32: ie. jump to 0086719C+BF800018 = C00671B4 We see that with the patch, no plt entries are set. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0c3d5cb8a4dfdf6ca1b8aeb385c01470d6628d55.1617283827.git.christophe.leroy@csgroup.eu	2021-04-14 23:04:13 +10:00
Vaibhav Jain	a5d6a3e73a	powerpc/mm: Add cond_resched() while removing hpte mappings While removing large number of mappings from hash page tables for large memory systems as soft-lockup is reported because of the time spent inside htap_remove_mapping() like one below: watchdog: BUG: soft lockup - CPU#8 stuck for 23s! <snip> NIP plpar_hcall+0x38/0x58 LR pSeries_lpar_hpte_invalidate+0x68/0xb0 Call Trace: 0x1fffffffffff000 (unreliable) pSeries_lpar_hpte_removebolted+0x9c/0x230 hash__remove_section_mapping+0xec/0x1c0 remove_section_mapping+0x28/0x3c arch_remove_memory+0xfc/0x150 devm_memremap_pages_release+0x180/0x2f0 devm_action_release+0x30/0x50 release_nodes+0x28c/0x300 device_release_driver_internal+0x16c/0x280 unbind_store+0x124/0x170 drv_attr_store+0x44/0x60 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 __vfs_write+0x3c/0x70 vfs_write+0xd4/0x270 ksys_write+0xdc/0x130 system_call+0x5c/0x70 Fix this by adding a cond_resched() to the loop in htap_remove_mapping() that issues hcall to remove hpte mapping. The call to cond_resched() is issued every HZ jiffies which should prevent the soft-lockup from being reported. Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210404163148.321346-1-vaibhav@linux.ibm.com	2021-04-14 23:04:12 +10:00
Shivaprasad G Bhat	75b7c05ebf	powerpc/papr_scm: Implement support for H_SCM_FLUSH hcall Add support for ND_REGION_ASYNC capability if the device tree indicates 'ibm,hcall-flush-required' property in the NVDIMM node. Flush is done by issuing H_SCM_FLUSH hcall to the hypervisor. If the flush request failed, the hypervisor is expected to to reflect the problem in the subsequent nvdimm H_SCM_HEALTH call. This patch prevents mmap of namespaces with MAP_SYNC flag if the nvdimm requires an explicit flush[1]. References: [1] https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/ndctl.py.data/map_sync.c Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Use unsigned long / long instead of uint64_t/int64_t] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/161703936121.36.7260632399582101498.stgit@e1fbed493c87	2021-04-14 23:04:07 +10:00
Michael Walle	83216e3988	of: net: pass the dst buffer to of_get_mac_address() of_get_mac_address() returns a "const void" pointer to a MAC address. Lately, support to fetch the MAC address by an NVMEM provider was added. But this will only work with platform devices. It will not work with PCI devices (e.g. of an integrated root complex) and esp. not with DSA ports. There is an of_ variant of the nvmem binding which works without devices. The returned data of a nvmem_cell_read() has to be freed after use. On the other hand the return of_get_mac_address() points to some static data without a lifetime. The trick for now, was to allocate a device resource managed buffer which is then returned. This will only work if we have an actual device. Change it, so that the caller of of_get_mac_address() has to supply a buffer where the MAC address is written to. Unfortunately, this will touch all drivers which use the of_get_mac_address(). Usually the code looks like: const char *addr; addr = of_get_mac_address(np); if (!IS_ERR(addr)) ether_addr_copy(ndev->dev_addr, addr); This can then be simply rewritten as: of_get_mac_address(np, ndev->dev_addr); Sometimes is_valid_ether_addr() is used to test the MAC address. of_get_mac_address() already makes sure, it just returns a valid MAC address. Thus we can just test its return code. But we have to be careful if there are still other sources for the MAC address before the of_get_mac_address(). In this case we have to keep the is_valid_ether_addr() call. The following coccinelle patch was used to convert common cases to the new style. Afterwards, I've manually gone over the drivers and fixed the return code variable: either used a new one or if one was already available use that. Mansour Moufid, thanks for that coccinelle patch! <spml> @a@ identifier x; expression y, z; @@ - x = of_get_mac_address(y); + x = of_get_mac_address(y, z); <... - ether_addr_copy(z, x); ...> @@ identifier a.x; @@ - if (<+... x ...+>) {} @@ identifier a.x; @@ if (<+... x ...+>) { ... } - else {} @@ identifier a.x; expression e; @@ - if (<+... x ...+>@e) - {} - else + if (!(e)) {...} @@ expression x, y, z; @@ - x = of_get_mac_address(y, z); + of_get_mac_address(y, z); ... when != x </spml> All drivers, except drivers/net/ethernet/aeroflex/greth.c, were compile-time tested. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-04-13 14:35:02 -07:00
Christophe Leroy	af072b1a9d	powerpc/signal32: Fix build failure with CONFIG_SPE Add missing fault exit label in unsafe_copy_from_user() in order to avoid following build failure with CONFIG_SPE CC arch/powerpc/kernel/signal_32.o arch/powerpc/kernel/signal_32.c: In function 'restore_user_regs': arch/powerpc/kernel/signal_32.c:565:36: error: macro "unsafe_copy_from_user" requires 4 arguments, but only 3 given 565 \| ELF_NEVRREG * sizeof(u32)); \| ^ In file included from ./include/linux/uaccess.h:11, from ./include/linux/sched/task.h:11, from ./include/linux/sched/signal.h:9, from ./include/linux/rcuwait.h:6, from ./include/linux/percpu-rwsem.h:7, from ./include/linux/fs.h:33, from ./include/linux/huge_mm.h:8, from ./include/linux/mm.h:707, from arch/powerpc/kernel/signal_32.c:17: ./arch/powerpc/include/asm/uaccess.h:428: note: macro "unsafe_copy_from_user" defined here 428 \| #define unsafe_copy_from_user(d, s, l, e) \ \| arch/powerpc/kernel/signal_32.c:564:3: error: 'unsafe_copy_from_user' undeclared (first use in this function); did you mean 'raw_copy_from_user'? 564 \| unsafe_copy_from_user(current->thread.evr, &sr->mc_vregs, \| ^~~~~~~~~~~~~~~~~~~~~ \| raw_copy_from_user arch/powerpc/kernel/signal_32.c:564:3: note: each undeclared identifier is reported only once for each function it appears in make[3]: *** [arch/powerpc/kernel/signal_32.o] Error 1 Fixes: `627b72bee8` ("powerpc/signal32: Convert restore_[tm]_user_regs() to user access block") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/aad2cb1801a3cc99bc27081022925b9fc18a0dfb.1618159169.git.christophe.leroy@csgroup.eu	2021-04-12 21:28:08 +10:00
Nicholas Piggin	732f21a305	KVM: PPC: Book3S HV: Ensure MSR[HV] is always clear in guest MSR Rather than clear the HV bit from the MSR at guest entry, make it clear that the hypervisor does not allow the guest to set the bit. The HV clear is kept in guest entry for now, but a future patch will warn if it is set. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-13-npiggin@gmail.com	2021-04-12 13:36:24 +10:00
Nicholas Piggin	946cf44ac6	KVM: PPC: Book3S HV: Ensure MSR[ME] is always set in guest MSR Rather than add the ME bit to the MSR at guest entry, make it clear that the hypervisor does not allow the guest to clear the bit. The ME set is kept in guest entry for now, but a future patch will warn if it's not present. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-12-npiggin@gmail.com	2021-04-12 13:36:24 +10:00
Nicholas Piggin	da487a5d1b	powerpc/64s: remove KVM SKIP test from instruction breakpoint handler The code being executed in KVM_GUEST_MODE_SKIP is hypervisor code with MSR[IR]=0, so the faults of concern are the d-side ones caused by access to guest context by the hypervisor. Instruction breakpoint interrupts are not a concern here. It's unlikely any good would come of causing breaks in this code, but skipping the instruction that caused it won't help matters (e.g., skip the mtmsr that sets MSR[DR]=0 or clears KVM_GUEST_MODE_SKIP). [Paul notes: "the 0x1300 interrupt was dropped from the architecture a long time ago and is not generated by P7, P8, P9 or P10." So add a comment about this in the handler code while we're here. ] Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-11-npiggin@gmail.com	2021-04-12 13:36:24 +10:00
Nicholas Piggin	5eee837182	powerpc/64s: Remove KVM handler support from CBE_RAS interrupts Cell does not support KVM. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-10-npiggin@gmail.com	2021-04-12 13:36:24 +10:00
Nicholas Piggin	0fd85cb83f	KVM: PPC: Book3S HV: Fix CONFIG_SPAPR_TCE_IOMMU=n default hcalls This config option causes the warning in init_default_hcalls to fire because the TCE handlers are in the default hcall list but not implemented. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-9-npiggin@gmail.com	2021-04-12 13:36:24 +10:00
Nicholas Piggin	6c12c4376b	KVM: PPC: Book3S HV: remove unused kvmppc_h_protect argument The va argument is not used in the function or set by its asm caller, so remove it to be safe. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-8-npiggin@gmail.com	2021-04-12 13:36:23 +10:00
Nicholas Piggin	4b5f0a0d49	KVM: PPC: Book3S HV: Remove redundant mtspr PSPB This SPR is set to 0 twice when exiting the guest. Suggested-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Daniel Axtens <dja@axtens.net> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-7-npiggin@gmail.com	2021-04-12 13:36:23 +10:00
Nicholas Piggin	72c1528721	KVM: PPC: Book3S HV: Prevent radix guests setting LPCR[TC] Prevent radix guests setting LPCR[TC]. This bit only applies to hash partitions. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-6-npiggin@gmail.com	2021-04-12 13:36:23 +10:00
Nicholas Piggin	bcc92a0d6d	KVM: PPC: Book3S HV: Disallow LPCR[AIL] to be set to 1 or 2 These are already disallowed by H_SET_MODE from the guest, also disallow these by updating LPCR directly. AIL modes can affect the host interrupt behaviour while the guest LPCR value is set, so filter it here too. Suggested-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-5-npiggin@gmail.com	2021-04-12 13:36:23 +10:00
Nicholas Piggin	67145ef496	KVM: PPC: Book3S HV: Add a function to filter guest LPCR bits Guest LPCR depends on hardware type, and future changes will add restrictions based on errata and guest MMU mode. Move this logic to a common function and use it for the cases where the guest wants to update its LPCR (or the LPCR of a nested guest). This also adds a warning in other places that set or update LPCR if we try to set something that would have been disallowed by the filter, as a sanity check. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-4-npiggin@gmail.com	2021-04-12 13:36:23 +10:00
Nicholas Piggin	a19b70abc6	KVM: PPC: Book3S HV: Nested move LPCR sanitising to sanitise_hv_regs This will get a bit more complicated in future patches. Move it into the helper function. This change allows the L1 hypervisor to determine some of the LPCR bits that the L0 is using to run it, which could be a privilege violation (LPCR is HV-privileged), although the same problem exists now for HFSCR for example. Discussion of the HV privilege issue is ongoing and can be resolved with a later change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-3-npiggin@gmail.com	2021-04-12 13:36:23 +10:00
Nicholas Piggin	5088eb4092	KVM: PPC: Book3S HV P9: Restore host CTRL SPR after guest exit The host CTRL (runlatch) value is not restored after guest exit. The host CTRL should always be 1 except in CPU idle code, so this can result in the host running with runlatch clear, and potentially switching to a different vCPU which then runs with runlatch clear as well. This has little effect on P9 machines, CTRL is only responsible for some PMU counter logic in the host and so other than corner cases of software relying on that, or explicitly reading the runlatch value (Linux does not appear to be affected but it's possible non-Linux guests could be), there should be no execution correctness problem, though it could be used as a covert channel between guests. There may be microcontrollers, firmware or monitoring tools that sample the runlatch value out-of-band, however since the register is writable by guests, these values would (should) not be relied upon for correct operation of the host, so suboptimal performance or incorrect reporting should be the worst problem. Fixes: `95a6432ce9` ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210412014845.1517916-2-npiggin@gmail.com	2021-04-12 13:36:22 +10:00
Christophe Leroy	c46bbf5d2d	powerpc/32: Remove powerpc specific definition of 'ptrdiff_t' For unknown reason, old commit d27dfd388715 ("Import pre2.0.8") changed 'ptrdiff_t' from 'int' to 'long'. GCC expects it as 'int' really, and this leads to the following warning when building KFENCE: CC mm/kfence/report.o In file included from ./include/linux/printk.h:7, from ./include/linux/kernel.h:16, from mm/kfence/report.c:10: mm/kfence/report.c: In function 'kfence_report_error': ./include/linux/kern_levels.h:5:18: warning: format '%td' expects argument of type 'ptrdiff_t', but argument 6 has type 'long int' [-Wformat=] 5 \| #define KERN_SOH "\001" /* ASCII Start Of Header / \| ^~~~~~ ./include/linux/kern_levels.h:11:18: note: in expansion of macro 'KERN_SOH' 11 \| #define KERN_ERR KERN_SOH "3" / error conditions */ \| ^~~~~~~~ ./include/linux/printk.h:343:9: note: in expansion of macro 'KERN_ERR' 343 \| printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) \| ^~~~~~~~ mm/kfence/report.c:213:3: note: in expansion of macro 'pr_err' 213 \| pr_err("Out-of-bounds %s at 0x%p (%luB %s of kfence-#%td):\n", \| ^~~~~~ <asm-generic/uapi/posix-types.h> defines it as 'int', and defines 'size_t' and 'ssize_t' exactly as powerpc do, so remove the powerpc specific definitions and fallback on generic ones. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e43d133bf52fa19e577f64f3a3a38cedc570377d.1617616601.git.christophe.leroy@csgroup.eu	2021-04-08 21:17:46 +10:00
Randy Dunlap	b27dadecdf	powerpc: iommu: fix build when neither PCI or IBMVIO is set When neither CONFIG_PCI nor CONFIG_IBMVIO is set/enabled, iommu.c has a build error. The fault injection code is not useful in that kernel config, so make the FAIL_IOMMU option depend on PCI \|\| IBMVIO. Prevents this build error (warning escalated to error): ../arch/powerpc/kernel/iommu.c:178:30: error: 'fail_iommu_bus_notifier' defined but not used [-Werror=unused-variable] 178 \| static struct notifier_block fail_iommu_bus_notifier = { Fixes: `d6b9a81b2a` ("powerpc: IOMMU fault injection") Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210404192623.10697-1-rdunlap@infradead.org	2021-04-08 21:17:46 +10:00
Yang Li	01ed051094	powerpc/pseries: remove unneeded semicolon Eliminate the following coccicheck warning: ./arch/powerpc/platforms/pseries/lpar.c:1633:2-3: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1617672785-81372-1-git-send-email-yang.lee@linux.alibaba.com	2021-04-08 21:17:45 +10:00
Nicholas Piggin	98db179a78	powerpc/64s: power4 nap fixup in C There is no need for this to be in asm, use the new intrrupt entry wrapper. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210406025508.821718-1-npiggin@gmail.com	2021-04-08 21:17:45 +10:00
Athira Rajeev	10f8f96179	powerpc/perf: Fix PMU constraint check for EBB events The power PMU group constraints includes check for EBB events to make sure all events in a group must agree on EBB. This will prevent scheduling EBB and non-EBB events together. But in the existing check, settings for constraint mask and value is interchanged. Patch fixes the same. Before the patch, PMU selftest "cpu_event_pinned_vs_ebb_test" fails with below in dmesg logs. This happens because EBB event gets enabled along with a non-EBB cpu event. [35600.453346] cpu_event_pinne[41326]: illegal instruction (4) at 10004a18 nip 10004a18 lr 100049f8 code 1 in cpu_event_pinned_vs_ebb_test[10000000+10000] Test results after the patch: $ ./pmu/ebb/cpu_event_pinned_vs_ebb_test test: cpu_event_pinned_vs_ebb tags: git_version:v5.12-rc5-93-gf28c3125acd3-dirty Binding to cpu 8 EBB Handler is at 0x100050c8 read error on event 0x7fffe6bd4040! PM_RUN_INST_CMPL: result 9872 running/enabled 37930432 success: cpu_event_pinned_vs_ebb This bug was hidden by other logic until commit `1908dc9117` (perf: Tweak perf_event_attr::exclusive semantics). Fixes: `4df4899911` ("powerpc/perf: Add power8 EBB support") Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> [mpe: Mention commit `1908dc9117`] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1617725761-1464-1-git-send-email-atrajeev@linux.vnet.ibm.com	2021-04-08 21:17:44 +10:00
Jordan Niethe	08a022ad3d	powerpc/powernv/memtrace: Allow mmaping trace buffers Let the memory removed from the linear mapping to be used for the trace buffers be mmaped. This is a useful way of providing cache-inhibited memory for the alignment_handler selftest. Signed-off-by: Jordan Niethe <jniethe5@gmail.com> [mpe: make memtrace_mmap() static as noticed by lkp@intel.com] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210225032108.1458352-1-jniethe5@gmail.com	2021-04-08 21:17:44 +10:00
Michael Ellerman	acd4dfeb49	powerpc/kexec: Don't use .machine ppc64 in trampoline_64.S As best as I can tell the ".machine" directive in trampoline_64.S is no longer, or never was, necessary. It was added in commit `0d97631392` ("powerpc: Add purgatory for kexec_file_load() implementation."), which created the file based on the kexec-tools purgatory. It may be/have-been necessary in the kexec-tools version, but we have a completely different build system, and we already pass the desired CPU flags, eg: gcc ... -m64 -Wl,-a64 -mabi=elfv2 -Wa,-maltivec -Wa,-mpower4 -Wa,-many ... arch/powerpc/purgatory/trampoline_64.S So drop the ".machine" directive and rely on the assembler flags. Reported-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/r/20210315034159.315675-1-mpe@ellerman.id.au	2021-04-08 21:17:43 +10:00
Michael Ellerman	c6b4c9147f	powerpc/64: Move security code into security.c When the original spectre/meltdown mitigations were merged we put them in setup_64.c for lack of a better place. Since then we created security.c for some of the other mitigation related code. But it should all be in there. This sort of code movement can cause trouble for backports, but hopefully this code is relatively stable these days (famous last words). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210326101201.1973552-1-mpe@ellerman.id.au	2021-04-08 21:17:43 +10:00
Michael Ellerman	bd573a8131	powerpc/mm/64s: Allow STRICT_KERNEL_RWX again We have now fixed the known bugs in STRICT_KERNEL_RWX for Book3S 64-bit Hash and Radix MMUs, see preceding commits, so allow the option to be selected again. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331003845.216246-6-mpe@ellerman.id.au	2021-04-08 21:17:43 +10:00
Michael Ellerman	87e65ad7bd	powerpc/mm/64s/hash: Add real-mode change_memory_range() for hash LPAR When we enabled STRICT_KERNEL_RWX we received some reports of boot failures when using the Hash MMU and running under phyp. The crashes are intermittent, and often exhibit as a completely unresponsive system, or possibly an oops. One example, which was caught in xmon: [ 14.068327][ T1] devtmpfs: mounted [ 14.069302][ T1] Freeing unused kernel memory: 5568K [ 14.142060][ T347] BUG: Unable to handle kernel instruction fetch [ 14.142063][ T1] Run /sbin/init as init process [ 14.142074][ T347] Faulting instruction address: 0xc000000000004400 cpu 0x2: Vector: 400 (Instruction Access) at [c00000000c7475e0] pc: c000000000004400: exc_virt_0x4400_instruction_access+0x0/0x80 lr: c0000000001862d4: update_rq_clock+0x44/0x110 sp: c00000000c747880 msr: 8000000040001031 current = 0xc00000000c60d380 paca = 0xc00000001ec9de80 irqmask: 0x03 irq_happened: 0x01 pid = 347, comm = kworker/2:1 ... enter ? for help [c00000000c747880] c0000000001862d4 update_rq_clock+0x44/0x110 (unreliable) [c00000000c7478f0] c000000000198794 update_blocked_averages+0xb4/0x6d0 [c00000000c7479f0] c000000000198e40 update_nohz_stats+0x90/0xd0 [c00000000c747a20] c0000000001a13b4 _nohz_idle_balance+0x164/0x390 [c00000000c747b10] c0000000001a1af8 newidle_balance+0x478/0x610 [c00000000c747be0] c0000000001a1d48 pick_next_task_fair+0x58/0x480 [c00000000c747c40] c000000000eaab5c __schedule+0x12c/0x950 [c00000000c747cd0] c000000000eab3e8 schedule+0x68/0x120 [c00000000c747d00] c00000000016b730 worker_thread+0x130/0x640 [c00000000c747da0] c000000000174d50 kthread+0x1a0/0x1b0 [c00000000c747e10] c00000000000e0f0 ret_from_kernel_thread+0x5c/0x6c This shows that CPU 2, which was idle, woke up and then appears to randomly take an instruction fault on a completely valid area of kernel text. The cause turns out to be the call to hash__mark_rodata_ro(), late in boot. Due to the way we layout text and rodata, that function actually changes the permissions for all of text and rodata to read-only plus execute. To do the permission change we use a hypervisor call, H_PROTECT. On phyp that appears to be implemented by briefly removing the mapping of the kernel text, before putting it back with the updated permissions. If any other CPU is executing during that window, it will see spurious faults on the kernel text and/or data, leading to crashes. To fix it we use stop machine to collect all other CPUs, and then have them drop into real mode (MMU off), while we change the mapping. That way they are unaffected by the mapping temporarily disappearing. We don't see this bug on KVM because KVM always use VPM=1, where faults are directed to the hypervisor, and the fault will be serialised vs the h_protect() by HPTE_V_HVLOCK. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331003845.216246-5-mpe@ellerman.id.au	2021-04-08 21:17:43 +10:00
Michael Ellerman	6f223ebe9c	powerpc/mm/64s/hash: Factor out change_memory_range() Pull the loop calling hpte_updateboltedpp() out of hash__change_memory_range() into a helper function. We need it to be a separate function for the next patch. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331003845.216246-4-mpe@ellerman.id.au	2021-04-08 21:17:43 +10:00
Michael Ellerman	2c02e656a2	powerpc/64s: Use htab_convert_pte_flags() in hash__mark_rodata_ro() In hash__mark_rodata_ro() we pass the raw PP_RXXX value to hash__change_memory_range(). That has the effect of setting the key to zero, because PP_RXXX contains no key value. Fix it by using htab_convert_pte_flags(), which knows how to convert a pgprot into a pp value, including the key. Fixes: `d94b827e89` ("powerpc/book3s64/kuap: Use Key 3 for kernel mapping with hash translation") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Daniel Axtens <dja@axtens.net> Link: https://lore.kernel.org/r/20210331003845.216246-3-mpe@ellerman.id.au	2021-04-08 21:17:43 +10:00
Michael Ellerman	b56d55a5aa	powerpc/pseries: Add key to flags in pSeries_lpar_hpte_updateboltedpp() The flags argument to plpar_pte_protect() (aka. H_PROTECT), includes the key in bits 9-13, but currently we always set those bits to zero. In the past that hasn't been a problem because we always used key 0 for the kernel, and updateboltedpp() is only used for kernel mappings. However since commit `d94b827e89` ("powerpc/book3s64/kuap: Use Key 3 for kernel mapping with hash translation") we are now inadvertently changing the key (to zero) when we call plpar_pte_protect(). That hasn't broken anything because updateboltedpp() is only used for STRICT_KERNEL_RWX, which is currently disabled on 64s due to other bugs. But we want to fix that, so first we need to pass the key correctly to plpar_pte_protect(). We can't pass our newpp value directly in, we have to convert it into the form expected by the hcall. The hcall we're using here is H_PROTECT, which is specified in section 14.5.4.1.6 of LoPAPR v1.1. It takes a `flags` parameter, and the description for flags says: * flags: AVPN, pp0, pp1, pp2, key0-key4, n, and for the CMO option: CMO Option flags as defined in Table 189‚ If you then go to the start of the parent section, 14.5.4.1, on page 405, it says: Register Linkage (For hcall() tokens 0x04 - 0x18) * On Call * R3 function call token * R4 flags (see Table 178‚ “Page Frame Table Access flags field definition‚” on page 401) Then you have to go to section 14.5.3, and on page 394 there is a list of hcalls and their tokens (table 176), and there you can see that H_PROTECT == 0x18. Finally you can look at table 178, on page 401, where it specifies the layout of the bits for the key: Bit Function ----------------- 50-54 \| key0-key4 Those are big-endian bit numbers, converting to normal bit numbers you get bits 9-13, or 0x3e00. In the kernel we have: #define HPTE_R_KEY_HI ASM_CONST(0x3000000000000000) #define HPTE_R_KEY_LO ASM_CONST(0x0000000000000e00) So the LO bits of newpp are already in the right place, and the HI bits need to be shifted down by 48. Fixes: `d94b827e89` ("powerpc/book3s64/kuap: Use Key 3 for kernel mapping with hash translation") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331003845.216246-2-mpe@ellerman.id.au	2021-04-08 21:17:43 +10:00
Michael Ellerman	56bec2f9d4	powerpc/mm/64s: Add _PAGE_KERNEL_ROX In the past we had a fallback definition for _PAGE_KERNEL_ROX, but we removed that in commit `d82fd29c5a` ("powerpc/mm: Distribute platform specific PAGE and PMD flags and definitions") and added definitions for each MMU family. However we missed adding a definition for 64s, which was not really a bug because it's currently not used. But we'd like to use PAGE_KERNEL_ROX in a future patch so add a definition now. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210331003845.216246-1-mpe@ellerman.id.au	2021-04-08 21:17:42 +10:00
Jordan Niethe	b8b2f37cf6	powerpc/64s: Fix pte update for kernel memory on radix When adding a PTE a ptesync is needed to order the update of the PTE with subsequent accesses otherwise a spurious fault may be raised. radix__set_pte_at() does not do this for performance gains. For non-kernel memory this is not an issue as any faults of this kind are corrected by the page fault handler. For kernel memory these faults are not handled. The current solution is that there is a ptesync in flush_cache_vmap() which should be called when mapping from the vmalloc region. However, map_kernel_page() does not call flush_cache_vmap(). This is troublesome in particular for code patching with Strict RWX on radix. In do_patch_instruction() the page frame that contains the instruction to be patched is mapped and then immediately patched. With no ordering or synchronization between setting up the PTE and writing to the page it is possible for faults. As the code patching is done using __put_user_asm_goto() the resulting fault is obscured - but using a normal store instead it can be seen: BUG: Unable to handle kernel data access on write at 0xc008000008f24a3c Faulting instruction address: 0xc00000000008bd74 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: nop_module(PO+) [last unloaded: nop_module] CPU: 4 PID: 757 Comm: sh Tainted: P O 5.10.0-rc5-01361-ge3c1b78c8440-dirty #43 NIP: c00000000008bd74 LR: c00000000008bd50 CTR: c000000000025810 REGS: c000000016f634a0 TRAP: 0300 Tainted: P O (5.10.0-rc5-01361-ge3c1b78c8440-dirty) MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 44002884 XER: 00000000 CFAR: c00000000007c68c DAR: c008000008f24a3c DSISR: 42000000 IRQMASK: 1 This results in the kind of issue reported here: https://lore.kernel.org/linuxppc-dev/15AC5B0E-A221-4B8C-9039-FA96B8EF7C88@lca.pw/ Chris Riedl suggested a reliable way to reproduce the issue: $ mount -t debugfs none /sys/kernel/debug $ (while true; do echo function > /sys/kernel/debug/tracing/current_tracer ; echo nop > /sys/kernel/debug/tracing/current_tracer ; done) & Turning ftrace on and off does a large amount of code patching which in usually less then 5min will crash giving a trace like: ftrace-powerpc: (____ptrval____): replaced (4b473b11) != old (60000000) ------------[ ftrace bug ]------------ ftrace failed to modify [<c000000000bf8e5c>] napi_busy_loop+0xc/0x390 actual: 11:3b:47:4b Setting ftrace call site to call ftrace function ftrace record flags: 80000001 (1) expected tramp: c00000000006c96c ------------[ cut here ]------------ WARNING: CPU: 4 PID: 809 at kernel/trace/ftrace.c:2065 ftrace_bug+0x28c/0x2e8 Modules linked in: nop_module(PO-) [last unloaded: nop_module] CPU: 4 PID: 809 Comm: sh Tainted: P O 5.10.0-rc5-01360-gf878ccaf250a #1 NIP: c00000000024f334 LR: c00000000024f330 CTR: c0000000001a5af0 REGS: c000000004c8b760 TRAP: 0700 Tainted: P O (5.10.0-rc5-01360-gf878ccaf250a) MSR: 900000000282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28008848 XER: 20040000 CFAR: c0000000001a9c98 IRQMASK: 0 GPR00: c00000000024f330 c000000004c8b9f0 c000000002770600 0000000000000022 GPR04: 00000000ffff7fff c000000004c8b6d0 0000000000000027 c0000007fe9bcdd8 GPR08: 0000000000000023 ffffffffffffffd8 0000000000000027 c000000002613118 GPR12: 0000000000008000 c0000007fffdca00 0000000000000000 0000000000000000 GPR16: 0000000023ec37c5 0000000000000000 0000000000000000 0000000000000008 GPR20: c000000004c8bc90 c0000000027a2d20 c000000004c8bcd0 c000000002612fe8 GPR24: 0000000000000038 0000000000000030 0000000000000028 0000000000000020 GPR28: c000000000ff1b68 c000000000bf8e5c c00000000312f700 c000000000fbb9b0 NIP ftrace_bug+0x28c/0x2e8 LR ftrace_bug+0x288/0x2e8 Call Trace: ftrace_bug+0x288/0x2e8 (unreliable) ftrace_modify_all_code+0x168/0x210 arch_ftrace_update_code+0x18/0x30 ftrace_run_update_code+0x44/0xc0 ftrace_startup+0xf8/0x1c0 register_ftrace_function+0x4c/0xc0 function_trace_init+0x80/0xb0 tracing_set_tracer+0x2a4/0x4f0 tracing_set_trace_write+0xd4/0x130 vfs_write+0xf0/0x330 ksys_write+0x84/0x140 system_call_exception+0x14c/0x230 system_call_common+0xf0/0x27c To fix this when updating kernel memory PTEs using ptesync. Fixes: `f1cb8f9beb` ("powerpc/64s/radix: avoid ptesync after set_pte and ptep_set_access_flags") Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Tidy up change log slightly] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210208032957.1232102-1-jniethe5@gmail.com	2021-04-08 21:17:42 +10:00
Bhaskar Chowdhury	4763d37827	powerpc: Spelling/typo fixes Various spelling/typo fixes. Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2021-04-08 21:17:42 +10:00
Christoph Hellwig	4eeb96f6ef	iommu/fsl_pamu: replace DOMAIN_ATTR_FSL_PAMU_STASH with a direct call Add a fsl_pamu_configure_l1_stash API that qman_portal can call directly instead of indirecting through the iommu attr API. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Will Deacon <will@kernel.org> Acked-by: Li Yang <leoyang.li@nxp.com> Link: https://lore.kernel.org/r/20210401155256.298656-8-hch@lst.de Signed-off-by: Joerg Roedel <jroedel@suse.de>	2021-04-07 10:56:52 +02:00
Greg Kroah-Hartman	9594408763	Merge 5.12-rc6 into tty-next We need the serial/tty fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-04-05 08:59:21 +02:00
Christophe Leroy	b0b3b2c78e	powerpc: Switch to relative jump labels Convert powerpc to relative jump labels. Before the patch, pseries_defconfig vmlinux.o has: 9074 __jump_table 0003f2a0 0000000000000000 0000000000000000 01321fa8 20 With the patch, the same config gets: 9074 __jump_table 0002a0e0 0000000000000000 0000000000000000 01321fb4 20 Size is 258720 without the patch, 172256 with the patch. That's a 33% size reduction. Largely copied from commit `c296146c05` ("arm64/kernel: jump_label: Switch to relative references") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/828348da7868eda953ce023994404dfc49603b64.1616514473.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:21 +11:00
Christophe Leroy	40272035e1	powerpc/bpf: Reallocate BPF registers to volatile registers when possible on PPC32 When the BPF routine doesn't call any function, the non volatile registers can be reallocated to volatile registers in order to avoid having to save them/restore on the stack. Before this patch, the test #359 ADD default X is: 0: 7c 64 1b 78 mr r4,r3 4: 38 60 00 00 li r3,0 8: 94 21 ff b0 stwu r1,-80(r1) c: 60 00 00 00 nop 10: 92 e1 00 2c stw r23,44(r1) 14: 93 01 00 30 stw r24,48(r1) 18: 93 21 00 34 stw r25,52(r1) 1c: 93 41 00 38 stw r26,56(r1) 20: 39 80 00 00 li r12,0 24: 39 60 00 00 li r11,0 28: 3b 40 00 00 li r26,0 2c: 3b 20 00 00 li r25,0 30: 7c 98 23 78 mr r24,r4 34: 7c 77 1b 78 mr r23,r3 38: 39 80 00 42 li r12,66 3c: 39 60 00 00 li r11,0 40: 7d 8c d2 14 add r12,r12,r26 44: 39 60 00 00 li r11,0 48: 7d 83 63 78 mr r3,r12 4c: 82 e1 00 2c lwz r23,44(r1) 50: 83 01 00 30 lwz r24,48(r1) 54: 83 21 00 34 lwz r25,52(r1) 58: 83 41 00 38 lwz r26,56(r1) 5c: 38 21 00 50 addi r1,r1,80 60: 4e 80 00 20 blr After this patch, the same test has become: 0: 7c 64 1b 78 mr r4,r3 4: 38 60 00 00 li r3,0 8: 94 21 ff b0 stwu r1,-80(r1) c: 60 00 00 00 nop 10: 39 80 00 00 li r12,0 14: 39 60 00 00 li r11,0 18: 39 00 00 00 li r8,0 1c: 38 e0 00 00 li r7,0 20: 7c 86 23 78 mr r6,r4 24: 7c 65 1b 78 mr r5,r3 28: 39 80 00 42 li r12,66 2c: 39 60 00 00 li r11,0 30: 7d 8c 42 14 add r12,r12,r8 34: 39 60 00 00 li r11,0 38: 7d 83 63 78 mr r3,r12 3c: 38 21 00 50 addi r1,r1,80 40: 4e 80 00 20 blr Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b94562d7d2bb21aec89de0c40bb3cd91054b65a2.1616430991.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:21 +11:00
Christophe Leroy	51c66ad849	powerpc/bpf: Implement extended BPF on PPC32 Implement Extended Berkeley Packet Filter on Powerpc 32 Test result with test_bpf module: test_bpf: Summary: 378 PASSED, 0 FAILED, [354/366 JIT'ed] Registers mapping: [BPF_REG_0] = r11-r12 /* function arguments / [BPF_REG_1] = r3-r4 [BPF_REG_2] = r5-r6 [BPF_REG_3] = r7-r8 [BPF_REG_4] = r9-r10 [BPF_REG_5] = r21-r22 (Args 9 and 10 come in via the stack) / non volatile registers / [BPF_REG_6] = r23-r24 [BPF_REG_7] = r25-r26 [BPF_REG_8] = r27-r28 [BPF_REG_9] = r29-r30 / frame pointer aka BPF_REG_10 / [BPF_REG_FP] = r17-r18 / eBPF jit internal registers / [BPF_REG_AX] = r19-r20 [TMP_REG] = r31 As PPC32 doesn't have a redzone in the stack, a stack frame must always be set in order to host at least the tail count counter. The stack frame remains for tail calls, it is set by the first callee and freed by the last callee. r0 is used as temporary register as much as possible. It is referenced directly in the code in order to avoid misusing it, because some instructions interpret it as value 0 instead of register r0 (ex: addi, addis, stw, lwz, ...) The following operations are not implemented: case BPF_ALU64 \| BPF_DIV \| BPF_X: / dst /= src / case BPF_ALU64 \| BPF_MOD \| BPF_X: / dst %= src / case BPF_STX \| BPF_XADD \| BPF_DW: / (u64 )(dst + off) += src / The following operations are only implemented for power of two constants: case BPF_ALU64 \| BPF_MOD \| BPF_K: / dst %= imm / case BPF_ALU64 \| BPF_DIV \| BPF_K: / dst /= imm */ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/61d8b149176ddf99e7d5cef0b6dc1598583ca202.1616430991.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:21 +11:00
Christophe Leroy	355a8d26cd	powerpc/asm: Add some opcodes in asm/ppc-opcode.h for PPC32 eBPF The following opcodes will be needed for the implementation of eBPF for PPC32. Add them in asm/ppc-opcode.h PPC_RAW_ADDE PPC_RAW_ADDZE PPC_RAW_ADDME PPC_RAW_MFLR PPC_RAW_ADDIC PPC_RAW_ADDIC_DOT PPC_RAW_SUBFC PPC_RAW_SUBFE PPC_RAW_SUBFIC PPC_RAW_SUBFZE PPC_RAW_ANDIS PPC_RAW_NOR Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f7bd573a368edd78006f8a5af508c726e7ce1ed2.1616430991.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:21 +11:00
Christophe Leroy	c426810fcf	powerpc/bpf: Change values of SEEN_ flags Because PPC32 will use more non volatile registers, move SEEN_ flags to positions 0-2 which corresponds to special registers. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/608faa1dc3ecfead649e15392abd07b00313d2ba.1616430991.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:20 +11:00
Christophe Leroy	4ea76e90a9	powerpc/bpf: Move common functions into bpf_jit_comp.c Move into bpf_jit_comp.c the functions that will remain common to PPC64 and PPC32 when we add support of EBPF for PPC32. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2c339d77fb168ef12b213ccddfee3cb6c8ce8ae1.1616430991.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:20 +11:00
Christophe Leroy	f1b1583d5f	powerpc/bpf: Move common helpers into bpf_jit.h Move functions bpf_flush_icache(), bpf_is_seen_register() and bpf_set_seen_register() in order to reuse them in future bpf_jit_comp32.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/28e8d5a75e64807d7e9d39a4b52658755e259f8c.1616430991.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:20 +11:00
Christophe Leroy	ed573b57e7	powerpc/bpf: Change register numbering for bpf_set/is_seen_register() Instead of using BPF register number as input in functions bpf_set_seen_register() and bpf_is_seen_register(), use CPU register number directly. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0cd2506f598e7095ea43e62dca1f472de5474a0d.1616430991.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:20 +11:00
Christophe Leroy	6944caad78	powerpc/bpf: Remove classical BPF support for PPC32 At the time being, PPC32 has Classical BPF support. The test_bpf module exhibits some failure: test_bpf: #298 LD_IND byte frag jited:1 ret 202 != 66 FAIL (1 times) test_bpf: #299 LD_IND halfword frag jited:1 ret 51958 != 17220 FAIL (1 times) test_bpf: #301 LD_IND halfword mixed head/frag jited:1 ret 51958 != 1305 FAIL (1 times) test_bpf: #303 LD_ABS byte frag jited:1 ret 202 != 66 FAIL (1 times) test_bpf: #304 LD_ABS halfword frag jited:1 ret 51958 != 17220 FAIL (1 times) test_bpf: #306 LD_ABS halfword mixed head/frag jited:1 ret 51958 != 1305 FAIL (1 times) test_bpf: Summary: 371 PASSED, 7 FAILED, [119/366 JIT'ed] Fixing this is not worth the effort. Instead, remove support for classical BPF and prepare for adding Extended BPF support instead. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/fbc3e4fcc9c8f6131d6c705212530b2aa50149ee.1616430991.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:20 +11:00
Christophe Leroy	c7393a71eb	powerpc/signal32: Simplify logging in sigreturn() Same spirit as commit `debf122c77` ("powerpc/signal32: Simplify logging in handle_rt_signal32()"), remove this intermediate 'addr' local var. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/638fa99530beb29f82f94370057d110e91272acc.1616151715.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:20 +11:00
Christophe Leroy	887f3ceb51	powerpc/signal32: Convert do_setcontext[_tm]() to user access block Add unsafe_get_user_sigset() and transform PPC32 get_sigset_t() into an unsafe version unsafe_get_sigset_t(). Then convert do_setcontext() and do_setcontext_tm() to use user_read_access_begin/end. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9273ba664db769b8d9c7540ae91395e346e4945e.1616151715.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:19 +11:00
Christophe Leroy	627b72bee8	powerpc/signal32: Convert restore_[tm]_user_regs() to user access block Convert restore_user_regs() and restore_tm_user_regs() to use user_access_read_begin/end blocks. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/181adf15a6f644efcd1aeafb355f3578ff1b6bc5.1616151715.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:19 +11:00
Christophe Leroy	036fc2cb1d	powerpc/signal32: Reorder user reads in restore_tm_user_regs() In restore_tm_user_regs(), regroup the reads from 'sr' and the ones from 'tm_sr' together in order to allow two block user accesses in following patch. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7c518b9a4c8e5ae9a3bfb647bc8b20bf820233af.1616151715.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:19 +11:00
Christophe Leroy	362471b319	powerpc/signal32: Perform access_ok() inside restore_user_regs() In preparation of using user_access_begin/end in restore_user_regs(), move the access_ok() inside the function. It makes no difference as the behaviour on a failed access_ok() is the same as on failed restore_user_regs(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c106eb2f37c3040f1fd38b40e50c670feb7cb835.1616151715.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:19 +11:00
Christophe Leroy	ca9e1605cd	powerpc/signal32: Remove ifdefery in middle of if/else in sigreturn() In the same spirit as commit `f1cf4f93de` ("powerpc/signal32: Remove ifdefery in middle of if/else") MSR_TM_ACTIVE() is always defined and returns always 0 when CONFIG_PPC_TRANSACTIONAL_MEM is not selected, so the awful ifdefery in the middle of an if/else can be removed. Make 'msr_hi' a 'long long' to avoid build failure on PPC32 due to the 32 bits left shift. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a4b48b2f0be1ef13fc8e57452b7f8350da28d521.1616151715.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:19 +11:00
Christophe Leroy	f918a81e20	powerpc/signal32: Rename save_user_regs_unsafe() and save_general_regs_unsafe() Convention is to prefix functions with __unsafe_ instead of suffixing it with _unsafe. Rename save_user_regs_unsafe() and save_general_regs_unsafe() accordingly, that is respectively __unsafe_save_general_regs() and __unsafe_save_user_regs(). Suggested-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8cef43607e5b35a7fd0829dec812d88beb570df2.1616151715.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:19 +11:00
Christophe Leroy	7c11f8893a	powerpc/signal: Add unsafe_copy_ck{fpr/vsx}_from_user Add unsafe_copy_ckfpr_from_user() and unsafe_copy_ckvsx_from_user() Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1040687aa27553d19f749f7fb48f0c07af98ee2d.1616151715.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:19 +11:00
Christophe Leroy	c1cc1570bc	powerpc/uaccess: Also perform 64 bits copies in unsafe_copy_from_user() on ppc32 Similarly to commit 5cf773fc8f37 ("powerpc/uaccess: Also perform 64 bits copies in unsafe_copy_to_user() on ppc32") ppc32 has an efficiant 64 bits unsafe_get_user(), so also use it in order to unroll loops more. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/308e65d9237a14e8c0e3b22919fcf0b5e5592608.1616151715.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:18 +11:00
Christophe Leroy	5cd29b1fd3	powerpc/uaccess: Use asm goto for get_user when compiler supports it clang 11 and future GCC are supporting asm goto with outputs. Use it to implement get_user in order to get better generated code. Note that clang requires to set x in the default branch of __get_user_size_goto() otherwise is compliant about x not being initialised :puzzled: Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/403745b5aaa1b315bb4e8e46c1ba949e77eecec0.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:16 +11:00
Christophe Leroy	035785ab28	powerpc/uaccess: Introduce __get_user_size_goto() We have got two places doing a goto based on the result of __get_user_size_allowed(). Refactor that into __get_user_size_goto(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/def8a39289e02653cfb1583b3b19837de9efed3a.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:13 +11:00
Christophe Leroy	e72fcdb26c	powerpc/uaccess: Refactor get/put_user() and __get/put_user() Make get_user() do the access_ok() check then call __get_user(). Make put_user() do the access_ok() check then call __put_user(). Then embed __get_user_size() and __put_user_size() in __get_user() and __put_user(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eebc554f6a81f570c46ea3551000ff5b886e4faa.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:10 +11:00
Christophe Leroy	17f8c0bc21	powerpc/uaccess: Rename __get/put_user_check/nocheck __get_user_check() becomes get_user() __put_user_check() becomes put_user() __get_user_nocheck() becomes __get_user() __put_user_nocheck() becomes __put_user() Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/41d7e45f4733f0e61e63824e4865b4e049db74d6.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:08 +11:00
Christophe Leroy	f904c22f2a	powerpc/uaccess: Split out __get_user_nocheck() One part of __get_user_nocheck() is used for __get_user(), the other part for unsafe_get_user(). Move the part dedicated to unsafe_get_user() in it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/618fe2e0626b308a5a063d5baac827b968e85c32.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:05 +11:00
Christophe Leroy	9975f852ce	powerpc/uaccess: Remove calls to __get_user_bad() and __put_user_bad() __get_user_bad() and __put_user_bad() are functions that are declared but not defined, in order to make the link fail in case they are called. Nowadays, we have BUILD_BUG() and BUILD_BUG_ON() for that, and they have the advantage to break the build earlier as it breaks it at compile time instead of link time. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d7d839e994f49fae4ff7b70fac72bd951272436b.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:22:02 +11:00
Christophe Leroy	028e156168	powerpc/uaccess: Remove __chk_user_ptr() in __get/put_user Commit `d02f6b7dab` ("powerpc/uaccess: Evaluate macro arguments once, before user access is allowed") changed the __chk_user_ptr() argument from the passed ptr pointer to the locally declared __gu_addr. But __gu_addr is locally defined as __user so the check is pointless. During kernel build __chk_user_ptr() voids and is only evaluated during sparse checks so it should have been armless to leave the original pointer check there. Nevertheless, this check is indeed redundant with the assignment above which casts the ptr pointer to the local __user __gu_addr. In case of mismatch, sparse will detect it there, so the __check_user_ptr() is not needed anywhere else than in access_ok(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/69f17d75046733b891ab2e668dbf464787cdf598.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:21:59 +11:00
Christophe Leroy	be15a16579	powerpc/uaccess: Remove __unsafe_put_user_goto() __unsafe_put_user_goto() is just an intermediate layer to __put_user_size_goto() without added value other than doing the __user pointer type checking. Do the __user pointer type checking in __put_user_size_goto() and remove __unsafe_put_user_goto(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b6552149209aebd887a6977272b06a41256bdb9f.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:21:55 +11:00
Christophe Leroy	ed0d9c66f9	powerpc/uaccess: Call might_fault() inconditionaly Commit `6bfd93c32a` ("powerpc: Fix incorrect might_sleep in __get_user/__put_user on kernel addresses") added a check to not call might_sleep() on kernel addresses. This was to enable the use of __get_user() in the alignment exception handler for any address. Then commit `95156f0051` ("lockdep, mm: fix might_fault() annotation") added a check of the address space in might_fault(), based on set_fs() logic. But this didn't solve the powerpc alignment exception case as it didn't call set_fs(KERNEL_DS). Nowadays, set_fs() is gone, previous patch fixed the alignment exception handler and __get_user/__put_user are not supposed to be used anymore to read kernel memory. Therefore the is_kernel_addr() check has become useless and can be removed. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e0a980a4dc7a2551183dd5cb30f46eafdbee390c.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:21:52 +11:00
Christophe Leroy	111631b5e9	powerpc/align: Don't use __get_user_instr() on kernel addresses In the old days, when we didn't have kernel userspace access protection and had set_fs(), it was wise to use __get_user() and friends to read kernel memory. Nowadays, get_user() is granting userspace access and is exclusively for userspace access. In alignment exception handler, use probe_kernel_read_inst() instead of __get_user_instr() for reading instructions in kernel. This will allow to remove the is_kernel_addr() check in __get/put_user() in a following patch. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d9ecbce00178484e66ca7adec2ff210058037704.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:21:49 +11:00
Christophe Leroy	35506a3e2d	powerpc/uaccess: Move get_user_instr helpers in asm/inst.h Those helpers use get_user helpers but they don't participate in their implementation, so they do not belong to asm/uaccess.h Move them in asm/inst.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2c6e83581b4fa434aa7cf2fa7714c41e98f57007.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:21:45 +11:00
Christophe Leroy	bad956b8fe	powerpc/uaccess: Remove __get/put_user_inatomic() Powerpc is the only architecture having _inatomic variants of __get_user() and __put_user() accessors. They were introduced by commit `e68c825bb0` ("[POWERPC] Add inatomic versions of __get_user and __put_user"). Those variants expand to the _nosleep macros instead of expanding to the _nocheck macros. The only difference between the _nocheck and the _nosleep macros is the call to might_fault(). Since commit `662bbcb274` ("mm, sched: Allow uaccess in atomic with pagefault_disable()"), __get/put_user() can be used in atomic parts of the code, therefore __get/put_user_inatomic() have become useless. Remove __get_user_inatomic() and __put_user_inatomic(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1e5c895669e8d54a7810b62dc61eb111f33c2c37.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:21:41 +11:00
Christophe Leroy	3fa3db3295	powerpc/align: Convert emulate_spe() to user_access_begin This patch converts emulate_spe() to using user_access_begin logic. Since commit `662bbcb274` ("mm, sched: Allow uaccess in atomic with pagefault_disable()"), might_fault() doesn't fire when called from sections where pagefaults are disabled, which must be the case when using _inatomic variants of __get_user and __put_user. So the might_fault() in user_access_begin() is not a problem. There was a verification of user_mode() together with the access_ok(), but there is a second verification of user_mode() just after, that leads to immediate return. The access_ok() is now part of the user_access_begin which is called after that other user_mode() verification, so no need to check user_mode() again. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c95a648fdf75992c9d88f3c73cc23e7537fcf2ad.1615555354.git.christophe.leroy@csgroup.eu	2021-04-03 21:21:39 +11:00
Christophe Leroy	9bd68dc5d7	powerpc/uaccess: Define ___get_user_instr() for ppc32 Define simple ___get_user_instr() for ppc32 instead of defining ppc32 versions of the three get_user_instr() helpers. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e02f83ec74f26d76df2874f0ce4d5cc69c3469ae.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:21:32 +11:00
Christophe Leroy	8cdf748d55	powerpc/uaccess: Remove __get_user_allowed() and unsafe_op_wrap() Those two macros have only one user which is unsafe_get_user(). Put everything in one place and remove them. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/439179c5e54c18f2cb8bdf1eea13ea0ef6b98375.1615398265.git.christophe.leroy@csgroup.eu	2021-04-03 21:21:26 +11:00
Christophe Leroy	791f9e3659	powerpc/vdso: Make sure vdso_wrapper.o is rebuilt everytime vdso.so is rebuilt Commit `bce74491c3` ("powerpc/vdso: fix unnecessary rebuilds of vgettimeofday.o") moved vdso32_wrapper.o and vdso64_wrapper.o out of arch/powerpc/kernel/vdso[32/64]/ and removed the dependencies in the Makefile. This leads to the wrappers not being re-build hence the kernel embedding the old vdso library. Add back missing dependencies to ensure vdso32_wrapper.o and vdso64_wrapper.o are rebuilt when vdso32.so.dbg and vdso64.so.dbg are changed. Fixes: `bce74491c3` ("powerpc/vdso: fix unnecessary rebuilds of vgettimeofday.o") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8bb015bc98c51d8ced581415b7e3d157e18da7c9.1617181918.git.christophe.leroy@csgroup.eu	2021-04-02 00:18:09 +11:00
Christophe Leroy	acca57217c	powerpc/signal32: Fix Oops on sigreturn with unmapped VDSO PPC32 encounters a KUAP fault when trying to handle a signal with VDSO unmapped. Kernel attempted to read user page (7fc07ec0) - exploit attempt? (uid: 0) BUG: Unable to handle kernel data access on read at 0x7fc07ec0 Faulting instruction address: 0xc00111d4 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=16K PREEMPT CMPC885 CPU: 0 PID: 353 Comm: sigreturn_vdso Not tainted 5.12.0-rc4-s3k-dev-01553-gb30c310ea220 #4814 NIP: c00111d4 LR: c0005a28 CTR: 00000000 REGS: cadb3dd0 TRAP: 0300 Not tainted (5.12.0-rc4-s3k-dev-01553-gb30c310ea220) MSR: 00009032 <EE,ME,IR,DR,RI> CR: 48000884 XER: 20000000 DAR: 7fc07ec0 DSISR: 88000000 GPR00: c0007788 cadb3e90 c28d4a40 7fc07ec0 7fc07ed0 000004e0 7fc07ce0 00000000 GPR08: 00000001 00000001 7fc07ec0 00000000 28000282 1001b828 100a0920 00000000 GPR16: 100cac0c 100b0000 105c43a4 105c5685 100d0000 100d0000 100d0000 100b2e9e GPR24: ffffffff 105c43c8 00000000 7fc07ec8 cadb3f40 cadb3ec8 c28d4a40 00000000 NIP [c00111d4] flush_icache_range+0x90/0xb4 LR [c0005a28] handle_signal32+0x1bc/0x1c4 Call Trace: [cadb3e90] [100d0000] 0x100d0000 (unreliable) [cadb3ec0] [c0007788] do_notify_resume+0x260/0x314 [cadb3f20] [c000c764] syscall_exit_prepare+0x120/0x184 [cadb3f30] [c00100b4] ret_from_syscall+0xc/0x28 --- interrupt: c00 at 0xfe807f8 NIP: 0fe807f8 LR: 10001060 CTR: c0139378 REGS: cadb3f40 TRAP: 0c00 Not tainted (5.12.0-rc4-s3k-dev-01553-gb30c310ea220) MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 28000482 XER: 20000000 GPR00: 00000025 7fc081c0 77bb1690 00000000 0000000a 28000482 00000001 0ff03a38 GPR08: 0000d032 00006de5 c28d4a40 00000009 88000482 1001b828 100a0920 00000000 GPR16: 100cac0c 100b0000 105c43a4 105c5685 100d0000 100d0000 100d0000 100b2e9e GPR24: ffffffff 105c43c8 00000000 77ba7628 10002398 10010000 10002124 00024000 NIP [0fe807f8] 0xfe807f8 LR [10001060] 0x10001060 --- interrupt: c00 Instruction dump: 38630010 7c001fac 38630010 4200fff0 7c0004ac 4c00012c 4e800020 7c001fac 2c0a0000 38630010 4082ffcc 4bffffe4 <7c00186c> 2c070000 39430010 4082ff8c ---[ end trace 3973fb72b049cb06 ]--- This is because flush_icache_range() is called on user addresses. The same problem was detected some time ago on PPC64. It was fixed by enabling KUAP in commit `59bee45b97` ("powerpc/mm: Fix missing KUAP disable in flush_coherent_icache()"). PPC32 doesn't use flush_coherent_icache() and fallbacks on clean_dcache_range() and invalidate_icache_range(). We could fix it similarly by enabling user access in those functions, but this is overkill for just flushing two instructions. The two instructions are 8 bytes aligned, so a single dcbst/icbi is enough to flush them. Do like __patch_instruction() and inline a dcbst followed by an icbi just after the write of the instructions, while user access is still allowed. The isync is not required because rfi will be used to return to user. icbi() is handled as a read so read-write user access is needed. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bde9154e5351a5ac7bca3d59cdb5a5e8edacbb79.1617199569.git.christophe.leroy@csgroup.eu	2021-04-02 00:16:23 +11:00
Christophe Leroy	3618250c83	powerpc/ptrace: Don't return error when getting/setting FP regs without CONFIG_PPC_FPU_REGS An #ifdef CONFIG_PPC_FPU_REGS is missing in arch_ptrace() leading to the following Oops because [REGSET_FPR] entry is not initialised in native_regsets[]. [ 41.917608] BUG: Unable to handle kernel instruction fetch [ 41.922849] Faulting instruction address: 0xff8fd228 [ 41.927760] Oops: Kernel access of bad area, sig: 11 [#1] [ 41.933089] BE PAGE_SIZE=4K PREEMPT CMPC885 [ 41.940753] Modules linked in: [ 41.943768] CPU: 0 PID: 366 Comm: gdb Not tainted 5.12.0-rc5-s3k-dev-01666-g7aac86a0f057-dirty #4835 [ 41.952800] NIP: ff8fd228 LR: c004d9e0 CTR: ff8fd228 [ 41.957790] REGS: caae9df0 TRAP: 0400 Not tainted (5.12.0-rc5-s3k-dev-01666-g7aac86a0f057-dirty) [ 41.966741] MSR: 40009032 <EE,ME,IR,DR,RI> CR: 82004248 XER: 20000000 [ 41.973540] [ 41.973540] GPR00: c004d9b4 caae9eb0 c1b64f60 c1b64520 c0713cd4 caae9eb8 c1bacdfc 00000004 [ 41.973540] GPR08: 00000200 ff8fd228 c1bac700 00001032 28004242 1061aaf4 00000001 106d64a0 [ 41.973540] GPR16: 00000000 00000000 7fa0a774 10610000 7fa0aef9 00000000 10610000 7fa0a538 [ 41.973540] GPR24: 7fa0a580 7fa0a570 c1bacc00 c1b64520 c1bacc00 caae9ee8 00000108 c0713cd4 [ 42.009685] NIP [ff8fd228] 0xff8fd228 [ 42.013300] LR [c004d9e0] __regset_get+0x100/0x124 [ 42.018036] Call Trace: [ 42.020443] [caae9eb0] [c004d9b4] __regset_get+0xd4/0x124 (unreliable) [ 42.026899] [caae9ee0] [c004da94] copy_regset_to_user+0x5c/0xb0 [ 42.032751] [caae9f10] [c002f640] sys_ptrace+0xe4/0x588 [ 42.037915] [caae9f30] [c0011010] ret_from_syscall+0x0/0x28 [ 42.043422] --- interrupt: c00 at 0xfd1f8e4 [ 42.047553] NIP: 0fd1f8e4 LR: 1004a688 CTR: 00000000 [ 42.052544] REGS: caae9f40 TRAP: 0c00 Not tainted (5.12.0-rc5-s3k-dev-01666-g7aac86a0f057-dirty) [ 42.061494] MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 48004442 XER: 00000000 [ 42.068551] [ 42.068551] GPR00: 0000001a 7fa0a040 77dad7e0 0000000e 00000170 00000000 7fa0a078 00000004 [ 42.068551] GPR08: 00000000 108deb88 108dda40 106d6010 44004442 1061aaf4 00000001 106d64a0 [ 42.068551] GPR16: 00000000 00000000 7fa0a774 10610000 7fa0aef9 00000000 10610000 7fa0a538 [ 42.068551] GPR24: 7fa0a580 7fa0a570 1078fe00 1078fd70 1078fd70 00000170 0fdd3244 0000000d [ 42.104696] NIP [0fd1f8e4] 0xfd1f8e4 [ 42.108225] LR [1004a688] 0x1004a688 [ 42.111753] --- interrupt: c00 [ 42.114768] Instruction dump: [ 42.117698] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 42.125443] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX [ 42.133195] ---[ end trace d35616f22ab2100c ]--- Adding the missing #ifdef is not good because gdb doesn't like getting an error when getting registers. Instead, make ptrace return 0s when CONFIG_PPC_FPU_REGS is not set. Fixes: `b6254ced4d` ("powerpc/signal: Don't manage floating point regs when no FPU") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9121a44a2d50ba1af18d8aa5ada06c9a3bea8afd.1617200085.git.christophe.leroy@csgroup.eu	2021-04-02 00:15:37 +11:00
Aneesh Kumar K.V	937c49d10b	powerpc/mm: Revert "powerpc/mm: Remove DEBUG_VM_PGTABLE support on powerpc" This reverts commit `675bceb097` ("powerpc/mm: Remove DEBUG_VM_PGTABLE support on powerpc") All the related issues are fixed as of commit: `f14312e1ed` ("mm/debug_vm_pgtable: avoid doing memory allocation with pgtable_t mapped.") Hence re-enable it. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210318034855.74513-1-aneesh.kumar@linux.ibm.com	2021-03-31 16:46:55 +11:00
Michael Ellerman	11d92156f7	powerpc/pseries: Only register vio drivers if vio bus exists The vio bus is a fake bus, which we use on pseries LPARs (guests) to discover devices provided by the hypervisor. There's no need or sense in creating the vio bus on bare metal systems. Which is why commit `4336b93378` ("powerpc/pseries: Make vio and ibmebus initcalls pseries specific") made the initialisation of the vio bus only happen in LPARs. However as a result of that commit we now see errors at boot on bare metal systems: Driver 'hvc_console' was unable to register with bus_type 'vio' because the bus was not initialized. Driver 'tpm_ibmvtpm' was unable to register with bus_type 'vio' because the bus was not initialized. This happens because those drivers are built-in, and are calling vio_register_driver(). It in turn calls driver_register() with a reference to vio_bus_type, but we haven't registered vio_bus_type with the driver core. Fix it by also guarding vio_register_driver() with a check to see if we are on pseries. Fixes: `4336b93378` ("powerpc/pseries: Make vio and ibmebus initcalls pseries specific") Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Link: https://lore.kernel.org/r/20210316010938.525657-1-mpe@ellerman.id.au	2021-03-31 14:32:58 +11:00
dingsenjie	69931cc387	powerpc/powernv: Remove unneeded variable: "rc" Remove unneeded variable: "rc". Signed-off-by: dingsenjie <dingsenjie@yulong.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210326115356.12444-1-dingsenjie@163.com	2021-03-29 13:22:19 +11:00
Chen Huang	4fe529449d	powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration When compiling the powerpc with the SMP disabled, it shows the issue: arch/powerpc/kernel/watchdog.c: In function ‘watchdog_smp_panic’: arch/powerpc/kernel/watchdog.c:177:4: error: implicit declaration of function ‘smp_send_nmi_ipi’; did you mean ‘smp_send_stop’? [-Werror=implicit-function-declaration] 177 \| smp_send_nmi_ipi(c, wd_lockup_ipi, 1000000); \| ^~~~~~~~~~~~~~~~ \| smp_send_stop cc1: all warnings being treated as errors make[2]: * [scripts/Makefile.build:273: arch/powerpc/kernel/watchdog.o] Error 1 make[1]: * [scripts/Makefile.build:534: arch/powerpc/kernel] Error 2 make: * [Makefile:1980: arch/powerpc] Error 2 make: * Waiting for unfinished jobs.... We found that powerpc used ipi to implement hardlockup watchdog, so the HAVE_HARDLOCKUP_DETECTOR_ARCH should depend on the SMP. Fixes: `2104180a53` ("powerpc/64s: implement arch-specific hardlockup watchdog") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Chen Huang <chenhuang5@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210327094900.938555-1-chenhuang5@huawei.com	2021-03-29 13:22:19 +11:00
Daniel Henrique Barboza	d19b3ad02c	powerpc/pseries/hotplug-cpu: Show 'last online CPU' error in dlpar_cpu_offline() One of the reasons that dlpar_cpu_offline can fail is when attempting to offline the last online CPU of the kernel. This can be observed in a pseries QEMU guest that has hotplugged CPUs. If the user offlines all other CPUs of the guest, and a hotplugged CPU is now the last online CPU, trying to reclaim it will fail. The current error message in this situation returns rc with -EBUSY and a generic explanation, e.g.: pseries-hotplug-cpu: Failed to offline CPU PowerPC,POWER9, rc: -16 EBUSY can be caused by other conditions, such as cpu_hotplug_disable being true. Throwing a more specific error message for this case, instead of just "Failed to offline CPU", makes it clearer that the error is in fact a known error situation instead of other generic/unknown cause. This patch adds a 'last online' check in dlpar_cpu_offline() to catch the 'last online CPU' offline error, eturning a more informative error message: pseries-hotplug-cpu: Unable to remove last online CPU PowerPC,POWER9 Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210323205056.52768-2-danielhb413@gmail.com	2021-03-29 13:22:18 +11:00
Christophe Leroy	48cf12d889	powerpc/irq: Inline call_do_irq() and call_do_softirq() call_do_irq() and call_do_softirq() are simple enough to be worth inlining. Inlining them avoids an mflr/mtlr pair plus a save/reload on stack. This is inspired from S390 arch. Several other arches do more or less the same. The way sparc arch does seems odd thought. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210320122227.345427-1-mpe@ellerman.id.au	2021-03-29 13:22:17 +11:00
He Ying	d2313da4ff	powerpc/setup_64: Fix sparse warnings Sparse warns: warning: symbol 'rfi_flush' was not declared. warning: symbol 'entry_flush' was not declared. warning: symbol 'uaccess_flush' was not declared. Define 'entry_flush' and 'uaccess_flush' as static because they are not referenced outside the file. Include asm/security_features.h in which 'rfi_flush' is declared. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: He Ying <heying24@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210316041148.29694-1-heying24@huawei.com	2021-03-29 13:22:17 +11:00
Christophe Leroy	a329ddd472	powerpc/embedded6xx: Remove CONFIG_MV64X60 Commit `92c8c16f34` ("powerpc/embedded6xx: Remove C2K board support") moved the last selector of CONFIG_MV64X60. As it is not a user selectable config, it can be removed. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/19e57d16692dcd1ca67ba880d7273a57fab416aa.1616085654.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:17 +11:00
kernel test robot	bbbe563f84	powerpc/iommu/debug: fix ifnullfree.cocci warnings arch/powerpc/kernel/iommu.c:76:2-16: WARNING: NULL check before some freeing functions is not needed. NULL check before some freeing functions is not needed. Based on checkpatch warning "kfree(NULL) is safe this check is probably not required" and kfreeaddr.cocci by Julia Lawall. Generated by: scripts/coccinelle/free/ifnullfree.cocci Fixes: `691602aab9` ("powerpc/iommu/debug: Add debugfs entries for IOMMU tables") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210318234441.GA63469@f8e20a472e81	2021-03-29 13:22:17 +11:00
Christophe Leroy	a230883688	powerpc: Fix arch_stack_walk() to have running function as first entry It seems like other architectures, namely x86 and arm64 and riscv at least, include the running function as top entry when saving stack trace with save_stack_trace_regs(). Functionnalities like KFENCE expect it. Do the same on powerpc, it allows KFENCE and other users to properly identify the faulting function as depicted below. Before the patch KFENCE was identifying finish_task_switch.isra as the faulting function. [ 14.937370] ================================================================== [ 14.948692] BUG: KFENCE: invalid read in test_invalid_access+0x54/0x108 [ 14.948692] [ 14.956814] Invalid read at 0xdf98800a: [ 14.960664] test_invalid_access+0x54/0x108 [ 14.964876] finish_task_switch.isra.0+0x54/0x23c [ 14.969606] kunit_try_run_case+0x5c/0xd0 [ 14.973658] kunit_generic_run_threadfn_adapter+0x24/0x30 [ 14.979079] kthread+0x15c/0x174 [ 14.982342] ret_from_kernel_thread+0x14/0x1c [ 14.986731] [ 14.988236] CPU: 0 PID: 111 Comm: kunit_try_catch Tainted: G B 5.12.0-rc1-01537-g95f6e2088d7e-dirty #4682 [ 14.999795] NIP: c016ec2c LR: c02f517c CTR: c016ebd8 [ 15.004851] REGS: e2449d90 TRAP: 0301 Tainted: G B (5.12.0-rc1-01537-g95f6e2088d7e-dirty) [ 15.015274] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 22000004 XER: 00000000 [ 15.022043] DAR: df98800a DSISR: 20000000 [ 15.022043] GPR00: c02f517c e2449e50 c1142080 e100dd24 c084b13c 00000008 c084b32b c016ebd8 [ 15.022043] GPR08: c0850000 df988000 c0d10000 e2449eb0 22000288 [ 15.040581] NIP [c016ec2c] test_invalid_access+0x54/0x108 [ 15.046010] LR [c02f517c] kunit_try_run_case+0x5c/0xd0 [ 15.051181] Call Trace: [ 15.053637] [e2449e50] [c005a68c] finish_task_switch.isra.0+0x54/0x23c (unreliable) [ 15.061338] [e2449eb0] [c02f517c] kunit_try_run_case+0x5c/0xd0 [ 15.067215] [e2449ed0] [c02f648c] kunit_generic_run_threadfn_adapter+0x24/0x30 [ 15.074472] [e2449ef0] [c004e7b0] kthread+0x15c/0x174 [ 15.079571] [e2449f30] [c001317c] ret_from_kernel_thread+0x14/0x1c [ 15.085798] Instruction dump: [ 15.088784] 8129d608 38e7ebd8 81020280 911f004c 39000000 995f0024 907f0028 90ff001c [ 15.096613] 3949000a 915f0020 3d40c0d1 3d00c085 <8929000a> 3908adb0 812a4b98 3d40c02f [ 15.104612] ================================================================== Fixes: `35de3b1aa1` ("powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Marco Elver <elver@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/21324f9e2f21d1640c8397b4d1d857a9355a2283.1615881400.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:16 +11:00
Christophe Leroy	a1cdef04f2	powerpc: Convert stacktrace to generic ARCH_STACKWALK This patch converts powerpc stacktrace to the generic ARCH_STACKWALK implemented by commit `214d8ca6ee` ("stacktrace: Provide common infrastructure") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/73b36bbb101299760b95ecd2cd3a46554bea8bf9.1615881400.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:16 +11:00
Christophe Leroy	826a307b0a	powerpc: Rename 'tsk' parameter into 'task' To better match generic code, rename 'tsk' to 'task' in some stacktrace functions in preparation of following patch which converts powerpc to generic ARCH_STACKWALK. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/117f0200e11961af6c0fdf85c98373e5dcf96a47.1615881400.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:16 +11:00
Christophe Leroy	accdd093f2	powerpc: Activate HAVE_RELIABLE_STACKTRACE for all CONFIG_HAVE_RELIABLE_STACKTRACE is applicable to all, no reason to limit it to book3s/64le Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/955248c6423cb068c5965923121ba31d4dd2fdde.1615881400.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:15 +11:00
Aneesh Kumar K.V	8b8adeb300	powerpc/book3s64/kuap: Move Kconfig varriables to BOOK3S_64 With below two commits: commit `c91435d95c` ("powerpc/book3s64/hash/kuep: Enable KUEP on hash") commit `b2ff33a10c` ("powerpc/book3s64/hash/kuap: Enable kuap on hash") the kernel now supports kuap/kuep with hash translation. Hence select the Kconfig even when radix is disabled. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210318034829.72255-1-aneesh.kumar@linux.ibm.com	2021-03-29 13:22:15 +11:00
Bhaskar Chowdhury	89f7d2927a	powerpc/kernel: Trivial typo fix in kgdb.c s/procesing/processing/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210317090413.120891-1-unixbhaskar@gmail.com	2021-03-29 13:22:15 +11:00
Nicholas Piggin	1479e3d3b7	powerpc/64s: Fix hash fault to use TRAP accessor Hash faults use the trap vector to decide whether this is an instruction or data fault. This should use the TRAP accessor rather than open access regs->trap. This won't cause a problem at the moment because 64s only uses trap flags for system call interrupts (the norestart flag), but that could change if any other trap flags get used in future. Fixes: `a4922f5442` ("powerpc/64s: move the hash fault handling logic to C") Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210316105205.407767-1-npiggin@gmail.com	2021-03-29 13:22:15 +11:00
Christophe Leroy	98c26a7275	powerpc/mm: Remove unneeded #ifdef CONFIG_PPC_MEM_KEYS In fault.c, #ifdef CONFIG_PPC_MEM_KEYS is not needed because all functions are always defined, and arch_vma_access_permitted() always returns true when CONFIG_PPC_MEM_KEYS is not defined so access_pkey_error() will return false so bad_access_pkey() will never be called. Include linux/pkeys.h to get a definition of vma_pkeys() for bad_access_pkey(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8038392f38d81f2ad169347efac29146f553b238.1615819955.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:15 +11:00
Michael Ellerman	b77878052a	powerpc/fsl-pci: Fix section mismatch warning Section mismatch in reference from the function .fsl_add_bridge() to the function .init.text:.setup_pci_cmd() fsl_add_bridge() is not __init, and can't be, and is the only caller of setup_pci_cmd(). Fix it by making setup_pci_cmd() non-init. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210314093341.132986-1-mpe@ellerman.id.au	2021-03-29 13:22:14 +11:00
Michael Ellerman	55c2f5574a	powerpc: Fix section mismatch warning in smp_setup_pacas() Section mismatch in reference from the function .smp_setup_pacas() to the function .init.text:.allocate_paca() The only caller of smp_setup_pacas() is setup_arch() which is __init, so mark smp_setup_pacas() __init. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210314093333.132657-1-mpe@ellerman.id.au	2021-03-29 13:22:14 +11:00
Michael Ellerman	c2a2a5d027	powerpc/64s: Fold update_current_thread_[i]amr() into their only callers lkp reported warnings in some configuration due to update_current_thread_amr() being unused: arch/powerpc/mm/book3s64/pkeys.c:284:20: error: unused function 'update_current_thread_amr' static inline void update_current_thread_amr(u64 value) Which is because it's only use is inside an ifdef. We could move it inside the ifdef, but it's a single line function and only has one caller, so just fold it in. Similarly update_current_thread_iamr() is small and only called once, so fold it in also. Fixes: `48a8ab4eeb` ("powerpc/book3s64/pkeys: Don't update SPRN_AMR when in kernel mode.") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210314093320.132331-1-mpe@ellerman.id.au	2021-03-29 13:22:14 +11:00
Michael Ellerman	7a7685acd2	powerpc/eeh: Fix build failure with CONFIG_PROC_FS=n The build fails with CONFIG_PROC_FS=n: arch/powerpc/kernel/eeh.c:1571:12: error: ‘proc_eeh_show’ defined but not used 1571 \| static int proc_eeh_show(struct seq_file m, void v) Wrap proc_eeh_show() in an ifdef to avoid it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210314093300.131998-1-mpe@ellerman.id.au	2021-03-29 13:22:14 +11:00
Jiapeng Chong	7a0fdc19f2	powerpc/pci: fix warning comparing pointer to 0 Fix the following coccicheck warning: ./arch/powerpc/platforms/maple/pci.c:37:16-17: WARNING comparing pointer to 0. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1615793724-97015-1-git-send-email-jiapeng.chong@linux.alibaba.com	2021-03-29 13:22:13 +11:00
Yang Li	9214cf0f48	powerpc/xive: use true and false for bool variable fixed the following coccicheck: ./arch/powerpc/sysdev/xive/spapr.c:552:8-9: WARNING: return of 0/1 in function 'xive_spapr_match' with return type bool Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1615793096-83758-1-git-send-email-yang.lee@linux.alibaba.com	2021-03-29 13:22:13 +11:00
Christophe Leroy	6eeca7a113	powerpc/asm-offsets: GPR14 is not needed either Commit `aac6a91fea` ("powerpc/asm: Remove unused symbols in asm-offsets.c") removed GPR15 to GPR31 but kept GPR14, probably because it pops up in a couple of comments when doing a grep. However, it was never used either, so remove it as well. Fixes: `aac6a91fea` ("powerpc/asm: Remove unused symbols in asm-offsets.c") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9881c68fbca004f9ea18fc9473f630e11ccd6417.1615806071.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:13 +11:00
Christophe Leroy	e448e1e774	powerpc/math: Fix missing __user qualifier for get_user() and other sparse warnings Sparse reports the following problems: arch/powerpc/math-emu/math.c:228:21: warning: Using plain integer as NULL pointer arch/powerpc/math-emu/math.c:228:31: warning: Using plain integer as NULL pointer arch/powerpc/math-emu/math.c:228:41: warning: Using plain integer as NULL pointer arch/powerpc/math-emu/math.c:228:51: warning: Using plain integer as NULL pointer arch/powerpc/math-emu/math.c:237:13: warning: incorrect type in initializer (different address spaces) arch/powerpc/math-emu/math.c:237:13: expected unsigned int [noderef] __user _gu_addr arch/powerpc/math-emu/math.c:237:13: got unsigned int [usertype] arch/powerpc/math-emu/math.c:226:1: warning: symbol 'do_mathemu' was not declared. Should it be static? Add missing __user qualifier when casting pointer used in get_user() Use NULL instead of 0 to initialise opX local variables. Add a prototype for do_mathemu() (Added in processor.h like sparc) Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e4d1aae7604d89c98a52dfd8ce8443462e595670.1615809591.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:12 +11:00
Bhaskar Chowdhury	7a7d744ffe	powerpc/mm/book3s64: Fix a typo in mmu_context.c s/detalis/details/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210312112537.4585-1-unixbhaskar@gmail.com	2021-03-29 13:22:12 +11:00
Bhaskar Chowdhury	f239873fcd	powerpc/64e: Trivial spelling fixes throughout head_fsl_booke.S Trivial spelling fixes throughout the file. Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210314220436.3417083-1-unixbhaskar@gmail.com	2021-03-29 13:22:12 +11:00
Christophe Leroy	802b556039	powerpc/Makefile: Remove workaround for gcc versions below 4.9 Commit `6ec4476ac8` ("Raise gcc version requirement to 4.9") made it impossible to build with gcc 4.8 and under. Remove related workaround. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a1e552006b8c51f23edd2f6cabdd9a986c631146.1615380184.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:11 +11:00
Christophe Leroy	c16728835e	powerpc/32: Manage KUAP in C Move all KUAP management in C. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/199365ddb58d579daf724815f2d0acb91cc49d19.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:11 +11:00
Christophe Leroy	0b45359aa2	powerpc/8xx: Create C version of kuap save/restore/check helpers In preparation of porting PPC32 to C syscall entry/exit, create C version of kuap_save_and_lock() and kuap_user_restore() and kuap_kernel_restore() and kuap_assert_locked() and kuap_get_and_assert_locked() on 8xx. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/156a7c4b669d26785391422a5581a1d919544c9a.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:11 +11:00
Christophe Leroy	21eb58ae4f	powerpc/32s: Create C version of kuap save/restore/check helpers In preparation of porting PPC32 to C syscall entry/exit, create C version of kuap_save_and_lock() and kuap_user_restore() and kuap_kernel_restore() and kuap_assert_locked() and kuap_get_and_assert_locked() on book3s/32. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2be8fb729da4a0f9863b25e1b9d547174fcd5056.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:11 +11:00
Christophe Leroy	ad2d234477	powerpc/64s: Make kuap_check_amr() and kuap_get_and_check_amr() generic In preparation of porting powerpc32 to C syscall entry/exit, rename kuap_check_amr() and kuap_get_and_check_amr() as kuap_assert_locked() and kuap_get_and_assert_locked(), and move in the generic asm/kup.h the stub for when CONFIG_PPC_KUAP is not selected. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f82614d9b17b83abd739aa18fc08811815d0c2e3.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:11 +11:00
Christophe Leroy	b5efec00b6	powerpc/32s: Move KUEP locking/unlocking in C This can be done in C, do it. Unrolling the loop gains approx. 15% performance. From now on, prepare_transfer_to_handler() is only for interrupts from kernel. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4eadd873927e9a73c3d1dfe2f9497353465514cf.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:10 +11:00
Christophe Leroy	a2b3e09ae4	powerpc/32: Only use prepare_transfer_to_handler function on book3s/32 and e500 Only book3s/32 and e500 have significative work to do in prepare_transfer_to_handler. Other 32 bit have nothing to do at all. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b5e29ca0e557c11340415a13fe8b107189d315e1.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:10 +11:00
Christophe Leroy	a5d33be051	powerpc/32: Return directly from power_save_ppc32_restore() transfer_to_handler_cont: is now just a blr. Directly perform blr in power_save_ppc32_restore(). Also remove useless setting of r11 in e500 version of power_save_ppc32_restore(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e337506e08a4df95b11d2290104b92f0dcdb5548.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:10 +11:00
Christophe Leroy	16db54369d	powerpc/32: Save remaining registers in exception prolog Save non volatile registers, XER, CTR, MSR and NIP in exception prolog. Also assign proper value to r2 and r3 there. For now, recalculate thread pointer in prepare_transfer_to_handler. It will disappear once KUAP is ported to C. And remove the comment which is now completely wrong. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/56f0cde9dd0362edf2ddba4d887552013eee7329.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:10 +11:00
Christophe Leroy	a305597850	powerpc/32: Refactor saving of volatile registers in exception prologs Exception prologs all do the same at the end: - Save trapno in stack - Mark stack with exception marker - Save r0 - Save r3 to r8 Refactor that into a COMMON_EXCEPTION_PROLOG_END macro. At the same time use r1 instead of r11. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e1c45d2e895e0693c42d2a6840df1105a148efea.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:10 +11:00
Christophe Leroy	acc142b623	powerpc/32: Remove the xfer parameter in EXCEPTION() macro The xfer parameter is not used anymore, remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/17c7d68bd18f7d2f1ab24a1a20d9ed33bbcda741.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:09 +11:00
Christophe Leroy	4c0104a83f	powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE In order to get more control in exception prolog, dismantle all non standard exception macros, finishing with EXC_XFER_STD and EXC_XFER_LITE and EXC_XFER_TEMPLATE. Also remove transfer_to_handler_full and ret_from_except and ret_from_except_full as they are not used anymore. Last parameter of EXCEPTION() is now ignored, will be removed in a later patch to avoid too much churn. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ca5795d04a220586b7037dbbbe6951dfa9e768eb.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:09 +11:00
Christophe Leroy	8f6ff5bd9b	powerpc/32: Only restore non volatile registers when required Until now, non volatile registers were restored everytime they were saved, ie using EXC_XFER_STD meant saving and restoring them while EXC_XFER_LITE meant neither saving not restoring them. Now that they are always saved, EXC_XFER_STD means to restore them and EXC_XFER_LITE means to not restore them. Most of the users of EXC_XFER_STD only need to retrieve the non volatile registers. For them there is no need to restore the non volatile registers as they have not been modified. Only very few exceptions require non volatile registers restore. Opencode the few places which require saving of non volatile registers. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d1cb12d8023cc6afc1f07150565571373c04945c.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:09 +11:00
Christophe Leroy	bce4c26a4e	powerpc/32: Add a prepare_transfer_to_handler macro for exception prologs In order to increase flexibility, add a macro that will for now call transfer_to_handler. As transfer_to_handler doesn't do the actual transfer anymore, also name it prepare_transfer_to_handler. The following patches will progressively remove the use of transfer_to_handler label. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7f757c52518ab1d7b27ad5113b10f860e803f467.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:09 +11:00
Christophe Leroy	719e7e212c	powerpc/32: Save trap number on stack in exception prolog Saving the trap number into the stack goes into the exception prolog, as EXC_XFER_xxx will soon disappear. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2ac7a0c9cde2ec2b23cd79e3a54cfedd816a91ae.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:08 +11:00
Christophe Leroy	af6f2ce84b	powerpc/32: Call bad_page_fault() from do_page_fault() Now that non volatile registers are saved at all time, no need to split bad_page_fault() out of do_page_fault(). Remove handle_page_fault() and use do_page_fault() directly. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cfb95be8863204cc2bf45a22ea44dd1d0dc16b7f.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:08 +11:00
Christophe Leroy	e72915560b	powerpc/32: Set regs parameter in r3 in transfer_to_handler All exception handlers take regs as first parameter. Instead of setting r3 just before each call to a handler, set it in transfer_to_handler. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f994a379bb895a2cbd518cb82460ad3f3d3ccdf5.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:08 +11:00
Christophe Leroy	db297c3b07	powerpc/32: Don't save thread.regs on interrupt entry Since commit `06d67d5474` ("powerpc: make process.c suitable for both 32-bit and 64-bit"), thread.regs is set on task creation, no need to set it again and again at each interrupt entry as it never change. Suggested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20d52c627303d63e461797df13e6890fc04017d0.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:08 +11:00
Christophe Leroy	b96bae3ae2	powerpc/32: Replace ASM exception exit by C exception exit from ppc64 This patch replaces the PPC32 ASM exception exit by C exception exit. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/48f8bae91da899d8e73fc0d75c9af66cc97b4d5b.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:07 +11:00
Christophe Leroy	e9f99704aa	powerpc/32: Always save non volatile registers on exception entry In preparation of handling exception entry and exit in C, in order to simplify the handling, always save non volatile registers when entering an exception. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3ce8ced87a4f1467fa36fcc50763d53b45e466c1.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:07 +11:00
Christophe Leroy	0f2793e33d	powerpc/32: Perform normal function call in exception entry Now that the MMU is re-enabled before calling the transfer function, we don't need anymore that hack with the address of the handler and the return function sitting just after the 'bl' to the transfer fonction, that function is retrieving via a read relative to 'lr'. Do a regular call to the transfer function, then to the handler, then branch to the return function. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/73c00f3361ca280ef8fd7814c291bd1f5b6e2081.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:07 +11:00
Christophe Leroy	32d2ca0e96	powerpc/32: Refactor booke critical registers saving Refactor booke critical registers saving into a few macros and move it into the exception prolog directly. Keep the dedicated transfert_to_handler entry point for the moment allthough they are empty. They will be removed in a later patch to reduce churn. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/269171496f1f5f22afa621695bded22976c9d48d.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:07 +11:00
Christophe Leroy	8f844c06f4	powerpc/32: Provide a name to exception prolog continuation in virtual mode Now that the prolog continuation is separated in .text, give it a name and mark it _ASM_NOKPROBE_SYMBOL. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d96374218815a6627e1e922ab2aba994050fb87a.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:06 +11:00
Christophe Leroy	dc13b889b5	powerpc/32: Move exception prolog code into .text once MMU is back on The space in the head section is rather constrained by the fact that exception vectors are spread every 0x100 bytes and sometimes we need to have "out of line" code because it doesn't fit. Now that we are enabling MMU early in the prolog, take that opportunity to jump somewhere else in the .text section where we don't have any space constraint. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/38b31ca4bc782a4985bc7952a675404d7ff27c24.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:06 +11:00
Christophe Leroy	7bf1d7e1ab	powerpc/32: Use START_EXCEPTION() as much as possible Everywhere where it is possible, use START_EXCEPTION(). This will help for proper exception init in future patches. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d47c1cc242bbbef8658327503726abdaef9b63ef.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:06 +11:00
Christophe Leroy	5b5e5bc53d	powerpc/32: Add vmap_stack_overflow label inside the macro For consistency, add in the macro the label used by exception prolog to branch to stack overflow processing. While at it, enclose the macro in #ifdef CONFIG_VMAP_STACK on the 8xx as already done on book3s/32. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cf80056f5b946572ad98aea9d915dd25b23beda6.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:06 +11:00
Christophe Leroy	a4719f5bb6	powerpc/32: Statically initialise first emergency context The check of the emergency context initialisation in vmap_stack_overflow is buggy for the SMP case, as it compares r1 with 0 while in the SMP case r1 is offseted by the CPU id. Instead of fixing it, just perform static initialisation of the first emergency context. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4a67ba422be75713286dca0c86ee0d3df2eb6dfa.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:06 +11:00
Christophe Leroy	9b6150fb89	powerpc/32: Enable instruction translation at the same time as data translation On 40x and 8xx, kernel text is pinned. On book3s/32, kernel text is mapped by BATs. Enable instruction translation at the same time as data translation, it makes things simpler. In syscall handler, MSR_RI can also be set at the same time because srr0/srr1 are already saved and r1 is set properly. On booke, translation is always on, so at the end all PPC32 have translation on early. Just update msr. Also update comment in power_save_ppc32_restore(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5269c7e5f5d2117358af3a89744d75a116be27b0.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:05 +11:00
Christophe Leroy	5b1c9a0d7f	powerpc/32: Tag DAR in EXCEPTION_PROLOG_2 for the 8xx 8xx requires to tag the DAR with a magic value in order to fixup DAR on faults generated by 'dcbX', as the 8xx forgets to update the DAR for those faults. Do the tagging as early as possible, that is before enabling MMU. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/853a2e28ca7c5fc85617037030f99fe6070c9536.1615552867.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:05 +11:00
Christophe Leroy	7aa8dd67f1	powerpc/32: Always enable data translation in exception prolog If the code can use a stack in vm area, it can also use a stack in linear space. Simplify code by removing old non VMAP stack code on PPC32. That means the data translation is now re-enabled early in exception prolog in all cases, not only when using VMAP stacks. While we are touching EXCEPTION_PROLOG macros, remove the unused for_rtas parameter in EXCEPTION_PROLOG_1. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7cd6440c60a7e8f4f035b245c57720f51e225aae.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:05 +11:00
Christophe Leroy	5747230645	powerpc/32: Remove ksp_limit ksp_limit is there to help detect stack overflows. That is specific to ppc32 as it was removed from ppc64 in commit `cbc9565ee8` ("powerpc: Remove ksp_limit on ppc64"). There are other means for detecting stack overflows. As ppc64 has proven to not need it, ppc32 should be able to do without it too. Lets remove it and simplify exception handling. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d789c3385b22e07bedc997613c0d26074cb513e7.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:05 +11:00
Christophe Leroy	e464d92b29	powerpc/32: Use fast instruction to set MSR RI in exception prolog on 8xx 8xx has registers SPRN_NRI, SPRN_EID and SPRN_EIE for changing MSR EE and RI. Use SPRN_EID in exception prolog to set RI. On an 8xx, it reduces the null_syscall test by 3 cycles. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/65f6bda827c2a2abce71ea7e07543e791163da33.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:04 +11:00
Christophe Leroy	79f4bb17f1	powerpc/32: Handle bookE debugging in C in exception entry The handling of SPRN_DBCR0 and other registers can easily be done in C instead of ASM. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/6d6b2497115890b90cfa72a2b3ab1da5f78123c2.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:04 +11:00
Christophe Leroy	f93d866e14	powerpc/32: Entry cpu time accounting in C There is no need for this to be in asm, use the new interrupt entry wrapper. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/daca4c3e05cdfe54d237162a0718b3aaca897662.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:04 +11:00
Christophe Leroy	be39e10506	powerpc/32: Reconcile interrupts in C There is no need for this to be in asm anymore, use the new interrupt entry wrapper. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/602e1ec47e15ca540f7edb9cf6feb6c249911bd6.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:04 +11:00
Christophe Leroy	0512aadd75	powerpc/40x: Prepare normal exception handler for enabling MMU early Ensure normal exception handler are able to manage stuff with MMU enabled. For that we use CONFIG_VMAP_STACK related code allthough there is no intention to really activate CONFIG_VMAP_STACK on powerpc 40x for the moment. 40x uses SPRN_DEAR instead of SPRN_DAR and SPRN_ESR instead of SPRN_DSISR. Take it into account in common macros. 40x MSR value doesn't fit on 15 bits, use LOAD_REG_IMMEDIATE() in common macros that will be used also with 40x. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/01963af2b83037bca270d7bf1336ffcf35da8282.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:04 +11:00
Christophe Leroy	0fc1e93481	powerpc/40x: Prepare for enabling MMU in critical exception prolog In order the enable MMU early in exception prolog, implement CONFIG_VMAP_STACK principles in critical exception prolog. There is no intention to use CONFIG_VMAP_STACK on 40x, but related code will be used to enable MMU early in exception in a later patch. Also address (critirq_ctx - PAGE_OFFSET) directly instead of using tophys() in order to win one instruction. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3fd75ee54c48307119acdbf66cfea966c1463bbd.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:03 +11:00
Christophe Leroy	26c468860c	powerpc/40x: Reorder a few instructions in critical exception prolog In order to ease preparation for CONFIG_VMAP_STACK, reorder a few instruction, especially save r1 into stack frame earlier. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c895ecf958c86d1736bdd2ff6f36626b55f35fd2.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:03 +11:00
Christophe Leroy	fcd4b43c36	powerpc/40x: Save SRR0/SRR1 and r10/r11 earlier in critical exception In order to be able to switch MMU on in exception prolog, save SRR0 and SRR1 earlier. Also save r10 and r11 into stack earlier to better match with the normal exception prolog. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/79a93f253d72dc97ac968c9c62b5066960b688ed.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:03 +11:00
Christophe Leroy	9d3c18a11a	powerpc/40x: Change CRITICAL_EXCEPTION_PROLOG macro to a gas macro Change CRITICAL_EXCEPTION_PROLOG macro to a gas macro to remove the ugly ; and \ on each line. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/73291fb9dc9ec58182c27a40dfc3db204e3f4024.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:03 +11:00
Christophe Leroy	52ae92cc29	powerpc/40x: Don't use SPRN_SPRG_SCRATCH0/1 in TLB miss handlers SPRN_SPRG_SCRATCH5 is used to save SPRN_PID. SPRN_SPRG_SCRATCH6 is already available. SPRN_PID is only 8 bits. We have r12 that contains CR. We only need to preserve CR0, so we have space available in r12 to save PID. Keep PID in r12 and free up SPRN_SPRG_SCRATCH5. Then In TLB miss handlers, instead of using SPRN_SPRG_SCRATCH0 and SPRN_SPRG_SCRATCH1, use SPRN_SPRG_SCRATCH5 and SPRN_SPRG_SCRATCH6 to avoid future conflicts with normal exception prologs. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4cdaa85d38e14d594ba902424060ec55babf2c42.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:03 +11:00
Christophe Leroy	a58cbed683	powerpc/traps: Declare unrecoverable_exception() as __noreturn unrecoverable_exception() is never expected to return, most callers have an infiniteloop in case it returns. Ensure it really never returns by terminating it with a BUG(), and declare it __no_return. It always GCC to really simplify functions calling it. In the exemple below, it avoids the stack frame in the likely fast path and avoids code duplication for the exit. With this patch: 00000348 <interrupt_exit_kernel_prepare>: 348: 81 43 00 84 lwz r10,132(r3) 34c: 71 48 00 02 andi. r8,r10,2 350: 41 82 00 2c beq 37c <interrupt_exit_kernel_prepare+0x34> 354: 71 4a 40 00 andi. r10,r10,16384 358: 40 82 00 20 bne 378 <interrupt_exit_kernel_prepare+0x30> 35c: 80 62 00 70 lwz r3,112(r2) 360: 74 63 00 01 andis. r3,r3,1 364: 40 82 00 28 bne 38c <interrupt_exit_kernel_prepare+0x44> 368: 7d 40 00 a6 mfmsr r10 36c: 7c 11 13 a6 mtspr 81,r0 370: 7c 12 13 a6 mtspr 82,r0 374: 4e 80 00 20 blr 378: 48 00 00 00 b 378 <interrupt_exit_kernel_prepare+0x30> 37c: 94 21 ff f0 stwu r1,-16(r1) 380: 7c 08 02 a6 mflr r0 384: 90 01 00 14 stw r0,20(r1) 388: 48 00 00 01 bl 388 <interrupt_exit_kernel_prepare+0x40> 388: R_PPC_REL24 unrecoverable_exception 38c: 38 e2 00 70 addi r7,r2,112 390: 3d 00 00 01 lis r8,1 394: 7c c0 38 28 lwarx r6,0,r7 398: 7c c6 40 78 andc r6,r6,r8 39c: 7c c0 39 2d stwcx. r6,0,r7 3a0: 40 a2 ff f4 bne 394 <interrupt_exit_kernel_prepare+0x4c> 3a4: 38 60 00 01 li r3,1 3a8: 4b ff ff c0 b 368 <interrupt_exit_kernel_prepare+0x20> Without this patch: 00000348 <interrupt_exit_kernel_prepare>: 348: 94 21 ff f0 stwu r1,-16(r1) 34c: 93 e1 00 0c stw r31,12(r1) 350: 7c 7f 1b 78 mr r31,r3 354: 81 23 00 84 lwz r9,132(r3) 358: 71 2a 00 02 andi. r10,r9,2 35c: 41 82 00 34 beq 390 <interrupt_exit_kernel_prepare+0x48> 360: 71 29 40 00 andi. r9,r9,16384 364: 40 82 00 28 bne 38c <interrupt_exit_kernel_prepare+0x44> 368: 80 62 00 70 lwz r3,112(r2) 36c: 74 63 00 01 andis. r3,r3,1 370: 40 82 00 3c bne 3ac <interrupt_exit_kernel_prepare+0x64> 374: 7d 20 00 a6 mfmsr r9 378: 7c 11 13 a6 mtspr 81,r0 37c: 7c 12 13 a6 mtspr 82,r0 380: 83 e1 00 0c lwz r31,12(r1) 384: 38 21 00 10 addi r1,r1,16 388: 4e 80 00 20 blr 38c: 48 00 00 00 b 38c <interrupt_exit_kernel_prepare+0x44> 390: 7c 08 02 a6 mflr r0 394: 90 01 00 14 stw r0,20(r1) 398: 48 00 00 01 bl 398 <interrupt_exit_kernel_prepare+0x50> 398: R_PPC_REL24 unrecoverable_exception 39c: 80 01 00 14 lwz r0,20(r1) 3a0: 81 3f 00 84 lwz r9,132(r31) 3a4: 7c 08 03 a6 mtlr r0 3a8: 4b ff ff b8 b 360 <interrupt_exit_kernel_prepare+0x18> 3ac: 39 02 00 70 addi r8,r2,112 3b0: 3d 40 00 01 lis r10,1 3b4: 7c e0 40 28 lwarx r7,0,r8 3b8: 7c e7 50 78 andc r7,r7,r10 3bc: 7c e0 41 2d stwcx. r7,0,r8 3c0: 40 a2 ff f4 bne 3b4 <interrupt_exit_kernel_prepare+0x6c> 3c4: 38 60 00 01 li r3,1 3c8: 4b ff ff ac b 374 <interrupt_exit_kernel_prepare+0x2c> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1e883e9d93fdb256853d1434c8ad77c257349b2d.1615552866.git.christophe.leroy@csgroup.eu	2021-03-29 13:22:02 +11:00
Ravi Bangoria	d943bc742a	powerpc/uprobes: Validation for prefixed instruction As per ISA 3.1, prefixed instruction should not cross 64-byte boundary. So don't allow Uprobe on such prefixed instruction. There are two ways probed instruction is changed in mapped pages. First, when Uprobe is activated, it searches for all the relevant pages and replace instruction in them. In this case, if that probe is on the 64-byte unaligned prefixed instruction, error out directly. Second, when Uprobe is already active and user maps a relevant page via mmap(), instruction is replaced via mmap() code path. But because Uprobe is invalid, entire mmap() operation can not be stopped. In this case just print an error and continue. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210311091538.368590-1-ravi.bangoria@linux.ibm.com	2021-03-29 12:52:24 +11:00
Christopher M. Riedl	d3ccc97815	powerpc/signal: Use __get_user() to copy sigset_t Usually sigset_t is exactly 8B which is a "trivial" size and does not warrant using __copy_from_user(). Use __get_user() directly in anticipation of future work to remove the trivial size optimizations from __copy_from_user(). The ppc32 implementation of get_sigset_t() previously called copy_from_user() which, unlike __copy_from_user(), calls access_ok(). Replacing this w/ __get_user() (no access_ok()) is fine here since both callsites in signal_32.c are preceded by an earlier access_ok(). Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-11-cmr@codefail.de	2021-03-29 12:52:24 +11:00
Daniel Axtens	0f92433b8f	powerpc/signal64: Rewrite rt_sigreturn() to minimise uaccess switches Add uaccess blocks and use the 'unsafe' versions of functions doing user access where possible to reduce the number of times uaccess has to be opened/closed. Co-developed-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-10-cmr@codefail.de	2021-03-29 12:52:23 +11:00
Daniel Axtens	96d7a4e06f	powerpc/signal64: Rewrite handle_rt_signal64() to minimise uaccess switches Add uaccess blocks and use the 'unsafe' versions of functions doing user access where possible to reduce the number of times uaccess has to be opened/closed. There is no 'unsafe' version of copy_siginfo_to_user, so move it slightly to allow for a "longer" uaccess block. Co-developed-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-9-cmr@codefail.de	2021-03-29 12:52:15 +11:00
Christopher M. Riedl	193323e100	powerpc/signal64: Replace restore_sigcontext() w/ unsafe_restore_sigcontext() Previously restore_sigcontext() performed a costly KUAP switch on every uaccess operation. These repeated uaccess switches cause a significant drop in signal handling performance. Rewrite restore_sigcontext() to assume that a userspace read access window is open by replacing all uaccess functions with their 'unsafe' versions. Modify the callers to first open, call unsafe_restore_sigcontext(), and then close the uaccess window. Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-8-cmr@codefail.de	2021-03-29 12:49:47 +11:00
Christopher M. Riedl	7bb081c8f0	powerpc/signal64: Replace setup_sigcontext() w/ unsafe_setup_sigcontext() Previously setup_sigcontext() performed a costly KUAP switch on every uaccess operation. These repeated uaccess switches cause a significant drop in signal handling performance. Rewrite setup_sigcontext() to assume that a userspace write access window is open by replacing all uaccess functions with their 'unsafe' versions. Modify the callers to first open, call unsafe_setup_sigcontext() and then close the uaccess window. Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-7-cmr@codefail.de	2021-03-29 12:49:47 +11:00
Christopher M. Riedl	2d19630e20	powerpc/signal64: Remove TM ifdefery in middle of if/else block Both rt_sigreturn() and handle_rt_signal_64() contain TM-related ifdefs which break-up an if/else block. Provide stubs for the ifdef-guarded TM functions and remove the need for an ifdef in rt_sigreturn(). Rework the remaining TM ifdef in handle_rt_signal64() similar to commit `f1cf4f93de` ("powerpc/signal32: Remove ifdefery in middle of if/else"). Unlike in the commit for ppc32, the ifdef can't be removed entirely since uc_transact in sigframe depends on CONFIG_PPC_TRANSACTIONAL_MEM. Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-6-cmr@codefail.de	2021-03-29 12:49:47 +11:00
Christopher M. Riedl	1a130b67c6	powerpc: Reference parameter in MSR_TM_ACTIVE() macro Unlike the other MSR_TM_* macros, MSR_TM_ACTIVE does not reference or use its parameter unless CONFIG_PPC_TRANSACTIONAL_MEM is defined. This causes an 'unused variable' compile warning unless the variable is also guarded with CONFIG_PPC_TRANSACTIONAL_MEM. Reference but do nothing with the argument in the macro to avoid a potential compile warning. Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-5-cmr@codefail.de	2021-03-29 12:49:46 +11:00
Christopher M. Riedl	c6c9645e37	powerpc/signal64: Remove non-inline calls from setup_sigcontext() The majority of setup_sigcontext() can be refactored to execute in an "unsafe" context assuming an open uaccess window except for some non-inline function calls. Move these out into a separate prepare_setup_sigcontext() function which must be called first and before opening up a uaccess window. Non-inline function calls should be avoided during a uaccess window for a few reasons: - KUAP should be enabled for as much kernel code as possible. Opening a uaccess window disables KUAP which means any code executed during this time contributes to a potential attack surface. - Non-inline functions default to traceable which means they are instrumented for ftrace. This adds more code which could run with KUAP disabled. - Powerpc does not currently support the objtool UACCESS checks. All code running with uaccess must be audited manually which means: less code -> less work -> fewer problems (in theory). A follow-up commit converts setup_sigcontext() to be "unsafe". Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-4-cmr@codefail.de	2021-03-29 12:49:46 +11:00
Christopher M. Riedl	609355dfc8	powerpc/signal: Add unsafe_copy_{vsx, fpr}_from_user() Reuse the "safe" implementation from signal.c but call unsafe_get_user() directly in a loop to avoid the intermediate copy into a local buffer. Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Reviewed-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-3-cmr@codefail.de	2021-03-29 12:49:46 +11:00
Christopher M. Riedl	9466c1799f	powerpc/uaccess: Add unsafe_copy_from_user() Use the same approach as unsafe_copy_to_user() but instead call unsafe_get_user() in a loop. Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210227011259.11992-2-cmr@codefail.de	2021-03-29 12:49:46 +11:00
Davidlohr Bueso	deb9b13eb2	powerpc/qspinlock: Use generic smp_cond_load_relaxed `49a7d46a06` (powerpc: Implement smp_cond_load_relaxed()) added busy-waiting pausing with a preferred SMT priority pattern, lowering the priority (reducing decode cycles) during the whole loop slowpath. However, data shows that while this pattern works well with simple spinlocks, queued spinlocks benefit more being kept in medium priority, with a cpu_relax() instead, being a low+medium combo on powerpc. Data is from three benchmarks on a Power9: 9008-22L 64 CPUs with 2 sockets and 8 threads per core. 1. locktorture. This is data for the lowest and most artificial/pathological level, with increasing thread counts pounding on the lock. Metrics are total ops/minute. Despite some small hits in the 4-8 range, scenarios are either neutral or favorable to this patch. +=========+==========+==========+=======+ \| # tasks \| vanilla \| dirty \| %diff \| +=========+==========+==========+=======+ \| 2 \| 46718565 \| 48751350 \| 4.35 \| +---------+----------+----------+-------+ \| 4 \| 51740198 \| 50369082 \| -2.65 \| +---------+----------+----------+-------+ \| 8 \| 63756510 \| 62568821 \| -1.86 \| +---------+----------+----------+-------+ \| 16 \| 67824531 \| 70966546 \| 4.63 \| +---------+----------+----------+-------+ \| 32 \| 53843519 \| 61155508 \| 13.58 \| +---------+----------+----------+-------+ \| 64 \| 53005778 \| 53104412 \| 0.18 \| +---------+----------+----------+-------+ \| 128 \| 53331980 \| 54606910 \| 2.39 \| +=========+==========+==========+=======+ 2. sockperf (tcp throughput) Here a client will do one-way throughput tests to a localhost server, with increasing message sizes, dealing with the sk_lock. This patch shows to put the performance of the qspinlock back to par with that of the simple lock: simple-spinlock vanilla dirty Hmean 14 73.50 ( 0.00%) 54.44 * -25.93%* 73.45 * -0.07%* Hmean 100 654.47 ( 0.00%) 385.61 * -41.08%* 771.43 * 17.87%* Hmean 300 2719.39 ( 0.00%) 2181.67 * -19.77%* 2666.50 * -1.94%* Hmean 500 4400.59 ( 0.00%) 3390.77 * -22.95%* 4322.14 * -1.78%* Hmean 850 6726.21 ( 0.00%) 5264.03 * -21.74%* 6863.12 * 2.04%* 3. dbench (tmpfs) Configured to run with up to ncpusx8 clients, it shows both latency and throughput metrics. For the latency, with the exception of the 64 case, there is really nothing to go by: vanilla dirty Amean latency-1 1.67 ( 0.00%) 1.67 * 0.09%* Amean latency-2 2.15 ( 0.00%) 2.08 * 3.36%* Amean latency-4 2.50 ( 0.00%) 2.56 * -2.27%* Amean latency-8 2.49 ( 0.00%) 2.48 * 0.31%* Amean latency-16 2.69 ( 0.00%) 2.72 * -1.37%* Amean latency-32 2.96 ( 0.00%) 3.04 * -2.60%* Amean latency-64 7.78 ( 0.00%) 8.17 * -5.07%* Amean latency-512 186.91 ( 0.00%) 186.41 * 0.27%* For the dbench4 Throughput (misleading but traditional) there's a small but rather constant improvement: vanilla dirty Hmean 1 849.13 ( 0.00%) 851.51 * 0.28%* Hmean 2 1664.03 ( 0.00%) 1663.94 * -0.01%* Hmean 4 3073.70 ( 0.00%) 3104.29 * 1.00%* Hmean 8 5624.02 ( 0.00%) 5694.16 * 1.25%* Hmean 16 9169.49 ( 0.00%) 9324.43 * 1.69%* Hmean 32 11969.37 ( 0.00%) 12127.09 * 1.32%* Hmean 64 15021.12 ( 0.00%) 15243.14 * 1.48%* Hmean 512 14891.27 ( 0.00%) 15162.11 * 1.82%* Measuring the dbench4 Per-VFS Operation latency, shows some very minor differences within the noise level, around the 0-1% ranges. Fixes: `49a7d46a06` ("powerpc: Implement smp_cond_load_relaxed()") Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210318204702.71417-1-dave@stgolabs.net	2021-03-29 12:48:46 +11:00
Davidlohr Bueso	66f6052213	powerpc/spinlock: Unserialize spin_is_locked `c6f5d02b6a` (locking/spinlocks/arm64: Remove smp_mb() from arch_spin_is_locked()) made it pretty official that the call semantics do not imply any sort of barriers, and any user that gets creative must explicitly do any serialization. This creativity, however, is nowadays pretty limited: 1. spin_unlock_wait() has been removed from the kernel in favor of a lock/unlock combo. Furthermore, queued spinlocks have now for a number of years no longer relied on _Q_LOCKED_VAL for the call, but any non-zero value to indicate a locked state. There were cases where the delayed locked store could lead to breaking mutual exclusion with crossed locking; such as with sysv ipc and netfilter being the most extreme. 2. The auditing Andrea did in verified that remaining spin_is_locked() no longer rely on such semantics. Most callers just use it to assert a lock is taken, in a debug nature. The only user that gets cute is NOLOCK qdisc, as of: `96009c7d50` (sched: replace __QDISC_STATE_RUNNING bit with a spin lock) ... which ironically went in the next day after `c6f5d02b6a`. This change replaces test_bit() with spin_is_locked() to know whether to take the busylock heuristic to reduce contention on the main qdisc lock. So any races against spin_is_locked() for archs that use LL/SC for spin_lock() will be benign and not break any mutual exclusion; furthermore, both the seqlock and busylock have the same scope. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210309015950.27688-3-dave@stgolabs.net	2021-03-26 23:19:43 +11:00
Davidlohr Bueso	2bf3604c41	powerpc/spinlock: Define smp_mb__after_spinlock only once Instead of both queued and simple spinlocks doing it. Move it into the arch's spinlock.h. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210309015950.27688-2-dave@stgolabs.net	2021-03-26 23:19:43 +11:00
Christophe Leroy	93c043e393	powerpc/ptrace: Convert gpr32_set_common() to user access block Use user access block in gpr32_set_common() instead of repetitive __get_user() which imply repetitive KUAP open/close. To get it clean, force inlining of the small set of tiny functions called inside the block. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bdcb8652c3bb4ab5b8b3bfd08147434be8fc04c9.1615398498.git.christophe.leroy@csgroup.eu	2021-03-26 23:19:43 +11:00
Christophe Leroy	870779f40e	powerpc/futex: Switch to user_access block Use user_access_begin() instead of the access_ok/allow_access sequence. This brings the missing might_fault() check. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/6cd202cdc4f939d47822e4ddd3c0856210431a58.1615398498.git.christophe.leroy@csgroup.eu	2021-03-26 23:19:43 +11:00

... 2 3 4 5 6 ...

23582 Commits