Commit Graph

1013988 Commits

Author SHA1 Message Date
Christophe Leroy 6ca6512c71 powerpc/mm: Properly coalesce pages in ptdump
Commit aaa2295292 ("powerpc/mm: Add physical address to Linux page
table dump") changed range coalescing to only combine ranges that are
both virtually and physically contiguous, in order to avoid erroneous
combination of unrelated mappings in IOREMAP space.

But in the VMALLOC space, mappings almost never have contiguous
physical pages, so the commit mentionned above leads to dumping one
line per page for vmalloc mappings.

Taking into account the vmalloc always leave a gap between two areas,
we never have two mappings dumped as a single combination even if they
have the exact same flags. The only space that may have encountered
such an issue was the early IOREMAP which is not using vmalloc engine.
But previous commits added gaps between early IO mappings, so it is
not an issue anymore.

That commit created some difficulties with KASAN mappings, see
commit cabe8138b2 ("powerpc: dump as a single line areas mapping a
single physical page.") and with huge page, see
commit b00ff6d8c1 ("powerpc/ptdump: Properly handle non standard
page size").

So, almost revert commit aaa2295292 to properly coalesce pages
mapped with the same flags as before, only keep the display of the
first physical address of the range, as it can be usefull especially
for IO mappings.

It brings back powerpc at the same level as other architectures and
simplifies the conversion to GENERIC PTDUMP.

With the patch:

---[ kasan shadow mem start ]---
0xf8000000-0xf8ffffff  0x07000000        16M   huge        rw       present           dirty  accessed
0xf9000000-0xf91fffff  0x01434000         2M               r        present                  accessed
0xf9200000-0xf95affff  0x02104000      3776K               rw       present           dirty  accessed
0xfef5c000-0xfeffffff  0x01434000       656K               r        present                  accessed
---[ kasan shadow mem end ]---

Before:

---[ kasan shadow mem start ]---
0xf8000000-0xf8ffffff  0x07000000        16M   huge        rw       present           dirty  accessed
0xf9000000-0xf91fffff  0x01434000        16K               r        present                  accessed
0xf9200000-0xf9203fff  0x02104000        16K               rw       present           dirty  accessed
0xf9204000-0xf9207fff  0x0213c000        16K               rw       present           dirty  accessed
0xf9208000-0xf920bfff  0x02174000        16K               rw       present           dirty  accessed
0xf920c000-0xf920ffff  0x02188000        16K               rw       present           dirty  accessed
0xf9210000-0xf9213fff  0x021dc000        16K               rw       present           dirty  accessed
0xf9214000-0xf9217fff  0x02220000        16K               rw       present           dirty  accessed
0xf9218000-0xf921bfff  0x023c0000        16K               rw       present           dirty  accessed
0xf921c000-0xf921ffff  0x023d4000        16K               rw       present           dirty  accessed
0xf9220000-0xf9227fff  0x023ec000        32K               rw       present           dirty  accessed
...
0xf93b8000-0xf93e3fff  0x02614000       176K               rw       present           dirty  accessed
0xf93e4000-0xf94c3fff  0x027c0000       896K               rw       present           dirty  accessed
0xf94c4000-0xf94c7fff  0x0236c000        16K               rw       present           dirty  accessed
0xf94c8000-0xf94cbfff  0x041f0000        16K               rw       present           dirty  accessed
0xf94cc000-0xf94cffff  0x029c0000        16K               rw       present           dirty  accessed
0xf94d0000-0xf94d3fff  0x041ec000        16K               rw       present           dirty  accessed
0xf94d4000-0xf94d7fff  0x0407c000        16K               rw       present           dirty  accessed
0xf94d8000-0xf94f7fff  0x041c0000       128K               rw       present           dirty  accessed
...
0xf95ac000-0xf95affff  0x042b0000        16K               rw       present           dirty  accessed
0xfef5c000-0xfeffffff  0x01434000        16K               r        present                  accessed
---[ kasan shadow mem end ]---

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c56ce1f5c3c75adc9811b1a5f9c410fa74183a8d.1618828806.git.christophe.leroy@csgroup.eu
2021-06-25 00:07:10 +10:00
Christophe Leroy 57307f1b6e powerpc/mm: Leave a gap between early allocated IO areas
Vmalloc system leaves a gap between allocated areas. It helps catching
overflows.

Do the same for IO areas which are allocated with early_ioremap_range()
until slab_is_available().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c433e358190fb5d47650463ea1ab755fc7b73e6e.1618828806.git.christophe.leroy@csgroup.eu
2021-06-25 00:07:10 +10:00
Andy Shevchenko 0e8554b5d7 powerpc/papr_scm: Properly handle UUID types and API
Parse to and export from UUID own type, before dereferencing.
This also fixes wrong comment (Little Endian UUID is something else)
and should eliminate the direct strict types assignments.

Fixes: 43001c52b6 ("powerpc/papr_scm: Use ibm,unit-guid as the iset cookie")
Fixes: 259a948c4b ("powerpc/pseries/scm: Use a specific endian format for storing uuid from the device tree")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210616134303.58185-1-andriy.shevchenko@linux.intel.com
2021-06-25 00:07:10 +10:00
Daniel Henrique Barboza 0e5962b2ec powerpc/pseries: fail quicker in dlpar_memory_add_by_ic()
The validation done at the start of dlpar_memory_add_by_ic() is an all
of nothing scenario - if any LMBs in the range is marked as RESERVED we
can fail right away.

We then can remove the 'lmbs_available' var and its check with
'lmbs_to_add' since the whole LMB range was already validated in the
previous step.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210622133923.295373-4-danielhb413@gmail.com
2021-06-25 00:07:09 +10:00
Daniel Henrique Barboza c2aaddcc65 powerpc/pseries: break early in dlpar_memory_add_by_count() loops
After a successful dlpar_add_lmb() call the LMB is marked as reserved.
Later on, depending whether we added enough LMBs or not, we rely on
the marked LMBs to see which ones might need to be removed, and we
remove the reservation of all of them.

These are done in for_each_drmem_lmb() loops without any break
condition. This means that we're going to check all LMBs of the partition
even after going through all the reserved ones.

This patch adds break conditions in both loops to avoid this. The
'lmbs_added' variable was renamed to 'lmbs_reserved', and it's now
being decremented each time a lmb reservation is removed, indicating
if there are still marked LMBs to be processed.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210622133923.295373-3-danielhb413@gmail.com
2021-06-25 00:07:09 +10:00
Daniel Henrique Barboza b3e3b4db7a powerpc/pseries: skip reserved LMBs in dlpar_memory_add_by_count()
The function is counting reserved LMBs as available to be added, but
they aren't. This will cause the function to miscalculate the available
LMBs and can trigger errors later on when executing dlpar_add_lmb().

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210622133923.295373-2-danielhb413@gmail.com
2021-06-25 00:07:09 +10:00
Nicholas Piggin bab26238bb powerpc: Offline CPU in stop_this_cpu()
printk_safe_flush_on_panic() has special lock breaking code for the case
where we panic()ed with the console lock held. It relies on panic IPI
causing other CPUs to mark themselves offline.

Do as most other architectures do.

This effectively reverts commit de6e5d3841 ("powerpc: smp_send_stop do
not offline stopped CPUs"), unfortunately it may result in some false
positive warnings, but the alternative is more situations where we can
crash without getting messages out.

Fixes: de6e5d3841 ("powerpc: smp_send_stop do not offline stopped CPUs")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210623041245.865134-1-npiggin@gmail.com
2021-06-25 00:07:09 +10:00
Nicholas Piggin f5f48e8cb9 powerpc: Make PPC_IRQ_SOFT_MASK_DEBUG depend on PPC64
32-bit platforms don't have irq soft masking.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210623032909.826010-1-npiggin@gmail.com
2021-06-25 00:07:09 +10:00
Nicholas Piggin 0cdff98b39 powerpc/64s: Remove irq mask workaround in accumulate_stolen_time()
The caller has been moved to C after irq soft-mask state has been
reconciled, and Linux IRQs have been marked as disabled, so this no
longer needs to play games with IRQ internals.

Fixes: 68b34588e2 ("powerpc/64/sycall: Implement syscall entry/exit logic in C")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210623022924.704645-1-npiggin@gmail.com
2021-06-25 00:07:09 +10:00
Nicholas Piggin 633c8e9800 powerpc/pseries: Enable hardlockup watchdog for PowerVM partitions
PowerVM will not arbitrarily oversubscribe or stop guests, page out the
guest kernel text to a NFS volume connected by carrier pigeon to abacus
based storage, etc., as a KVM host might. So PowerVM guests are not
likely to be killed by the hard lockup watchdog in normal operation,
even with shared processor LPARs which still get a minimum allotment of
CPU time.

Enable the hard lockup detector by default on !KVM guests, which we will
assume is PowerVM. It has been useful in finding problems on bare metal
kernels.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210623021528.702241-1-npiggin@gmail.com
2021-06-25 00:07:09 +10:00
Nicholas Piggin 6eaaf9de35 powerpc/64s/interrupt: Check and fix srr_valid without crashing
The PPC_RFI_SRR_DEBUG check added by patch "powerpc/64s: avoid reloading
(H)SRR registers if they are still valid" has a few deficiencies. It
does not fix the actual problem, it's not enabled by default, and it
causes a program check interrupt which can cause more difficulties.

However there are a lot of paths which may clobber SRRs or change return
regs, and difficult to have a high confidence that all paths are covered
without wider testing.

Add a relatively low overhead always-enabled check that catches most
such cases, reports once, and fixes it so the kernel can continue.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Rebase, use switch & INT names, squash in race fix from Nick]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2021-06-25 00:06:57 +10:00
Christophe Leroy ae58b1c645 powerpc/interrupt: Remove prep_irq_for_user_exit()
prep_irq_for_user_exit() has only one caller, squash it
inside that caller.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-18-npiggin@gmail.com
2021-06-25 00:06:57 +10:00
Christophe Leroy 61eece2d17 powerpc/interrupt: Refactor prep_irq_for_{user/kernel_enabled}_exit()
prep_irq_for_user_exit() is a superset of
prep_irq_for_kernel_enabled_exit().

Rename prep_irq_for_kernel_enabled_exit() as prep_irq_for_enabled_exit()
and have prep_irq_for_user_exit() use it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-17-npiggin@gmail.com
2021-06-25 00:06:57 +10:00
Christophe Leroy 99f98f849c powerpc/interrupt: Interchange prep_irq_for_{kernel_enabled/user}_exit()
prep_irq_for_user_exit() is a superset of
prep_irq_for_kernel_enabled_exit(). In order to allow refactoring in
following patch, interchange the two. This will allow
prep_irq_for_user_exit() to call a renamed version of
prep_irq_for_kernel_enabled_exit().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-16-npiggin@gmail.com
2021-06-25 00:06:57 +10:00
Christophe Leroy a214ee8802 powerpc/interrupt: Refactor interrupt_exit_user_prepare()
interrupt_exit_user_prepare() is a superset of
interrupt_exit_user_prepare_main().

Refactor to avoid code duplication.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-15-npiggin@gmail.com
2021-06-25 00:06:57 +10:00
Christophe Leroy f84aa28494 powerpc/interrupt: Rename and lightly change syscall_exit_prepare_main()
Rename syscall_exit_prepare_main() into interrupt_exit_prepare_main()

Pass it the 'ret' so that it can 'or' it directly instead of
oring twice, once inside the function and once outside.

And remove 'r3' parameter which is not used.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[np: split out some changes into other patches]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-14-npiggin@gmail.com
2021-06-25 00:06:57 +10:00
Nicholas Piggin 13799748b9 powerpc/64: use interrupt restart table to speed up return from interrupt
Use the restart table facility to return from interrupt or system calls
without disabling MSR[EE] or MSR[RI].

Interrupt return asm is put into the low soft-masked region, to prevent
interrupts being processed here, although they are still taken as masked
interrupts which causes SRRs to be clobbered, and a pending soft-masked
interrupt to require replaying.

The return code uses restart table regions to redirct to a fixup handler
rather than continue with the exit, if such an interrupt happens. In
this case the interrupt return is redirected to a fixup handler which
reloads r1 for the interrupt stack and reloads registers and sets state
up to replay the soft-masked interrupt and try the exit again.

Some types of security exit fallback flushes and barriers are currently
unable to cope with reentrant interrupts, e.g., because they store some
state in the scratch SPR which would be clobbered even by masked
interrupts. For now the interrupts-enabled exits are disabled when these
flushes are used.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Guard unused exit_must_hard_disable() as reported by lkp]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-13-npiggin@gmail.com
2021-06-25 00:06:56 +10:00
Nicholas Piggin 9d1988ca87 powerpc/64: treat low kernel text as irqs soft-masked
Treat code below __end_soft_masked as soft-masked for the purpose
of alternate return. 64s already mostly does this for scv entry.

This will be used to exit from interrupts without disabling MSR[EE].

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-12-npiggin@gmail.com
2021-06-25 00:06:56 +10:00
Nicholas Piggin 862fa56352 powerpc/64: interrupt soft-enable race fix
Prevent interrupt restore from allowing racing hard interrupts going
ahead of previous soft-pending ones, by using the soft-masked restart
handler to allow a store to clear the soft-mask while knowing nothing
is soft-pending.

This probably doesn't matter much in practice, but it's a simple
demonstrator / test case to exercise the restart table logic.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-11-npiggin@gmail.com
2021-06-25 00:06:56 +10:00
Nicholas Piggin f23699c93b powerpc/64: allow alternate return locations for soft-masked interrupts
The exception table fixup adjusts a failed page fault's interrupt return
location if it was taken at an address specified in the exception table,
to a corresponding fixup handler address.

Introduce a variation of that idea which adds a fixup table for NMIs and
soft-masked asynchronous interrupts. This will be used to protect
certain critical sections that are sensitive to being clobbered by
interrupts coming in (due to using the same SPRs and/or irq soft-mask
state).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-10-npiggin@gmail.com
2021-06-25 00:06:56 +10:00
Nicholas Piggin 63e40806ee powerpc/64s: save one more register in the masked interrupt handler
This frees up one more register (and takes advantage of that to
clean things up a little bit).

This register will be used in the following patch.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-9-npiggin@gmail.com
2021-06-25 00:06:56 +10:00
Nicholas Piggin dd152f70bd powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE]
This extends the MSR[RI]=0 window a little further into the system
call in order to pair RI and EE enabling with a single mtmsrd.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-8-npiggin@gmail.com
2021-06-25 00:06:56 +10:00
Nicholas Piggin e754f4d13e powerpc/64: move interrupt return asm to interrupt_64.S
The next patch would like to move interrupt return assembly code to a low
location before general text, so move it into its own file and include via
head_64.S

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-7-npiggin@gmail.com
2021-06-25 00:06:55 +10:00
Nicholas Piggin 59dc5bfca0 powerpc/64s: avoid reloading (H)SRR registers if they are still valid
When an interrupt is taken, the SRR registers are set to return to where
it left off. Unless they are modified in the meantime, or the return
address or MSR are modified, there is no need to reload these registers
when returning from interrupt.

Introduce per-CPU flags that track the validity of SRR and HSRR
registers. These are cleared when returning from interrupt, when
using the registers for something else (e.g., OPAL calls), when
adjusting the return address or MSR of a context, and when context
switching (which changes the return address and MSR).

This improves the performance of interrupt returns.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fold in fixup patch from Nick]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-5-npiggin@gmail.com
2021-06-25 00:06:55 +10:00
Nicholas Piggin 1df7d5e4ba powerpc/64s: introduce different functions to return from SRR vs HSRR interrupts
This makes no real difference yet except that HSRR type interrupts will
use hrfid to return. This is important for the next patch.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-4-npiggin@gmail.com
2021-06-25 00:06:55 +10:00
Nicholas Piggin bf9155f197 powerpc: remove interrupt exit helpers unused argument
The msr argument is not used, remove it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-3-npiggin@gmail.com
2021-06-25 00:06:55 +10:00
Christophe Leroy 9a3ed7adca powerpc/interrupt: Fix CONFIG ifdef typo
CONFIG_PPC_BOOK3S should be CONFIG_PPC_BOOK3S_64. restore_math is a
no-op for other configurations.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[np: split from another patch]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210617155116.2167984-2-npiggin@gmail.com
2021-06-25 00:06:55 +10:00
Michael Ellerman ffaacd97fd powerpc/prom_init: Pass linux_banner to firmware via option vector 7
Pass the value of linux_banner to firmware via option vector 7.

Option vector 7 is described in "LoPAR" Linux on Power Architecture
Reference v2.9, in table B.7 on page 824:

  An ASCII character formatted null terminated string that describes
  the client operating system. The string shall be human readable and
  may be displayed on the console.

The string can be up to 256 bytes total, including the nul terminator.

linux_banner contains lots of information, and should make it possible
to identify the exact kernel version that is running:

  const char linux_banner[] =
  "Linux version " UTS_RELEASE " (" LINUX_COMPILE_BY "@"
  LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION "\n";

For example:
  Linux version 4.15.0-144-generic (buildd@bos02-ppc64el-018) (gcc
  version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #148-Ubuntu SMP Sat May 8
  02:32:13 UTC 2021 (Ubuntu 4.15.0-144.148-generic 4.15.18)

It's also printed at boot to the console/dmesg, which should make it
possible to correlate what firmware receives with the console/dmesg on
the machine.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621064938.2021419-2-mpe@ellerman.id.au
2021-06-25 00:06:55 +10:00
Michael Ellerman f47d5a4fc2 powerpc/prom_init: Convert prom_strcpy() into prom_strscpy_pad()
In a subsequent patch we'd like to have something like a strscpy_pad()
implementation usable in prom_init.c.

Currently we have a strcpy() implementation with only one caller, so
convert it into strscpy_pad() and update the caller.

Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621064938.2021419-1-mpe@ellerman.id.au
2021-06-25 00:06:55 +10:00
Michael Ellerman 3018fbc636 powerpc/64s: Fix boot failure with 4K Radix
When using the Radix MMU our PGD is always 64K, and must be naturally
aligned.

For a 4K page size kernel that means page alignment of swapper_pg_dir is
not sufficient, leading to failure to boot.

Use the existing MAX_PTRS_PER_PGD which has the correct value, and
avoids us hard-coding 64K here.

Fixes: e72421a085 ("powerpc: Define swapper_pg_dir[] in C")
Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210624123420.2784187-1-mpe@ellerman.id.au
2021-06-25 00:06:54 +10:00
Michael Ellerman a736143afd Merge branch 'topic/ppc-kvm' into next
Pull in some more ppc KVM patches we are keeping in our topic branch.

In particular this brings in the series to add H_RPT_INVALIDATE.
2021-06-23 00:19:08 +10:00
Nathan Chancellor 51696f39cb KVM: PPC: Book3S HV: Workaround high stack usage with clang
LLVM does not emit optimal byteswap assembly, which results in high
stack usage in kvmhv_enter_nested_guest() due to the inlining of
byteswap_pt_regs(). With LLVM 12.0.0:

arch/powerpc/kvm/book3s_hv_nested.c:289:6: error: stack frame size of
2512 bytes in function 'kvmhv_enter_nested_guest' [-Werror,-Wframe-larger-than=]
long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
     ^
1 error generated.

While this gets fixed in LLVM, mark byteswap_pt_regs() as
noinline_for_stack so that it does not get inlined and break the build
due to -Werror by default in arch/powerpc/. Not inlining saves
approximately 800 bytes with LLVM 12.0.0:

arch/powerpc/kvm/book3s_hv_nested.c:290:6: warning: stack frame size of
1728 bytes in function 'kvmhv_enter_nested_guest' [-Wframe-larger-than=]
long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
     ^
1 warning generated.

Cc: stable@vger.kernel.org # v4.20+
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://github.com/ClangBuiltLinux/linux/issues/1292
Link: https://bugs.llvm.org/show_bug.cgi?id=49610
Link: https://lore.kernel.org/r/202104031853.vDT0Qjqj-lkp@intel.com/
Link: https://gist.github.com/ba710e3703bf45043a31e2806c843ffd
Link: https://lore.kernel.org/r/20210621182440.990242-1-nathan@kernel.org
2021-06-23 00:18:30 +10:00
Bharata B Rao 81468083f3 KVM: PPC: Book3S HV: Use H_RPT_INVALIDATE in nested KVM
In the nested KVM case, replace H_TLB_INVALIDATE by the new hcall
H_RPT_INVALIDATE if available. The availability of this hcall
is determined from "hcall-rpt-invalidate" string in ibm,hypertas-functions
DT property.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-7-bharata@linux.ibm.com
2021-06-22 23:38:28 +10:00
Bharata B Rao b87cc116c7 KVM: PPC: Book3S HV: Add KVM_CAP_PPC_RPT_INVALIDATE capability
Now that we have H_RPT_INVALIDATE fully implemented, enable
support for the same via KVM_CAP_PPC_RPT_INVALIDATE KVM capability

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-6-bharata@linux.ibm.com
2021-06-22 23:38:28 +10:00
Bharata B Rao 53324b51c5 KVM: PPC: Book3S HV: Nested support in H_RPT_INVALIDATE
Enable support for process-scoped invalidations from nested
guests and partition-scoped invalidations for nested guests.

Process-scoped invalidations for any level of nested guests
are handled by implementing H_RPT_INVALIDATE handler in the
nested guest exit path in L0.

Partition-scoped invalidation requests are forwarded to the
right nested guest, handled there and passed down to L0
for eventual handling.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
[aneesh: Nested guest partition-scoped invalidation changes]
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[mpe: Squash in fixup patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-5-bharata@linux.ibm.com
2021-06-22 23:35:37 +10:00
Bharata B Rao f0c6fbbb90 KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE
H_RPT_INVALIDATE does two types of TLB invalidations:

1. Process-scoped invalidations for guests when LPCR[GTSE]=0.
   This is currently not used in KVM as GTSE is not usually
   disabled in KVM.
2. Partition-scoped invalidations that an L1 hypervisor does on
   behalf of an L2 guest. This is currently handled
   by H_TLB_INVALIDATE hcall and this new replaces the old that.

This commit enables process-scoped invalidations for L1 guests.
Support for process-scoped and partition-scoped invalidations
from/for nested guests will be added separately.

Process scoped tlbie invalidations from L1 and nested guests
need RS register for TLBIE instruction to contain both PID and
LPID.  This patch introduces primitives that execute tlbie
instruction with both PID and LPID set in prepartion for
H_RPT_INVALIDATE hcall.

A description of H_RPT_INVALIDATE follows:

int64   /* H_Success: Return code on successful completion */
        /* H_Busy - repeat the call with the same */
        /* H_Parameter, H_P2, H_P3, H_P4, H_P5 : Invalid
	   parameters */
hcall(const uint64 H_RPT_INVALIDATE, /* Invalidate RPT
					translation
					lookaside information */
      uint64 id,        /* PID/LPID to invalidate */
      uint64 target,    /* Invalidation target */
      uint64 type,      /* Type of lookaside information */
      uint64 pg_sizes,  /* Page sizes */
      uint64 start,     /* Start of Effective Address (EA)
			   range (inclusive) */
      uint64 end)       /* End of EA range (exclusive) */

Invalidation targets (target)
-----------------------------
Core MMU        0x01 /* All virtual processors in the
			partition */
Core local MMU  0x02 /* Current virtual processor */
Nest MMU        0x04 /* All nest/accelerator agents
			in use by the partition */

A combination of the above can be specified,
except core and core local.

Type of translation to invalidate (type)
---------------------------------------
NESTED       0x0001  /* invalidate nested guest partition-scope */
TLB          0x0002  /* Invalidate TLB */
PWC          0x0004  /* Invalidate Page Walk Cache */
PRT          0x0008  /* Invalidate caching of Process Table
			Entries if NESTED is clear */
PAT          0x0008  /* Invalidate caching of Partition Table
			Entries if NESTED is set */

A combination of the above can be specified.

Page size mask (pages)
----------------------
4K              0x01
64K             0x02
2M              0x04
1G              0x08
All sizes       (-1UL)

A combination of the above can be specified.
All page sizes can be selected with -1.

Semantics: Invalidate radix tree lookaside information
           matching the parameters given.
* Return H_P2, H_P3 or H_P4 if target, type, or pageSizes parameters
  are different from the defined values.
* Return H_PARAMETER if NESTED is set and pid is not a valid nested
  LPID allocated to this partition
* Return H_P5 if (start, end) doesn't form a valid range. Start and
  end should be a valid Quadrant address and  end > start.
* Return H_NotSupported if the partition is not in running in radix
  translation mode.
* May invalidate more translation information than requested.
* If start = 0 and end = -1, set the range to cover all valid
  addresses. Else start and end should be aligned to 4kB (lower 11
  bits clear).
* If NESTED is clear, then invalidate process scoped lookaside
  information. Else pid specifies a nested LPID, and the invalidation
  is performed   on nested guest partition table and nested guest
  partition scope real addresses.
* If pid = 0 and NESTED is clear, then valid addresses are quadrant 3
  and quadrant 0 spaces, Else valid addresses are quadrant 0.
* Pages which are fully covered by the range are to be invalidated.
  Those which are partially covered are considered outside
  invalidation range, which allows a caller to optimally invalidate
  ranges that may   contain mixed page sizes.
* Return H_SUCCESS on success.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-4-bharata@linux.ibm.com
2021-06-21 22:54:27 +10:00
Bharata B Rao d6265cb33b powerpc/book3s64/radix: Add H_RPT_INVALIDATE pgsize encodings to mmu_psize_def
Add a field to mmu_psize_def to store the page size encodings
of H_RPT_INVALIDATE hcall. Initialize this while scanning the radix
AP encodings. This will be used when invalidating with required
page size encoding in the hcall.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-3-bharata@linux.ibm.com
2021-06-21 22:48:18 +10:00
Aneesh Kumar K.V f09216a190 KVM: PPC: Book3S HV: Fix comments of H_RPT_INVALIDATE arguments
The type values H_RPTI_TYPE_PRT and H_RPTI_TYPE_PAT indicate
invalidating the caching of process and partition scoped entries
respectively.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-2-bharata@linux.ibm.com
2021-06-21 22:48:18 +10:00
Joel Stanley 4a21192e27 powerpc/boot: Add a boot wrapper for Microwatt
This allows microwatt's kernel to be built with an embedded device tree.

Load to arch/powerpc/boot/dtbImage.microwatt to 0x500000:

 mw_debug -b fpga stop load arch/powerpc/boot/dtbImage.microwatt 500000 start

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwX19wym3kQ7guu@thinks.paulus.ozlabs.org
2021-06-21 21:16:32 +10:00
Benjamin Herrenschmidt c93f80849b powerpc/boot: Fixup device-tree on little endian
This fixes the core devtree.c functions and the ns16550 UART backend.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwXrPT8nc4YUdJ9@thinks.paulus.ozlabs.org
2021-06-21 21:16:32 +10:00
Paul Mackerras 4a1511eb34 powerpc/microwatt: Add microwatt_defconfig
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwXfL8hOpReIiiP@thinks.paulus.ozlabs.org
2021-06-21 21:16:32 +10:00
Paul Mackerras c25769fdda powerpc/microwatt: Add support for hardware random number generator
Microwatt's hardware RNG is accessed using the DARN instruction.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwXPHlV/ZleiQUY@thinks.paulus.ozlabs.org
2021-06-21 21:16:32 +10:00
Benjamin Herrenschmidt 48b545b801 powerpc/microwatt: Use standard 16550 UART for console
This adds support to the Microwatt platform to use the standard
16550-style UART which available in the standalone Microwatt FPGA.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwXGCTzedpQje7r@thinks.paulus.ozlabs.org
2021-06-21 21:16:31 +10:00
Benjamin Herrenschmidt aa9c5adf2f powerpc/xics: Add a native ICS backend for microwatt
This is a simple native ICS backend that matches the layout of
the Microwatt implementation of ICS.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
[mpe: Add empty ics_native_init() to unbreak non-microwatt builds]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

fixup-ics
Link: https://lore.kernel.org/r/YMwW8cxrwB2W5EUN@thinks.paulus.ozlabs.org
2021-06-21 21:15:58 +10:00
Benjamin Herrenschmidt 0d0f9e5f2f powerpc/microwatt: Populate platform bus from device-tree
Just like any other embedded platform.

Add an empty soc node.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwWx98+PMibZq/G@thinks.paulus.ozlabs.org
2021-06-21 21:15:26 +10:00
Paul Mackerras 151b88e848 powerpc: Add Microwatt device tree
Microwatt currently runs with MSR[HV] = 0, hence the usable-privilege
properties don't have bit 2 (for HV support) set, and we need the
/chosen/ibm,architecture-vec-5 property.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwWkPcXlGDSQ9Q3@thinks.paulus.ozlabs.org
2021-06-21 21:15:26 +10:00
Paul Mackerras 53d143fe08 powerpc: Add Microwatt platform
Microwatt is a FPGA-based implementation of the Power ISA.  It
currently only implements little-endian 64-bit mode, and does
not (yet) support SMP, VMX, VSX or transactional memory.  It has an
optional FPU, and an optional MMU (required for running Linux,
obviously) which implements a configurable radix tree but not
hypervisor mode or nested radix translation.

This adds a new machine type to support FPGA-based SoCs with a
Microwatt core.  CONFIG_MATH_EMULATION can be selected for Microwatt
SOCs which don't have the FPU.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/YMwWbZVREsVug9R0@thinks.paulus.ozlabs.org
2021-06-21 21:15:26 +10:00
Christophe Leroy c988cfd38e powerpc/32: use set_memory_attr()
Use set_memory_attr() instead of the PPC32 specific change_page_attr()

change_page_attr() was checking that the address was not mapped by
blocks and was handling highmem, but that's unneeded because the
affected pages can't be in highmem and block mapping verification
is already done by the callers.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[ruscur: rebase on powerpc/merge with Christophe's new patches]
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-10-jniethe5@gmail.com
2021-06-21 21:13:21 +10:00
Christophe Leroy 4d1755b6a7 powerpc/mm: implement set_memory_attr()
In addition to the set_memory_xx() functions which allows to change
the memory attributes of not (yet) used memory regions, implement a
set_memory_attr() function to:
- set the final memory protection after init on currently used
kernel regions.
- enable/disable kernel memory regions in the scope of DEBUG_PAGEALLOC.

Unlike the set_memory_xx() which can act in three step as the regions
are unused, this function must modify 'on the fly' as the kernel is
executing from them. At the moment only PPC32 will use it and changing
page attributes on the fly is not an issue.

Reported-by: kbuild test robot <lkp@intel.com>
[ruscur: cast "data" to unsigned long instead of int]
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-9-jniethe5@gmail.com
2021-06-21 21:13:21 +10:00
Russell Currey c35717c71e powerpc: Set ARCH_HAS_STRICT_MODULE_RWX
To enable strict module RWX on powerpc, set:

    CONFIG_STRICT_MODULE_RWX=y

You should also have CONFIG_STRICT_KERNEL_RWX=y set to have any real
security benefit.

ARCH_HAS_STRICT_MODULE_RWX is set to require ARCH_HAS_STRICT_KERNEL_RWX.
This is due to a quirk in arch/Kconfig and arch/powerpc/Kconfig that
makes STRICT_MODULE_RWX *on by default* in configurations where
STRICT_KERNEL_RWX is *unavailable*.

Since this doesn't make much sense, and module RWX without kernel RWX
doesn't make much sense, having the same dependencies as kernel RWX
works around this problem.

Book3s/32 603 and 604 core processors are not able to write protect
kernel pages so do not set ARCH_HAS_STRICT_MODULE_RWX for Book3s/32.

[jpn: - predicate on !PPC_BOOK3S_604
      - make module_alloc() use PAGE_KERNEL protection]

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-8-jniethe5@gmail.com
2021-06-21 21:13:21 +10:00