The fallback RFI flush is used when firmware does not provide a way
to flush the cache. It's a "displacement flush" that evicts useful
data by displacing it with an uninteresting buffer.
The flush has to take care to work with implementation specific cache
replacment policies, so the recipe has been in flux. The initial
slow but conservative approach is to touch all lines of a congruence
class, with dependencies between each load. It has since been
determined that a linear pattern of loads without dependencies is
sufficient, and is significantly faster.
Measuring the speed of a null syscall with RFI fallback flush enabled
gives the relative improvement:
P8 - 1.83x
P9 - 1.75x
The flush also becomes simpler and more adaptable to different cache
geometries.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Merge our fixes branch from the 4.15 cycle.
Unusually the fixes branch saw some significant features merged,
notably the RFI flush patches, so we want the code in next to be
tested against that, to avoid any surprises when the two are merged.
There's also some other work on the panic handling that was reverted
in fixes and we now want to do properly in next, which would conflict.
And we also fix a few other minor merge conflicts.
Rename the paca->soft_enabled to paca->irq_soft_mask as it is no
longer used as a flag for interrupt state, but a mask.
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Remember when the biggest problem we had to worry about was hashed
pointers, those were the days.
These were missed in my earlier patch because they don't match "%p",
but the macro is hiding a "%p", so these all end up being hashed,
which is not what we want in xmon. Convert them to "%px".
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Since commit ad67b74d24 ("printk: hash addresses printed with %p")
pointers printed with %p are hashed, ie. you don't see the actual
pointer value but rather a cryptographic hash of its value.
In xmon we want to see the actual pointer values, because xmon is a
debugger, so replace %p with %px which prints the actual pointer
value.
We justify doing this in xmon because 1) xmon is a kernel crash
debugger, it's only accessible via the console 2) xmon doesn't print
to dmesg, so the pointers it prints are not able to be leaked that
way.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
It would be nice to be able to dump page tables in a particular
context.
eg: dumping vmalloc space:
0:mon> dv 0xd00037fffff00000
pgd @ 0xc0000000017c0000
pgdp @ 0xc0000000017c00d8 = 0x00000000f10b1000
pudp @ 0xc0000000f10b13f8 = 0x00000000f10d0000
pmdp @ 0xc0000000f10d1ff8 = 0x00000000f1102000
ptep @ 0xc0000000f1102780 = 0xc0000000f1ba018e
Maps physical address = 0x00000000f1ba0000
Flags = Accessed Dirty Read Write
This patch does not replicate the complex code of dump_pagetable and
has no support for bolted linear mapping, thats why I've it's called
dump virtual page table support. The format of the PTE can be expanded
even further to add more useful information about the flags in the PTE
if required.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Bike shed the output format, show the pgdir, fix build failures]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
CONFIG_PPC_STD_MMU_64 indicates support for the "standard" powerpc MMU
on 64-bit CPUs. The "standard" MMU refers to the hash page table MMU
found in "server" processors, from IBM mainly.
Currently CONFIG_PPC_STD_MMU_64 is == CONFIG_PPC_BOOK3S_64. While it's
annoying to have two symbols that always have the same value, it's not
quite annoying enough to bother removing one.
However with the arrival of Power9, we now have the situation where
CONFIG_PPC_STD_MMU_64 is enabled, but the kernel is running using the
Radix MMU - *not* the "standard" MMU. So it is now actively confusing
to use it, because it implies that code is disabled or inactive when
the Radix MMU is in use, however that is not necessarily true.
So s/CONFIG_PPC_STD_MMU_64/CONFIG_PPC_BOOK3S_64/, and do some minor
formatting updates of some of the affected lines.
This will be a pain for backports, but c'est la vie.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When dumping the paca in xmon we currently show kstack. Although it's
not hard it's a bit fiddly to work out what the bounds of the kernel
stack should be based on the kstack value.
To make life easier and "kstack_base" which is the base (lowest
address) of the kernel stack, eg:
kstack = 0xc0000000f1a7be30 (0x258)
kstack_base = 0xc0000000f1a78000
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently xmon could call XIVE functions from OPAL even if the XIVE is
disabled or does not exist in the system, as in POWER8 machines. This
causes the following exception:
1:mon> dx
cpu 0x1: Vector: 700 (Program Check) at [c000000423c93450]
pc: c00000000009cfa4: opal_xive_dump+0x50/0x68
lr: c0000000000997b8: opal_return+0x0/0x50
This patch simply checks if XIVE is enabled before calling XIVE
functions.
Fixes: 243e25112d ("powerpc/xive: Native exploitation of the XIVE interrupt controller")
Suggested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
It might be useful to quickly get the uptime of a running system on
xmon, without needing to grab data from memory and doing math on
struct addresses.
For example, it'd be useful to check for how long after a crash a
system is on xmon shell or if some test was started after the first
test crashed (and this 2nd test crashed too into xmon).
This small patch adds the 'U' command, to accomplish this.
Suggested-by: Murilo Fossa Vicentini <muvic@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
[mpe: Display units (seconds), add sync()/__delay() sequence]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The SMP hardlockup watchdog cross-checks other CPUs for lockups, which
causes xmon headaches because it's assuming interrupts hard disabled
means no watchdog troubles. Try to improve that by calling
touch_nmi_watchdog() in obvious places where secondaries are spinning.
Also annotate these spin loops with spin_begin/end calls.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add support for printing the PIDR/TIDR for ISA 300 and PSSCR and PTCR
in ISA 3.0 hypervisor mode.
SPRN_PSSCR_PR is the privileged mode access and is used when we are
not in hypervisor mode.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patch adds support to xmon for dumping the AMR, UAMOR, AMOR and
IAMR SPRs based on their supported ISA revisions.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
ISA 3.0 defines hypervisor decrementer to be 64 bits in length.
This patch extends the print format for to be 64 bits.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Convert 0.16x to 0.16lx. Otherwise we lose the top 8 nibbles and
effectively print only the last 32 bits.
Fixes: 1846193b17 ("powerpc/xmon: Dump ISA 2.06 SPRs")
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
If tracing is enabled and you get into xmon, the tracing buffer
continues to be updated, causing possible loss of data and unnecessary
tracing information coming from xmon functions.
This patch simple disables tracing when entering xmon, and re-enables it
if the kernel is resumed (with 'x').
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Current xmon 'dt' command dumps the tracing buffer for all the CPUs,
which makes it very hard to read due to the fact that most of
powerpc machines currently have many CPUs. Other than that, the CPU
lines are interleaved in the ftrace log.
This new option just dumps the ftrace buffer for the current CPU.
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Move from mwrite() to patch_instruction() for xmon for
breakpoint addition and removal.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rearrange the code so that mode and badaddr are only defined when
they're used.
Also unsplit the string for easier grepping, and switch from CONFIG_8xx
which is deprecated to CONFIG_PPC_8xx.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently if we take an oops caused by an 0x380 or 0x480 exception, we get a
print which assumes SLB problems. With radix, these vectors have different
meanings.
This patch updates the oops message to reflect these different meanings.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
An externally triggered system reset (e.g., via QEMU nmi command, or pseries
reset button) can cause system reset interrupts on all CPUs. In case this causes
xmon to be entered, it is undesirable for the primary (first) CPU into xmon to
trigger an NMI IPI to others, because this may cause a nested system reset
interrupt.
So spin for a time waiting for secondaries to join xmon before performing the
NMI IPI, similarly to what the crash dump code does.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Only do it when we come in from system reset, not via sysrq etc.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The system reset interrupt is used for crash/debug situations, so it is
desirable to have as little impact on the normal state of the system as
possible.
Currently it uses the current kernel stack to process the exception.
This stores into the stack which may be involved with the crash. The
stack pointer may be corrupted, or it may have overflowed.
Avoid or minimise these problems by creating a dedicated NMI stack for
the system reset interrupt to use.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In preparation for using a dedicated stack for system reset interrupts,
prevent a nested system reset from recovering, in order to simplify
code that is called in crash/debug path. This allows a system reset
interrupt to just use the base stack pointer.
Keep an in_nmi nesting counter similarly to the in_mce counter. Consider
the interrrupt non-recoverable if it is taken inside another system
reset.
Interrupt nesting could be allowed similarly to MCE, but system reset
is a special case that's not for normal operation, so simplicity wins
until there is requirement for nested system reset interrupts.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently the code that dumps SLB entries uses a double-nested if. This
means the actual dumping logic is a bit squashed. Deindent it by using
continue.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This merges the arch part of the XIVE support, leaving the final commit
with the KVM specific pieces dangling on the branch for Paul to merge
via the kvm-ppc tree.
powerpc_debugfs_root is the dentry representing the root of the
"powerpc" directory tree in debugfs.
Currently it sits in asm/debug.h, a long with some other things that
have "debug" in the name, but are otherwise unrelated.
Pull it out into a separate header, which also includes linux/debugfs.h,
and convert all the users to include debugfs.h instead of debug.h.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The XIVE interrupt controller is the new interrupt controller
found in POWER9. It supports advanced virtualization capabilities
among other things.
Currently we use a set of firmware calls that simulate the old
"XICS" interrupt controller but this is fairly inefficient.
This adds the framework for using XIVE along with a native
backend which OPAL for configuration. Later, a backend allowing
the use in a KVM or PowerVM guest will also be provided.
This disables some fast path for interrupts in KVM when XIVE is
enabled as these rely on the firmware emulation code which is no
longer available when the XIVE is used natively by Linux.
A latter patch will make KVM also directly exploit the XIVE, thus
recovering the lost performance (and more).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Fixup pr_xxx("XIVE:"...), don't split pr_xxx() strings,
tweak Kconfig so XIVE_NATIVE selects XIVE and depends on POWERNV,
fix build errors when SMP=n, fold in fixes from Ben:
Don't call cpu_online() on an invalid CPU number
Fix irq target selection returning out of bounds cpu#
Extra sanity checks on cpu numbers
]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently the xmon debugger is set only via kernel boot command-line.
It's disabled by default, and can be enabled with "xmon=on" on the
command-line. Also, xmon may be accessed via sysrq mechanism.
But we cannot enable/disable xmon in runtime, it needs kernel reload.
This patch introduces a debugfs entry for xmon, allowing user to query
its current state and change it if desired. Basically, the "xmon" file
to read from/write to is under the debugfs mount point, on powerpc
directory. It's a simple attribute, value 0 meaning xmon is disabled
and value 1 the opposite. Writing these states to the file will take
immediate effect in the debugger.
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The xmon parameter nobt was added long time ago, by commit 26c8af5f01
("[POWERPC] print backtrace when entering xmon"). The problem that time
was that during a crash in a machine with USB keyboard, xmon wouldn't
respond to commands from the keyboard, so printing the backtrace wouldn't
be possible.
Idea then was to show automatically the backtrace on xmon crash for the
first time it's invoked (if it recovers, next time xmon won't show
backtrace automatically). The nobt parameter was added _only_ to prevent
this automatic trace show. Seems long time ago USB keyboards didn't work
that well!
We don't need this parameter anymore, the feature of auto showing the
backtrace is interesting (imagine a case of auto-reboot script),
so this patch extends the functionality, by always showing the backtrace
automatically when xmon is invoked; it removes the nobt parameter too.
Also, this patch fixes __initdata placement on xmon_early and replaces
__initcall() with modern device_initcall() on sysrq handler.
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Once xmon is triggered by sysrq-x, it is enabled always afterwards even
if it is disabled during boot. This will cause a system reset interrupt
fail to dump. So keep xmon in its original state after exit.
We have several ways to set xmon on or off.
1) by a build config CONFIG_XMON_DEFAULT.
2) by a boot cmdline with xmon or xmon=early or xmon=on to enable xmon
and xmon=off to disable xmon. This value will override that in step 1.
3) by a debugfs interface, as proposed in this patchset.
And this value can override those in step 1 and 2.
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Highlights include:
- An update of the disassembly code used by xmon to the latest versions in
binutils. We've received permission from all the authors of the relevant
binutils changes to relicense their changes to the relevant files from GPLv3
to GPLv2, for inclusion in Linux. Thanks to Peter Bergner for doing the leg
work to get permission from everyone.
- Addition of the "architected" Power9 CPU table entry, allowing us to boot
in Power9 architected mode under a hypervisor.
- Updates to the Power9 PMU code.
- Implementation of clear_bit_unlock_is_negative_byte() to optimise
unlock_page().
- Freescale updates from Scott: "Highlights include 8xx breakpoints and perf,
t1042rdb display support, and board updates."
Thanks to:
Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas Miller,
Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan, Michael Roth, Nathan
Fontenot, Naveen N. Rao, Nicholas Piggin, Peter Bergner, Paul E. McKenney,
Rashmica Gupta, Russell Currey, Sahil Mehta, Stewart Smith.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYthsKAAoJEFHr6jzI4aWAaWMQAJ7mAwX98ncoYschPgRmmIun
f6DtE4IonrxiZ22gp1ct4+c9OFtA+B5FXMcEhOKpfh93lg38PTDjHs9e5kfauD7+
oTQ2Bg1eXaL48FKdmC5Vs4Kt+/J8e9guGafUC1OVIpTyyRPoZeUDH0lx+kSPV5bd
PkL+wY/k3W0Njo8WgD1P9u3W15+BxISo/k8c7ajzKTHGBZlAvj5h2gO6XUBNMLyy
YClB/qIymjZriSB+AeWYD79k8gPbBZPsmZG0ZF1hY060894LgqLB9mPOJdffx/DY
H7/uP6jcsRDOXTOmyueW1SEmPoQbtysiMd1lNrCXKtC/Okr5uhn2cUhi88AsgWvd
1QFly2lobcDAKPah/yB7YQGMAcmYvGGNuqrWaosaV2T7r0KprzUYYgCOqzvC3WSJ
QtVatBzMIqRTMYq+3U4G1aHeCXlRazVQHDuvPby8RdR5b2gIexiqMab2eS7tSMIH
mCOIunRIvT14g/7wxUV7tahN+ifncNxzAk4DvPO+Wc4FQ4sy7wArv2YipSaWRWtE
u7tNdBkEwlDkKhJgRU5T0Op2PyMbHwCP8pWuz7PQIhKIcgwmP9wb07BIWG/GGIqn
07TxJYX2ItabyEMZMsYhzILZqjLyiAaCARANB7ScbQbdP8wdcGZcwismhwnfROIU
NuxsZg63BUDMoxk7Sauu
=rspd
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull more powerpc updates from Michael Ellerman:
"Highlights include:
- an update of the disassembly code used by xmon to the latest
versions in binutils. We've received permission from all the
authors of the relevant binutils changes to relicense their changes
to the relevant files from GPLv3 to GPLv2, for inclusion in Linux.
Thanks to Peter Bergner for doing the leg work to get permission
from everyone.
- addition of the "architected" Power9 CPU table entry, allowing us
to boot in Power9 architected mode under a hypervisor.
- updates to the Power9 PMU code.
- implementation of clear_bit_unlock_is_negative_byte() to optimise
unlock_page().
- Freescale updates from Scott: "Highlights include 8xx breakpoints
and perf, t1042rdb display support, and board updates."
Thanks to:
Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas
Miller, Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan,
Michael Roth, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Peter
Bergner, Paul E. McKenney, Rashmica Gupta, Russell Currey, Sahil
Mehta, Stewart Smith"
* tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits)
powerpc: Remove leftover cputime_to_nsecs call causing build error
powerpc/mm/hash: Always clear UPRT and Host Radix bits when setting up CPU
powerpc/optprobes: Fix TOC handling in optprobes trampoline
powerpc/pseries: Advertise Hot Plug Event support to firmware
cxl: fix nested locking hang during EEH hotplug
powerpc/xmon: Dump memory in CPU endian format
powerpc/pseries: Revert 'Auto-online hotplugged memory'
powerpc/powernv: Make PCI non-optional
powerpc/64: Implement clear_bit_unlock_is_negative_byte()
powerpc/powernv: Remove unused variable in pnv_pci_sriov_disable()
powerpc/kernel: Remove error message in pcibios_setup_phb_resources()
powerpc/mm: Fix typo in set_pte_at()
pci/hotplug/pnv-php: Disable MSI and PCI device properly
pci/hotplug/pnv-php: Disable surprise hotplug capability on conflicts
pci/hotplug/pnv-php: Remove WARN_ON() in pnv_php_put_slot()
powerpc: Add POWER9 architected mode to cputable
powerpc/perf: use is_kernel_addr macro in perf_get_misc_flags()
powerpc/perf: Avoid FAB_*_MATCH checks for power9
powerpc/perf: Add restrictions to PMC5 in power9 DD1
powerpc/perf: Use Instruction Counter value
...
show_mem() allows to filter out node specific data which is irrelevant
to the allocation request via SHOW_MEM_FILTER_NODES. The filtering is
done in skip_free_areas_node which skips all nodes which are not in the
mems_allowed of the current process. This works most of the time as
expected because the nodemask shouldn't be outside of the allocating
task but there are some exceptions. E.g. memory hotplug might want to
request allocations from outside of the allowed nodes (see
new_node_page).
Get rid of this hardcoded behavior and push the allocation mask down the
show_mem path and use it instead of cpuset_current_mems_allowed. NULL
nodemask is interpreted as cpuset_current_mems_allowed.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170117091543.25850-5-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Highlights include:
- Support for direct mapped LPC on POWER9, giving Linux direct access to
devices that may be on there such as a UART.
- Memory hotplug support for the Power9 Radix MMU.
- Add new AUX vectors describing the processor's cache geometry, to be used by
glibc.
- The ability for a guest to ask the hypervisor to resize the guest's hash
table, and in addition support for doing so automatically when memory is
hotplugged into/out-of the guest. This allows the hash table to be sized
based on the current memory usage of the guest, rather than the maximum
possible memory usage.
- Implementation of optprobes (kprobe optimisation) for powerpc.
In addition there's the topic branch shared with the KVM tree, which includes
support for guests to use the Radix MMU on Power9.
Thanks to:
Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T, Anton Blanchard,
Benjamin Herrenschmidt, Chris Packham, Daniel Axtens, Daniel Borkmann, David
Gibson, Finn Thain, Gautham R. Shenoy, Gavin Shan, Greg Kurz, Joel Stanley,
John Allen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael
Neuling, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Ravi
Bangoria, Reza Arbab, Shailendra Singh, Vaibhav Jain, Wei Yongjun.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYrWj0AAoJEFHr6jzI4aWAsn4P/08Kz3TtOvDuuPGVNoO7fWOn
ag5/zVt8R4FuCALqWpAZbVqMuUU4wLxG0RuWlmBNYYhrjMC6JxHpOSjQXxM2D7YT
CdGTJxG414r6HMOeToL9i/z33o0m+KT07tscer+QMKlXVKCR2z0fEJch+zoPBHYA
5eVBFpLLTtbNiX9UnrcM/vYz61d56kT4YJey9/8qbAkTAc1rMPa8ucU5UiKYJ7yX
mF8cd7WE+7aqif9V8yN59G2rcbz+h3pbMw/gzImiYsYrUj4fLjU+VTKL5PPT+UFy
WWBXD3MAMm1dksMMZi3hgoo2BZhDn3RkymeYi6Jo4kDknNMPZzkMxGyvaJ8Eq5H/
bIYXdS1AbTtvaaEEuWqDFjpnChOEvj/8IeqitU0jjlql8BVjNKg/ESaaKucZr+pO
pk2Mfvw0Tb/lxJT5qj27yq4aRsxJwdFOoPYCN7MquPp/wV2Tg5M6h4nVQ4T6Wl0S
tMFQeCqXflhDWh0Xgr2UXpF66/YTj3Du5LasOTkgGeU30Z8TcNGFEmDWShKP3cEm
e0oQE+OHhIGN4WSBXBAto/Gw8/0v3pXlMs+VEfeHqenwPss2sWtSwXWe8khmiy9e
DJ48sTzj75/Zx1fiqRldw9YEnrL+NK0eOOpvzxeyKpfvdUytc+chFfEqUmO6kl3Z
DW2UmlZxmW+b0SfexCHL
=Icle
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights include:
- Support for direct mapped LPC on POWER9, giving Linux direct access
to devices that may be on there such as a UART.
- Memory hotplug support for the Power9 Radix MMU.
- Add new AUX vectors describing the processor's cache geometry, to
be used by glibc.
- The ability for a guest to ask the hypervisor to resize the guest's
hash table, and in addition support for doing so automatically when
memory is hotplugged into/out-of the guest. This allows the hash
table to be sized based on the current memory usage of the guest,
rather than the maximum possible memory usage.
- Implementation of optprobes (kprobe optimisation) for powerpc.
In addition there's the topic branch shared with the KVM tree, which
includes support for guests to use the Radix MMU on Power9.
Thanks to:
Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T, Anton
Blanchard, Benjamin Herrenschmidt, Chris Packham, Daniel Axtens,
Daniel Borkmann, David Gibson, Finn Thain, Gautham R. Shenoy, Gavin
Shan, Greg Kurz, Joel Stanley, John Allen, Madhavan Srinivasan,
Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Nathan Fontenot,
Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Ravi Bangoria, Reza
Arbab, Shailendra Singh, Vaibhav Jain, Wei Yongjun"
* tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (129 commits)
powerpc/mm/radix: Skip ptesync in pte update helpers
powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm
powerpc/mm/radix: Update pte update sequence for pte clear case
powerpc/mm: Update PROTFAULT handling in the page fault path
powerpc/xmon: Fix data-breakpoint
powerpc/mm: Fix build break with BOOK3S_64=n and MEMORY_HOTPLUG=y
powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y
powerpc/mm: Fix build break with RADIX=y & HUGETLBFS=n
powerpc/pseries: Fix typo in parameter description
powerpc/kprobes: Remove kprobe_exceptions_notify()
kprobes: Introduce weak variant of kprobe_exceptions_notify()
powerpc/ftrace: Fix confusing help text for DISABLE_MPROFILE_KERNEL
powerpc/powernv: Fix opal_exit tracepoint opcode
powerpc: Add a prototype for mcount() so it can be versioned
powerpc: Drop GPL from of_node_to_nid() export to match other arches
powerpc/kprobes: Optimize kprobe in kretprobe_trampoline()
powerpc/kprobes: Implement Optprobes
powerpc/kprobes: Fixes for kprobe_lookup_name() on BE
powerpc: Add helper to check if offset is within relative branch range
powerpc/bpf: Introduce __PPC_SH64()
...
Extend the dump command to allow display of 2, 4, and 8 byte words in
CPU endian format. Also adds dump command for "1 byte values" for the
sake of symmetry. New commands are:
d1 dump 1 byte values
d2 dump 2 byte values
d4 dump 4 byte values
d8 dump 8 byte values
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
That in order to gather all cputime accumulation to the same place.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-7-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In order to prepare for CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y to delay
cputime accounting to the tick, provide finegrained accumulators to
powerpc in order to store the cputime until flushing.
While at it, normalize the name of several fields according to common
cputime naming.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-6-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is a nice interface for asking ftrace to dump all its tracing
buffers. The only down side for use in xmon is that it uses printk.
Depending on circumstances printk may not work when in xmon, but it also
may, so add a 'dt' command which dumps the ftrace buffers, and add a
note to the help to mentiont that it uses printk.
Calling this routine also disables tracing, which is problematic if you
return from xmon and expect the system to keep operating normally. So
after we do the dump turn tracing back on.
Both functions already have nop versions defined for when ftrace is not
enabled, so we don't need any extra #ifdefs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patch provides VIRT_CPU_ACCOUTING to PPC32 architecture.
PPC32 doesn't have the PACA structure, so we use the task_info
structure to store the accounting data.
In order to reuse on PPC32 the PPC64 functions, all u64 data has
been replaced by 'unsigned long' so that it is u32 on PPC32 and
u64 on PPC64
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Scott Wood <oss@buserror.net>
xmon has commands for reading and writing SPRs, but they don't work
currently for several reasons. They attempt to synthesize a small
function containing an mfspr or mtspr instruction and call it. However,
the instructions are on the stack, which is usually not executable.
Also, for 64-bit we set up a procedure descriptor, which is fine for the
big-endian ABIv1, but not correct for ABIv2. Finally, the code uses the
infrastructure for catching memory errors, but that only catches data
storage interrupts and machine check interrupts, but a failed
mfspr/mtspr can generate a program interrupt or a hypervisor emulation
assist interrupt, or be a no-op.
Instead of trying to synthesize a function on the fly, this adds two new
functions, xmon_mfspr() and xmon_mtspr(), which take an SPR number as an
argument and read or write the SPR. Because there is no Power ISA
instruction which takes an SPR number in a register, we have to generate
one of each possible mfspr and mtspr instruction, for all 1024 possible
SPRs. Thus we get just over 8k bytes of code for each of xmon_mfspr()
and xmon_mtspr(). However, this 16kB of code pales in comparison to the
> 130kB of PPC opcode tables used by the xmon disassembler.
To catch interrupts caused by the mfspr/mtspr instructions, we add a new
'catch_spr_faults' flag. If an interrupt occurs while it is set, we come
back into xmon() via program_check_interrupt(), _exception() and die(),
see that catch_spr_faults is set and do a longjmp to bus_error_jmp, back
into read_spr() or write_spr().
This adds a couple of other nice features: first, a "Sa" command that
attempts to read and print out the value of all 1024 SPRs. If any mfspr
instruction acts as a no-op, then the SPR is not implemented and not
printed.
Secondly, the Sr and Sw commands detect when an SPR is not
implemented (i.e. mfspr is a no-op) and print a message to that effect
rather than printing a bogus value.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We also use MMU_FTR_RADIX to branch out from code path specific to
hash.
No functionality change.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add 'P' command with optional task_struct address to dump all/one task's
information: task pointer, kernel stack pointer, PID, PPID, state
(interpreted), CPU where (last) running, and command.
Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>