Add a callback for MIPS KVM implementations to handle the VZ guest
exit exception. Currently the trap & emulate implementation contains a
stub which reports an internal error, but the callback will be used
properly by the VZ implementation.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Add an implementation callback for the kvm_arch_hardware_enable() and
kvm_arch_hardware_disable() architecture functions, with simple stubs
for trap & emulate. This is in preparation for VZ which will make use of
them.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Add an implementation callback for checking presence of KVM extensions.
This allows implementation specific extensions to be provided without
ifdefs in mips.c.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Currently the software emulated timer is initialised to a frequency of
100MHz by kvm_mips_init_count(), but this isn't suitable for VZ where
the frequency of the guest timer matches that of the host.
Add a count_hz argument so the caller can specify the default frequency,
and move the call from kvm_arch_vcpu_create() to the implementation
specific vcpu_setup() callback, so that VZ can specify a different
frequency.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Extend MIPS KVM stats counters and kvm_transition trace event codes to
cover hypervisor exceptions, which have their own GExcCode field in
CP0_GuestCtl0 with up to 32 hypervisor exception cause codes.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Emulate the HYPCALL instruction added in the VZ ASE and used by the MIPS
paravirtualised guest support that is already merged. The new hypcall.c
handles arguments and the return value. No actual hypercalls are yet
supported, but this still allows us to safely step over hypercalls and
set an error code in the return value for forward compatibility.
Non-zero HYPCALL codes are not handled.
We also document the hypercall ABI which asm/kvm_para.h uses.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Add a distinct UNIQUE_GUEST_ENTRYHI() macro for invalidation of guest
TLB entries by KVM, using addresses in KSeg1 rather than KSeg0. This
avoids conflicts with guest invalidation routines when there is no EHINV
bit to mark the whole entry as invalid, avoiding guest machine check
exceptions on Cavium Octeon III.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Add some missing guest accessors and register field definitions for KVM
for MIPS VZ to make use of.
Guest CP0_LLAddr register accessors and definitions for the LLB field
allow KVM to clear the guest LLB to cancel in-progress LL/SC atomics on
restore, and to emulate accesses by the guest to the CP0_LLAddr
register.
Bitwise modifiers and definitions for the guest CP0_Wired and
CP0_Config1 registers allow KVM to modify fields within the CP0_Wired
and CP0_Config1 registers.
Finally a definition for the CP0_Config5.SBRI bit allows KVM to
initialise and allow modification of the guest version of the SBRI bit.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Probe for availablility of M{T,F}HC0 instructions used with e.g. XPA in
the VZ guest context, and make it available via cpu_guest_has_mvh. This
will be helpful in properly emulating the MAAR registers in KVM for MIPS
VZ.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Probe for presence of guest CP0_UserLocal register and expose via
cpu_guest_has_userlocal. This register is optional pre-r6, so this will
allow KVM to only save/restore/expose the guest CP0_UserLocal register
if it exists.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
The MAAR V bit has been renamed VL since another bit called VH is added
at the top of the register when it is extended to 64-bits on a 32-bit
processor with XPA. Rename the V definition, fix the various users, and
add definitions for the VH bit. Also add a definition for the MAARI
Index field.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Add definitions and probing of the UFR bit in Config5. This bit allows
user mode control of the FR bit (floating point register mode). It is
present if the UFRP bit is set in the floating point implementation
register.
This is a capability KVM may want to expose to guest kernels, even
though Linux is unlikely to ever use it due to the implications for
multi-threaded programs.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
This socket option returns the NAPI ID associated with the queue on which
the last frame is received. This information can be used by the apps to
split the incoming flows among the threads based on the Rx queue on which
they are received.
If the NAPI ID actually represents a sender_cpu then the value is ignored
and 0 is returned.
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allows reading of SK_MEMINFO_VARS via socket option. This way an
application can get all meminfo related information in single socket
option call instead of multiple calls.
Adds helper function, sk_get_meminfo(), and uses that for both
getsockopt and sock_diag_put_meminfo().
Suggested by Eric Dumazet.
Signed-off-by: Josh Hunt <johunt@akamai.com>
Reviewed-by: Jason Baron <jbaron@akamai.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the separate IRQ stack was introduced, stack unwinding only
proceeded as far as the top of the IRQ stack, leading to kernel
backtraces being less useful, lacking the trace of what was interrupted.
Fix this by providing a means for the kernel to unwind the IRQ stack
onto the interrupted task stack. The processor state is saved to the
kernel task stack on interrupt. The IRQ_STACK_START macro reserves an
unsigned long at the top of the IRQ stack where the interrupted task
stack pointer can be saved. After the active stack is switched to the
IRQ stack, save the interrupted tasks stack pointer to the reserved
location.
Fix the stack unwinding code to look for the frame being the top of the
IRQ stack and if so get the next frame from the saved location. The
existing test does not work with the separate stack since the ra is no
longer pointed at ret_from_{irq,exception}.
The test to stop unwinding the stack 32 bytes from the top of a stack
must be modified to allow unwinding to continue up to the location of
the saved task stack pointer when on the IRQ stack. The low / high marks
of the stack are set depending on whether the sp is on an irq stack or
not.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: Masanari Iida <standby24x7@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15788/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
If an architecture uses 4level-fixup.h we don't need to do anything as
it includes 5level-fixup.h.
If an architecture uses pgtable-nop*d.h, define __ARCH_USE_5LEVEL_HACK
before inclusion of the header. It makes asm-generic code to use
5level-fixup.h.
If an architecture has 4-level paging or folds levels on its own,
include 5level-fixup.h directly.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wire up the statx system call for MIPS, which was introduced in commit
a528d35e8b ("statx: Add a system call to make enhanced file info
available").
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15387/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Use of the task_pt_regs() based macros in MIPS' asm/processor.h for
accessing the user context on the kernel stack need the definition of
struct pt_regs from asm/ptrace.h. __own_fpu() in asm/fpu.h uses these
macros but implicitly depended on linux/sched.h to include asm/ptrace.h.
Since commit f780d89a0e ("sched/headers: Remove <asm/ptrace.h> from
<linux/sched.h>") however linux/sched.h no longer includes asm/ptrace.h,
so include it explicitly from asm/fpu.h where it is needed instead.
This fixes build errors such as:
./arch/mips/include/asm/fpu.h: In function '__own_fpu':
./arch/mips/include/asm/processor.h:385:31: error: invalid application of 'sizeof' to incomplete type 'struct pt_regs'
THREAD_SIZE - 32 - sizeof(struct pt_regs))
^
Fixes: f780d89a0e ("sched/headers: Remove <asm/ptrace.h> from <linux/sched.h>")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15386/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When building for microMIPS we need to ensure that the assembler always
knows that there is code at the target of a branch or jump. Recent
toolchains will fail to link a microMIPS kernel when this isn't the case
due to what it thinks is a branch to non-microMIPS code.
mips-mti-linux-gnu-ld kernel/built-in.o: .spinlock.text+0x2fc: Unsupported branch between ISA modes.
mips-mti-linux-gnu-ld final link failed: Bad value
This is due to inline assembly labels in spinlock.h not being followed
by an instruction mnemonic, either due to a .subsection pseudo-op or the
end of the inline asm block.
Fix this with a .insn direction after such labels.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/15325/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
After the split of linux/sched.h, several platforms in arch/mips stopped building.
Add the respective additional #include statements to fix the problem I first
tried adding these into asm/processor.h, but ran into circular header
dependencies with that which I could not figure out.
The commit I listed as causing the problem is the branch merge, as there is
likely a combination of multiple patches in that branch.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: ralf@linux-mips.org
Fixes: 1827adb11a ("Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip")
Link: http://lkml.kernel.org/r/20170308072931.3836696-1-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Update code that relied on sched.h including various MM types for them.
This will allow us to remove the <linux/mm_types.h> include from <linux/sched.h>.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Introduce dummy header and add dependencies to places that will depend on it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/task_stack.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Often all is needed is these small helpers, instead of compiler.h or a
full kprobes.h. This is important for asm helpers, in fact even some
asm/kprobes.h make use of these helpers... instead just keep a generic
asm file with helpers useful for asm code with the least amount of
clutter as possible.
Likewise we need now to also address what to do about this file for both
when architectures have CONFIG_HAVE_KPROBES, and when they do not. Then
for when architectures have CONFIG_HAVE_KPROBES but have disabled
CONFIG_KPROBES.
Right now most asm/kprobes.h do not have guards against CONFIG_KPROBES,
this means most architecture code cannot include asm/kprobes.h safely.
Correct this and add guards for architectures missing them.
Additionally provide architectures that not have kprobes support with
the default asm-generic solution. This lets us force asm/kprobes.h on
the header include/linux/kprobes.h always, but most importantly we can
now safely include just asm/kprobes.h on architecture code without
bringing the full kitchen sink of header files.
Two architectures already provided a guard against CONFIG_KPROBES on its
kprobes.h: sh, arch. The rest of the architectures needed gaurds added.
We avoid including any not-needed headers on asm/kprobes.h unless
kprobes have been enabled.
In a subsequent atomic change we can try now to remove compiler.h from
include/linux/kprobes.h.
During this sweep I've also identified a few architectures defining a
common macro needed for both kprobes and ftrace, that of the definition
of the breakput instruction up. Some refer to this as
BREAKPOINT_INSTRUCTION. This must be kept outside of the #ifdef
CONFIG_KPROBES guard.
[mcgrof@kernel.org: fix arm64 build]
Link: http://lkml.kernel.org/r/CAB=NE6X1WMByuARS4mZ1g9+W=LuVBnMDnh_5zyN0CLADaVh=Jw@mail.gmail.com
[sfr@canb.auug.org.au: fixup for kprobes declarations moving]
Link: http://lkml.kernel.org/r/20170214165933.13ebd4f4@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170203233139.32682-1-mcgrof@kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bart Van Assche noted that the ib DMA mapping code was significantly
similar enough to the core DMA mapping code that with a few changes
it was possible to remove the IB DMA mapping code entirely and
switch the RDMA stack to use the core DMA mapping code. This resulted
in a nice set of cleanups, but touched the entire tree. This branch
will be submitted separately to Linus at the end of the merge window
as per normal practice for tree wide changes like this.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYo06oAAoJELgmozMOVy/d9Z8QALedWHdu98St1L0u2c8sxnR9
2zo/4sF5Vb9u7FpmdIX32L4SQ9s9KhPE8Qp8NtZLf9v10zlDebIRJDpXknXtKooV
CAXxX4sxBXV27/UrhbZEfXiPrmm6ccJFyIfRnMU6NlMqh2AtAsRa5AC2/RMp8oUD
Med97PFiF0o6TD22/UH1VFbRpX1zjaKyqm7a3as5sJfzNA+UGIZAQ7Euz8000DKZ
xCgVLTEwS0FmOujtBkCst7xa9TjuqR1HLOB4DdGvAhP6BHdz2yamM7Qmh9NN+NEX
0BtjsuXomtn6j6AszGC+bpipCZh3NUigcwoFAARXCYFHibBvo4DPdFeGsraFgXdy
1+KyR8CCeQG3Aly5Vwr264RFPGkGpwMj8PsBlXgQVtrlg4rriaCzOJNmIIbfdADw
ftqhxBOzReZw77aH2s+9p2ILRfcAmPqhynLvFGFo9LBvsik8LVso7YgZN0xGxwcI
IjI/XGC8UskPVsIZBIYA6sl2bYzgOjtBIHiXjRrPlW3uhduIXLrvKFfLPP/5XLAG
ehLXK+J0bfsyY9ClmlNS8oH/WdLhXAyy/KNmnj5bRRm9qg6BRJR3bsOBhZJODuoC
XgEXFfF6/7roNESWxowff7pK0rTkRg/m/Pa4VQpeO+6NWHE7kgZhL6kyIp5nKcwS
3e7mgpcwC+3XfA/6vU3F
=e0Si
-----END PGP SIGNATURE-----
Merge tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma DMA mapping updates from Doug Ledford:
"Drop IB DMA mapping code and use core DMA code instead.
Bart Van Assche noted that the ib DMA mapping code was significantly
similar enough to the core DMA mapping code that with a few changes it
was possible to remove the IB DMA mapping code entirely and switch the
RDMA stack to use the core DMA mapping code.
This resulted in a nice set of cleanups, but touched the entire tree
and has been kept separate for that reason."
* tag 'for-next-dma_ops' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (37 commits)
IB/rxe, IB/rdmavt: Use dma_virt_ops instead of duplicating it
IB/core: Remove ib_device.dma_device
nvme-rdma: Switch from dma_device to dev.parent
RDS: net: Switch from dma_device to dev.parent
IB/srpt: Modify a debug statement
IB/srp: Switch from dma_device to dev.parent
IB/iser: Switch from dma_device to dev.parent
IB/IPoIB: Switch from dma_device to dev.parent
IB/rxe: Switch from dma_device to dev.parent
IB/vmw_pvrdma: Switch from dma_device to dev.parent
IB/usnic: Switch from dma_device to dev.parent
IB/qib: Switch from dma_device to dev.parent
IB/qedr: Switch from dma_device to dev.parent
IB/ocrdma: Switch from dma_device to dev.parent
IB/nes: Remove a superfluous assignment statement
IB/mthca: Switch from dma_device to dev.parent
IB/mlx5: Switch from dma_device to dev.parent
IB/mlx4: Switch from dma_device to dev.parent
IB/i40iw: Remove a superfluous assignment statement
IB/hns: Switch from dma_device to dev.parent
...
200 commits and noteworthy changes for most architectures.
* ARM:
- GICv3 save/restore
- cache flushing fixes
- working MSI injection for GICv3 ITS
- physical timer emulation
* MIPS:
- various improvements under the hood
- support for SMP guests
- a large rewrite of MMU emulation. KVM MIPS can now use MMU notifiers
to support copy-on-write, KSM, idle page tracking, swapping, ballooning
and everything else. KVM_CAP_READONLY_MEM is also supported, so that
writes to some memory regions can be treated as MMIO. The new MMU also
paves the way for hardware virtualization support.
* PPC:
- support for POWER9 using the radix-tree MMU for host and guest
- resizable hashed page table
- bugfixes.
* s390: expose more features to the guest
- more SIMD extensions
- instruction execution protection
- ESOP2
* x86:
- improved hashing in the MMU
- faster PageLRU tracking for Intel CPUs without EPT A/D bits
- some refactoring of nested VMX entry/exit code, preparing for live
migration support of nested hypervisors
- expose yet another AVX512 CPUID bit
- host-to-guest PTP support
- refactoring of interrupt injection, with some optimizations thrown in
and some duct tape removed.
- remove lazy FPU handling
- optimizations of user-mode exits
- optimizations of vcpu_is_preempted() for KVM guests
* generic:
- alternative signaling mechanism that doesn't pound on tsk->sighand->siglock
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJYral1AAoJEL/70l94x66DbNgH/Rx8YXuidFq2fe3RWOvld3RK
85OM/D5g38cTLpBE0/sJpcvX34iYN8U/l5foCZwpxB+83GHEk2Cr57JyfTogdaAJ
x8dBhHKQCA/HxSQUQLN6nFqRV+yT8WUR92Fhqx82+80BSen5Yzcfee/TDoW6T1IW
g8CYgX9FrRaGOX066ImAuUfdAdUVjyssfs9VttDTX+HiusPeuBPx/wsRe1ZEEPlH
vnltIJQb1ETV2GOZLUojKjzH6aZkjIl29XxjkYii9JTUornClG0DfW+5QT3uLrB5
gJ+G+Zmpsq8ZBx9jNDtAi7sFsoPY1Mzf+JPNCGXBra2sP2GrBAuXcxmgznRYltQ=
=8IIp
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"4.11 is going to be a relatively large release for KVM, with a little
over 200 commits and noteworthy changes for most architectures.
ARM:
- GICv3 save/restore
- cache flushing fixes
- working MSI injection for GICv3 ITS
- physical timer emulation
MIPS:
- various improvements under the hood
- support for SMP guests
- a large rewrite of MMU emulation. KVM MIPS can now use MMU
notifiers to support copy-on-write, KSM, idle page tracking,
swapping, ballooning and everything else. KVM_CAP_READONLY_MEM is
also supported, so that writes to some memory regions can be
treated as MMIO. The new MMU also paves the way for hardware
virtualization support.
PPC:
- support for POWER9 using the radix-tree MMU for host and guest
- resizable hashed page table
- bugfixes.
s390:
- expose more features to the guest
- more SIMD extensions
- instruction execution protection
- ESOP2
x86:
- improved hashing in the MMU
- faster PageLRU tracking for Intel CPUs without EPT A/D bits
- some refactoring of nested VMX entry/exit code, preparing for live
migration support of nested hypervisors
- expose yet another AVX512 CPUID bit
- host-to-guest PTP support
- refactoring of interrupt injection, with some optimizations thrown
in and some duct tape removed.
- remove lazy FPU handling
- optimizations of user-mode exits
- optimizations of vcpu_is_preempted() for KVM guests
generic:
- alternative signaling mechanism that doesn't pound on
tsk->sighand->siglock"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (195 commits)
x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64
x86/paravirt: Change vcp_is_preempted() arg type to long
KVM: VMX: use correct vmcs_read/write for guest segment selector/base
x86/kvm/vmx: Defer TR reload after VM exit
x86/asm/64: Drop __cacheline_aligned from struct x86_hw_tss
x86/kvm/vmx: Simplify segment_base()
x86/kvm/vmx: Get rid of segment_base() on 64-bit kernels
x86/kvm/vmx: Don't fetch the TSS base from the GDT
x86/asm: Define the kernel TSS limit in a macro
kvm: fix page struct leak in handle_vmon
KVM: PPC: Book3S HV: Disable HPT resizing on POWER9 for now
KVM: Return an error code only as a constant in kvm_get_dirty_log()
KVM: Return an error code only as a constant in kvm_get_dirty_log_protect()
KVM: Return directly after a failed copy_from_user() in kvm_vm_compat_ioctl()
KVM: x86: remove code for lazy FPU handling
KVM: race-free exit from KVM_RUN without POSIX signals
KVM: PPC: Book3S HV: Turn "KVM guest htab" message into a debug message
KVM: PPC: Book3S PR: Ratelimit copy data failure error messages
KVM: Support vCPU-based gfn->hva cache
KVM: use separate generations for each address space
...
Miscellaneous:
- Add IRQ stacks
- Add cacheinfo support
- Add "uzImage.bin" zboot target
- Unify performance counter definitions
- Export various (mainly assembly) symbols alongside their
definitions
- Audit and remove unnecessary uses of module.h
kexec & kdump:
- Lots of improvements and fixes
- Add correct copy_regs implementations
- Add debug logging of new kernel information
Security:
- Use Makefile.postlink to insert relocations into vmlinux
- Provide plat_post_relocation hook (used for Octeon KASLR)
- Add support for tuning mmap randomisation
- Relocate DTB
microMIPS:
- A load of unwind fixes
- Add some missing .insn to fix link errors
MIPSr6:
- Fix MULTU/MADDU/MSUBU sign extension in r2 emulation
- Remove r2_emul_return and use ERETNC unconditionally on MIPSr6
- Allow pre-r6 emulation on SMP MIPSr6 kernels
Cache management:
- Treat physically indexed dcache as non-aliasing
- Add return errors to protected cache ops for KVM
- CM3: Ensure L1 & L2 cache ECC checking matches
- CM3: Indicate inclusive caches
- I6400: Treat dcache as physically indexed
Memory management:
- Ensure bootmem doesn't corrupt reserved memory
- Export some TLB exception generation functions for KVM
OF
- NULL check initial_boot_params before use in of_scan_flat_dt()
- Fix unaligned access in of_alias_scan()
SMP:
- CPS: Don't BUG if a CPU fails to start
Other fixes
- Fix longstanding 64-bit IP checksum carry bug
- Fix KERN_CONT fallout in cpu-bugs64.c and sync-r4k.c
- Update defconfigs for NF_CT_PROTO_DCCP, DPLITE,
CPU_FREQ_STAT,SCSI_DH changes
- Disable certain builtin compiler options, stack-check (whole
kernel), asynchronous-unwind-tables (VDSO).
- A bunch of build fixes from kernelci.org testing
- Various other minor cleanups & corrections
BMIPS:
- Migrate interrupts during bmips_cpu_disable
- BCM47xx: Add Luxul devices
- BCM47xx: Fix Asus WL-500W button inversion
- BCM7xxx: Add SPI device nodes
Generic (multiplatform):
- Add kexec DTB passing
- Fix big endian
- Add cpp_its_S in ksym_dep_filter to silence build warning
IP22:
- Reformat inline assembler code to modern standards
- Fix binutils 2.25 build error
IP27:
- Fix duplicate CAC_BASE definition build error
- Disable qlge driver to workaround broken compiler
Lantiq:
- Refresh defconfig and activate more drivers
- Lock DMA register access
- Fix cascading IRQ setup
- Fix build of VPE loader
- xway: Fix ethernet packet header corruption over reboot
Loongson1
- Add watchdog support
- 1B: Reduce DEFAULT_MEMSIZE to 64MB
- 1B: Change OSC clock name to match rest of kernel
- 1C: Remove ARCH_WANT_OPTIONAL_GPIOLIB
Octeon:
- Add KASLR support
- Support Octeon III USB controller
- Fix large copy_from_user corner case
- Enable devtmpfs in defconfig
Netlogic:
- Fix non-default XLR build error due to netlogic,xlp-pic code
- Fix assembler warning from smpboot.S
pic32mzda:
- Fix linker error when early printk is disabled
Pistachio:
- Add base device tree
- Add Ci40 "Marduk" device tree
Ralink:
- Support raw appended DTB
- Add missing I2C & I2S clocks
- Add missing pinmux and fix pinmux function name typo
- Add missing clk_round_rate()
- Clean up prom_init()
- MT7621: Set SoC type
- MT7621: Support highmem
TXx9:
- Modernize printing of kernel messages and resolve KERN_CONT fallout
- 7segled: use permission-specific DEVICE_ATTR variants
XilFPGA:
- Add IRQ controller and UART IRQ
- Add AXI I2C and emaclite to DT & defconfig
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJYp6EBAAoJEGwLaZPeOHZ6HdMQAJTaWMET5SFuERSQKi6xsgRp
+ZQ8eHWG6AlERXw56EOuo69qEKvvTWfYR8cBZmePxbrP0cBREcT7Thz8oX4q4IGs
ZT3jCLmHb4dSu4Pp5nAVVPb7XaJ6InJr/5V88jGECO+zvdIlsoOj8e13gypl0hsw
gxEiYHp3kSHVeBTmiCo5KasWtY36vkpl0CVfFIHVXQQulHPWdYGSamdhyCVHWvJc
l1fGJPi9f5KEJO6ys6bZUCPQGnpj09d3muw2FR0SA7i1M4TWw5t6HU6yt81mgxQD
TJH/lRKmDkfbv+2sOlz2NSHyilyuApY3dlN8BQX7eG4UFsvbZnZhAvGQTfZzelku
6pJXYHdeZOlYFDHFfZRKyT6cOnIZNqWlcoouds1GoHQZaUUNlGfVRVzJ1KGJoHa4
vBCIXOUb4KCCc1ylykzWeCOOalPlNVKxDn9vIBrhtld1CbgRaOzUFFlU8YJkYACB
k5W0XqFA2tth7cttLIc7H/VJUHwHfaVUwNbbNUWDKiUOrLtpIGtfNh2EPrFPHPA8
N7gsQ5gDDiASVHcreEnMqOQLdFpciFLTo0kXC9zXT4a9Q5fD0ILC02xzJCqCXRlG
KgJBBOfIavug9ZQO3vAaz8AyMNe+XCeMdJf+ZtEMnuuk4CqxRBixxu37fXEPv/hX
QUNLUk3kqV2yJPrkhFqJ
=maZx
-----END PGP SIGNATURE-----
Merge tag 'mips_4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips
Pull MIPS updates from James Hogan:
"Here's the main MIPS pull request for 4.11.
It contains a few new features such as IRQ stacks, cacheinfo support,
and KASLR for Octeon CPUs, and a variety of smaller improvements and
fixes including devicetree additions, kexec cleanups, microMIPS stack
unwinding fixes, and a bunch of build fixes to clean up continuous
integration builds.
Its all been in linux-next for at least a couple of days, most of it
far longer.
Miscellaneous:
- Add IRQ stacks
- Add cacheinfo support
- Add "uzImage.bin" zboot target
- Unify performance counter definitions
- Export various (mainly assembly) symbols alongside their
definitions
- Audit and remove unnecessary uses of module.h
kexec & kdump:
- Lots of improvements and fixes
- Add correct copy_regs implementations
- Add debug logging of new kernel information
Security:
- Use Makefile.postlink to insert relocations into vmlinux
- Provide plat_post_relocation hook (used for Octeon KASLR)
- Add support for tuning mmap randomisation
- Relocate DTB
microMIPS:
- A load of unwind fixes
- Add some missing .insn to fix link errors
MIPSr6:
- Fix MULTU/MADDU/MSUBU sign extension in r2 emulation
- Remove r2_emul_return and use ERETNC unconditionally on MIPSr6
- Allow pre-r6 emulation on SMP MIPSr6 kernels
Cache management:
- Treat physically indexed dcache as non-aliasing
- Add return errors to protected cache ops for KVM
- CM3: Ensure L1 & L2 cache ECC checking matches
- CM3: Indicate inclusive caches
- I6400: Treat dcache as physically indexed
Memory management:
- Ensure bootmem doesn't corrupt reserved memory
- Export some TLB exception generation functions for KVM
OF:
- NULL check initial_boot_params before use in of_scan_flat_dt()
- Fix unaligned access in of_alias_scan()
SMP:
- CPS: Don't BUG if a CPU fails to start
Other fixes:
- Fix longstanding 64-bit IP checksum carry bug
- Fix KERN_CONT fallout in cpu-bugs64.c and sync-r4k.c
- Update defconfigs for NF_CT_PROTO_DCCP, DPLITE,
CPU_FREQ_STAT,SCSI_DH changes
- Disable certain builtin compiler options, stack-check (whole
kernel), asynchronous-unwind-tables (VDSO).
- A bunch of build fixes from kernelci.org testing
- Various other minor cleanups & corrections
BMIPS:
- Migrate interrupts during bmips_cpu_disable
- BCM47xx: Add Luxul devices
- BCM47xx: Fix Asus WL-500W button inversion
- BCM7xxx: Add SPI device nodes
Generic (multiplatform):
- Add kexec DTB passing
- Fix big endian
- Add cpp_its_S in ksym_dep_filter to silence build warning
IP22:
- Reformat inline assembler code to modern standards
- Fix binutils 2.25 build error
IP27:
- Fix duplicate CAC_BASE definition build error
- Disable qlge driver to workaround broken compiler
Lantiq:
- Refresh defconfig and activate more drivers
- Lock DMA register access
- Fix cascading IRQ setup
- Fix build of VPE loader
- xway: Fix ethernet packet header corruption over reboot
Loongson1
- Add watchdog support
- 1B: Reduce DEFAULT_MEMSIZE to 64MB
- 1B: Change OSC clock name to match rest of kernel
- 1C: Remove ARCH_WANT_OPTIONAL_GPIOLIB
Octeon:
- Add KASLR support
- Support Octeon III USB controller
- Fix large copy_from_user corner case
- Enable devtmpfs in defconfig
Netlogic:
- Fix non-default XLR build error due to netlogic,xlp-pic code
- Fix assembler warning from smpboot.S
pic32mzda:
- Fix linker error when early printk is disabled
Pistachio:
- Add base device tree
- Add Ci40 "Marduk" device tree
Ralink:
- Support raw appended DTB
- Add missing I2C & I2S clocks
- Add missing pinmux and fix pinmux function name typo
- Add missing clk_round_rate()
- Clean up prom_init()
- MT7621: Set SoC type
- MT7621: Support highmem
TXx9:
- Modernize printing of kernel messages and resolve KERN_CONT fallout
- 7segled: use permission-specific DEVICE_ATTR variants
XilFPGA:
- Add IRQ controller and UART IRQ
- Add AXI I2C and emaclite to DT & defconfig"
* tag 'mips_4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips: (148 commits)
MIPS: VDSO: Explicitly use -fno-asynchronous-unwind-tables
MIPS: BCM47XX: Fix button inversion for Asus WL-500W
MIPS: DTS: Add img directory to Makefile
MIPS: ip27: Disable qlge driver in defconfig
MIPS: pic32mzda: Fix linker error for pic32_get_pbclk()
MIPS: Lantiq: Keep ethernet enabled during boot
MIPS: OCTEON: Fix copy_from_user fault handling for large buffers
MIPS: Fix special case in 64 bit IP checksumming.
MIPS: OCTEON: Enable DEVTMPFS
MIPS: lantiq: Set physical_memsize
MIPS: sysmips: Remove duplicated include from syscall.c
Kbuild: Add cpp_its_S in ksym_dep_filter
MIPS: Audit and remove any unnecessary uses of module.h
MIPS: Unify perf counter register definitions
MIPS: Disable stack checks on MIPS kernels
MIPS: OCTEON: Platform support for OCTEON III USB controller
MIPS: Lantiq: Fix cascaded IRQ setup
MIPS: sync-r4k: Fix KERN_CONT fallout
MIPS: IRQ Stack: Fix erroneous jal to plat_irq_dispatch
MIPS: Fix distclean with Makefile.postlink
...
For certain arguments such as saddr = 0xc0a8fd60, daddr = 0xc0a8fda1,
len = 80, proto = 17, sum = 0x7eae049d there will be a carry when
folding the intermediate 64 bit checksum to 32 bit but the code doesn't
add the carry back to the one's complement sum, thus an incorrect result
will be generated.
Reported-by: Mark Zhang <bomb.zhang@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Unify definitions for MIPS performance counter register fields in
mipsregs.h rather than duplicating them in perf_events and oprofile.
This will allow future patches to use them to expose performance
counters to KVM guests.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Robert Richter <rric@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: oprofile-list@lists.sf.net
Patchwork: https://patchwork.linux-mips.org/patch/15212/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Add all the necessary platform code to initialize the dwc3
USB host controller. This code initializes the clocks and
performs a reset on the USB core and PHYs. The driver code
in 'drivers/usb/dwc3' is where the real driver lives.
Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Acked-by: David Daney <david.daney@cavium.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15108/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
When building for microMIPS we need to ensure that the assembler always
knows that there is code at the target of a branch or jump. Commit
7170bdc777 ("MIPS: Add return errors to protected cache ops")
introduced a fixup path to protected_cache(e)_op() which does not meet
this requirement. The fixup path jumps to the "2" label but the .section
pseudo-op immediately following it causes the label to be marked as
data. Linking then fails with:
mips-img-linux-gnu-ld: arch/mips/mm/c-r4k.o: .fixup+0x0: Unsupported
jump between ISA modes; consider recompiling with interlinking
enabled.
Fix this by declaring that "2" labels code using the .insn directive.
Fixes: 7170bdc777 ("MIPS: Add return errors to protected cache ops")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/15274/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
MIPS dependencies for KVM
Miscellaneous MIPS architecture changes depended on by the MIPS KVM
changes in the KVM tree.
- Move pgd_alloc() out of header.
- Exports so KVM can access page table management and TLBEX functions.
- Add return errors to protected cache ops.
Increase the maximum number of MIPS KVM VCPUs to 8, and implement the
KVM_CAP_NR_VCPUS and KVM_CAP_MAX_CPUS capabilities which expose the
recommended and maximum number of VCPUs to userland. The previous
maximum of 1 didn't allow for any form of SMP guests.
We calculate the values similarly to ARM, recommending as many VCPUs as
there are CPUs online in the system. This will allow userland to know
how many VCPUs it is possible to create.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Expose the CP0_IntCtl register through the KVM register access API,
which is a required register since MIPS32r2. It is currently read-only
since the VS field isn't implemented due to lack of Config3.VInt or
Config3.VEIC.
It is implemented in trap_emul.c so that a VZ implementation can allow
writes.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Expose the CP0_EntryLo0 and CP0_EntryLo1 registers through the KVM
register access API. This is fairly straightforward for trap & emulate
since we don't support the RI and XI bits. For the sake of future
proofing (particularly for VZ) it is explicitly specified that the API
always exposes the 64-bit version of these registers (i.e. with the RI
and XI bits in bit positions 63 and 62 respectively), and they are
implemented in trap_emul.c rather than mips.c to allow them to be
implemented differently for VZ.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
The CP0_EBase register is a standard feature of MIPS32r2, so we should
always have been implementing it properly. However the register value
was ignored and wasn't exposed to userland.
Fix the emulation of exceptions and interrupts to use the value stored
in guest CP0_EBase, and fix the masks so that the top 3 bits (rather
than the standard 2) are fixed, so that it is always in the guest KSeg0
segment.
Also add CP0_EBASE to the KVM one_reg interface so it can be accessed by
userland, also allowing the CPU number field to be written (which isn't
permitted by the guest).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Access to various CP0 registers via the KVM register access API needs to
be implementation specific to allow restrictions to be made on changes,
for example when VZ guest registers aren't present, so move them all
into trap_emul.c in preparation for VZ.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Now that load/store faults due to read only memory regions are treated
as MMIO accesses it is safe to claim support for read only memory
regions (KVM_CAP_READONLY_MEM).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Implement the SYNC_MMU capability for KVM MIPS, allowing changes in the
underlying user host virtual address (HVA) mappings to be promptly
reflected in the corresponding guest physical address (GPA) mappings.
This allows for several features to work with guest RAM which require
mappings to be altered or protected, such as copy-on-write, KSM (Kernel
Samepage Merging), idle page tracking, memory swapping, and guest memory
ballooning.
There are two main aspects of this change, described below.
The KVM MMU notifier architecture callbacks are implemented so we can be
notified of changes in the HVA mappings. These arrange for the guest
physical address (GPA) page tables to be modified and possibly for
derived mappings (GVA page tables and TLBs) to be flushed.
- kvm_unmap_hva[_range]() - These deal with HVA mappings being removed,
for example before a copy-on-write takes place, which requires the
corresponding GPA page table mappings to be removed too.
- kvm_set_spte_hva() - These update a GPA page table entry to match the
new HVA entry, but must be careful to respect KVM specific
configuration such as not dirtying a clean guest page which is dirty
to the host, and write protecting writable pages in read only
memslots (which will soon be supported).
- kvm[_test]_age_hva() - These update GPA page table entries to be old
(invalid) so that access can be tracked, making them young again.
The GPA page fault handling (kvm_mips_map_page) is updated to use
gfn_to_pfn_prot() (which may provide read-only pages), to handle
asynchronous page table invalidation from MMU notifier callbacks, and to
handle more cases in the fast path.
- mmu_notifier_seq is used to detect asynchronous page table
invalidations while we're holding a pfn from gfn_to_pfn_prot()
outside of kvm->mmu_lock, retrying if invalidations have taken place,
e.g. a COW or a KSM page merge.
- The fast path (_kvm_mips_map_page_fast) now handles marking old pages
as young / accessed, and disallowing dirtying of clean pages that
aren't actually writable (e.g. shared pages that should COW, and
read-only memory regions when they are enabled in a future patch).
- Due to the use of MMU notifications we no longer need to keep the
page references after we've updated the GPA page tables.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Add a helper function to make a range of guest physical address (GPA)
mappings in the GPA page table clean so that writes can be caught. This
will be used in a few places to manage dirty page logging.
Note that until the dirty bit is transferred from GPA page table entries
to GVA page table entries in an upcoming patch this won't trigger a TLB
modified exception on write.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Rewrite TLB modified exception handling to handle read only GPA memory
regions, instead of unconditionally passing the exception to the guest.
If the guest TLB is not the cause of the exception we call into the
normal TLB fault handling depending on the memory segment, which will
soon attempt to remap the physical page to be writable (handling dirty
page tracking or copy on write in the process).
Failing that we fall back to treating it as MMIO, due to a read only
memory region. Once the capability is enabled, this will allow read only
memory regions (such as the Malta boot flash as emulated by QEMU) to
have writes treated as MMIO, while still allowing reads to run
untrapped.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
kvm_mips_map_page() will need to know whether the fault was due to a
read or a write in order to support dirty page tracking,
KVM_CAP_SYNC_MMU, and read only memory regions, so get that information
passed down to it via new bool write_fault arguments to various
functions.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Implement the kvm_arch_flush_shadow_all() and
kvm_arch_flush_shadow_memslot() KVM functions for MIPS to allow guest
physical mappings to be safely changed.
The general MIPS KVM code takes care of flushing of GPA page table
entries. kvm_arch_flush_shadow_all() flushes the whole GPA page table,
and is always called on the cleanup path so there is no need to acquire
the kvm->mmu_lock. kvm_arch_flush_shadow_memslot() flushes only the
range of mappings in the GPA page table corresponding to the slot being
flushed, and happens when memory regions are moved or deleted.
MIPS KVM implementation callbacks are added for handling the
implementation specific flushing of mappings derived from the GPA page
tables. These are implemented for trap_emul.c using
kvm_flush_remote_tlbs() which should now be functional, and will flush
the per-VCPU GVA page tables and ASIDS synchronously (before next
entering guest mode or directly accessing GVA space).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Use the lockless GVA helpers to implement the reading of guest
instructions for emulation. This will allow it to handle asynchronous
TLB flushes when they are implemented.
This is a little more complicated than the other two cases (get_inst()
and dynamic translation) due to the need to emulate the appropriate
guest TLB exception when the address isn't present or isn't valid in the
guest TLB.
Since there are several protected cache ops that may need to be
performed safely, this is abstracted by kvm_mips_guest_cache_op() which
is passed a protected cache op function pointer and takes care of the
lockless operation and fault handling / retry if the op should fail,
taking advantage of the new errors which the protected cache ops can now
return. This allows the existing advance fault handling which relied on
host TLB lookups to be removed, along with the now unused
kvm_mips_host_tlb_lookup(),
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Add helpers to allow for lockless direct access to the GVA space, by
changing the VCPU mode to READING_SHADOW_PAGE_TABLES for the duration of
the access. This allows asynchronous TLB flush requests in future
patches to safely trigger either a TLB flush before the direct GVA space
access, or a delay until the in-progress lockless direct access is
complete.
The kvm_trap_emul_gva_lockless_begin() and
kvm_trap_emul_gva_lockless_end() helpers take care of guarding the
direct GVA accesses, and kvm_trap_emul_gva_fault() tries to handle a
uaccess fault resulting from a flush having taken place.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Current guest physical memory is mapped to host physical addresses using
a single linear array (guest_pmap of length guest_pmap_npages). This was
only really meant to be temporary, and isn't sparse, so its wasteful of
memory. A small amount of RAM at GPA 0 and a small boot exception vector
at GPA 0x1fc00000 cannot be represented without a full 128KiB guest_pmap
allocation (MIPS32 with 16KiB pages), which is one reason why QEMU
currently runs its boot code at the top of RAM instead of the usual boot
exception vector address.
Instead use the existing infrastructure for host virtual page table
management to allocate a page table for guest physical memory too. This
should be sufficient for now, assuming the size of physical memory
doesn't exceed the size of virtual memory. It may need extending in
future to handle XPA (eXtended Physical Addressing) in 32-bit guests, as
supported by VZ guests on P5600.
Some of this code is based loosely on Cavium's VZ KVM implementation.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
When exiting from the guest, store the values of the CP0_BadInstr and
CP0_BadInstrP registers if they exist, which contain the encodings of
the instructions which caused the last synchronous exception.
When the instruction is needed for emulation, kvm_get_badinstr() and
kvm_get_badinstrp() are used instead of calling kvm_get_inst() directly,
to decide whether to read the saved CP0_BadInstr/CP0_BadInstrP registers
(if they exist), or read the instruction from memory (if not).
The use of these registers should be more robust than using
kvm_get_inst(), as it actually gives the instruction encoding seen by
the hardware rather than relying on user accessors after the fact, which
can be fooled by incoherent icache or a racing code modification. It
will also work with VZ, where the guest virtual memory isn't directly
accessible by the host with user accessors.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Currently kvm_get_inst() returns KVM_INVALID_INST in the event of a
fault reading the guest instruction. This has the rather arbitrary magic
value 0xdeadbeef. This API isn't very robust, and in fact 0xdeadbeef is
a valid MIPS64 instruction encoding, namely "ld t1,-16657(s5)".
Therefore change the kvm_get_inst() API to return 0 or -EFAULT, and to
return the instruction via a u32 *out argument. We can then drop the
KVM_INVALID_INST definition entirely.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
In order to make use of the CP0_BadInstr & CP0_BadInstrP registers we
need to be a bit more careful not to treat code fetch faults as MMIO,
lest we hit an UNPREDICTABLE register value when we try to emulate the
MMIO load instruction but there was no valid instruction word available
to the hardware.
Add a kvm_is_ifetch_fault() helper to try to figure out whether a load
fault was due to a code fetch, and prevent MMIO instruction emulation in
that case.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
MIPS KVM uses its own variation of get_new_mmu_context() which takes an
extra vcpu pointer (unused) and does exactly the same thing.
Switch to just using get_new_mmu_context() directly and drop KVM's
version of it as it doesn't really serve any purpose.
The nearby declarations of kvm_mips_alloc_new_mmu_context(),
kvm_mips_vcpu_load() and kvm_mips_vcpu_put() are also removed from
kvm_host.h, as no definitions or users exist.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
When exceptions are injected into the MIPS KVM guest, the whole host TLB
is flushed (except any entries in the guest KSeg0 range). This is
certainly not mandated by the architecture when exceptions are taken
(userland can't directly change TLB mappings anyway), and is a pretty
heavyweight operation:
- There may be hundreds of TLB entries especially when a 512 entry FTLB
is present. These are walked and read and conditionally invalidated,
so the TLBINV feature can't be used either.
- It'll indiscriminately wipe out entries belonging to other memory
spaces. A simple ASID regeneration would be much faster to perform,
although it'd wipe out the guest KSeg0 mappings too.
My suspicion is that this was simply to plaster over the fact that
kvm_mips_host_tlb_inv() incorrectly only invalidated TLB entries in the
ASID for guest usermode, and not the ASID for guest kernelmode.
Now that the recent commit "KVM: MIPS/TLB: Flush host TLB entry in
kernel ASID" fixes kvm_mips_host_tlb_inv() to flush TLB entries in the
kernelmode ASID when the guest TLB changes, lets drop these calls and
the otherwise unused kvm_mips_flush_host_tlb().
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Now that KVM no longer uses wired entries we can safely use
local_flush_tlb_all() when we need to flush the entire TLB (on the start
of a new ASID cycle). This doesn't flush wired entries, which allows
other code to use them without KVM clobbering them all the time. It also
is more up to date, knowing about the tlbinv architectural feature,
flushing of micro TLB on cores where that is necessary (Loongson I
believe), and knows to stop the HTW while doing so.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Now that we have GVA page tables, use standard user accesses with page
faults disabled to read & modify guest instructions. This should be more
robust (than the rather dodgy method of accessing guest mapped segments
by just directly addressing them) and will also work with Enhanced
Virtual Addressing (EVA) host kernel configurations where dedicated
instructions are needed for accessing user mode memory.
For simplicity and speed we do this regardless of the guest segment the
address resides in, rather than handling guest KSeg0 specially with
kmap_atomic() as before.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Now that the commpage doesn't use wired TLB entries, the per-CPU
vm_init() callback is the only work done by kvm_mips_init_vm_percpu().
The trap & emulate implementation doesn't actually need to do anything
from vm_init(), and the future VZ implementation would be better served
by a kvm_arch_hardware_enable callback anyway.
Therefore drop the vm_init() callback entirely, allowing the
kvm_mips_init_vm_percpu() function to also be dropped, along with the
kvm_mips_instance atomic counter.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Now that we have GVA page tables and an optimised TLB refill handler in
place, convert the handling of commpage faults from the guest kernel to
fill the GVA page table and invalidate the TLB entry, rather than
filling the wired TLB entry directly.
For simplicity we no longer use a wired entry for the commpage (refill
should be much cheaper with the fast-path handler anyway). Since we
don't need to manipulate the TLB directly any longer, move the function
from tlb.c to mmu.c. This puts it closer to the similar functions
handling KSeg0 and TLB mapped page faults from the guest.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Now that we have GVA page tables and an optimised TLB refill handler in
place, convert the handling of page faults in TLB mapped segment from
the guest to fill a single GVA page table entry and invalidate the TLB
entry, rather than filling a TLB entry pair directly.
Also remove the now unused kvm_mips_get_{kernel,user}_asid() functions
in mmu.c and kvm_mips_host_tlb_write() in tlb.c.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Implement invalidation of specific pairs of GVA page table entries in
one or both of the GVA page tables. This is used when existing mappings
are replaced in the guest TLB by emulated TLBWI/TLBWR instructions. Due
to the sharing of page tables in the host kernel range, we should be
careful not to allow host pages to be invalidated.
Add a helper kvm_mips_walk_pgd() which can be used when walking of
either GPA (future patches) or GVA page tables is needed, optionally
with allocation of page tables along the way when they don't exist.
GPA page table walking will need to be protected by the kvm->mmu_lock,
so we also add a small MMU page cache in each KVM VCPU, like that found
for other architectures but smaller. This allows enough pages to be
pre-allocated to handle a single fault without holding the lock,
allowing the helper to run with the lock held without having to handle
allocation failures.
Using the same mechanism for GVA allows the same code to be used, and
allows it to use the same cache of allocated pages if the GPA walk
didn't need to allocate any new tables.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Implement invalidation of large ranges of virtual addresses from GVA
page tables in response to a guest ASID change (immediately for guest
kernel page table, lazily for guest user page table).
We iterate through a range of page tables invalidating entries and
freeing fully invalidated tables. To minimise overhead the exact ranges
invalidated depends on the flags argument to kvm_mips_flush_gva_pt(),
which also allows it to be used in future KVM_CAP_SYNC_MMU patches in
response to GPA changes, which unlike guest TLB mapping changes affects
guest KSeg0 mappings.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Refactor kvm_mips_host_tlb_inv() to also be able to invalidate any
matching TLB entry in the kernel ASID rather than assuming only the TLB
entries in the user ASID can change. Two new bool user/kernel arguments
allow the caller to indicate whether the mapping should affect each of
the ASIDs for guest user/kernel mode.
- kvm_mips_invalidate_guest_tlb() (used by TLBWI/TLBWR emulation) can
now invalidate any corresponding TLB entry in both the kernel ASID
(guest kernel may have accessed any guest mapping), and the user ASID
if the entry being replaced is in guest USeg (where guest user may
also have accessed it).
- The tlbmod fault handler (and the KSeg0 / TLB mapped / commpage fault
handlers in later patches) can now invalidate the corresponding TLB
entry in whichever ASID is currently active, since only a single page
table will have been updated anyway.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Use functions from the general MIPS TLB exception vector generation code
(tlbex.c) to construct a fast path TLB refill handler similar to the
general one, but cut down and capable of preserving K0 and K1.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Activate the GVA page tables when in guest context. This will allow the
normal Linux TLB refill handler to fill from it when guest memory is
read, as well as preventing accidental reading from user memory.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Wire up a vcpu uninit implementation callback. This will be used for the
clean up of GVA->HPA page tables.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Set init_mm as the active_mm and update mm_cpumask(current->mm) to
reflect that it isn't active when in guest context. This prevents cache
management code from attempting cache flushes on host virtual addresses
while in guest context, for example due to a cache management IPIs or
later when writing of dynamically translated code hits copy on write.
We do this using helpers in static kernel code to avoid having to export
init_mm to modules.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Add implementation callbacks for entering the guest (vcpu_run()) and
reentering the guest (vcpu_reenter()), allowing implementation specific
operations to be performed before entering the guest or after returning
to the host without cluttering kvm_arch_vcpu_ioctl_run().
This allows the T&E specific lazy user GVA flush to be moved into
trap_emul.c, along with disabling of the HTW. We also move
kvm_mips_deliver_interrupts() as VZ will need to restore the guest timer
state prior to delivering interrupts.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
The kvm_vcpu_arch structure contains both mm_structs for allocating MMU
contexts (primarily the ASID) but it also copies the resulting ASIDs
into guest_{user,kernel}_asid[] arrays which are referenced from uasm
generated code.
This duplication doesn't seem to serve any purpose, and it gets in the
way of generalising the ASID handling across guest kernel/user modes, so
lets just extract the ASID straight out of the mm_struct on demand, and
in fact there are convenient cpu_context() and cpu_asid() macros for
doing so.
To reduce the verbosity of this code we do also add kern_mm and user_mm
local variables where the kernel and user mm_structs are used.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Convert the get_regs() and set_regs() callbacks to vcpu_load() and
vcpu_put(), which provide a cpu argument and more closely match the
kvm_arch_vcpu_load() / kvm_arch_vcpu_put() that they are called by.
This is in preparation for moving ASID management into the
implementations.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
KVM T&E uses an ASID for guest kernel mode and an ASID for guest user
mode. The current ASID is saved when the guest is scheduled out, and
restored when scheduling back in, with checks for whether the ASID needs
to be regenerated.
This isn't really necessary as the ASID can be easily determined by the
current guest mode, so lets simplify it to just read the required ASID
from guest_kernel_asid or guest_user_asid even if the ASID hasn't been
regenerated.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
The protected cache ops contain no out of line fixup code to return an
error code in the event of a fault, with the cache op being skipped in
that case. For KVM however we'd like to detect this case as page
faulting will be disabled so it could happen during normal operation if
the GVA page tables were flushed, and need to be handled by the caller.
Add the out-of-line fixup code to load the error value -EFAULT into the
return variable, and adapt the protected cache line functions to pass
the error back to the caller.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Export to TLB exception code generating functions so that KVM can
construct a fast TLB refill handler for guest context without
reinventing the wheel quite so much.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Add include guards in asm/uasm.h to allow it to be safely used by a new
header asm/tlbex.h in the next patch to expose TLB exception building
functions for KVM to use.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
pgd_alloc() references init_mm which is not exported to modules. In
order for KVM to be able to use pgd_alloc() to allocate GVA page tables,
move pgd_alloc() into a new pgtable.c file and export it to modules.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
cputime_t is now only used by two architectures:
* powerpc (when CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y)
* s390
And since the core doesn't use it anymore, we don't need any arch support
from the others. So we can remove their stub implementations.
A final cleanup would be to provide an efficient pure arch
implementation of cputime_to_nsec() for s390 and powerpc and finally
remove include/linux/cputime.h .
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1485832191-26889-36-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
So far only Luxul XWR-1750 router was supported. This adds a set of
other Luxul devices based on BCM47XX. It's a standard support for LEDs
and buttons.
Signed-off-by: Dan Haab <dhaab@luxul.com>
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Cc: Rafał Miłecki <zajec5@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15106/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
vdso.h includes <spaces.h> implicitly after defining CONFIG_32BITS.
This defeats the override in mach-ip27/spaces.h, leading to
a build error that shows up in kernelci.org:
In file included from arch/mips/include/asm/mach-ip27/spaces.h:29:0,
from arch/mips/include/asm/page.h:12,
from arch/mips/vdso/vdso.h:26,
from arch/mips/vdso/gettimeofday.c:11:
arch/mips/include/asm/mach-generic/spaces.h:28:0: error: "CAC_BASE" redefined [-Werror]
#define CAC_BASE _AC(0x80000000, UL)
An earlier patch tried to make the second definition conditional,
but that patch had the #ifdef in the wrong place, and would lead
to another warning:
arch/mips/include/asm/io.h: In function 'phys_to_virt':
arch/mips/include/asm/io.h:138:9: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
For all I can tell, there is no other reason than vdso32 to ever
include this file with CONFIG_32BITS set, and the vdso itself should
never refer to the base addresses as it is running in user space,
so adding an #ifdef here is safe.
Link: https://patchwork.kernel.org/patch/9418187/
Fixes: 3ffc17d876 ("MIPS: Adjust MIPS64 CAC_BASE to reflect Config.K0")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15039/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
kernelci.org reports tons of build warnings for linux-next:
35 WARNING: "memcpy" [fs/fat/msdos.ko] has no CRC!
35 WARNING: "__copy_user" [fs/fat/fat.ko] has no CRC!
32 WARNING: EXPORT symbol "memset" [vmlinux] version generation failed, symbol will not be versioned.
32 WARNING: EXPORT symbol "copy_page" [vmlinux] version generation failed, symbol will not be versioned.
32 WARNING: EXPORT symbol "clear_page" [vmlinux] version generation failed, symbol will not be versioned.
32 WARNING: EXPORT symbol "__strncpy_from_user_nocheck_asm" [vmlinux] version generation failed, symbol will not be versioned.
The problem here is mainly the missing asm/asm-prototypes.h header file
that is supposed to include the prototypes for each symbol that is exported
from an assembler file.
A second problem is that the asm/uaccess.h header contains some but not
all the necessary declarations for the user access helpers.
Finally, the vdso build is broken once we add asm/asm-prototypes.h, so
we have to fix this at the same time by changing the vdso header. My
approach here is to just not look for exported symbols in the VDSO
assembler files, as the symbols cannot be exported anyway.
Fixes: 576a2f0c5c ("MIPS: Export memcpy & memset functions alongside their definitions")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Maciej W. Rozycki <macro@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15038/
Patchwork: https://patchwork.linux-mips.org/patch/15069/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Introduce a new architecture-specific get_arch_dma_ops() function
that takes a struct bus_type * argument. Add get_dma_ops() in
<linux/dma-mapping.h>.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
Some but not all architectures provide set_dma_ops(). Move dma_ops
from struct dev_archdata into struct device such that it becomes
possible on all architectures to configure dma_ops per device.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Russell King <linux@armlinux.org.uk>
Cc: x86@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
When building a kernel targeting a microMIPS ISA, recent GNU linkers
will fail the link if they cannot determine that the target of a branch
or jump is microMIPS code, with errors such as the following:
mips-img-linux-gnu-ld: arch/mips/built-in.o: .text+0x542c:
Unsupported jump between ISA modes; consider recompiling with
interlinking enabled.
mips-img-linux-gnu-ld: final link failed: Bad value
or:
./arch/mips/include/asm/uaccess.h:1017: warning: JALX to a
non-word-aligned address
Placing anything other than an instruction at the start of a function
written in assembly appears to trigger such errors. In order to prepare
for allowing us to follow function prologue macros with an EXPORT_SYMBOL
invocation, end the prologue macros (LEAD, NESTED & FEXPORT) with a
.insn directive. This ensures that the start of the function is marked
as code, which always makes sense for functions & safely prevents us
from hitting the link errors described above.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14508/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Include export.h in the list of generic headers used by the MIPS
architecture for use by later patches.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14506/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The mt7620 has a pin that can be used to generate an external reference
clock. The pinmux setup was missing the definition of said pin. This patch
adds it.
Signed-off-by: John Crispin <john@phrozen.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14898/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add plat_fdt_relocated(void*) API to allow the kernel relocation code to
update platform's information about the DTB location if the DTB had to
be moved due to being placed in a location used by the relocated kernel.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14611/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS does not currently define ELF_CORE_COPY_REGS macros and as a result
the generic implementation is used. The generic version attempts to do
directly map (struct pt_regs) into (elf_gregset_t), which isn't correct
for MIPS platforms and also triggers a BUG() at runtime in
include/linux/elfcore.h:16 (BUG_ON(sizeof(*elfregs) != sizeof(*regs)))
[ralf@linux-mips.org: Add semicolons to the macro definitions as I do not
apply https://patchwork.linux-mips.org/patch/14588/ for now.]
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14586/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Current register dump methods for MIPS are implemented inside ptrace
methods, but there will be other uses in the kernel for them, so keep
them separately in process.c and use those definitions for ptrace
instead.
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14587/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
SMP_DUMP has been added as a new IPI signal when kexec support was added
for Cavium Octeon CPUs ('commit 7aa1c8f47e ("MIPS: kdump: Add support")'.
However, the new signal doesn't appear to ever have a proper handler
added (octeon_message_functions[] array has an empty handler for it),
and generic IPI handlers now trigger a BUG() on unhandled signal.
As the method is unused remove it completely and replace its only
invocation with a smp_call_function().
[ralf@linux-mips.org: Renumber SMP_ASK_C0COUNT to avoid numbering gaps.]
Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14630/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 7c151d3d5d ("MIPS: Make use of the ERETNC instruction on MIPS
R6") began clearing LLBit during context switches, but did so on all
systems where it is writable for unclear reasons & did so from a macro
with "software_ll_bit" in its name, which is intended to operate on the
ll_bit variable used by ll/sc emulation for old CPUs.
We do now need to clear LLBit on MIPSr6 systems where we'll use eretnc
to return to userland, but we don't need to do so on MIPSr5 systems with
a writable LLBit.
Move the clear to its own appropriately named macro, do it only for
MIPSr6 systems & comment about why.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14409/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The r2_emul_return field in struct thread_info was used in order to take
an alternate codepath when returning to userland, which (besides not
implementing certain features) effectively used the eretnc instruction
in place of eret. The difference is that eretnc doesn't clear LLBit, and
therefore doesn't cause a linked load & store sequence to fail due to
emulation like eret would.
The reason eret would usually be used to clear LLBit is so that after
context switching we ensure that a load performed by one task doesn't
influence another task. However commit 7c151d3d5d ("MIPS: Make use of
the ERETNC instruction on MIPS R6") which introduced the r2_emul_return
field and conditional use of eretnc also for some reason began
explicitly clearing LLBit during context switches - despite retaining
the use of eret for everything but returns from the pre-r6 instruction
emulation code.
As LLBit is cleared upon context switches anyway, simplify this by using
eretnc unconditionally for MIPSr6 kernels. This allows us to remove the
4 byte r2_emul_return boolean from struct thread_info, simplify the
return to user code in entry.S and avoid the overhead of tracking &
checking state which we don't need.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14408/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
On systems with CM3, we must ensure that the L1 & L2 ECC enables are set
to the same value. This is presumed by the hardware & cache corruption
can occur when it is not the case. Support enabling & disabling the L2
ECC checking on CM3 systems where this is controlled via a GCR, and
ensure that it matches the state of L1 ECC checking. Remove I6400 from
the switch statement it will no longer hit, and which was incorrect
since the L2 ECC enable bit isn't in the CP0 ErrCtl register.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14413/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The previous commit made cpu_callin_map redundant, since it is no longer
used to signal secondary CPUs starting, or going offline. Remove it now.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Yang Shi <yang.shi@windriver.com>
Cc: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14503/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We have a HIGHMEM_DEBUG macro defined in asm/highmem.h with a comment
stating that it should be removed for production, and no users... Kill
it.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14523/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The MIPS-specific asm/unaligned.h provides nothing that the generic
version doesn't - it simply uses MIPS-specific endianness macros in
place of generic ones & lacks support for
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS. Remove it & switch to using the
generic version to remove duplication.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14412/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch enables KASLR for Octeon systems. The SMP startup code is
such that the secondaries monitor the volatile variable
'octeon_processor_relocated_kernel_entry' for any non-zero value.
The 'plat_post_relocation hook' is used to set that value to the
kernel entry point of the relocated kernel. The secondary CPUs will
then jusmp to the new kernel, perform their initialization again
and begin waiting for the boot CPU to start them via the relocated
loop 'octeon_spin_wait_boot'. Inspired by Steven's code from Cavium.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14669/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The SAVE_SOME macro is used to save the execution context on all
exceptions.
If an exception occurs while executing user code, the stack is switched
to the kernel's stack for the current task, and register $28 is switched
to point to the current_thread_info, which is at the bottom of the stack
region.
If the exception occurs while executing kernel code, the stack is left,
and this change ensures that register $28 is not updated. This is the
correct behaviour when the kernel can be executing on the separate irq
stack, because the thread_info will not be at the base of it.
With this change, register $28 is only switched to it's kernel
conventional usage of the currrent thread info pointer at the point at
which execution enters kernel space. Doing it on every exception was
redundant, but OK without an IRQ stack, but will be erroneous once that
is introduced.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Acked-by: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14742/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>