Spurious NMIs will be observed with the following command:
while :; do
perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp"
-e "cpu/umask=0x03,event=0x0/"
-e "cpu/umask=0x02,event=0x0/"
-e cycles,branches,cache-misses
-e cache-references -- sleep 10
done
The bug was introduced by commit:
8077eca079 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
That commit clears the status bits for the counters used for PEBS
events, by masking the whole 64 bits pebs_enabled. However, only the
low 32 bits of both status and pebs_enabled are reserved for PEBS-able
counters.
For status bits 32-34 are fixed counter overflow bits. For
pebs_enabled bits 32-34 are for PEBS Load Latency.
In the test case, the PEBS Load Latency event and fixed counter event
could overflow at the same time. The fixed counter overflow bit will
be cleared by mistake. Once it is cleared, the fixed counter overflow
never be processed, which finally trigger spurious NMI.
Correct the PEBS enabled mask by ignoring the non-PEBS bits.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 8077eca079 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
Link: http://lkml.kernel.org/r/1491333246-3965-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Consolidate x86 instruction decoder users on the path of
copying original code for kprobes.
Kprobes decodes the same instruction a maximum of 3 times when
preparing the instruction buffer:
- The first time for getting the length of the instruction,
- the 2nd for adjusting displacement,
- and the 3rd for checking whether the instruction is boostable or not.
For each time, the actual decoding target address is slightly
different (1st is original address or recovered instruction buffer,
2nd and 3rd are pointing to the copied buffer), but all have
the same instruction.
Thus, this patch also changes the target address to the copied
buffer at first and reuses the decoded "insn" for displacement
adjusting and checking boostability.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ye Xiaolong <xiaolong.ye@intel.com>
Link: http://lkml.kernel.org/r/149076389643.22469.13151892839998777373.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Use probe_kernel_read() for avoiding unexpected faults while
copying kernel text in __recover_probed_insn(),
__recover_optprobed_insn() and __copy_instruction().
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ye Xiaolong <xiaolong.ye@intel.com>
Link: http://lkml.kernel.org/r/149076382624.22469.10091613887942958518.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Set the pages which is used for kprobes' singlestep buffer
and optprobe's trampoline instruction buffer to readonly.
This can prevent unexpected (or unintended) instruction
modification.
This also passes rodata_test as below.
Without this patch, rodata_test shows a warning:
WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 note_page+0x7a9/0xa20
x86/mm: Found insecure W+X mapping at address ffffffffa0000000/0xffffffffa0000000
With this fix, no W+X pages are found:
x86/mm: Checked W+X mappings: passed, no W+X pages found.
rodata_test: all tests were successful
Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ye Xiaolong <xiaolong.ye@intel.com>
Link: http://lkml.kernel.org/r/149076375592.22469.14174394514338612247.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Make arch_specific_insn.boostable to boolean, since it has
only 2 states, boostable or not. So it is better to use
boolean from the viewpoint of code readability.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ye Xiaolong <xiaolong.ye@intel.com>
Link: http://lkml.kernel.org/r/149076368566.22469.6322906866458231844.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Do not modify singlestep execution buffer (kprobe.ainsn.insn)
while resuming from single-stepping, instead, modifies
the buffer to add a jump back instruction at preparing
buffer.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ye Xiaolong <xiaolong.ye@intel.com>
Link: http://lkml.kernel.org/r/149076361560.22469.1610155860343077495.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Use x86 instruction decoder for checking whether the probed
instruction is able to boost or not, instead of hand-written
code.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ye Xiaolong <xiaolong.ye@intel.com>
Link: http://lkml.kernel.org/r/149076354563.22469.13379472209338986858.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix the description comment of __copy_instruction() function
since it has already been changed to return the length of the
copied instruction.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ye Xiaolong <xiaolong.ye@intel.com>
Link: http://lkml.kernel.org/r/149076347582.22469.3775133607244923462.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix the kprobe-booster not to boost far call instruction,
because a call may store the address in the single-step
execution buffer to the stack, which should be modified
after single stepping.
Currently, this instruction will be filtered as not
boostable in resume_execution(), so this is not a
critical issue.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ye Xiaolong <xiaolong.ye@intel.com>
Link: http://lkml.kernel.org/r/149076340615.22469.14066273186134229909.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fam16h is the same as the default one, remove it. Turn the switch-case
into a simple if-else.
No functionality change.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20170410122047.3026-3-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull ARM fixes from Russell King:
"A number of ARM fixes:
- prevent oopses caused by dma_get_sgtable() and declared DMA
coherent memory
- fix boot failure on nommu caused by ID_PFR1 access
- a number of kprobes fixes from Jon Medhurst and Masami Hiramatsu"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8665/1: nommu: access ID_PFR1 only if CPUID scheme
ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memory
arm: kprobes: Align stack to 8-bytes in test code
arm: kprobes: Fix the return address of multiple kretprobes
arm: kprobes: Skip single-stepping in recursing path if possible
arm: kprobes: Allow to handle reentered kprobe on single-stepping
Pull VFS fixes from Al Viro:
"statx followup fixes and a fix for stack-smashing on alpha"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
alpha: fix stack smashing in old_adjtimex(2)
statx: Include a mask for stx_attributes in struct statx
statx: Reserve the top bit of the mask for future struct expansion
xfs: report crtime and attribute flags to statx
ext4: Add statx support
statx: optimize copy of struct statx to userspace
statx: remove incorrect part of vfs_statx() comment
statx: reject unknown flags when using NULL path
Documentation/filesystems: fix documentation for ->getattr()
Headed to stable:
- disable HFSCR[TM] if TM is not supported, fixes a potential host kernel crash
triggered by a hostile guest, but only in configurations that no one uses
- don't try to fix up misaligned load-with-reservation instructions
- fix flush_(d|i)cache_range() called from modules on little endian kernels
- add missing global TLB invalidate if cxl is active
- fix missing preempt_disable() in crc32c-vpmsum
And a fix for selftests build changes that went in this release:
- selftests/powerpc: Fix standalone powerpc build
Thanks to:
Benjamin Herrenschmidt, Frederic Barrat, Oliver O'Halloran, Paul Mackerras.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJY6LIKAAoJEFHr6jzI4aWAhfcQAKORHx/tJf9w8KqcfSfKfeEL
O8cZEl5/N3ArNXVM5J5QK5KnMVHnoWWR3FWYwntOjt3RJywjJYJ02YvhOVvt4q+M
YinRS34KzAhnT1f526zx97v0BGqi//UJamrcFBUBTd4rLuHGbol7fdtWHVrsMYa0
KWQ+ooPLEpGDk4I3sDz37yeJBQXVpyhC/UF8vzHpvHGPvIQ8Dw8rfWwOZ0HooJuZ
ewKdkeIsYF8SrM461c1GhOI0VXB0q+CMn9mzIaEKMuZMhHDKyiaM5rm8mWXapzcT
HsCQKlF9X9YHAbhbSbz9DGvNCEYaW7T4vnudSNHjQaAJlA4HsmeRwWXy4+zqZuPc
rIbRIFZAyV3wYowN7j3P6Se3lLBDMmlHZvVkygJnwoaR4rmoujePGwdAv8ZH4Udn
hrbieC41HKVxcm5t3whIDOcHmxaAo1MDqmrVhyxJSjgnkdBtN/gnZXvHDb0VeOJV
9wFGGE8WvMXnTKEcjM2l+a14CuOrV/wRbHQ1B1O0Kfk613cPrukMYab6eLPqyJzF
lmkCm1o46bib5oBOmvlqK+5oVuwNyfHmJSzvL+VOylhLVbJPmFJUhHQFssCvsTUf
k36ZAUxH4fbz1TzAPipXl+wrkE/yzthGmA9FTC9hLkYE/rzvrZt9IKowFw1mq5n/
2zFabXQBl5JBQ4hdL54f
=bTuf
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 4.11:
Headed to stable:
- disable HFSCR[TM] if TM is not supported, fixes a potential host
kernel crash triggered by a hostile guest, but only in
configurations that no one uses
- don't try to fix up misaligned load-with-reservation instructions
- fix flush_(d|i)cache_range() called from modules on little endian
kernels
- add missing global TLB invalidate if cxl is active
- fix missing preempt_disable() in crc32c-vpmsum
And a fix for selftests build changes that went in this release:
- selftests/powerpc: Fix standalone powerpc build
Thanks to: Benjamin Herrenschmidt, Frederic Barrat, Oliver O'Halloran,
Paul Mackerras"
* tag 'powerpc-4.11-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/crypto/crc32c-vpmsum: Fix missing preempt_disable()
powerpc/mm: Add missing global TLB invalidate if cxl is active
powerpc/64: Fix flush_(d|i)cache_range() called from modules
powerpc: Don't try to fix up misaligned load-with-reservation instructions
powerpc: Disable HFSCR[TM] if TM is not supported
selftests/powerpc: Fix standalone powerpc build
Pull sparc fixes from David Miller:
"Several fixes here, mostly having to due with either build errors or
memory corruptions depending upon whether you have THP enabled or not"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: remove unused wp_works_ok macro
sparc32: Export vac_cache_size to fix build error
sparc64: Fix memory corruption when THP is enabled
sparc64: Fix kernel panic due to erroneous #ifdef surrounding pmd_write()
arch/sparc: Avoid DCTI Couples
sparc64: kern_addr_valid regression
sparc64: Add support for 2G hugepages
sparc64: Fix size check in huge_pte_alloc
ARM:
- Fix a problem with GICv3 userspace save/restore
- Clarify GICv2 userspace save/restore ABI
- Be more careful in clearing GIC LRs
- Add missing synchronization primitive to our MMU handling code
PPC:
- Check for a NULL return from kzalloc
s390:
- Prevent translation exception errors on valid page tables for the
instruction-exection-protection support
x86:
- Fix Page-Modification Logging when running a nested guest
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJY5/X8AAoJEED/6hsPKofo8hQH/As3CbihZMysaK6JJTx5oMZw
b3W8p8xVXVu4dKM8WnXa6m5xBDFmOa7eBB+CtT3gP68XnFvMpr/vPmDv6v6i9p8q
7VyALDqqk2fxDmgHEwuETw9XZyuhdyCz/GaINCdnAJs25wTFOA7r0WEW5W8qRJpA
9nQirapdJcknymIch1JqeWlYYmbIaFzT8jItfA9QQ7F9mG4pxC8D1k2D56lNYwTf
FJIgXgkMPe7CPDXmgc/KqT5+iVsc/+SgzP/WdH6bX/007TV71sksxxfz6fIrao0X
RtcL2WIZTXBdSNrvXflHhCfYgogPgCnYp8AsYTIa+IEijcfteJx7UiET47Ne0Ow=
=/SPG
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"ARM:
- Fix a problem with GICv3 userspace save/restore
- Clarify GICv2 userspace save/restore ABI
- Be more careful in clearing GIC LRs
- Add missing synchronization primitive to our MMU handling code
PPC:
- Check for a NULL return from kzalloc
s390:
- Prevent translation exception errors on valid page tables for the
instruction-exection-protection support
x86:
- Fix Page-Modification Logging when running a nested guest"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: PPC: Book3S HV: Check for kmalloc errors in ioctl
KVM: nVMX: initialize PML fields in vmcs02
KVM: nVMX: do not leak PML full vmexit to L1
KVM: arm/arm64: vgic: Fix GICC_PMR uaccess on GICv3 and clarify ABI
KVM: arm64: Ensure LRs are clear when they should be
kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd
KVM: s390: remove change-recording override support
arm/arm64: KVM: Take mmap_sem in kvm_arch_prepare_memory_region
arm/arm64: KVM: Take mmap_sem in stage2_unmap_vm
- Restore previous SIGBUS behaviour for unhandled unaligned user accesses
- Revert broken support for the contiguous bit in hugetlb (again...)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJY53foAAoJELescNyEwWM0ILEH/An3v6VSnbABDRxvXWkrTvKZ
y4KoHVgSDqehrH8MysrrI7SlB5J5AEGjQI2SzI2InVS4j4Dd/kfqZMeZlo2Z2Idv
KlC4FXb6QhRjJrrLCVIWCZxQL8gqP9KEI+DwB76a46WSYHHWP4ihtfYTxpTSAZbj
mDHOmZ2udc/GjEpPzzPNOhXs0+1dEAHkQa+gW8T5HotQK+VVBwFTJKPXGNjm/YQa
A1lLzYW/R9xRzAeEaJIGa6/jy6jJQ09vkXUdriibRi9qu7+A/xecgq3nb6puwT3j
0BQqvVQ3eAEejlXA5L4xtdwNb3fhe8hK4pq9OgNnhSytntAtbSiqvGTHea03XKY=
=d5YQ
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"We've got a regression fix for the signal raised when userspace makes
an unsupported unaligned access and a revert of the contiguous
(hugepte) support for hugetlb, which has once again been found to be
broken. One day, maybe, we'll get it right.
Summary:
- restore previous SIGBUS behaviour for unhandled unaligned user
accesses
- revert broken support for the contiguous bit in hugetlb (again...)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
Revert "Revert "arm64: hugetlb: partial revert of 66b3923a1a0f""
arm64: mm: unaligned access by user-land should be received as SIGBUS
These patches fix a bunch of longstanding (some over a decade old) metag
user copy fault handling bugs. Thanks go to Al Viro for spotting some of
the questionable code in the first place.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJY5g8zAAoJEGwLaZPeOHZ6m8sP/3M5e2VPrtqK7u22QVrjIkOx
XwtIRCbEFUf9XJhDATKnwraRcKlfespu3ibc2BrQ7e2FCa/Nx6nSipUIMW+zUmGX
nu2DHnxh6rEEzHc5pBzmiUH8+AsoK5Q12jeQRu8PviCkn7QOOMFt6ZvOHltE0AOh
OKycaZAbZnvgKQkYmAqxcakesALc0gSRmXLvxlIba7fnhR8fYhSCow3Fxf+DLBbh
Hq7/7cRyZi/GNnYd0NNLVyifbZk2Xt3nfN9TadysCMc4InsYSz4uJycZZy2p/WW+
feHJy0EUvDCRBggU7vgSpAd+7By7+tVSjGTH+dwDVwcP3ukFx6qZcu8dvrbyUoMK
E2QBLb3DSOqPtRmjIq4AYrQUOnzCNwxDG7f02GQGFmV8VudNHUf6Y7W3xIZrT5Ke
0Y+mcYN4ZN/g5rzBTgj5+zOMsQr1kRlBSGwt2LZq6G5oFk4ZvFXyO4pA7ZHLzP30
cRuky9uRYvTBGzwa/vxkecjJ4w7xGZAgWHHtc9yPuetX81bJYTY2bQVeqlMvfCrb
NzOsObBjqokkYqYe4ywaKhyxFo/Ks4X1LkvCepcj5CfcLvfFW6BuZFKw2WvPGGoB
RSEhvWhtuok5eafb+6QDd6G2ZLo4wSnt5qbCjjWjW9sHLxWG88cOsCctq4mBT1tl
Vnd2140H9FoAMY0H2xhc
=MwhE
-----END PGP SIGNATURE-----
Merge tag 'metag-for-v4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag
Pull metag usercopy fixes from James Hogan:
"Metag usercopy fault handling fixes
These patches fix a bunch of longstanding (some over a decade old)
metag user copy fault handling bugs. Thanks go to Al Viro for spotting
some of the questionable code in the first place"
* tag 'metag-for-v4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
metag/usercopy: Add missing fixups
metag/usercopy: Fix src fixup in from user rapf loops
metag/usercopy: Set flags before ADDZ
metag/usercopy: Zero rest of buffer from copy_from_user
metag/usercopy: Add early abort to copy_to_user
metag/usercopy: Fix alignment error checking
metag/usercopy: Drop unused macros
The use of the contiguous bit by our hugetlb implementation violates
the break-before-make requirements of the architecture and can lead to
silent data corruption or TLB conflict aborts. Once again, disable these
hugetlb sizes whilst it gets worked out.
This reverts commit ab2e1b8923.
Conflicts:
arch/arm64/mm/hugetlbpage.c
Signed-off-by: Will Deacon <will.deacon@arm.com>
In crc32c_vpmsum() we call enable_kernel_altivec() without first
disabling preemption, which is not allowed:
WARNING: CPU: 9 PID: 2949 at ../arch/powerpc/kernel/process.c:277 enable_kernel_altivec+0x100/0x120
Modules linked in: dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c vmx_crypto ...
CPU: 9 PID: 2949 Comm: docker Not tainted 4.11.0-rc5-compiler_gcc-6.3.1-00033-g308ac7563944 #381
...
NIP [c00000000001e320] enable_kernel_altivec+0x100/0x120
LR [d000000003df0910] crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
Call Trace:
0xc138fd09 (unreliable)
crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
crc32c_vpmsum_update+0x3c/0x60 [crc32c_vpmsum]
crypto_shash_update+0x88/0x1c0
crc32c+0x64/0x90 [libcrc32c]
dm_bm_checksum+0x48/0x80 [dm_persistent_data]
sb_check+0x84/0x120 [dm_thin_pool]
dm_bm_validate_buffer.isra.0+0xc0/0x1b0 [dm_persistent_data]
dm_bm_read_lock+0x80/0xf0 [dm_persistent_data]
__create_persistent_data_objects+0x16c/0x810 [dm_thin_pool]
dm_pool_metadata_open+0xb0/0x1a0 [dm_thin_pool]
pool_ctr+0x4cc/0xb60 [dm_thin_pool]
dm_table_add_target+0x16c/0x3c0
table_load+0x184/0x400
ctl_ioctl+0x2f0/0x560
dm_ctl_ioctl+0x38/0x50
do_vfs_ioctl+0xd8/0x920
SyS_ioctl+0x68/0xc0
system_call+0x38/0xfc
It used to be sufficient just to call pagefault_disable(), because that
also disabled preemption. But the two were decoupled in commit 8222dbe21e
("sched/preempt, mm/fault: Decouple preemption from the page fault
logic") in mid 2015.
So add the missing preempt_disable/enable(). We should also call
disable_kernel_fp(), although it does nothing by default, there is a
debug switch to make it active and all enables should be paired with
disables.
Fixes: 6dd7a82cc5 ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull MIPS fixes from Ralf Baechle:
"Lantiq:
- Fix adding xbar resoures causing a panic
Loongson3:
- Some Loongson 3A don't identify themselves as having an FTLB so
hardwire that knowledge into CPU probing.
- Handle Loongson 3 TLB peculiarities in the fast path of the RDHWR
emulation.
- Fix invalid FTLB entries with huge page on VTLB+FTLB platforms
- Add missing calculation of S-cache and V-cache cache-way size
Ralink:
- Fix typos in rt3883 pinctrl data
Generic:
- Force o32 fp64 support on 32bit MIPS64r6 kernels
- Yet another build fix after the linux/sched.h changes
- Wire up statx system call
- Fix stack unwinding after introduction of IRQ stack
- Fix spinlock code to build even for microMIPS with recent binutils
SMP-CPS:
- Fix retrieval of VPE mask on big endian CPUs"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: IRQ Stack: Unwind IRQ stack onto task stack
MIPS: c-r4k: Fix Loongson-3's vcache/scache waysize calculation
MIPS: Flush wrong invalid FTLB entry for huge page
MIPS: Check TLB before handle_ri_rdhwr() for Loongson-3
MIPS: Add MIPS_CPU_FTLB for Loongson-3A R2
MIPS: Lantiq: fix missing xbar kernel panic
MIPS: smp-cps: Fix retrieval of VPE mask on big endian CPUs
MIPS: Wire up statx system call
MIPS: Include asm/ptrace.h now linux/sched.h doesn't
MIPS: ralink: Fix typos in rt3883 pinctrl
MIPS: End spinlocks with .insn
MIPS: Force o32 fp64 support on 32bit MIPS64r6 kernels
It's unused for ages, used to be required for ksyms.c back in the v1.1
times.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
sparc32:allmodconfig fails to build with the following error.
ERROR: "vac_cache_size" [drivers/infiniband/sw/rxe/rdma_rxe.ko] undefined!
Fixes: cb88645596 ("infiniband: Fix alignment of mmap cookies ...")
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Doug Ledford <dledford@redhat.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The memory corruption was happening due to incorrect
TLB/TSB flushing of hugepages.
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
kzalloc() won't actually fail because sizeof(*resize) is small, but
static checkers complain.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Fixes include:
- Fix a problem with GICv3 userspace save/restore
- Clarify GICv2 userspace save/restore ABI
- Be more careful in clearing GIC LRs
- Add missing synchronization primitive to our MMU handling code
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJY5MItAAoJEEtpOizt6ddy4mUH/1Z2rt2mUAYFQpWD/vy9WMxf
zJKMtcLlZZGjeU78zFfWuOxEo1bbDO+tOTV1docNnY8xjyszCZ5XKOqMeo2a7Vfh
1QYHxJTOmgxcRmMsOnJpqUXhhYm9hDxrbU88U/wvoNllLjWBea01ZXiJbWFPBssT
jrdtcCVstDGp3x3D91RgYNNzj9jNw80RBekACZZwYokDRpBZyUb8DYKfUgABFEKT
UPiHrxb8UOVqvbCuXMBNzhUZcuMoAh3oY02R9sV7u1QOXAJYfRV4fOV12fIcYbHf
tnyU8cCxEkSI1pHrpVG6SStcMt8yznQ+UPo0okQNBJXim2yI8+QKHtQlvx7Tjo8=
=tPDd
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-v4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
From: Christoffer Dall <cdall@linaro.org>
KVM/ARM Fixes for v4.11-rc6
Fixes include:
- Fix a problem with GICv3 userspace save/restore
- Clarify GICv2 userspace save/restore ABI
- Be more careful in clearing GIC LRs
- Add missing synchronization primitive to our MMU handling code
The rapf copy loops in the Meta usercopy code is missing some extable
entries for HTP cores with unaligned access checking enabled, where
faults occur on the instruction immediately after the faulting access.
Add the fixup labels and extable entries for these cases so that corner
case user copy failures don't cause kernel crashes.
Fixes: 373cd784d0 ("metag: Memory handling")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Cc: stable@vger.kernel.org
The fixup code to rewind the source pointer in
__asm_copy_from_user_{32,64}bit_rapf_loop() always rewound the source by
a single unit (4 or 8 bytes), however this is insufficient if the fault
didn't occur on the first load in the loop, as the source pointer will
have been incremented but nothing will have been stored until all 4
register [pairs] are loaded.
Read the LSM_STEP field of TXSTATUS (which is already loaded into a
register), a bit like the copy_to_user versions, to determine how many
iterations of MGET[DL] have taken place, all of which need rewinding.
Fixes: 373cd784d0 ("metag: Memory handling")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Cc: stable@vger.kernel.org
The fixup code for the copy_to_user rapf loops reads TXStatus.LSM_STEP
to decide how far to rewind the source pointer. There is a special case
for the last execution of an MGETL/MGETD, since it leaves LSM_STEP=0
even though the number of MGETLs/MGETDs attempted was 4. This uses ADDZ
which is conditional upon the Z condition flag, but the AND instruction
which masked the TXStatus.LSM_STEP field didn't set the condition flags
based on the result.
Fix that now by using ANDS which does set the flags, and also marking
the condition codes as clobbered by the inline assembly.
Fixes: 373cd784d0 ("metag: Memory handling")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Cc: stable@vger.kernel.org
Currently we try to zero the destination for a failed read from userland
in fixup code in the usercopy.c macros. The rest of the destination
buffer is then zeroed from __copy_user_zeroing(), which is used for both
copy_from_user() and __copy_from_user().
Unfortunately we fail to zero in the fixup code as D1Ar1 is set to 0
before the fixup code entry labels, and __copy_from_user() shouldn't even
be zeroing the rest of the buffer.
Move the zeroing out into copy_from_user() and rename
__copy_user_zeroing() to raw_copy_from_user() since it no longer does
any zeroing. This also conveniently matches the name needed for
RAW_COPY_USER support in a later patch.
Fixes: 373cd784d0 ("metag: Memory handling")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Cc: stable@vger.kernel.org
When copying to userland on Meta, if any faults are encountered
immediately abort the copy instead of continuing on and repeatedly
faulting, and worse potentially copying further bytes successfully to
subsequent valid pages.
Fixes: 373cd784d0 ("metag: Memory handling")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Cc: stable@vger.kernel.org
Fix the error checking of the alignment adjustment code in
raw_copy_from_user(), which mistakenly considers it safe to skip the
error check when aligning the source buffer on a 2 or 4 byte boundary.
If the destination buffer was unaligned it may have started to copy
using byte or word accesses, which could well be at the start of a new
(valid) source page. This would result in it appearing to have copied 1
or 2 bytes at the end of the first (invalid) page rather than none at
all.
Fixes: 373cd784d0 ("metag: Memory handling")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Cc: stable@vger.kernel.org
Metag's lib/usercopy.c has a bunch of copy_from_user macros for larger
copies between 5 and 16 bytes which are completely unused. Before fixing
zeroing lets drop these macros so there is less to fix.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-metag@vger.kernel.org
Cc: stable@vger.kernel.org
Commit 4c6d9acce1 ("powerpc/mm: Add hooks for cxl") converted local
TLB invalidates to global if the cxl driver is active. This is necessary
because the CAPP snoops invalidations to forward them to the PSL on the
cxl adapter. However one path was forgotten. native_flush_hash_range()
still does local TLB invalidates, as found out the hard way recently.
This patch fixes it by following the same logic as previously: if the
cxl driver is active, the local TLB invalidates are 'upgraded' to
global.
Fixes: 4c6d9acce1 ("powerpc/mm: Add hooks for cxl")
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When the kernel is compiled to use 64bit ABIv2 the _GLOBAL() macro does
not include a global entry point. A function's global entry point is
used when the function is called from a different TOC context and in the
kernel this typically means a call from a module into the vmlinux (or
vice-versa).
There are a few exported asm functions declared with _GLOBAL() and
calling them from a module will likely crash the kernel since any TOC
relative load will yield garbage.
flush_icache_range() and flush_dcache_range() are both exported to
modules, and use the TOC, so must use _GLOBAL_TOC().
Fixes: 721aeaa9fd ("powerpc: Build little endian ppc64 kernel with ABIv2")
Cc: stable@vger.kernel.org # v3.16+
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
L2 was running with uninitialized PML fields which led to incomplete
dirty bitmap logging. This manifested as all kinds of subtle erratic
behavior of the nested guest.
Fixes: 843e433057 ("KVM: VMX: Add PML support in VMX")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The PML feature is not exposed to guests so we should not be forwarding
the vmexit either.
This commit fixes BSOD 0x20001 (HYPERVISOR_ERROR) when running Hyper-V
enabled Windows Server 2016 in L1 on hardware that supports PML.
Fixes: 843e433057 ("KVM: VMX: Add PML support in VMX")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
In the past, there was only one load-with-reservation instruction,
lwarx, and if a program attempted a lwarx on a misaligned address, it
would take an alignment interrupt and the kernel handler would emulate
it as though it was lwzx, which was not really correct, but benign since
it is loading the right amount of data, and the lwarx should be paired
with a stwcx. to the same address, which would also cause an alignment
interrupt which would result in a SIGBUS being delivered to the process.
We now have 5 different sizes of load-with-reservation instruction. Of
those, lharx and ldarx cause an immediate SIGBUS by luck since their
entries in aligninfo[] overlap instructions which were not fixed up, but
lqarx overlaps with lhz and will be emulated as such. lbarx can never
generate an alignment interrupt since it only operates on 1 byte.
To straighten this out and fix the lqarx case, this adds code to detect
the l[hwdq]arx instructions and return without fixing them up, resulting
in a SIGBUS being delivered to the process.
Cc: stable@vger.kernel.org
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This is a fix that prevents translation exception errors
on valid page tables for the instruction-exection-protection
support. This feature was added during the 4.11 merge window.
We have to remove an old check that would trigger if the
change-recording override is not available (e.g. edat1 disabled
via cpu model).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJY4ktxAAoJEBF7vIC1phx8lroP/3miwUoUReBrQ4s4dFmMdRs9
YJ7y9uqHFNlGHiyLANEjQN5BQzyCZ7SeTvzQqvnBojhVVWV1ylFRIQltopJ4Hd3+
DIApcUXxrSLfSv3rXih/g1/RS7ZbN+HK8jb+uHj47drs96Y6g9/qomea060eIZiV
6WU7MYcf9hf+xVM/r8Hjo40iz+LFKk5qIqEGOGjR8bRvIdjJWyFYjEShyuX+O7F9
F83XryU386xJjLcDgpKIRmJpQ9h0a32e0TLxKSzh5YElIirmeUCNDrHkuBcljHBm
tUp3P2ST6XRA1KtybLedNSS00HP+2pZsq2pZPmm/ACsKQtSItm1VUHjO13SUI1sB
RHlJvsObMtSPbmlbctmTFtJ1L87BTygGTepYTwZIp9LzlmX0ddC7pCJHO+9OJlL1
N8kPmK1P8evVEntO0ej6R7JRPw/kLBb7+gXHBsMifqQQT7akVLMZKolv9z2yYms2
s1rc2XtbS/9QrexqsGu74dWrnjoTfd+SFslMNJK1GDOhgKljbZmU4Ca6wnsopTds
1JGf4Rc++pFANpxKOhkhavo4m/uDd7T6mCD4yPuhJA0lnGMBYUb26ycpL3gbeA1S
PWo1quee1fmGcBjpDHJElyvc5DcgcaAm5b58nBwCusiRMUSmmMt+6sBEiH+n873h
zwDoIwU9K/D0ogyEhvJD
=G97j
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-master-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux
From: Christian Borntraeger <borntraeger@de.ibm.com>
KVM: s390: Fix instruction-execution-protection/change-recording override
This is a fix that prevents translation exception errors
on valid page tables for the instruction-exection-protection
support. This feature was added during the 4.11 merge window.
We have to remove an old check that would trigger if the
change-recording override is not available (e.g. edat1 disabled
via cpu model).
We currently have some code to clear the list registers on GICv3, but we
never call this code, because the caller got nuked when removing the old
vgic. We also used to have a similar GICv2 part, but that got lost in
the process too.
Let's reintroduce the logic for GICv2 and call the logic when we
initialize the use of hypervisors on the CPU, for example when first
loading KVM or when exiting a low power state.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
unmap_stage2_range() on the entire memory range for the guest. This could
cause problems with other callers (e.g, munmap on a memslot) trying to
unmap a range. And since we have to unmap the entire Guest memory range
holding a spinlock, make sure we yield the lock if necessary, after we
unmap each PUD range.
Fixes: commit d5d8184d35 ("KVM: ARM: Memory virtualization setup")
Cc: stable@vger.kernel.org # v3.10+
Cc: Paolo Bonzini <pbonzin@redhat.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[ Avoid vCPU starvation and lockup detector warnings ]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
After 52d7523 (arm64: mm: allow the kernel to handle alignment faults on
user accesses) commit user-land accesses that produce unaligned exceptions
like in case of aarch32 ldm/stm/ldrd/strd instructions operating on
unaligned memory received by user-land as SIGSEGV. It is wrong, it should
be reported as SIGBUS as it was before 52d7523 commit.
Changed do_bad_area function to take signal and code parameters out of esr
value using fault_info table, so in case of do_alignment_fault fault
user-land will receive SIGBUS. Wrapped access to fault_info table into
esr_to_fault_info function.
Cc: <stable@vger.kernel.org>
Fixes: 52d7523 (arm64: mm: allow the kernel to handle alignment faults on user accesses)
Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>