linux-sg2042/arch
Daniel Borkmann 2c4ea6e28d x86/tlb: Fix tlb flushing when lguest clears PGE
Fengguang reported random corruptions from various locations on x86-32
after commits d2852a2240 ("arch: add ARCH_HAS_SET_MEMORY config") and
9d876e79df ("bpf: fix unlocking of jited image when module ronx not set")
that uses the former. While x86-32 doesn't have a JIT like x86_64, the
bpf_prog_lock_ro() and bpf_prog_unlock_ro() got enabled due to
ARCH_HAS_SET_MEMORY, whereas Fengguang's test kernel doesn't have module
support built in and therefore never had the DEBUG_SET_MODULE_RONX setting
enabled.

After investigating the crashes further, it turned out that using
set_memory_ro() and set_memory_rw() didn't have the desired effect, for
example, setting the pages as read-only on x86-32 would still let
probe_kernel_write() succeed without error. This behavior would manifest
itself in situations where the vmalloc'ed buffer was accessed prior to
set_memory_*() such as in case of bpf_prog_alloc(). In cases where it
wasn't, the page attribute changes seemed to have taken effect, leading to
the conclusion that a TLB invalidate didn't happen. Moreover, it turned out
that this issue reproduced with qemu in "-cpu kvm64" mode, but not for
"-cpu host". When the issue occurs, change_page_attr_set_clr() did trigger
a TLB flush as expected via __flush_tlb_all() through cpa_flush_range(),
though.

There are 3 variants for issuing a TLB flush: invpcid_flush_all() (depends
on CPU feature bits X86_FEATURE_INVPCID, X86_FEATURE_PGE), cr4 based flush
(depends on X86_FEATURE_PGE), and cr3 based flush.  For "-cpu host" case in
my setup, the flush used invpcid_flush_all() variant, whereas for "-cpu
kvm64", the flush was cr4 based. Switching the kvm64 case to cr3 manually
worked fine, and further investigating the cr4 one turned out that
X86_CR4_PGE bit was not set in cr4 register, meaning the
__native_flush_tlb_global_irq_disabled() wrote cr4 twice with the same
value instead of clearing X86_CR4_PGE in the first write to trigger the
flush.

It turned out that X86_CR4_PGE was cleared from cr4 during init from
lguest_arch_host_init() via adjust_pge(). The X86_FEATURE_PGE bit is also
cleared from there due to concerns of using PGE in guest kernel that can
lead to hard to trace bugs (see bff672e630 ("lguest: documentation V:
Host") in init()). The CPU feature bits are cleared in dynamic
boot_cpu_data, but they never propagated to __flush_tlb_all() as it uses
static_cpu_has() instead of boot_cpu_has() for testing which variant of TLB
flushing to use, meaning they still used the old setting of the host
kernel.

Clearing via setup_clear_cpu_cap(X86_FEATURE_PGE) so this would propagate
to static_cpu_has() checks is too late at this point as sections have been
patched already, so for now, it seems reasonable to switch back to
boot_cpu_has(X86_FEATURE_PGE) as it was prior to commit c109bf9599
("x86/cpufeature: Remove cpu_has_pge"). This lets the TLB flush trigger via
cr3 as originally intended, properly makes the new page attributes visible
and thus fixes the crashes seen by Fengguang.

Fixes: c109bf9599 ("x86/cpufeature: Remove cpu_has_pge")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: bp@suse.de
Cc: Kees Cook <keescook@chromium.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: lkp@01.org
Cc: Laura Abbott <labbott@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernrl.org/r/20170301125426.l4nf65rx4wahohyl@wfg-t540p.sh.intel.com
Link: http://lkml.kernel.org/r/25c41ad9eca164be4db9ad84f768965b7eb19d9e.1489191673.git.daniel@iogearbox.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-03-12 11:19:29 +01:00
..
alpha sched/headers: Move task->mm handling methods to <linux/sched/mm.h> 2017-03-03 01:43:28 +01:00
arc sched/headers: Move task->mm handling methods to <linux/sched/mm.h> 2017-03-03 01:43:28 +01:00
arm Merge branch 'stable/for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb 2017-03-07 10:23:17 -08:00
arm64 ARM: SoC: late DT updates for v4.11 2017-03-03 16:15:48 -08:00
avr32 avr32: Fix build error caused by include file reshuffling 2017-03-07 08:35:48 +01:00
blackfin sched/headers: Move task->mm handling methods to <linux/sched/mm.h> 2017-03-03 01:43:28 +01:00
c6x sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
cris sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h> 2017-03-02 08:42:37 +01:00
frv sched/headers: Prepare to move 'init_task' and 'init_thread_union' from <linux/sched.h> to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
h8300 h8300: Fix build breakage caused by header file changes 2017-03-07 08:35:49 +01:00
hexagon sched/headers: Move task->mm handling methods to <linux/sched/mm.h> 2017-03-03 01:43:28 +01:00
ia64 sched/core: Remove unused prefetch_stack() 2017-03-03 01:45:38 +01:00
m32r sched/headers: Move task->mm handling methods to <linux/sched/mm.h> 2017-03-03 01:43:28 +01:00
m68k sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h> 2017-03-02 08:42:37 +01:00
metag sched/headers: Move task->mm handling methods to <linux/sched/mm.h> 2017-03-03 01:43:28 +01:00
microblaze sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h> 2017-03-02 08:42:37 +01:00
mips MIPS: Add missing include files 2017-03-08 10:38:06 +01:00
mn10300 sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h> 2017-03-02 08:42:37 +01:00
nios2 sched/headers: Prepare to move 'init_task' and 'init_thread_union' from <linux/sched.h> to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
openrisc sched/headers: Prepare to move kstack_end() from <linux/sched.h> to <linux/sched/task_stack.h> 2017-03-02 08:42:39 +01:00
parisc Merge branch 'parisc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux 2017-03-03 16:20:06 -08:00
powerpc kexec, x86/purgatory: Unbreak it and clean it up 2017-03-10 20:55:09 +01:00
s390 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-07 14:38:16 -08:00
score sched/headers: Move task->mm handling methods to <linux/sched/mm.h> 2017-03-03 01:43:28 +01:00
sh sched/headers: Remove <asm/ptrace.h> from <linux/sched.h> 2017-03-03 01:45:33 +01:00
sparc sched/headers: Move task->mm handling methods to <linux/sched/mm.h> 2017-03-03 01:43:28 +01:00
tile sched/headers: Move task->mm handling methods to <linux/sched/mm.h> 2017-03-03 01:43:28 +01:00
um sched/headers: Prepare to move kstack_end() from <linux/sched.h> to <linux/sched/task_stack.h> 2017-03-02 08:42:39 +01:00
unicore32 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
x86 x86/tlb: Fix tlb flushing when lguest clears PGE 2017-03-12 11:19:29 +01:00
xtensa Xtensa improvements for v4.11: 2017-03-03 16:17:55 -08:00
.gitignore
Kconfig scripts/spelling.txt: add "an user" pattern and fix typo instances 2017-02-27 18:43:46 -08:00