OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Helge Deller	203c110b39	parisc: Fix indenting in puts() Static analysis tools complain that we intended to have curly braces around this indent block. In this case this assumption is wrong, so fix the indenting. Fixes: `2f3c7b8137` ("parisc: Add core code for self-extracting kernel") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.14+	2017-12-17 21:06:25 +01:00
Thomas Gleixner	6cbd2171e8	x86/cpufeatures: Make CPU bugs sticky There is currently no way to force CPU bug bits like CPU feature bits. That makes it impossible to set a bug bit once at boot and have it stick for all upcoming CPUs. Extend the force set/clear arrays to handle bug bits as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.992156574@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 14:27:53 +01:00
Thomas Gleixner	79cc741552	x86/paravirt: Provide a way to check for hypervisors There is no generic way to test whether a kernel is running on a specific hypervisor. But that's required to prevent the upcoming user address space separation feature in certain guest modes. Make the hypervisor type enum unconditionally available and provide a helper function which allows to test for a specific type. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.912938129@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 14:27:52 +01:00
Thomas Gleixner	a035795499	x86/paravirt: Dont patch flush_tlb_single native_flush_tlb_single() will be changed with the upcoming PAGE_TABLE_ISOLATION feature. This requires to have more code in there than INVLPG. Remove the paravirt patching for it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: linux-mm@kvack.org Cc: michael.schwarz@iaik.tugraz.at Cc: moritz.lipp@iaik.tugraz.at Cc: richard.fellner@student.tugraz.at Link: https://lkml.kernel.org/r/20171204150606.828111617@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 14:27:52 +01:00
Andy Lutomirski	c482feefe1	x86/entry/64: Make cpu_entry_area.tss read-only The TSS is a fairly juicy target for exploits, and, now that the TSS is in the cpu_entry_area, it's no longer protected by kASLR. Make it read-only on x86_64. On x86_32, it can't be RO because it's written by the CPU during task switches, and we use a task gate for double faults. I'd also be nervous about errata if we tried to make it RO even on configurations without double fault handling. [ tglx: AMD confirmed that there is no problem on 64-bit with TSS RO. So it's probably safe to assume that it's a non issue, though Intel might have been creative in that area. Still waiting for confirmation. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bpetkov@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.733700132@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 14:27:52 +01:00
Andy Lutomirski	0f9a48100f	x86/entry: Clean up the SYSENTER_stack code The existing code was a mess, mainly because C arrays are nasty. Turn SYSENTER_stack into a struct, add a helper to find it, and do all the obvious cleanups this enables. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bpetkov@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.653244723@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 14:27:51 +01:00
Andy Lutomirski	7fbbd5cbeb	x86/entry/64: Remove the SYSENTER stack canary Now that the SYSENTER stack has a guard page, there's no need for a canary to detect overflow after the fact. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.572577316@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 14:27:51 +01:00
Andy Lutomirski	40e7f949e0	x86/entry/64: Move the IST stacks into struct cpu_entry_area The IST stacks are needed when an IST exception occurs and are accessed before any kernel code at all runs. Move them into struct cpu_entry_area. The IST stacks are unlike the rest of cpu_entry_area: they're used even for entries from kernel mode. This means that they should be set up before we load the final IDT. Move cpu_entry_area setup to trap_init() for the boot CPU and set it up for all possible CPUs at once in native_smp_prepare_cpus(). Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.480598743@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 14:27:51 +01:00
Andy Lutomirski	3386bc8aed	x86/entry/64: Create a per-CPU SYSCALL entry trampoline Handling SYSCALL is tricky: the SYSCALL handler is entered with every single register (except FLAGS), including RSP, live. It somehow needs to set RSP to point to a valid stack, which means it needs to save the user RSP somewhere and find its own stack pointer. The canonical way to do this is with SWAPGS, which lets us access percpu data using the %gs prefix. With PAGE_TABLE_ISOLATION-like pagetable switching, this is problematic. Without a scratch register, switching CR3 is impossible, so %gs-based percpu memory would need to be mapped in the user pagetables. Doing that without information leaks is difficult or impossible. Instead, use a different sneaky trick. Map a copy of the first part of the SYSCALL asm at a different address for each CPU. Now RIP varies depending on the CPU, so we can use RIP-relative memory access to access percpu memory. By putting the relevant information (one scratch slot and the stack address) at a constant offset relative to RIP, we can make SYSCALL work without relying on %gs. A nice thing about this approach is that we can easily switch it on and off if we want pagetable switching to be configurable. The compat variant of SYSCALL doesn't have this problem in the first place -- there are plenty of scratch registers, since we don't care about preserving r8-r15. This patch therefore doesn't touch SYSCALL32 at all. This patch actually seems to be a small speedup. With this patch, SYSCALL touches an extra cache line and an extra virtual page, but the pipeline no longer stalls waiting for SWAPGS. It seems that, at least in a tight loop, the latter outweights the former. Thanks to David Laight for an optimization tip. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bpetkov@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.403607157@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 14:27:50 +01:00
Andy Lutomirski	3e3b9293d3	x86/entry/64: Return to userspace from the trampoline stack By itself, this is useless. It gives us the ability to run some final code before exit that cannnot run on the kernel stack. This could include a CR3 switch a la PAGE_TABLE_ISOLATION or some kernel stack erasing, for example. (Or even weird things like changing which kernel stack gets used as an ASLR-strengthening mechanism.) The SYSRET32 path is not covered yet. It could be in the future or we could just ignore it and force the slow path if needed. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.306546484@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 14:27:50 +01:00
Andy Lutomirski	7f2590a110	x86/entry/64: Use a per-CPU trampoline stack for IDT entries Historically, IDT entries from usermode have always gone directly to the running task's kernel stack. Rearrange it so that we enter on a per-CPU trampoline stack and then manually switch to the task's stack. This touches a couple of extra cachelines, but it gives us a chance to run some code before we touch the kernel stack. The asm isn't exactly beautiful, but I think that fully refactoring it can wait. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.225330557@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 14:27:38 +01:00
Andy Lutomirski	6d9256f0a8	x86/espfix/64: Stop assuming that pt_regs is on the entry stack When we start using an entry trampoline, a #GP from userspace will be delivered on the entry stack, not on the task stack. Fix the espfix64 #DF fixup to set up #GP according to TSS.SP0, rather than assuming that pt_regs + 1 == SP0. This won't change anything without an entry stack, but it will make the code continue to work when an entry stack is added. While we're at it, improve the comments to explain what's actually going on. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.130778051@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:57 +01:00
Andy Lutomirski	9aaefe7b59	x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 On 64-bit kernels, we used to assume that TSS.sp0 was the current top of stack. With the addition of an entry trampoline, this will no longer be the case. Store the current top of stack in TSS.sp1, which is otherwise unused but shares the same cacheline. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.050864668@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:56 +01:00
Andy Lutomirski	72f5e08dbb	x86/entry: Remap the TSS into the CPU entry area This has a secondary purpose: it puts the entry stack into a region with a well-controlled layout. A subsequent patch will take advantage of this to streamline the SYSCALL entry code to be able to find it more easily. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bpetkov@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.962042855@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:56 +01:00
Andy Lutomirski	1a935bc3d4	x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct SYSENTER_stack should have reliable overflow detection, which means that it needs to be at the bottom of a page, not the top. Move it to the beginning of struct tss_struct and page-align it. Also add an assertion to make sure that the fixed hardware TSS doesn't cross a page boundary. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.881827433@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:56 +01:00
Andy Lutomirski	6e60e58342	x86/dumpstack: Handle stack overflow on all stacks We currently special-case stack overflow on the task stack. We're going to start putting special stacks in the fixmap with a custom layout, so they'll have guard pages, too. Teach the unwinder to be able to unwind an overflow of any of the stacks. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.802057305@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:55 +01:00
Andy Lutomirski	7fb983b4dd	x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss A future patch will move SYSENTER_stack to the beginning of cpu_tss to help detect overflow. Before this can happen, fix several code paths that hardcode assumptions about the old layout. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.722425540@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:55 +01:00
Andy Lutomirski	21506525fb	x86/kasan/64: Teach KASAN about the cpu_entry_area The cpu_entry_area will contain stacks. Make sure that KASAN has appropriate shadow mappings for them. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Potapenko <glider@google.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: kasan-dev@googlegroups.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.642806442@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:55 +01:00
Andy Lutomirski	ef8813ab28	x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area Currently, the GDT is an ad-hoc array of pages, one per CPU, in the fixmap. Generalize it to be an array of a new 'struct cpu_entry_area' so that we can cleanly add new things to it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.563271721@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:54 +01:00
Andy Lutomirski	aaeed3aeb3	x86/entry/gdt: Put per-CPU GDT remaps in ascending order We currently have CPU 0's GDT at the top of the GDT range and higher-numbered CPUs at lower addresses. This happens because the fixmap is upside down (index 0 is the top of the fixmap). Flip it so that GDTs are in ascending order by virtual address. This will simplify a future patch that will generalize the GDT remap to contain multiple pages. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.471561421@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:54 +01:00
Andy Lutomirski	33a2f1a6c4	x86/dumpstack: Add get_stack_info() support for the SYSENTER stack get_stack_info() doesn't currently know about the SYSENTER stack, so unwinding will fail if we entered the kernel on the SYSENTER stack and haven't fully switched off. Teach get_stack_info() about the SYSENTER stack. With future patches applied that run part of the entry code on the SYSENTER stack and introduce an intentional BUG(), I would get: PANIC: double fault, error_code: 0x0 ... RIP: 0010:do_error_trap+0x33/0x1c0 ... Call Trace: Code: ... With this patch, I get: PANIC: double fault, error_code: 0x0 ... Call Trace: <SYSENTER> ? async_page_fault+0x36/0x60 ? invalid_op+0x22/0x40 ? async_page_fault+0x36/0x60 ? sync_regs+0x3c/0x40 ? sync_regs+0x2e/0x40 ? error_entry+0x6c/0xd0 ? async_page_fault+0x36/0x60 </SYSENTER> Code: ... which is a lot more informative. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.392711508@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:54 +01:00
Andy Lutomirski	1a79797b58	x86/entry/64: Allocate and enable the SYSENTER stack This will simplify future changes that want scratch variables early in the SYSENTER handler -- they'll be able to spill registers to the stack. It also lets us get rid of a SWAPGS_UNSAFE_STACK user. This does not depend on CONFIG_IA32_EMULATION=y because we'll want the stack space even without IA32 emulation. As far as I can tell, the reason that this wasn't done from day 1 is that we use IST for #DB and #BP, which is IMO rather nasty and causes a lot more problems than it solves. But, since #DB uses IST, we don't actually need a real stack for SYSENTER (because SYSENTER with TF set will invoke #DB on the IST stack rather than the SYSENTER stack). I want to remove IST usage from these vectors some day, and this patch is a prerequisite for that as well. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.312726423@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:53 +01:00
Andy Lutomirski	4f3789e792	x86/irq/64: Print the offending IP in the stack overflow warning In case something goes wrong with unwind (not unlikely in case of overflow), print the offending IP where we detected the overflow. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.231677119@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:53 +01:00
Andy Lutomirski	6669a69260	x86/irq: Remove an old outdated comment about context tracking races That race has been fixed and code cleaned up for a while now. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.150551639@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:53 +01:00
Josh Poimboeuf	b02fcf9ba1	x86/unwinder: Handle stack overflows more gracefully There are at least two unwinder bugs hindering the debugging of stack-overflow crashes: - It doesn't deal gracefully with the case where the stack overflows and the stack pointer itself isn't on a valid stack but the to-be-dereferenced data is. - The ORC oops dump code doesn't know how to print partial pt_regs, for the case where if we get an interrupt/exception in early entry code before the full pt_regs have been saved. Fix both issues. http://lkml.kernel.org/r/20171126024031.uxi4numpbjm5rlbr@treble Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bpetkov@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.071425003@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:52 +01:00
Andy Lutomirski	d3a0910401	x86/unwinder/orc: Dont bail on stack overflow If the stack overflows into a guard page and the ORC unwinder should work well: by construction, there can't be any meaningful data in the guard page because no writes to the guard page will have succeeded. But there is a bug that prevents unwinding from working correctly: if the starting register state has RSP pointing into a stack guard page, the ORC unwinder bails out immediately. Instead of bailing out immediately check whether the next page up is a valid check page and if so analyze that. As a result the ORC unwinder will start the unwind. Tested by intentionally overflowing the task stack. The result is an accurate call trace instead of a trace consisting purely of '?' entries. There are a few other bugs that are triggered if the unwinder encounters a stack overflow after the first step, but they are outside the scope of this fix. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150604.991389777@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:52 +01:00
Boris Ostrovsky	e17f823453	x86/entry/64/paravirt: Use paravirt-safe macro to access eflags Commit `1d3e53e862` ("x86/entry/64: Refactor IRQ stacks and make them NMI-safe") added DEBUG_ENTRY_ASSERT_IRQS_OFF macro that acceses eflags using 'pushfq' instruction when testing for IF bit. On PV Xen guests looking at IF flag directly will always see it set, resulting in 'ud2'. Introduce SAVE_FLAGS() macro that will use appropriate save_fl pv op when running paravirt. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: xen-devel@lists.xenproject.org Link: https://lkml.kernel.org/r/20171204150604.899457242@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:59:52 +01:00
Andrey Ryabinin	2aeb07365b	x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow [ Note, this is a Git cherry-pick of the following commit: d17a1d97dc20: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow") ... for easier x86 PTI code testing and back-porting. ] The KASAN shadow is currently mapped using vmemmap_populate() since that provides a semi-convenient way to map pages into init_top_pgt. However, since that no longer zeroes the mapped pages, it is not suitable for KASAN, which requires zeroed shadow memory. Add kasan_populate_shadow() interface and use it instead of vmemmap_populate(). Besides, this allows us to take advantage of gigantic pages and use them to populate the shadow, which should save us some memory wasted on page tables and reduce TLB pressure. Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatashin@oracle.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Bob Picco <bob.picco@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Alexander Potapenko <glider@google.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:57:26 +01:00
Will Deacon	3382290ed2	locking/barriers: Convert users of lockless_dereference() to READ_ONCE() [ Note, this is a Git cherry-pick of the following commit: `506458efaf` ("locking/barriers: Convert users of lockless_dereference() to READ_ONCE()") ... for easier x86 PTI code testing and back-porting. ] READ_ONCE() now has an implicit smp_read_barrier_depends() call, so it can be used instead of lockless_dereference() without any change in semantics. Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1508840570-22169-4-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:57:15 +01:00
Daniel Borkmann	ab95477e7c	bpf: fix build issues on um due to mising bpf_perf_event.h [ Note, this is a Git cherry-pick of the following commit: `a23f06f06d` ("bpf: fix build issues on um due to mising bpf_perf_event.h") ... for easier x86 PTI code testing and back-porting. ] Since `c895f6f703` ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type") um (uml) won't build on i386 or x86_64: [...] CC init/main.o In file included from ../include/linux/perf_event.h:18:0, from ../include/linux/trace_events.h:10, from ../include/trace/syscall.h:7, from ../include/linux/syscalls.h:82, from ../init/main.c:20: ../include/uapi/linux/bpf_perf_event.h:11:32: fatal error: asm/bpf_perf_event.h: No such file or directory #include <asm/bpf_perf_event.h> [...] Lets add missing bpf_perf_event.h also to um arch. This seems to be the only one still missing. Fixes: `c895f6f703` ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type") Reported-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Richard Weinberger <richard@sigma-star.at> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Randy Dunlap <rdunlap@infradead.org> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Richard Weinberger <richard@sigma-star.at> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:56:34 +01:00
Andi Kleen	2fe1bc1f50	perf/x86: Enable free running PEBS for REGS_USER/INTR [ Note, this is a Git cherry-pick of the following commit: `a47ba4d77e` ("perf/x86: Enable free running PEBS for REGS_USER/INTR") ... for easier x86 PTI code testing and back-porting. ] Currently free running PEBS is disabled when user or interrupt registers are requested. Most of the registers are actually available in the PEBS record and can be supported. So we just need to check for the supported registers and then allow it: it is all except for the segment register. For user registers this only works when the counter is limited to ring 3 only, so this also needs to be checked. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170831214630.21892-1-andi@firstfloor.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:55:17 +01:00
Rudolf Marek	f2dbad36c5	x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD [ Note, this is a Git cherry-pick of the following commit: 2b67799bdf25 ("x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD") ... for easier x86 PTI code testing and back-porting. ] The latest AMD AMD64 Architecture Programmer's Manual adds a CPUID feature XSaveErPtr (CPUID_Fn80000008_EBX[2]). If this feature is set, the FXSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES / FXRSTOR, XRSTOR, XRSTORS always save/restore error pointers, thus making the X86_BUG_FXSAVE_LEAK workaround obsolete on such CPUs. Signed-Off-By: Rudolf Marek <r.marek@assembler.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Link: https://lkml.kernel.org/r/bdcebe90-62c5-1f05-083c-eba7f08b2540@assembler.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:55:02 +01:00
Ricardo Neri	a8b4db562e	x86/cpufeature: Add User-Mode Instruction Prevention definitions [ Note, this is a Git cherry-pick of the following commit: (limited to the cpufeatures.h file) `3522c2a6a4` ("x86/cpufeature: Add User-Mode Instruction Prevention definitions") ... for easier x86 PTI code testing and back-porting. ] User-Mode Instruction Prevention is a security feature present in new Intel processors that, when set, prevents the execution of a subset of instructions if such instructions are executed in user mode (CPL > 0). Attempting to execute such instructions causes a general protection exception. The subset of instructions comprises: * SGDT - Store Global Descriptor Table * SIDT - Store Interrupt Descriptor Table * SLDT - Store Local Descriptor Table * SMSW - Store Machine Status Word * STR - Store Task Register This feature is also added to the list of disabled-features to allow a cleaner handling of build-time configuration. Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi V. Shankar <ravi.v.shankar@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: ricardo.neri@intel.com Link: http://lkml.kernel.org/r/1509935277-22138-7-git-send-email-ricardo.neri-calderon@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:54:34 +01:00
Ingo Molnar	e5d77a73f3	Merge commit 'upstream-x86-virt' into WIP.x86/mm Merge a minimal set of virt cleanups, for a base for the MM isolation patches. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:50:01 +01:00
Ingo Molnar	2ec077c186	Merge branch 'upstream-acpi-fixes' into WIP.x86/pti.base Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:09:31 +01:00
Ingo Molnar	650400b2cc	Merge branch 'upstream-x86-selftests' into WIP.x86/pti.base Conflicts: arch/x86/kernel/cpu/Makefile Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 13:04:28 +01:00
Ingo Molnar	0fd2e9c53d	Merge commit 'upstream-x86-entry' into WIP.x86/mm Pull in a minimal set of v4.15 entry code changes, for a base for the MM isolation patches. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-17 12:58:53 +01:00
Matthew Wilcox	5f0e3fe6b1	x86/build: Make isoimage work on Debian Debian does not ship a 'mkisofs' symlink to genisoimage. All modern distros ship genisoimage, so just use that directly. That requires renaming the 'genisoimage' function. Also neaten up the 'for' loop while I'm in here. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: Changbin Du <changbin.du@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-16 16:23:31 +01:00
Linus Torvalds	f6f3732162	Revert "mm: replace p??_write with pte_access_permitted in fault + gup paths" This reverts commits `5c9d2d5c26`, `c7da82b894`, and `e7fe7b5cae`. We'll probably need to revisit this, but basically we should not complicate the get_user_pages_fast() case, and checking the actual page table protection key bits will require more care anyway, since the protection keys depend on the exact state of the VM in question. Particularly when doing a "remote" page lookup (ie in somebody elses VM, not your own), you need to be much more careful than this was. Dave Hansen says: "So, the underlying bug here is that we now a get_user_pages_remote() and then go ahead and do the p_access_permitted() checks against the current PKRU. This was introduced recently with the addition of the new p??_access_permitted() calls. We have checks in the VMA path for the "remote" gups and we avoid consulting PKRU for them. This got missed in the pkeys selftests because I did a ptrace read, but not a write*. I also didn't explicitly test it against something where a COW needed to be done" It's also not entirely clear that it makes sense to check the protection key bits at this level at all. But one possible eventual solution is to make the get_user_pages_fast() case just abort if it sees protection key bits set, which makes us fall back to the regular get_user_pages() case, which then has a vma and can do the check there if we want to. We'll see. Somewhat related to this all: what we _do_ want to do some day is to check the PAGE_USER bit - it should obviously always be set for user pages, but it would be a good check to have back. Because we have no generic way to test for it, we lost it as part of moving over from the architecture-specific x86 GUP implementation to the generic one in commit `e585513b76` ("x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation"). Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-15 18:53:22 -08:00
Linus Torvalds	7a3c296ae0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Clamp timeouts to INT_MAX in conntrack, from Jay Elliot. 2) Fix broken UAPI for BPF_PROG_TYPE_PERF_EVENT, from Hendrik Brueckner. 3) Fix locking in ieee80211_sta_tear_down_BA_sessions, from Johannes Berg. 4) Add missing barriers to ptr_ring, from Michael S. Tsirkin. 5) Don't advertise gigabit in sh_eth when not available, from Thomas Petazzoni. 6) Check network namespace when delivering to netlink taps, from Kevin Cernekee. 7) Kill a race in raw_sendmsg(), from Mohamed Ghannam. 8) Use correct address in TCP md5 lookups when replying to an incoming segment, from Christoph Paasch. 9) Add schedule points to BPF map alloc/free, from Eric Dumazet. 10) Don't allow silly mtu values to be used in ipv4/ipv6 multicast, also from Eric Dumazet. 11) Fix SKB leak in tipc, from Jon Maloy. 12) Disable MAC learning on OVS ports of mlxsw, from Yuval Mintz. 13) SKB leak fix in skB_complete_tx_timestamp(), from Willem de Bruijn. 14) Add some new qmi_wwan device IDs, from Daniele Palmas. 15) Fix static key imbalance in ingress qdisc, from Jiri Pirko. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits) net: qcom/emac: Reduce timeout for mdio read/write net: sched: fix static key imbalance in case of ingress/clsact_init error net: sched: fix clsact init error path ip_gre: fix wrong return value of erspan_rcv net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support pkt_sched: Remove TC_RED_OFFLOADED from uapi net: sched: Move to new offload indication in RED net: sched: Add TCA_HW_OFFLOAD net: aquantia: Increment driver version net: aquantia: Fix typo in ethtool statistics names net: aquantia: Update hw counters on hw init net: aquantia: Improve link state and statistics check interval callback net: aquantia: Fill in multicast counter in ndev stats from hardware net: aquantia: Fill ndev stat couters from hardware net: aquantia: Extend stat counters to 64bit values net: aquantia: Fix hardware DMA stream overload on large MRRS net: aquantia: Fix actual speed capabilities reporting sock: free skb in skb_complete_tx_timestamp on error s390/qeth: update takeover IPs after configuration change s390/qeth: lock IP table while applying takeover changes ...	2017-12-15 13:08:37 -08:00
Linus Torvalds	06f976ecc7	arm64 fixes: - Fix FPSIMD context switch regression introduced in -rc2 - Fix ABI break with SVE CPUID register reporting - Fix use of uninitialised variable - Fixes to hardware access/dirty management and sanity checking - CPU erratum workaround for Falkor CPUs - Fix reporting of writeable+executable mappings - Fix signal reporting for RAS errors -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJaNAZcAAoJELescNyEwWM0brIH/i69foOwEb5CFE8B6Bwh1yMR WtiNMiuLeaOoOmAzTLz5ZMi0W087cth+ycgiXuvnMQtzLIAFXK0gWCZ+CLBHgsiz Q6ba7Li0JbFuSqOyxjxcLMeDRFsD6eVZuoVhBeVi+bjz6Ni44nXF4+TXhep82+Ws xMfK5S8qjhAwFqFuOlL6Goq1zg5lGVJQjpBHkipiWRpmU8AdY16dsajsvMvbZl0A 4LhIntEo5qx1+6un+8w3xoMt5uzb0BeVseTKCEghDgZ2wE351pwQEEQUam0KVhv4 6b803ccpiBbl3oo4yAbgvXigTW6HBjyKA9e/Xy9SG9gpSFZdUNhBcGoCOnaIF/A= =kjAU -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "There are some significant fixes in here for FP state corruption, hardware access/dirty PTE corruption and an erratum workaround for the Falkor CPU. I'm hoping that things finally settle down now, but never say never... Summary: - Fix FPSIMD context switch regression introduced in -rc2 - Fix ABI break with SVE CPUID register reporting - Fix use of uninitialised variable - Fixes to hardware access/dirty management and sanity checking - CPU erratum workaround for Falkor CPUs - Fix reporting of writeable+executable mappings - Fix signal reporting for RAS errors" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: fpsimd: Fix copying of FP state from signal frame into task struct arm64/sve: Report SVE to userspace via CPUID only if supported arm64: fix CONFIG_DEBUG_WX address reporting arm64: fault: avoid send SIGBUS two times arm64: hw_breakpoint: Use linux/uaccess.h instead of asm/uaccess.h arm64: Add software workaround for Falkor erratum 1041 arm64: Define cputype macros for Falkor CPU arm64: mm: Fix false positives in set_pte_at access/dirty race detection arm64: mm: Fix pte_mkclean, pte_mkdirty semantics arm64: Initialise high_memory global variable earlier	2017-12-15 12:44:49 -08:00
Linus Torvalds	e53000b1ed	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: - fix the s2ram regression related to confusion around segment register restoration, plus related cleanups that make the code more robust - a guess-unwinder Kconfig dependency fix - an isoimage build target fix for certain tool chain combinations - instruction decoder opcode map fixes+updates, and the syncing of the kernel decoder headers to the objtool headers - a kmmio tracing fix - two 5-level paging related fixes - a topology enumeration fix on certain SMP systems" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Resync objtool's instruction decoder source code copy with the kernel's latest version x86/decoder: Fix and update the opcodes map x86/power: Make restore_processor_context() sane x86/power/32: Move SYSENTER MSR restoration to fix_processor_context() x86/power/64: Use struct desc_ptr for the IDT in struct saved_context x86/unwinder/guess: Prevent using CONFIG_UNWINDER_GUESS=y with CONFIG_STACKDEPOT=y x86/build: Don't verify mtools configuration file for isoimage x86/mm/kmmio: Fix mmiotrace for page unaligned addresses x86/boot/compressed/64: Print error if 5-level paging is not supported x86/boot/compressed/64: Detect and handle 5-level paging at boot-time x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation	2017-12-15 12:14:33 -08:00
Linus Torvalds	bde6b37e49	xen: fixes for 4.15-rc4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABAgAGBQJaM17CAAoJELDendYovxMvZs0H/3YIAN3wXEqcaZuAoMKdTU3C gSipBUJKwcwo/1qJCtE5CzQnl7U6qC4BPnwNfed1AJUULFBYtpEMw6KFMhK2KELn ADWoJzVbBzouZnj0GHuky4dQnsC5lPL3RjlWqKkzlEA3KebrxLsO7inw5Pru9zPZ y1Isl/nLxV04f8SYITjgThCCdo9pnQOdx275YAcbQprqsOK45j+4d1Mn0fYb1Rjk 7Fk4i9UAZlY/VYycpgyC/m3PKFdawl0uqHXwu4yTKXQJGZ3BwED7ifTvHknaSOhA DVxktnEOTg5IrrqDrr4EdZIgXQcqQF19oIOtEWCKCLvW8j/sBlSAQYkgtn5OpxY= =gqWy -----END PGP SIGNATURE----- Merge tag 'for-linus-4.15-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "Two minor fixes for running as Xen dom0: - when built as 32 bit kernel on large machines the Xen LAPIC emulation should report a rather modern LAPIC in order to support enough APIC-Ids - The Xen LAPIC emulation is needed for dom0 only, so build it only for kernels supporting to run as Xen dom0" * tag 'for-linus-4.15-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: XEN_ACPI_PROCESSOR is Dom0-only x86/Xen: don't report ancient LAPIC version	2017-12-15 11:32:09 -08:00
Daniel Borkmann	07aee94394	bpf, sparc: fix usage of wrong reg for load_skb_regs after call When LD_ABS/IND is used in the program, and we have a BPF helper call that changes packet data (bpf_helper_changes_pkt_data() returns true), then in case of sparc JIT, we try to reload cached skb data from bpf2sparc[BPF_REG_6]. However, there is no such guarantee or assumption that skb sits in R6 at this point, all helpers changing skb data only have a guarantee that skb sits in R1. Therefore, store BPF R1 in L7 temporarily and after procedure call use L7 to reload cached skb data. skb sitting in R6 is only true at the time when LD_ABS/IND is executed. Fixes: `7a12b5031c` ("sparc64: Add eBPF JIT.") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2017-12-15 09:19:35 -08:00
Daniel Borkmann	87338c8e2c	bpf, ppc64: do not reload skb pointers in non-skb context The assumption of unconditionally reloading skb pointers on BPF helper calls where bpf_helper_changes_pkt_data() holds true is wrong. There can be different contexts where the helper would enforce a reload such as in case of XDP. Here, we do have a struct xdp_buff instead of struct sk_buff as context, thus this will access garbage. JITs only ever need to deal with cached skb pointer reload when ld_abs/ind was seen, therefore guard the reload behind SEEN_SKB. Fixes: `156d0e290e` ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2017-12-15 09:19:35 -08:00
Daniel Borkmann	6d59b7dbf7	bpf, s390x: do not reload skb pointers in non-skb context The assumption of unconditionally reloading skb pointers on BPF helper calls where bpf_helper_changes_pkt_data() holds true is wrong. There can be different contexts where the BPF helper would enforce a reload such as in case of XDP. Here, we do have a struct xdp_buff instead of struct sk_buff as context, thus this will access garbage. JITs only ever need to deal with cached skb pointer reload when ld_abs/ind was seen, therefore guard the reload behind SEEN_SKB only. Tested on s390x. Fixes: `9db7f2b818` ("s390/bpf: recache skb->data/hlen for skb_vlan_push/pop") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2017-12-15 09:19:35 -08:00
Will Deacon	a454483137	arm64: fpsimd: Fix copying of FP state from signal frame into task struct Commit `9de52a755c` ("arm64: fpsimd: Fix failure to restore FPSIMD state after signals") fixed an issue reported in our FPSIMD signal restore code but inadvertently introduced another issue which tends to manifest as random SEGVs in userspace. The problem is that when we copy the struct fpsimd_state from the kernel stack (populated from the signal frame) into the struct held in the current thread_struct, we blindly copy uninitialised stack into the "cpu" field, which means that context-switching of the FP registers is no longer reliable. This patch fixes the problem by copying only the user_fpsimd member of struct fpsimd_state. We should really rework the function prototypes to take struct user_fpsimd_state * instead, but let's just get this fixed for now. Cc: Dave Martin <Dave.Martin@arm.com> Fixes: `9de52a755c` ("arm64: fpsimd: Fix failure to restore FPSIMD state after signals") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-15 16:12:35 +00:00
Andy Lutomirski	c739f930be	x86/espfix/64: Fix espfix double-fault handling on 5-level systems Using PGDIR_SHIFT to identify espfix64 addresses on 5-level systems was wrong, and it resulted in panics due to unhandled double faults. Use P4D_SHIFT instead, which is correct on 4-level and 5-level machines. This fixes a panic when running x86 selftests on 5-level machines. Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: `1d33b21956` ("x86/espfix: Add support for 5-level paging") Link: http://lkml.kernel.org/r/24c898b4f44fdf8c22d93703850fb384ef87cfdc.1513035461.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-15 16:58:53 +01:00
Randy Dunlap	f5b5fab178	x86/decoder: Fix and update the opcodes map Update x86-opcode-map.txt based on the October 2017 Intel SDM publication. Fix INVPID to INVVPID. Add UD0 and UD1 instruction opcodes. Also sync the objtool and perf tooling copies of this file. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/aac062d7-c0f6-96e3-5c92-ed299e2bd3da@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-15 13:45:20 +01:00
Andy Lutomirski	7ee18d6779	x86/power: Make restore_processor_context() sane My previous attempt to fix a couple of bugs in __restore_processor_context(): `5b06bbcfc2` ("x86/power: Fix some ordering bugs in __restore_processor_context()") ... introduced yet another bug, breaking suspend-resume. Rather than trying to come up with a minimal fix, let's try to clean it up for real. This patch fixes quite a few things: - The old code saved a nonsensical subset of segment registers. The only registers that need to be saved are those that contain userspace state or those that can't be trivially restored without percpu access working. (On x86_32, we can restore percpu access by writing __KERNEL_PERCPU to %fs. On x86_64, it's easier to save and restore the kernel's GSBASE.) With this patch, we restore hardcoded values to the kernel state where applicable and explicitly restore the user state after fixing all the descriptor tables. - We used to use an unholy mix of inline asm and C helpers for segment register access. Let's get rid of the inline asm. This fixes the reported s2ram hangs and make the code all around more logical. Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reported-by: Pavel Machek <pavel@ucw.cz> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Tested-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Zhang Rui <rui.zhang@intel.com> Fixes: `5b06bbcfc2` ("x86/power: Fix some ordering bugs in __restore_processor_context()") Link: http://lkml.kernel.org/r/398ee68e5c0f766425a7b746becfc810840770ff.1513286253.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-15 12:21:38 +01:00
Andy Lutomirski	896c80bef4	x86/power/32: Move SYSENTER MSR restoration to fix_processor_context() x86_64 restores system call MSRs in fix_processor_context(), and x86_32 restored them along with segment registers. The 64-bit variant makes more sense, so move the 32-bit code to match the 64-bit code. No side effects are expected to runtime behavior. Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Zhang Rui <rui.zhang@intel.com> Link: http://lkml.kernel.org/r/65158f8d7ee64dd6bbc6c1c83b3b34aaa854e3ae.1513286253.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-15 12:18:29 +01:00
Andy Lutomirski	090edbe23f	x86/power/64: Use struct desc_ptr for the IDT in struct saved_context x86_64's saved_context nonsensically used separate idt_limit and idt_base fields and then cast &idt_limit to struct desc_ptr *. This was correct (with -fno-strict-aliasing), but it's confusing, served no purpose, and required #ifdeffery. Simplify this by using struct desc_ptr directly. No change in functionality. Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Zhang Rui <rui.zhang@intel.com> Link: http://lkml.kernel.org/r/967909ce38d341b01d45eff53e278e2728a3a93a.1513286253.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-15 12:18:29 +01:00
Lan Tianyu	f298103359	KVM/x86: Check input paging mode when cs.l is set Reported by syzkaller: WARNING: CPU: 0 PID: 27962 at arch/x86/kvm/emulate.c:5631 x86_emulate_insn+0x557/0x15f0 [kvm] Modules linked in: kvm_intel kvm [last unloaded: kvm] CPU: 0 PID: 27962 Comm: syz-executor Tainted: G B W 4.15.0-rc2-next-20171208+ #32 Hardware name: Intel Corporation S1200SP/S1200SP, BIOS S1200SP.86B.01.03.0006.040720161253 04/07/2016 RIP: 0010:x86_emulate_insn+0x557/0x15f0 [kvm] RSP: 0018:ffff8807234476d0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88072d0237a0 RCX: ffffffffa0065c4d RDX: 1ffff100e5a046f9 RSI: 0000000000000003 RDI: ffff88072d0237c8 RBP: ffff880723447728 R08: ffff88072d020000 R09: ffffffffa008d240 R10: 0000000000000002 R11: ffffed00e7d87db3 R12: ffff88072d0237c8 R13: ffff88072d023870 R14: ffff88072d0238c2 R15: ffffffffa008d080 FS: 00007f8a68666700(0000) GS:ffff880802200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002009506c CR3: 000000071fec4005 CR4: 00000000003626f0 Call Trace: x86_emulate_instruction+0x3bc/0xb70 [kvm] ? reexecute_instruction.part.162+0x130/0x130 [kvm] vmx_handle_exit+0x46d/0x14f0 [kvm_intel] ? trace_event_raw_event_kvm_entry+0xe7/0x150 [kvm] ? handle_vmfunc+0x2f0/0x2f0 [kvm_intel] ? wait_lapic_expire+0x25/0x270 [kvm] vcpu_enter_guest+0x720/0x1ef0 [kvm] ... When CS.L is set, vcpu should run in the 64 bit paging mode. Current kvm set_sregs function doesn't have such check when userspace inputs sreg values. This will lead unexpected behavior. This patch is to add checks for CS.L, EFER.LME, EFER.LMA and CR4.PAE when get SREG inputs from userspace in order to avoid unexpected behavior. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Tianyu Lan <tianyu.lan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-12-15 10:01:46 +01:00
Linus Torvalds	c4f988ee51	pci-v4.15-fixes-1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJaMwhcAAoJEFmIoMA60/r88ygP/3fuGrYrolx1yOM/XKjiG1Za B+FF5IK0xcV4Xrdg4K9TrqCkVBlSmTkF2wvD1raFQhvPvzNnTMI0JvI5iwUogSlA EI8rKvKRT0QRtT27URhGWCZOrPMgylE7Vtq7ZGLin2o19A4JQw3TC8mz9QirSbii C5H+WZ6iAQ3ISjOuiFUREqQDCE8FzbYQC69uh/xMRshZFG1WPsX96ZRSvVfPQUeC XwvKqiO4s9I7zyxhS/WSRnnEHO7LNSRRiq8kzP0VKKMAfTo8gLYPoeIPakvXSsQZ j7N1ycWub9DSn7wjh9Zd922wp1EYfxkCW6jilVOd+BlmOxLjyPPGgahbNIrIojwa qiTY5SNdty3h27IT3Hep4yDBoqhXXPORXgh7QRlzizu8Te2SjHY7n9qi1S8Uobn+ s98PoNYU+oelivi9MtKLKcqCaNL0RE8aCQfuA3TzCU/dzR4Trx8obV50f67vsL6e eK67S/KpOVUjHQmGec8iqnrJfHPZkxgaRTOVVYCXKidtJxTf/LyjDKYh8A2Q27A5 wXdRaSxLw25r0alAWzUEIVj/iUPd4nYKY9RTeRPWiXU+wsEUcBO+js6HE1xZAAfk ULDqZKnkKsgW9aWlQU6DzzLsCjxPTRcC8SfK79K47Otua2laABzL47mzo2sSKKFs mNq3i4OiNFq8XCsN18lo =o3Rn -----END PGP SIGNATURE----- Merge tag 'pci-v4.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - add a pci_get_domain_bus_and_slot() stub for the CONFIG_PCI=n case to avoid build breakage in the v4.16 merge window if a pci_get_bus_and_slot() -> pci_get_domain_bus_and_slot() patch gets merged before the PCI tree (Randy Dunlap) - fix an AMD boot regression in the 64bit BAR support added in v4.15 (Christian König) - fix an R-Car use-after-free that causes a crash if no PCIe card is present (Geert Uytterhoeven) * tag 'pci-v4.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: rcar: Fix use-after-free in probe error path x86/PCI: Only enable a 64bit BAR on single-socket AMD Family 15h x86/PCI: Fix infinite loop in search for 64bit BAR placement PCI: Add pci_get_domain_bus_and_slot() stub	2017-12-14 17:02:39 -08:00
Thiago Rafael Becker	bdcf0a423e	kernel: make groups_sort calling a responsibility group_info allocators In testing, we found that nfsd threads may call set_groups in parallel for the same entry cached in auth.unix.gid, racing in the call of groups_sort, corrupting the groups for that entry and leading to permission denials for the client. This patch: - Make groups_sort globally visible. - Move the call to groups_sort to the modifiers of group_info - Remove the call to groups_sort from set_groups Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com Signed-off-by: Thiago Rafael Becker <thiago.becker@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: NeilBrown <neilb@suse.com> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-14 16:00:49 -08:00
Dave Martin	3fab39997a	arm64/sve: Report SVE to userspace via CPUID only if supported Currently, the SVE field in ID_AA64PFR0_EL1 is visible unconditionally to userspace via the CPU ID register emulation, irrespective of the kernel config. This means that if a kernel configured with CONFIG_ARM64_SVE=n is run on SVE-capable hardware, userspace will see SVE reported as present in the ID regs even though the kernel forbids execution of SVE instructions. This patch makes the exposure of the SVE field in ID_AA64PFR0_EL1 conditional on CONFIG_ARM64_SVE=y. Since future architecture features are likely to encounter a similar requirement, this patch adds a suitable helper macros for use when declaring config-conditional ID register fields. Fixes: `43994d824e` ("arm64/sve: Detect SVE and activate runtime support") Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-14 15:14:30 +00:00
Mark Rutland	1d08a044cf	arm64: fix CONFIG_DEBUG_WX address reporting In ptdump_check_wx(), we pass walk_pgd() a start address of 0 (rather than VA_START) for the init_mm. This means that any reported W&X addresses are offset by VA_START, which is clearly wrong and can make them appear like userspace addresses. Fix this by telling the ptdump code that we're walking init_mm starting at VA_START. We don't need to update the addr_markers, since these are still valid bounds regardless. Cc: <stable@vger.kernel.org> Fixes: `1404d6f13e` ("arm64: dump: Add checking for writable and exectuable pages") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Laura Abbott <labbott@redhat.com> Reported-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-14 10:18:23 +00:00
Peter Xu	5663d8f9bb	kvm: x86: fix WARN due to uninitialized guest FPU state ------------[ cut here ]------------ Bad FPU state detected at kvm_put_guest_fpu+0xd8/0x2d0 [kvm], reinitializing FPU registers. WARNING: CPU: 1 PID: 4594 at arch/x86/mm/extable.c:103 ex_handler_fprestore+0x88/0x90 CPU: 1 PID: 4594 Comm: qemu-system-x86 Tainted: G B OE 4.15.0-rc2+ #10 RIP: 0010:ex_handler_fprestore+0x88/0x90 Call Trace: fixup_exception+0x4e/0x60 do_general_protection+0xff/0x270 general_protection+0x22/0x30 RIP: 0010:kvm_put_guest_fpu+0xd8/0x2d0 [kvm] RSP: 0018:ffff8803d5627810 EFLAGS: 00010246 kvm_vcpu_reset+0x3b4/0x3c0 [kvm] kvm_apic_accept_events+0x1c0/0x240 [kvm] kvm_arch_vcpu_ioctl_run+0x1658/0x2fb0 [kvm] kvm_vcpu_ioctl+0x479/0x880 [kvm] do_vfs_ioctl+0x142/0x9a0 SyS_ioctl+0x74/0x80 do_syscall_64+0x15f/0x600 where kvm_put_guest_fpu is called without a prior kvm_load_guest_fpu. To fix it, move kvm_load_guest_fpu to the very beginning of kvm_arch_vcpu_ioctl_run. Cc: stable@vger.kernel.org Fixes: `f775b13eed` Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-12-14 09:24:35 +01:00
Wanpeng Li	d73235d17b	KVM: X86: Fix load RFLAGS w/o the fixed bit * Guest State * CR0: actual=0x0000000000000030, shadow=0x0000000060000010, gh_mask=fffffffffffffff7 CR4: actual=0x0000000000002050, shadow=0x0000000000000000, gh_mask=ffffffffffffe871 CR3 = 0x00000000fffbc000 RSP = 0x0000000000000000 RIP = 0x0000000000000000 RFLAGS=0x00000000 DR7 = 0x0000000000000400 ^^^^^^^^^^ The failed vmentry is triggered by the following testcase when ept=Y: #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[5]; int main() { r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); struct kvm_regs regs = { .rflags = 0, }; ioctl(r[4], KVM_SET_REGS, &regs); ioctl(r[4], KVM_RUN, 0); } X86 RFLAGS bit 1 is fixed set, userspace can simply clearing bit 1 of RFLAGS with KVM_SET_REGS ioctl which results in vmentry fails. This patch fixes it by oring X86_EFLAGS_FIXED during ioctl. Cc: stable@vger.kernel.org Suggested-by: Jim Mattson <jmattson@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Quan Xu <quan.xu0@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-12-14 09:24:26 +01:00
Wanpeng Li	ed52870f46	KVM: MMU: Fix infinite loop when there is no available mmu page The below test case can cause infinite loop in kvm when ept=0. #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[5]; int main() { r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); ioctl(r[4], KVM_RUN, 0); } It doesn't setup the memory regions, mmu_alloc_shadow/direct_roots() in kvm return 1 when kvm fails to allocate root page table which can result in beblow infinite loop: vcpu_run() { for (;;) { r = vcpu_enter_guest()::kvm_mmu_reload() returns 1 if (r <= 0) break; if (need_resched()) cond_resched(); } } This patch fixes it by returning -ENOSPC when there is no available kvm mmu page for root page table. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: stable@vger.kernel.org Fixes: `26eeb53cf0` (KVM: MMU: Bail out immediately if there is no available mmu page) Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-12-14 09:24:14 +01:00
Linus Torvalds	4e746cf4f7	RISC-V Fixes for 4.15-rc4 This pull request contains three small fixes that I'm hoping to get into 4.15-rc4: * A fix to a typo in sys_riscv_flush_icache. This only effects error handling, but I think it's a small and obvious enough change that it's sane outside the merge window. * The addition of smp_mb__after_spinlock(), which was recently removed due to an incorrect comment. This is largly a comment change (as there's a big one now), and while it's necessary for complience with the RISC-V memory model the lack of this fence shouldn't manifest as a bug on current implementations. Nonetheless, it still seems saner to have the fence in 4.15. * The removal of some of the HVC_RISCV_SBI driver that snuck into the arch port. This is compile-time dead code in 4.15 (as the driver isn't in yet), and during the review process we found a better way to implement early printk on RISC-V. While this change doesn't do anything, it will make staging our HVC driver easier: without this change the HVC driver we hope to upstream won't build on 4.15 (because the 4.15 arch code would reference a function that no longer exists). Additionally, I'm instituting a bit of a process change so I don't submit things too quickly again: * All the patches I submit during an RC will be from the week before, so everyone has gotten a chance to see them and they've made it through our autobuilders and integration trees. * I'll cherry-pick (single patches) or merge (if it's a patch set) on top of the new RC on Monday morning. * I'll sign the tag on Monday morning, to let the autobuilders pick up exactly what I'm submitting. * Assuming nothing goes wrong, I'll mail the pull request out on Wednesday. If something goes wrong, I'll wait at least a day after re-spinning the tag to let the autobuilders pick things up. Hopefully this will avoid any headaches in the future, barring any emergency fixes. I don't think this is the last patch set we'll want for 4.15: I think I'll want to remove some of the first-level irqchip driver that snuck in as well, which will look a lot like the HVC patch here. This is pending some asm-generic cleanup I'm doing that I haven't quite gotten clean enough to send out yet, though, but hopefully it'll be ready by next week (and still OK for that late). -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlou0sQTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRDvTKFQLMurQebsD/4gPSr0adXxl3hXhrHWQIcigWM1QNgp EjNnsHiT1zLs9SNK+CBLQXEjXp/GwqfYDiE6yzaJEhoBm5rHB9c72IThCJkt6fRR 5fuybYXxBXaXVxq0CbaItNzTdqm99iqSCD2p3mh7Dmw/X23EglkYdPqQlUIxWTtw BmlCaH8tn8PwGIgVATwJe8fWEe8tep8uPUW6NAMznLVCKKaOd56fkP+FVA1+M7u4 7kaoG6B8tg3mEbHdsfvo4VSII7HcHsaTa4tOaD7RPjEDKdxFQbzY8HUP38BunqOu v9bsV0BXYvG5dHF0eWtQAfB+y5V7n1fipcfKtZl2HagN9hbKruwQMPZTO+xPT7Am F/VyYxoirzfY5quA0ancS+DV54LfcHC5MH9Xvhacvg7yY3zYTvhsZfEaND0GAMLT sqO3Mi24+yw5Cs24uSJEj6qkxxuHMZUePpzWJWET2vduXe+pqOYF8jQn13SqHyxs QoIaw18RqcBPtX6vABFoMKP6WHPk8No3iki5DrSVSQ0jN9r3HSBxvYCIdG89P6A+ E0a65H91GhY/kVNJAE4YC8MF3e6D8/+KX9+h/q1wyueJV2tTE0cd0ZiQY7hN36HY /QIf8zH4tmYhHAJFNLQnK1sg1Ipino6ABHNv76MhZMxguYD4FAE0nAHsfv1AAa97 Webx5zolwHEz8A== =FRZX -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-4.15-rc4-riscv_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux Pull RISC-V fixes from Palmer Dabbelt: "This contains three small fixes: - A fix to a typo in sys_riscv_flush_icache. This only effects error handling, but I think it's a small and obvious enough change that it's sane outside the merge window. - The addition of smp_mb__after_spinlock(), which was recently removed due to an incorrect comment. This is largly a comment change (as there's a big one now), and while it's necessary for complience with the RISC-V memory model the lack of this fence shouldn't manifest as a bug on current implementations. Nonetheless, it still seems saner to have the fence in 4.15. - The removal of some of the HVC_RISCV_SBI driver that snuck into the arch port. This is compile-time dead code in 4.15 (as the driver isn't in yet), and during the review process we found a better way to implement early printk on RISC-V. While this change doesn't do anything, it will make staging our HVC driver easier: without this change the HVC driver we hope to upstream won't build on 4.15 (because the 4.15 arch code would reference a function that no longer exists). I don't think this is the last patch set we'll want for 4.15: I think I'll want to remove some of the first-level irqchip driver that snuck in as well, which will look a lot like the HVC patch here. This is pending some asm-generic cleanup I'm doing that I haven't quite gotten clean enough to send out yet, though, but hopefully it'll be ready by next week (and still OK for that late)" * tag 'riscv-for-linus-4.15-rc4-riscv_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux: RISC-V: Remove unused CONFIG_HVC_RISCV_SBI code RISC-V: Resurrect smp_mb__after_spinlock() RISC-V: Logical vs Bitwise typo	2017-12-13 20:13:05 -08:00
David S. Miller	8c8f67a46f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2017-12-13 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Addition of explicit scheduling points to map alloc/free in order to avoid having to hold the CPU for too long, from Eric. 2) Fixing of a corruption in overlapping perf_event_output calls from different BPF prog types on the same CPU out of different contexts, from Daniel. 3) Fallout fixes for recent correction of broken uapi for BPF_PROG_TYPE_PERF_EVENT. um had a missing asm header that needed to be pulled in from asm-generic and for BPF selftests the asm-generic include did not work, so similar asm include scheme was adapted for that problematic header that perf is having with other header files under tools, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-13 17:30:04 -05:00
Russell King	cd8165c3d5	ARM: dts: vf610-zii-dev: use XAUI for DSA link ports Use XAUI rather than XGMII for DSA link ports, as this is the interface mode that the switches actually use. XAUI is the 4 lane bus with clock per direction, whereas XGMII is a 32 bit bus with clock. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-13 14:59:15 -05:00
Dongjiu Geng	faa75e147b	arm64: fault: avoid send SIGBUS two times do_sea() calls arm64_notify_die() which will always signal user-space. It also returns whether APEI claimed the external abort as a RAS notification. If it returns failure do_mem_abort() will signal user-space too. do_mem_abort() wants to know if we handled the error, we always call arm64_notify_die() so can always return success. Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Reviewed-by: James Morse <james.morse@arm.com> Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-13 09:58:13 +00:00
Anju T Sudhakar	110df8bd3e	powerpc/perf: Fix kfree memory allocated for nest pmus imc_common_cpuhp_mem_free() is the common function for all IMC (In-memory Collection counters) domains to unregister cpuhotplug callback and free memory. Since kfree of memory allocated for nest-imc (per_nest_pmu_arr) is in the common code, all domains (core/nest/thread) can do the kfree in the failure case. This could potentially create a call trace as shown below, where core(/thread/nest) imc pmu initialization fails and in the failure path imc_common_cpuhp_mem_free() free the memory(per_nest_pmu_arr), which is allocated by successfully registered nest units. The call trace is generated in a scenario where core-imc initialization is made to fail and a cpuhotplug is performed in a p9 system. During cpuhotplug ppc_nest_imc_cpu_offline() tries to access per_nest_pmu_arr, which is already freed by core-imc. NIP [c000000000cb6a94] mutex_lock+0x34/0x90 LR [c000000000cb6a88] mutex_lock+0x28/0x90 Call Trace: mutex_lock+0x28/0x90 (unreliable) perf_pmu_migrate_context+0x90/0x3a0 ppc_nest_imc_cpu_offline+0x190/0x1f0 cpuhp_invoke_callback+0x160/0x820 cpuhp_thread_fun+0x1bc/0x270 smpboot_thread_fn+0x250/0x290 kthread+0x1a8/0x1b0 ret_from_kernel_thread+0x5c/0x74 To address this scenario do the kfree(per_nest_pmu_arr) only in case of nest-imc initialization failure, and when there is no other nest units registered. Fixes: `73ce9aec65` ("powerpc/perf: Fix IMC_MAX_PMU macro") Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-12-13 20:51:22 +11:00
Anju T Sudhakar	ad2b6e0102	powerpc/perf/imc: Fix nest-imc cpuhotplug callback failure Oops is observed during boot: Faulting instruction address: 0xc000000000248340 cpu 0x0: Vector: 380 (Data Access Out of Range) at [c000000ff66fb850] pc: c000000000248340: event_function_call+0x50/0x1f0 lr: c00000000024878c: perf_remove_from_context+0x3c/0x100 sp: c000000ff66fbad0 msr: 9000000000009033 dar: 7d20e2a6f92d03c0 pid = 14, comm = cpuhp/0 While registering the cpuhotplug callbacks for nest-imc, if we fail in the cpuhotplug online path for any random node in a multi node system (because the opal call to stop nest-imc counters fails for that node), ppc_nest_imc_cpu_offline() will get invoked for other nodes who successfully returned from cpuhotplug online path. This call trace is generated since in the ppc_nest_imc_cpu_offline() path we are trying to migrate the event context, when nest-imc counters are not even initialized. Patch to add a check to ensure that nest-imc is registered before migrating the event context. Fixes: `885dcd709b` ("powerpc/perf: Add nest IMC PMU support") Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-12-13 20:36:53 +11:00
Ravi Bangoria	f41d84dddc	powerpc/perf: Dereference BHRB entries safely It's theoretically possible that branch instructions recorded in BHRB (Branch History Rolling Buffer) entries have already been unmapped before they are processed by the kernel. Hence, trying to dereference such memory location will result in a crash. eg: Unable to handle kernel paging request for data at address 0xd000000019c41764 Faulting instruction address: 0xc000000000084a14 NIP [c000000000084a14] branch_target+0x4/0x70 LR [c0000000000eb828] record_and_restart+0x568/0x5c0 Call Trace: [c0000000000eb3b4] record_and_restart+0xf4/0x5c0 (unreliable) [c0000000000ec378] perf_event_interrupt+0x298/0x460 [c000000000027964] performance_monitor_exception+0x54/0x70 [c000000000009ba4] performance_monitor_common+0x114/0x120 Fix it by deferefencing the addresses safely. Fixes: `691231846c` ("powerpc/perf: Fix setting of "to" addresses for BHRB") Cc: stable@vger.kernel.org # v3.10+ Suggested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Use probe_kernel_read() which is clearer, tweak change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-12-13 20:29:20 +11:00
Daniel Borkmann	a23f06f06d	bpf: fix build issues on um due to mising bpf_perf_event.h Since `c895f6f703` ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type") um (uml) won't build on i386 or x86_64: [...] CC init/main.o In file included from ../include/linux/perf_event.h:18:0, from ../include/linux/trace_events.h:10, from ../include/trace/syscall.h:7, from ../include/linux/syscalls.h:82, from ../init/main.c:20: ../include/uapi/linux/bpf_perf_event.h:11:32: fatal error: asm/bpf_perf_event.h: No such file or directory #include <asm/bpf_perf_event.h> [...] Lets add missing bpf_perf_event.h also to um arch. This seems to be the only one still missing. Fixes: `c895f6f703` ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type") Reported-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Richard Weinberger <richard@sigma-star.at> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Randy Dunlap <rdunlap@infradead.org> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Richard Weinberger <richard@sigma-star.at> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2017-12-12 09:51:12 -08:00
Jan Beulich	0f3922a9b9	x86/Xen: don't report ancient LAPIC version Unconditionally reporting a value seen on the P4 or older invokes functionality like io_apic_get_unique_id() on 32-bit builds, resulting in a panic() with sufficiently many CPUs and/or IO-APICs. Doing what that function does would be the hypervisor's responsibility anyway, so makes no sense to be used when running on Xen. Uniformly report a more modern version; this shouldn't matter much as both LAPIC and IO-APIC are being managed entirely / mostly by the hypervisor. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2017-12-12 09:39:17 -05:00
Will Deacon	0e17cada2a	arm64: hw_breakpoint: Use linux/uaccess.h instead of asm/uaccess.h The only inclusion of asm/uaccess.h should be by linux/uaccess.h. All other headers should use the latter. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-12 11:53:26 +00:00
Shanker Donthineni	932b50c7c1	arm64: Add software workaround for Falkor erratum 1041 The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter 4K and next 4K. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. 1) A System Error Interrupt (SEI) being raised by the Falkor core due to the errant memory access attempting to access a region of memory that is protected by a slave-side memory protection unit. 2) Unpredictable device behavior due to a speculative read from device memory. This behavior may only occur if the instruction cache is disabled prior to or coincident with translation being changed from enabled to disabled. The conditions leading to this erratum will not occur when either of the following occur: 1) A higher exception level disables translation of a lower exception level (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). 2) An exception level disabling its stage-1 translation if its stage-2 translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 to 0 when HCR_EL2[VM] has a value of 1). To avoid the errant behavior, software must execute an ISB immediately prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-12 11:45:19 +00:00
Shanker Donthineni	c622cc013c	arm64: Define cputype macros for Falkor CPU Add cputype definition macros for Qualcomm Datacenter Technologies Falkor CPU in cputype.h. It's unfortunate that the first revision of the Falkor CPU used the wrong part number 0x800, got fixed in v2 chip with part number 0xC00, and would be used the same value for future revisions. Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-12 11:45:19 +00:00
Will Deacon	86c9e8126e	arm64: mm: Fix false positives in set_pte_at access/dirty race detection Jiankang reports that our race detection in set_pte_at is firing when copying the page tables in dup_mmap as a result of a fork(). In this situation, the page table isn't actually live and so there is no way that we can race with a concurrent update from the hardware page table walker. This patch reworks the race detection so that we require either the mm to match the current active_mm (i.e. currently installed in our TTBR0) or the mm_users count to be greater than 1, implying that the page table could be live in another CPU. The mm_users check might still be racy, but we'll avoid false positives and it's not realistic to validate that all the necessary locks are held as part of this assertion. Cc: Yisheng Xie <xieyisheng1@huawei.com> Reported-by: Jiankang Chen <chenjiankang1@huawei.com> Tested-by: Jiankang Chen <chenjiankang1@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-12 11:42:24 +00:00
Linus Torvalds	916b20e02e	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This push fixes the following issues: - buffer overread in RSA - potential use after free in algif_aead. - error path null pointer dereference in af_alg - forbid combinations such as hmac(hmac(sha3)) which may crash - crash in salsa20 due to incorrect API usage" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: salsa20 - fix blkcipher_walk API usage crypto: hmac - require that the underlying hash algorithm is unkeyed crypto: af_alg - fix NULL pointer dereference in crypto: algif_aead - fix reference counting of null skcipher crypto: rsa - fix buffer overread when stripping leading zeroes	2017-12-11 16:32:45 -08:00
Andrey Ryabinin	0a373d4fc2	x86/unwinder/guess: Prevent using CONFIG_UNWINDER_GUESS=y with CONFIG_STACKDEPOT=y Stackdepot doesn't work well with CONFIG_UNWINDER_GUESS=y. The 'guess' unwinder generate awfully large and inaccurate stacktraces, thus stackdepot can't deduplicate stacktraces because they all look like unique. Eventually stackdepot reaches its capacity limit: WARNING: CPU: 0 PID: 545 at lib/stackdepot.c:119 depot_save_stack+0x28e/0x550 Call Trace: ? kasan_kmalloc+0x144/0x160 ? depot_save_stack+0x1f5/0x550 ? do_raw_spin_unlock+0xda/0xf0 ? preempt_count_sub+0x13/0xc0 <...90 lines...> ? do_raw_spin_unlock+0xda/0xf0 Add a STACKDEPOT=n dependency to UNWINDER_GUESS to avoid the problem. Reported-by: kernel test robot <xiaolong.ye@intel.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171130123554.4330-1-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-11 19:07:07 +01:00
Changbin Du	f79ce87fa4	x86/build: Don't verify mtools configuration file for isoimage If mtools.conf is not generated before, 'make isoimage' could complain: Kernel: arch/x86/boot/bzImage is ready (#597) GENIMAGE arch/x86/boot/image.iso *** Missing file: arch/x86/boot/mtools.conf arch/x86/boot/Makefile:144: recipe for target 'isoimage' failed mtools.conf is not used for isoimage generation, so do not check it. Signed-off-by: Changbin Du <changbin.du@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `4366d57af1` ("x86/build: Factor out fdimage/isoimage generation commands to standalone script") Link: http://lkml.kernel.org/r/1512053480-8083-1-git-send-email-changbin.du@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-11 18:55:38 +01:00
Steve Capper	8781bcbc5e	arm64: mm: Fix pte_mkclean, pte_mkdirty semantics On systems with hardware dirty bit management, the ltp madvise09 unit test fails due to dirty bit information being lost and pages being incorrectly freed. This was bisected to: arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect() Reverting this commit leads to a separate problem, that the unit test retains pages that should have been dropped due to the function madvise_free_pte_range(.) not cleaning pte's properly. Currently pte_mkclean only clears the software dirty bit, thus the following code sequence can appear: pte = pte_mkclean(pte); if (pte_dirty(pte)) // this condition can return true with HW DBM! This patch also adjusts pte_mkclean to set PTE_RDONLY thus effectively clearing both the SW and HW dirty information. In order for this to function on systems without HW DBM, we need to also adjust pte_mkdirty to remove the read only bit from writable pte's to avoid infinite fault loops. Cc: <stable@vger.kernel.org> Fixes: `64c26841b3` ("arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect()") Reported-by: Bhupinder Thakur <bhupinder.thakur@linaro.org> Tested-by: Bhupinder Thakur <bhupinder.thakur@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-11 16:13:10 +00:00
Steve Capper	f24e5834a2	arm64: Initialise high_memory global variable earlier The high_memory global variable is used by cma_declare_contiguous(.) before it is defined. We don't notice this as we compute __pa(high_memory - 1), and it looks like we're processing a VA from the direct linear map. This problem becomes apparent when we flip the kernel virtual address space and the linear map is moved to the bottom of the kernel VA space. This patch moves the initialisation of high_memory before it used. Cc: <stable@vger.kernel.org> Fixes: `f7426b983a` ("mm: cma: adjust address limit to avoid hitting low/high memory boundary") Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-11 16:13:10 +00:00
Palmer Dabbelt	27b0174525	RISC-V: Remove unused CONFIG_HVC_RISCV_SBI code This is code that probably should never have made it into the kernel in the first place: it depends on a driver that hadn't been reviewed yet. During the HVC_SBI_RISCV review process a better way of doing this was suggested, but that means this code is defunct. It's compile-time disabled in 4.15 because the driver isn't in, so I think it's safe to just remove this for now. CC: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>	2017-12-11 07:51:09 -08:00
Palmer Dabbelt	3cfa500808	RISC-V: Resurrect smp_mb__after_spinlock() I removed this last week because of an incorrect comment: smp_mb__after_spinlock() is actually still used, and is necessary on RISC-V. It's been resurrected, with a comment that describes what it actually does this time. Thanks to Andrea for finding the bug! Fixes: `3343eb6806` ("RISC-V: Remove smb_mb__{before,after}_spinlock()") CC: Andrea Parri <parri.andrea@gmail.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>	2017-12-11 07:51:07 -08:00
Dan Carpenter	86ad5c97ce	RISC-V: Logical vs Bitwise typo In the current code, there is a ! logical NOT where a bitwise ~ NOT was intended. It means that we never return -EINVAL. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>	2017-12-11 07:51:06 -08:00
Karol Herbst	6d60ce384d	x86/mm/kmmio: Fix mmiotrace for page unaligned addresses If something calls ioremap() with an address not aligned to PAGE_SIZE, the returned address might be not aligned as well. This led to a probe registered on exactly the returned address, but the entire page was armed for mmiotracing. On calling iounmap() the address passed to unregister_kmmio_probe() was PAGE_SIZE aligned by the caller leading to a complete freeze of the machine. We should always page align addresses while (un)registerung mappings, because the mmiotracer works on top of pages, not mappings. We still keep track of the probes based on their real addresses and lengths though, because the mmiotrace still needs to know what are mapped memory regions. Also move the call to mmiotrace_iounmap() prior page aligning the address, so that all probes are unregistered properly, otherwise the kernel ends up failing memory allocations randomly after disabling the mmiotracer. Tested-by: Lyude <lyude@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Pekka Paalanen <ppaalanen@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: nouveau@lists.freedesktop.org Link: http://lkml.kernel.org/r/20171127075139.4928-1-kherbst@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-11 15:35:18 +01:00
Linus Torvalds	9c02e0601b	ARM: SoC fixes for 4.15-rc ARM SoC fixes for this merge window: - A revert of all SCPI changes from the 4.15 merge window. They had regressions on the Amlogic platforms, and the submaintainer isn't around to fix these bugs due to vacation, etc. So we agreed to revert and revisit in next release cycle. - A series fixing a number of bugs for ARM CCN interconnect, around module unload, smp_processor_id() in preemptable context, and fixing some memory allocation failure checks. - A handful of devicetree fixes for different platforms, fixing warnings and errors that were previously ignored by the compiler. - The usual set of mostly minor fixes for different platforms. -----BEGIN PGP SIGNATURE----- iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAloswBYPHG9sb2ZAbGl4 b20ubmV0AAoJEIwa5zzehBx3bFoP/R6fAth5zVuUveykTbHYSs1Oz5XsLw4X77mA vg9Y2bBiz82/R7/weISxn28iXd29FQk0inbohI3bGsqJ+OInKiBNJ4UDrvuvrG39 F2ifH7bjg6x9aooXx0VRXdECckO6klVhe8kjRovRZv712bWUA5uRG/6YUxnyC5LX 6smrhFUAhj5StUqaY8rFlF8Pob9ftee0aM8/C3LAggQBhwqU5RXY8HgL0jLb919q ewOuDgO3FtfrigtOlDWkrQLRe+sY2b5D97Z4amIe4rojqvD6i7grRFBfkfDb4gR2 7Vc0FZmk7OaaOLsLChv8H8atCxpDJ6KPpg5NAfXk4KM2kbWLbinkLPMbXxSXB4bC Q26PPZhKP1OupsSl/9fewHbZeaMZSY0kQXHxBIXCtyMV110TsLVmdFTksbV0gVKk x0lBAttX2RZiRU0bFxOHACEXnRTS4/uDCxBvxh5othQZg5EqAEqcFoium2psMvsJ XzFPNetyqPE2KhOUMfwUbOtWs2js5E4HaxwyF9xduw32g10NxtRTKGUMeDrxdtW3 dHQZMr33yLkSkk08S4LMdwUrlONieyJZYeVZ2Jxd+Lv4bn4ZX8mDbLU24BCtLwUI wR+8zS2YSVfer/t5JC+dIJzKAiufeOFovNurlsoFJsE2yNLfekZcZ/x63ZdVQnve 8f0mibwa =DIHP -----END PGP SIGNATURE----- Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: - A revert of all SCPI changes from the 4.15 merge window. They had regressions on the Amlogic platforms, and the submaintainer isn't around to fix these bugs due to vacation, etc. So we agreed to revert and revisit in next release cycle. - A series fixing a number of bugs for ARM CCN interconnect, around module unload, smp_processor_id() in preemptable context, and fixing some memory allocation failure checks. - A handful of devicetree fixes for different platforms, fixing warnings and errors that were previously ignored by the compiler. - The usual set of mostly minor fixes for different platforms. * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (42 commits) ARM64: dts: meson-gx: fix UART pclk clock name ARM: omap2: hide omap3_save_secure_ram on non-OMAP3 builds arm: dts: nspire: Add missing #phy-cells to usb-nop-xceiv ARM: dts: Fix dm814x missing phy-cells property ARM: dts: Fix elm interrupt compiler warning bus: arm-ccn: fix module unloading Error: Removing state 147 which has instances left. bus: arm-cci: Fix use of smp_processor_id() in preemptible context bus: arm-ccn: Fix use of smp_processor_id() in preemptible context bus: arm-ccn: Simplify code bus: arm-ccn: Check memory allocation failure bus: arm-ccn: constify attribute_group structures. firmware: arm_scpi: Revert updates made during v4.15 merge window arm: dts: marvell: Add missing #phy-cells to usb-nop-xceiv arm64: dts: sort vendor subdirectories in Makefile alphabetically meson-gx-socinfo: Fix package id parsing ARM: meson: fix spelling mistake: "Couln't" -> "Couldn't" ARM: dts: meson: fix the memory region of the GPIO interrupt controller ARM: dts: meson: correct the sort order for the the gpio_intc node MAINTAINERS: exclude other Socionext SoC DT files from ARM/UNIPHIER entry arm64: dts: uniphier: remove unnecessary interrupt-parent ...	2017-12-10 08:26:59 -08:00
Linus Torvalds	c465fc11e5	KVM fixes for v4.15-rc3 ARM: * A number of issues in the vgic discovered using SMATCH * A bit one-off calculation in out stage base address mask (32-bit and 64-bit) * Fixes to single-step debugging instructions that trap for other reasons such as MMMIO aborts * Printing unavailable hyp mode as error * Potential spinlock deadlock in the vgic * Avoid calling vgic vcpu free more than once * Broken bit calculation for big endian systems s390: * SPDX tags * Fence storage key accesses from problem state * Make sure that irq_state.flags is not used in the future x86: * Intercept port 0x80 accesses to prevent host instability (CVE) * Use userspace FPU context for guest FPU (mainly an optimization that fixes a double use of kernel FPU) * Do not leak one page per module load * Flush APIC page address cache from MMU invalidation notifiers -----BEGIN PGP SIGNATURE----- iQEcBAABCAAGBQJaLA93AAoJEED/6hsPKofo9msH/2DrqT2FOKfLuxNR2FeUGWr3 lqFoBRUXrVDMINGStnWrV36h/xYzlgJl9jtSDS8dr3VxLqtrNLlDg9NmGeogoZ+k /xewr/jFYoSRfffsvrbkzORUfvu6zqvJwufiwBEJwAfcswiLqPizdFXcxtUL4eZE 9s9sIweo5zp2Xjg5yLOEkyanePKMEht/81zPkHyM+g0ZMoaPam3qZHA0lLzdyRgd G9LpSyiMFHguYYgbwipaVue3zgMY1EdmKQ8C2hEPmZd8nVau26YDwRnAwwLrmVkW sFhGO1Xi18TzQPokzALC25c9v0fqgxL5+fNyFNgWwTc2n9PSwO+IHcy699UH+3A= =Qcqd -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Radim Krčmář: "ARM: - A number of issues in the vgic discovered using SMATCH - A bit one-off calculation in out stage base address mask (32-bit and 64-bit) - Fixes to single-step debugging instructions that trap for other reasons such as MMMIO aborts - Printing unavailable hyp mode as error - Potential spinlock deadlock in the vgic - Avoid calling vgic vcpu free more than once - Broken bit calculation for big endian systems s390: - SPDX tags - Fence storage key accesses from problem state - Make sure that irq_state.flags is not used in the future x86: - Intercept port 0x80 accesses to prevent host instability (CVE) - Use userspace FPU context for guest FPU (mainly an optimization that fixes a double use of kernel FPU) - Do not leak one page per module load - Flush APIC page address cache from MMU invalidation notifiers" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits) KVM: x86: fix APIC page invalidation KVM: s390: Fix skey emulation permission check KVM: s390: mark irq_state.flags as non-usable KVM: s390: Remove redundant license text KVM: s390: add SPDX identifiers to the remaining files KVM: VMX: fix page leak in hardware_setup() KVM: VMX: remove I/O port 0x80 bypass on Intel hosts x86,kvm: remove KVM emulator get_fpu / put_fpu x86,kvm: move qemu/guest FPU switching out to vcpu_run KVM: arm/arm64: Fix broken GICH_ELRSR big endian conversion KVM: arm/arm64: kvm_arch_destroy_vm cleanups KVM: arm/arm64: Fix spinlock acquisition in vgic_set_owner kvm: arm: don't treat unavailable HYP mode as an error KVM: arm/arm64: Avoid attempting to load timer vgic state without a vgic kvm: arm64: handle single-step of hyp emulated mmio instructions kvm: arm64: handle single-step during SError exceptions kvm: arm64: handle single-step of userspace mmio instructions kvm: arm64: handle single-stepping trapped instructions KVM: arm/arm64: debug: Introduce helper for single-step arm: KVM: Fix VTTBR_BADDR_MASK BUG_ON off-by-one ...	2017-12-10 08:24:16 -08:00
Olof Johansson	ce39882eb1	Amlogic fixes for v4.15-rc - GPIO interrupt fixes - socinfo fix for GX series - fix typo -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEe4dGDhaSf6n1v/EMWTcYmtP7xmUFAloq33MACgkQWTcYmtP7 xmVqdxAAjP5E/B52UpRqDDxANPrhUPDHLVI+JkJqtmAEn7G9+eyf/WOOJXLsdJTY 1JEqBd2cg2Gnj5VhhfVZDFVVebGboYB6OJPgPlvL+EjYShPf2P0Xug2xFxuxFMOA ed48amV2cz+/3AXGuKmDcUKdN6LAcF1ax7kXCUSQKUbK/s4EmuYXjwHRJrvw1PrH 4Rpt5Zw6u1t5na26KDBit9OkRL5g0o8SX829MhbHqPScDvAi2wYy95i+Ay1PNFin uUojff5V1OBbeYZMamUIjl3e2/BCecPdaBwDFsnz+SlFfYQIy6g0zQEjRa9BGnMi i/LQOr09l/Dl1Wlexl0rwq3vEXr5hT9JfA9IpLBOJRccG6RUl/TarjpwSqtPPFmf eMdIGZd9S+06+vRtJxaFEW+AJ9qFg3EtQCU4EWuzsE1UABa3q5KnIRr5fh5X85N2 HIJmDbnVsVLDv3gAkQykaHkO9YLjFS8Cq58ILWBEFCMKZUL2JanZAB/s5+3xTf+w Y+jQQA8LQ5NgvhcH+iBt+7brkBFcxbWrvgMxnWEsIxR+vFbTF/Dq52foJsjnK7kQ xXa/hyE52EsRiUPXLGJ5Gj+Z89VYfkjmqvLCBVIdpkyfhaOvRse82UcoPDnV0+Dt RiLZ0CS7z432CMLfJAy+YnbdSK/6/Kxv8BeK5DAeY2CvtXl+SYE= =xHta -----END PGP SIGNATURE----- Merge tag 'amlogic-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic into fixes Amlogic fixes for v4.15-rc - GPIO interrupt fixes - socinfo fix for GX series - fix typo * tag 'amlogic-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic: ARM64: dts: meson-gx: fix UART pclk clock name meson-gx-socinfo: Fix package id parsing ARM: meson: fix spelling mistake: "Couln't" -> "Couldn't" ARM: dts: meson: fix the memory region of the GPIO interrupt controller ARM: dts: meson: correct the sort order for the the gpio_intc node Signed-off-by: Olof Johansson <olof@lixom.net>	2017-12-09 20:23:29 -08:00
Olof Johansson	69b8df5d6a	Two fixes for dts compiler warnings These recently started showing up with better dtc checks being introduced. -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAlomypwRHHRvbnlAYXRv bWlkZS5jb20ACgkQG9Q+yVyrpXNVDA//XzuwM4eH7SPQGI2gTiZ21WUU+n1zDzjD bCaHMgtz8rmifFVVBjP9cWIEKyhlUt3owuY3gfaSvzOcL9a4WfVDwS6hAb4TtjVa 0o6oEralG7D/L7QyIGJM3abGdDvKXs3HGyUuqpkt61TnmstJQXWjz6vRFroIP6bl ogn++GFiigBdF+HC555QCMNvWMotN8C7qDbqugmkhE5JBF+1HkQOAFuYb2YT8w2A DxWRgMrR0sS+kVLtJVSFOCluaALmxHICkjbPApS0xXBrvbjBnwx74tdT3Tikd1uu tBniisQU/AMN/HmhsyPP1G8oZYcwhHxEgVZQXCeBK5H2RsCZQTzCP5iywuh3gvzG MUctaPIHkyhx1FvYlSQ9G7SygZDBxSlhz1gIZ4+3a9ZEaH+hZmONJ0tyTPBD78CU yVugCSD2o9Tz0qNGGVcEUI79uGTFiKFt5M5V6WWVgjpFjfMC/Nir+LAUTH8/KiRZ CKkm1Tr6zJwkI50h3vlJLF3nYTDBEZSsQhICGu6wN/0sWqXruLWNty0lKv2HLpEi 8rpNBlz/cmTVFa8X3IlwI9tYt+ZkABWd52FUCQHoXFLDRR6W87VFPLHhWJl7sVnL qvsoo0dvlJVdTHISJgjP5VJdK4Ed2mq3dCHBqFLAcNPndO4Ifjncm8fIWjCJW2T4 rKk/6enGmFU= =BHn4 -----END PGP SIGNATURE----- Merge tag 'omap-for-v4.15/fixes-dt-warnings' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Two fixes for dts compiler warnings These recently started showing up with better dtc checks being introduced. * tag 'omap-for-v4.15/fixes-dt-warnings' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: Fix dm814x missing phy-cells property ARM: dts: Fix elm interrupt compiler warning Signed-off-by: Olof Johansson <olof@lixom.net>	2017-12-09 20:22:01 -08:00
Michal Hocko	f335195adf	kmemcheck: rip it out for real Commit `4675ff05de` ("kmemcheck: rip it out") has removed the code but for some reason SPDX header stayed in place. This looks like a rebase mistake in the mmotm tree or the merge mistake. Let's drop those leftovers as well. Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-08 13:40:17 -08:00
Linus Torvalds	e9ef1fe312	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) CAN fixes from Martin Kelly (cancel URBs properly in all the CAN usb drivers). 2) Revert returning -EEXIST from __dev_alloc_name() as this propagates to userspace and broke some apps. From Johannes Berg. 3) Fix conn memory leaks and crashes in TIPC, from Jon Malloc and Cong Wang. 4) Gianfar MAC can't do EEE so don't advertise it by default, from Claudiu Manoil. 5) Relax strict netlink attribute validation, but emit a warning. From David Ahern. 6) Fix regression in checksum offload of thunderx driver, from Florian Westphal. 7) Fix UAPI bpf issues on s390, from Hendrik Brueckner. 8) New card support in iwlwifi, from Ihab Zhaika. 9) BBR congestion control bug fixes from Neal Cardwell. 10) Fix port stats in nfp driver, from Pieter Jansen van Vuuren. 11) Fix leaks in qualcomm rmnet, from Subash Abhinov Kasiviswanathan. 12) Fix DMA API handling in sh_eth driver, from Thomas Petazzoni. 13) Fix spurious netpoll warnings in bnxt_en, from Calvin Owens. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits) net: mvpp2: fix the RSS table entry offset tcp: evaluate packet losses upon RTT change tcp: fix off-by-one bug in RACK tcp: always evaluate losses in RACK upon undo tcp: correctly test congestion state in RACK bnxt_en: Fix sources of spurious netpoll warnings tcp_bbr: reset long-term bandwidth sampling on loss recovery undo tcp_bbr: reset full pipe detection on loss recovery undo tcp_bbr: record "full bw reached" decision in new full_bw_reached bit sfc: pass valid pointers from efx_enqueue_unwind gianfar: Disable EEE autoneg by default tcp: invalidate rate samples during SACK reneging can: peak/pcie_fd: fix potential bug in restarting tx queue can: usb_8dev: cancel urb on -EPIPE and -EPROTO can: kvaser_usb: cancel urb on -EPIPE and -EPROTO can: esd_usb2: cancel urb on -EPIPE and -EPROTO can: ems_usb: cancel urb on -EPIPE and -EPROTO can: mcba_usb: cancel urb on -EPROTO usbnet: fix alignment for frames with no ethernet header tcp: use current time in tcp_rcv_space_adjust() ...	2017-12-08 13:32:44 -08:00
Linus Torvalds	d90696ed61	powerpc fixes for 4.15 #4 One notable fix for kexec on Power9, where we were not clearing MMU PID properly which sometimes leads to hangs. Finally debugged to a root cause by Nick. A revert of a patch which tried to rework our panic handling to get more output on the console, but inadvertently broke reporting the panic to the hypervisor, which apparently people care about. Then a fix for an oops in the PMU code, and finally some s/%p/%px/ in xmon. Thanks to: David Gibson, Nicholas Piggin, Ravi Bangoria. -----BEGIN PGP SIGNATURE----- iQIwBAABCAAaBQJaKoWXExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYAp Rw/+KRvwt1jt3vFKrWlcXQ4Mx4UTseSaBO7FsGwyANqNGUNvkIEIAZYu6M9x0LLh tfVowZdJ2vQrgdZy4Rd5zhIjzVaybyENMAMZFmGCxQUidORdibP2qT+3612FmQl0 rczQB4Ra1Jymw+42iwe4WQfyta9cvVgfk7D+1KVWaCXQ0lx8DynZ75yK+U0fensz FPQNdtkfC2D37IFrqtgGBS5YLkeQpfftm8C/eBG0n2tv8PO1KM5xwVU8Ovf5LoIm 8NbWL//H+zUOoU2jCGHDMfg1qLv9owScTMRtquQSmrE1i21mE2lLOSDSM105+AP2 7CVRMMkth8V9w/nauPq0a5OGyzJWtClI9qj2ZPWS2wPF331g58GUNJsEy7OQAJgO QZoqcCkpT5qarmxkcKJlYZGF6AZ/4mIBL9mucfQc/afEgRUqksaUKck0qD5SW28j fm3pPjlMyf2vMRKGgaE9/+by5N/Bmxy2VCoFSuhm1ZrQsIpZXtp/Mfylqz0msdhU VCt4T229S7rdCQTn2TyMNW+iVmjlgvR4OUXvba/eBz67gzGk4huLNB4EnEwHA/SK qhkTJqYbP9B/MBD9GrNLFzG5yZTTv+3OA/aehL0PEGouV7cgMEqyGtYw2afwggRC sf+veK/2apPMnlA5WItEa7JPWaTLsxljZ65acskb7S/4W8g= =wU60 -----END PGP SIGNATURE----- Merge tag 'powerpc-4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "One notable fix for kexec on Power9, where we were not clearing MMU PID properly which sometimes leads to hangs. Finally debugged to a root cause by Nick. A revert of a patch which tried to rework our panic handling to get more output on the console, but inadvertently broke reporting the panic to the hypervisor, which apparently people care about. Then a fix for an oops in the PMU code, and finally some s/%p/%px/ in xmon. Thanks to: David Gibson, Nicholas Piggin, Ravi Bangoria" * tag 'powerpc-4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/xmon: Don't print hashed pointers in xmon powerpc/64s: Initialize ISAv3 MMU registers before setting partition table Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier" powerpc/perf: Fix oops when grouping different pmu events	2017-12-08 12:52:09 -08:00
Neil Armstrong	39005e562a	ARM64: dts: meson-gx: fix UART pclk clock name The clock-names for pclk was wrongly set to "core", but the bindings specifies "pclk". This was not cathed until the legacy non-documented bindings were removed. Reported-by: Andreas Färber <afaerber@suse.de> Fixes: `f72d6f6037` ("ARM64: dts: meson-gx: use stable UART bindings with correct gate clock") Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Kevin Hilman <khilman@baylibre.com>	2017-12-08 10:39:11 -08:00
Linus Torvalds	c6b3e9693f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: - three more patches in regard to the SPDX license tags. The missing tags for the files in arch/s390/kvm will be merged via the KVM tree. With that all s390 related files should have their SPDX tags. - a patch to get rid of 'struct timespec' in the DASD driver. - bug fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: fix compat system call table s390/mm: fix off-by-one bug in 5-level page table handling s390: Remove redudant license text s390: add a few more SPDX identifiers s390/dasd: prevent prefix I/O error s390: always save and restore all registers on context switch s390/dasd: remove 'struct timespec' usage s390/qdio: restrict target-full handling to IQDIO s390/qdio: consider ERROR buffers for inbound-full condition s390/virtio: add BSD license to virtio-ccw	2017-12-08 10:10:17 -08:00
Linus Torvalds	6e7e7f4ddc	arm64 fixes: - Fix SW PAN pgd shadowing for kernel threads, EFI and exiting user tasks - Fix FP register leak when a task_struct is re-allocated - Fix potential use-after-free in FP state tracking used by KVM -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJaKUT9AAoJELescNyEwWM0AdIH/RmM1E3LIbOga+9DyqeQ4i8/ +8AVV0wFEyIYZY3APrFEnasrjuunWvCUZMOT5HwkfcWpBUmxKHmgG/Jy0dgvZ9at xC5WAgYZuN2Z1U9smrCWVBiUlojrxbaEPS/RR4QqB0ViHg2xBL8TW6Dolm8Rt4ei UMXyhLAzUPgTIzN+xiW10dg5VqwLv2y1HvbbaF3bUhidrccMeyz+7bpwYfk26n+c 2N7XJqc9t7DxqBpr1ZSwUzAz89wVDI7cCll+9nTS0/UBDeYXSHwTqO8MbmoRVPo9 Mlf0NdFwTqKE0YS/4q2QBfMog4fzJQmQrcrsoHUy/ZA2IBuUxKsBVn4SCG0Y9Q8= =Esv9 -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Fix some more FP register fallout from the SVE patches and also some problems with the PGD tracking in our software PAN emulation code, after we received a crash report from a 3.18 kernel running a backport. Summary: - fix SW PAN pgd shadowing for kernel threads, EFI and exiting user tasks - fix FP register leak when a task_struct is re-allocated - fix potential use-after-free in FP state tracking used by KVM" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/sve: Avoid dereference of dead task_struct in KVM guest entry arm64: SW PAN: Update saved ttbr0 value on enter_lazy_tlb arm64: SW PAN: Point saved ttbr0 at the zero page when switching to init_mm arm64: fpsimd: Abstract out binding of task's fpsimd context to the cpu. arm64: fpsimd: Prevent registers leaking from dead tasks	2017-12-08 10:08:23 -08:00
Arnd Bergmann	863204cfda	ARM: omap2: hide omap3_save_secure_ram on non-OMAP3 builds In configurations without CONFIG_OMAP3 but with secure RAM support, we now run into a link failure: arch/arm/mach-omap2/omap-secure.o: In function `omap3_save_secure_ram': omap-secure.c:(.text+0x130): undefined reference to `save_secure_ram_context' The omap3_save_secure_ram() function is only called from the OMAP34xx power management code, so we can simply hide that function in the appropriate #ifdef. Fixes: `d09220a887` ("ARM: OMAP2+: Fix SRAM virt to phys translation for save_secure_ram_context") Acked-by: Tony Lindgren <tony@atomide.com> Tested-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2017-12-07 15:52:21 +01:00
Rob Herring	c5bbf358b7	arm: dts: nspire: Add missing #phy-cells to usb-nop-xceiv "usb-nop-xceiv" is using the phy binding, but is missing #phy-cells property. This is probably because the binding was the precursor to the phy binding. Fixes the following warning in nspire dts files: Warning (phys_property): Missing property '#phy-cells' in node ... Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2017-12-07 15:52:20 +01:00
Kirill A. Shutemov	6d7e0ba2d2	x86/boot/compressed/64: Print error if 5-level paging is not supported If the machine does not support the paging mode for which the kernel was compiled, the boot process cannot continue. It's not possible to let the kernel detect the mismatch as it does not even reach the point where cpu features can be evaluted due to a triple fault in the KASLR setup. Instead of instantaneous silent reboot, emit an error message which gives the user the information why the boot fails. Fixes: `77ef56e4f0` ("x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y") Reported-by: Borislav Petkov <bp@suse.de> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Cc: linux-mm@kvack.org Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20171204124059.63515-3-kirill.shutemov@linux.intel.com	2017-12-07 10:36:26 +01:00
Kirill A. Shutemov	08529078d8	x86/boot/compressed/64: Detect and handle 5-level paging at boot-time Prerequisite for fixing the current problem of instantaneous reboots when a 5-level paging kernel is booted on 4-level paging hardware. At the same time this change prepares the decompression code to boot-time switching between 4- and 5-level paging. [ tglx: Folded the GCC < 5 fix. ] Fixes: `77ef56e4f0` ("x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Cc: linux-mm@kvack.org Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20171204124059.63515-2-kirill.shutemov@linux.intel.com	2017-12-07 10:34:39 +01:00
Prarit Bhargava	947134d9b0	x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation Documentation/x86/topology.txt defines smp_num_siblings as "The number of threads in a core". Since commit `bbb65d2d36` ("x86: use cpuid vector 0xb when available for detecting cpu topology") smp_num_siblings is the maximum number of threads in a core. If Simultaneous MultiThreading (SMT) is disabled on a system, smp_num_siblings is 2 and not 1 as expected. Use topology_max_smt_threads(), which contains the active numer of threads, in the __max_logical_packages calculation. On a single socket, single core, single thread system __max_smt_threads has not been updated when the __max_logical_packages calculation happens, so its zero which makes the package estimate fail. Initialize it to one, which is the minimum number of threads on a core. [ tglx: Folded the __max_smt_threads fix in ] Fixes: `b4c0a7326f` ("x86/smpboot: Fix __max_logical_packages estimate") Reported-by: Jakub Kicinski <kubakici@wp.pl> Signed-off-by: Prarit Bhargava <prarit@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jakub Kicinski <kubakici@wp.pl> Cc: netdev@vger.kernel.org Cc: "netdev@vger.kernel.org" Cc: Clark Williams <williams@redhat.com> Link: https://lkml.kernel.org/r/20171204164521.17870-1-prarit@redhat.com	2017-12-07 10:28:22 +01:00
Heiko Carstens	e779498df5	s390: fix compat system call table When wiring up the socket system calls the compat entries were incorrectly set. Not all of them point to the corresponding compat wrapper functions, which clear the upper 33 bits of user space pointers, like it is required. Fixes: `977108f89c` ("s390: wire up separate socketcalls system calls") Cc: <stable@vger.kernel.org> # v4.3+ Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-12-07 07:49:46 +01:00
Linus Torvalds	10f837e52b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu fixes from Greg Ungerer: "There are two fixes here. One to add a missing linker section to the m68k architecture linker scripts, the other to fix a defconfig build problem" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k/defconfig: fix stmark2 broken local compilation m68k: add missing SOFTIRQENTRY_TEXT linker section	2017-12-06 18:16:20 -08:00
Linus Torvalds	dd53a4214d	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Ingo Molnar: - make CR4 handling irq-safe, which bug vmware guests ran into - don't crash on early IRQs in Xen guests - don't crash secondary CPU bringup if #UD assisted WARN()ings are triggered - make X86_BUG_FXSAVE_LEAK optional on newer AMD CPUs that have the fix - fix AMD Fam17h microcode loading - fix broadcom_postcore_init() if ACPI is disabled - fix resume regression in __restore_processor_context() - fix Sparse warnings - fix a GCC-8 warning * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Change time() prototype to match __vdso_time() x86: Fix Sparse warnings about non-static functions x86/power: Fix some ordering bugs in __restore_processor_context() x86/PCI: Make broadcom_postcore_init() check acpi_disabled x86/microcode/AMD: Add support for fam17h microcode loading x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD x86/idt: Load idt early in start_secondary x86/xen: Support early interrupts in xen pv guests x86/tlb: Disable interrupts when changing CR4 x86/tlb: Refactor CR4 setting and shadow write	2017-12-06 17:47:29 -08:00
Christian König	a19e269613	x86/PCI: Only enable a 64bit BAR on single-socket AMD Family 15h When we have a multi-socket system, each CPU core needs the same setup. Since this is tricky to do in the fixup code, don't enable a 64bit BAR on multi-socket systems for now. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2017-12-06 14:57:20 -06:00
Christian König	470195f82e	x86/PCI: Fix infinite loop in search for 64bit BAR placement Break the loop if we can't find some address space for a 64bit BAR. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2017-12-06 14:57:19 -06:00
Arnd Bergmann	88edb57d1e	x86/vdso: Change time() prototype to match __vdso_time() gcc-8 warns that time() is an alias for __vdso_time() but the two have different prototypes: arch/x86/entry/vdso/vclock_gettime.c:327:5: error: 'time' alias between functions of incompatible types 'int(time_t )' {aka 'int(long int )'} and 'time_t(time_t )' {aka 'long int(long int )'} [-Werror=attribute-alias] int time(time_t *t) ^~~~ arch/x86/entry/vdso/vclock_gettime.c:318:16: note: aliased declaration here I could not figure out whether this is intentional, but I see that changing it to return time_t avoids the warning. Returning 'int' from time() is also a bit questionable, as it causes an overflow in y2038 even on 64-bit architectures that use a 64-bit time_t type. On 32-bit architecture with 64-bit time_t, time() should always be implement by the C library by calling a (to be added) clock_gettime() variant that takes a sufficiently wide argument. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Link: http://lkml.kernel.org/r/20171204150203.852959-1-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-06 21:31:46 +01:00
Dave Martin	cb968afc78	arm64/sve: Avoid dereference of dead task_struct in KVM guest entry When deciding whether to invalidate FPSIMD state cached in the cpu, the backend function sve_flush_cpu_state() attempts to dereference __this_cpu_read(fpsimd_last_state). However, this is not safe: there is no guarantee that this task_struct pointer is still valid, because the task could have exited in the meantime. This means that we need another means to get the appropriate value of TIF_SVE for the associated task. This patch solves this issue by adding a cached copy of the TIF_SVE flag in fpsimd_last_state, which we can check without dereferencing the task pointer. In particular, although this patch is not a KVM fix per se, this means that this check is now done safely in the KVM world switch path (which is currently the only user of this code). Signed-off-by: Dave Martin <Dave.Martin@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-06 19:08:05 +00:00
Colin Ian King	d553d03f70	x86: Fix Sparse warnings about non-static functions Functions x86_vector_debug_show(), uv_handle_nmi() and uv_nmi_setup_common() are local to the source and do not need to be in global scope, so make them static. Fixes up various sparse warnings. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Mike Travis <mike.travis@hpe.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Kosina <trivial@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Cc: travis@sgi.com Link: http://lkml.kernel.org/r/20171206173358.24388-1-colin.king@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-06 19:32:58 +01:00
Will Deacon	d96cc49bff	arm64: SW PAN: Update saved ttbr0 value on enter_lazy_tlb enter_lazy_tlb is called when a kernel thread rides on the back of another mm, due to a context switch or an explicit call to unuse_mm where a call to switch_mm is elided. In these cases, it's important to keep the saved ttbr value up to date with the active mm, otherwise we can end up with a stale value which points to a potentially freed page table. This patch implements enter_lazy_tlb for arm64, so that the saved ttbr0 is kept up-to-date with the active mm for kernel threads. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Cc: <stable@vger.kernel.org> Fixes: `39bc88e5e3` ("arm64: Disable TTBR0_EL1 during normal kernel execution") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-06 18:28:10 +00:00
Will Deacon	0adbdfde8c	arm64: SW PAN: Point saved ttbr0 at the zero page when switching to init_mm update_saved_ttbr0 mandates that mm->pgd is not swapper, since swapper contains kernel mappings and should never be installed into ttbr0. However, this means that callers must avoid passing the init_mm to update_saved_ttbr0 which in turn can cause the saved ttbr0 value to be out-of-date in the context of the idle thread. For example, EFI runtime services may leave the saved ttbr0 pointing at the EFI page table, and kernel threads may end up with stale references to freed page tables. This patch changes update_saved_ttbr0 so that the init_mm points the saved ttbr0 value to the empty zero page, which always exists and never contains valid translations. EFI and switch can then call into update_saved_ttbr0 unconditionally. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Vinayak Menon <vinmenon@codeaurora.org> Cc: <stable@vger.kernel.org> Fixes: `39bc88e5e3` ("arm64: Disable TTBR0_EL1 during normal kernel execution") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-06 18:28:10 +00:00
Dave Martin	8884b7bd7e	arm64: fpsimd: Abstract out binding of task's fpsimd context to the cpu. There is currently some duplicate logic to associate current's FPSIMD context with the cpu when loading FPSIMD state into the cpu regs. Subsequent patches will update that logic, so in order to ensure it only needs to be done in one place, this patch factors the relevant code out into a new function fpsimd_bind_to_cpu(). Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-06 18:28:10 +00:00
Dave Martin	071b6d4a5d	arm64: fpsimd: Prevent registers leaking from dead tasks Currently, loading of a task's fpsimd state into the CPU registers is skipped if that task's state is already present in the registers of that CPU. However, the code relies on the struct fpsimd_state * (and by extension struct task_struct *) to unambiguously identify a task. There is a particular case in which this doesn't work reliably: when a task exits, its task_struct may be recycled to describe a new task. Consider the following scenario: 1) Task P loads its fpsimd state onto cpu C. per_cpu(fpsimd_last_state, C) := P; P->thread.fpsimd_state.cpu := C; 2) Task X is scheduled onto C and loads its fpsimd state on C. per_cpu(fpsimd_last_state, C) := X; X->thread.fpsimd_state.cpu := C; 3) X exits, causing X's task_struct to be freed. 4) P forks a new child T, which obtains X's recycled task_struct. T == X. T->thread.fpsimd_state.cpu == C (inherited from P). 5) T is scheduled on C. T's fpsimd state is not loaded, because per_cpu(fpsimd_last_state, C) == T (== X) && T->thread.fpsimd_state.cpu == C. (This is the check performed by fpsimd_thread_switch().) So, T gets X's registers because the last registers loaded onto C were those of X, in (2). This patch fixes the problem by ensuring that the sched-in check fails in (5): fpsimd_flush_task_state(T) is called when T is forked, so that T->thread.fpsimd_state.cpu == C cannot be true. This relies on the fact that T is not schedulable until after copy_thread() completes. Once T's fpsimd state has been loaded on some CPU C there may still be other cpus D for which per_cpu(fpsimd_last_state, D) == &X->thread.fpsimd_state. But D is necessarily != C in this case, and the check in (5) must fail. An alternative fix would be to do refcounting on task_struct. This would result in each CPU holding a reference to the last task whose fpsimd state was loaded there. It's not clear whether this is preferable, and it involves higher overhead than the fix proposed in this patch. It would also move all the task_struct freeing work into the context switch critical section, or otherwise some deferred cleanup mechanism would need to be introduced, neither of which seems obviously justified. Cc: <stable@vger.kernel.org> Fixes: `005f78cd88` ("arm64: defer reloading a task's FPSIMD state to userland resume") Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [will: word-smithed the comment so it makes more sense] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-12-06 18:02:21 +00:00
Radim Krčmář	b1394e745b	KVM: x86: fix APIC page invalidation Implementation of the unpinned APIC page didn't update the VMCS address cache when invalidation was done through range mmu notifiers. This became a problem when the page notifier was removed. Re-introduce the arch-specific helper and call it from ...range_start. Reported-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Fixes: `38b9917350` ("kvm: vmx: Implement set_apic_access_page_addr") Fixes: `369ea8242c` ("mm/rmap: update to new mmu_notifier semantic v2") Cc: <stable@vger.kernel.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Wanpeng Li <wanpeng.li@hotmail.com> Tested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>	2017-12-06 16:10:34 +01:00
Radim Krčmář	d29899a30f	KVM: s390: Fixes for 4.15 - SPDX tags - Fence storage key accesses from problem state - Make sure that irq_state.flags is not used in the future -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJaJ6rwAAoJEBF7vIC1phx8QDIP/jUgH9OpLpg+bhrEUvB7e83X sAQuKv7jVTTEpZ6mTkqjPihdiRC72x1spzz5ACw5XaZ9NoxMiOtABFBtiaTfoYz+ l6eu+qOqk6lIn7n/WR8VQFSuizHa1VjnQiuco/GEUE+3FOZQPVE/u8gpNsjvWwfV gB+45oTCF24LZEgPAotPglMWOtbxjauMmqHkUh3jDgsk0bFCGWe+MR+T3ljIZ45M /6JBDchEibCsfkg4/ck0HjnQ3p9J4gfictAKJWeYgNh/4oB2krId9FNYbxXkOFNX 1+zeurttmRuFjFwVdCD6SoxE0PQTYXnL/hisITxRfX5otoXQ/x5PffiBrXIZucWK fZZvPX0MBNNzIx1UvCaJ8bKEmtzXdGuy5mpzX84kJNqCIkqft/bFrOPAf/p/Nrrv 4RoF00FH6ZZdxPD3rLkBYSs//P6lTEivkrMHGFndHrJc844pVTEN45lTg0ngOcmF aOBbpZQl6etRwobWJdye76OuVszadoECYrnLPP+fFgWjFqp0F3b9Ki1WkSPsZ4E1 isXp/tYRA+/0tZPBQT297tuUXv7c0ID2SROIUvgQt2yC2EdrizWvXsl2QCGsfbxL 8jT5AsPg0U3qUeBAUZP6gtdIAIuN5lj75uOM83CEcPTo4fOkGuvCnHI0LSHz6Ooc oz2Z8aZAENKd+3193FnO =uodY -----END PGP SIGNATURE----- Merge tag 'kvm-s390-master-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux KVM: s390: Fixes for 4.15 - SPDX tags - Fence storage key accesses from problem state - Make sure that irq_state.flags is not used in the future	2017-12-06 15:55:44 +01:00
Michael Ellerman	d810418208	powerpc/xmon: Don't print hashed pointers in xmon Since commit `ad67b74d24` ("printk: hash addresses printed with %p") pointers printed with %p are hashed, ie. you don't see the actual pointer value but rather a cryptographic hash of its value. In xmon we want to see the actual pointer values, because xmon is a debugger, so replace %p with %px which prints the actual pointer value. We justify doing this in xmon because 1) xmon is a kernel crash debugger, it's only accessible via the console 2) xmon doesn't print to dmesg, so the pointers it prints are not able to be leaked that way. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-12-07 00:27:01 +11:00
Nicholas Piggin	371b80447f	powerpc/64s: Initialize ISAv3 MMU registers before setting partition table kexec can leave MMU registers set when booting into a new kernel, the PIDR (Process Identification Register) in particular. The boot sequence does not zero PIDR, so it only gets set when CPUs first switch to a userspace processes (until then it's running a kernel thread with effective PID = 0). This leaves a window where a process table entry and page tables are set up due to user processes running on other CPUs, that happen to match with a stale PID. The CPU with that PID may cause speculative accesses that address quadrant 0 (aka userspace addresses), which will result in cached translations and PWC (Page Walk Cache) for that process, on a CPU which is not in the mm_cpumask and so they will not be invalidated properly. The most common result is the kernel hanging in infinite page fault loops soon after kexec (usually in schedule_tail, which is usually the first non-speculative quadrant 0 access to a new PID) due to a stale PWC. However being a stale translation error, it could result in anything up to security and data corruption problems. Fix this by zeroing out PIDR at boot and kexec. Fixes: `7e381c0ff6` ("powerpc/mm/radix: Add mmu context handling callback for radix") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-12-06 23:32:43 +11:00
Andy Lutomirski	5b06bbcfc2	x86/power: Fix some ordering bugs in __restore_processor_context() __restore_processor_context() had a couple of ordering bugs. It restored GSBASE after calling load_gs_index(), and the latter can call into tracing code. It also tried to restore segment registers before restoring the LDT, which is straight-up wrong. Reorder the code so that we restore GSBASE, then the descriptor tables, then the segments. This fixes two bugs. First, it fixes a regression that broke resume under certain configurations due to irqflag tracing in native_load_gs_index(). Second, it fixes resume when the userspace process that initiated suspect had funny segments. The latter can be reproduced by compiling this: // SPDX-License-Identifier: GPL-2.0 /* * ldt_echo.c - Echo argv[1] while using an LDT segment / int main(int argc, char argv) { int ret; size_t len; char buf; const struct user_desc desc = { .entry_number = 0, .base_addr = 0, .limit = 0xfffff, .seg_32bit = 1, .contents = 0, /* Data, grow-up */ .read_exec_only = 0, .limit_in_pages = 1, .seg_not_present = 0, .useable = 0 }; if (argc != 2) errx(1, "Usage: %s STRING", argv[0]); len = asprintf(&buf, "%s\n", argv[1]); if (len < 0) errx(1, "Out of memory"); ret = syscall(SYS_modify_ldt, 1, &desc, sizeof(desc)); if (ret < -1) errno = -ret; if (ret) err(1, "modify_ldt"); asm volatile ("movw %0, %%es" :: "rm" ((unsigned short)7)); write(1, buf, len); return 0; } and running ldt_echo >/sys/power/mem Without the fix, the latter causes a triple fault on resume. Fixes: `ca37e57bbe` ("x86/entry/64: Add missing irqflags tracing to native_load_gs_index()") Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/6b31721ea92f51ea839e79bd97ade4a75b1eeea2.1512057304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-06 12:29:12 +01:00
Rafael J. Wysocki	ddec3bdee0	x86/PCI: Make broadcom_postcore_init() check acpi_disabled acpi_os_get_root_pointer() may return a valid address even if acpi_disabled is set, but the host bridge information from the ACPI tables is not going to be used in that case and the Broadcom host bridge initialization should not be skipped then, So make broadcom_postcore_init() check acpi_disabled too to avoid this issue. Fixes: `6361d72b04` (x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan) Reported-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Linux PCI <linux-pci@vger.kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/3186627.pxZj1QbYNg@aspire.rjw.lan Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-06 12:27:47 +01:00
Tom Lendacky	f4e9b7af0c	x86/microcode/AMD: Add support for fam17h microcode loading The size for the Microcode Patch Block (MPB) for an AMD family 17h processor is 3200 bytes. Add a #define for fam17h so that it does not default to 2048 bytes and fail a microcode load/update. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20171130224640.15391.40247.stgit@tlendack-t1.amdoffice.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-06 12:27:24 +01:00
Rudolf Marek	e3811a3f74	x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD The latest AMD AMD64 Architecture Programmer's Manual adds a CPUID feature XSaveErPtr (CPUID_Fn80000008_EBX[2]). If this feature is set, the FXSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES / FXRSTOR, XRSTOR, XRSTORS always save/restore error pointers, thus making the X86_BUG_FXSAVE_LEAK workaround obsolete on such CPUs. Signed-off-by: Rudolf Marek <r.marek@assembler.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Link: https://lkml.kernel.org/r/bdcebe90-62c5-1f05-083c-eba7f08b2540@assembler.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-12-06 12:27:13 +01:00
Janosch Frank	ca76ec9ca8	KVM: s390: Fix skey emulation permission check All skey functions call skey_check_enable at their start, which checks if we are in the PSTATE and injects a privileged operation exception if we are. Unfortunately they continue processing afterwards and perform the operation anyhow as skey_check_enable does not deliver an error if the exception injection was successful. Let's move the PSTATE check into the skey functions and exit them on such an occasion, also we now do not enable skey handling anymore in such a case. Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: `a7e19ab` ("KVM: s390: handle missing storage-key facility") Cc: <stable@vger.kernel.org> # v4.8+ Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2017-12-06 09:18:43 +01:00
Christian Borntraeger	bb64da9aba	KVM: s390: mark irq_state.flags as non-usable Old kernels did not check for zero in the irq_state.flags field and old QEMUs did not zero the flag/reserved fields when calling KVM_S390_*_IRQ_STATE. Let's add comments to prevent future uses of these fields. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2017-12-06 09:18:43 +01:00
Greg Kroah-Hartman	940f89a5a3	KVM: s390: Remove redundant license text Now that the SPDX tag is in all arch/s390/kvm/ files, that identifies the license in a specific and legally-defined manner. So the extra GPL text wording can be removed as it is no longer needed at all. This is done on a quest to remove the 700+ different ways that files in the kernel describe the GPL license text. And there's unneeded stuff like the address (sometimes incorrect) for the FSF which is never needed. No copyright headers or other non-license-description text was removed. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20171124140043.10062-9-gregkh@linuxfoundation.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2017-12-06 09:18:42 +01:00
Greg Kroah-Hartman	d809aa2387	KVM: s390: add SPDX identifiers to the remaining files It's good to have SPDX identifiers in all files to make it easier to audit the kernel tree for correct licenses. Update the arch/s390/kvm/ files with the correct SPDX license identifier based on the license text in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This work is based on a script and data from Thomas Gleixner, Philippe Ombredanne, and Kate Stewart. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20171124140043.10062-3-gregkh@linuxfoundation.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2017-12-06 09:18:39 +01:00
Linus Torvalds	328b4ed93b	x86: don't hash faulting address in oops printout Things like this will probably keep showing up for other architectures and other special cases. I actually thought we already used %lx for this, and that is indeed _historically_ the case, but we moved to %p when merging the 32-bit and 64-bit cases as a convenient way to get the formatting right (ie automatically picking "%08lx" vs "%016lx" based on register size). So just turn this %p into %px. Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-05 17:59:29 -08:00
Kees Cook	b562c171cf	locking/refcounts: Do not force refcount_t usage as GPL-only export The refcount_t protection on x86 was not intended to use the stricter GPL export. This adjusts the linkage again to avoid a regression in the availability of the refcount API. Reported-by: Dave Airlie <airlied@gmail.com> Fixes: `7a46ec0e2f` ("locking/refcounts, x86/asm: Implement fast refcount overflow protection") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-12-05 17:14:31 -08:00
Jim Mattson	2895db67b0	KVM: VMX: fix page leak in hardware_setup() vmx_io_bitmap_b should not be allocated twice. Fixes: `2361133293` ("KVM: VMX: refactor setup of global page-sized bitmaps") Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>	2017-12-05 22:34:49 +01:00
Andrew Honig	d59d51f088	KVM: VMX: remove I/O port 0x80 bypass on Intel hosts This fixes CVE-2017-1000407. KVM allows guests to directly access I/O port 0x80 on Intel hosts. If the guest floods this port with writes it generates exceptions and instability in the host kernel, leading to a crash. With this change guest writes to port 0x80 on Intel will behave the same as they currently behave on AMD systems. Prevent the flooding by removing the code that sets port 0x80 as a passthrough port. This is essentially the same as upstream patch `99f85a28a7`, except that patch was for AMD chipsets and this patch is for Intel. Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Fixes: `fdef3ad1b3` ("KVM: VMX: Enable io bitmaps to avoid IO port 0x80 VMEXITs") Cc: <stable@vger.kernel.org> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>	2017-12-05 22:32:51 +01:00
Rik van Riel	6ab0b9feb8	x86,kvm: remove KVM emulator get_fpu / put_fpu Now that get_fpu and put_fpu do nothing, because the scheduler will automatically load and restore the guest FPU context for us while we are in this code (deep inside the vcpu_run main loop), we can get rid of the get_fpu and put_fpu hooks. Signed-off-by: Rik van Riel <riel@redhat.com> Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-12-05 21:20:24 +01:00
Rik van Riel	f775b13eed	x86,kvm: move qemu/guest FPU switching out to vcpu_run Currently, every time a VCPU is scheduled out, the host kernel will first save the guest FPU/xstate context, then load the qemu userspace FPU context, only to then immediately save the qemu userspace FPU context back to memory. When scheduling in a VCPU, the same extraneous FPU loads and saves are done. This could be avoided by moving from a model where the guest FPU is loaded and stored with preemption disabled, to a model where the qemu userspace FPU is swapped out for the guest FPU context for the duration of the KVM_RUN ioctl. This is done under the VCPU mutex, which is also taken when other tasks inspect the VCPU FPU context, so the code should already be safe for this change. That should come as no surprise, given that s390 already has this optimization. This can fix a bug where KVM calls get_user_pages while owning the FPU, and the file system ends up requesting the FPU again: [258270.527947] __warn+0xcb/0xf0 [258270.527948] warn_slowpath_null+0x1d/0x20 [258270.527951] kernel_fpu_disable+0x3f/0x50 [258270.527953] __kernel_fpu_begin+0x49/0x100 [258270.527955] kernel_fpu_begin+0xe/0x10 [258270.527958] crc32c_pcl_intel_update+0x84/0xb0 [258270.527961] crypto_shash_update+0x3f/0x110 [258270.527968] crc32c+0x63/0x8a [libcrc32c] [258270.527975] dm_bm_checksum+0x1b/0x20 [dm_persistent_data] [258270.527978] node_prepare_for_write+0x44/0x70 [dm_persistent_data] [258270.527985] dm_block_manager_write_callback+0x41/0x50 [dm_persistent_data] [258270.527988] submit_io+0x170/0x1b0 [dm_bufio] [258270.527992] __write_dirty_buffer+0x89/0x90 [dm_bufio] [258270.527994] __make_buffer_clean+0x4f/0x80 [dm_bufio] [258270.527996] __try_evict_buffer+0x42/0x60 [dm_bufio] [258270.527998] dm_bufio_shrink_scan+0xc0/0x130 [dm_bufio] [258270.528002] shrink_slab.part.40+0x1f5/0x420 [258270.528004] shrink_node+0x22c/0x320 [258270.528006] do_try_to_free_pages+0xf5/0x330 [258270.528008] try_to_free_pages+0xe9/0x190 [258270.528009] __alloc_pages_slowpath+0x40f/0xba0 [258270.528011] __alloc_pages_nodemask+0x209/0x260 [258270.528014] alloc_pages_vma+0x1f1/0x250 [258270.528017] do_huge_pmd_anonymous_page+0x123/0x660 [258270.528021] handle_mm_fault+0xfd3/0x1330 [258270.528025] __get_user_pages+0x113/0x640 [258270.528027] get_user_pages+0x4f/0x60 [258270.528063] __gfn_to_pfn_memslot+0x120/0x3f0 [kvm] [258270.528108] try_async_pf+0x66/0x230 [kvm] [258270.528135] tdp_page_fault+0x130/0x280 [kvm] [258270.528149] kvm_mmu_page_fault+0x60/0x120 [kvm] [258270.528158] handle_ept_violation+0x91/0x170 [kvm_intel] [258270.528162] vmx_handle_exit+0x1ca/0x1400 [kvm_intel] No performance changes were detected in quick ping-pong tests on my 4 socket system, which is expected since an FPU+xstate load is on the order of 0.1us, while ping-ponging between CPUs is on the order of 20us, and somewhat noisy. Cc: stable@vger.kernel.org Signed-off-by: Rik van Riel <riel@redhat.com> Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [Fixed a bug where reset_vcpu called put_fpu without preceding load_fpu, which happened inside from KVM_CREATE_VCPU ioctl. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>	2017-12-05 21:16:43 +01:00
Linus Torvalds	84dda2965d	TTY/Serial driver fixes for 4.15-rc3 Here are some small serdev and serial fixes for 4.15-rc3. They resolve some reported problems: - a number of serdev fixes to resolve crashes - MIPS build fixes for their serial port - a new 8250 device id All of these have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWia9GQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ynSEgCfZ/zINl1ItdXcMUr1cnznwNgFGhAAoIoIpmre +4qtH6PjV/+kq+2j2lmG =bxjP -----END PGP SIGNATURE----- Merge tag 'tty-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fixes from Greg KH: "Here are some small serdev and serial fixes for 4.15-rc3. They resolve some reported problems: - a number of serdev fixes to resolve crashes - MIPS build fixes for their serial port - a new 8250 device id All of these have been in linux-next for a while with no reported issues" * tag 'tty-4.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: MIPS: Add custom serial.h with BASE_BAUD override for generic kernel serdev: ttyport: fix tty locking in close serdev: ttyport: fix NULL-deref on hangup serdev: fix receive_buf return value when no callback serdev: ttyport: add missing receive_buf sanity checks serial: 8250_early: Only set divisor if valid clk & baud serial: 8250_pci: Add Amazon PCI serial device ID	2017-12-05 09:05:16 -08:00
Radim Krčmář	609b700270	KVM/ARM Fixes for v4.15. Fixes: - A number of issues in the vgic discovered using SMATCH - A bit one-off calculation in out stage base address mask (32-bit and 64-bit) - Fixes to single-step debugging instructions that trap for other reasons such as MMMIO aborts - Printing unavailable hyp mode as error - Potential spinlock deadlock in the vgic - Avoid calling vgic vcpu free more than once - Broken bit calculation for big endian systems -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJaJU3VAAoJEEtpOizt6ddyvmAH/jw+UAzN8lrcbfsYkyyulVDW yTe+7PYMYEODQlY31R/IAlVQB23aR2KkGyMlKjb9IM6mcB13A7pUVTrFfFGMGzln V75X20EV8CKcUBgdy8NRr9gsFwtDRHei0RIuQi8bkF0cV39QSiBgf36DW0oMCPFW aqUP5UiFMlMr4UqpWmS+8W4E0OBqcqaJAJXIsvHoB0Wqv4j9AUTvLqoDEpQSVgOr LzsaUc+K+zB2VOtEXEVSLZn6N4CRMmMUh3xfspC2Qv/zSLhHgl9QlOBjIpX6UYDp a+md5qa1hMajKswGGZB7DF/yLUb76PWFcepWOn5F3DXSv9YLaxzY5DvrNSkl46w= =tTiS -----END PGP SIGNATURE----- Merge tag 'kvm-arm-fixes-for-v4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm KVM/ARM Fixes for v4.15. Fixes: - A number of issues in the vgic discovered using SMATCH - A bit one-off calculation in out stage base address mask (32-bit and 64-bit) - Fixes to single-step debugging instructions that trap for other reasons such as MMMIO aborts - Printing unavailable hyp mode as error - Potential spinlock deadlock in the vgic - Avoid calling vgic vcpu free more than once - Broken bit calculation for big endian systems	2017-12-05 18:02:03 +01:00
Tony Lindgren	c22fe69615	ARM: dts: Fix dm814x missing phy-cells property We have phy-cells for usb_phy0, but it's missing for usb_phy1 and we get: Warning (phys_property): Missing property '#phy-cells' in node /ocp/l4ls@48000000/control@140000/usb-phy@1b00 or bad phandle (referred from /ocp/usb@47400000/usb@47401800:phys[0]) Signed-off-by: Tony Lindgren <tony@atomide.com>	2017-12-05 08:29:04 -08:00
Tony Lindgren	d364b038bc	ARM: dts: Fix elm interrupt compiler warning Looks like the interrupt property is missing the controller and level information causing: Warning (interrupts_property): interrupts size is (4), expected multiple of 12 in /ocp/elm@48078000 Signed-off-by: Tony Lindgren <tony@atomide.com>	2017-12-05 08:29:01 -08:00
Hendrik Brueckner	62e1dfa3e1	s390/uapi: correct whitespace & coding style in asm/ptrace.h Correct whitespace and coding style issues in the s390 asm/ptrace.h uapi header file. This is preparatory work to copy it to the tools/ directory for inclusion by selftests and perf. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-05 15:02:41 +01:00
Hendrik Brueckner	a39cada702	arm64/bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type Correct the broken uapi for the BPF_PROG_TYPE_PERF_EVENT program type by exporting the user_pt_regs structure instead of the pt_regs structure that is in-kernel only. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-05 15:02:41 +01:00
Hendrik Brueckner	466698e654	s390/bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type To mitigate and correct the broken uapi for the BPF_PROG_TYPE_PERF_EVENT program type, introduce a user_pt_regs structure (similar to arm64) that exports parts from the beginnig of the pt_regs structure. The export must start with the beginning of the pt_regs structure because to correctly calculate BPF prologues for perf (regs_query_register_offset()). For BPF_PROG_TYPE_PERF_EVENT program types, the BPF program is then passed a user_pt_regs structure. Note: Depending on future changes to the s390 pt_regs structure, consider the user_pt_regs structure to be stable for a particular kernel version only. (Of course, s390 tries to ensure keep it stable as much as possible.) Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-and-tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-05 15:02:41 +01:00
Hendrik Brueckner	c895f6f703	bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type Commit `0515e5999a` ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type") introduced the bpf_perf_event_data structure which exports the pt_regs structure. This is OK for multiple architectures but fail for s390 and arm64 which do not export pt_regs. Programs using them, for example, the bpf selftest fail to compile on these architectures. For s390, exporting the pt_regs is not an option because s390 wants to allow changes to it. For arm64, there is a user_pt_regs structure that covers parts of the pt_regs structure for use by user space. To solve the broken uapi for s390 and arm64, introduce an abstract type for pt_regs and add an asm/bpf_perf_event.h file that concretes the type. An asm-generic header file covers the architectures that export pt_regs today. The arch-specific enablement for s390 and arm64 follows in separate commits. Reported-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Fixes: `0515e5999a` ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type") Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-and-tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-05 15:02:40 +01:00
David Gibson	ab9dbf771f	Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier" This reverts commit `a3b2cb30f2`. That commit tried to fix problems with panic on powerpc in certain circumstances, where some output from the generic panic code was being dropped. Unfortunately, it breaks things worse in other circumstances. In particular when running a PAPR guest, it will now attempt to reboot instead of informing the hypervisor (KVM or PowerVM) that the guest has crashed. The crash notification is important to some virtualization management layers. Revert it for now until we can come up with a better solution. Fixes: `a3b2cb30f2` ("powerpc: Do not call ppc_md.panic in fadump panic notifier") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [mpe: Tweak change log a bit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-12-05 23:21:46 +11:00
Heiko Carstens	8d306f53b6	s390/mm: fix off-by-one bug in 5-level page table handling Martin Cermak reported that setting a uprobe doesn't work. Reason for this is that the common uprobes code tries to get an unmapped area at the last possible page within an address space. This broke with commit `1aea9b3f92` ("s390/mm: implement 5 level pages tables") which introduced an off-by-one bug which prevents to map anything at the last possible page within an address space. The check with the off-by-one bug however can be removed since with commit `8ab867cb08` ("s390/mm: fix BUG_ON in crst_table_upgrade") the necessary check is done at both call sites. Reported-by: Martin Cermak <mcermak@redhat.com> Bisected-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Fixes: `1aea9b3f92` ("s390/mm: implement 5 level pages tables") Cc: <stable@vger.kernel.org> # v4.13+ Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-12-05 07:51:09 +01:00
Martin Schwidefsky	987b154983	s390: Remove redudant license text More files under arch/s390 have been tagged with the SPDX identifier, a few of those files have a GPL license text. Remove the GPL text as it is no longer needed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-12-05 07:51:09 +01:00
Martin Schwidefsky	9fa1db4c75	s390: add a few more SPDX identifiers Add the correct SPDX license to a few more files under arch/s390 and drivers/s390 which have been missed to far. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-12-05 07:51:09 +01:00
Heiko Carstens	fbbd7f1a51	s390: always save and restore all registers on context switch The switch_to() macro has an optimization to avoid saving and restoring register contents that aren't needed for kernel threads. There is however the possibility that a kernel thread execve's a user space program. In such a case the execve'd process can partially see the contents of the previous process, which shouldn't be allowed. To avoid this, simply always save and restore register contents on context switch. Cc: <stable@vger.kernel.org> # v2.6.37+ Fixes: `fdb6d070ef` ("switch_to: dont restore/save access & fpu regs for kernel threads") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-12-05 07:51:08 +01:00
Michael S. Tsirkin	edfb8d8fcb	s390/virtio: add BSD license to virtio-ccw The original intent of the virtio header relicensing from 2008 was to make sure anyone can implement compatible devices/drivers. The virtio-ccw was omitted by mistake. We have an ack from the only contributor as well as the maintainer from IBM, so it's not too late to fix that. Make it dual-licensed with GPLv2, as the whole kernel is GPL2. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-12-05 07:51:07 +01:00
Angelo Dureghello	65323ee1ab	m68k/defconfig: fix stmark2 broken local compilation Signed-off-by: Angelo Dureghello <angelo@sysam.it> Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>	2017-12-04 22:36:43 +10:00
Ravi Bangoria	5aa04b3eb6	powerpc/perf: Fix oops when grouping different pmu events When user tries to group imc (In-Memory Collections) event with normal event, (sometime) kernel crashes with following log: Faulting instruction address: 0x00000000 [link register ] c00000000010ce88 power_check_constraints+0x128/0x980 ... c00000000010e238 power_pmu_event_init+0x268/0x6f0 c0000000002dc60c perf_try_init_event+0xdc/0x1a0 c0000000002dce88 perf_event_alloc+0x7b8/0xac0 c0000000002e92e0 SyS_perf_event_open+0x530/0xda0 c00000000000b004 system_call+0x38/0xe0 'event_base' field of 'struct hw_perf_event' is used as flags for normal hw events and used as memory address for imc events. While grouping these two types of events, collect_events() tries to interpret imc 'event_base' as a flag, which causes a corruption resulting in a crash. Consider only those events which belongs to 'perf_hw_context' in collect_events(). Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Reviewed-By: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-12-04 16:03:19 +11:00
Olof Johansson	0d55f2ab0d	Merge tag 'omap-for-v4.15/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Fixes for omaps for v4.15-rc cycle with two fixes for hangs with the rest being compiler warning fixes and fixes for power states and devices on various boards: - Fix smatch issue introduced by recent omap device changes for legacy resources - Fix SRAM virt to phys related boot hang affecting n900 and other omap3 hs devices found by pending CMA changes. While it seems that we have not hit this in other use cases, let's fix it to avoid a nasty and hard to find suprise as right now there is just luck keeping the SRAM virtual address to physical address translation working with the 0xffff high_mask. - Fix am335x reading of domain state registers that only exist for the PM_CEFUSE domain and produce wrong results for other domains - Fix missing setting for error code for omap device if allocation fails - Fix missing modules_offs for omap3 MMC3 affecting n9/n950 - Fix cm_split_idlest() reading reserved registers showing wrong idlestatus - Fixes to correct #phy-cells property for compiler warnings that recently started happening - Add a missing OHCI remote-wakeup-connected property that I was supposed to merge after the ohci-omap3 to ohci-platform changes but somehow managed to drop. I only noticed this was missing while debugging the OHCI/EHCI GPS and modem hang - Fix a system hang with GPS or modem connected to the OHCI/EHCI bus that typically happened within 20 - 40 minutes on an idle system. This turned out to be an issue caused by using the parent interrupt controller directly with the WUGEN + GIC stacked interrupt controller domains - Fixes for logicpd-somlv GPMC for Ethernet and NAND that clearly have been broken since we changed GPMC to use the interrupt controller binding for some pins. And fix the wrong pin muxing for WLAN while at it - Fixes for am437x interrupt and dma properties to fix compiler warnings that recently started happening * tag 'omap-for-v4.15/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: am437x-cm-t43: Correct the dmas property of spi0 ARM: dts: am4372: Correct the interrupts_properties of McASP ARM: dts: logicpd-somlv: Fix wl127x pinmux ARM: dts: logicpd-som-lv: Fix gpmc addresses for NAND and enet ARM: dts: Fix omap4 hang with GPS connected to USB by using wakeupgen ARM: OMAP2+: Missing error code in omap_device_build() ARM: AM33xx: PRM: Remove am33xx_pwrdm_read_prev_pwrst function ARM: OMAP2+: Fix SRAM virt to phys translation for save_secure_ram_context ARM: dts: Add remote-wakeup-connected for omap OHCI ARM: dts: am33xx: Add missing #phy-cells to ti,am335x-usb-phy ARM: dts: omap: Add missing #phy-cells to usb-nop-xceiv ARM: OMAP2+: Fix smatch found issue for omap_device ARM: OMAP2/3: CM: fix cm_split_idlest functionality ARM: OMAP3: hwmod_data: add missing module_offs for MMC3 Signed-off-by: Olof Johansson <olof@lixom.net>	2017-12-03 17:09:04 -08:00
Greg Ungerer	969de0988b	m68k: add missing SOFTIRQENTRY_TEXT linker section Commit `be7635e728` ("arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections") added a new linker section, SOFTIRQENTRY_TEXT, to the linker scripts for most architectures. It didn't add it to any of the linker scripts for the m68k architecture. This was not really a problem because it is only defined if either of CONFIG_FUNCTION_GRAPH_TRACER or CONFIG_KASAN are enabled - which can never be true for m68k. However commit `229a718605` ("irq: Make the irqentry text section unconditional") means that SOFTIRQENTRY_TEXT is now always defined. So on m68k we now end up with a separate ELF section for .softirqentry.text instead of it being part of the .text section. On some m68k targets in some configurations this can also cause a fatal link error: LD vmlinux /usr/local/bin/../m68k-uclinux/bin/ld.real: section .softirqentry.text loaded at [0000000010de10c0,0000000010de12dd] overlaps section .rodata loaded at [0000000010de10c0,0000000010e0fd67] To fix add in the missing SOFTIRQENTRY_TEXT section into the m68k linker scripts. I noticed that m68k is also missing the IRQENTRY_TEXT section, so this patch also adds an entry for that too. Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>	2017-12-04 10:15:18 +10:00
Linus Torvalds	87fc5c686e	Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm Pull ARM fix from Russell King: "Just one fix this time around, for the late commit in the merge window that triggered a problem with qemu. Qemu is apparently also going to receive a fix for the discovered issue" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: avoid faulting on qemu	2017-12-03 10:51:08 -05:00
Olof Johansson	8d810a4029	i.MX fixes for 4.15: - A fix for vf610-zii-dev-rev-c board which correct the unit-address of I2C EEPROM node to match the 'reg' property. - We thought the RTC block on i.MX53 is compatible with the one found on i.MX25, and added the device for i.MX53 device tree. But it turns out that's not the case, and we have to revert the change. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJaH174AAoJEFBXWFqHsHzOL+kIALWexxxRlwyOl75pkkZKTX2m AsvVTq3jgjvNTJn8mjuhzYstgUr0/GqX1InG7eBeJwQPQj+mn1roElkd1wz09xXY lKECXbTwQ0+Jc8Diomms5VvBD3y3OwaQVZKAPbTfI6A4nuUCaQdUVDMplP6dCoTI FUcxbOvAppAFEeC3X04Gc2OdUpBRSiio8DE5nvuIGzIRvA9auc5YTbJ9sOtduj4o lZvVgAHqccTmFgaFfPdW6T7h7uyZ1Jz5ItDx3BTk+QtWxFq+Cb9hOEJg/I+Z6QY+ 2twJZxyXgpaWmdQkP18tSOEA8dGZDwIwhRG1ZEeeY0JNPw6dy5w9j1W1zijdCIk= =RRYB -----END PGP SIGNATURE----- Merge tag 'imx-fixes-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes i.MX fixes for 4.15: - A fix for vf610-zii-dev-rev-c board which correct the unit-address of I2C EEPROM node to match the 'reg' property. - We thought the RTC block on i.MX53 is compatible with the one found on i.MX25, and added the device for i.MX53 device tree. But it turns out that's not the case, and we have to revert the change. * tag 'imx-fixes-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: Revert "ARM: dts: imx53: add srtc node" ARM: dts: vf610-zii-dev-rev-c: Fix the I2C EEPROM address Signed-off-by: Olof Johansson <olof@lixom.net>	2017-12-02 17:01:40 -08:00
Olof Johansson	8bd30f2298	UniPhier ARM SoC fixes for v4.15 - Fix IRQ number of PXs3 SoC - Remove redundant interrupt-parent properties - Fix arm64 DT path in MAINTAINERS -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJaHrtJAAoJED2LAQed4NsGFv4P/1MkOEG6nt12PzMafnFZATKT tHAysRO2Xv08mxaAb8hi+hR/dlZIUSizo5H3OYURs/UfPLnUIpALCrs4nKiVRCo4 LxTPVBSAm5fcQWq51vxL89D7ZtXqaMpYTnHS/u4LKRMN+2u0G3Ojbyl+CxbMf1pW PBHclfWLsJlO3Q6P1L/QRcN7kCOvWkG3JfXZWNRASvZJCglu0xvMwjHZf5R5Kiiv Y0kBiFsSfywIc+ON9q+VwKcxCvahSDdQl1Bpwuv0LmaweMxrUyflWUcSFvS/qhfa H9s/yWYvP0I+TVL+133+KpPXn71O+B848sbG0N9/jkNNaZv84a4qYSmmkF3c/8+/ Fr1tltfgqVeaDFXHecczdLWQ85MbecUGPVWpa1fpK8P7BixYPyjQprakP5IjOqHJ GDqsfGKtJA6l92jzcyYaQhkbtvK8H5zkj7US37eKjNPhgxhMBNTjcGSFnHby+GZw 5kZAtAQkKXtBqlVRk3lsinBSisJ4wgOAW1qWBHYXk1WAlIKBRxUN1BMkKVC60XS3 z9hRy79DbMlCHjPcOIaBCRqjRxwRtjb9as425BUuq4H+Ed1nPT0qWpKRW+5/vyHM eIUkijjEtPW5oYIV9vEbqkS64+vFVPy9XeDFp5PZoTLfbJAWrRkadzJ0bCksokj+ Ho/3nRWWoJplGcm27bwc =pame -----END PGP SIGNATURE----- Merge tag 'uniphier-fixes-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-uniphier into fixes UniPhier ARM SoC fixes for v4.15 - Fix IRQ number of PXs3 SoC - Remove redundant interrupt-parent properties - Fix arm64 DT path in MAINTAINERS * tag 'uniphier-fixes-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-uniphier: MAINTAINERS: exclude other Socionext SoC DT files from ARM/UNIPHIER entry arm64: dts: uniphier: remove unnecessary interrupt-parent arm64: dts: uniphier: correct on-board device IRQ number for PXs3 Signed-off-by: Olof Johansson <olof@lixom.net>	2017-12-02 17:00:35 -08:00
Olof Johansson	1864f527ae	This pull request contains Broadcom ARM-based SoCs Device Tree fixes for 4.15, please pull the following: - Stefan provides a fix for the BCM2835 (Raspberry Pi) to fix warnings about missing "#phy-cells" properties - Florian provides two fixes for Nortsthar Plus, one that uses the correct interrupt specifiers for the timer/watchdog and one that disables SATA on BCM9582HR boards since that leads to unidentified hangs right now -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJaHGgQAAoJEIfQlpxEBwcEtvgP/AtZo295Pg5Q+ckCarHNpDfC hiPrfRPJ47+70RCllMwzb/BkxXa/SooDKGRh68PcoI+qtlWvv4IttIQbO1dCksMx 96jYHZesDjeHuL4IN6TTwZZYCcamI5FfZ9ELjEFP20Duv42RYPO1Yt+zTI7v/PmU FBCfHx6/ESAF539xr2qjimkf245MjWMpxJknNOUDNeXJUYmUiBb0nUnovkJbcbsa WgHoHcHQq4hoFfSIlLiJvmwaM/MGa7gvDdxVjuF+PBKeJ7cuNt7iAqAsc5ttx7ps Wt35oIL4lzxhuY7Kx/ygQ25h+TJkdv3PgHj5ulLGkdlNd85q9SUIQNJrCYqEl99Y Vgb15mamOEF1plFmqtyD9BuBA1GBHelzDrxmW91tk28oTZKVpQGBWMZwERRhN4Re mjjXbb6yqA8MQDF+BM6GKDmxx0zrTHLoRs4QBZoHQAuTtgAxfsQU6Z4ywGuRmECq QzK9qXweYcx5SIjKj2O/fniJz5WJNW8LPlwrhdQUR3PefygQauhuXrpvTiiOnPYU J02ACa6VjcRBzMNtMVOuVf2YNdqb0iDzjqwaYNMOrh+ysYuVtEWd8I91U0ThsZju Ry9IXfh+dfxbxriGRqUZkjotYKpqj+llZ4wHPz2ewx3j0hfrSIO0Tl5TGSsxzQTF ja0YkEHzzdquYJHiZZ2K =A6Ju -----END PGP SIGNATURE----- Merge tag 'arm-soc/for-4.15/devicetree-fixes-1' of http://github.com/Broadcom/stblinux into fixes This pull request contains Broadcom ARM-based SoCs Device Tree fixes for 4.15, please pull the following: - Stefan provides a fix for the BCM2835 (Raspberry Pi) to fix warnings about missing "#phy-cells" properties - Florian provides two fixes for Nortsthar Plus, one that uses the correct interrupt specifiers for the timer/watchdog and one that disables SATA on BCM9582HR boards since that leads to unidentified hangs right now * tag 'arm-soc/for-4.15/devicetree-fixes-1' of http://github.com/Broadcom/stblinux: ARM: dts: NSP: Fix PPI interrupt types ARM: dts: NSP: Disable AHCI controller for HR NSP boards ARM: dts: bcm283x: Fix DTC warnings about missing phy-cells Signed-off-by: Olof Johansson <olof@lixom.net>	2017-12-02 16:58:37 -08:00
Olof Johansson	615af08e04	Merge tag 'renesas-dt-fixes-for-v4.15' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes Renesas ARM Based SoC DT Fixes for v4.15 Add missing '#reset-cells' property to cpg nodes. This flagged by recent dtc. * tag 'renesas-dt-fixes-for-v4.15' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: ARM: dts: r8a779x: Add '#reset-cells' in cpg-mssr Signed-off-by: Olof Johansson <olof@lixom.net>	2017-12-02 16:57:49 -08:00

1 2 3 4 5 ...

141433 Commits