OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Andy Lutomirski	55920d31f1	x86/mm/cpa: Add missing comment in populate_pdg() In commit: 21cbc2822aa1 ("x86/mm/cpa: Unbreak populate_pgd(): stop trying to deallocate failed PUDs") I intended to add this comment, but I failed at using git. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/242baf8612394f4e31216f96d13c4d2e9b90d1b7.1469293159.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-23 21:17:10 +02:00
Andy Lutomirski	530dd8d4b9	x86/mm/cpa: Fix populate_pgd(): Stop trying to deallocate failed PUDs Valdis Kletnieks bisected a boot failure back to this recent commit: `360cb4d155` ("x86/mm/cpa: In populate_pgd(), don't set the PGD entry until it's populated") I broke the case where a PUD table got allocated -- populate_pud() would wander off a pgd_none entry and get lost. I'm not sure how this survived my testing. Fix the original issue in a much simpler way. The problem was that, if we allocated a PUD table, failed to populate it, and freed it, another CPU could potentially keep using the PGD entry we installed (either by copying it via vmalloc_fault or by speculatively caching it). There's a straightforward fix: simply leave the top-level entry in place if this happens. This can't waste any significant amount of memory -- there are at most 256 entries like this systemwide and, as a practical matter, if we hit this failure path repeatedly, we're likely to reuse the same page anyway. For context, this is a reversion with this hunk added in: if (ret < 0) { + /* + * Leave the PUD page in place in case some other CPU or thread + * already found it, but remove any useless entries we just + * added to it. + */ - unmap_pgd_range(cpa->pgd, addr, + unmap_pud_range(pgd_entry, addr, addr + (cpa->numpages << PAGE_SHIFT)); return ret; } This effectively open-codes what the now-deleted unmap_pgd_range() function used to do except that unmap_pgd_range() used to try to free the page as well. Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Mike Krinkin <krinkin.m.u@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/21cbc2822aa18aa812c0215f4231dbf5f65afa7f.1469249789.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-23 21:13:25 +02:00
H.J. Lu	3ebfd81f7f	x86/syscalls: Add compat_sys_preadv64v2/compat_sys_pwritev64v2 Don't use the same syscall numbers for 2 different syscalls: 534 x32 preadv compat_sys_preadv64 535 x32 pwritev compat_sys_pwritev64 534 x32 preadv2 compat_sys_preadv2 535 x32 pwritev2 compat_sys_pwritev2 Add compat_sys_preadv64v2() and compat_sys_pwritev64v2() so that 64-bit offset is passed in one 64-bit register on x32, similar to compat_sys_preadv64() and compat_sys_pwritev64(). Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/CAMe9rOovCMf-RQfx_n1U_Tu_DX1BYkjtFr%3DQ4-_PFVSj9BCzUA@mail.gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:30:26 +02:00
Andy Lutomirski	eb43e8f85f	x86/smp: Remove unnecessary initialization of thread_info::cpu It's statically initialized to zero -- no need to dynamically initialize it to zero as well. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/6cf6314dce3051371a913ee19d1b88e29c68c560.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:31 +02:00
Andy Lutomirski	fb59831b49	x86/smp: Remove stack_smp_processor_id() It serves no purpose -- raw_smp_processor_id() works fine. This change will be needed to move thread_info off the stack. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a2bf4f07fbc30fb32f9f7f3f8f94ad3580823847.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:30 +02:00
Andy Lutomirski	13d4ea097d	x86/uaccess: Move thread_info::addr_limit to thread_struct struct thread_info is a legacy mess. To prepare for its partial removal, move thread_info::addr_limit out. As an added benefit, this way is simpler. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/15bee834d09402b47ac86f2feccdf6529f9bc5b0.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:30 +02:00
Ingo Molnar	2a53ccbc0d	x86/dumpstack: Rename thread_struct::sig_on_uaccess_error to sig_on_uaccess_err Rename it to match the thread_struct::uaccess_err pattern and also because it was too long. Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:29 +02:00
Andy Lutomirski	dfa9a942fd	x86/uaccess: Move thread_info::uaccess_err and thread_info::sig_on_uaccess_err to thread_struct struct thread_info is a legacy mess. To prepare for its partial removal, move the uaccess control fields out -- they're straightforward. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/d0ac4d01c8e4d4d756264604e47445d5acc7900e.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:28 +02:00
Andy Lutomirski	2deb4be280	x86/dumpstack: When OOPSing, rewind the stack before do_exit() If we call do_exit() with a clean stack, we greatly reduce the risk of recursive oopses due to stack overflow in do_exit, and we allow do_exit to work even if we OOPS from an IST stack. The latter gives us a much better chance of surviving long enough after we detect a stack overflow to write out our logs. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/32f73ceb372ec61889598da5e5b145889b9f2e19.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:28 +02:00
Andy Lutomirski	46aea38734	x86/mm/64: In vmalloc_fault(), use CR3 instead of current->active_mm If we get a vmalloc fault while current->active_mm->pgd doesn't match CR3, we'll crash without this change. I've seen this failure mode on heavily instrumented kernels with virtually mapped stacks. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4650d7674185f165ed8fdf9ac4c5c35c5c179ba8.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:27 +02:00
Andy Lutomirski	98f30b1207	x86/dumpstack/64: Handle faults when printing the "Stack: " part of an OOPS If we overflow the stack into a guard page, we'll recursively fault when trying to dump the contents of the guard page. Use probe_kernel_address() so we can recover if this happens. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/e626d47a55d7b04dcb1b4d33faa95e8505b217c8.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:27 +02:00
Andy Lutomirski	9a2e9da3e0	x86/dumpstack: Try harder to get a call trace on stack overflow If we overflow the stack, print_context_stack() will abort. Detect this case and rewind back into the valid part of the stack so that we can trace it. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/ee1690eb2715ccc5dc187fde94effa4ca0ccbbcd.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:26 +02:00
Andy Lutomirski	d92fc69cca	x86/mm: Remove kernel_unmap_pages_in_pgd() and efi_cleanup_page_tables() kernel_unmap_pages_in_pgd() is dangerous: if a PGD entry in init_mm.pgd were to be cleared, callers would need to ensure that the pgd entry hadn't been propagated to any other pgd. Its only caller was efi_cleanup_page_tables(), and that, in turn, was unused, so just delete both functions. This leaves a couple of other helpers unused, so delete them, too. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Acked-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/77ff20fdde3b75cd393be5559ad8218870520248.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:26 +02:00
Andy Lutomirski	360cb4d155	x86/mm/cpa: In populate_pgd(), don't set the PGD entry until it's populated This avoids pointless races in which another CPU or task might see a partially populated global PGD entry. These races should normally be harmless, but, if another CPU propagates the entry via vmalloc_fault() and then populate_pgd() fails (due to memory allocation failure, for example), this prevents a use-after-free of the PGD entry. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/bf99df27eac6835f687005364bd1fbd89130946c.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:25 +02:00
Ingo Molnar	af2cf278ef	x86/mm/hotplug: Don't remove PGD entries in remove_pagetable() So when memory hotplug removes a piece of physical memory from pagetable mappings, it also frees the underlying PGD entry. This complicates PGD management, so don't do this. We can keep the PGD mapped and the PUD table all clear - it's only a single 4K page per 512 GB of memory hotplugged. Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <Waiman.Long@hp.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/064ff6c7275734537f969e876f6cd0baa954d2cc.1468527351.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:24 +02:00
Ingo Molnar	38452af242	Merge branch 'x86/asm' into x86/mm, to resolve conflicts Conflicts: tools/testing/selftests/x86/Makefile Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-15 10:26:04 +02:00
Dave Hansen	dcb32d9913	x86/mm: Use pte_none() to test for empty PTE The page table manipulation code seems to have grown a couple of sites that are looking for empty PTEs. Just in case one of these entries got a stray bit set, use pte_none() instead of checking for a zero pte_val(). The use pte_same() makes me a bit nervous. If we were doing a pte_same() check against two cleared entries and one of them had a stray bit set, it might fail the pte_same() check. But, I don't think we ever _do_ pte_same() for cleared entries. It is almost entirely used for checking for races in fault-in paths. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: dave.hansen@intel.com Cc: linux-mm@kvack.org Cc: mhocko@suse.com Link: http://lkml.kernel.org/r/20160708001915.813703D9@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-13 09:43:25 +02:00
Dave Hansen	e4a84be6f0	x86/mm: Disallow running with 32-bit PTEs to work around erratum The Intel(R) Xeon Phi(TM) Processor x200 Family (codename: Knights Landing) has an erratum where a processor thread setting the Accessed or Dirty bits may not do so atomically against its checks for the Present bit. This may cause a thread (which is about to page fault) to set A and/or D, even though the Present bit had already been atomically cleared. These bits are truly "stray". In the case of the Dirty bit, the thread associated with the stray set was not allowed to write to the page. This means that we do not have to launder the bit(s); we can simply ignore them. If the PTE is used for storing a swap index or a NUMA migration index, the A bit could be misinterpreted as part of the swap type. The stray bits being set cause a software-cleared PTE to be interpreted as a swap entry. In some cases (like when the swap index ends up being for a non-existent swapfile), the kernel detects the stray value and WARN()s about it, but there is no guarantee that the kernel can always detect it. When we have 64-bit PTEs (64-bit mode or 32-bit PAE), we were able to move the swap PTE format around to avoid these troublesome bits. But, 32-bit non-PAE is tight on bits. So, disallow it from running on this hardware. I can't imagine anyone wanting to run 32-bit non-highmem kernels on this hardware, but disallowing them from running entirely is surely the safe thing to do. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: dave.hansen@intel.com Cc: linux-mm@kvack.org Cc: mhocko@suse.com Link: http://lkml.kernel.org/r/20160708001914.D0B50110@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-13 09:43:25 +02:00
Dave Hansen	97e3c602cc	x86/mm: Ignore A/D bits in pte/pmd/pud_none() The erratum we are fixing here can lead to stray setting of the A and D bits. That means that a pte that we cleared might suddenly have A/D set. So, stop considering those bits when determining if a pte is pte_none(). The same goes for the other pmd_none() and pud_none(). pgd_none() can be skipped because it is not affected; we do not use PGD entries for anything other than pagetables on affected configurations. This adds a tiny amount of overhead to all pte_none() checks. I doubt we'll be able to measure it anywhere. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: dave.hansen@intel.com Cc: linux-mm@kvack.org Cc: mhocko@suse.com Link: http://lkml.kernel.org/r/20160708001912.5216F89C@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-13 09:43:25 +02:00
Dave Hansen	00839ee3b2	x86/mm: Move swap offset/type up in PTE to work around erratum This erratum can result in Accessed/Dirty getting set by the hardware when we do not expect them to be (on !Present PTEs). Instead of trying to fix them up after this happens, we just allow the bits to get set and try to ignore them. We do this by shifting the layout of the bits we use for swap offset/type in our 64-bit PTEs. It looks like this: bitnrs: \| ... \| 11\| 10\| 9\|8\|7\|6\|5\| 4\| 3\|2\|1\|0\| names: \| ... \|SW3\|SW2\|SW1\|G\|L\|D\|A\|CD\|WT\|U\|W\|P\| before: \| OFFSET (9-63) \|0\|X\|X\| TYPE(1-5) \|0\| after: \| OFFSET (14-63) \| TYPE (9-13) \|0\|X\|X\|X\| X\| X\|X\|X\|0\| Note that D was already a don't care (X) even before. We just move TYPE up and turn its old spot (which could be hit by the A bit) into all don't cares. We take 5 bits away from the offset, but that still leaves us with 50 bits which lets us index into a 62-bit swapfile (4 EiB). I think that's probably fine for the moment. We could theoretically reclaim 5 of the bits (1, 2, 3, 4, 7) but it doesn't gain us anything. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: dave.hansen@intel.com Cc: linux-mm@kvack.org Cc: mhocko@suse.com Link: http://lkml.kernel.org/r/20160708001911.9A3FD2B6@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-13 09:43:25 +02:00
Paolo Bonzini	be8a18e2e9	x86/entry: Inline enter_from_user_mode() This matches what is already done for prepare_exit_to_usermode(), and saves about 60 clock cycles (4% speedup) with the benchmark in the previous commit message. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Link: http://lkml.kernel.org/r/1466434712-31440-3-git-send-email-pbonzini@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-10 13:33:02 +02:00
Paolo Bonzini	2e9d1e150a	x86/entry: Avoid interrupt flag save and restore Thanks to all the work that was done by Andy Lutomirski and others, enter_from_user_mode() and prepare_exit_to_usermode() are now called only with interrupts disabled. Let's provide them a version of user_enter()/user_exit() that skips saving and restoring the interrupt flag. On an AMD-based machine I tested this patch on, with force-enabled context tracking, the speed-up in system calls was 90 clock cycles or 6%, measured with the following simple benchmark: #include <sys/signal.h> #include <time.h> #include <unistd.h> #include <stdio.h> unsigned long rdtsc() { unsigned long result; asm volatile("rdtsc; shl $32, %%rdx; mov %%eax, %%eax\n" "or %%rdx, %%rax" : "=a" (result) : : "rdx"); return result; } int main() { unsigned long tsc1, tsc2; int pid = getpid(); int i; tsc1 = rdtsc(); for (i = 0; i < 100000000; i++) kill(pid, SIGWINCH); tsc2 = rdtsc(); printf("%ld\n", tsc2 - tsc1); } Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Link: http://lkml.kernel.org/r/1466434712-31440-2-git-send-email-pbonzini@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-10 13:33:02 +02:00
Ingo Molnar	52e31f89cc	Merge branch 'linus' into x86/asm, to pick up fixes before merging new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-09 10:43:49 +02:00
Linus Torvalds	ee40fb2948	SCSI fixes on 20160708 Three fixes. One is the qla24xx MSI regression, one is a theoretical problem over blacklist matching, which would bite USB badly if it ever triggered and one is a system hang with a particular type of IPR device. Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJXgAriAAoJEAVr7HOZEZN40VcP/R65ZYezkobZDVR+xQfCwDnn cCaFSl4a+FtsGnzs2hb88ZZWA2hn4xWDAdjwoPxRwD7yq8KB1Oxb2wEOiJMjIgJH N/NU6zEIjMvVei8jDuAJJ13c1Xa1/y59ptD1nykm1dJlLGjwAE1jBiAuDs1OOm27 SnvIj+mOSRiesGqp2Srg6+w+QmOJ0biXiSf5EUhj7rFKayIaM3yLrQZ2E4a5oUfc EZYzCBagObQoYwABw/6Nd1HhFdpptRnOvIr8xk+ch5w2m+AEiw1xAf+/B9ws25fB EB4O7fQcNzIEbjRRIehGhtp83Q7ST77kq5mE7CHDLeu/v93ZzKBmEZXIM2Bdg2x5 TBEvFrkIFe1ECITHInegvG4/V/mNLFYcad9Ygdbdt6ndXWoJ5l1a8BRcNMCX0smQ g7jXqMz0vYKW+JSZvP1fHtqsJR6t6CHAFOKtwwb5xlhRvbZRMB311pr3UfbMqlTx qZOfGMqF/ta51RWkenNlRZHvg8WeeTxGioNexS8qU9j+9CUvvj5eIemEpBEBiblN 8BnbEAAnjSTSMGPFuOMQ8Njh3umC4ozzc8WcaXNjHnUMcIXaOY3PkcjUf65pi5S+ fPjV18350LAhiIlneHSCcGBO+Z+D5OjPJdQGywMb3fs9HICNfi41QsJrJVAkA3e5 vc2XZhSAfEQuwniuGJU3 =0CCg -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes. One is the qla24xx MSI regression, one is a theoretical problem over blacklist matching, which would bite USB badly if it ever triggered and one is a system hang with a particular type of IPR device" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: qla2xxx: Fix NULL pointer deref in QLA interrupt SCSI: fix new bug in scsi_dev_info_list string matching ipr: Clear interrupt on croc/crocodile when running with LSI	2016-07-08 18:59:46 -07:00
Linus Torvalds	b987c759d2	eCryptfs fixes for 4.7-rc7: - Provide a more concise fix for CVE-2016-1583 + Additionally fixes linux-stable regressions caused by the cherry-picking of the original fix - Some very minor changes that have queued up + Fix typos in code comments + Remove unnecessary check for NULL before destroying kmem_cache -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCgAGBQJXf8nnAAoJENaSAD2qAscKwXgP/0awhY1z40dL/igP6fPv2ack HbqrOjUVO2DzxinvKB3vRLNy93zwESxe8UpwPsl84IJ85zOQjkUkJ8PYk1oyBf0N dVWqO11g6AKNZ+VQFspconvMhZATwSrsv8z3BzvwNGLsPhPuUQ+JmbBe8xMdrsZ5 qVaWswsMtMlhM3p/zFh57vWO64fT1xiabpxSkKpG2LHJN6h6QAQxkfBfa2FuXCsN hZIw+ULcUJfdawXGq8lAfcYzbDmFpNt70fFquJgfJHrXFrOuensYfLcWUvhrSNbc HZ6imRK9LCG4IKjJTBNmCmBR8ho71yGzdKuup81Eap+2zx2kC7twokS1d5fha8iL Kzkx0NMDriY2N+tIfufHYk2IIenFzWG6Yuj0STswtJX4YhQGBc0H6VxcgrxE0PgW k1iKUV7jnJGxxN+d6lmV4+fX0vKGgBMsQq1Q76CkYLN1BAvdwz6GnWSfqP8hWz3o sNVyNtYh+/TXY8JMWKDBlps7Ib8W88qDW3K7YcAf2VPYAqIWm5Va1MR5m5s+UIeR QiCD32X/0PfDp13QRiKAHJ6C9CInyu0r+fF/g8ZMqLuWgLxoahxpr6ML/CnHoGl5 IcDydJO3/bLBq9If8WxYsOQvVKCa4e7N7o7ZHPKd8U7O39mCGNfbQx7/FlMjtvf6 +4HAxamUC1ogpLTkpWxI =Bt4P -----END PGP SIGNATURE----- Merge tag 'ecryptfs-4.7-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull eCryptfs fixes from Tyler Hicks: "Provide a more concise fix for CVE-2016-1583: - Additionally fixes linux-stable regressions caused by the cherry-picking of the original fix Some very minor changes that have queued up: - Fix typos in code comments - Remove unnecessary check for NULL before destroying kmem_cache" * tag 'ecryptfs-4.7-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: ecryptfs: don't allow mmap when the lower fs doesn't support it Revert "ecryptfs: forbid opening files without mmap handler" ecryptfs: fix spelling mistakes eCryptfs: fix typos in comment ecryptfs: drop null test before destroy functions	2016-07-08 09:48:28 -07:00
Linus Torvalds	b89c44bb23	IOMMU Fixes for Linux v4.7-rc6 Two Fixes: * Intel VT-d fix for a suspend/resume issue, introduced with the scalability improvements in this cycle. * AMD IOMMU fix for systems that have unity mappings defined. There was a race where translation got enabled before the unity mappings were in place. This issue was seen on some HP servers. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJXf7bkAAoJECvwRC2XARrjzYoP/AvvkL955TW0pSdMWZqQ0o9j kL3jSl75Te1EwaSyJxR9nOK8azSsfYcG4NdzMxnFUCK6ydDA6Ogz0cMtcd6s2skd unQuCSpjOJiMQU426ssQrsW+KoTxi5utpapMY3Ayh94/BbEDuVtv/iUIBE1Ca9eE H/SQUl4l8c3Cwe//1Kwrzm6O0pslrLBoOABTpFhl4GZ5pWCcv7bwbSEjVfSY5X8x //trqhgsACdh+Qlp5dA56BeMVyEblOvnY5XO2ODeWOG2vlEbqH9MiUeuDBNoXxtM MqhLTSKijZ1wp2a1iaCuyBLKFlaW7zChU41YyQmYJ8B60u6CwVuFwGs+XltnkOha iYBK56xZ8npFKOs4iauBmsVZZnNrAY0mxCYg28KtNh2uUT4sQdqUCnT/SxA9Z0RC bRtudpoX5I3NZrlDlt6OzWqCjxFj8Zq57RwrhDuQm+ijqAAor4C7NhH7cT0QTC4k l1srxnLtSM/iiqWgs2hNIC2GzNbdhix8YU98kIxpT2iNB99qlqPXK9ve0bJIb03N oECjKBu+d0I7ApdfOlL6N/RHgZJ01O+6gK1neArtVnviBgN7tmTxgWvcioJd+JuL bmjytsExfP/Z2p7iy9yQCiIDicAo0+wpMOzZRG8jAXvHHCj8laZ8PhL74voWc2wH Mn+iaXAXUysLblaXzpEx =axsu -----END PGP SIGNATURE----- Merge tag 'iommu-fixes-v4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: "Two Fixes: - Intel VT-d fix for a suspend/resume issue, introduced with the scalability improvements in this cycle. - AMD IOMMU fix for systems that have unity mappings defined. There was a race where translation got enabled before the unity mappings were in place. This issue was seen on some HP servers" * tag 'iommu-fixes-v4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix unity mapping initialization race iommu/vt-d: Fix infinite loop in free_all_cpu_cached_iovas	2016-07-08 09:35:23 -07:00
Linus Torvalds	cfae7e3eb1	xen: bug fixes for 4.7-rc6 - Fix two bugs in the handling of xenbus transactions. - Make the xen acpi driver compatible with Xen 4.7. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXf7GHAAoJEFxbo/MsZsTRYksH/0F+xZZQQiO3WPSzL0muu5Fn wLmmSyBv6Ak76vZ6z7+ku095OagA0LgS1eISnKlP86HTaRl8eQ6AyChjKux3cX9T +X1hHwBN39rfF6mZO4pXhu/7SKVmcOvVY7SHvKca8Lx31Y58eLB4+6ycnrGI+XQ7 oon7KrmqSAg/3r1/CLvwTE6/PPxj/T38g0QoegN6ua26O79OFY5GWmdc+ucfR76i NIOubaVX93s8dF0YcvVBL1HIs64AkUkk6i5DiyJ1r05kCTy2sYlZ3e6abCFhqMj+ jcf4aCTI4sCzbZRHID5mEMxfiGAHFo5MPuoRpo08orMbGZu/0+ytnkJ/hYb+H7c= =YMOM -----END PGP SIGNATURE----- Merge tag 'for-linus-4.7b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - Fix two bugs in the handling of xenbus transactions. - Make the xen acpi driver compatible with Xen 4.7. * tag 'for-linus-4.7b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7 xenbus: simplify xenbus_dev_request_and_reply() xenbus: don't bail early from xenbus_dev_request_and_reply() xenbus: don't BUG() on user mode induced condition	2016-07-08 09:12:41 -07:00
Linus Torvalds	267ba96492	arm64 fixes: - Enforce USER_DS on exception entry from EL1 - Apply workaround for Cavium errata #27456 on Thunderx-81xx parts -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJXf5pkAAoJELescNyEwWM0CrwH/RTFmTDzlvwJbcmVKLeabfSb 8AUphL7+D8gRLBRy1l+pdjqHli4EuxA34peaIHs91ziPl85wI+l37juTZ08MqYUM W3lLbKPmJGa39WKYq5rtKqaohCGHRA0SwLSq78kbRFb3GgWUvNbrUaC5oBoEOBkc x2vEpsVVhAWezly1CaX0zf8yfBuGp5O8rkw2yFqPuD7MKh3D0DLK4F8UCmZ9OqQM nI10nq9GBdbus8yA/2kIHSvtkGC9l0Cyiu8iJ/Gf4HQnSqVopPAzvP0FdNs5cj9o 5m/BOJUED/pEdps7+PZMlJHYrHpB+VTqrZ/HdFFI4M5EsIltw3OSKp/lA6cA/Xc= =iKFx -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "A couple of late fixes here, but one that we've been sitting on for a few weeks while the details were worked out. Specifically, we now enforce USER_DS on taking exceptions whilst in the kernel, which avoids leaking kernel data to userspace through things like perf. The other patch is an update to a workaround for a hardware erratum on some Cavium SoCs. Summary: - Enforce USER_DS on exception entry from EL1 - Apply workaround for Cavium errata #27456 on Thunderx-81xx parts" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Enable workaround for Cavium erratum 27456 on thunderx-81xx arm64: kernel: Save and restore UAO and addr_limit on exception entry	2016-07-08 09:08:27 -07:00
Linus Torvalds	a017f583ec	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Three fixes: - A boot crash fix with certain configs - a MAINTAINERS entry update - Documentation typo fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/Documentation: Fix various typos in Documentation/x86/ files x86/amd_nb: Fix boot crash on non-AMD systems MAINTAINERS: Update the Calgary IOMMU entry	2016-07-08 09:06:52 -07:00
Linus Torvalds	369da7fc6d	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Two load-balancing fixes for cgroups-intense workloads" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix calc_cfs_shares() fixed point arithmetics width confusion sched/fair: Fix effective_load() to consistently use smoothed load	2016-07-08 09:04:34 -07:00
Linus Torvalds	612807fe28	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Various fixes: - 32-bit callgraph bug fix - suboptimal event group scheduling bug fix - event constraint fixes for Broadwell/Skylake - RAPL module name collision fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix pmu::filter_match for SW-led groups x86/perf/intel/rapl: Fix module name collision with powercap intel-rapl perf/x86: Fix 32-bit perf user callgraph collection perf/x86/intel: Update event constraints when HT is off	2016-07-08 09:02:16 -07:00
Linus Torvalds	977dcf0c47	Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: "Two MIPS-GIC irqchip driver fixes to unbreak certain MIPS boards" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mips-gic: Match IPI IRQ domain by bus token only irqchip/mips-gic: Map to VPs using HW VPNum	2016-07-08 08:59:33 -07:00
Linus Torvalds	18b16676c3	Final (hopefully) GPIO fixes for v4.7: - Fix an oops on the Asus Eee PC 1201 - Revert a patch trying to split GPIO parsing and GPIO configuration - Revert a too liberal compile testing thing -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXf460AAoJEEEQszewGV1zelEQAKGzur1ClwQx4PzF5W+6hAkJ a7DDMOA2jBWgh+fnGRGq1yKNpeF1famLB4EVYvr4fzsTZDMFnFmHLAomc42wzfPM ZDgPBcpNN9/603vXrW32c84SbNKUt3liC0VGBiQns0ixRnl/z07HcFIcQgLWjSjc 02VtYQVXlBtEvgpCSovfcly2hvoSB8onDKhF7/gaI5d+k0Ngq4MtIMueMEYaJY/z 16NZl5Sapn58gAUhElUy+9gBeIA+xPx/VXhBlP+g6pSAHiRudaHNAMEFEDiAWEOQ Iedzy/vXtLdRV6w3ZZJ99OGSTYrGtBBMbIKOWA98IW9s2h98B3r7+267VzFMd5f2 Avcdg8emkQP6iN1hWCtT3YKKgK6L0Iwi7FIqgaHzFyg32tIc42Ej8EoD+HyzT4MU NzAXg/0E3b0vgtxzXcWUNEJ3K4P7xlgcFRiuwGAE1C2c1RxxKLSTyvDBANRSKSLt ADNYioWr4vAQEE+id3SZFkbxOOlTSD+nAjcE1NzXCJcGpwinpPWADfQxwn6TEvp/ KodH7eItpJsWs8z7mqnCFC6lox133phXqPvs2ECRMFi6p4OYaTRX+MSeoLv9t9w/ e/KVl9Wpnx9ImvFurr3TsPaAtjsAPlnlUxrNrY3mngAfKEm+YKoJBgen6KrHiGPI fewQGrZr4xhtDn0AOKNm =9+1I -----END PGP SIGNATURE----- Merge tag 'gpio-v4.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "I don't like to toss in last minute patches, but these are all for things that are broken, and have bitten people for real. Two of them go into stable. Maybe all of them if the compile test problem is a pain in the ass also for stable folks. Final (hopefully) GPIO fixes for v4.7: - Fix an oops on the Asus Eee PC 1201 - Revert a patch trying to split GPIO parsing and GPIO configuration - Revert a too liberal compile testing thing" * tag 'gpio-v4.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: Revert "gpio: gpiolib-of: Allow compile testing" Revert "gpiolib: Split GPIO flags parsing and GPIO configuration" gpio: sch: Fix Oops on module load on Asus Eee PC 1201	2016-07-08 08:57:03 -07:00
Linus Torvalds	1d110cf5d3	Merge tag 'drm-fixes-for-v4.7-rc7' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "One nouveau fix, and a few AMD Polaris fixes and some Allwinner fixes. I've got some vmware fixes that I might send separate over the weekend, they fix some black screens, but I'm still debating them" * tag 'drm-fixes-for-v4.7-rc7' of git://people.freedesktop.org/~airlied/linux: drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation. drm/amd/powerplay: fix bug that get wrong polaris evv voltage. drm/amd/powerplay: incorrectly use of the function return value drm/amd/powerplay: fix incorrect voltage table value for tonga drm/amd/powerplay: fix incorrect voltage table value for polaris10 drm/nouveau/disp/sor/gf119: select correct sor when poking training pattern gpu: drm: sun4i_drv: add missing of_node_put after calling of_parse_phandle drm/sun4i: Send vblank event when the CRTC is disabled drm/sun4i: Report proper vblank	2016-07-08 08:55:27 -07:00
Jeff Mahoney	f0fe970df3	ecryptfs: don't allow mmap when the lower fs doesn't support it There are legitimate reasons to disallow mmap on certain files, notably in sysfs or procfs. We shouldn't emulate mmap support on file systems that don't offer support natively. CVE-2016-1583 Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: stable@vger.kernel.org [tyhicks: clean up f_op check by using ecryptfs_file_to_lower()] Signed-off-by: Tyler Hicks <tyhicks@canonical.com>	2016-07-08 10:35:28 -05:00
Borislav Petkov	9a7e7b5718	x86/asm/entry: Make thunk's restore a local label No need to have it appear in objdump output. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160708141016.GH3808@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-08 16:47:01 +02:00
Jan Beulich	6f2d9d9921	xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7 As of Xen 4.7 PV CPUID doesn't expose either of CPUID[1].ECX[7] and CPUID[0x80000007].EDX[7] anymore, causing the driver to fail to load on both Intel and AMD systems. Doing any kind of hardware capability checks in the driver as a prerequisite was wrong anyway: With the hypervisor being in charge, all such checking should be done by it. If ACPI data gets uploaded despite some missing capability, the hypervisor is free to ignore part or all of that data. Ditch the entire check_prereq() function, and do the only valid check (xen_initial_domain()) in the caller in its place. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-08 14:53:13 +01:00
Dmitry Safonov	f80fd3a5ff	selftests/x86: Add vDSO mremap() test Should print this on vDSO remapping success (on new kernels): [root@localhost ~]# ./test_mremap_vdso_32 AT_SYSINFO_EHDR is 0xf773f000 [NOTE] Moving vDSO: [f773f000, f7740000] -> [a000000, a001000] [OK] Or print that mremap() for vDSOs is unsupported: [root@localhost ~]# ./test_mremap_vdso_32 AT_SYSINFO_EHDR is 0xf773c000 [NOTE] Moving vDSO: [0xf773c000, 0xf773d000] -> [0xf7737000, 0xf7738000] [FAIL] mremap() of the vDSO does not work on this kernel! Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: 0x7f454c46@gmail.com Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kselftest@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160628113539.13606-3-dsafonov@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-08 14:17:51 +02:00
Dmitry Safonov	b059a453b1	x86/vdso: Add mremap hook to vm_special_mapping Add possibility for 32-bit user-space applications to move the vDSO mapping. Previously, when a user-space app called mremap() for the vDSO address, in the syscall return path it would land on the previous address of the vDSOpage, resulting in segmentation violation. Now it lands fine and returns to userspace with a remapped vDSO. This will also fix the context.vdso pointer for 64-bit, which does not affect the user of vDSO after mremap() currently, but this may change in the future. As suggested by Andy, return -EINVAL for mremap() that would split the vDSO image: that operation cannot possibly result in a working system so reject it. Renamed and moved the text_mapping structure declaration inside map_vdso(), as it used only there and now it complements the vvar_mapping variable. There is still a problem for remapping the vDSO in glibc applications: the linker relocates addresses for syscalls on the vDSO page, so you need to relink with the new addresses. Without that the next syscall through glibc may fail: Program received signal SIGSEGV, Segmentation fault. #0 0xf7fd9b80 in __kernel_vsyscall () #1 0xf7ec8238 in _exit () from /usr/lib32/libc.so.6 Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: 0x7f454c46@gmail.com Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160628113539.13606-2-dsafonov@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-08 14:17:51 +02:00
Jan Beulich	e5a79475a7	xenbus: simplify xenbus_dev_request_and_reply() No need to retain a local copy of the full request message, only the type is really needed. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-08 11:50:29 +01:00
Jan Beulich	7469be95a4	xenbus: don't bail early from xenbus_dev_request_and_reply() xenbus_dev_request_and_reply() needs to track whether a transaction is open. For XS_TRANSACTION_START messages it calls transaction_start() and for XS_TRANSACTION_END messages it calls transaction_end(). If sending an XS_TRANSACTION_START message fails or responds with an an error, the transaction is not open and transaction_end() must be called. If sending an XS_TRANSACTION_END message fails, the transaction is still open, but if an error response is returned the transaction is closed. Commit `027bd7e899` ("xen/xenbus: Avoid synchronous wait on XenBus stalling shutdown/restart") introduced a regression where failed XS_TRANSACTION_START messages were leaving the transaction open. This can cause problems with suspend (and migration) as all transactions must be closed before suspending. It appears that the problematic change was added accidentally, so just remove it. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-08 11:14:26 +01:00
Jiri Kosina	39380b80d7	x86/mm/pat, /dev/mem: Remove superfluous error message Currently it's possible for broken (or malicious) userspace to flood a kernel log indefinitely with messages a-la Program dmidecode tried to access /dev/mem between f0000->100000 because range_is_allowed() is case of CONFIG_STRICT_DEVMEM being turned on dumps this information each and every time devmem_is_allowed() fails. Reportedly userspace that is able to trigger contignuous flow of these messages exists. It would be possible to rate limit this message, but that'd have a questionable value; the administrator wouldn't get information about all the failing accessess, so then the information would be both superfluous and incomplete at the same time :) Returning EPERM (which is what is actually happening) is enough indication for userspace what has happened; no need to log this particular error as some sort of special condition. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1607081137020.24757@cbobk.fhfr.pm Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-08 11:52:58 +02:00
Ingo Molnar	946e0f6ffc	Linux 4.7-rc6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXefulAAoJEHm+PkMAQRiG6nMH/2O1vcZeOtqmx2yCMUeXyKAT wG88XflXzf3rM7C7TiObEYVf/bbLleJ7saDLEeic7ButD5gyYacIuzylVnrcqfBc vinz4cOw5kvu9DrRkCKdOfiTAgwYtqQW+syJ8ZK4lPQuSxnPAs+F/FKSOpyUF5FN Dngr520KjYKBEtn27W9UDPChFRwQoWAlaOC534eusaArCJtHGHHiuq5TEDn2EIo8 pUw2vwx5JiquSHOY34WLU7r+QoilovCQlUSsBQdLlPjfMB1QFtclPYa+5yEMjkT4 wusOUOfS/zK0rV6KnEdc/SkpiVX5C9WpFiWUOdEeJ5mZ+KijVkaOa9K1EDx8jSM= =7Hwh -----END PGP SIGNATURE----- Merge tag 'v4.7-rc6' into x86/mm, to merge fixes before applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-07-08 11:51:28 +02:00
Linus Torvalds	cc23c619f8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull apparmor fix from James Morris. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: apparmor: fix oops, validate buffer size in apparmor_setprocattr()	2016-07-07 20:56:09 -07:00
Linus Torvalds	7ed18e2d1b	ACPI fixes for v4.7-rc7 - Fix a lock ordering issue in ACPICA introduced by a recent commit that attempted to fix a deadlock in the dynamic table loading code which in turn appeared after changes related to the handling of module-level AML also made in this cycle (Lv Zheng). - Fix a recent regression in the ACPI IRQ management code that may cause PCI drivers to be unable to register an IRQ if that IRQ happens to be shared with a device on the ISA bus, like the parallel port, by reverting one commit entirely and restoring the previous behavior in two other places (Sinan Kaya). - Fix a recent regression in the ACPI AML debugger introduced by the commit that removed incorrect usage of IS_ERR_VALUE() from multiple places (Lv Zheng). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJXfuIeAAoJEILEb/54YlRxWbIP/iLT8J15RhFOoRYX4HXyULym UxeqEHVySKaqQLYhkBRz0opclW6sJQ7GaKYpVR31V4eeoll1+iQeYBYWdyuezjwp em3LvtLXTiLWWfRVNHg9AUOv4wc4q41m98MZ5ZScgTsUcexv2R1tt0KzLE/HNy1T NU/7JyWBEF4AiFsfYBuqtknWudV7JF3/siJO+Q1o+HSA9DW5cdqR0K9oM+Sl9pTD 3yLpVY8DNJGLSA/7MyJlyLmZ6eJmTQbxcLO7oCQliqJ9jjBKkC4r3fgfQcOcoltT S6O7ODcJwvqCA1sv271VyOckkhJmyT7zJi/L/yJOgiGRU1//0Vfyg6iRVou9OZmE VcQT79W+6W6saWucoNDX1CLm6GzLIHP2e6leooL710nNmtTUgrSc6C4FT1KGGs0R UTNPYKYtiaK54krsa48XonSCdGFIszeK8sa3DIjf6C/5AYX/eGBuGFm5dZlU/qzs BNv+GyT/Vt/Cu3cNJUrsMugwDL5sp31u44mfM1c99LMXmnwzkPylRGR1aaO4TYOy jE5LDwMryi5gpAF8/Es0AiU4Xa8O3p+AlD7PjZIUSEyMd9zr3fdpvF6dZwaiHP26 FX6paRbXi10mkXYGnRCbTOBV8i3PuNiplaZUT+XotHlEXOvAyMhVCVSM2XCAo1Wm 3IWSGzc5gwq1Pl4fihWi =vKn9 -----END PGP SIGNATURE----- Merge tag 'acpi-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "All of these fix recent regressions in ACPICA, in the ACPI PCI IRQ management code and in the ACPI AML debugger. Specifics: - Fix a lock ordering issue in ACPICA introduced by a recent commit that attempted to fix a deadlock in the dynamic table loading code which in turn appeared after changes related to the handling of module-level AML also made in this cycle (Lv Zheng). - Fix a recent regression in the ACPI IRQ management code that may cause PCI drivers to be unable to register an IRQ if that IRQ happens to be shared with a device on the ISA bus, like the parallel port, by reverting one commit entirely and restoring the previous behavior in two other places (Sinan Kaya). - Fix a recent regression in the ACPI AML debugger introduced by the commit that removed incorrect usage of IS_ERR_VALUE() from multiple places (Lv Zheng)" * tag 'acpi-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / debugger: Fix regression introduced by IS_ERR_VALUE() removal ACPICA: Namespace: Fix namespace/interpreter lock ordering ACPI,PCI,IRQ: separate ISA penalty calculation Revert "ACPI, PCI, IRQ: remove redundant code in acpi_irq_penalty_init()" ACPI,PCI,IRQ: factor in PCI possible	2016-07-07 20:49:41 -07:00
Linus Torvalds	c09230f308	Power management fixes for v4.7-rc7 - Fix a recent performance regression on Power systems (powernv and pseries) introduced by a core cpuidle commit that decreased the precision of the last_residency conversion from nano- to microseconds, which should not matter in theory, but turned out to play not-so-well with the special "snooze" idle state on Power (Shreyas B Prabhu). - Fix a crash during resume from hibernation on x86-64 caused by possible corruption of the kernel text part of page tables in the last phase of image restoration exposed by a security-related change during the 4.3 development cycle (Rafael Wysocki). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJXfuVmAAoJEILEb/54YlRx/TUQALIrIgVMtNpPhC0Mu0DQp+1y S2t/vWfLnMqfl1lzH/scDi4OC3cqEbEsYB5CUJPRysc1KB0fjJbW+BEuQuJyEhgf ac95mv1IU7DKiJ1Rp4qWvvBoovY3fjb1rhHMzvo9uh6HmKrSlWLL4PmnbGw/BaFL PgqEazsy/a9FgGT7VnvamVL6bGXgVMEf8VqoBeZFd0krH4qvBZHrS97OLqU685vT VW0IDZr0sI3u0pG9ropJ5b8ETtnyZ4lmTO5D4TzD9P8kZyjl2TBnSZ/ls/1YLh4N WdBZXbsfmFFt4jyDnOjilWOBA8R3IDteCDmG1l8XhK4Smbl7TRh2XVuRAHHEjG6b oMPHTeqIil3ExBS2gdzVEFNzYfHsyg3N9cXSochqnp3vboyZOiKaxIPCiVeBOZ+t bDPwtbovIly1aEnsJKUbHvBxJ4A9uPUqw99YbBtMK1VpQXZFg+s/0mBVNynkuu/k EUOuU6KcY6Br4JJPsqfb26NmARgLrrI+gTJbC3pFDHARtALnCbkkbmNZRRVRS3Zs 7CzGaYqgZ0ejMv/46jFPsHYKXU52AJ5HDhfXfELd8X3AvdhtmIszfqTkBgKx4gto fyTVzi2c6+RJJr5Jc07msYawaz86nEUjeXFsM59g6jC5dguSWnRxUuWV2zUM/ENb GDkzNdzif0t2DheJ4bKc =XL6s -----END PGP SIGNATURE----- Merge tag 'pm-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "One fix for a recent cpuidle core change that, against all odds, introduced a functional regression on Power systems and the fix for the crash during resume from hibernation on x86-64 that has been in the works for the last few weeks (it actually was ready last week, but I wanted to allow the reporters to test if for some more time). Specifics: - Fix a recent performance regression on Power systems (powernv and pseries) introduced by a core cpuidle commit that decreased the precision of the last_residency conversion from nano- to microseconds, which should not matter in theory, but turned out to play not-so-well with the special "snooze" idle state on Power (Shreyas B Prabhu). - Fix a crash during resume from hibernation on x86-64 caused by possible corruption of the kernel text part of page tables in the last phase of image restoration exposed by a security-related change during the 4.3 development cycle (Rafael Wysocki)" * tag 'pm-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpuidle: Fix last_residency division x86/power/64: Fix kernel text mapping corruption during image restoration	2016-07-07 20:46:48 -07:00
Dave Airlie	39c8859418	Allwinner DRM driver fixes for 4.7, take 2 A new set of fixes for the sun4i driver, mostly related to vblank handling, and a minor fix to release a reference on the device tree nodes we're parsing in the probe logic. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXfBEjAAoJEBx+YmzsjxAg614P/Ak3zQiSfzdr3W95cttFN3Vj uI5kSFJjW96KzmS3EaqJYSrVZ0gtLkofB5lS0ElsrulYKcKBmYnjCvw+f0a8gM2K OY+y3pu17s5o3TJKWle2Sf8bW+rhA4dLaKHKBolrT2J9wFiQGTL0sOvq5ZvsCKqC j5Vp2sEEeGopO6ZOp6lKcRT4FVB7y9A7/7UVnizsqn3ZrUqERa9ofWp9OS8p+P2H cftyVxDgfA5Ve6bS7C0iqjfSVsSqbc/NexKHvbp3ZOLiuLUehZoFQb9fvrzZYoPK uFTl29BEhxHEgqwAF7/iLjggBOmzzxKSaQP4/1567r+uYW+xF857yCl82JbilYiO /laUdl4GK2B+yeJ+617THaCi5V+zdRQID0FPFcGNbDzjpw9WFN8ok7MY9iuoPfF7 30hEDwZYcGlbiNoCTs+1TKO8NQO85gNPgQ6MxfcJ+9nxR6OugkRGjltBrKJdGbSP MafUG8FEDXnHbpNEQ7ruA+EebJZAHxjjMUvsCJlGKTIIUuw+vMfrKhexd0JKx2s2 el5Xmb2kKtl6+88xSTglPxugt0w6lM2VA1n52ZNEVkcgG4jdOPj/NVoGMmKL9/sJ oUFIlzua06MA3fjoYMiLGmpNhls008pzs9nDPTrF7cuot5xCNnW/P/CkI7ODTB1P khj8gD2QtIgNDGnQg/Mh =oFqB -----END PGP SIGNATURE----- Merge tag 'sunxi-drm-fixes-for-4.7-2' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into drm-fixes Allwinner DRM driver fixes for 4.7, take 2 A new set of fixes for the sun4i driver, mostly related to vblank handling, and a minor fix to release a reference on the device tree nodes we're parsing in the probe logic. * tag 'sunxi-drm-fixes-for-4.7-2' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux: gpu: drm: sun4i_drv: add missing of_node_put after calling of_parse_phandle drm/sun4i: Send vblank event when the CRTC is disabled drm/sun4i: Report proper vblank	2016-07-08 13:29:11 +10:00
Vegard Nossum	30a46a4647	apparmor: fix oops, validate buffer size in apparmor_setprocattr() When proc_pid_attr_write() was changed to use memdup_user apparmor's (interface violating) assumption that the setprocattr buffer was always a single page was violated. The size test is not strictly speaking needed as proc_pid_attr_write() will reject anything larger, but for the sake of robustness we can keep it in. SMACK and SELinux look safe to me, but somebody else should probably have a look just in case. Based on original patch from Vegard Nossum <vegard.nossum@oracle.com> modified for the case that apparmor provides null termination. Fixes: `bb646cdb12` Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: John Johansen <john.johansen@canonical.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Eric Paris <eparis@parisplace.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: stable@kernel.org Signed-off-by: John Johansen <john.johansen@canonical.com> Reviewed-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: James Morris <james.l.morris@oracle.com>	2016-07-08 10:26:25 +10:00
Jeff Mahoney	78c4e17241	Revert "ecryptfs: forbid opening files without mmap handler" This reverts commit `2f36db7100`. It fixed a local root exploit but also introduced a dependency on the lower file system implementing an mmap operation just to open a file, which is a bit of a heavy hammer. The right fix is to have mmap depend on the existence of the mmap handler instead. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: stable@vger.kernel.org Signed-off-by: Tyler Hicks <tyhicks@canonical.com>	2016-07-07 18:47:57 -05:00
Linus Torvalds	ac904ae6e6	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block IO fixes from Jens Axboe: "Three small fixes that have been queued up and tested for this series: - A bug fix for xen-blkfront from Bob Liu, fixing an issue with incomplete requests during migration. - A fix for an ancient issue in retrieving the IO priority of a different PID than self, preventing that task from going away while we access it. From Omar. - A writeback fix from Tahsin, fixing a case where we'd call ihold() with a zero ref count inode" * 'for-linus' of git://git.kernel.dk/linux-block: block: fix use-after-free in sys_ioprio_get() writeback: inode cgroup wb switch should not call ihold() xen-blkfront: save uncompleted reqs in blkfront_resume()	2016-07-07 15:34:09 -07:00

1 2 3 4 5 ...

603372 Commits All Branches Search

603372 Commits

All Branches