OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Mark Rutland	289d07a2dc	arm64: mm: abort uaccess retries upon fatal signal When there's a fatal signal pending, arm64's do_page_fault() implementation returns 0. The intent is that we'll return to the faulting userspace instruction, delivering the signal on the way. However, if we take a fatal signal during fixing up a uaccess, this results in a return to the faulting kernel instruction, which will be instantly retried, resulting in the same fault being taken forever. As the task never reaches userspace, the signal is not delivered, and the task is left unkillable. While the task is stuck in this state, it can inhibit the forward progress of the system. To avoid this, we must ensure that when a fatal signal is pending, we apply any necessary fixup for a faulting kernel instruction. Thus we will return to an error path, and it is up to that code to make forward progress towards delivering the fatal signal. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Laura Abbott <labbott@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Steve Capper <steve.capper@arm.com> Tested-by: Steve Capper <steve.capper@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-08-22 18:15:42 +01:00
Catalin Marinas	6d332747fa	arm64: Fix potential race with hardware DBM in ptep_set_access_flags() In a system with DBM (dirty bit management) capable agents there is a possible race between a CPU executing ptep_set_access_flags() (maybe non-DBM capable) and a hardware update of the dirty state (clearing of PTE_RDONLY). The scenario: a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old (PTE_AF clear) b) ptep_set_access_flags() is called as a result of a read access and it needs to set the pte to writable, clean and young (PTE_AF set) c) a DBM-capable agent, as a result of a different write access, is marking the entry as young (setting PTE_AF) and dirty (clearing PTE_RDONLY) The current ptep_set_access_flags() implementation would set the PTE_RDONLY bit in the resulting value overriding the DBM update and losing the dirty state. This patch fixes such race by setting PTE_RDONLY to the most permissive (lowest value) of the current entry and the new one. Fixes: `66dbd6e61a` ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM") Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-08-04 13:26:11 +01:00
Linus Torvalds	3d9d7405c0	arm64 fixes: - Ensure we have a guard page after the kernel image in vmalloc - Fix incorrect prefetch stride in copy_page - Ensure irqs are disabled in die() - Fix for event group validation in QCOM L2 PMU driver - Fix requesting of PMU IRQs on AMD Seattle - Minor cleanups and fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJZey1iAAoJELescNyEwWM0w/0H/1RaHFUSoFUIoL+qFD0eGXcp hORI0sIHrUlHRONTFYMTyNko7kxELz5aDm6pc87dzBUNoUq3gxhqeEa0zsmwOPsQ m4iDa7r9xXT+nBITe2auAg6miEMX7Ym448dDrIyKNcRK+2SyZoFqS0vr8UVqs1P/ NwdFGgpKHbV4r1Jeoosom+n7VnuyE0vYBKo8TlRks6NvQJoh2duiPkL+AsBgCfBq fznck7jIPL4z4kf4Fp/Yz1QsmMhkDSidPmGD/m97Bj4wvEbMwf0u8Dnv1tySK5wx NwKeN0Dn7JphtL5c5j+OGiri7gTcswjxHJ9f6d0Ez+2TwnjWFM6JNQ+xdVqFcxc= =EpS9 -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "I'd been collecting these whilst we debugged a CPU hotplug failure, but we ended up diagnosing that one to tglx, who has taken a fix via the -tip tree separately. We're seeing some NFS issues that we haven't gotten to the bottom of yet, and we've uncovered some issues with our backtracing too so there might be another fixes pull before we're done. Summary: - Ensure we have a guard page after the kernel image in vmalloc - Fix incorrect prefetch stride in copy_page - Ensure irqs are disabled in die() - Fix for event group validation in QCOM L2 PMU driver - Fix requesting of PMU IRQs on AMD Seattle - Minor cleanups and fixes" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mmu: Place guard page after mapping of kernel image drivers/perf: arm_pmu: Request PMU SPIs with IRQF_PER_CPU arm64: sysreg: Fix unprotected macro argmuent in write_sysreg perf: qcom_l2: fix column exclusion check arm64/lib: copy_page: use consistent prefetch stride arm64/numa: Drop duplicate message perf: Convert to using %pOF instead of full_name arm64: Convert to using %pOF instead of full_name arm64: traps: disable irq in die() arm64: atomics: Remove '&' from '+&' asm constraint in lse atomics arm64: uaccess: Remove redundant __force from addr cast in __range_ok	2017-07-28 13:29:36 -07:00
Will Deacon	92bbd16e50	arm64: mmu: Place guard page after mapping of kernel image The vast majority of virtual allocations in the vmalloc region are followed by a guard page, which can help to avoid overruning on vma into another, which may map a read-sensitive device. This patch adds a guard page to the end of the kernel image mapping (i.e. following the data/bss segments). Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-07-28 10:32:14 +01:00
Punit Agrawal	ece4b206be	arm64/numa: Drop duplicate message When booting linux on a system without CONFIG_NUMA enabled, the following messages are printed during boot - NUMA: Faking a node at [mem 0x0000000000000000-0x00000083ffffffff] NUMA: Adding memblock [0x8000000000 - 0x8000e7ffff] on node 0 NUMA: Adding memblock [0x8000e80000 - 0x83f65cffff] on node 0 NUMA: Adding memblock [0x83f65d0000 - 0x83f665ffff] on node 0 NUMA: Adding memblock [0x83f6660000 - 0x83f676ffff] on node 0 NUMA: Adding memblock [0x83f6770000 - 0x83f678ffff] on node 0 NUMA: Adding memblock [0x83f6790000 - 0x83fb82ffff] on node 0 NUMA: Adding memblock [0x83fb830000 - 0x83fbc0ffff] on node 0 NUMA: Adding memblock [0x83fbc10000 - 0x83fbdfffff] on node 0 NUMA: Adding memblock [0x83fbe00000 - 0x83fbffffff] on node 0 NUMA: Adding memblock [0x83fc000000 - 0x83fffbffff] on node 0 NUMA: Adding memblock [0x83fffc0000 - 0x83fffdffff] on node 0 NUMA: Adding memblock [0x83fffe0000 - 0x83ffffffff] on node 0 NUMA: Initmem setup node 0 [mem 0x8000000000-0x83ffffffff] NUMA: NODE_DATA [mem 0x83fffec500-0x83fffedfff] The information is then duplicated by core kernel messages right after the above output. Early memory node ranges node 0: [mem 0x0000008000000000-0x0000008000e7ffff] node 0: [mem 0x0000008000e80000-0x00000083f65cffff] node 0: [mem 0x00000083f65d0000-0x00000083f665ffff] node 0: [mem 0x00000083f6660000-0x00000083f676ffff] node 0: [mem 0x00000083f6770000-0x00000083f678ffff] node 0: [mem 0x00000083f6790000-0x00000083fb82ffff] node 0: [mem 0x00000083fb830000-0x00000083fbc0ffff] node 0: [mem 0x00000083fbc10000-0x00000083fbdfffff] node 0: [mem 0x00000083fbe00000-0x00000083fbffffff] node 0: [mem 0x00000083fc000000-0x00000083fffbffff] node 0: [mem 0x00000083fffc0000-0x00000083fffdffff] node 0: [mem 0x00000083fffe0000-0x00000083ffffffff] Initmem setup node 0 [mem 0x0000008000000000-0x00000083ffffffff] Remove the duplication of memblock layout information printed during boot by dropping the messages from arm64 numa initialisation. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-07-20 17:03:53 +01:00
Vladimir Murzin	43fc509c3e	dma-coherent: introduce interface for default DMA pool Christoph noticed [1] that default DMA pool in current form overload the DMA coherent infrastructure. In reply, Robin suggested [2] to split the per-device vs. global pool interfaces, so allocation/release from default DMA pool is driven by dma ops implementation. This patch implements Robin's idea and provide interface to allocate/release/mmap the default (aka global) DMA pool. To make it clear that existing _from_coherent routines work on per-device pool rename them to _from_dev_coherent. [1] https://lkml.org/lkml/2017/7/7/370 [2] https://lkml.org/lkml/2017/7/7/431 Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Suggested-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Andras Szemzo <sza@esh.hu> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-07-20 16:09:10 +02:00
Rik van Riel	cf92251dc5	arm64/mmap: properly account for stack randomization in mmap_base When RLIMIT_STACK is, for example, 256MB, the current code results in a gap between the top of the task and mmap_base of 256MB, failing to take into account the amount by which the stack address was randomized. In other words, the stack gets less than RLIMIT_STACK space. Ensure that the gap between the stack and mmap_base always takes stack randomization and the stack guard gap into account. Obtained from Daniel Micay's linux-hardened tree. Link: http://lkml.kernel.org/r/20170622200033.25714-3-riel@redhat.com Signed-off-by: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Rik van Riel <riel@redhat.com> Reported-by: Florian Weimer <fweimer@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Hugh Dickins <hughd@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-07-12 16:26:03 -07:00
Andrey Ryabinin	3f9ec80f7b	arm64/kasan: don't allocate extra shadow memory We used to read several bytes of the shadow memory in advance. Therefore additional shadow memory mapped to prevent crash if speculative load would happen near the end of the mapped shadow memory. Now we don't have such speculative loads, so we no longer need to map additional shadow memory. Link: http://lkml.kernel.org/r/20170601162338.23540-3-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-07-10 16:32:33 -07:00
Linus Torvalds	9f45efb928	Merge branch 'akpm' (patches from Andrew) Merge misc updates from Andrew Morton: - a few hotfixes - various misc updates - ocfs2 updates - most of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (108 commits) mm, memory_hotplug: move movable_node to the hotplug proper mm, memory_hotplug: drop CONFIG_MOVABLE_NODE mm, memory_hotplug: drop artificial restriction on online/offline mm: memcontrol: account slab stats per lruvec mm: memcontrol: per-lruvec stats infrastructure mm: memcontrol: use generic mod_memcg_page_state for kmem pages mm: memcontrol: use the node-native slab memory counters mm: vmstat: move slab statistics from zone to node counters mm/zswap.c: delete an error message for a failed memory allocation in zswap_dstmem_prepare() mm/zswap.c: improve a size determination in zswap_frontswap_init() mm/zswap.c: delete an error message for a failed memory allocation in zswap_pool_create() mm/swapfile.c: sort swap entries before free mm/oom_kill: count global and memory cgroup oom kills mm: per-cgroup memory reclaim stats mm: kmemleak: treat vm_struct as alternative reference to vmalloc'ed objects mm: kmemleak: factor object reference updating out of scan_block() mm: kmemleak: slightly reduce the size of some structures on 64-bit architectures mm, mempolicy: don't check cpuset seqlock where it doesn't matter mm, cpuset: always use seqlock when changing task's nodemask mm, mempolicy: simplify rebinding mempolicies when updating cpusets ...	2017-07-06 22:27:08 -07:00
Linus Torvalds	f72e24a124	This is the first pull request for the new dma-mapping subsystem In this new subsystem we'll try to properly maintain all the generic code related to dma-mapping, and will further consolidate arch code into common helpers. This pull request contains: - removal of the DMA_ERROR_CODE macro, replacing it with calls to ->mapping_error so that the dma_map_ops instances are more self contained and can be shared across architectures (me) - removal of the ->set_dma_mask method, which duplicates the ->dma_capable one in terms of functionality, but requires more duplicate code. - various updates for the coherent dma pool and related arm code (Vladimir) - various smaller cleanups (me) -----BEGIN PGP SIGNATURE----- iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlldmw0LHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYOiKA/+Ln1mFLSf3nfTzIHa24Bbk8ZTGr0B8TD4Vmyyt8iG oO3AeaTLn3d6ugbH/uih/tPz8PuyXsdiTC1rI/ejDMiwMTSjW6phSiIHGcStSR9X VFNhmMFacp7QpUpvxceV0XZYKDViAoQgHeGdp3l+K5h/v4AYePV/v/5RjQPaEyOh YLbCzETO+24mRWdJxdAqtTW4ovYhzj6XsiJ+pAjlV0+SWU6m5L5E+VAPNi1vqv1H 1O2KeCFvVYEpcnfL3qnkw2timcjmfCfeFAd9mCUAc8mSRBfs3QgDTKw3XdHdtRml LU2WuA5cpMrOdBO4mVra2plo8E2szvpB1OZZXoKKdCpK3VGwVpVHcTvClK2Ks/3B GDLieroEQNu2ZIUIdWXf/g2x6le3BcC9MmpkAhnGPqCZ7skaIBO5Cjpxm0zTJAPl PPY3CMBBEktAvys6DcudOYGixNjKUuAm5lnfpcfTEklFdG0AjhdK/jZOplAFA6w4 LCiy0rGHM8ZbVAaFxbYoFCqgcjnv6EjSiqkJxVI4fu/Q7v9YXfdPnEmE0PJwCVo5 +i7aCLgrYshTdHr/F3e5EuofHN3TDHwXNJKGh/x97t+6tt326QMvDKX059Kxst7R rFukGbrYvG8Y7yXwrSDbusl443ta0Ht7T1oL4YUoJTZp0nScAyEluDTmrH1JVCsT R4o= =0Fso -----END PGP SIGNATURE----- Merge tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping infrastructure from Christoph Hellwig: "This is the first pull request for the new dma-mapping subsystem In this new subsystem we'll try to properly maintain all the generic code related to dma-mapping, and will further consolidate arch code into common helpers. This pull request contains: - removal of the DMA_ERROR_CODE macro, replacing it with calls to ->mapping_error so that the dma_map_ops instances are more self contained and can be shared across architectures (me) - removal of the ->set_dma_mask method, which duplicates the ->dma_capable one in terms of functionality, but requires more duplicate code. - various updates for the coherent dma pool and related arm code (Vladimir) - various smaller cleanups (me)" * tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping: (56 commits) ARM: dma-mapping: Remove traces of NOMMU code ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus ARM: NOMMU: Introduce dma operations for noMMU drivers: dma-mapping: allow dma_common_mmap() for NOMMU drivers: dma-coherent: Introduce default DMA pool drivers: dma-coherent: Account dma_pfn_offset when used with device tree dma: Take into account dma_pfn_offset dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrs dma-mapping: remove dmam_free_noncoherent crypto: qat - avoid an uninitialized variable warning au1100fb: remove a bogus dma_free_nonconsistent call MAINTAINERS: add entry for dma mapping helpers powerpc: merge __dma_set_mask into dma_set_mask dma-mapping: remove the set_dma_mask method powerpc/cell: use the dma_supported method for ops switching powerpc/cell: clean up fixed mapping dma_ops initialization tile: remove dma_supported and mapping_error methods xen-swiotlb: remove xen_swiotlb_set_dma_mask arm: implement ->dma_supported instead of ->set_dma_mask mips/loongson64: implement ->dma_supported instead of ->set_dma_mask ...	2017-07-06 19:20:54 -07:00
Punit Agrawal	7868a2087e	mm/hugetlb: add size parameter to huge_pte_offset() A poisoned or migrated hugepage is stored as a swap entry in the page tables. On architectures that support hugepages consisting of contiguous page table entries (such as on arm64) this leads to ambiguity in determining the page table entry to return in huge_pte_offset() when a poisoned entry is encountered. Let's remove the ambiguity by adding a size parameter to convey additional information about the requested address. Also fixup the definition/usage of huge_pte_offset() throughout the tree. Link: http://lkml.kernel.org/r/20170522133604.11392-4-punit.agrawal@arm.com Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: James Hogan <james.hogan@imgtec.com> (odd fixer:METAG ARCHITECTURE) Cc: Ralf Baechle <ralf@linux-mips.org> (supporter:MIPS) Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-07-06 16:24:34 -07:00
Steve Capper	f0b38d65c9	arm64: hugetlb: remove spurious calls to huge_ptep_offset() We don't need to call huge_ptep_offset as our accessors are already supplied with the pte_t *. This patch removes those spurious calls. [punit.agrawal@arm.com: resolve rebase conflicts due to patch re-ordering] Link: http://lkml.kernel.org/r/20170524115409.31309-3-punit.agrawal@arm.com Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: David Woods <dwoods@mellanox.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-07-06 16:24:34 -07:00
Steve Capper	bb9dd3df8e	arm64: hugetlb: refactor find_num_contig() Patch series "Support for contiguous pte hugepages", v4. This patchset updates the hugetlb code to fix issues arising from contiguous pte hugepages (such as on arm64). Compared to v3, This version addresses a build failure on arm64 by including two cleanup patches. Other than the arm64 cleanups, the rest are generic code changes. The remaining arm64 support based on these patches will be posted separately. The patches are based on v4.12-rc2. Previous related postings can be found at [0], [1], [2], and [3]. The patches fall into three categories - * Patch 1-2 - arm64 cleanups required to greatly simplify changing huge_pte_offset() prototype in Patch 5. Catalin, Will - are you happy for these patches to go via mm? * Patches 3-4 address issues with gup * Patches 5-8 relate to passing a size argument to hugepage helpers to disambiguate the size of the referred page. These changes are required to enable arch code to properly handle swap entries for contiguous pte hugepages. The changes to huge_pte_offset() (patch 5) touch multiple architectures but I've managed to minimise these changes for the other affected functions - huge_pte_clear() and set_huge_pte_at(). These patches gate the enabling of contiguous hugepages support on arm64 which has been requested for systems using !4k page granule. The ARM64 architecture supports two flavours of hugepages - * Block mappings at the pud/pmd level These are regular hugepages where a pmd or a pud page table entry points to a block of memory. Depending on the PAGE_SIZE in use the following size of block mappings are supported - PMD PUD --- --- 4K: 2M 1G 16K: 32M 64K: 512M For certain applications/usecases such as HPC and large enterprise workloads, folks are using 64k page size but the minimum hugepage size of 512MB isn't very practical. To overcome this ... * Using the Contiguous bit The architecture provides a contiguous bit in the translation table entry which acts as a hint to the mmu to indicate that it is one of a contiguous set of entries that can be cached in a single TLB entry. We use the contiguous bit in Linux to increase the mapping size at the pmd and pte (last) level. The number of supported contiguous entries varies by page size and level of the page table. Using the contiguous bit allows additional hugepage sizes - CONT PTE PMD CONT PMD PUD -------- --- -------- --- 4K: 64K 2M 32M 1G 16K: 2M 32M 1G 64K: 2M 512M 16G Of these, 64K with 4K and 2M with 64K pages have been explicitly requested by a few different users. Entries with the contiguous bit set are required to be modified all together - which makes things like memory poisoning and migration impossible to do correctly without knowing the size of hugepage being dealt with - the reason for adding size parameter to a few of the hugepage helpers in this series. This patch (of 8): As we regularly check for contiguous pte's in the huge accessors, remove this extra check from find_num_contig. [punit.agrawal@arm.com: resolve rebase conflicts due to patch re-ordering] Link: http://lkml.kernel.org/r/20170524115409.31309-2-punit.agrawal@arm.com Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: David Woods <dwoods@mellanox.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-07-06 16:24:34 -07:00
Will Deacon	3edb1dd13c	Merge branch 'aarch64/for-next/ras-apei' into aarch64/for-next/core Merge in arm64 ACPI RAS support (APEI/GHES) from Tyler Baicar.	2017-06-26 10:54:27 +01:00
Tyler Baicar	621f48e40e	arm/arm64: KVM: add guest SEA support Currently external aborts are unsupported by the guest abort handling. Add handling for SEAs so that the host kernel reports SEAs which occur in the guest kernel. When an SEA occurs in the guest kernel, the guest exits and is routed to kvm_handle_guest_abort(). Prior to this patch, a print message of an unsupported FSC would be printed and nothing else would happen. With this patch, the code gets routed to the APEI handling of SEAs in the host kernel to report the SEA information. Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-22 18:22:05 +01:00
Tyler Baicar	7edda0886b	acpi: apei: handle SEA notification type for ARMv8 ARM APEI extension proposal added SEA (Synchronous External Abort) notification type for ARMv8. Add a new GHES error source handling function for SEA. If an error source's notification type is SEA, then this function can be registered into the SEA exception handler. That way GHES will parse and report SEA exceptions when they occur. An SEA can interrupt code that had interrupts masked and is treated as an NMI. To aid this the page of address space for mapping APEI buffers while in_nmi() is always reserved, and ghes_ioremap_pfn_nmi() is changed to use the helper methods to find the prot_t to map with in the same way as ghes_ioremap_pfn_irq(). Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Reviewed-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-22 18:22:03 +01:00
Tyler Baicar	32015c2356	arm64: exception: handle Synchronous External Abort SEA exceptions are often caused by an uncorrected hardware error, and are handled when data abort and instruction abort exception classes have specific values for their Fault Status Code. When SEA occurs, before killing the process, report the error in the kernel logs. Update fault_info[] with specific SEA faults so that the new SEA handler is used. Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Reviewed-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [will: use NULL instead of 0 when assigning si_addr] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-22 18:21:46 +01:00
Christoph Hellwig	e0d60ac10e	arm64: remove DMA_ERROR_CODE The dma alloc interface returns an error by return NULL, and the mapping interfaces rely on the mapping_error method, which the dummy ops already implement correctly. Thus remove the DMA_ERROR_CODE define. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com>	2017-06-20 11:13:08 +02:00
Olav Haugan	577dfe16b8	arm64/dma-mapping: Remove extraneous null-pointer checks The current null-pointer check in __dma_alloc_coherent and __dma_free_coherent is not needed anymore since the __dma_alloc/__dma_free functions won't be called if !dev (dummy ops will be called instead). Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Olav Haugan <ohaugan@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-15 11:40:22 +01:00
Punit Agrawal	0e3a902639	arm64: mm: Update perf accounting to handle poison faults Re-organise the perf accounting for fault handling in preparation for enabling handling of hardware poison faults in subsequent commits. The change updates perf accounting to be inline with the behaviour on x86. With this update, the perf fault accounting - * Always report PERF_COUNT_SW_PAGE_FAULTS * Doesn't report anything else for VM_FAULT_ERROR (which includes hwpoison faults) * Reports PERF_COUNT_SW_PAGE_FAULTS_MAJ if it's a major fault (indicated by VM_FAULT_MAJOR) * Otherwise, reports PERF_COUNT_SW_PAGE_FAULTS_MIN Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-12 16:04:29 +01:00
Jonathan (Zhixiong) Zhang	e7c600f149	arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling Add VM_FAULT_HWPOISON[_LARGE] handling to the arm64 page fault handler. Handling of VM_FAULT_HWPOISON[_LARGE] is very similar to VM_FAULT_OOM, the only difference is that a different si_code (BUS_MCEERR_AR) is passed to user space and si_addr_lsb field is initialized. Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> (fix new __do_user_fault call-site) Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-12 16:04:29 +01:00
Punit Agrawal	f02ab08afb	arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries When memory failure is enabled, a poisoned hugepage pte is marked as a swap entry. huge_pte_offset() does not return the poisoned page table entries when it encounters PUD/PMD hugepages. This behaviour of huge_pte_offset() leads to error such as below when munmap is called on poisoned hugepages. [ 344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074. Fix huge_pte_offset() to return the poisoned pte which is then appropriately handled by the generic layer code. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: David Woods <dwoods@mellanox.com> Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-12 16:04:28 +01:00
Will Deacon	1eb34b6e51	arm64: fault: Print info about page table structure when dumping pte Whilst debugging a remote crash, I noticed that show_pte is unhelpful when it comes to describing the structure of the page table being walked. This is easily fixed by printing out the page table (swapper vs user), page size and virtual address size when displaying the PGD address. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-12 12:33:54 +01:00
Kristina Martsenko	83016b2042	arm64: mm: print file name of faulting vma Print out the name of the file associated with the vma that faulted. This is usually the executable or shared library name. We already print out the task name, but also printing the library name is useful for pinpointing bugs to libraries. Also print the base address and size of the vma, which together with the PC (printed by __show_regs) gives the offset into the library. Fault prints now look like: test[2361]: unhandled level 2 translation fault (11) at 0x00000012, esr 0x92000006, in libfoo.so[ffffa0145000+1000] This is already done on x86, for more details see commit `03252919b7` ("x86: print which shared library/executable faulted in segfault etc. messages v3"). Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-12 12:33:37 +01:00
Kristina Martsenko	bf396c09c2	arm64: mm: don't print out page table entries on EL0 faults When we take a fault from EL0 that can't be handled, we print out the page table entries associated with the faulting address. This allows userspace to print out any current page table entries, including kernel (TTBR1) entries. Exposing kernel mappings like this could pose a security risk, so don't print out page table information on EL0 faults. (But still print it out for EL1 faults.) This also follows the same behaviour as x86, printing out page table entries on kernel mode faults but not user mode faults. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-12 12:33:37 +01:00
Kristina Martsenko	67ce16ec15	arm64: mm: print out correct page table entries When we take a fault that can't be handled, we print out the page table entries associated with the faulting address. In some cases we currently print out the wrong entries. For a faulting TTBR1 address, we sometimes print out TTBR0 table entries instead, and for a faulting TTBR0 address we sometimes print out TTBR1 table entries. Fix this by choosing the tables based on the faulting address. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> [will: zero-extend addrs to 64-bit, don't walk swapper w/ TTBR0 addr] Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-12 12:33:37 +01:00
Ard Biesheuvel	1151f838cb	arm64: kernel: restrict /dev/mem read() calls to linear region When running lscpu on an AArch64 system that has SMBIOS version 2.0 tables, it will segfault in the following way: Unable to handle kernel paging request at virtual address ffff8000bfff0000 pgd = ffff8000f9615000 [ffff8000bfff0000] *pgd=0000000000000000 Internal error: Oops: 96000007 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 1284 Comm: lscpu Not tainted 4.11.0-rc3+ #103 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 task: ffff8000fa78e800 task.stack: ffff8000f9780000 PC is at __arch_copy_to_user+0x90/0x220 LR is at read_mem+0xcc/0x140 This is caused by the fact that lspci issues a read() on /dev/mem at the offset where it expects to find the SMBIOS structure array. However, this region is classified as EFI_RUNTIME_SERVICE_DATA (as per the UEFI spec), and so it is omitted from the linear mapping. So let's restrict /dev/mem read/write access to those areas that are covered by the linear region. Reported-by: Alexander Graf <agraf@suse.de> Fixes: `4dffbfc48d` ("arm64/efi: mark UEFI reserved regions as MEMBLOCK_NOMAP") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-06-01 18:26:26 +01:00
Tobias Klauser	6efd8499d9	arm64: mm: explicity include linux/vmalloc.h arm64's mm/mmu.c uses vm_area_add_early, struct vm_area and other definitions but relies on implict inclusion of linux/vmalloc.h which means that changes in other headers could break the build. Thus, add an explicit include. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-05-30 11:07:42 +01:00
Kefeng Wang	c07ab957d9	arm64: Call __show_regs directly Generic code expects show_regs() to also dump the stack, but arm64's show_reg() does not do this. Some arm64 callers of show_regs() only want the registers dumped, without the stack. To enable generic code to work as expected, we need to make show_regs() dump the stack. Where we only want the registers dumped, we must use __show_regs(). This patch updates code to use __show_regs() where only registers are desired. A subsequent patch will modify show_regs(). Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-05-30 11:07:41 +01:00
Linus Torvalds	28b47809b2	IOMMU Updates for Linux v4.12 This includes: * Some code optimizations for the Intel VT-d driver * Code to switch off a previously enabled Intel IOMMU * Support for 'struct iommu_device' for OMAP, Rockchip and Mediatek IOMMUs * Some header optimizations for IOMMU core code headers and a few fixes that became necessary in other parts of the kernel because of that * ACPI/IORT updates and fixes * Some Exynos IOMMU optimizations * Code updates for the IOMMU dma-api code to bring it closer to use per-cpu iova caches * New command-line option to set default domain type allocated by the iommu core code * Another command line option to allow the Intel IOMMU switched off in a tboot environment * ARM/SMMU: TLB sync optimisations for SMMUv2, Support for using an IDENTITY domain in conjunction with DMA ops, Support for SMR masking, Support for 16-bit ASIDs (was previously broken) * Various other small fixes and improvements -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJZEY4XAAoJECvwRC2XARrjth0QAKV56zjnFclv39aDo6eCq9CT 51+XT4bPY5VKQ2+Jx76TBNObHmGK+8KEMHfT9khpWJtFCDyy25SGckLry1nYqmZs tSTsbj4sOeCyKzOLITlRN9/OzKXkjKAxYuq+sQZZFDFYf3kCM/eag0dGAU6aVLNp tkIal3CSpGjCQ9M5JohrtQ1mwiGqCIkMIgvnBjRw+bfpLnQNG+VL6VU2G3RAkV2b 5Vbdoy+P7ZQnJSZr/bibYL2BaQs2diR4gOppT5YbsfniMq4QYSjheu1xBboGX8b7 sx8yuPi4370irSan0BDvlvdQdjBKIRiDjfGEKDhRwPhtvN6JREGakhEOC8MySQ37 mP96B72Lmd+a7DEl5udOL7tQILA0DcUCX0aOyF714khnZuFU5tVlCotb/36xeJ+T FPc3RbEVQ90m8dYU6MNJ+ahtb/ZapxGTRfisIigB6wlnZa0Evabp9EJSce6oJMkm whbBhDubeEU18n9XAaofMbu+P2LAzq8cxiRMlsDvT4mIy7jO86jjCmhpu1Tfn2GY 4wrEQZdWOMvhUsIhObXA0aC3BzC506uvnKPW3qy041RaxBuelWiBi29qzYbhxzkr DLDpWbUZNYPyFJjttpavyQb2/XRduBTJdVP1pQpkJNDsW5jLiBkpSqm9xNADapRY vLSYRX0JCIquaD+PAuxn =3aE8 -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: - code optimizations for the Intel VT-d driver - ability to switch off a previously enabled Intel IOMMU - support for 'struct iommu_device' for OMAP, Rockchip and Mediatek IOMMUs - header optimizations for IOMMU core code headers and a few fixes that became necessary in other parts of the kernel because of that - ACPI/IORT updates and fixes - Exynos IOMMU optimizations - updates for the IOMMU dma-api code to bring it closer to use per-cpu iova caches - new command-line option to set default domain type allocated by the iommu core code - another command line option to allow the Intel IOMMU switched off in a tboot environment - ARM/SMMU: TLB sync optimisations for SMMUv2, Support for using an IDENTITY domain in conjunction with DMA ops, Support for SMR masking, Support for 16-bit ASIDs (was previously broken) - various other small fixes and improvements * tag 'iommu-updates-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (63 commits) soc/qbman: Move dma-mapping.h include to qman_priv.h soc/qbman: Fix implicit header dependency now causing build fails iommu: Remove trace-events include from iommu.h iommu: Remove pci.h include from trace/events/iommu.h arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops() ACPI/IORT: Fix CONFIG_IOMMU_API dependency iommu/vt-d: Don't print the failure message when booting non-kdump kernel iommu: Move report_iommu_fault() to iommu.c iommu: Include device.h in iommu.h x86, iommu/vt-d: Add an option to disable Intel IOMMU force on iommu/arm-smmu: Return IOVA in iova_to_phys when SMMU is bypassed iommu/arm-smmu: Correct sid to mask iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid() iommu: Make iommu_bus_notifier return NOTIFY_DONE rather than error code omap3isp: Remove iommu_group related code iommu/omap: Add iommu-group support iommu/omap: Make use of 'struct iommu_device' iommu/omap: Store iommu_dev pointer in arch_data iommu/omap: Move data structures to omap-iommu.h iommu/omap: Drop legacy-style device support ...	2017-05-09 15:15:47 -07:00
Laura Abbott	d4bbc30bb0	arm64: use set_memory.h header The set_memory_* functions have moved to set_memory.h. Use that header explicitly. Link: http://lkml.kernel.org/r/1488920133-27229-4-git-send-email-labbott@redhat.com Signed-off-by: Laura Abbott <labbott@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-05-08 17:15:13 -07:00
Linus Torvalds	ab182e67ec	arm64 updates for 4.12: - kdump support, including two necessary memblock additions: memblock_clear_nomap() and memblock_cap_memory_range() - ARMv8.3 HWCAP bits for JavaScript conversion instructions, complex numbers and weaker release consistency - arm64 ACPI platform MSI support - arm perf updates: ACPI PMU support, L3 cache PMU in some Qualcomm SoCs, Cortex-A53 L2 cache events and DTLB refills, MAINTAINERS update for DT perf bindings - architected timer errata framework (the arch/arm64 changes only) - support for DMA_ATTR_FORCE_CONTIGUOUS in the arm64 iommu DMA API - arm64 KVM refactoring to use common system register definitions - remove support for ASID-tagged VIVT I-cache (no ARMv8 implementation using it and deprecated in the architecture) together with some I-cache handling clean-up - PE/COFF EFI header clean-up/hardening - define BUG() instruction without CONFIG_BUG -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJZDKMoAAoJEGvWsS0AyF7xR+YP/0EMEz5MDfCv0PVYj7/AIa0G Zphl7OhysIkeDAz7urXw9Jdl0NfORNIqmD1vZNVSc321IyNp56Od+kWd82lBrOWB ad3nNT67pEmu0pAW7CO48ju3rTesEnEl3ra45E1tULeLihmv93jc4ZlfXgumlKq3 /GE84XJ5ZFmluuhq1zgNefeUtyl1tbxTxHJ74+INF7dTd/5sJcphpqS4Dzpb+msT 20WYliccQCBF9zBFUYHc2KjcXXKRQGxLulGS3MuoN2DLkD+U9YyR/OmA7SoXh2J2 WXC5b0x856xTQJFCJ39pb7rw5xHjt3l5zfU3VLSvqEVL/+asBqCcgGNtNUgOW1Es dEHC6bc66Ley6mn7bbpFE3MK8D+K5q8HwMF6G5KDtIVB6DB/iQ6kzi5aXKoupxtb 1EuU4OW6cDhmOFQYjgIDofLgqbmVvJofdF6+NfxasfZmWrMgHzv0rYvaCDnAV/Tr t7bhH7hf9/KcP/wpk86O2AMKKpgoNTqe1Qy8cWVFFLnut567Pb6zs/L3ZXfleoLv t613yM8Zj2fE05ja8ylMDjaasidNpXGttb08/4kAn06Daaoueqla0jmduAhy4aaV dQ3OFP9lJ5MFaFnMMTPfU3vtvNLMHuo9MZsYCrv5zCaNNs3lpAPUiPNh588ZscKa sWx4PEiaCi+wcOsLsJvh =SDkm -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - kdump support, including two necessary memblock additions: memblock_clear_nomap() and memblock_cap_memory_range() - ARMv8.3 HWCAP bits for JavaScript conversion instructions, complex numbers and weaker release consistency - arm64 ACPI platform MSI support - arm perf updates: ACPI PMU support, L3 cache PMU in some Qualcomm SoCs, Cortex-A53 L2 cache events and DTLB refills, MAINTAINERS update for DT perf bindings - architected timer errata framework (the arch/arm64 changes only) - support for DMA_ATTR_FORCE_CONTIGUOUS in the arm64 iommu DMA API - arm64 KVM refactoring to use common system register definitions - remove support for ASID-tagged VIVT I-cache (no ARMv8 implementation using it and deprecated in the architecture) together with some I-cache handling clean-up - PE/COFF EFI header clean-up/hardening - define BUG() instruction without CONFIG_BUG * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (92 commits) arm64: Fix the DMA mmap and get_sgtable API with DMA_ATTR_FORCE_CONTIGUOUS arm64: Print DT machine model in setup_machine_fdt() arm64: pmu: Wire-up Cortex A53 L2 cache events and DTLB refills arm64: module: split core and init PLT sections arm64: pmuv3: handle pmuv3+ arm64: Add CNTFRQ_EL0 trap handler arm64: Silence spurious kbuild warning on menuconfig arm64: pmuv3: use arm_pmu ACPI framework arm64: pmuv3: handle !PMUv3 when probing drivers/perf: arm_pmu: add ACPI framework arm64: add function to get a cpu's MADT GICC table drivers/perf: arm_pmu: split out platform device probe logic drivers/perf: arm_pmu: move irq request/free into probe drivers/perf: arm_pmu: split cpu-local irq request/free drivers/perf: arm_pmu: rename irq request/free functions drivers/perf: arm_pmu: handle no platform_device drivers/perf: arm_pmu: simplify cpu_pmu_request_irqs() drivers/perf: arm_pmu: factor out pmu registration drivers/perf: arm_pmu: fold init into alloc drivers/perf: arm_pmu: define armpmu_init_fn ...	2017-05-05 12:11:37 -07:00
Catalin Marinas	92f66f84d9	arm64: Fix the DMA mmap and get_sgtable API with DMA_ATTR_FORCE_CONTIGUOUS While honouring the DMA_ATTR_FORCE_CONTIGUOUS on arm64 (commit 44176bb38fa4: "arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU"), the existing uses of dma_mmap_attrs() and dma_get_sgtable() have been broken by passing a physically contiguous vm_struct with an invalid pages pointer through the common iommu API. Since the coherent allocation with DMA_ATTR_FORCE_CONTIGUOUS uses CMA, this patch simply reuses the existing swiotlb logic for mmap and get_sgtable. Note that the current implementation of get_sgtable (both swiotlb and iommu) is broken if dma_declare_coherent_memory() is used since such memory does not have a corresponding struct page. To be addressed in a subsequent patch. Fixes: `44176bb38f` ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU") Reported-by: Andrzej Hajda <a.hajda@samsung.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-05-05 11:41:35 +01:00
Joerg Roedel	2c0248d688	Merge branches 'arm/exynos', 'arm/omap', 'arm/rockchip', 'arm/mediatek', 'arm/smmu', 'arm/core', 'x86/vt-d', 'x86/amd' and 'core' into next	2017-05-04 18:06:17 +02:00
Stefano Stabellini	e058632670	xen/arm,arm64: fix xen_dma_ops after `815dd18` "Consolidate get_dma_ops..." The following commit: commit `815dd18788` Author: Bart Van Assche <bart.vanassche@sandisk.com> Date: Fri Jan 20 13:04:04 2017 -0800 treewide: Consolidate get_dma_ops() implementations rearranges get_dma_ops in a way that xen_dma_ops are not returned when running on Xen anymore, dev->dma_ops is returned instead (see arch/arm/include/asm/dma-mapping.h:get_arch_dma_ops and include/linux/dma-mapping.h:get_dma_ops). Fix the problem by storing dev->dma_ops in dev_archdata, and setting dev->dma_ops to xen_dma_ops. This way, xen_dma_ops is returned naturally by get_dma_ops. The Xen code can retrieve the original dev->dma_ops from dev_archdata when needed. It also allows us to remove __generic_dma_ops from common headers. Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> Tested-by: Julien Grall <julien.grall@arm.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> [4.11+] CC: linux@armlinux.org.uk CC: catalin.marinas@arm.com CC: will.deacon@arm.com CC: boris.ostrovsky@oracle.com CC: jgross@suse.com CC: Julien Grall <julien.grall@arm.com>	2017-05-02 11:14:42 +02:00
Joerg Roedel	461a6946b1	iommu: Remove pci.h include from trace/events/iommu.h The include file does not need any PCI specifics, so remove that include. Also fix the places that relied on it. Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-04-29 00:20:49 +02:00
Sricharan R	b913efe78a	arm64: dma-mapping: Remove the notifier trick to handle early setting of dma_ops With arch_setup_dma_ops now being called late during device's probe after the device's iommu is probed, the notifier trick required to handle the early setup of dma_ops before the iommu group gets created is not required. So removing the notifier's here. Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sricharan R <sricharan@codeaurora.org> [rm: clean up even more] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2017-04-20 16:31:08 +02:00
Will Deacon	6ae979ab39	Revert "Revert "arm64: hugetlb: partial revert of 66b3923a1a0f"" The use of the contiguous bit by our hugetlb implementation violates the break-before-make requirements of the architecture and can lead to silent data corruption or TLB conflict aborts. Once again, disable these hugetlb sizes whilst it gets worked out. This reverts commit `ab2e1b8923`. Conflicts: arch/arm64/mm/hugetlbpage.c Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-04-07 12:27:29 +01:00
Stephen Boyd	b824b93068	arm64: print a fault message when attempting to write RO memory If a page is marked read only we should print out that fact, instead of printing out that there was a page fault. Right now we get a cryptic error message that something went wrong with an unhandled fault, but we don't evaluate the esr to figure out that it was a read/write permission fault. Instead of seeing: Unable to handle kernel paging request at virtual address ffff000008e460d8 pgd = ffff800003504000 [ffff000008e460d8] pgd=0000000083473003, pud=0000000083503003, pmd=0000000000000000 Internal error: Oops: 9600004f [#1] PREEMPT SMP we'll see: Unable to handle kernel write to read-only memory at virtual address ffff000008e760d8 pgd = ffff80003d3de000 [ffff000008e760d8] pgd=0000000083472003, pud=0000000083435003, pmd=0000000000000000 Internal error: Oops: 9600004f [#1] PREEMPT SMP We also add a userspace address check into is_permission_fault() so that the function doesn't return true for ttbr0 PAN faults when it shouldn't. Reviewed-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> Acked-by: Laura Abbott <labbott@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-04-06 17:36:09 +01:00
AKASHI Takahiro	e62aaeac42	arm64: kdump: provide /proc/vmcore file Arch-specific functions are added to allow for implementing a crash dump file interface, /proc/vmcore, which can be viewed as a ELF file. A user space tool, like kexec-tools, is responsible for allocating a separate region for the core's ELF header within crash kdump kernel memory and filling it in when executing kexec_load(). Then, its location will be advertised to crash dump kernel via a new device-tree property, "linux,elfcorehdr", and crash dump kernel preserves the region for later use with reserve_elfcorehdr() at boot time. On crash dump kernel, /proc/vmcore will access the primary kernel's memory with copy_oldmem_page(), which feeds the data page-by-page by ioremap'ing it since it does not reside in linear mapping on crash dump kernel. Meanwhile, elfcorehdr_read() is simple as the region is always mapped. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-04-05 18:31:38 +01:00
AKASHI Takahiro	254a41c0ba	arm64: hibernate: preserve kdump image around hibernation Since arch_kexec_protect_crashkres() removes a mapping for crash dump kernel image, the loaded data won't be preserved around hibernation. In this patch, helper functions, crash_prepare_suspend()/ crash_post_resume(), are additionally called before/after hibernation so that the relevant memory segments will be mapped again and preserved just as the others are. In addition, to minimize the size of hibernation image, crash_is_nosave() is added to pfn_is_nosave() in order to recognize only the pages that hold loaded crash dump kernel image as saveable. Hibernation excludes any pages that are marked as Reserved and yet "nosave." Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-04-05 18:28:50 +01:00
Takahiro Akashi	98d2e1539b	arm64: kdump: protect crash dump kernel memory arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres() are meant to be called by kexec_load() in order to protect the memory allocated for crash dump kernel once the image is loaded. The protection is implemented by unmapping the relevant segments in crash dump kernel memory, rather than making it read-only as other archs do, to prevent coherency issues due to potential cache aliasing (with mismatched attributes). Page-level mappings are consistently used here so that we can change the attributes of segments in page granularity as well as shrink the region also in page granularity through /sys/kernel/kexec_crash_size, putting the freed memory back to buddy system. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-04-05 18:28:35 +01:00
AKASHI Takahiro	9b0aa14e31	arm64: mm: add set_memory_valid() This function validates and invalidates PTE entries, and will be utilized in kdump to protect loaded crash dump kernel image. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-04-05 18:27:53 +01:00
AKASHI Takahiro	764b51ead1	arm64: kdump: reserve memory for crash dump kernel "crashkernel=" kernel parameter specifies the size (and optionally the start address) of the system ram to be used by crash dump kernel. reserve_crashkernel() will allocate and reserve that memory at boot time of primary kernel. The memory range will be exposed to userspace as a resource named "Crash kernel" in /proc/iomem. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: Pratyush Anand <panand@redhat.com> Reviewed-by: James Morse <james.morse@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-04-05 18:26:57 +01:00
AKASHI Takahiro	8f579b1c4e	arm64: limit memory regions based on DT property, usable-memory-range Crash dump kernel uses only a limited range of available memory as System RAM. On arm64 kdump, This memory range is advertised to crash dump kernel via a device-tree property under /chosen, linux,usable-memory-range = <BASE SIZE> Crash dump kernel reads this property at boot time and calls memblock_cap_memory_range() to limit usable memory which are listed either in UEFI memory map table or "memory" nodes of a device tree blob. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: Geoff Levand <geoff@infradead.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-04-05 18:26:54 +01:00
Victor Kamensky	09a6adf53d	arm64: mm: unaligned access by user-land should be received as SIGBUS After `52d7523` (arm64: mm: allow the kernel to handle alignment faults on user accesses) commit user-land accesses that produce unaligned exceptions like in case of aarch32 ldm/stm/ldrd/strd instructions operating on unaligned memory received by user-land as SIGSEGV. It is wrong, it should be reported as SIGBUS as it was before `52d7523` commit. Changed do_bad_area function to take signal and code parameters out of esr value using fault_info table, so in case of do_alignment_fault fault user-land will receive SIGBUS. Wrapped access to fault_info table into esr_to_fault_info function. Cc: <stable@vger.kernel.org> Fixes: `52d7523` (arm64: mm: allow the kernel to handle alignment faults on user accesses) Signed-off-by: Victor Kamensky <kamensky@cisco.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-04-04 12:13:36 +01:00
Ard Biesheuvel	d27cfa1fc8	arm64: mm: set the contiguous bit for kernel mappings where appropriate This is the third attempt at enabling the use of contiguous hints for kernel mappings. The most recent attempt `0bfc445dec` was reverted after it turned out that updating permission attributes on live contiguous ranges may result in TLB conflicts. So this time, the contiguous hint is not set for .rodata or for the linear alias of .text/.rodata, both of which are mapped read-write initially, and remapped read-only at a later stage. (Note that the latter region could also be unmapped and remapped again with updated permission attributes, given that the region, while live, is only mapped for the convenience of the hibernation code, but that also means the TLB footprint is negligible anyway, so why bother) This enables the following contiguous range sizes for the virtual mapping of the kernel image, and for the linear mapping: granule size \| cont PTE \| cont PMD \| -------------+------------+------------+ 4 KB \| 64 KB \| 32 MB \| 16 KB \| 2 MB \| 1 GB* \| 64 KB \| 2 MB \| 16 GB* \| * Only when built for 3 or more levels of translation. This is due to the fact that a 2 level configuration only consists of PGDs and PTEs, and the added complexity of dealing with folded PMDs is not justified considering that 16 GB contiguous ranges are likely to be ignored by the hardware (and 16k/2 levels is a niche configuration) Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-03-23 14:09:23 +00:00
Ard Biesheuvel	5bd4669471	arm64/mm: remove pointless map/unmap sequences when creating page tables The routines __pud_populate and __pmd_populate only create a table entry at their respective level which refers to the next level page by its physical address, so there is no reason to map this page and then unmap it immediately after. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-03-23 14:09:12 +00:00
Ard Biesheuvel	c0951366d4	arm64/mmu: replace 'page_mappings_only' parameter with flags argument In preparation of extending the policy for manipulating kernel mappings with whether or not contiguous hints may be used in the page tables, replace the bool 'page_mappings_only' with a flags field and a flag NO_BLOCK_MAPPINGS. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-03-23 13:55:51 +00:00
Ard Biesheuvel	141d1497aa	arm64/mmu: add contiguous bit to sanity bug check A mapping with the contiguous bit cannot be safely manipulated while live, regardless of whether the bit changes between the old and new mapping. So take this into account when deciding whether the change is safe. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2017-03-23 13:55:35 +00:00

1 2 3 4 5 ...

530 Commits