Commit Graph

1319 Commits

Author SHA1 Message Date
Nicolas Saenz Julienne 8424ecdde7 arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges
We recently introduced a 1 GB sized ZONE_DMA to cater for platforms
incorporating masters that can address less than 32 bits of DMA, in
particular the Raspberry Pi 4, which has 4 or 8 GB of DRAM, but has
peripherals that can only address up to 1 GB (and its PCIe host
bridge can only access the bottom 3 GB)

The DMA layer also needs to be able to allocate memory that is
guaranteed to meet those DMA constraints, for bounce buffering as well
as allocating the backing for consistent mappings. This is why the 1 GB
ZONE_DMA was introduced recently. Unfortunately, it turns out the having
a 1 GB ZONE_DMA as well as a ZONE_DMA32 causes problems with kdump, and
potentially in other places where allocations cannot cross zone
boundaries. Therefore, we should avoid having two separate DMA zones
when possible.

So, with the help of of_dma_get_max_cpu_address() get the topmost
physical address accessible to all DMA masters in system and use that
information to fine-tune ZONE_DMA's size. In the absence of addressing
limited masters ZONE_DMA will span the whole 32-bit address space,
otherwise, in the case of the Raspberry Pi 4 it'll only span the 30-bit
address space, and have ZONE_DMA32 cover the rest of the 32-bit address
space.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Link: https://lore.kernel.org/r/20201119175400.9995-6-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-20 09:34:13 +00:00
Nicolas Saenz Julienne 9804f8c69b arm64: mm: Move zone_dma_bits initialization into zone_sizes_init()
zone_dma_bits's initialization happens earlier that it's actually
needed, in arm64_memblock_init(). So move it into the more suitable
zone_sizes_init().

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Link: https://lore.kernel.org/r/20201119175400.9995-3-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-20 09:34:13 +00:00
Nicolas Saenz Julienne 0a30c53573 arm64: mm: Move reserve_crashkernel() into mem_init()
crashkernel might reserve memory located in ZONE_DMA. We plan to delay
ZONE_DMA's initialization after unflattening the devicetree and ACPI's
boot table initialization, so move it later in the boot process.
Specifically into bootmem_init() since request_standard_resources()
depends on it.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Link: https://lore.kernel.org/r/20201119175400.9995-2-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-20 09:34:13 +00:00
Catalin Marinas 2687275a58 arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
mem_init() currently relies on knowing the boundaries of the crashkernel
reservation to map such region with page granularity for later
unmapping via set_memory_valid(..., 0). If the crashkernel reservation
is deferred, such boundaries are not known when the linear mapping is
created. Simply parse the command line for "crashkernel" and, if found,
create the linear map with NO_BLOCK_MAPPINGS.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Acked-by: James Morse <james.morse@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Link: https://lore.kernel.org/r/20201119175556.18681-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-20 09:32:46 +00:00
Catalin Marinas 791ab8b2e3 arm64: Ignore any DMA offsets in the max_zone_phys() calculation
Currently, the kernel assumes that if RAM starts above 32-bit (or
zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
such constrained devices have a hardwired DMA offset. In practice, we
haven't noticed any such hardware so let's assume that we can expand
ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
zone_bits.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Cc: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20201118185809.1078362-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-19 17:58:55 +00:00
Sudarshan Rajagopalan 9f84f39f55 arm64/mm: add fallback option to allocate virtually contiguous memory
When section mappings are enabled, we allocate vmemmap pages from
physically continuous memory of size PMD_SIZE using
vmemmap_alloc_block_buf(). Section mappings are good to reduce TLB
pressure. But when system is highly fragmented and memory blocks are
being hot-added at runtime, its possible that such physically continuous
memory allocations can fail. Rather than failing the memory hot-add
procedure, add a fallback option to allocate vmemmap pages from
discontinuous pages using vmemmap_populate_basepages().

Signed-off-by: Sudarshan Rajagopalan <sudaraja@codeaurora.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/d6c06f2ef39bbe6c715b2f6db76eb16155fdcee6.1602722808.git.sudaraja@codeaurora.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-17 18:21:01 +00:00
Ard Biesheuvel e2a073dde9 arm64: omit [_text, _stext) from permanent kernel mapping
In a previous patch, we increased the size of the EFI PE/COFF header
to 64 KB, which resulted in the _stext symbol to appear at a fixed
offset of 64 KB into the image.

Since 64 KB is also the largest page size we support, this completely
removes the need to map the first 64 KB of the kernel image, given that
it only contains the arm64 Image header and the EFI header, neither of
which we ever access again after booting the kernel. More importantly,
we should avoid an executable mapping of non-executable and not entirely
predictable data, to deal with the unlikely event that we inadvertently
emitted something that looks like an opcode that could be used as a
gadget for speculative execution.

So let's limit the kernel mapping of .text to the [_stext, _etext)
region, which matches the view of generic code (such as kallsyms) when
it reasons about the boundaries of the kernel's .text section.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201117124729.12642-2-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-17 16:14:20 +00:00
Anshuman Khandual 58284a901b arm64/mm: Validate hotplug range before creating linear mapping
During memory hotplug process, the linear mapping should not be created for
a given memory range if that would fall outside the maximum allowed linear
range. Else it might cause memory corruption in the kernel virtual space.

Maximum linear mapping region is [PAGE_OFFSET..(PAGE_END -1)] accommodating
both its ends but excluding PAGE_END. Max physical range that can be mapped
inside this linear mapping range, must also be derived from its end points.

This ensures that arch_add_memory() validates memory hot add range for its
potential linear mapping requirements, before creating it with
__create_pgd_mapping().

Fixes: 4ab2150615 ("arm64: Add memory hotplug support")
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1605252614-761-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-11-13 09:45:08 +00:00
Ard Biesheuvel c1090bb10d arm64: mm: don't assume struct page is always 64 bytes
Commit 8c96400d6a simplified the page-to-virt and virt-to-page
conversions, based on the assumption that struct page is always 64
bytes in size, in which case we can use a single signed shift to
perform the conversion (provided that the vmemmap array is placed
appropriately in the kernel VA space)

Unfortunately, this assumption turns out not to hold, and so we need
to revert part of this commit, and go back to an affine transformation.
Given that all the quantities involved are compile time constants,
this should not make any practical difference.

Fixes: 8c96400d6a ("arm64: mm: make vmemmap region a projection of the linear region")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20201110180511.29083-1-ardb@kernel.org
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-12 08:32:25 +00:00
Anshuman Khandual fdd99a4103 arm64/mm/hotplug: Ensure early memory sections are all online
This adds a validation function that scans the entire boot memory and makes
sure that all early memory sections are online. This check is essential for
the memory notifier to work properly, as it cannot prevent any boot memory
from offlining, if all sections are not online to begin with. Although the
boot section scanning is selectively enabled with DEBUG_VM.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1604896137-16644-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-10 18:44:12 +00:00
Anshuman Khandual 9fb3d4a303 arm64/mm/hotplug: Enable MEM_OFFLINE event handling
This enables MEM_OFFLINE memory event handling. It will help intercept any
possible error condition such as if boot memory some how still got offlined
even after an explicit notifier failure, potentially by a future change in
generic hot plug framework. This would help detect such scenarios and help
debug further. While here, also call out the first section being attempted
for offline or got offlined.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1604896137-16644-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-10 18:44:12 +00:00
Anshuman Khandual cb45babe1b arm64/mm/hotplug: Register boot memory hot remove notifier earlier
This moves memory notifier registration earlier in the boot process from
device_initcall() to early_initcall() which will help in guarding against
potential early boot memory offline requests. Even though there should not
be any actual offlinig requests till memory block devices are initialized
with memory_dev_init() but then generic init sequence might just change in
future. Hence an early registration for the memory event notifier would be
helpful. While here, just skip the registration if CONFIG_MEMORY_HOTREMOVE
is not enabled and also call out when memory notifier registration fails.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1604896137-16644-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-10 18:44:11 +00:00
Ard Biesheuvel 97d6786e06 arm64: mm: account for hotplug memory when randomizing the linear region
As a hardening measure, we currently randomize the placement of
physical memory inside the linear region when KASLR is in effect.
Since the random offset at which to place the available physical
memory inside the linear region is chosen early at boot, it is
based on the memblock description of memory, which does not cover
hotplug memory. The consequence of this is that the randomization
offset may be chosen such that any hotplugged memory located above
memblock_end_of_DRAM() that appears later is pushed off the end of
the linear region, where it cannot be accessed.

So let's limit this randomization of the linear region to ensure
that this can no longer happen, by using the CPU's addressable PA
range instead. As it is guaranteed that no hotpluggable memory will
appear that falls outside of that range, we can safely put this PA
range sized window anywhere in the linear region.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20201014081857.3288-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-10 18:43:25 +00:00
Mark Rutland 833be850f1 arm64: consistently use reserved_pg_dir
Depending on configuration options and specific code paths, we either
use the empty_zero_page or the configuration-dependent reserved_ttbr0
as a reserved value for TTBR{0,1}_EL1.

To simplify this code, let's always allocate and use the same
reserved_pg_dir, replacing reserved_ttbr0. Note that this is allocated
(and hence pre-zeroed), and is also marked as read-only in the kernel
Image mapping.

Keeping this separate from the empty_zero_page potentially helps with
robustness as the empty_zero_page is used in a number of cases where a
failure to map it read-only could allow it to become corrupted.

The (presently unused) swapper_pg_end symbol is also removed, and
comments are added wherever we rely on the offsets between the
pre-allocated pg_dirs to keep these cases easily identifiable.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201103102229.8542-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-10 17:30:40 +00:00
Ard Biesheuvel 8c96400d6a arm64: mm: make vmemmap region a projection of the linear region
Now that we have reverted the introduction of the vmemmap struct page
pointer and the separate physvirt_offset, we can simplify things further,
and place the vmemmap region in the VA space in such a way that virtual
to page translations and vice versa can be implemented using a single
arithmetic shift.

One happy coincidence resulting from this is that the 48-bit/4k and
52-bit/64k configurations (which are assumed to be the two most
prevalent) end up with the same placement of the vmemmap region. In
a subsequent patch, we will take advantage of this, and unify the
memory maps even more.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Link: https://lore.kernel.org/r/20201008153602.9467-4-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-09 17:15:37 +00:00
Ard Biesheuvel f4693c2716 arm64: mm: extend linear region for 52-bit VA configurations
For historical reasons, the arm64 kernel VA space is configured as two
equally sized halves, i.e., on a 48-bit VA build, the VA space is split
into a 47-bit vmalloc region and a 47-bit linear region.

When support for 52-bit virtual addressing was added, this equal split
was kept, resulting in a substantial waste of virtual address space in
the linear region:

                           48-bit VA                     52-bit VA
  0xffff_ffff_ffff_ffff +-------------+               +-------------+
                        |   vmalloc   |               |   vmalloc   |
  0xffff_8000_0000_0000 +-------------+ _PAGE_END(48) +-------------+
                        |   linear    |               :             :
  0xffff_0000_0000_0000 +-------------+               :             :
                        :             :               :             :
                        :             :               :             :
                        :             :               :             :
                        :             :               :  currently  :
                        :  unusable   :               :             :
                        :             :               :   unused    :
                        :     by      :               :             :
                        :             :               :             :
                        :  hardware   :               :             :
                        :             :               :             :
  0xfff8_0000_0000_0000 :             : _PAGE_END(52) +-------------+
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :  unusable   :               |             |
                        :             :               |   linear    |
                        :     by      :               |             |
                        :             :               |   region    |
                        :  hardware   :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
                        :             :               |             |
  0xfff0_0000_0000_0000 +-------------+  PAGE_OFFSET  +-------------+

As illustrated above, the 52-bit VA kernel uses 47 bits for the vmalloc
space (as before), to ensure that a single 64k granule kernel image can
support any 64k granule capable system, regardless of whether it supports
the 52-bit virtual addressing extension. However, due to the fact that
the VA space is still split in equal halves, the linear region is only
2^51 bytes in size, wasting almost half of the 52-bit VA space.

Let's fix this, by abandoning the equal split, and simply assigning all
VA space outside of the vmalloc region to the linear region.

The KASAN shadow region is reconfigured so that it ends at the start of
the vmalloc region, and grows downwards. That way, the arrangement of
the vmalloc space (which contains kernel mappings, modules, BPF region,
the vmemmap array etc) is identical between non-KASAN and KASAN builds,
which aids debugging.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Link: https://lore.kernel.org/r/20201008153602.9467-3-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-09 17:15:37 +00:00
Rob Herring 96d389ca10 arm64: Add workaround for Arm Cortex-A77 erratum 1508412
On Cortex-A77 r0p0 and r1p0, a sequence of a non-cacheable or device load
and a store exclusive or PAR_EL1 read can cause a deadlock.

The workaround requires a DMB SY before and after a PAR_EL1 register
read. In addition, it's possible an interrupt (doing a device read) or
KVM guest exit could be taken between the DMB and PAR read, so we
also need a DMB before returning from interrupt and before returning to
a guest.

A deadlock is still possible with the workaround as KVM guests must also
have the workaround. IOW, a malicious guest can deadlock an affected
systems.

This workaround also depends on a firmware counterpart to enable the h/w
to insert DMB SY after load and store exclusive instructions. See the
errata document SDEN-1152370 v10 [1] for more information.

[1] https://static.docs.arm.com/101992/0010/Arm_Cortex_A77_MP074_Software_Developer_Errata_Notice_v10.pdf

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: kvmarm@lists.cs.columbia.edu
Link: https://lore.kernel.org/r/20201028182839.166037-2-robh@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-10-29 12:56:01 +00:00
Joe Perches 33def8498f treewide: Convert macro and uses of __section(foo) to __section("foo")
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.

Remove the quote operator # from compiler_attributes.h __section macro.

Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.

Conversion done using the script at:

    https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-25 14:51:49 -07:00
Linus Torvalds 032c7ed958 More arm64 updates for 5.10
- Improve performance of Spectre-v2 mitigation on Falkor CPUs (if you're lucky
   enough to have one)
 
 - Select HAVE_MOVE_PMD. This has been shown to improve mremap() performance,
   which is used heavily by the Android runtime GC, and it seems we forgot to
   enable this upstream back in 2018.
 
 - Ensure linker flags are consistent between LLVM and BFD
 
 - Fix stale comment in Spectre mitigation rework
 
 - Fix broken copyright header
 
 - Fix KASLR randomisation of the linear map
 
 - Prevent arm64-specific prctl()s from compat tasks (return -EINVAL)
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+QEPAQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNE8jB/0YNYKO9mis/Xn5KcOCwlg4dbc2uVBknZXD
 f7otEJ6SOax2HcWz8qJlrJ+qbGFawPIqFBUAM0vU1VmoyctIoKRFTA8ACfWfWtnK
 QBfHrcxtJCh/GGq+E1IyuqWzCjppeY/7gYVdgi1xDEZRSaLz53MC1GVBwKBtu5cf
 X2Bfm8d9+PSSnmKfpO65wSCTvN3PQX1SNEHwwTWFZQx0p7GcQK1DdwoobM6dRnVy
 +e984ske+2a+nTrkhLSyQIgsfHuLB4pD6XdM/UOThnfdNxdQ0dUGn375sXP+b4dW
 7MTH9HP/dXIymTcuErMXOHJXLk/zUiUBaOxkmOxdvrhQd0uFNFIc
 =e9p9
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull more arm64 updates from Will Deacon:
 "A small selection of further arm64 fixes and updates. Most of these
  are fixes that came in during the merge window, with the exception of
  the HAVE_MOVE_PMD mremap() speed-up which we discussed back in 2018
  and somehow forgot to enable upstream.

   - Improve performance of Spectre-v2 mitigation on Falkor CPUs (if
     you're lucky enough to have one)

   - Select HAVE_MOVE_PMD. This has been shown to improve mremap()
     performance, which is used heavily by the Android runtime GC, and
     it seems we forgot to enable this upstream back in 2018.

   - Ensure linker flags are consistent between LLVM and BFD

   - Fix stale comment in Spectre mitigation rework

   - Fix broken copyright header

   - Fix KASLR randomisation of the linear map

   - Prevent arm64-specific prctl()s from compat tasks (return -EINVAL)"

Link: https://lore.kernel.org/kvmarm/20181108181201.88826-3-joelaf@google.com/

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: proton-pack: Update comment to reflect new function name
  arm64: spectre-v2: Favour CPU-specific mitigation at EL2
  arm64: link with -z norelro regardless of CONFIG_RELOCATABLE
  arm64: Fix a broken copyright header in gen_vdso_offsets.sh
  arm64: mremap speedup - Enable HAVE_MOVE_PMD
  arm64: mm: use single quantity to represent the PA to VA translation
  arm64: reject prctl(PR_PAC_RESET_KEYS) on compat tasks
2020-10-23 09:46:16 -07:00
Linus Torvalds 5a32c3413d dma-mapping updates for 5.10
- rework the non-coherent DMA allocator
  - move private definitions out of <linux/dma-mapping.h>
  - lower CMA_ALIGNMENT (Paul Cercueil)
  - remove the omap1 dma address translation in favor of the common
    code
  - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)
  - support per-node DMA CMA areas (Barry Song)
  - increase the default seg boundary limit (Nicolin Chen)
  - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)
  - various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl+IiPwLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPKEQ//TM8vxjucnRl/pklpMin49dJorwiVvROLhQqLmdxw
 286ZKpVzYYAPc7LnNqwIBugnFZiXuHu8xPKQkIiOa2OtNDTwhKNoBxOAmOJaV6DD
 8JfEtZYeX5mKJ/Nqd2iSkIqOvCwZ9Wzii+aytJ2U88wezQr1fnyF4X49MegETEey
 FHWreSaRWZKa0MMRu9AQ0QxmoNTHAQUNaPc0PeqEtPULybfkGOGw4/ghSB7WcKrA
 gtKTuooNOSpVEHkTas2TMpcBp6lxtOjFqKzVN0ml+/nqq5NeTSDx91VOCX/6Cj76
 mXIg+s7fbACTk/BmkkwAkd0QEw4fo4tyD6Bep/5QNhvEoAriTuSRbhvLdOwFz0EF
 vhkF0Rer6umdhSK7nPd7SBqn8kAnP4vBbdmB68+nc3lmkqysLyE4VkgkdH/IYYQI
 6TJ0oilXWFmU6DT5Rm4FBqCvfcEfU2dUIHJr5wZHqrF2kLzoZ+mpg42fADoG4GuI
 D/oOsz7soeaRe3eYfWybC0omGR6YYPozZJ9lsfftcElmwSsFrmPsbO1DM5IBkj1B
 gItmEbOB9ZK3RhIK55T/3u1UWY3Uc/RVr+kchWvADGrWnRQnW0kxYIqDgiOytLFi
 JZNH8uHpJIwzoJAv6XXSPyEUBwXTG+zK37Ce769HGbUEaUrE71MxBbQAQsK8mDpg
 7fM=
 =Bkf/
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - rework the non-coherent DMA allocator

 - move private definitions out of <linux/dma-mapping.h>

 - lower CMA_ALIGNMENT (Paul Cercueil)

 - remove the omap1 dma address translation in favor of the common code

 - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)

 - support per-node DMA CMA areas (Barry Song)

 - increase the default seg boundary limit (Nicolin Chen)

 - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)

 - various cleanups

* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
  ARM/ixp4xx: add a missing include of dma-map-ops.h
  dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
  dma-direct: factor out a dma_direct_alloc_from_pool helper
  dma-direct check for highmem pages in dma_direct_alloc_pages
  dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h>
  dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma
  dma-mapping: move dma-debug.h to kernel/dma/
  dma-mapping: remove <asm/dma-contiguous.h>
  dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>
  dma-contiguous: remove dma_contiguous_set_default
  dma-contiguous: remove dev_set_cma_area
  dma-contiguous: remove dma_declare_contiguous
  dma-mapping: split <linux/dma-mapping.h>
  cma: decrease CMA_ALIGNMENT lower limit to 2
  firewire-ohci: use dma_alloc_pages
  dma-iommu: implement ->alloc_noncoherent
  dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
  dma-mapping: add a new dma_alloc_pages API
  dma-mapping: remove dma_cache_sync
  53c700: convert to dma_alloc_noncoherent
  ...
2020-10-15 14:43:29 -07:00
Ard Biesheuvel 7bc1a0f9e1 arm64: mm: use single quantity to represent the PA to VA translation
On arm64, the global variable memstart_addr represents the physical
address of PAGE_OFFSET, and so physical to virtual translations or
vice versa used to come down to simple additions or subtractions
involving the values of PAGE_OFFSET and memstart_addr.

When support for 52-bit virtual addressing was introduced, we had to
deal with PAGE_OFFSET potentially being outside of the region that
can be covered by the virtual range (as the 52-bit VA capable build
needs to be able to run on systems that are only 48-bit VA capable),
and for this reason, another translation was introduced, and recorded
in the global variable physvirt_offset.

However, if we go back to the original definition of memstart_addr,
i.e., the physical address of PAGE_OFFSET, it turns out that there is
no need for two separate translations: instead, we can simply subtract
the size of the unaddressable VA space from memstart_addr to make the
available physical memory appear in the 48-bit addressable VA region.

This simplifies things, but also fixes a bug on KASLR builds, which
may update memstart_addr later on in arm64_memblock_init(), but fails
to update vmemmap and physvirt_offset accordingly.

Fixes: 5383cc6efe ("arm64: mm: Introduce vabits_actual")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Link: https://lore.kernel.org/r/20201008153602.9467-2-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-10-15 11:14:57 +01:00
Mike Rapoport cc6de16805 memblock: use separate iterators for memory and reserved regions
for_each_memblock() is used to iterate over memblock.memory in a few
places that use data from memblock_region rather than the memory ranges.

Introduce separate for_each_mem_region() and
for_each_reserved_mem_region() to improve encapsulation of memblock
internals from its users.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>			[x86]
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>	[MIPS]
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>	[.clang-format]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-18-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13 18:38:35 -07:00
Mike Rapoport b10d6bca87 arch, drivers: replace for_each_membock() with for_each_mem_range()
There are several occurrences of the following pattern:

	for_each_memblock(memory, reg) {
		start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
		end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));

		/* do something with start and end */
	}

Using for_each_mem_range() iterator is more appropriate in such cases and
allows simpler and cleaner code.

[akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build]
[rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring]
  Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.com

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13 18:38:35 -07:00
Mike Rapoport c9118e6c37 arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()
There are several occurrences of the following pattern:

	for_each_memblock(memory, reg) {
		start_pfn = memblock_region_memory_base_pfn(reg);
		end_pfn = memblock_region_memory_end_pfn(reg);

		/* do something with start_pfn and end_pfn */
	}

Rather than iterate over all memblock.memory regions and each time query
for their start and end PFNs, use for_each_mem_pfn_range() iterator to get
simpler and clearer code.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>	[.clang-format]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-12-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13 18:38:35 -07:00
Mike Rapoport ab8f21aa8b arm64: numa: simplify dummy_numa_init()
dummy_numa_init() loops over memblock.memory and passes nid=0 to
numa_add_memblk() which essentially wraps memblock_set_node().  However,
memblock_set_node() can cope with entire memory span itself, so the loop
over memblock.memory regions is redundant.

Using a single call to memblock_set_node() rather than a loop also fixes
an issue with a buggy ACPI firmware in which the SRAT table covers some
but not all of the memory in the EFI memory map.

Jonathan Cameron says:

  This issue can be easily triggered by having an SRAT table which fails
  to cover all elements of the EFI memory map.

  This firmware error is detected and a warning printed. e.g.
  "NUMA: Warning: invalid memblk node 64 [mem 0x240000000-0x27fffffff]"
  At that point we fall back to dummy_numa_init().

  However, the failed ACPI init has left us with our memblocks all broken
  up as we split them when trying to assign them to NUMA nodes.

  We then iterate over the memblocks and add them to node 0.

  numa_add_memblk() calls memblock_set_node() which merges regions that
  were previously split up during the earlier attempt to add them to
  different nodes during parsing of SRAT.

  This means elements are moved in the memblock array and we can end up
  in a different memblock after the call to numa_add_memblk().
  Result is:

  Unable to handle kernel paging request at virtual address 0000000000003a40
  Mem abort info:
    ESR = 0x96000004
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
  Data abort info:
    ISV = 0, ISS = 0x00000004
    CM = 0, WnR = 0
  [0000000000003a40] user address but active_mm is swapper
  Internal error: Oops: 96000004 [#1] PREEMPT SMP

  ...

  Call trace:
    sparse_init_nid+0x5c/0x2b0
    sparse_init+0x138/0x170
    bootmem_init+0x80/0xe0
    setup_arch+0x2a0/0x5fc
    start_kernel+0x8c/0x648

Replace the loop with a single call to memblock_set_node() to the entire
memory.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-5-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-13 18:38:35 -07:00
Linus Torvalds 34eb62d868 Orphan link sections were a long-standing source of obscure bugs,
because the heuristics that various linkers & compilers use to handle them
 (include these bits into the output image vs discarding them silently)
 are both highly idiosyncratic and also version dependent.
 
 Instead of this historically problematic mess, this tree by Kees Cook (et al)
 adds build time asserts and build time warnings if there's any orphan section
 in the kernel or if a section is not sized as expected.
 
 And because we relied on so many silent assumptions in this area, fix a metric
 ton of dependencies and some outright bugs related to this, before we can
 finally enable the checks on the x86, ARM and ARM64 platforms.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+Edv4RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hiKBAApdJEOaK7hMc3013DYNctklIxEPJL2mFJ
 11YJRIh4pUJTF0TE+EHT/D+rSIuRsyuoSmOQBQ61/wVSnyG067GjjVJRqh/eYaJ1
 fDhJi2FuHOjXl+CiN0KxzBjjp+V4NhF7jHT59tpQSvfZeg7FjteoxfztxaCp5ek3
 S3wHB3CC4c4jE3lfjHem1E9/PwT4kwPYx1c3gAUdEqJdjkihjX9fWusfjLeqW6/d
 Y5VkApi6bL9XiZUZj5l0dEIweLJJ86+PkKJqpo3spxxEak1LSn1MEix+lcJ8e1Kg
 sb/bEEivDcmFlFWOJnn0QLquCR0Cx5bz1pwsL0tuf0yAd4+sXX5IMuGUysZlEdKM
 BHL9h5HbevGF4BScwZwZH7lyEg7q67s5KnRu4hxy0Swfcj7y0oT/9lXqpbpZ2DqO
 Hd+bRRQKIbqnTMp0hcit9LfpLp93vj0dBlaV5ocAJJlu62u9VnwGG5HQuZ5giLUr
 kA1SLw63Y1wopFRxgFyER8les7eLsu0zxHeK44rRVlVnfI99OMTOgVNicmDFy3Fm
 AfcnfJG0BqBEJGQz5es34uQQKKBwFPtC9NztopI62KiwOspYYZyrO1BNxdOc6DlS
 mIHrmO89HMXuid5eolvLaFqUWirHoWO8TlycgZxUWVHc2txVPjAEU/axouU/dSSU
 w/6GpzAa+7g=
 =fXAw
 -----END PGP SIGNATURE-----

Merge tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull orphan section checking from Ingo Molnar:
 "Orphan link sections were a long-standing source of obscure bugs,
  because the heuristics that various linkers & compilers use to handle
  them (include these bits into the output image vs discarding them
  silently) are both highly idiosyncratic and also version dependent.

  Instead of this historically problematic mess, this tree by Kees Cook
  (et al) adds build time asserts and build time warnings if there's any
  orphan section in the kernel or if a section is not sized as expected.

  And because we relied on so many silent assumptions in this area, fix
  a metric ton of dependencies and some outright bugs related to this,
  before we can finally enable the checks on the x86, ARM and ARM64
  platforms"

* tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  x86/boot/compressed: Warn on orphan section placement
  x86/build: Warn on orphan section placement
  arm/boot: Warn on orphan section placement
  arm/build: Warn on orphan section placement
  arm64/build: Warn on orphan section placement
  x86/boot/compressed: Add missing debugging sections to output
  x86/boot/compressed: Remove, discard, or assert for unwanted sections
  x86/boot/compressed: Reorganize zero-size section asserts
  x86/build: Add asserts for unwanted sections
  x86/build: Enforce an empty .got.plt section
  x86/asm: Avoid generating unused kprobe sections
  arm/boot: Handle all sections explicitly
  arm/build: Assert for unwanted sections
  arm/build: Add missing sections
  arm/build: Explicitly keep .ARM.attributes sections
  arm/build: Refactor linker script headers
  arm64/build: Assert for unwanted sections
  arm64/build: Add missing DWARF sections
  arm64/build: Use common DISCARDS in linker script
  arm64/build: Remove .eh_frame* sections due to unwind tables
  ...
2020-10-12 13:39:19 -07:00
Christoph Hellwig 9f4df96b87 dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h>
Move more nitty gritty DMA implementation details into the common
internal header.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06 07:07:06 +02:00
Christoph Hellwig 0b1abd1fb7 dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>
Merge dma-contiguous.h into dma-map-ops.h, after removing the comment
describing the contiguous allocator into kernel/dma/contigous.c.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06 07:07:04 +02:00
Christoph Hellwig 0a0f0d8be7 dma-mapping: split <linux/dma-mapping.h>
Split out all the bits that are purely for dma_map_ops implementations
and related code into a new <linux/dma-map-ops.h> header so that they
don't get pulled into all the drivers.  That also means the architecture
specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h>
any more, which leads to a missing includes that were pulled in by the
x86 or arm versions in a few not overly portable drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06 07:07:03 +02:00
Will Deacon baab853229 Merge branch 'for-next/mte' into for-next/core
Add userspace support for the Memory Tagging Extension introduced by
Armv8.5.

(Catalin Marinas and others)
* for-next/mte: (30 commits)
  arm64: mte: Fix typo in memory tagging ABI documentation
  arm64: mte: Add Memory Tagging Extension documentation
  arm64: mte: Kconfig entry
  arm64: mte: Save tags when hibernating
  arm64: mte: Enable swap of tagged pages
  mm: Add arch hooks for saving/restoring tags
  fs: Handle intra-page faults in copy_mount_options()
  arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset
  arm64: mte: ptrace: Add PTRACE_{PEEK,POKE}MTETAGS support
  arm64: mte: Allow {set,get}_tagged_addr_ctrl() on non-current tasks
  arm64: mte: Restore the GCR_EL1 register after a suspend
  arm64: mte: Allow user control of the generated random tags via prctl()
  arm64: mte: Allow user control of the tag check mode via prctl()
  mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
  arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
  mm: Introduce arch_validate_flags()
  arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  mm: Introduce arch_calc_vm_flag_bits()
  arm64: mte: Tags-aware aware memcmp_pages() implementation
  arm64: Avoid unnecessary clear_user_page() indirection
  ...
2020-10-02 12:16:11 +01:00
Will Deacon 57b8b1b435 Merge branches 'for-next/acpi', 'for-next/boot', 'for-next/bpf', 'for-next/cpuinfo', 'for-next/fpsimd', 'for-next/misc', 'for-next/mm', 'for-next/pci', 'for-next/perf', 'for-next/ptrauth', 'for-next/sdei', 'for-next/selftests', 'for-next/stacktrace', 'for-next/svm', 'for-next/topology', 'for-next/tpyos' and 'for-next/vdso' into for-next/core
Remove unused functions and parameters from ACPI IORT code.
(Zenghui Yu via Lorenzo Pieralisi)
* for-next/acpi:
  ACPI/IORT: Remove the unused inline functions
  ACPI/IORT: Drop the unused @ops of iort_add_device_replay()

Remove redundant code and fix documentation of caching behaviour for the
HVC_SOFT_RESTART hypercall.
(Pingfan Liu)
* for-next/boot:
  Documentation/kvm/arm: improve description of HVC_SOFT_RESTART
  arm64/relocate_kernel: remove redundant code

Improve reporting of unexpected kernel traps due to BPF JIT failure.
(Will Deacon)
* for-next/bpf:
  arm64: Improve diagnostics when trapping BRK with FAULT_BRK_IMM

Improve robustness of user-visible HWCAP strings and their corresponding
numerical constants.
(Anshuman Khandual)
* for-next/cpuinfo:
  arm64/cpuinfo: Define HWCAP name arrays per their actual bit definitions

Cleanups to handling of SVE and FPSIMD register state in preparation
for potential future optimisation of handling across syscalls.
(Julien Grall)
* for-next/fpsimd:
  arm64/sve: Implement a helper to load SVE registers from FPSIMD state
  arm64/sve: Implement a helper to flush SVE registers
  arm64/fpsimdmacros: Allow the macro "for" to be used in more cases
  arm64/fpsimdmacros: Introduce a macro to update ZCR_EL1.LEN
  arm64/signal: Update the comment in preserve_sve_context
  arm64/fpsimd: Update documentation of do_sve_acc

Miscellaneous changes.
(Tian Tao and others)
* for-next/misc:
  arm64/mm: return cpu_all_mask when node is NUMA_NO_NODE
  arm64: mm: Fix missing-prototypes in pageattr.c
  arm64/fpsimd: Fix missing-prototypes in fpsimd.c
  arm64: hibernate: Remove unused including <linux/version.h>
  arm64/mm: Refactor {pgd, pud, pmd, pte}_ERROR()
  arm64: Remove the unused include statements
  arm64: get rid of TEXT_OFFSET
  arm64: traps: Add str of description to panic() in die()

Memory management updates and cleanups.
(Anshuman Khandual and others)
* for-next/mm:
  arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD
  arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
  arm64/mm: Unify CONT_PMD_SHIFT
  arm64/mm: Unify CONT_PTE_SHIFT
  arm64/mm: Remove CONT_RANGE_OFFSET
  arm64/mm: Enable THP migration
  arm64/mm: Change THP helpers to comply with generic MM semantics
  arm64/mm/ptdump: Add address markers for BPF regions

Allow prefetchable PCI BARs to be exposed to userspace using normal
non-cacheable mappings.
(Clint Sbisa)
* for-next/pci:
  arm64: Enable PCI write-combine resources under sysfs

Perf/PMU driver updates.
(Julien Thierry and others)
* for-next/perf:
  perf: arm-cmn: Fix conversion specifiers for node type
  perf: arm-cmn: Fix unsigned comparison to less than zero
  arm_pmu: arm64: Use NMIs for PMU
  arm_pmu: Introduce pmu_irq_ops
  KVM: arm64: pmu: Make overflow handler NMI safe
  arm64: perf: Defer irq_work to IPI_IRQ_WORK
  arm64: perf: Remove PMU locking
  arm64: perf: Avoid PMXEV* indirection
  arm64: perf: Add missing ISB in armv8pmu_enable_counter()
  perf: Add Arm CMN-600 PMU driver
  perf: Add Arm CMN-600 DT binding
  arm64: perf: Add support caps under sysfs
  drivers/perf: thunderx2_pmu: Fix memory resource error handling
  drivers/perf: xgene_pmu: Fix uninitialized resource struct
  perf: arm_dsu: Support DSU ACPI devices
  arm64: perf: Remove unnecessary event_idx check
  drivers/perf: hisi: Add missing include of linux/module.h
  arm64: perf: Add general hardware LLC events for PMUv3

Support for the Armv8.3 Pointer Authentication enhancements.
(By Amit Daniel Kachhap)
* for-next/ptrauth:
  arm64: kprobe: clarify the comment of steppable hint instructions
  arm64: kprobe: disable probe of fault prone ptrauth instruction
  arm64: cpufeature: Modify address authentication cpufeature to exact
  arm64: ptrauth: Introduce Armv8.3 pointer authentication enhancements
  arm64: traps: Allow force_signal_inject to pass esr error code
  arm64: kprobe: add checks for ARMv8.3-PAuth combined instructions

Tonnes of cleanup to the SDEI driver.
(Gavin Shan)
* for-next/sdei:
  firmware: arm_sdei: Remove _sdei_event_unregister()
  firmware: arm_sdei: Remove _sdei_event_register()
  firmware: arm_sdei: Introduce sdei_do_local_call()
  firmware: arm_sdei: Cleanup on cross call function
  firmware: arm_sdei: Remove while loop in sdei_event_unregister()
  firmware: arm_sdei: Remove while loop in sdei_event_register()
  firmware: arm_sdei: Remove redundant error message in sdei_probe()
  firmware: arm_sdei: Remove duplicate check in sdei_get_conduit()
  firmware: arm_sdei: Unregister driver on error in sdei_init()
  firmware: arm_sdei: Avoid nested statements in sdei_init()
  firmware: arm_sdei: Retrieve event number from event instance
  firmware: arm_sdei: Common block for failing path in sdei_event_create()
  firmware: arm_sdei: Remove sdei_is_err()

Selftests for Pointer Authentication and FPSIMD/SVE context-switching.
(Mark Brown and Boyan Karatotev)
* for-next/selftests:
  selftests: arm64: Add build and documentation for FP tests
  selftests: arm64: Add wrapper scripts for stress tests
  selftests: arm64: Add utility to set SVE vector lengths
  selftests: arm64: Add stress tests for FPSMID and SVE context switching
  selftests: arm64: Add test for the SVE ptrace interface
  selftests: arm64: Test case for enumeration of SVE vector lengths
  kselftests/arm64: add PAuth tests for single threaded consistency and differently initialized keys
  kselftests/arm64: add PAuth test for whether exec() changes keys
  kselftests/arm64: add nop checks for PAuth tests
  kselftests/arm64: add a basic Pointer Authentication test

Implementation of ARCH_STACKWALK for unwinding.
(Mark Brown)
* for-next/stacktrace:
  arm64: Move console stack display code to stacktrace.c
  arm64: stacktrace: Convert to ARCH_STACKWALK
  arm64: stacktrace: Make stack walk callback consistent with generic code
  stacktrace: Remove reliable argument from arch_stack_walk() callback

Support for ASID pinning, which is required when sharing page-tables with
the SMMU.
(Jean-Philippe Brucker)
* for-next/svm:
  arm64: cpufeature: Export symbol read_sanitised_ftr_reg()
  arm64: mm: Pin down ASIDs for sharing mm with devices

Rely on firmware tables for establishing CPU topology.
(Valentin Schneider)
* for-next/topology:
  arm64: topology: Stop using MPIDR for topology information

Spelling fixes.
(Xiaoming Ni and Yanfei Xu)
* for-next/tpyos:
  arm64/numa: Fix a typo in comment of arm64_numa_init
  arm64: fix some spelling mistakes in the comments by codespell

vDSO cleanups.
(Will Deacon)
* for-next/vdso:
  arm64: vdso: Fix unusual formatting in *setup_additional_pages()
  arm64: vdso32: Remove a bunch of #ifdef CONFIG_COMPAT_VDSO guards
2020-10-02 12:01:41 +01:00
Will Deacon 6a1bdb173f arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
Our use of broadcast TLB maintenance means that spurious page-faults
that have been handled already by another CPU do not require additional
TLB maintenance.

Make flush_tlb_fix_spurious_fault() a no-op and rely on the existing TLB
invalidation instead. Add an explicit flush_tlb_page() when making a page
dirty, as the TLB is permitted to cache the old read-only entry.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200728092220.GA21800@willie-the-truck
Signed-off-by: Will Deacon <will@kernel.org>
2020-10-01 09:45:32 +01:00
Jean-Philippe Brucker 48118151d8 arm64: mm: Pin down ASIDs for sharing mm with devices
To enable address space sharing with the IOMMU, introduce
arm64_mm_context_get() and arm64_mm_context_put(), that pin down a
context and ensure that it will keep its ASID after a rollover. Export
the symbols to let the modular SMMUv3 driver use them.

Pinning is necessary because a device constantly needs a valid ASID,
unlike tasks that only require one when running. Without pinning, we would
need to notify the IOMMU when we're about to use a new ASID for a task,
and it would get complicated when a new task is assigned a shared ASID.
Consider the following scenario with no ASID pinned:

1. Task t1 is running on CPUx with shared ASID (gen=1, asid=1)
2. Task t2 is scheduled on CPUx, gets ASID (1, 2)
3. Task tn is scheduled on CPUy, a rollover occurs, tn gets ASID (2, 1)
   We would now have to immediately generate a new ASID for t1, notify
   the IOMMU, and finally enable task tn. We are holding the lock during
   all that time, since we can't afford having another CPU trigger a
   rollover. The IOMMU issues invalidation commands that can take tens of
   milliseconds.

It gets needlessly complicated. All we wanted to do was schedule task tn,
that has no business with the IOMMU. By letting the IOMMU pin tasks when
needed, we avoid stalling the slow path, and let the pinning fail when
we're out of shareable ASIDs.

After a rollover, the allocator expects at least one ASID to be available
in addition to the reserved ones (one per CPU). So (NR_ASIDS - NR_CPUS -
1) is the maximum number of ASIDs that can be shared with the IOMMU.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20200918101852.582559-5-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-28 22:15:38 +01:00
Zhengyuan Liu a194c5f2d2 arm64/mm: return cpu_all_mask when node is NUMA_NO_NODE
The @node passed to cpumask_of_node() can be NUMA_NO_NODE, in that
case it will trigger the following WARN_ON(node >= nr_node_ids) due to
mismatched data types of @node and @nr_node_ids. Actually we should
return cpu_all_mask just like most other architectures do if passed
NUMA_NO_NODE.

Also add a similar check to the inline cpumask_of_node() in numa.h.

Signed-off-by: Zhengyuan Liu <liuzhengyuan@tj.kylinos.cn>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20200921023936.21846-1-liuzhengyuan@tj.kylinos.cn
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-21 22:05:01 +01:00
Will Deacon 0fdb64c2a3 arm64: Improve diagnostics when trapping BRK with FAULT_BRK_IMM
When generating instructions at runtime, for example due to kernel text
patching or the BPF JIT, we can emit a trapping BRK instruction if we
are asked to encode an invalid instruction such as an out-of-range]
branch. This is indicative of a bug in the caller, and will result in a
crash on executing the generated code. Unfortunately, the message from
the crash is really unhelpful, and mumbles something about ptrace:

  | Unexpected kernel BRK exception at EL1
  | Internal error: ptrace BRK handler: f2000100 [#1] SMP

We can do better than this. Install a break handler for FAULT_BRK_IMM,
which is the immediate used to encode the "I've been asked to generate
an invalid instruction" error, and triage the faulting PC to determine
whether or not the failure occurred in the BPF JIT.

Link: https://lore.kernel.org/r/20200915141707.GB26439@willie-the-truck
Reported-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-18 16:35:54 +01:00
Tian Tao 152d75d664 arm64: mm: Fix missing-prototypes in pageattr.c
Fix the following warnings.
‘set_memory_valid’ [-Wmissing-prototypes]
int set_memory_valid(unsigned long addr, int numpages, int enable)
     ^
‘set_direct_map_invalid_noflush’ [-Wmissing-prototypes]
int set_direct_map_invalid_noflush(struct page *page)
     ^
‘set_direct_map_default_noflush’ [-Wmissing-prototypes]
int set_direct_map_default_noflush(struct page *page)
     ^

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/1600222847-56792-1-git-send-email-tiantao6@hisilicon.com
arch/arm64/mm/pageattr.c:138:5: warning: no previous prototype for
arch/arm64/mm/pageattr.c:150:5: warning: no previous prototype for
arch/arm64/mm/pageattr.c:165:5: warning: no previous prototype for
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-18 14:33:46 +01:00
Anshuman Khandual c048ddf86c arm64/mm/ptdump: Add address markers for BPF regions
Kernel virtual region [BPF_JIT_REGION_START..BPF_JIT_REGION_END] is missing
from address_markers[], hence relevant page table entries are not displayed
with /sys/kernel/debug/kernel_page_tables. This adds those missing markers.
While here, also rename arch/arm64/mm/dump.c which sounds bit ambiguous, as
arch/arm64/mm/ptdump.c instead.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1599208259-11191-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07 22:04:27 +01:00
Yanfei Xu 9a747c91e8 arm64/numa: Fix a typo in comment of arm64_numa_init
Fix a typo in comment of arm64_numa_init. 'encomapssing' should
be 'encompassing'.

Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com>
Link: https://lore.kernel.org/r/20200901091154.10112-1-yanfei.xu@windriver.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-09-07 14:20:06 +01:00
Steven Price 36943aba91 arm64: mte: Enable swap of tagged pages
When swapping pages out to disk it is necessary to save any tags that
have been set, and restore when swapping back in. Make use of the new
page flag (PG_ARCH_2, locally named PG_mte_tagged) to identify pages
with tags. When swapping out these pages the tags are stored in memory
and later restored when the pages are brought back in. Because shmem can
swap pages back in without restoring the userspace PTE it is also
necessary to add a hook for shmem.

Signed-off-by: Steven Price <steven.price@arm.com>
[catalin.marinas@arm.com: move function prototypes to mte.h]
[catalin.marinas@arm.com: drop '_tags' from arch_swap_restore_tags()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <will@kernel.org>
2020-09-04 12:46:07 +01:00
Catalin Marinas 738c8780fc arm64: Avoid unnecessary clear_user_page() indirection
Since clear_user_page() calls clear_page() directly, avoid the
unnecessary indirection.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
2020-09-04 12:46:06 +01:00
Vincenzo Frascino 2563776b41 arm64: mte: Tags-aware copy_{user_,}highpage() implementations
When the Memory Tagging Extension is enabled, the tags need to be
preserved across page copy (e.g. for copy-on-write, page migration).

Introduce MTE-aware copy_{user_,}highpage() functions to copy tags to
the destination if the source page has the PG_mte_tagged flag set.
copy_user_page() does not need to handle tag copying since, with this
patch, it is only called by the DAX code where there is no source page
structure (and no source tags).

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
2020-09-04 12:46:06 +01:00
Vincenzo Frascino 637ec831ea arm64: mte: Handle synchronous and asynchronous tag check faults
The Memory Tagging Extension has two modes of notifying a tag check
fault at EL0, configurable through the SCTLR_EL1.TCF0 field:

1. Synchronous raising of a Data Abort exception with DFSC 17.
2. Asynchronous setting of a cumulative bit in TFSRE0_EL1.

Add the exception handler for the synchronous exception and handling of
the asynchronous TFSRE0_EL1.TF0 bit setting via a new TIF flag in
do_notify_resume().

On a tag check failure in user-space, whether synchronous or
asynchronous, a SIGSEGV will be raised on the faulting thread.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
2020-09-04 12:46:06 +01:00
Vincenzo Frascino 3b714d24ef arm64: mte: CPU feature detection and initial sysreg configuration
Add the cpufeature and hwcap entries to detect the presence of MTE. Any
secondary CPU not supporting the feature, if detected on the boot CPU,
will be parked.

Add the minimum SCTLR_EL1 and HCR_EL2 bits for enabling MTE. The Normal
Tagged memory type is configured in MAIR_EL1 before the MMU is enabled
in order to avoid disrupting other CPUs in the CnP domain.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
2020-09-03 17:26:32 +01:00
Catalin Marinas 0178dc7613 arm64: mte: Use Normal Tagged attributes for the linear map
Once user space is given access to tagged memory, the kernel must be
able to clear/save/restore tags visible to the user. This is done via
the linear mapping, therefore map it as such. The new MT_NORMAL_TAGGED
index for MAIR_EL1 is initially mapped as Normal memory and later
changed to Normal Tagged via the cpufeature infrastructure. From a
mismatched attribute aliases perspective, the Tagged memory is
considered a permission and it won't lead to undefined behaviour.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
2020-09-03 17:26:31 +01:00
Kees Cook b4ca91027d arm64/mm: Remove needless section quotes
Fix a case of needless quotes in __section(), which Clang doesn't like.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200821194310.3089815-9-keescook@chromium.org
2020-09-01 09:50:36 +02:00
Barry Song c6303ab9b9 arm64: mm: reserve per-numa CMA to localize coherent dma buffers
Right now, smmu is using dma_alloc_coherent() to get memory to save queues
and tables. Typically, on ARM64 server, there is a default CMA located at
node0, which could be far away from node2, node3 etc.
with this patch, smmu will get memory from local numa node to save command
queues and page tables. that means dma_unmap latency will be shrunk much.
Meanwhile, when iommu.passthrough is on, device drivers which call dma_
alloc_coherent() will also get local memory and avoid the travel between
numa nodes.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-01 09:19:37 +02:00
Gustavo A. R. Silva df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Peter Xu 6a1bb025d2 mm/arm64: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.  To do this, we pass pt_regs
pointer into __do_page_fault().

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-6-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:03 -07:00
Peter Xu bce617edec mm: do page fault accounting in handle_mm_fault
Patch series "mm: Page fault accounting cleanups", v5.

This is v5 of the pf accounting cleanup series.  It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b98270 ("mm: allow
VM_FAULT_RETRY for multiple times"):

  https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

What this series did:

  - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault.  For example, page fault
    retries should not be counted in page fault counters.  Same to the
    perf events.

  - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g.  errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense.  And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

  - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR).  More information in patch 1.

  - Always account the page fault onto the one that triggered the page
    fault.  This does not matter much for #PF handlings, but mostly for
    gup.  More information on this in patch 25.

Patchset layout:

Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more

This patch (of 25):

This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault().  This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events.  To
do this, the pt_regs pointer is passed into handle_mm_fault().

PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.

So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:02 -07:00
Jia He d622ecec5f mm/memory_hotplug: introduce default dummy memory_add_physaddr_to_nid()
This is to introduce a general dummy helper.  memory_add_physaddr_to_nid()
is a fallback option to get the nid in case NUMA_NO_NID is detected.

After this patch, arm64/sh/s390 can simply use the general dummy version.
PowerPC/x86/ia64 will still use their specific version.

This is the preparation to set a fallback value for dev_dax->target_node.

Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Chuhong Yuan <hslester96@gmail.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: Kaly Xin <Kaly.Xin@arm.com>
Link: http://lkml.kernel.org/r/20200710031619.18762-2-justin.he@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:57 -07:00
Mike Rapoport c89ab04feb mm/sparse: cleanup the code surrounding memory_present()
After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP we have two equivalent
functions that call memory_present() for each region in memblock.memory:
sparse_memory_present_with_active_regions() and membocks_present().

Moreover, all architectures have a call to either of these functions
preceding the call to sparse_init() and in the most cases they are called
one after the other.

Mark the regions from memblock.memory as present during sparce_init() by
making sparse_init() call memblocks_present(), make memblocks_present()
and memory_present() functions static and remove redundant
sparse_memory_present_with_active_regions() function.

Also remove no longer required HAVE_MEMORY_PRESENT configuration option.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200712083130.22919-1-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:27 -07:00
Anshuman Khandual eee07935bb arm64/mm: enable vmem_altmap support for vmemmap mappings
Device memory ranges when getting hot added into ZONE_DEVICE, might
require their vmemmap mapping's backing memory to be allocated from their
own range instead of consuming system memory.  This prevents large system
memory usage for potentially large device memory ranges.  Device driver
communicates this request via vmem_altmap structure.  Architecture needs
to take this request into account while creating and tearing down vemmmap
mappings.

This enables vmem_altmap support in vmemmap_populate() and vmemmap_free()
which includes vmemmap_populate_basepages() used for ARM64_16K_PAGES and
ARM64_64K_PAGES configs.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jia He <justin.he@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1594004178-8861-4-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:27 -07:00
Anshuman Khandual 56993b4e14 mm/sparsemem: enable vmem_altmap support in vmemmap_alloc_block_buf()
There are many instances where vmemap allocation is often switched between
regular memory and device memory just based on whether altmap is available
or not.  vmemmap_alloc_block_buf() is used in various platforms to
allocate vmemmap mappings.  Lets also enable it to handle altmap based
device memory allocation along with existing regular memory allocations.
This will help in avoiding the altmap based allocation switch in many
places.  To summarize there are two different methods to call
vmemmap_alloc_block_buf().

vmemmap_alloc_block_buf(size, node, NULL)   /* Allocate from system RAM */
vmemmap_alloc_block_buf(size, node, altmap) /* Allocate from altmap */

This converts altmap_alloc_block_buf() into a static function, drops it's
entry from the header and updates Documentation/vm/memory-model.rst.

Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jia He <justin.he@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Will Deacon <will@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yu Zhao <yuzhao@google.com>
Link: http://lkml.kernel.org/r/1594004178-8861-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:27 -07:00
Anshuman Khandual 1d9cfee753 mm/sparsemem: enable vmem_altmap support in vmemmap_populate_basepages()
Patch series "arm64: Enable vmemmap mapping from device memory", v4.

This series enables vmemmap backing memory allocation from device memory
ranges on arm64.  But before that, it enables vmemmap_populate_basepages()
and vmemmap_alloc_block_buf() to accommodate struct vmem_altmap based
alocation requests.

This patch (of 3):

vmemmap_populate_basepages() is used across platforms to allocate backing
memory for vmemmap mapping.  This is used as a standard default choice or
as a fallback when intended huge pages allocation fails.  This just
creates entire vmemmap mapping with base pages (PAGE_SIZE).

On arm64 platforms, vmemmap_populate_basepages() is called instead of the
platform specific vmemmap_populate() when ARM64_SWAPPER_USES_SECTION_MAPS
is not enabled as in case for ARM64_16K_PAGES and ARM64_64K_PAGES configs.

At present vmemmap_populate_basepages() does not support allocating from
driver defined struct vmem_altmap while trying to create vmemmap mapping
for a device memory range.  It prevents ARM64_16K_PAGES and
ARM64_64K_PAGES configs on arm64 from supporting device memory with
vmemap_altmap request.

This enables vmem_altmap support in vmemmap_populate_basepages() unlocking
device memory allocation for vmemap mapping on arm64 platforms with 16K or
64K base page configs.

Each architecture should evaluate and decide on subscribing device memory
based base page allocation through vmemmap_populate_basepages().  Hence
lets keep it disabled on all archs in order to preserve the existing
semantics.  A subsequent patch enables it on arm64.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jia He <justin.he@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Yu Zhao <yuzhao@google.com>
Link: http://lkml.kernel.org/r/1594004178-8861-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1594004178-8861-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:27 -07:00
Mike Rapoport ca15ca406f mm: remove unneeded includes of <asm/pgalloc.h>
Patch series "mm: cleanup usage of <asm/pgalloc.h>"

Most architectures have very similar versions of pXd_alloc_one() and
pXd_free_one() for intermediate levels of page table.  These patches add
generic versions of these functions in <asm-generic/pgalloc.h> and enable
use of the generic functions where appropriate.

In addition, functions declared and defined in <asm/pgalloc.h> headers are
used mostly by core mm and early mm initialization in arch and there is no
actual reason to have the <asm/pgalloc.h> included all over the place.
The first patch in this series removes unneeded includes of
<asm/pgalloc.h>

In the end it didn't work out as neatly as I hoped and moving
pXd_alloc_track() definitions to <asm-generic/pgalloc.h> would require
unnecessary changes to arches that have custom page table allocations, so
I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
to mm/.

This patch (of 8):

In most cases <asm/pgalloc.h> header is required only for allocations of
page table memory.  Most of the .c files that include that header do not
use symbols declared in <asm/pgalloc.h> and do not require that header.

As for the other header files that used to include <asm/pgalloc.h>, it is
possible to move that include into the .c file that actually uses symbols
from <asm/pgalloc.h> and drop the include from the header file.

The process was somewhat automated using

	sed -i -E '/[<"]asm\/pgalloc\.h/d' \
                $(grep -L -w -f /tmp/xx \
                        $(git grep -E -l '[<"]asm/pgalloc\.h'))

where /tmp/xx contains all the symbols defined in
arch/*/include/asm/pgalloc.h.

[rppt@linux.ibm.com: fix powerpc warning]

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:26 -07:00
Linus Torvalds 47ec5303d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Support 6Ghz band in ath11k driver, from Rajkumar Manoharan.

 2) Support UDP segmentation in code TSO code, from Eric Dumazet.

 3) Allow flashing different flash images in cxgb4 driver, from Vishal
    Kulkarni.

 4) Add drop frames counter and flow status to tc flower offloading,
    from Po Liu.

 5) Support n-tuple filters in cxgb4, from Vishal Kulkarni.

 6) Various new indirect call avoidance, from Eric Dumazet and Brian
    Vazquez.

 7) Fix BPF verifier failures on 32-bit pointer arithmetic, from
    Yonghong Song.

 8) Support querying and setting hardware address of a port function via
    devlink, use this in mlx5, from Parav Pandit.

 9) Support hw ipsec offload on bonding slaves, from Jarod Wilson.

10) Switch qca8k driver over to phylink, from Jonathan McDowell.

11) In bpftool, show list of processes holding BPF FD references to
    maps, programs, links, and btf objects. From Andrii Nakryiko.

12) Several conversions over to generic power management, from Vaibhav
    Gupta.

13) Add support for SO_KEEPALIVE et al. to bpf_setsockopt(), from Dmitry
    Yakunin.

14) Various https url conversions, from Alexander A. Klimov.

15) Timestamping and PHC support for mscc PHY driver, from Antoine
    Tenart.

16) Support bpf iterating over tcp and udp sockets, from Yonghong Song.

17) Support 5GBASE-T i40e NICs, from Aleksandr Loktionov.

18) Add kTLS RX HW offload support to mlx5e, from Tariq Toukan.

19) Fix the ->ndo_start_xmit() return type to be netdev_tx_t in several
    drivers. From Luc Van Oostenryck.

20) XDP support for xen-netfront, from Denis Kirjanov.

21) Support receive buffer autotuning in MPTCP, from Florian Westphal.

22) Support EF100 chip in sfc driver, from Edward Cree.

23) Add XDP support to mvpp2 driver, from Matteo Croce.

24) Support MPTCP in sock_diag, from Paolo Abeni.

25) Commonize UDP tunnel offloading code by creating udp_tunnel_nic
    infrastructure, from Jakub Kicinski.

26) Several pci_ --> dma_ API conversions, from Christophe JAILLET.

27) Add FLOW_ACTION_POLICE support to mlxsw, from Ido Schimmel.

28) Add SK_LOOKUP bpf program type, from Jakub Sitnicki.

29) Refactor a lot of networking socket option handling code in order to
    avoid set_fs() calls, from Christoph Hellwig.

30) Add rfc4884 support to icmp code, from Willem de Bruijn.

31) Support TBF offload in dpaa2-eth driver, from Ioana Ciornei.

32) Support XDP_REDIRECT in qede driver, from Alexander Lobakin.

33) Support PCI relaxed ordering in mlx5 driver, from Aya Levin.

34) Support TCP syncookies in MPTCP, from Flowian Westphal.

35) Fix several tricky cases of PMTU handling wrt. briding, from Stefano
    Brivio.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2056 commits)
  net: thunderx: initialize VF's mailbox mutex before first usage
  usb: hso: remove bogus check for EINPROGRESS
  usb: hso: no complaint about kmalloc failure
  hso: fix bailout in error case of probe
  ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM
  selftests/net: relax cpu affinity requirement in msg_zerocopy test
  mptcp: be careful on subflow creation
  selftests: rtnetlink: make kci_test_encap() return sub-test result
  selftests: rtnetlink: correct the final return value for the test
  net: dsa: sja1105: use detected device id instead of DT one on mismatch
  tipc: set ub->ifindex for local ipv6 address
  ipv6: add ipv6_dev_find()
  net: openvswitch: silence suspicious RCU usage warning
  Revert "vxlan: fix tos value before xmit"
  ptp: only allow phase values lower than 1 period
  farsync: switch from 'pci_' to 'dma_' API
  wan: wanxl: switch from 'pci_' to 'dma_' API
  hv_netvsc: do not use VF device if link is down
  dpaa2-eth: Fix passing zero to 'PTR_ERR' warning
  net: macb: Properly handle phylink on at91sam9x
  ...
2020-08-05 20:13:21 -07:00
Jean-Philippe Brucker 8008342853 bpf, arm64: Add BPF exception tables
When a tracing BPF program attempts to read memory without using the
bpf_probe_read() helper, the verifier marks the load instruction with
the BPF_PROBE_MEM flag. Since the arm64 JIT does not currently recognize
this flag it falls back to the interpreter.

Add support for BPF_PROBE_MEM, by appending an exception table to the
BPF program. If the load instruction causes a data abort, the fixup
infrastructure finds the exception table and fixes up the fault, by
clearing the destination register and jumping over the faulting
instruction.

To keep the compact exception table entry format, inspect the pc in
fixup_exception(). A more generic solution would add a "handler" field
to the table entry, like on x86 and s390.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200728152122.1292756-2-jean-philippe@linaro.org
2020-07-31 00:43:40 +02:00
Pingfan Liu c4885bbb3a arm64/mm: save memory access in check_and_switch_context() fast switch path
On arm64, smp_processor_id() reads a per-cpu `cpu_number` variable,
using the per-cpu offset stored in the tpidr_el1 system register. In
some cases we generate a per-cpu address with a sequence like:

  cpu_ptr = &per_cpu(ptr, smp_processor_id());

Which potentially incurs a cache miss for both `cpu_number` and the
in-memory `__per_cpu_offset` array. This can be written more optimally
as:

  cpu_ptr = this_cpu_ptr(ptr);

Which only needs the offset from tpidr_el1, and does not need to
load from memory.

The following two test cases show a small performance improvement measured
on a 46-cpus qualcomm machine with 5.8.0-rc4 kernel.

Test 1: (about 0.3% improvement)
    #cat b.sh
    make clean && make all -j138
    #perf stat --repeat 10 --null --sync sh b.sh

    - before this patch
     Performance counter stats for 'sh b.sh' (10 runs):

                298.62 +- 1.86 seconds time elapsed  ( +-  0.62% )

    - after this patch
     Performance counter stats for 'sh b.sh' (10 runs):

               297.734 +- 0.954 seconds time elapsed  ( +-  0.32% )

Test 2: (about 1.69% improvement)
     'perf stat -r 10 perf bench sched messaging'
        Then sum the total time of 'sched/messaging' by manual.

    - before this patch
      total 0.707 sec for 10 times
    - after this patch
      totol 0.695 sec for 10 times

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/1594389852-19949-1-git-send-email-kernelfans@gmail.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-30 12:58:40 +01:00
Anshuman Khandual abb7962adc arm64/hugetlb: Reserve CMA areas for gigantic pages on 16K and 64K configs
Currently 'hugetlb_cma=' command line argument does not create CMA area on
ARM64_16K_PAGES and ARM64_64K_PAGES based platforms. Instead, it just ends
up with the following warning message. Reason being, hugetlb_cma_reserve()
never gets called for these huge page sizes.

[   64.255669] hugetlb_cma: the option isn't supported by current arch

This enables CMA areas reservation on ARM64_16K_PAGES and ARM64_64K_PAGES
configs by defining an unified arm64_hugetlb_cma_reseve() that is wrapped
in CONFIG_CMA. Call site for arm64_hugetlb_cma_reserve() is also protected
as <asm/hugetlb.h> is conditionally included and hence cannot contain stub
for the inverse config i.e !(CONFIG_HUGETLB_PAGE && CONFIG_CMA).

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Barry Song <song.bao.hua@hisilicon.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1593578521-24672-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-15 13:38:03 +01:00
Gavin Shan a1634a542f arm64/mm: Redefine CONT_{PTE, PMD}_SHIFT
Currently, the value of CONT_{PTE, PMD}_SHIFT is off from standard
{PAGE, PMD}_SHIFT. In turn, we have to consider adding {PAGE, PMD}_SHIFT
when using CONT_{PTE, PMD}_SHIFT in the function hugetlbpage_init().
It's a bit confusing.

This redefines CONT_{PTE, PMD}_SHIFT with {PAGE, PMD}_SHIFT included
so that the later values needn't be added when using the former ones
in function hugetlbpage_init(). Note that the values of CONT_{PTES, PMDS}
are unchanged.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lkml.org/lkml/2020/5/6/190
Link: https://lore.kernel.org/r/20200630062428.194235-1-gshan@redhat.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-03 17:49:58 +01:00
Anshuman Khandual 638d503130 arm64/panic: Unify all three existing notifier blocks
Currently there are three different registered panic notifier blocks. This
unifies all of them into a single one i.e arm64_panic_block, hence reducing
code duplication and required calling sequence during panic. This preserves
the existing dump sequence. While here, just use device_initcall() directly
instead of __initcall() which has been a legacy alias for the earlier. This
replacement is a pure cleanup with no functional implications.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/1593405511-7625-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-07-02 15:44:50 +01:00
Barry Song 618e07865b arm64: mm: reserve hugetlb CMA after numa_init
hugetlb_cma_reserve() is called at the wrong place. numa_init has not been
done yet. so all reserved memory will be located at node0.

Fixes: cf11e85fc0 ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200617215828.25296-1-song.bao.hua@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-06-18 10:39:15 +01:00
Shyam Thombre 8dd4daa042 arm64: mm: reset address tag set by kasan sw tagging
KASAN sw tagging sets a random tag of 8 bits in the top byte of the pointer
returned by the memory allocating functions. So for the functions unaware
of this change, the top 8 bits of the address must be reset which is done
by the function arch_kasan_reset_tag().

Signed-off-by: Shyam Thombre <sthombre@codeaurora.org>
Link: https://lore.kernel.org/r/1591787384-5823-1-git-send-email-sthombre@codeaurora.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-06-15 16:58:13 +01:00
Michel Lespinasse d8ed45c5dc mmap locking API: use coccinelle to convert mmap_sem rwsem call sites
This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.

The change is generated using coccinelle with the following rule:

// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:14 -07:00
Mike Rapoport 974b9b2c68 mm: consolidate pte_index() and pte_offset_*() definitions
All architectures define pte_index() as

	(address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)

and all architectures define pte_offset_kernel() as an entry in the array
of PTEs indexed by the pte_index().

For the most architectures the pte_offset_kernel() implementation relies
on the availability of pmd_page_vaddr() that converts a PMD entry value to
the virtual address of the page containing PTEs array.

Let's move x86 definitions of the PTE accessors to the generic place in
<linux/pgtable.h> and then simply drop the respective definitions from the
other architectures.

The architectures that didn't provide pmd_page_vaddr() are updated to have
that defined.

The generic implementation of pte_offset_kernel() can be overridden by an
architecture and alpha makes use of this because it has special ordering
requirements for its version of pte_offset_kernel().

[rppt@linux.ibm.com: v2]
  Link: http://lkml.kernel.org/r/20200514170327.31389-11-rppt@kernel.org
[rppt@linux.ibm.com: update]
  Link: http://lkml.kernel.org/r/20200514170327.31389-12-rppt@kernel.org
[rppt@linux.ibm.com: update]
  Link: http://lkml.kernel.org/r/20200514170327.31389-13-rppt@kernel.org
[akpm@linux-foundation.org: fix x86 warning]
[sfr@canb.auug.org.au: fix powerpc build]
  Link: http://lkml.kernel.org/r/20200607153443.GB738695@linux.ibm.com

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-10-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:14 -07:00
Mike Rapoport 65fddcfca8 mm: reorder includes after introduction of linux/pgtable.h
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include
of the latter in the middle of asm includes.  Fix this up with the aid of
the below script and manual adjustments here and there.

	import sys
	import re

	if len(sys.argv) is not 3:
	    print "USAGE: %s <file> <header>" % (sys.argv[0])
	    sys.exit(1)

	hdr_to_move="#include <linux/%s>" % sys.argv[2]
	moved = False
	in_hdrs = False

	with open(sys.argv[1], "r") as f:
	    lines = f.readlines()
	    for _line in lines:
		line = _line.rstrip('
')
		if line == hdr_to_move:
		    continue
		if line.startswith("#include <linux/"):
		    in_hdrs = True
		elif not moved and in_hdrs:
		    moved = True
		    print hdr_to_move
		print line

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Mike Rapoport ca5999fde0 mm: introduce include/linux/pgtable.h
The include/linux/pgtable.h is going to be the home of generic page table
manipulation functions.

Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and
make the latter include asm/pgtable.h.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Mike Rapoport e31cf2f4ca mm: don't include asm/pgtable.h if linux/mm.h is already included
Patch series "mm: consolidate definitions of page table accessors", v2.

The low level page table accessors (pXY_index(), pXY_offset()) are
duplicated across all architectures and sometimes more than once.  For
instance, we have 31 definition of pgd_offset() for 25 supported
architectures.

Most of these definitions are actually identical and typically it boils
down to, e.g.

static inline unsigned long pmd_index(unsigned long address)
{
        return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
}

static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
        return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
}

These definitions can be shared among 90% of the arches provided
XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.

For architectures that really need a custom version there is always
possibility to override the generic version with the usual ifdefs magic.

These patches introduce include/linux/pgtable.h that replaces
include/asm-generic/pgtable.h and add the definitions of the page table
accessors to the new header.

This patch (of 12):

The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
functions involving page table manipulations, e.g.  pte_alloc() and
pmd_alloc().  So, there is no point to explicitly include <asm/pgtable.h>
in the files that include <linux/mm.h>.

The include statements in such cases are remove with a simple loop:

	for f in $(git grep -l "include <linux/mm.h>") ; do
		sed -i -e '/include <asm\/pgtable.h>/ d' $f
	done

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Mike Rapoport e9f6376858 arm64: add support for folded p4d page tables
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and
remove __ARCH_USE_5LEVEL_HACK.

[arnd@arndb.de: fix gcc-10 shift warning]
  Link: http://lkml.kernel.org/r/20200429185657.4085975-1-arnd@arndb.de
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: James Morse <james.morse@arm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200414153455.21744-4-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04 19:06:21 -07:00
Mike Kravetz 3823783088 hugetlbfs: remove hugetlb_add_hstate() warning for existing hstate
hugetlb_add_hstate() prints a warning if the hstate already exists.  This
was originally done as part of kernel command line parsing.  If
'hugepagesz=' was specified more than once, the warning

	pr_warn("hugepagesz= specified twice, ignoring\n");

would be printed.

Some architectures want to enable all huge page sizes.  They would call
hugetlb_add_hstate for all supported sizes.  However, this was done after
command line processing and as a result hstates could have already been
created for some sizes.  To make sure no warning were printed, there would
often be code like:

	if (!size_to_hstate(size)
		hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT)

The only time we want to print the warning is as the result of command
line processing.  So, remove the warning from hugetlb_add_hstate and add
it to the single arch independent routine processing "hugepagesz=".  After
this, calls to size_to_hstate() in arch specific code can be removed and
hugetlb_add_hstate can be called without worrying about warning messages.

[mike.kravetz@oracle.com: fix hugetlb initialization]
  Link: http://lkml.kernel.org/r/4c36c6ce-3774-78fa-abc4-b7346bf24348@oracle.com
  Link: http://lkml.kernel.org/r/20200428205614.246260-5-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Mina Almasry <almasrymina@google.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[s390]
Acked-by: Will Deacon <will@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Longpeng <longpeng2@huawei.com>
Cc: Nitesh Narayan Lal <nitesh@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200417185049.275845-4-mike.kravetz@oracle.com
Link: http://lkml.kernel.org/r/20200428205614.246260-4-mike.kravetz@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:46 -07:00
Mike Kravetz 359f25443a hugetlbfs: move hugepagesz= parsing to arch independent code
Now that architectures provide arch_hugetlb_valid_size(), parsing of
"hugepagesz=" can be done in architecture independent code.  Create a
single routine to handle hugepagesz= parsing and remove all arch specific
routines.  We can also remove the interface hugetlb_bad_size() as this is
no longer used outside arch independent code.

This also provides consistent behavior of hugetlbfs command line options.
The hugepagesz= option should only be specified once for a specific size,
but some architectures allow multiple instances.  This appears to be more
of an oversight when code was added by some architectures to set up ALL
huge pages sizes.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Mina Almasry <almasrymina@google.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[s390]
Acked-by: Will Deacon <will@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Longpeng <longpeng2@huawei.com>
Cc: Nitesh Narayan Lal <nitesh@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200417185049.275845-3-mike.kravetz@oracle.com
Link: http://lkml.kernel.org/r/20200428205614.246260-3-mike.kravetz@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:46 -07:00
Mike Kravetz ae94da8981 hugetlbfs: add arch_hugetlb_valid_size
Patch series "Clean up hugetlb boot command line processing", v4.

Longpeng(Mike) reported a weird message from hugetlb command line
processing and proposed a solution [1].  While the proposed patch does
address the specific issue, there are other related issues in command line
processing.  As hugetlbfs evolved, updates to command line processing have
been made to meet immediate needs and not necessarily in a coordinated
manner.  The result is that some processing is done in arch specific code,
some is done in arch independent code and coordination is problematic.
Semantics can vary between architectures.

The patch series does the following:
- Define arch specific arch_hugetlb_valid_size routine used to validate
  passed huge page sizes.
- Move hugepagesz= command line parsing out of arch specific code and into
  an arch independent routine.
- Clean up command line processing to follow desired semantics and
  document those semantics.

[1] https://lore.kernel.org/linux-mm/20200305033014.1152-1-longpeng2@huawei.com

This patch (of 3):

The architecture independent routine hugetlb_default_setup sets up the
default huge pages size.  It has no way to verify if the passed value is
valid, so it accepts it and attempts to validate at a later time.  This
requires undocumented cooperation between the arch specific and arch
independent code.

For architectures that support more than one huge page size, provide a
routine arch_hugetlb_valid_size to validate a huge page size.
hugetlb_default_setup can use this to validate passed values.

arch_hugetlb_valid_size will also be used in a subsequent patch to move
processing of the "hugepagesz=" in arch specific code to a common routine
in arch independent code.

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[s390]
Acked-by: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Longpeng <longpeng2@huawei.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Nitesh Narayan Lal <nitesh@redhat.com>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200428205614.246260-1-mike.kravetz@oracle.com
Link: http://lkml.kernel.org/r/20200428205614.246260-2-mike.kravetz@oracle.com
Link: http://lkml.kernel.org/r/20200417185049.275845-1-mike.kravetz@oracle.com
Link: http://lkml.kernel.org/r/20200417185049.275845-2-mike.kravetz@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:46 -07:00
Mike Rapoport 584cb13dca arm64: simplify detection of memory zone boundaries for UMA configs
The free_area_init() function only requires the definition of maximal PFN
for each of the supported zone rater than calculation of actual zone sizes
and the sizes of the holes between the zones.

After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP the free_area_init() is
available to all architectures.

Using this function instead of free_area_init_node() simplifies the zone
detection.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Hoan Tran <hoan@os.amperecomputing.com>	[arm64]
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200412194859.12663-9-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:43 -07:00
Mike Rapoport 9691a071aa mm: use free_area_init() instead of free_area_init_nodes()
free_area_init() has effectively became a wrapper for
free_area_init_nodes() and there is no point of keeping it.  Still
free_area_init() name is shorter and more general as it does not imply
necessity to initialize multiple nodes.

Rename free_area_init_nodes() to free_area_init(), update the callers and
drop old version of free_area_init().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Hoan Tran <hoan@os.amperecomputing.com>	[arm64]
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200412194859.12663-6-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:43 -07:00
Mike Rapoport d622abf74f mm: memblock: replace dereferences of memblock_region.nid with API calls
Patch series "mm: rework free_area_init*() funcitons".

After the discussion [1] about removal of CONFIG_NODES_SPAN_OTHER_NODES
and CONFIG_HAVE_MEMBLOCK_NODE_MAP options, I took it a bit further and
updated the node/zone initialization.

Since all architectures have memblock, it is possible to use only the
newer version of free_area_init_node() that calculates the zone and node
boundaries based on memblock node mapping and architectural limits on
possible zone PFNs.

The architectures that still determined zone and hole sizes can be
switched to the generic code and the old code that took those zone and
hole sizes can be simply removed.

And, since it all started from the removal of
CONFIG_NODES_SPAN_OTHER_NODES, the memmap_init() is now updated to iterate
over memblocks and so it does not need to perform early_pfn_to_nid() query
for every PFN.

[1] https://lore.kernel.org/lkml/1585420282-25630-1-git-send-email-Hoan@os.amperecomputing.com

This patch (of 21):

There are several places in the code that directly dereference
memblock_region.nid despite this field being defined only when
CONFIG_HAVE_MEMBLOCK_NODE_MAP=y.

Replace these with calls to memblock_get_region_nid() to improve code
robustness and to avoid possible breakage when
CONFIG_HAVE_MEMBLOCK_NODE_MAP will be removed.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Hoan Tran <hoan@os.amperecomputing.com>	[arm64]
Reviewed-by: Baoquan He <bhe@redhat.com>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200412194859.12663-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200412194859.12663-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-03 20:09:43 -07:00
Linus Torvalds 118d6e9829 ACPI updates for 5.8-rc1
- Update the ACPICA code in the kernel to upstream revision
    20200430:
 
    * Move acpi_gbl_next_cmd_num definition (Erik Kaneda).
 
    * Ignore AE_ALREADY_EXISTS status in the disassembler when parsing
      create operators (Erik Kaneda).
 
    * Add status checks to the dispatcher (Erik Kaneda).
 
    * Fix required parameters for _NIG and _NIH (Erik Kaneda).
 
    * Make acpi_protocol_lengths static (Yue Haibing).
 
  - Fix ACPI table reference counting errors in several places, mostly
    in error code paths (Hanjun Guo).
 
  - Extend the Generic Event Device (GED) driver to support _Exx and
    _Lxx handler methods (Ard Biesheuvel).
 
  - Add new acpi_evaluate_reg() helper and modify the ACPI PCI hotplug
    code to use it (Hans de Goede).
 
  - Add new DPTF battery participant driver and make the DPFT power
    participant driver create more sysfs device attributes (Srinivas
    Pandruvada).
 
  - Improve the handling of memory failures in APEI (James Morse).
 
  - Add new blacklist entry for Acer TravelMate 5735Z to the backlight
    driver (Paul Menzel).
 
  - Add i2c address for thermal control to the PMIC driver (Mauro
    Carvalho Chehab).
 
  - Allow the ACPI processor idle driver to work on platforms with
    only one ACPI C-state present (Zhang Rui).
 
  - Fix kobject reference count leaks in error code paths in two
    places (Qiushi Wu).
 
  - Delete unused proc filename macros and make some symbols static
    (Pascal Terjan, Zheng Zengkai, Zou Wei).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl7VHb8SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxVboQAIjYda2RhQANIlIvoEa+Qd2/FBd3HXgU
 Mv0LZ6y1xxxEZYeKne7zja1hzt5WetuZ1hZHGfg8YkXyrLqZGxfCIFbbhSA90BGG
 PGzFerGmOBNzB3I9SN6iQY7vSqoFHvQEV1PVh24d+aHWZqj2lnaRRq+GT54qbRLX
 /U3Hy5glFl8A/DCBP4cpoEjDr4IJHY68DathkDK2Ep2ybXV6B401uuqx8Su/OBd/
 MQmJTYI1UK/RYBXfdzS9TIZahnkxBbU1cnLFy08Ve2mawl5YsHPEbvm77a0yX2M6
 sOAerpgyzYNivAuOLpNIwhUZjpOY66nQuKAQaEl2cfRUkqt4nbmq7yDoH3d2MJLC
 /Ccz955rV2YyD1DtyV+PyT+HB+/EVwH/+UCZ+gsSbdHvOiwdFU6VaTc2eI1qq8K9
 4m5eEZFrAMPlvTzj/xVxr2Hfw1lbm23J5B5n7sM5HzYbT6MUWRQpvfV4zM3jTGz0
 rQd8JmcHVvZk/MV1mGrYHrN5TnGTLWpbS4Yv1lAQa6FP0N0NxzVud7KRfLKnCnJ1
 vh5yzW2fCYmVulJpuqxJDfXSqNV7n40CFrIewSp6nJRQXnWpImqHwwiA8fl51+hC
 fBL72Ey08EHGFnnNQqbebvNglsodRWJddBy43ppnMHtuLBA/2GVKYf2GihPbpEBq
 NHtX+Rd3vlWW
 =xH3i
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "These update the ACPICA code in the kernel to upstream revision
  20200430, fix several reference counting errors related to ACPI
  tables, add _Exx / _Lxx support to the GED driver, add a new
  acpi_evaluate_reg() helper, add new DPTF battery participant driver
  and extend the DPFT power participant driver, improve the handling of
  memory failures in the APEI code, add a blacklist entry to the
  backlight driver, update the PMIC driver and the processor idle
  driver, fix two kobject reference count leaks, and make a few janitory
  changes.

  Specifics:

   - Update the ACPICA code in the kernel to upstream revision 20200430:

      - Move acpi_gbl_next_cmd_num definition (Erik Kaneda).

      - Ignore AE_ALREADY_EXISTS status in the disassembler when parsing
        create operators (Erik Kaneda).

      - Add status checks to the dispatcher (Erik Kaneda).

      - Fix required parameters for _NIG and _NIH (Erik Kaneda).

      - Make acpi_protocol_lengths static (Yue Haibing).

   - Fix ACPI table reference counting errors in several places, mostly
     in error code paths (Hanjun Guo).

   - Extend the Generic Event Device (GED) driver to support _Exx and
     _Lxx handler methods (Ard Biesheuvel).

   - Add new acpi_evaluate_reg() helper and modify the ACPI PCI hotplug
     code to use it (Hans de Goede).

   - Add new DPTF battery participant driver and make the DPFT power
     participant driver create more sysfs device attributes (Srinivas
     Pandruvada).

   - Improve the handling of memory failures in APEI (James Morse).

   - Add new blacklist entry for Acer TravelMate 5735Z to the backlight
     driver (Paul Menzel).

   - Add i2c address for thermal control to the PMIC driver (Mauro
     Carvalho Chehab).

   - Allow the ACPI processor idle driver to work on platforms with only
     one ACPI C-state present (Zhang Rui).

   - Fix kobject reference count leaks in error code paths in two places
     (Qiushi Wu).

   - Delete unused proc filename macros and make some symbols static
     (Pascal Terjan, Zheng Zengkai, Zou Wei)"

* tag 'acpi-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
  ACPI: CPPC: Fix reference count leak in acpi_cppc_processor_probe()
  ACPI: sysfs: Fix reference count leak in acpi_sysfs_add_hotplug_profile()
  ACPI: GED: use correct trigger type field in _Exx / _Lxx handling
  ACPI: DPTF: Add battery participant driver
  ACPI: DPTF: Additional sysfs attributes for power participant driver
  ACPI: video: Use native backlight on Acer TravelMate 5735Z
  arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
  ACPI: APEI: Kick the memory_failure() queue for synchronous errors
  mm/memory-failure: Add memory_failure_queue_kick()
  ACPI / PMIC: Add i2c address for thermal control
  ACPI: GED: add support for _Exx / _Lxx handler methods
  ACPI: Delete unused proc filename macros
  ACPI: hotplug: PCI: Use the new acpi_evaluate_reg() helper
  ACPI: utils: Add acpi_evaluate_reg() helper
  ACPI: debug: Make two functions static
  ACPI: sleep: Put the FACS table after using it
  ACPI: scan: Put SPCR and STAO table after using it
  ACPI: EC: Put the ACPI table after using it
  ACPI: APEI: Put the HEST table for error path
  ACPI: APEI: Put the error record serialization table for error path
  ...
2020-06-02 13:25:52 -07:00
Linus Torvalds 94709049fb Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:
 "A few little subsystems and a start of a lot of MM patches.

  Subsystems affected by this patch series: squashfs, ocfs2, parisc,
  vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup,
  swap, memcg, pagemap, memory-failure, vmalloc, kasan"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits)
  kasan: move kasan_report() into report.c
  mm/mm_init.c: report kasan-tag information stored in page->flags
  ubsan: entirely disable alignment checks under UBSAN_TRAP
  kasan: fix clang compilation warning due to stack protector
  x86/mm: remove vmalloc faulting
  mm: remove vmalloc_sync_(un)mappings()
  x86/mm/32: implement arch_sync_kernel_mappings()
  x86/mm/64: implement arch_sync_kernel_mappings()
  mm/ioremap: track which page-table levels were modified
  mm/vmalloc: track which page-table levels were modified
  mm: add functions to track page directory modifications
  s390: use __vmalloc_node in stack_alloc
  powerpc: use __vmalloc_node in alloc_vm_stack
  arm64: use __vmalloc_node in arch_alloc_vmap_stack
  mm: remove vmalloc_user_node_flags
  mm: switch the test_vmalloc module to use __vmalloc_node
  mm: remove __vmalloc_node_flags_caller
  mm: remove both instances of __vmalloc_node_flags
  mm: remove the prot argument to __vmalloc_node
  mm: remove the pgprot argument to __vmalloc
  ...
2020-06-02 12:21:36 -07:00
Steven Price 99395ee3f7 mm: ptdump: expand type of 'val' in note_page()
The page table entry is passed in the 'val' argument to note_page(),
however this was previously an "unsigned long" which is fine on 64-bit
platforms.  But for 32 bit x86 it is not always big enough to contain a
page table entry which may be 64 bits.

Change the type to u64 to ensure that it is always big enough.

[akpm@linux-foundation.org: fix riscv]
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200521152308.33096-3-steven.price@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:10 -07:00
Linus Torvalds 533b220f7b arm64 updates for 5.8
- Branch Target Identification (BTI)
 	* Support for ARMv8.5-BTI in both user- and kernel-space. This
 	  allows branch targets to limit the types of branch from which
 	  they can be called and additionally prevents branching to
 	  arbitrary code, although kernel support requires a very recent
 	  toolchain.
 
 	* Function annotation via SYM_FUNC_START() so that assembly
 	  functions are wrapped with the relevant "landing pad"
 	  instructions.
 
 	* BPF and vDSO updates to use the new instructions.
 
 	* Addition of a new HWCAP and exposure of BTI capability to
 	  userspace via ID register emulation, along with ELF loader
 	  support for the BTI feature in .note.gnu.property.
 
 	* Non-critical fixes to CFI unwind annotations in the sigreturn
 	  trampoline.
 
 - Shadow Call Stack (SCS)
 	* Support for Clang's Shadow Call Stack feature, which reserves
 	  platform register x18 to point at a separate stack for each
 	  task that holds only return addresses. This protects function
 	  return control flow from buffer overruns on the main stack.
 
 	* Save/restore of x18 across problematic boundaries (user-mode,
 	  hypervisor, EFI, suspend, etc).
 
 	* Core support for SCS, should other architectures want to use it
 	  too.
 
 	* SCS overflow checking on context-switch as part of the existing
 	  stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.
 
 - CPU feature detection
 	* Removed numerous "SANITY CHECK" errors when running on a system
 	  with mismatched AArch32 support at EL1. This is primarily a
 	  concern for KVM, which disabled support for 32-bit guests on
 	  such a system.
 
 	* Addition of new ID registers and fields as the architecture has
 	  been extended.
 
 - Perf and PMU drivers
 	* Minor fixes and cleanups to system PMU drivers.
 
 - Hardware errata
 	* Unify KVM workarounds for VHE and nVHE configurations.
 
 	* Sort vendor errata entries in Kconfig.
 
 - Secure Monitor Call Calling Convention (SMCCC)
 	* Update to the latest specification from Arm (v1.2).
 
 	* Allow PSCI code to query the SMCCC version.
 
 - Software Delegated Exception Interface (SDEI)
 	* Unexport a bunch of unused symbols.
 
 	* Minor fixes to handling of firmware data.
 
 - Pointer authentication
 	* Add support for dumping the kernel PAC mask in vmcoreinfo so
 	  that the stack can be unwound by tools such as kdump.
 
 	* Simplification of key initialisation during CPU bringup.
 
 - BPF backend
 	* Improve immediate generation for logical and add/sub
 	  instructions.
 
 - vDSO
 	- Minor fixes to the linker flags for consistency with other
 	  architectures and support for LLVM's unwinder.
 
 	- Clean up logic to initialise and map the vDSO into userspace.
 
 - ACPI
 	- Work around for an ambiguity in the IORT specification relating
 	  to the "num_ids" field.
 
 	- Support _DMA method for all named components rather than only
 	  PCIe root complexes.
 
 	- Minor other IORT-related fixes.
 
 - Miscellaneous
 	* Initialise debug traps early for KGDB and fix KDB cacheflushing
 	  deadlock.
 
 	* Minor tweaks to early boot state (documentation update, set
 	  TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).
 
 	* Refactoring and cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl7U9csQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNLBHCACs/YU4SM7Om5f+7QnxIKao5DBr2CnGGvdC
 yTfDghFDTLQVv3MufLlfno3yBe5G8sQpcZfcc+hewfcGoMzVZXu8s7LzH6VSn9T9
 jmT3KjDMrg0RjSHzyumJp2McyelTk0a4FiKArSIIKsJSXUyb1uPSgm7SvKVDwEwU
 JGDzL9IGilmq59GiXfDzGhTZgmC37QdwRoRxDuqtqWQe5CHoRXYexg87HwBKOQxx
 HgU9L7ehri4MRZfpyjaDrr6quJo3TVnAAKXNBh3mZAskVS9ZrfKpEH0kYWYuqybv
 znKyHRecl/rrGePV8RTMtrwnSdU26zMXE/omsVVauDfG9hqzqm+Q
 =w3qi
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "A sizeable pile of arm64 updates for 5.8.

  Summary below, but the big two features are support for Branch Target
  Identification and Clang's Shadow Call stack. The latter is currently
  arm64-only, but the high-level parts are all in core code so it could
  easily be adopted by other architectures pending toolchain support

  Branch Target Identification (BTI):

   - Support for ARMv8.5-BTI in both user- and kernel-space. This allows
     branch targets to limit the types of branch from which they can be
     called and additionally prevents branching to arbitrary code,
     although kernel support requires a very recent toolchain.

   - Function annotation via SYM_FUNC_START() so that assembly functions
     are wrapped with the relevant "landing pad" instructions.

   - BPF and vDSO updates to use the new instructions.

   - Addition of a new HWCAP and exposure of BTI capability to userspace
     via ID register emulation, along with ELF loader support for the
     BTI feature in .note.gnu.property.

   - Non-critical fixes to CFI unwind annotations in the sigreturn
     trampoline.

  Shadow Call Stack (SCS):

   - Support for Clang's Shadow Call Stack feature, which reserves
     platform register x18 to point at a separate stack for each task
     that holds only return addresses. This protects function return
     control flow from buffer overruns on the main stack.

   - Save/restore of x18 across problematic boundaries (user-mode,
     hypervisor, EFI, suspend, etc).

   - Core support for SCS, should other architectures want to use it
     too.

   - SCS overflow checking on context-switch as part of the existing
     stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.

  CPU feature detection:

   - Removed numerous "SANITY CHECK" errors when running on a system
     with mismatched AArch32 support at EL1. This is primarily a concern
     for KVM, which disabled support for 32-bit guests on such a system.

   - Addition of new ID registers and fields as the architecture has
     been extended.

  Perf and PMU drivers:

   - Minor fixes and cleanups to system PMU drivers.

  Hardware errata:

   - Unify KVM workarounds for VHE and nVHE configurations.

   - Sort vendor errata entries in Kconfig.

  Secure Monitor Call Calling Convention (SMCCC):

   - Update to the latest specification from Arm (v1.2).

   - Allow PSCI code to query the SMCCC version.

  Software Delegated Exception Interface (SDEI):

   - Unexport a bunch of unused symbols.

   - Minor fixes to handling of firmware data.

  Pointer authentication:

   - Add support for dumping the kernel PAC mask in vmcoreinfo so that
     the stack can be unwound by tools such as kdump.

   - Simplification of key initialisation during CPU bringup.

  BPF backend:

   - Improve immediate generation for logical and add/sub instructions.

  vDSO:

   - Minor fixes to the linker flags for consistency with other
     architectures and support for LLVM's unwinder.

   - Clean up logic to initialise and map the vDSO into userspace.

  ACPI:

   - Work around for an ambiguity in the IORT specification relating to
     the "num_ids" field.

   - Support _DMA method for all named components rather than only PCIe
     root complexes.

   - Minor other IORT-related fixes.

  Miscellaneous:

   - Initialise debug traps early for KGDB and fix KDB cacheflushing
     deadlock.

   - Minor tweaks to early boot state (documentation update, set
     TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).

   - Refactoring and cleanup"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
  KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h
  KVM: arm64: Check advertised Stage-2 page size capability
  arm64/cpufeature: Add get_arm64_ftr_reg_nowarn()
  ACPI/IORT: Remove the unused __get_pci_rid()
  arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context
  arm64/cpufeature: Add remaining feature bits in ID_AA64PFR1 register
  arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register
  arm64/cpufeature: Add remaining feature bits in ID_AA64ISAR0 register
  arm64/cpufeature: Add remaining feature bits in ID_MMFR4 register
  arm64/cpufeature: Add remaining feature bits in ID_PFR0 register
  arm64/cpufeature: Introduce ID_MMFR5 CPU register
  arm64/cpufeature: Introduce ID_DFR1 CPU register
  arm64/cpufeature: Introduce ID_PFR2 CPU register
  arm64/cpufeature: Make doublelock a signed feature in ID_AA64DFR0
  arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register
  arm64/cpufeature: Add explicit ftr_id_isar0[] for ID_ISAR0 register
  arm64: mm: Add asid_gen_match() helper
  firmware: smccc: Fix missing prototype warning for arm_smccc_version_init
  arm64: vdso: Fix CFI directives in sigreturn trampoline
  arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction
  ...
2020-06-01 15:18:27 -07:00
Will Deacon 082af5ec50 Merge branch 'for-next/scs' into for-next/core
Support for Clang's Shadow Call Stack in the kernel
(Sami Tolvanen and Will Deacon)
* for-next/scs:
  arm64: entry-ftrace.S: Update comment to indicate that x18 is live
  scs: Move DEFINE_SCS macro into core code
  scs: Remove references to asm/scs.h from core code
  scs: Move scs_overflow_check() out of architecture code
  arm64: scs: Use 'scs_sp' register alias for x18
  scs: Move accounting into alloc/free functions
  arm64: scs: Store absolute SCS stack pointer value in thread_info
  efi/libstub: Disable Shadow Call Stack
  arm64: scs: Add shadow stacks for SDEI
  arm64: Implement Shadow Call Stack
  arm64: Disable SCS for hypervisor code
  arm64: vdso: Disable Shadow Call Stack
  arm64: efi: Restore register x18 if it was corrupted
  arm64: Preserve register x18 when CPU is suspended
  arm64: Reserve register x18 from general allocation with SCS
  scs: Disable when function graph tracing is enabled
  scs: Add support for stack usage debugging
  scs: Add page accounting for shadow call stack allocations
  scs: Add support for Clang's Shadow Call Stack (SCS)
2020-05-28 18:03:40 +01:00
Will Deacon d27865279f Merge branch 'for-next/bti' into for-next/core
Support for Branch Target Identification (BTI) in user and kernel
(Mark Brown and others)
* for-next/bti: (39 commits)
  arm64: vdso: Fix CFI directives in sigreturn trampoline
  arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction
  arm64: bti: Fix support for userspace only BTI
  arm64: kconfig: Update and comment GCC version check for kernel BTI
  arm64: vdso: Map the vDSO text with guarded pages when built for BTI
  arm64: vdso: Force the vDSO to be linked as BTI when built for BTI
  arm64: vdso: Annotate for BTI
  arm64: asm: Provide a mechanism for generating ELF note for BTI
  arm64: bti: Provide Kconfig for kernel mode BTI
  arm64: mm: Mark executable text as guarded pages
  arm64: bpf: Annotate JITed code for BTI
  arm64: Set GP bit in kernel page tables to enable BTI for the kernel
  arm64: asm: Override SYM_FUNC_START when building the kernel with BTI
  arm64: bti: Support building kernel C code using BTI
  arm64: Document why we enable PAC support for leaf functions
  arm64: insn: Report PAC and BTI instructions as skippable
  arm64: insn: Don't assume unrecognized HINTs are skippable
  arm64: insn: Provide a better name for aarch64_insn_is_nop()
  arm64: insn: Add constants for new HINT instruction decode
  arm64: Disable old style assembly annotations
  ...
2020-05-28 18:00:51 +01:00
Will Deacon 342403bcb4 Merge branches 'for-next/acpi', 'for-next/bpf', 'for-next/cpufeature', 'for-next/docs', 'for-next/kconfig', 'for-next/misc', 'for-next/perf', 'for-next/ptr-auth', 'for-next/sdei', 'for-next/smccc' and 'for-next/vdso' into for-next/core
ACPI and IORT updates
(Lorenzo Pieralisi)
* for-next/acpi:
  ACPI/IORT: Remove the unused __get_pci_rid()
  ACPI/IORT: Fix PMCG node single ID mapping handling
  ACPI: IORT: Add comments for not calling acpi_put_table()
  ACPI: GTDT: Put GTDT table after parsing
  ACPI: IORT: Add extra message "applying workaround" for off-by-1 issue
  ACPI/IORT: work around num_ids ambiguity
  Revert "ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map()"
  ACPI/IORT: take _DMA methods into account for named components

BPF JIT optimisations for immediate value generation
(Luke Nelson)
* for-next/bpf:
  bpf, arm64: Optimize ADD,SUB,JMP BPF_K using arm64 add/sub immediates
  bpf, arm64: Optimize AND,OR,XOR,JSET BPF_K using arm64 logical immediates
  arm64: insn: Fix two bugs in encoding 32-bit logical immediates

Addition of new CPU ID register fields and removal of some benign sanity checks
(Anshuman Khandual and others)
* for-next/cpufeature: (27 commits)
  KVM: arm64: Check advertised Stage-2 page size capability
  arm64/cpufeature: Add get_arm64_ftr_reg_nowarn()
  arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context
  arm64/cpufeature: Add remaining feature bits in ID_AA64PFR1 register
  arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register
  arm64/cpufeature: Add remaining feature bits in ID_AA64ISAR0 register
  arm64/cpufeature: Add remaining feature bits in ID_MMFR4 register
  arm64/cpufeature: Add remaining feature bits in ID_PFR0 register
  arm64/cpufeature: Introduce ID_MMFR5 CPU register
  arm64/cpufeature: Introduce ID_DFR1 CPU register
  arm64/cpufeature: Introduce ID_PFR2 CPU register
  arm64/cpufeature: Make doublelock a signed feature in ID_AA64DFR0
  arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register
  arm64/cpufeature: Add explicit ftr_id_isar0[] for ID_ISAR0 register
  arm64/cpufeature: Drop open encodings while extracting parange
  arm64/cpufeature: Validate hypervisor capabilities during CPU hotplug
  arm64: cpufeature: Group indexed system register definitions by name
  arm64: cpufeature: Extend comment to describe absence of field info
  arm64: drop duplicate definitions of ID_AA64MMFR0_TGRAN constants
  arm64: cpufeature: Add an overview comment for the cpufeature framework
  ...

Minor documentation tweaks for silicon errata and booting requirements
(Rob Herring and Will Deacon)
* for-next/docs:
  arm64: silicon-errata.rst: Sort the Cortex-A55 entries
  arm64: docs: Mandate that the I-cache doesn't hold stale kernel text

Minor Kconfig cleanups
(Geert Uytterhoeven)
* for-next/kconfig:
  arm64: cpufeature: Add "or" to mitigations for multiple errata
  arm64: Sort vendor-specific errata

Miscellaneous updates
(Ard Biesheuvel and others)
* for-next/misc:
  arm64: mm: Add asid_gen_match() helper
  arm64: stacktrace: Factor out some common code into on_stack()
  arm64: Call debug_traps_init() from trap_init() to help early kgdb
  arm64: cacheflush: Fix KGDB trap detection
  arm64/cpuinfo: Move device_initcall() near cpuinfo_regs_init()
  arm64: kexec_file: print appropriate variable
  arm: mm: use __pfn_to_section() to get mem_section
  arm64: Reorder the macro arguments in the copy routines
  efi/libstub/arm64: align PE/COFF sections to segment alignment
  KVM: arm64: Drop PTE_S2_MEMATTR_MASK
  arm64/kernel: Fix range on invalidating dcache for boot page tables
  arm64: set TEXT_OFFSET to 0x0 in preparation for removing it entirely
  arm64: lib: Consistently enable crc32 extension
  arm64/mm: Use phys_to_page() to access pgtable memory
  arm64: smp: Make cpus_stuck_in_kernel static
  arm64: entry: remove unneeded semicolon in el1_sync_handler()
  arm64/kernel: vmlinux.lds: drop redundant discard/keep macros
  arm64: drop GZFLAGS definition and export
  arm64: kexec_file: Avoid temp buffer for RNG seed
  arm64: rename stext to primary_entry

Perf PMU driver updates
(Tang Bin and others)
* for-next/perf:
  pmu/smmuv3: Clear IRQ affinity hint on device removal
  drivers/perf: hisi: Permit modular builds of HiSilicon uncore drivers
  drivers/perf: hisi: Fix typo in events attribute array
  drivers/perf: arm_spe_pmu: Avoid duplicate printouts
  drivers/perf: arm_dsu_pmu: Avoid duplicate printouts

Pointer authentication updates and support for vmcoreinfo
(Amit Daniel Kachhap and Mark Rutland)
* for-next/ptr-auth:
  Documentation/vmcoreinfo: Add documentation for 'KERNELPACMASK'
  arm64/crash_core: Export KERNELPACMASK in vmcoreinfo
  arm64: simplify ptrauth initialization
  arm64: remove ptrauth_keys_install_kernel sync arg

SDEI cleanup and non-critical fixes
(James Morse and others)
* for-next/sdei:
  firmware: arm_sdei: Document the motivation behind these set_fs() calls
  firmware: arm_sdei: remove unused interfaces
  firmware: arm_sdei: Put the SDEI table after using it
  firmware: arm_sdei: Drop check for /firmware/ node and always register driver

SMCCC updates and refactoring
(Sudeep Holla)
* for-next/smccc:
  firmware: smccc: Fix missing prototype warning for arm_smccc_version_init
  firmware: smccc: Add function to fetch SMCCC version
  firmware: smccc: Refactor SMCCC specific bits into separate file
  firmware: smccc: Drop smccc_version enum and use ARM_SMCCC_VERSION_1_x instead
  firmware: smccc: Add the definition for SMCCCv1.2 version/error codes
  firmware: smccc: Update link to latest SMCCC specification
  firmware: smccc: Add HAVE_ARM_SMCCC_DISCOVERY to identify SMCCC v1.1 and above

vDSO cleanup and non-critical fixes
(Mark Rutland and Vincenzo Frascino)
* for-next/vdso:
  arm64: vdso: Add --eh-frame-hdr to ldflags
  arm64: vdso: use consistent 'map' nomenclature
  arm64: vdso: use consistent 'abi' nomenclature
  arm64: vdso: simplify arch_vdso_type ifdeffery
  arm64: vdso: remove aarch32_vdso_pages[]
  arm64: vdso: Add '-Bsymbolic' to ldflags
2020-05-28 17:47:34 +01:00
Jean-Philippe Brucker 4fc92254bf arm64: mm: Add asid_gen_match() helper
Add a macro to check if an ASID is from the current generation, since a
subsequent patch will introduce a third user for this test.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20200519175502.2504091-6-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-21 14:46:14 +01:00
James Morse 8fcc4ae6fa arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
APEI is unable to do all of its error handling work in nmi-context, so
it defers non-fatal work onto the irq_work queue. arch_irq_work_raise()
sends an IPI to the calling cpu, but this is not guaranteed to be taken
before returning to user-space.

Unless the exception interrupted a context with irqs-masked,
irq_work_run() can run immediately. Otherwise return -EINPROGRESS to
indicate ghes_notify_sea() found some work to do, but it hasn't
finished yet.

With this apei_claim_sea() returning '0' means this external-abort was
also notification of a firmware-first RAS error, and that APEI has
processed the CPER records.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Tyler Baicar <baicar@os.amperecomputing.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-19 19:51:11 +02:00
Sami Tolvanen 6d37d81f44 arm64: Preserve register x18 when CPU is suspended
Don't lose the current task's shadow stack when the CPU is suspended.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-15 16:35:50 +01:00
Mark Brown 67d4a1cd09 arm64: mm: Mark executable text as guarded pages
When the kernel is built for BTI and running on a system which supports
make all executable text guarded pages to ensure that loadable module
and JITed BPF code is protected by BTI.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200506195138.22086-7-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-07 17:53:20 +01:00
Mark Brown c8027285e3 arm64: Set GP bit in kernel page tables to enable BTI for the kernel
Now that the kernel is built with BTI annotations enable the feature by
setting the GP bit in the stage 1 translation tables.  This is done
based on the features supported by the boot CPU so that we do not need
to rewrite the translation tables.

In order to avoid potential issues on big.LITTLE systems when there are
a mix of BTI and non-BTI capable CPUs in the system when we have enabled
kernel mode BTI we change BTI to be a _STRICT_BOOT_CPU_FEATURE when we
have kernel BTI.  This will prevent any CPUs that don't support BTI
being started if the boot CPU supports BTI rather than simply not using
BTI as we do when supporting BTI only in userspace.  The main concern is
the possibility of BTYPE being preserved by a CPU that does not
implement BTI when a thread is migrated to it resulting in an incorrect
state which could generate an exception when the thread migrates back to
a CPU that does support BTI.  If we encounter practical systems which
mix BTI and non-BTI CPUs we will need to revisit this implementation.

Since we currently do not generate landing pads in the BPF JIT we only
map the base kernel text in this way.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200506195138.22086-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-05-07 17:53:20 +01:00
Mark Rutland 027d0c7101 arm64: hugetlb: avoid potential NULL dereference
The static analyzer in GCC 10 spotted that in huge_pte_alloc() we may
pass a NULL pmdp into pte_alloc_map() when pmd_alloc() returns NULL:

|   CC      arch/arm64/mm/pageattr.o
|   CC      arch/arm64/mm/hugetlbpage.o
|                  from arch/arm64/mm/hugetlbpage.c:10:
| arch/arm64/mm/hugetlbpage.c: In function ‘huge_pte_alloc’:
| ./arch/arm64/include/asm/pgtable-types.h:28:24: warning: dereference of NULL ‘pmdp’ [CWE-690] [-Wanalyzer-null-dereference]
| ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’
| arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’
|     |arch/arm64/mm/hugetlbpage.c:232:10:
|     |./arch/arm64/include/asm/pgtable-types.h:28:24:
| ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’
| arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’

This can only occur when the kernel cannot allocate a page, and so is
unlikely to happen in practice before other systems start failing.

We can avoid this by bailing out if pmd_alloc() fails, as we do earlier
in the function if pud_alloc() fails.

Fixes: 66b3923a1a ("arm64: hugetlb: add support for PTE contiguous bit")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Kyrill Tkachov <kyrylo.tkachov@arm.com>
Cc: <stable@vger.kernel.org> # 4.5.x-
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-05-07 09:24:15 +01:00
Will Deacon 80e4e56132 Merge branch 'for-next/bti-user' into for-next/bti
Merge in user support for Branch Target Identification, which narrowly
missed the cut for 5.7 after a late ABI concern.

* for-next/bti-user:
  arm64: bti: Document behaviour for dynamically linked binaries
  arm64: elf: Fix allnoconfig kernel build with !ARCH_USE_GNU_PROPERTY
  arm64: BTI: Add Kconfig entry for userspace BTI
  mm: smaps: Report arm64 guarded pages in smaps
  arm64: mm: Display guarded pages in ptdump
  KVM: arm64: BTI: Reset BTYPE when skipping emulated instructions
  arm64: BTI: Reset BTYPE when skipping emulated instructions
  arm64: traps: Shuffle code to eliminate forward declarations
  arm64: unify native/compat instruction skipping
  arm64: BTI: Decode BYTPE bits when printing PSTATE
  arm64: elf: Enable BTI at exec based on ELF program properties
  elf: Allow arch to tweak initial mmap prot flags
  arm64: Basic Branch Target Identification support
  ELF: Add ELF program property parsing support
  ELF: UAPI and Kconfig additions for ELF program properties
2020-05-05 15:15:58 +01:00
Guixiong Wei 037d9303a7 arm: mm: use __pfn_to_section() to get mem_section
Replace the open-coded '__nr_to_section(pfn_to_section_nr(pfn))' in
pfn_valid() with a more concise call to '__pfn_to_section(pfn)'.

No functional change.

Signed-off-by: Guixiong Wei <guixiongwei@gmail.com>
Link: https://lore.kernel.org/r/20200430161858.11379-1-guixiongwei@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-04-30 21:13:56 +01:00
Mark Rutland 62a679cb28 arm64: simplify ptrauth initialization
Currently __cpu_setup conditionally initializes the address
authentication keys and enables them in SCTLR_EL1, doing so differently
for the primary CPU and secondary CPUs, and skipping this work for CPUs
returning from an idle state. For the latter case, cpu_do_resume
restores the keys and SCTLR_EL1 value after the MMU has been enabled.

This flow is rather difficult to follow, so instead let's move the
primary and secondary CPU initialization into their respective boot
paths. By following the example of cpu_do_resume and doing so once the
MMU is enabled, we can always initialize the keys from the values in
thread_struct, and avoid the machinery necessary to pass the keys in
secondary_data or open-coding initialization for the boot CPU.

This means we perform an additional RMW of SCTLR_EL1, but we already do
this in the cpu_do_resume path, and for other features in cpufeature.c,
so this isn't a major concern in a bringup path. Note that even while
the enable bits are clear, the key registers are accessible.

As this now renders the argument to __cpu_setup redundant, let's also
remove that entirely. Future extensions can follow a similar approach to
initialize values that differ for primary/secondary CPUs.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200423101606.37601-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-04-28 11:23:21 +01:00
Mark Rutland d0055da526 arm64: remove ptrauth_keys_install_kernel sync arg
The 'sync' argument to ptrauth_keys_install_kernel macro is somewhat
opaque at callsites, so instead lets have regular and _nosync variants
of the macro to make this a little more obvious.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200423101606.37601-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2020-04-28 11:23:21 +01:00
Logan Gunthorpe bfeb022f8f mm/memory_hotplug: add pgprot_t to mhp_params
devm_memremap_pages() is currently used by the PCI P2PDMA code to create
struct page mappings for IO memory.  At present, these mappings are
created with PAGE_KERNEL which implies setting the PAT bits to be WB.
However, on x86, an mtrr register will typically override this and force
the cache type to be UC-.  In the case firmware doesn't set this
register it is effectively WB and will typically result in a machine
check exception when it's accessed.

Other arches are not currently likely to function correctly seeing they
don't have any MTRR registers to fall back on.

To solve this, provide a way to specify the pgprot value explicitly to
arch_add_memory().

Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a
simple change to pass the pgprot_t down to their respective functions
which set up the page tables.  For x86_32, set the page tables
explicitly using _set_memory_prot() (seeing they are already mapped).

For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this
should be fine, for now, seeing these architectures don't support
ZONE_DEVICE.

A check in __add_pages() is also added to ensure the pgprot parameter
was set for all arches.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:21 -07:00
Logan Gunthorpe f5637d3b42 mm/memory_hotplug: rename mhp_restrictions to mhp_params
The mhp_restrictions struct really doesn't specify anything resembling a
restriction anymore so rename it to be mhp_params as it is a list of
extended parameters.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:21 -07:00
Anshuman Khandual 6cb4d9a287 mm/vma: introduce VM_ACCESS_FLAGS
There are many places where all basic VMA access flags (read, write,
exec) are initialized or checked against as a group.  One such example
is during page fault.  Existing vma_is_accessible() wrapper already
creates the notion of VMA accessibility as a group access permissions.

Hence lets just create VM_ACCESS_FLAGS (VM_READ|VM_WRITE|VM_EXEC) which
will not only reduce code duplication but also extend the VMA
accessibility concept in general.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Springer <rspringer@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Link: http://lkml.kernel.org/r/1583391014-8170-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:21 -07:00
Roman Gushchin cf11e85fc0 mm: hugetlb: optionally allocate gigantic hugepages using cma
Commit 944d9fec8d ("hugetlb: add support for gigantic page allocation
at runtime") has added the run-time allocation of gigantic pages.

However it actually works only at early stages of the system loading,
when the majority of memory is free.  After some time the memory gets
fragmented by non-movable pages, so the chances to find a contiguous 1GB
block are getting close to zero.  Even dropping caches manually doesn't
help a lot.

At large scale rebooting servers in order to allocate gigantic hugepages
is quite expensive and complex.  At the same time keeping some constant
percentage of memory in reserved hugepages even if the workload isn't
using it is a big waste: not all workloads can benefit from using 1 GB
pages.

The following solution can solve the problem:
1) On boot time a dedicated cma area* is reserved. The size is passed
   as a kernel argument.
2) Run-time allocations of gigantic hugepages are performed using the
   cma allocator and the dedicated cma area

In this case gigantic hugepages can be allocated successfully with a
high probability, however the memory isn't completely wasted if nobody
is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
etc.

* On a multi-node machine a per-node cma area is allocated on each node.
  Following gigantic hugetlb allocation are using the first available
  numa node if the mask isn't specified by a user.

Usage:
1) configure the kernel to allocate a cma area for hugetlb allocations:
   pass hugetlb_cma=10G as a kernel argument

2) allocate hugetlb pages as usual, e.g.
   echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

If the option isn't enabled or the allocation of the cma area failed,
the current behavior of the system is preserved.

x86 and arm-64 are covered by this patch, other architectures can be
trivially added later.

The patch contains clean-ups and fixes proposed and implemented by Aslan
Bakirov and Randy Dunlap.  It also contains ideas and suggestions
proposed by Rik van Riel, Michal Hocko and Mike Kravetz.  Thanks!

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Andreas Schaufler <andreas.schaufler@gmx.de>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10 15:36:21 -07:00
Peter Xu 4064b98270 mm: allow VM_FAULT_RETRY for multiple times
The idea comes from a discussion between Linus and Andrea [1].

Before this patch we only allow a page fault to retry once.  We achieved
this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
handle_mm_fault() the second time.  This was majorly used to avoid
unexpected starvation of the system by looping over forever to handle the
page fault on a single page.  However that should hardly happen, and after
all for each code path to return a VM_FAULT_RETRY we'll first wait for a
condition (during which time we should possibly yield the cpu) to happen
before VM_FAULT_RETRY is really returned.

This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY
flag when we receive VM_FAULT_RETRY.  It means that the page fault handler
now can retry the page fault for multiple times if necessary without the
need to generate another page fault event.  Meanwhile we still keep the
FAULT_FLAG_TRIED flag so page fault handler can still identify whether a
page fault is the first attempt or not.

Then we'll have these combinations of fault flags (only considering
ALLOW_RETRY flag and TRIED flag):

  - ALLOW_RETRY and !TRIED:  this means the page fault allows to
                             retry, and this is the first try

  - ALLOW_RETRY and TRIED:   this means the page fault allows to
                             retry, and this is not the first try

  - !ALLOW_RETRY and !TRIED: this means the page fault does not allow
                             to retry at all

  - !ALLOW_RETRY and TRIED:  this is forbidden and should never be used

In existing code we have multiple places that has taken special care of
the first condition above by checking against (fault_flags &
FAULT_FLAG_ALLOW_RETRY).  This patch introduces a simple helper to detect
the first retry of a page fault by checking against both (fault_flags &
FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now
even the 2nd try will have the ALLOW_RETRY set, then use that helper in
all existing special paths.  One example is in __lock_page_or_retry(), now
we'll drop the mmap_sem only in the first attempt of page fault and we'll
keep it in follow up retries, so old locking behavior will be retained.

This will be a nice enhancement for current code [2] at the same time a
supporting material for the future userfaultfd-writeprotect work, since in
that work there will always be an explicit userfault writeprotect retry
for protected pages, and if that cannot resolve the page fault (e.g., when
userfaultfd-writeprotect is used in conjunction with swapped pages) then
we'll possibly need a 3rd retry of the page fault.  It might also benefit
other potential users who will have similar requirement like userfault
write-protection.

GUP code is not touched yet and will be covered in follow up patch.

Please read the thread below for more information.

[1] https://lore.kernel.org/lkml/20171102193644.GB22686@redhat.com/
[2] https://lore.kernel.org/lkml/20181230154648.GB9832@redhat.com/

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Brian Geffon <bgeffon@google.com>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Martin Cracauer <cracauer@cons.org>
Cc: Marty McFadden <mcfadden8@llnl.gov>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Maya Gokhale <gokhale2@llnl.gov>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Link: http://lkml.kernel.org/r/20200220160246.9790-1-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02 09:35:30 -07:00
Peter Xu dde1607248 mm: introduce FAULT_FLAG_DEFAULT
Although there're tons of arch-specific page fault handlers, most of them
are still sharing the same initial value of the page fault flags.  Say,
merely all of the page fault handlers would allow the fault to be retried,
and they also allow the fault to respond to SIGKILL.

Let's define a default value for the fault flags to replace those initial
page fault flags that were copied over.  With this, it'll be far easier to
introduce new fault flag that can be used by all the architectures instead
of touching all the archs.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Brian Geffon <bgeffon@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Martin Cracauer <cracauer@cons.org>
Cc: Marty McFadden <mcfadden8@llnl.gov>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Maya Gokhale <gokhale2@llnl.gov>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Link: http://lkml.kernel.org/r/20200220160238.9694-1-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02 09:35:29 -07:00
Peter Xu b502f038f2 arm64/mm: use helper fault_signal_pending()
Let the arm64 fault handling to use the new fault_signal_pending() helper,
by moving the signal handling out of the retry logic.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Brian Geffon <bgeffon@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
Cc: "Dr . David Alan Gilbert" <dgilbert@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Martin Cracauer <cracauer@cons.org>
Cc: Marty McFadden <mcfadden8@llnl.gov>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Maya Gokhale <gokhale2@llnl.gov>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Link: http://lkml.kernel.org/r/20200220155927.9264-1-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02 09:35:29 -07:00
Linus Torvalds 3cd86a58f7 arm64 updates for 5.7:
- In-kernel Pointer Authentication support (previously only offered to
   user space).
 
 - ARM Activity Monitors (AMU) extension support allowing better CPU
   utilisation numbers for the scheduler (frequency invariance).
 
 - Memory hot-remove support for arm64.
 
 - Lots of asm annotations (SYM_*) in preparation for the in-kernel
   Branch Target Identification (BTI) support.
 
 - arm64 perf updates: ARMv8.5-PMU 64-bit counters, refactoring the PMU
   init callbacks, support for new DT compatibles.
 
 - IPv6 header checksum optimisation.
 
 - Fixes: SDEI (software delegated exception interface) double-lock on
   hibernate with shared events.
 
 - Minor clean-ups and refactoring: cpu_ops accessor, cpu_do_switch_mm()
   converted to C, cpufeature finalisation helper.
 
 - sys_mremap() comment explaining the asymmetric address untagging
   behaviour.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl6DVyIACgkQa9axLQDI
 XvHkqRAAiZA2EYKiQL4M1DJ1cNTADjT7xKX9+UtYBXj7GMVhgVWdunpHVE6qtfgk
 cT6avmKrS/6PDqizJgr+Z1yX8x3Kvs57G4BvmIUKIw97mkdewvFQ9JKv6VA1vb86
 7Qrl1WzqsGg5Kj9uUfI4h+ZoT1H4C/9PQeFxJwgZRtF9DxRh8O7VeZI+JCu8Aub2
 lIkjI8rh+EpTsGT9h/PMGWUcawnKQloZ1/F+GfMAuYBvIv2RNN2xVreJtTmm4NyJ
 VcpL0KCNyAI2lGdaJg5nBLRDyGuXDm5i+PLsCSXMquI4fie00txXeD8sjbeuO0ks
 YTJ0EhmUUhbSE17go+SxYiEFE0v09i+lD5ud+B4Vmojp0KTczTta9VSgURlbb2/9
 n9biq5G3PPDNIrZqiTT2Tf4AMz1350nkbzL2gzKecM5aIzR/u3y5yII5CgfZtFnj
 7bGbyFpFpcqI7UaISPsNCxmknbTt/7ff0WM3+7SbecxI3AD2mnxsOdN9JTLyhDp+
 owjyiaWxl5zMWF9DhplLG/9BKpNWSxh3skazdOdELd8GTq2MbJlXrVG2XgXTAOh3
 y1s6RQrfw8zXh8TSqdmmzauComXIRWTum/sbVB3U8Z3AUsIeq/NTSbN5X9JyIbOP
 HOabhlVhhkI6omN1grqPX4jwUiZLZoNfn7Ez4q71549KVK/uBtA=
 =LJVX
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "The bulk is in-kernel pointer authentication, activity monitors and
  lots of asm symbol annotations. I also queued the sys_mremap() patch
  commenting the asymmetry in the address untagging.

  Summary:

   - In-kernel Pointer Authentication support (previously only offered
     to user space).

   - ARM Activity Monitors (AMU) extension support allowing better CPU
     utilisation numbers for the scheduler (frequency invariance).

   - Memory hot-remove support for arm64.

   - Lots of asm annotations (SYM_*) in preparation for the in-kernel
     Branch Target Identification (BTI) support.

   - arm64 perf updates: ARMv8.5-PMU 64-bit counters, refactoring the
     PMU init callbacks, support for new DT compatibles.

   - IPv6 header checksum optimisation.

   - Fixes: SDEI (software delegated exception interface) double-lock on
     hibernate with shared events.

   - Minor clean-ups and refactoring: cpu_ops accessor,
     cpu_do_switch_mm() converted to C, cpufeature finalisation helper.

   - sys_mremap() comment explaining the asymmetric address untagging
     behaviour"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits)
  mm/mremap: Add comment explaining the untagging behaviour of mremap()
  arm64: head: Convert install_el2_stub to SYM_INNER_LABEL
  arm64: Introduce get_cpu_ops() helper function
  arm64: Rename cpu_read_ops() to init_cpu_ops()
  arm64: Declare ACPI parking protocol CPU operation if needed
  arm64: move kimage_vaddr to .rodata
  arm64: use mov_q instead of literal ldr
  arm64: Kconfig: verify binutils support for ARM64_PTR_AUTH
  lkdtm: arm64: test kernel pointer authentication
  arm64: compile the kernel with ptrauth return address signing
  kconfig: Add support for 'as-option'
  arm64: suspend: restore the kernel ptrauth keys
  arm64: __show_regs: strip PAC from lr in printk
  arm64: unwind: strip PAC from kernel addresses
  arm64: mask PAC bits of __builtin_return_address
  arm64: initialize ptrauth keys for kernel booting task
  arm64: initialize and switch ptrauth kernel keys
  arm64: enable ptrauth earlier
  arm64: cpufeature: handle conflicts based on capability
  arm64: cpufeature: Move cpu capability helpers inside C file
  ...
2020-03-31 10:05:01 -07:00
Catalin Marinas 44ca0e00b6 Merge branch 'for-next/kernel-ptrauth' into for-next/core
* for-next/kernel-ptrauth:
  : Return address signing - in-kernel support
  arm64: Kconfig: verify binutils support for ARM64_PTR_AUTH
  lkdtm: arm64: test kernel pointer authentication
  arm64: compile the kernel with ptrauth return address signing
  kconfig: Add support for 'as-option'
  arm64: suspend: restore the kernel ptrauth keys
  arm64: __show_regs: strip PAC from lr in printk
  arm64: unwind: strip PAC from kernel addresses
  arm64: mask PAC bits of __builtin_return_address
  arm64: initialize ptrauth keys for kernel booting task
  arm64: initialize and switch ptrauth kernel keys
  arm64: enable ptrauth earlier
  arm64: cpufeature: handle conflicts based on capability
  arm64: cpufeature: Move cpu capability helpers inside C file
  arm64: ptrauth: Add bootup/runtime flags for __cpu_setup
  arm64: install user ptrauth keys at kernel exit time
  arm64: rename ptrauth key structures to be user-specific
  arm64: cpufeature: add pointer auth meta-capabilities
  arm64: cpufeature: Fix meta-capability cpufeature check
2020-03-25 11:11:08 +00:00
Catalin Marinas 806dc825f0 Merge branch 'for-next/asm-cleanups' into for-next/core
* for-next/asm-cleanups:
  : Various asm clean-ups (alignment, mov_q vs ldr, .idmap)
  arm64: move kimage_vaddr to .rodata
  arm64: use mov_q instead of literal ldr
2020-03-25 11:10:51 +00:00
Catalin Marinas da12d2739f Merge branches 'for-next/memory-hotremove', 'for-next/arm_sdei', 'for-next/amu', 'for-next/final-cap-helper', 'for-next/cpu_ops-cleanup', 'for-next/misc' and 'for-next/perf' into for-next/core
* for-next/memory-hotremove:
  : Memory hot-remove support for arm64
  arm64/mm: Enable memory hot remove
  arm64/mm: Hold memory hotplug lock while walking for kernel page table dump

* for-next/arm_sdei:
  : SDEI: fix double locking on return from hibernate and clean-up
  firmware: arm_sdei: clean up sdei_event_create()
  firmware: arm_sdei: Use cpus_read_lock() to avoid races with cpuhp
  firmware: arm_sdei: fix possible double-lock on hibernate error path
  firmware: arm_sdei: fix double-lock on hibernate with shared events

* for-next/amu:
  : ARMv8.4 Activity Monitors support
  clocksource/drivers/arm_arch_timer: validate arch_timer_rate
  arm64: use activity monitors for frequency invariance
  cpufreq: add function to get the hardware max frequency
  Documentation: arm64: document support for the AMU extension
  arm64/kvm: disable access to AMU registers from kvm guests
  arm64: trap to EL1 accesses to AMU counters from EL0
  arm64: add support for the AMU extension v1

* for-next/final-cap-helper:
  : Introduce cpus_have_final_cap_helper(), migrate arm64 KVM to it
  arm64: kvm: hyp: use cpus_have_final_cap()
  arm64: cpufeature: add cpus_have_final_cap()

* for-next/cpu_ops-cleanup:
  : cpu_ops[] access code clean-up
  arm64: Introduce get_cpu_ops() helper function
  arm64: Rename cpu_read_ops() to init_cpu_ops()
  arm64: Declare ACPI parking protocol CPU operation if needed

* for-next/misc:
  : Various fixes and clean-ups
  arm64: define __alloc_zeroed_user_highpage
  arm64/kernel: Simplify __cpu_up() by bailing out early
  arm64: remove redundant blank for '=' operator
  arm64: kexec_file: Fixed code style.
  arm64: add blank after 'if'
  arm64: fix spelling mistake "ca not" -> "cannot"
  arm64: entry: unmask IRQ in el0_sp()
  arm64: efi: add efi-entry.o to targets instead of extra-$(CONFIG_EFI)
  arm64: csum: Optimise IPv6 header checksum
  arch/arm64: fix typo in a comment
  arm64: remove gratuitious/stray .ltorg stanzas
  arm64: Update comment for ASID() macro
  arm64: mm: convert cpu_do_switch_mm() to C
  arm64: fix NUMA Kconfig typos

* for-next/perf:
  : arm64 perf updates
  arm64: perf: Add support for ARMv8.5-PMU 64-bit counters
  KVM: arm64: limit PMU version to PMUv3 for ARMv8.1
  arm64: cpufeature: Extract capped perfmon fields
  arm64: perf: Clean up enable/disable calls
  perf: arm-ccn: Use scnprintf() for robustness
  arm64: perf: Support new DT compatibles
  arm64: perf: Refactor PMU init callbacks
  perf: arm_spe: Remove unnecessary zero check on 'nr_pages'
2020-03-25 11:10:32 +00:00
Remi Denis-Courmont dc374b477f arm64: use mov_q instead of literal ldr
In practice, this requires only 2 instructions, or even only 1 for
the idmap_pg_dir size (with 4 or 64 KiB pages). Only the MAIR values
needed more than 2 instructions and it was already converted to mov_q
by 95b3f74bec.

Signed-off-by: Remi Denis-Courmont <remi.denis.courmont@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
2020-03-24 11:48:24 +00:00
Amit Daniel Kachhap e51f5f56dd arm64: suspend: restore the kernel ptrauth keys
This patch restores the kernel keys from current task during cpu resume
after the mmu is turned on and ptrauth is enabled.

A flag is added in macro ptrauth_keys_install_kernel to check if isb
instruction needs to be executed.

Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-18 09:50:21 +00:00
Kristina Martsenko 33e4523498 arm64: initialize and switch ptrauth kernel keys
Set up keys to use pointer authentication within the kernel. The kernel
will be compiled with APIAKey instructions, the other keys are currently
unused. Each task is given its own APIAKey, which is initialized during
fork. The key is changed during context switch and on kernel entry from
EL0.

The keys for idle threads need to be set before calling any C functions,
because it is not possible to enter and exit a function with different
keys.

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
[Amit: Modified secondary cores key structure, comments]
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-18 09:50:20 +00:00
Kristina Martsenko 6982934e19 arm64: enable ptrauth earlier
When the kernel is compiled with pointer auth instructions, the boot CPU
needs to start using address auth very early, so change the cpucap to
account for this.

Pointer auth must be enabled before we call C functions, because it is
not possible to enter a function with pointer auth disabled and exit it
with pointer auth enabled. Note, mismatches between architected and
IMPDEF algorithms will still be caught by the cpufeature framework (the
separate *_ARCH and *_IMP_DEF cpucaps).

Note the change in behavior: if the boot CPU has address auth and a
late CPU does not, then the late CPU is parked by the cpufeature
framework. This is possible as kernel will only have NOP space intructions
for PAC so such mismatched late cpu will silently ignore those
instructions in C functions. Also, if the boot CPU does not have address
auth and the late CPU has then the late cpu will still boot but with
ptrauth feature disabled.

Leave generic authentication as a "system scope" cpucap for now, since
initially the kernel will only use address authentication.

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com>
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
[Amit: Re-worked ptrauth setup logic, comments]
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-18 09:50:19 +00:00
Amit Daniel Kachhap df3551011b arm64: ptrauth: Add bootup/runtime flags for __cpu_setup
This patch allows __cpu_setup to be invoked with one of these flags,
ARM64_CPU_BOOT_PRIMARY, ARM64_CPU_BOOT_SECONDARY or ARM64_CPU_RUNTIME.
This is required as some cpufeatures need different handling during
different scenarios.

The input parameter in x0 is preserved till the end to be used inside
this function.

There should be no functional change with this patch and is useful
for the subsequent ptrauth patch which utilizes it. Some upcoming
arm cpufeatures can also utilize these flags.

Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
Reviewed-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-18 09:50:19 +00:00
Mark Brown de48bb3692 arm64: mm: Display guarded pages in ptdump
v8.5-BTI introduces the GP field in stage 1 translation tables which
indicates that blocks and pages with it set are guarded pages for which
branch target identification checks should be performed. Decode this
when dumping the page tables to aid debugging.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-16 17:19:50 +00:00
Ionela Voinescu 87a1f06346 arm64: trap to EL1 accesses to AMU counters from EL0
The activity monitors extension is an optional extension introduced
by the ARMv8.4 CPU architecture. In order to access the activity
monitors counters safely, if desired, the kernel should detect the
presence of the extension through the feature register, and mediate
the access.

Therefore, disable direct accesses to activity monitors counters
from EL0 (userspace) and trap them to EL1 (kernel).

To be noted that the ARM64_AMU_EXTN kernel config does not have an
effect on this code. Given that the amuserenr_el0 resets to an
UNKNOWN value, setting the trap of EL0 accesses to EL1 is always
attempted for safety and security considerations. Therefore firmware
should still ensure accesses to AMU registers are not trapped in
EL2/EL3 as this code cannot be bypassed if the CPU implements the
Activity Monitors Unit.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-06 16:02:50 +00:00
Anshuman Khandual bbd6ec605c arm64/mm: Enable memory hot remove
The arch code for hot-remove must tear down portions of the linear map and
vmemmap corresponding to memory being removed. In both cases the page
tables mapping these regions must be freed, and when sparse vmemmap is in
use the memory backing the vmemmap must also be freed.

This patch adds unmap_hotplug_range() and free_empty_tables() helpers which
can be used to tear down either region and calls it from vmemmap_free() and
___remove_pgd_mapping(). The free_mapped argument determines whether the
backing memory will be freed.

It makes two distinct passes over the kernel page table. In the first pass
with unmap_hotplug_range() it unmaps, invalidates applicable TLB cache and
frees backing memory if required (vmemmap) for each mapped leaf entry. In
the second pass with free_empty_tables() it looks for empty page table
sections whose page table page can be unmapped, TLB invalidated and freed.

While freeing intermediate level page table pages bail out if any of its
entries are still valid. This can happen for partially filled kernel page
table either from a previously attempted failed memory hot add or while
removing an address range which does not span the entire page table page
range.

The vmemmap region may share levels of table with the vmalloc region.
There can be conflicts between hot remove freeing page table pages with
a concurrent vmalloc() walking the kernel page table. This conflict can
not just be solved by taking the init_mm ptl because of existing locking
scheme in vmalloc(). So free_empty_tables() implements a floor and ceiling
method which is borrowed from user page table tear with free_pgd_range()
which skips freeing page table pages if intermediate address range is not
aligned or maximum floor-ceiling might not own the entire page table page.

Boot memory on arm64 cannot be removed. Hence this registers a new memory
hotplug notifier which prevents boot memory offlining and it's removal.

While here update arch_add_memory() to handle __add_pages() failures by
just unmapping recently added kernel linear mapping. Now enable memory hot
remove on arm64 platforms by default with ARCH_ENABLE_MEMORY_HOTREMOVE.

This implementation is overall inspired from kernel page table tear down
procedure on X86 architecture and user page table tear down method.

[Mike and Catalin added P4D page table level support]

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-04 15:35:44 +00:00
Anshuman Khandual bf2b59f60e arm64/mm: Hold memory hotplug lock while walking for kernel page table dump
The arm64 page table dump code can race with concurrent modification of the
kernel page tables. When a leaf entries are modified concurrently, the dump
code may log stale or inconsistent information for a VA range, but this is
otherwise not harmful.

When intermediate levels of table are freed, the dump code will continue to
use memory which has been freed and potentially reallocated for another
purpose. In such cases, the dump code may dereference bogus addresses,
leading to a number of potential problems.

Intermediate levels of table may by freed during memory hot-remove,
which will be enabled by a subsequent patch. To avoid racing with
this, take the memory hotplug lock when walking the kernel page table.

Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-04 15:35:22 +00:00
Jean-Philippe Brucker 9abd515a6e arm64: context: Fix ASID limit in boot messages
Since commit f88f42f853 ("arm64: context: Free up kernel ASIDs if KPTI
is not in use"), the NUM_USER_ASIDS macro doesn't correspond to the
effective number of ASIDs when KPTI is enabled. Get an accurate number
of available ASIDs in an arch_initcall, once we've discovered all CPUs'
capabilities and know if we still need to halve the ASID space for KPTI.

Fixes: f88f42f853 ("arm64: context: Free up kernel ASIDs if KPTI is not in use")
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-03-02 12:10:38 +00:00
Mark Rutland 25b92693a1 arm64: mm: convert cpu_do_switch_mm() to C
There's no reason that cpu_do_switch_mm() needs to be written as an
assembly function, and having it as a C function would make it easier to
maintain.

This patch converts cpu_do_switch_mm() to C, removing code that this
change makes redundant (e.g. the mmid macro). Since the header comment
was stale and the prototype now implies all the necessary information,
this comment is removed. The 'pgd_phys' argument is made a phys_addr_t
to match the return type of virt_to_phys().

At the same time, post_ttbr_update_workaround() is updated to use
IS_ENABLED(), which allows the compiler to figure out it can elide calls
for !CONFIG_CAVIUM_ERRATUM_27456 builds.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
[catalin.marinas@arm.com: change comments from asm-style to C-style]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-02-27 14:30:50 +00:00
Steven Price e47690d756 x86: mm: avoid allocating struct mm_struct on the stack
struct mm_struct is quite large (~1664 bytes) and so allocating on the
stack may cause problems as the kernel stack size is small.

Since ptdump_walk_pgd_level_core() was only allocating the structure so
that it could modify the pgd argument we can instead introduce a pgd
override in struct mm_walk and pass this down the call stack to where it
is needed.

Since the correct mm_struct is now being passed down, it is now also
unnecessary to take the mmap_sem semaphore because ptdump_walk_pgd() will
now take the semaphore on the real mm.

[steven.price@arm.com: restore missed arm64 changes]
  Link: http://lkml.kernel.org/r/20200108145710.34314-1-steven.price@arm.com
Link: http://lkml.kernel.org/r/20200108145710.34314-1-steven.price@arm.com
Signed-off-by: Steven Price <steven.price@arm.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zong Li <zong.li@sifive.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04 03:05:25 +00:00
Steven Price f8f0d0b6fa mm: ptdump: reduce level numbers by 1 in note_page()
Rather than having to increment the 'depth' number by 1 in ptdump_hole(),
let's change the meaning of 'level' in note_page() since that makes the
code simplier.

Note that for x86, the level numbers were previously increased by 1 in
commit 45dcd20913 ("x86/mm/dump_pagetables: Fix printout of p4d level")
and the comment "Bit 7 has a different meaning" was not updated, so this
change also makes the code match the comment again.

Link: http://lkml.kernel.org/r/20191218162402.45610-24-steven.price@arm.com
Signed-off-by: Steven Price <steven.price@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zong Li <zong.li@sifive.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04 03:05:25 +00:00
Steven Price 9c7869c7f1 arm64: mm: display non-present entries in ptdump
Previously the /sys/kernel/debug/kernel_page_tables file would only show
lines for entries present in the page tables.  However it is useful to
also show non-present entries as this makes the size and level of the
holes more visible.  This aligns the behaviour with x86 which also shows
holes.

Link: http://lkml.kernel.org/r/20191218162402.45610-23-steven.price@arm.com
Signed-off-by: Steven Price <steven.price@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zong Li <zong.li@sifive.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04 03:05:25 +00:00
Steven Price 102f45fdbe arm64: mm: convert mm/dump.c to use walk_page_range()
Now walk_page_range() can walk kernel page tables, we can switch the arm64
ptdump code over to using it, simplifying the code.

Link: http://lkml.kernel.org/r/20191218162402.45610-22-steven.price@arm.com
Signed-off-by: Steven Price <steven.price@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zong Li <zong.li@sifive.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04 03:05:25 +00:00
Linus Torvalds 0238d3c753 arm64 updates for 5.6
- New architecture features
 	* Support for Armv8.5 E0PD, which benefits KASLR in the same way as
 	  KPTI but without the overhead. This allows KPTI to be disabled on
 	  CPUs that are not affected by Meltdown, even is KASLR is enabled.
 
 	* Initial support for the Armv8.5 RNG instructions, which claim to
 	  provide access to a high bandwidth, cryptographically secure hardware
 	  random number generator. As well as exposing these to userspace, we
 	  also use them as part of the KASLR seed and to seed the crng once
 	  all CPUs have come online.
 
 	* Advertise a bunch of new instructions to userspace, including support
 	  for Data Gathering Hint, Matrix Multiply and 16-bit floating point.
 
 - Kexec
 	* Cleanups in preparation for relocating with the MMU enabled
 	* Support for loading crash dump kernels with kexec_file_load()
 
 - Perf and PMU drivers
 	* Cleanups and non-critical fixes for a couple of system PMU drivers
 
 - FPU-less (aka broken) CPU support
 	* Considerable fixes to support CPUs without the FP/SIMD extensions,
 	  including their presence in heterogeneous systems. Good luck finding
 	  a 64-bit userspace that handles this.
 
 - Modern assembly function annotations
 	* Start migrating our use of ENTRY() and ENDPROC() over to the
 	  new-fangled SYM_{CODE,FUNC}_{START,END} macros, which are intended to
 	  aid debuggers
 
 - Kbuild
 	* Cleanup detection of LSE support in the assembler by introducing
 	  'as-instr'
 
 	* Remove compressed Image files when building clean targets
 
 - IP checksumming
 	* Implement optimised IPv4 checksumming routine when hardware offload
 	  is not in use. An IPv6 version is in the works, pending testing.
 
 - Hardware errata
 	* Work around Cortex-A55 erratum #1530923
 
 - Shadow call stack
 	* Work around some issues with Clang's integrated assembler not liking
 	  our perfectly reasonable assembly code
 
 	* Avoid allocating the X18 register, so that it can be used to hold the
 	  shadow call stack pointer in future
 
 - ACPI
 	* Fix ID count checking in IORT code. This may regress broken firmware
 	  that happened to work with the old implementation, in which case we'll
 	  have to revert it and try something else
 
 	* Fix DAIF corruption on return from GHES handler with pseudo-NMIs
 
 - Miscellaneous
 	* Whitelist some CPUs that are unaffected by Spectre-v2
 
 	* Reduce frequency of ASID rollover when KPTI is compiled in but
 	  inactive
 
 	* Reserve a couple of arch-specific PROT flags that are already used by
 	  Sparc and PowerPC and are planned for later use with BTI on arm64
 
 	* Preparatory cleanup of our entry assembly code in preparation for
 	  moving more of it into C later on
 
 	* Refactoring and cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl4oY+IQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNNfRB/4p3vax0hqaOnLRvmJPRXF31B8oPlivnr2u
 6HCA9LkdU5IlrgaTNOJ/sQEqJAPOPCU7v49Ol0iYw0iKL1suUE7Ikui5VB6Uybqt
 YbfF5UNzfXAMs2A86TF/hzqhxw+W+lpnZX8NVTuQeAODfHEGUB1HhTLfRi9INsER
 wKEAuoZyuSUibxTFvji+DAq7nVRniXX7CM7tE385pxDisCMuu/7E5wOl+3EZYXWz
 DTGzTbHXuVFL+UFCANFEUlAtmr3dQvPFIqAwVl/CxjRJjJ7a+/G3cYLsHFPrQCjj
 qYX4kfhAeeBtqmHL7YFNWFwFs5WaT5UcQquFO665/+uCTWSJpORY
 =AIh/
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "The changes are a real mixed bag this time around.

  The only scary looking one from the diffstat is the uapi change to
  asm-generic/mman-common.h, but this has been acked by Arnd and is
  actually just adding a pair of comments in an attempt to prevent
  allocation of some PROT values which tend to get used for
  arch-specific purposes. We'll be using them for Branch Target
  Identification (a CFI-like hardening feature), which is currently
  under review on the mailing list.

  New architecture features:

   - Support for Armv8.5 E0PD, which benefits KASLR in the same way as
     KPTI but without the overhead. This allows KPTI to be disabled on
     CPUs that are not affected by Meltdown, even is KASLR is enabled.

   - Initial support for the Armv8.5 RNG instructions, which claim to
     provide access to a high bandwidth, cryptographically secure
     hardware random number generator. As well as exposing these to
     userspace, we also use them as part of the KASLR seed and to seed
     the crng once all CPUs have come online.

   - Advertise a bunch of new instructions to userspace, including
     support for Data Gathering Hint, Matrix Multiply and 16-bit
     floating point.

  Kexec:

   - Cleanups in preparation for relocating with the MMU enabled

   - Support for loading crash dump kernels with kexec_file_load()

  Perf and PMU drivers:

   - Cleanups and non-critical fixes for a couple of system PMU drivers

  FPU-less (aka broken) CPU support:

   - Considerable fixes to support CPUs without the FP/SIMD extensions,
     including their presence in heterogeneous systems. Good luck
     finding a 64-bit userspace that handles this.

  Modern assembly function annotations:

   - Start migrating our use of ENTRY() and ENDPROC() over to the
     new-fangled SYM_{CODE,FUNC}_{START,END} macros, which are intended
     to aid debuggers

  Kbuild:

   - Cleanup detection of LSE support in the assembler by introducing
     'as-instr'

   - Remove compressed Image files when building clean targets

  IP checksumming:

   - Implement optimised IPv4 checksumming routine when hardware offload
     is not in use. An IPv6 version is in the works, pending testing.

  Hardware errata:

   - Work around Cortex-A55 erratum #1530923

  Shadow call stack:

   - Work around some issues with Clang's integrated assembler not
     liking our perfectly reasonable assembly code

   - Avoid allocating the X18 register, so that it can be used to hold
     the shadow call stack pointer in future

  ACPI:

   - Fix ID count checking in IORT code. This may regress broken
     firmware that happened to work with the old implementation, in
     which case we'll have to revert it and try something else

   - Fix DAIF corruption on return from GHES handler with pseudo-NMIs

  Miscellaneous:

   - Whitelist some CPUs that are unaffected by Spectre-v2

   - Reduce frequency of ASID rollover when KPTI is compiled in but
     inactive

   - Reserve a couple of arch-specific PROT flags that are already used
     by Sparc and PowerPC and are planned for later use with BTI on
     arm64

   - Preparatory cleanup of our entry assembly code in preparation for
     moving more of it into C later on

   - Refactoring and cleanup"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (73 commits)
  arm64: acpi: fix DAIF manipulation with pNMI
  arm64: kconfig: Fix alignment of E0PD help text
  arm64: Use v8.5-RNG entropy for KASLR seed
  arm64: Implement archrandom.h for ARMv8.5-RNG
  arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean'
  arm64: entry: Avoid empty alternatives entries
  arm64: Kconfig: select HAVE_FUTEX_CMPXCHG
  arm64: csum: Fix pathological zero-length calls
  arm64: entry: cleanup sp_el0 manipulation
  arm64: entry: cleanup el0 svc handler naming
  arm64: entry: mark all entry code as notrace
  arm64: assembler: remove smp_dmb macro
  arm64: assembler: remove inherit_daif macro
  ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map()
  mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use
  arm64: Use macros instead of hard-coded constants for MAIR_EL1
  arm64: Add KRYO{3,4}XX CPU cores to spectre-v2 safe list
  arm64: kernel: avoid x18 in __cpu_soft_restart
  arm64: kvm: stop treating register x18 as caller save
  arm64/lib: copy_page: avoid x18 register in assembler code
  ...
2020-01-27 08:58:19 -08:00
Will Deacon aa246c056c Merge branch 'for-next/asm-annotations' into for-next/core
* for-next/asm-annotations: (6 commits)
  arm64: kernel: Correct annotation of end of el0_sync
  ...
2020-01-22 11:34:21 +00:00
Will Deacon 4f6cdf296c Merge branches 'for-next/acpi', 'for-next/cpufeatures', 'for-next/csum', 'for-next/e0pd', 'for-next/entry', 'for-next/kbuild', 'for-next/kexec/cleanup', 'for-next/kexec/file-kdump', 'for-next/misc', 'for-next/nofpsimd', 'for-next/perf' and 'for-next/scs' into for-next/core
* for-next/acpi:
  ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map()

* for-next/cpufeatures: (2 commits)
  arm64: Introduce ID_ISAR6 CPU register
  ...

* for-next/csum: (2 commits)
  arm64: csum: Fix pathological zero-length calls
  ...

* for-next/e0pd: (7 commits)
  arm64: kconfig: Fix alignment of E0PD help text
  ...

* for-next/entry: (5 commits)
  arm64: entry: cleanup sp_el0 manipulation
  ...

* for-next/kbuild: (4 commits)
  arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean'
  ...

* for-next/kexec/cleanup: (11 commits)
  Revert "arm64: kexec: make dtb_mem always enabled"
  ...

* for-next/kexec/file-kdump: (2 commits)
  arm64: kexec_file: add crash dump support
  ...

* for-next/misc: (12 commits)
  arm64: entry: Avoid empty alternatives entries
  ...

* for-next/nofpsimd: (7 commits)
  arm64: nofpsmid: Handle TIF_FOREIGN_FPSTATE flag cleanly
  ...

* for-next/perf: (2 commits)
  perf/imx_ddr: Fix cpu hotplug state cleanup
  ...

* for-next/scs: (6 commits)
  arm64: kernel: avoid x18 in __cpu_soft_restart
  ...
2020-01-22 11:32:31 +00:00
Catalin Marinas 95b3f74bec arm64: Use macros instead of hard-coded constants for MAIR_EL1
Currently, the arm64 __cpu_setup has hard-coded constants for the memory
attributes that go into the MAIR_EL1 register. Define proper macros in
asm/sysreg.h and make use of them in proc.S.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-17 12:48:33 +00:00
Sami Tolvanen e3ec6582d1 arm64: mm: avoid x18 in idmap_kpti_install_ng_mappings
idmap_kpti_install_ng_mappings uses x18 as a temporary register, which
will result in a conflict when x18 is reserved. Use x16 and x17 instead
where needed.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-16 17:32:56 +00:00
Vladimir Murzin f88f42f853 arm64: context: Free up kernel ASIDs if KPTI is not in use
We can extend user ASID space if it turns out that system does not
require KPTI. We start with kernel ASIDs reserved because CPU caps are
not finalized yet and free them up lazily on the next rollover if we
confirm than KPTI is not in use.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-16 12:45:05 +00:00
Pan Zhang edf9081827 mm: change_memory_common: add spaces for `*` operator
Leaves one space before and after a binary operator both, it may be more elegant.

Signed-off-by: Pan Zhang <zhangpan26@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-08 17:30:33 +00:00
Mark Brown f4659254a3 arm64: mm: Use modern annotations for assembly functions
In an effort to clarify and simplify the annotation of assembly functions
in the kernel new macros have been introduced. These replace ENTRY and
ENDPROC and also add a new annotation for static functions which previously
had no ENTRY equivalent. Update the annotations in the mm code to the
new macros. Even the functions called from non-standard environments
like idmap have no special requirements on their environments so can be
treated like regular functions.

Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-08 12:23:38 +00:00
Catalin Marinas 24cecc3774 arm64: Revert support for execute-only user mappings
The ARMv8 64-bit architecture supports execute-only user permissions by
clearing the PTE_USER and PTE_UXN bits, practically making it a mostly
privileged mapping but from which user running at EL0 can still execute.

The downside, however, is that the kernel at EL1 inadvertently reading
such mapping would not trip over the PAN (privileged access never)
protection.

Revert the relevant bits from commit cab15ce604 ("arm64: Introduce
execute-only page access permissions") so that PROT_EXEC implies
PROT_READ (and therefore PTE_USER) until the architecture gains proper
support for execute-only user mappings.

Fixes: cab15ce604 ("arm64: Introduce execute-only page access permissions")
Cc: <stable@vger.kernel.org> # 4.9.x-
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-06 10:10:07 -08:00
David Hildenbrand feee6b2989 mm/memory_hotplug: shrink zones when offlining memory
We currently try to shrink a single zone when removing memory.  We use
the zone of the first page of the memory we are removing.  If that
memmap was never initialized (e.g., memory was never onlined), we will
read garbage and can trigger kernel BUGs (due to a stale pointer):

    BUG: unable to handle page fault for address: 000000000000353d
    #PF: supervisor write access in kernel mode
    #PF: error_code(0x0002) - not-present page
    PGD 0 P4D 0
    Oops: 0002 [#1] SMP PTI
    CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
    Workqueue: kacpi_hotplug acpi_hotplug_work_fn
    RIP: 0010:clear_zone_contiguous+0x5/0x10
    Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
    RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
    RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
    RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
    R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
    FS:  0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     __remove_pages+0x4b/0x640
     arch_remove_memory+0x63/0x8d
     try_remove_memory+0xdb/0x130
     __remove_memory+0xa/0x11
     acpi_memory_device_remove+0x70/0x100
     acpi_bus_trim+0x55/0x90
     acpi_device_hotplug+0x227/0x3a0
     acpi_hotplug_work_fn+0x1a/0x30
     process_one_work+0x221/0x550
     worker_thread+0x50/0x3b0
     kthread+0x105/0x140
     ret_from_fork+0x3a/0x50
    Modules linked in:
    CR2: 000000000000353d

Instead, shrink the zones when offlining memory or when onlining failed.
Introduce and use remove_pfn_range_from_zone(() for that.  We now
properly shrink the zones, even if we have DIMMs whereby

 - Some memory blocks fall into no zone (never onlined)

 - Some memory blocks fall into multiple zones (offlined+re-onlined)

 - Multiple memory blocks that fall into different zones

Drop the zone parameter (with a potential dubious value) from
__remove_pages() and __remove_section().

Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
Fixes: f1dd2cd13c ("mm, memory_hotplug: do not associate hotadded memory to zones until online")	[visible after d0dc12e86b]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: <stable@vger.kernel.org>	[5.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-04 13:55:08 -08:00
Linus Torvalds 9888428102 arm64 updates for 5.5:
- ZONE_DMA32 initialisation fix when memblocks fall entirely within the
   first GB (used by ZONE_DMA in 5.5 for Raspberry Pi 4).
 
 - Couple of ftrace fixes following the FTRACE_WITH_REGS patchset.
 
 - access_ok() fix for the Tagged Address ABI when called from from a
   kernel thread (asynchronous I/O): the kthread does not have the TIF
   flags of the mm owner, so untag the user address unconditionally.
 
 - KVM compute_layout() called before the alternatives code patching.
 
 - Minor clean-ups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl3qdr8ACgkQa9axLQDI
 XvEKNA/9FWyqesV612qaFkhTFCg0f6byLP4wQf0ZqcZf74L5J6zeUCIc7Mqgdno3
 jWROZNAW69gOYfwHxcBNyPVBpbAcx3k3WBBRWRJPnnJmeQk0oBLVdRziqdGKLxfw
 GhhWGsndnpxDOmyBJOWxZLemZjKFqYX3dhQ9Zi6pvkZgAeFEtIESw6S/iqf42nWL
 1o/LgyQ5kjR6eto1OVW9QOQ83/TlXXlbsvoNwNFGghX1yHjQ6mZ3LITbiFdrbFlT
 FLLwsqlsPzDQcKagszTEHZTbXBf0RojKXWh3HKSEnmPwpacestvLJcKHTD9mtDmY
 Z+rLfyiolZmXoNU9LT6uGTzVD4cRWfzz6eHSYHiufM1UztGSV+dOhIqfPuEOLEu3
 2Xf8sKmQMtuICgbol6Q6ISrjXKH/UNvK2CuuuJSNmOHDlyHKvNfJtoyEhZ5rHUpT
 HQy0qxDCEU7rFCP7clOD/94EGA8gYrV8j5NauY8/VsLpRCMBwoLNglI049qJydaZ
 jL9dPxo+GG7kh7S8VmYwBKtPhqlDNFCzw/HmBBURFhkM1j0nCNt5dKHx+kdLNurg
 nbzRvJ+W42eDze2lmVf33eOfrAy2MfcGr+VuJ5QdmL30bQENCemPrreIy+VnVVR8
 ydeK3lyknJjmX4q8a5o/URsAKvk13crwimNPa0OSoYzDKmWd8SA=
 =vhnZ
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - ZONE_DMA32 initialisation fix when memblocks fall entirely within the
   first GB (used by ZONE_DMA in 5.5 for Raspberry Pi 4).

 - Couple of ftrace fixes following the FTRACE_WITH_REGS patchset.

 - access_ok() fix for the Tagged Address ABI when called from from a
   kernel thread (asynchronous I/O): the kthread does not have the TIF
   flags of the mm owner, so untag the user address unconditionally.

 - KVM compute_layout() called before the alternatives code patching.

 - Minor clean-ups.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: entry: refine comment of stack overflow check
  arm64: ftrace: fix ifdeffery
  arm64: KVM: Invoke compute_layout() before alternatives are applied
  arm64: Validate tagged addresses in access_ok() called from kernel threads
  arm64: mm: Fix column alignment for UXN in kernel_page_tables
  arm64: insn: consistently handle exit text
  arm64: mm: Fix initialisation of DMA zones on non-NUMA systems
2019-12-06 14:18:01 -08:00
Mark Brown cba779d80a arm64: mm: Fix column alignment for UXN in kernel_page_tables
UXN is the only individual PTE bit other than the PTE_ATTRINDX_MASK ones
which doesn't have both a set and a clear value provided, meaning that the
columns in the table won't all be aligned. The PTE_ATTRINDX_MASK values
are all both mutually exclusive and longer so are listed last to make a
single final column for those values. Ensure everything is aligned by
providing a clear value for UXN.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-12-04 11:44:29 +00:00
Will Deacon 93b90414c3 arm64: mm: Fix initialisation of DMA zones on non-NUMA systems
John reports that the recently merged commit 1a8e1cef76 ("arm64: use
both ZONE_DMA and ZONE_DMA32") breaks the boot on his DB845C board:

  | Booting Linux on physical CPU 0x0000000000 [0x517f803c]
  | Linux version 5.4.0-mainline-10675-g957a03b9e38f
  | Machine model: Thundercomm Dragonboard 845c
  | [...]
  | Built 1 zonelists, mobility grouping on.  Total pages: -188245
  | Kernel command line: earlycon
  | firmware_class.path=/vendor/firmware/ androidboot.hardware=db845c
  | init=/init androidboot.boot_devices=soc/1d84000.ufshc
  | printk.devkmsg=on buildvariant=userdebug root=/dev/sda2
  | androidboot.bootdevice=1d84000.ufshc androidboot.serialno=c4e1189c
  | androidboot.baseband=sda
  | msm_drm.dsi_display0=dsi_lt9611_1080_video_display:
  | androidboot.slot_suffix=_a skip_initramfs rootwait ro init=/init
  |
  | <hangs indefinitely here>

This is because, when CONFIG_NUMA=n, zone_sizes_init() fails to handle
memblocks that fall entirely within the ZONE_DMA region and erroneously ends up
trying to add a negatively-sized region into the following ZONE_DMA32, which is
later interpreted as a large unsigned region by the core MM code.

Rework the non-NUMA implementation of zone_sizes_init() so that the start
address of the memblock being processed is adjusted according to the end of the
previous zone, which is then range-checked before updating the hole information
of subsequent zones.

Cc: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/lkml/CALAqxLVVcsmFrDKLRGRq7GewcW405yTOxG=KR3csVzQ6bXutkA@mail.gmail.com
Fixes: 1a8e1cef76 ("arm64: use both ZONE_DMA and ZONE_DMA32")
Reported-by: John Stultz <john.stultz@linaro.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-12-04 11:25:20 +00:00
Linus Torvalds 81b6b96475 dma-mapping updates for 5.5-rc1
- improve dma-debug scalability (Eric Dumazet)
  - tiny dma-debug cleanup (Dan Carpenter)
  - check for vmap memory in dma_map_single (Kees Cook)
  - check for dma_addr_t overflows in dma-direct when using
    DMA offsets (Nicolas Saenz Julienne)
  - switch the x86 sta2x11 SOC to use more generic DMA code
    (Nicolas Saenz Julienne)
  - fix arm-nommu dma-ranges handling (Vladimir Murzin)
  - use __initdata in CMA (Shyam Saini)
  - replace the bus dma mask with a limit (Nicolas Saenz Julienne)
  - merge the remapping helpers into the main dma-direct flow (me)
  - switch xtensa to the generic dma remap handling (me)
  - various cleanups around dma_capable (me)
  - remove unused dev arguments to various dma-noncoherent helpers (me)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl3f+eULHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPyPg/+PVHCrhmepudQQFHu6wfurE5U77iNnoUifvG+b5z5
 5mHmTMkQwyox6rKDe8NuFApAhz1VJDSUgSelPmvTSOIEIGXCvX1p+GqRSVS5YQON
 aLzGvbWKE8hCpaPdDHKYDauD1FZGMM8L2P5oOMF9X9fQ94xxRqfqJM6c8iD16Sgg
 +aOgPNzTnxQHJFF/Dbt/mjJrKXWI+XF+bgUbH+l9yKa7Dd7ibmJR8yl9hs1jmp0H
 1CZ+CizwnAs57rCd1a6Ybc6gj59tySc03NMnnbTko+KDxrcbD3Ee2tpqHVkkCjYz
 Yl0m4FIpbotrpokL/FIS727bVvkjbWgoeM+kiVPoYzmZea3pq/tFDr6tp/BxDhFj
 TZXSFfgQljlYMD3ppSoklFlfjGriVWV0tPO3arPXwuuMF5EX/IMQmvxei05jpc8n
 iELNXOP9iZZkY4tLHy2hn2uWrxBRrS1WQwlLg9hahlNRzyfFSyHeP0zWlVDt+RgF
 5CCbEI+HQcUqg1FApB30lQNWTn1+dJftrpKVBlgNBIyIa/z2rFbt8GdSnItxjfQX
 /XX8EZbFvF6AcXkgURkYFIoKM/EbYShOSLcYA3PTUtcuTnF6Kk5eimySiGWZTVCS
 prruSFDZJOvL3SnOIMIiYVmBdB7lEbDyLI/VYuhoECXEDCJpVmRktNkJNg4q6/E+
 fjQ=
 =e5wO
 -----END PGP SIGNATURE-----

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux; tag 'dma-mapping-5.5' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - improve dma-debug scalability (Eric Dumazet)

 - tiny dma-debug cleanup (Dan Carpenter)

 - check for vmap memory in dma_map_single (Kees Cook)

 - check for dma_addr_t overflows in dma-direct when using DMA offsets
   (Nicolas Saenz Julienne)

 - switch the x86 sta2x11 SOC to use more generic DMA code (Nicolas
   Saenz Julienne)

 - fix arm-nommu dma-ranges handling (Vladimir Murzin)

 - use __initdata in CMA (Shyam Saini)

 - replace the bus dma mask with a limit (Nicolas Saenz Julienne)

 - merge the remapping helpers into the main dma-direct flow (me)

 - switch xtensa to the generic dma remap handling (me)

 - various cleanups around dma_capable (me)

 - remove unused dev arguments to various dma-noncoherent helpers (me)

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux:

* tag 'dma-mapping-5.5' of git://git.infradead.org/users/hch/dma-mapping: (22 commits)
  dma-mapping: treat dev->bus_dma_mask as a DMA limit
  dma-direct: exclude dma_direct_map_resource from the min_low_pfn check
  dma-direct: don't check swiotlb=force in dma_direct_map_resource
  dma-debug: clean up put_hash_bucket()
  powerpc: remove support for NULL dev in __phys_to_dma / __dma_to_phys
  dma-direct: avoid a forward declaration for phys_to_dma
  dma-direct: unify the dma_capable definitions
  dma-mapping: drop the dev argument to arch_sync_dma_for_*
  x86/PCI: sta2x11: use default DMA address translation
  dma-direct: check for overflows on 32 bit DMA addresses
  dma-debug: increase HASH_SIZE
  dma-debug: reorder struct dma_debug_entry fields
  xtensa: use the generic uncached segment support
  dma-mapping: merge the generic remapping helpers into dma-direct
  dma-direct: provide mmap and get_sgtable method overrides
  dma-direct: remove the dma_handle argument to __dma_direct_alloc_pages
  dma-direct: remove __dma_direct_free_pages
  usb: core: Remove redundant vmap checks
  kernel: dma-contiguous: mark CMA parameters __initdata/__initconst
  dma-debug: add a schedule point in debug_dma_dump_mappings()
  ...
2019-11-28 11:16:43 -08:00
Linus Torvalds 6e9f879684 ACPI updates for 5.5-rc1
- Update the ACPICA code in the kernel to upstream revision 20191018
    including:
 
    * Fixes for Clang warnings (Bob Moore).
 
    * Fix for possible overflow in get_tick_count() (Bob Moore).
 
    * Introduction of acpi_unload_table() (Bob Moore).
 
    * Debugger and utilities updates (Erik Schmauss).
 
    * Fix for unloading tables loaded via configfs (Nikolaus Voss).
 
  - Add support for EFI specific purpose memory to optionally allow
    either application-exclusive or core-kernel-mm managed access to
    differentiated memory (Dan Williams).
 
  - Fix and clean up processing of the HMAT table (Brice Goglin,
    Qian Cai, Tao Xu).
 
  - Update the ACPI EC driver to make it work on systems with
    hardware-reduced ACPI (Daniel Drake).
 
  - Always build in support for the Generic Event Device (GED) to
    allow one kernel binary to work both on systems with full
    hardware ACPI and hardware-reduced ACPI (Arjan van de Ven).
 
  - Fix the table unload mechanism to unregister platform devices
    created when the given table was loaded (Andy Shevchenko).
 
  - Rework the lid blacklist handling in the button driver and add
    more lid quirks to it (Hans de Goede).
 
  - Improve ACPI-based device enumeration for some platforms based
    on Intel BayTrail SoCs (Hans de Goede).
 
  - Add an OpRegion driver for the Cherry Trail Crystal Cove PMIC
    and prevent handlers from being registered for unhandled PMIC
    OpRegions (Hans de Goede).
 
  - Unify ACPI _HID/_UID matching (Andy Shevchenko).
 
  - Clean up documentation and comments (Cao jin, James Pack, Kacper
    Piwiński).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl3dHNkSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx/NkP/2y6DWjslA6UW4gjZwaRBcjYoyWExMtQ
 Z86goiRJtP+/NqOwm09wHFcV6FdZ4kitUno3UgMCDZJjrURapg1D0rxb1lSYtMzs
 mGr2FBZlVsJ9erOVSzKj1x2afVhdgl0Rl0fxPzoKgCFt8tCJar6cXy4CVEQKdeLs
 eUui2ksXMIEODGhpN/tr/fJqY4O4jlLmPY6gKWfFpSTsv6lnZmzcCxLf5EvUU7JW
 O91/jXdWz4Vl6IdP32sce6dGDjkvwnY105c7HeBf5EQWUe9RHFuSex982qhCD8U+
 iE+JzlhoYpUb03EktJSXbL++IKUHvoUpTanbhka6unMhazC86x0hDf7ruUtYo2Bk
 V8347CFeQ1x2O5IabfJNnUfKaMYhYmOXIoFHJTLKFO5mcCJmP8KOOyDAYilC1psb
 RJpl1fDoAhk7NqhMttyBqfxiotP0kMoKuqtAAl8Y0hTF0DwR9IfKntuTtp1yTGds
 R4dpJrizUDzw1/o4fCWbc3dFZQR3NFGpL/EAyfPzqjGaeaBBkLoNYstqkal5XHwT
 CILmQg2WHoNuQLXZ4NFFDrM2k2G+VUAjQdkYcb/MCOFbw+aTVPu1wyQq37RLtbMo
 9UwGeeT6SXW3iA1nyMoM+YvitjmxS7gHPPPl+b9G6kBubAzBPp91Ra0Mj9dPIGRB
 Evv5nzOIh8Hi
 =7Cqr
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "These update the ACPICA code in the kernel to upstream revision
  20191018, add support for EFI specific purpose memory, update the ACPI
  EC driver to make it work on systems with hardware-reduced ACPI,
  improve ACPI-based device enumeration for some platforms, rework the
  lid blacklist handling in the button driver and add more lid quirks to
  it, unify ACPI _HID/_UID matching, fix assorted issues and clean up
  the code and documentation.

  Specifics:

   - Update the ACPICA code in the kernel to upstream revision 20191018
     including:
      * Fixes for Clang warnings (Bob Moore)
      * Fix for possible overflow in get_tick_count() (Bob Moore)
      * Introduction of acpi_unload_table() (Bob Moore)
      * Debugger and utilities updates (Erik Schmauss)
      * Fix for unloading tables loaded via configfs (Nikolaus Voss)

   - Add support for EFI specific purpose memory to optionally allow
     either application-exclusive or core-kernel-mm managed access to
     differentiated memory (Dan Williams)

   - Fix and clean up processing of the HMAT table (Brice Goglin, Qian
     Cai, Tao Xu)

   - Update the ACPI EC driver to make it work on systems with
     hardware-reduced ACPI (Daniel Drake)

   - Always build in support for the Generic Event Device (GED) to allow
     one kernel binary to work both on systems with full hardware ACPI
     and hardware-reduced ACPI (Arjan van de Ven)

   - Fix the table unload mechanism to unregister platform devices
     created when the given table was loaded (Andy Shevchenko)

   - Rework the lid blacklist handling in the button driver and add more
     lid quirks to it (Hans de Goede)

   - Improve ACPI-based device enumeration for some platforms based on
     Intel BayTrail SoCs (Hans de Goede)

   - Add an OpRegion driver for the Cherry Trail Crystal Cove PMIC and
     prevent handlers from being registered for unhandled PMIC OpRegions
     (Hans de Goede)

   - Unify ACPI _HID/_UID matching (Andy Shevchenko)

   - Clean up documentation and comments (Cao jin, James Pack, Kacper
     Piwiński)"

* tag 'acpi-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
  ACPI: OSI: Shoot duplicate word
  ACPI: HMAT: use %u instead of %d to print u32 values
  ACPI: NUMA: HMAT: fix a section mismatch
  ACPI: HMAT: don't mix pxm and nid when setting memory target processor_pxm
  ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device
  ACPI: NUMA: HMAT: Register HMAT at device_initcall level
  device-dax: Add a driver for "hmem" devices
  dax: Fix alloc_dax_region() compile warning
  lib: Uplevel the pmem "region" ida to a global allocator
  x86/efi: Add efi_fake_mem support for EFI_MEMORY_SP
  arm/efi: EFI soft reservation to memblock
  x86/efi: EFI soft reservation to E820 enumeration
  efi: Common enable/disable infrastructure for EFI soft reservation
  x86/efi: Push EFI_MEMMAP check into leaf routines
  efi: Enumerate EFI_MEMORY_SP
  ACPI: NUMA: Establish a new drivers/acpi/numa/ directory
  ACPICA: Update version to 20191018
  ACPICA: debugger: remove leading whitespaces when converting a string to a buffer
  ACPICA: acpiexec: initialize all simple types and field units from user input
  ACPICA: debugger: add field unit support for acpi_db_get_next_token
  ...
2019-11-26 19:25:25 -08:00
Christoph Hellwig d7293f79ca Merge branch 'for-next/zone-dma' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into dma-mapping-for-next
Pull in a stable branch from the arm64 tree that adds the zone_dma_bits
variable to avoid creating hard to resolve conflicts with that addition.
2019-11-21 18:13:03 +01:00
Christoph Hellwig 56e35f9c5b dma-mapping: drop the dev argument to arch_sync_dma_for_*
These are pure cache maintainance routines, so drop the unused
struct device argument.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2019-11-20 20:31:38 +01:00
Catalin Marinas 6be22809e5 Merge branches 'for-next/elf-hwcap-docs', 'for-next/smccc-conduit-cleanup', 'for-next/zone-dma', 'for-next/relax-icc_pmr_el1-sync', 'for-next/double-page-fault', 'for-next/misc', 'for-next/kselftest-arm64-signal' and 'for-next/kaslr-diagnostics' into for-next/core
* for-next/elf-hwcap-docs:
  : Update the arm64 ELF HWCAP documentation
  docs/arm64: cpu-feature-registers: Rewrite bitfields that don't follow [e, s]
  docs/arm64: cpu-feature-registers: Documents missing visible fields
  docs/arm64: elf_hwcaps: Document HWCAP_SB
  docs/arm64: elf_hwcaps: sort the HWCAP{, 2} documentation by ascending value

* for-next/smccc-conduit-cleanup:
  : SMC calling convention conduit clean-up
  firmware: arm_sdei: use common SMCCC_CONDUIT_*
  firmware/psci: use common SMCCC_CONDUIT_*
  arm: spectre-v2: use arm_smccc_1_1_get_conduit()
  arm64: errata: use arm_smccc_1_1_get_conduit()
  arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit()

* for-next/zone-dma:
  : Reintroduction of ZONE_DMA for Raspberry Pi 4 support
  arm64: mm: reserve CMA and crashkernel in ZONE_DMA32
  dma/direct: turn ARCH_ZONE_DMA_BITS into a variable
  arm64: Make arm64_dma32_phys_limit static
  arm64: mm: Fix unused variable warning in zone_sizes_init
  mm: refresh ZONE_DMA and ZONE_DMA32 comments in 'enum zone_type'
  arm64: use both ZONE_DMA and ZONE_DMA32
  arm64: rename variables used to calculate ZONE_DMA32's size
  arm64: mm: use arm64_dma_phys_limit instead of calling max_zone_dma_phys()

* for-next/relax-icc_pmr_el1-sync:
  : Relax ICC_PMR_EL1 (GICv3) accesses when ICC_CTLR_EL1.PMHE is clear
  arm64: Document ICC_CTLR_EL3.PMHE setting requirements
  arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear

* for-next/double-page-fault:
  : Avoid a double page fault in __copy_from_user_inatomic() if hw does not support auto Access Flag
  mm: fix double page fault on arm64 if PTE_AF is cleared
  x86/mm: implement arch_faults_on_old_pte() stub on x86
  arm64: mm: implement arch_faults_on_old_pte() on arm64
  arm64: cpufeature: introduce helper cpu_has_hw_af()

* for-next/misc:
  : Various fixes and clean-ups
  arm64: kpti: Add NVIDIA's Carmel core to the KPTI whitelist
  arm64: mm: Remove MAX_USER_VA_BITS definition
  arm64: mm: simplify the page end calculation in __create_pgd_mapping()
  arm64: print additional fault message when executing non-exec memory
  arm64: psci: Reduce the waiting time for cpu_psci_cpu_kill()
  arm64: pgtable: Correct typo in comment
  arm64: docs: cpu-feature-registers: Document ID_AA64PFR1_EL1
  arm64: cpufeature: Fix typos in comment
  arm64/mm: Poison initmem while freeing with free_reserved_area()
  arm64: use generic free_initrd_mem()
  arm64: simplify syscall wrapper ifdeffery

* for-next/kselftest-arm64-signal:
  : arm64-specific kselftest support with signal-related test-cases
  kselftest: arm64: fake_sigreturn_misaligned_sp
  kselftest: arm64: fake_sigreturn_bad_size
  kselftest: arm64: fake_sigreturn_duplicated_fpsimd
  kselftest: arm64: fake_sigreturn_missing_fpsimd
  kselftest: arm64: fake_sigreturn_bad_size_for_magic0
  kselftest: arm64: fake_sigreturn_bad_magic
  kselftest: arm64: add helper get_current_context
  kselftest: arm64: extend test_init functionalities
  kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht]
  kselftest: arm64: mangle_pstate_invalid_daif_bits
  kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils
  kselftest: arm64: extend toplevel skeleton Makefile

* for-next/kaslr-diagnostics:
  : Provide diagnostics on boot for KASLR
  arm64: kaslr: Check command line before looking for a seed
  arm64: kaslr: Announce KASLR status on boot
2019-11-08 17:46:11 +00:00
Dan Williams 16993c0f0a arm/efi: EFI soft reservation to memblock
UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the
interpretation of the EFI Memory Types as "reserved for a specific
purpose".

The proposed Linux behavior for specific purpose memory is that it is
reserved for direct-access (device-dax) by default and not available for
any kernel usage, not even as an OOM fallback.  Later, through udev
scripts or another init mechanism, these device-dax claimed ranges can
be reconfigured and hot-added to the available System-RAM with a unique
node identifier. This device-dax management scheme implements "soft" in
the "soft reserved" designation by allowing some or all of the
reservation to be recovered as typical memory. This policy can be
disabled at compile-time with CONFIG_EFI_SOFT_RESERVE=n, or runtime with
efi=nosoftreserve.

For this patch, update the ARM paths that consider
EFI_CONVENTIONAL_MEMORY to optionally take the EFI_MEMORY_SP attribute
into account as a reservation indicator. Publish the soft reservation as
IORES_DESC_SOFT_RESERVED memory, similar to x86.

(Based on an original patch by Ard)

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-07 15:44:19 +01:00
Nicolas Saenz Julienne bff3b04460 arm64: mm: reserve CMA and crashkernel in ZONE_DMA32
With the introduction of ZONE_DMA in arm64 we moved the default CMA and
crashkernel reservation into that area. This caused a regression on big
machines that need big CMA and crashkernel reservations. Note that
ZONE_DMA is only 1GB big.

Restore the previous behavior as the wide majority of devices are OK
with reserving these in ZONE_DMA32. The ones that need them in ZONE_DMA
will configure it explicitly.

Fixes: 1a8e1cef76 ("arm64: use both ZONE_DMA and ZONE_DMA32")
Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-07 11:22:20 +00:00
Masahiro Yamada 32d1870877 arm64: mm: simplify the page end calculation in __create_pgd_mapping()
Calculate the page-aligned end address more simply.

The local variable, "length" is unneeded.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-06 11:17:09 +00:00
Nicolas Saenz Julienne 8b5369ea58 dma/direct: turn ARCH_ZONE_DMA_BITS into a variable
Some architectures, notably ARM, are interested in tweaking this
depending on their runtime DMA addressing limitations.

Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-11-01 09:41:18 +00:00
Xiang Zheng e44ec4a35d arm64: print additional fault message when executing non-exec memory
When attempting to executing non-executable memory, the fault message
shows:

  Unable to handle kernel read from unreadable memory at virtual address
  ffff802dac469000

This may confuse someone, so add a new fault message for instruction
abort.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-29 15:41:14 +00:00
Catalin Marinas 8301ae822d Merge branch 'for-next/entry-s-to-c' into for-next/core
Move the synchronous exception paths from entry.S into a C file to
improve the code readability.

* for-next/entry-s-to-c:
  arm64: entry-common: don't touch daif before bp-hardening
  arm64: Remove asmlinkage from updated functions
  arm64: entry: convert el0_sync to C
  arm64: entry: convert el1_sync to C
  arm64: add local_daif_inherit()
  arm64: Add prototypes for functions called by entry.S
  arm64: remove __exception annotations
2019-10-28 17:02:56 +00:00
Catalin Marinas 4686da5140 arm64: Make arm64_dma32_phys_limit static
This variable is only used in the arch/arm64/mm/init.c file for
ZONE_DMA32 initialisation, no need to expose it.

Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28 16:46:43 +00:00
James Morse bfe298745a arm64: entry-common: don't touch daif before bp-hardening
The previous patches mechanically transformed the assembly version of
entry.S to entry-common.c for synchronous exceptions.

The C version of local_daif_restore() doesn't quite do the same thing
as the assembly versions if pseudo-NMI is in use. In particular,
| local_daif_restore(DAIF_PROCCTX_NOIRQ)
will still allow pNMI to be delivered. This is not the behaviour
do_el0_ia_bp_hardening() and do_sp_pc_abort() want as it should not
be possible for the PMU handler to run as an NMI until the bp-hardening
sequence has run.

The bp-hardening calls were placed where they are because this was the
first C code to run after the relevant exceptions. As we've now moved
that point earlier, move the checks and calls earlier too.

This makes it clearer that this stuff runs before any kind of exception,
and saves modifying PSTATE twice.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28 11:22:54 +00:00
James Morse afa7c0e5b9 arm64: Remove asmlinkage from updated functions
Now that the callers of these functions have moved into C, they no longer
need the asmlinkage annotation. Remove it.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28 11:22:51 +00:00
James Morse b6e43c0e31 arm64: remove __exception annotations
Since commit 7326749801 ("arm64: unwind: reference pt_regs via embedded
stack frame") arm64 has not used the __exception annotation to dump
the pt_regs during stack tracing. in_exception_text() has no callers.

This annotation is only used to blacklist kprobes, it means the same as
__kprobes.

Section annotations like this require the functions to be grouped
together between the start/end markers, and placed according to
the linker script. For kprobes we also have NOKPROBE_SYMBOL() which
logs the symbol address in a section that kprobes parses and
blacklists at boot.

Using NOKPROBE_SYMBOL() instead lets kprobes publish the list of
blacklisted symbols, and saves us from having an arm64 specific
spelling of __kprobes.

do_debug_exception() already has a NOKPROBE_SYMBOL() annotation.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28 11:22:38 +00:00
Mark Rutland 3813733595 arm64: mm: fix inverted PAR_EL1.F check
When detecting a spurious EL1 translation fault, we have the CPU retry
the translation using an AT S1E1R instruction, and inspect PAR_EL1 to
determine if the fault was spurious.

When PAR_EL1.F == 0, the AT instruction successfully translated the
address without a fault, which implies the original fault was spurious.
However, in this case we return false and treat the original fault as if
it was not spurious.

Invert the return value so that we treat such a case as spurious.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Fixes: 42f91093b0 ("arm64: mm: Ignore spurious translation faults taken from the kernel")
Tested-by: James Morse <james.morse@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-16 09:58:03 -07:00
Nathan Chancellor 4399d43070 arm64: mm: Fix unused variable warning in zone_sizes_init
When building arm64 allnoconfig, CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32
get disabled so there is a warning about max_dma being unused.

../arch/arm64/mm/init.c:215:16: warning: unused variable 'max_dma'
[-Wunused-variable]
        unsigned long max_dma = min;
                      ^
1 warning generated.

Add __maybe_unused to make this clear to the compiler.

Fixes: 1a8e1cef76 ("arm64: use both ZONE_DMA and ZONE_DMA32")
Reviewed-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-16 16:06:27 +01:00
Anshuman Khandual 6ec939f8b8 arm64/mm: Poison initmem while freeing with free_reserved_area()
Platform implementation for free_initmem() should poison the memory while
freeing it up. Hence pass across POISON_FREE_INITMEM while calling into
free_reserved_area(). The same is being followed in the generic fallback
for free_initmem() and some other platforms overriding it.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-16 13:56:30 +01:00
Mike Rapoport 899ee4afe5 arm64: use generic free_initrd_mem()
arm64 calls memblock_free() for the initrd area in its implementation of
free_initrd_mem(), but this call has no actual effect that late in the boot
process. By the time initrd is freed, all the reserved memory is managed by
the page allocator and the memblock.reserved is unused, so the only purpose
of the memblock_free() call is to keep track of initrd memory for debugging
and accounting.

Without the memblock_free() call the only difference between arm64 and the
generic versions of free_initrd_mem() is the memory poisoning.

Move memblock_free() call to the generic code, enable it there
for the architectures that define ARCH_KEEP_MEMBLOCK and use the generic
implementation of free_initrd_mem() on arm64.

Tested-by: Anshuman Khandual <anshuman.khandual@arm.com>	#arm64
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-16 13:55:25 +01:00
Nicolas Saenz Julienne 1a8e1cef76 arm64: use both ZONE_DMA and ZONE_DMA32
So far all arm64 devices have supported 32 bit DMA masks for their
peripherals. This is not true anymore for the Raspberry Pi 4 as most of
it's peripherals can only address the first GB of memory on a total of
up to 4 GB.

This goes against ZONE_DMA32's intent, as it's expected for ZONE_DMA32
to be addressable with a 32 bit mask. So it was decided to re-introduce
ZONE_DMA in arm64.

ZONE_DMA will contain the lower 1G of memory, which is currently the
memory area addressable by any peripheral on an arm64 device.
ZONE_DMA32 will contain the rest of the 32 bit addressable memory.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-14 10:56:28 +01:00
Nicolas Saenz Julienne a573cdd797 arm64: rename variables used to calculate ZONE_DMA32's size
Let the name indicate that they are used to calculate ZONE_DMA32's size
as opposed to ZONE_DMA.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-14 10:56:28 +01:00
Nicolas Saenz Julienne ae970dc096 arm64: mm: use arm64_dma_phys_limit instead of calling max_zone_dma_phys()
By the time we call zones_sizes_init() arm64_dma_phys_limit already
contains the result of max_zone_dma_phys(). We use the variable instead
of calling the function directly to save some precious cpu time.

Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-14 10:56:28 +01:00
Mark Rutland 308c515617 arm64: mm: fix spurious fault detection
When detecting a spurious EL1 translation fault, we attempt to compare
ESR_EL1.DFSC with PAR_EL1.FST. We erroneously use FIELD_PREP() to
extract PAR_EL1.FST, when we should be using FIELD_GET().

In the wise words of Robin Murphy:

| FIELD_GET() is a UBFX, FIELD_PREP() is a BFI

Using FIELD_PREP() means that that dfsc & ESR_ELx_FSC_TYPE is always
zero, and hence not equal to ESR_ELx_FSC_FAULT. Thus we detect any
unhandled translation fault as spurious.

... so let's use FIELD_GET() to ensure we don't decide all translation
faults are spurious. ESR_EL1.DFSC occupies bits [5:0], and requires no
shifting.

Fixes: 42f91093b0 ("arm64: mm: Ignore spurious translation faults taken from the kernel")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Robin Murphy <robin.murphy@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-07 11:07:16 +01:00
Mark Rutland e4365f968f arm64: mm: avoid virt_to_phys(init_mm.pgd)
If we take an unhandled fault in the kernel, we call show_pte() to dump
the {PGDP,PGD,PUD,PMD,PTE} values for the corresponding page table walk,
where the PGDP value is virt_to_phys(mm->pgd).

The boot-time and runtime kernel page tables, init_pg_dir and
swapper_pg_dir respectively, are kernel symbols. Thus, it is not valid
to call virt_to_phys() on either of these, though we'll do so if we take
a fault on a TTBR1 address.

When CONFIG_DEBUG_VIRTUAL is not selected, virt_to_phys() will silently
fix this up. However, when CONFIG_DEBUG_VIRTUAL is selected, this
results in splats as below. Depending on when these occur, they can
happen to suppress information needed to debug the original unhandled
fault, such as the backtrace:

| Unable to handle kernel paging request at virtual address ffff7fffec73cf0f
| Mem abort info:
|   ESR = 0x96000004
|   EC = 0x25: DABT (current EL), IL = 32 bits
|   SET = 0, FnV = 0
|   EA = 0, S1PTW = 0
| Data abort info:
|   ISV = 0, ISS = 0x00000004
|   CM = 0, WnR = 0
| ------------[ cut here ]------------
| virt_to_phys used for non-linear address: 00000000102c9dbe (swapper_pg_dir+0x0/0x1000)
| WARNING: CPU: 1 PID: 7558 at arch/arm64/mm/physaddr.c:15 __virt_to_phys+0xe0/0x170 arch/arm64/mm/physaddr.c:12
| Kernel panic - not syncing: panic_on_warn set ...
| SMP: stopping secondary CPUs
| Dumping ftrace buffer:
|    (ftrace buffer empty)
| Kernel Offset: disabled
| CPU features: 0x0002,23000438
| Memory Limit: none
| Rebooting in 1 seconds..

We can avoid this by ensuring that we call __pa_symbol() for
init_mm.pgd, as this will always be a kernel symbol. As the dumped
{PGD,PUD,PMD,PTE} values are the raw values from the relevant entries we
don't need to handle these specially.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-10-04 11:43:55 +01:00
Mark Rutland b4ed71f557 mm: treewide: clarify pgtable_page_{ctor,dtor}() naming
The naming of pgtable_page_{ctor,dtor}() seems to have confused a few
people, and until recently arm64 used these erroneously/pointlessly for
other levels of page table.

To make it incredibly clear that these only apply to the PTE level, and to
align with the naming of pgtable_pmd_page_{ctor,dtor}(), let's rename them
to pgtable_pte_page_{ctor,dtor}().

These changes were generated with the following shell script:

----
git grep -lw 'pgtable_page_.tor' | while read FILE; do
    sed -i '{s/pgtable_page_ctor/pgtable_pte_page_ctor/}' $FILE;
    sed -i '{s/pgtable_page_dtor/pgtable_pte_page_dtor/}' $FILE;
done
----

... with the documentation re-flowed to remain under 80 columns, and
whitespace fixed up in macros to keep backslashes aligned.

There should be no functional change as a result of this patch.

Link: http://lkml.kernel.org/r/20190722141133.3116-1-mark.rutland@arm.com
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-26 10:10:44 -07:00
Alexandre Ghiti 67f3977f80 arm64, mm: move generic mmap layout functions to mm
arm64 handles top-down mmap layout in a way that can be easily reused by
other architectures, so make it available in mm.  It then introduces a new
config ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT that can be set by other
architectures to benefit from those functions.  Note that this new config
depends on MMU being enabled, if selected without MMU support, a warning
will be thrown.

Link: http://lkml.kernel.org/r/20190730055113.23635-5-alex@ghiti.fr
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: James Hogan <jhogan@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Alexandre Ghiti e8d54b62c5 arm64: consider stack randomization for mmap base only when necessary
Do not offset mmap base address because of stack randomization if current
task does not want randomization.  Note that x86 already implements this
behaviour.

Link: http://lkml.kernel.org/r/20190730055113.23635-4-alex@ghiti.fr
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Alexandre Ghiti 28058ed61f arm64: make use of is_compat_task instead of hardcoding this test
Each architecture has its own way to determine if a task is a compat task,
by using is_compat_task in arch_mmap_rnd, it allows more genericity and
then it prepares its moving to mm/.

Link: http://lkml.kernel.org/r/20190730055113.23635-3-alex@ghiti.fr
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:11 -07:00
Mike Rapoport 782de70c42 mm: consolidate pgtable_cache_init() and pgd_cache_init()
Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem
cache for page table allocations on several architectures that do not use
PAGE_SIZE tables for one or more levels of the page table hierarchy.

Most architectures do not implement these functions and use __weak default
NOP implementation of pgd_cache_init().  Since there is no such default
for pgtable_cache_init(), its empty stub is duplicated among most
architectures.

Rename the definitions of pgd_cache_init() to pgtable_cache_init() and
drop empty stubs of pgtable_cache_init().

Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Will Deacon <will@kernel.org>		[arm64]
Acked-by: Thomas Gleixner <tglx@linutronix.de>	[x86]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:09 -07:00
Matthew Wilcox (Oracle) a50b854e07 mm: introduce page_size()
Patch series "Make working with compound pages easier", v2.

These three patches add three helpers and convert the appropriate
places to use them.

This patch (of 3):

It's unnecessarily hard to find out the size of a potentially huge page.
Replace 'PAGE_SIZE << compound_order(page)' with page_size(page).

Link: http://lkml.kernel.org/r/20190721104612.19120-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 15:54:08 -07:00
Linus Torvalds 671df18953 dma-mapping updates for 5.4:
- add dma-mapping and block layer helpers to take care of IOMMU
    merging for mmc plus subsequent fixups (Yoshihiro Shimoda)
  - rework handling of the pgprot bits for remapping (me)
  - take care of the dma direct infrastructure for swiotlb-xen (me)
  - improve the dma noncoherent remapping infrastructure (me)
  - better defaults for ->mmap, ->get_sgtable and ->get_required_mask (me)
  - cleanup mmaping of coherent DMA allocations (me)
  - various misc cleanups (Andy Shevchenko, me)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl2CSucLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPfrhAAgXZA/EdFPvkkCoDrmgtf3XkudX9gajeCd9g4NZy6
 ZBQElTVvm4S0sQj7IXgALnMumDMbbTibW5SQLX5GwQDe+XXBpZ8ajpAnJAXc8a5T
 qaFQ4SInr4CgBZf9nZKDkbSBZ1Tu3AQm1c0QI8riRCkrVTuX4L06xpCef4Yh4mgO
 rwWEjIioYpQiKZMmu98riXh3ZNfFG3mVJRhKt8B6XJbBgnUnjDOPYGgaUwp6CU20
 tFBKL2GaaV0vdLJ5wYhIGXT4DJ8tp9T5n3IYGZv1Ux889RaZEHlCrMxzelYeDbCT
 KhZbhcSECGnddsh73t/UX7/KhytuqnfKa9n+Xo6AWuA47xO4c36quOOcTk9M0vE5
 TfGDmewgL6WIv4lzokpRn5EkfDhyL33j8eYJrJ8e0ldcOhSQIFk4ciXnf2stWi6O
 JrlzzzSid+zXxu48iTfoPdnMr7psTpiMvvRvKfEeMp2FX9Fg6EdMzJYLTEl+COHB
 0WwNacZmY3P01+b5EZXEgqKEZevIIdmPKbyM9rPtTjz8BjBwkABHTpN3fWbVBf7/
 Ax6OPYyW40xp1fnJuzn89m3pdOxn88FpDdOaeLz892Zd+Qpnro1ayulnFspVtqGM
 mGbzA9whILvXNRpWBSQrvr2IjqMRjbBxX3BVACl3MMpOChgkpp5iANNfSDjCftSF
 Zu8=
 =/wGv
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - add dma-mapping and block layer helpers to take care of IOMMU merging
   for mmc plus subsequent fixups (Yoshihiro Shimoda)

 - rework handling of the pgprot bits for remapping (me)

 - take care of the dma direct infrastructure for swiotlb-xen (me)

 - improve the dma noncoherent remapping infrastructure (me)

 - better defaults for ->mmap, ->get_sgtable and ->get_required_mask
   (me)

 - cleanup mmaping of coherent DMA allocations (me)

 - various misc cleanups (Andy Shevchenko, me)

* tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping: (41 commits)
  mmc: renesas_sdhi_internal_dmac: Add MMC_CAP2_MERGE_CAPABLE
  mmc: queue: Fix bigger segments usage
  arm64: use asm-generic/dma-mapping.h
  swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page
  swiotlb-xen: simplify cache maintainance
  swiotlb-xen: use the same foreign page check everywhere
  swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable
  xen: remove the exports for xen_{create,destroy}_contiguous_region
  xen/arm: remove xen_dma_ops
  xen/arm: simplify dma_cache_maint
  xen/arm: use dev_is_dma_coherent
  xen/arm: consolidate page-coherent.h
  xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance
  arm: remove wrappers for the generic dma remap helpers
  dma-mapping: introduce a dma_common_find_pages helper
  dma-mapping: always use VM_DMA_COHERENT for generic DMA remap
  vmalloc: lift the arm flag for coherent mappings to common code
  dma-mapping: provide a better default ->get_required_mask
  dma-mapping: remove the dma_declare_coherent_memory export
  remoteproc: don't allow modular build
  ...
2019-09-19 13:27:23 -07:00
Linus Torvalds e77fafe9af arm64 updates for 5.4:
- 52-bit virtual addressing in the kernel
 
 - New ABI to allow tagged user pointers to be dereferenced by syscalls
 
 - Early RNG seeding by the bootloader
 
 - Improve robustness of SMP boot
 
 - Fix TLB invalidation in light of recent architectural clarifications
 
 - Support for i.MX8 DDR PMU
 
 - Remove direct LSE instruction patching in favour of static keys
 
 - Function error injection using kprobes
 
 - Support for the PPTT "thread" flag introduced by ACPI 6.3
 
 - Move PSCI idle code into proper cpuidle driver
 
 - Relaxation of implicit I/O memory barriers
 
 - Build with RELR relocations when toolchain supports them
 
 - Numerous cleanups and non-critical fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl1yYREQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNAM3CAChqDFQkryXoHwdeEcaukMRVNxtxOi4pM4g
 5xqkb7PoqRJssIblsuhaXjrSD97yWCgaqCmFe6rKoes++lP4bFcTe22KXPPyPBED
 A+tK4nTuKKcZfVbEanUjI+ihXaHJmKZ/kwAxWsEBYZ4WCOe3voCiJVNO2fHxqg1M
 8TskZ2BoayTbWMXih0eJg2MCy/xApBq4b3nZG4bKI7Z9UpXiKN1NYtDh98ZEBK4V
 d/oNoHsJ2ZvIQsztoBJMsvr09DTCazCijWZiECadm6l41WEPFizngrACiSJLLtYo
 0qu4qxgg9zgFlvBCRQmIYSggTuv35RgXSfcOwChmW5DUjHG+f9GK
 =Ru4B
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "Although there isn't tonnes of code in terms of line count, there are
  a fair few headline features which I've noted both in the tag and also
  in the merge commits when I pulled everything together.

  The part I'm most pleased with is that we had 35 contributors this
  time around, which feels like a big jump from the usual small group of
  core arm64 arch developers. Hopefully they all enjoyed it so much that
  they'll continue to contribute, but we'll see.

  It's probably worth highlighting that we've pulled in a branch from
  the risc-v folks which moves our CPU topology code out to where it can
  be shared with others.

  Summary:

   - 52-bit virtual addressing in the kernel

   - New ABI to allow tagged user pointers to be dereferenced by
     syscalls

   - Early RNG seeding by the bootloader

   - Improve robustness of SMP boot

   - Fix TLB invalidation in light of recent architectural
     clarifications

   - Support for i.MX8 DDR PMU

   - Remove direct LSE instruction patching in favour of static keys

   - Function error injection using kprobes

   - Support for the PPTT "thread" flag introduced by ACPI 6.3

   - Move PSCI idle code into proper cpuidle driver

   - Relaxation of implicit I/O memory barriers

   - Build with RELR relocations when toolchain supports them

   - Numerous cleanups and non-critical fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (114 commits)
  arm64: remove __iounmap
  arm64: atomics: Use K constraint when toolchain appears to support it
  arm64: atomics: Undefine internal macros after use
  arm64: lse: Make ARM64_LSE_ATOMICS depend on JUMP_LABEL
  arm64: asm: Kill 'asm/atomic_arch.h'
  arm64: lse: Remove unused 'alt_lse' assembly macro
  arm64: atomics: Remove atomic_ll_sc compilation unit
  arm64: avoid using hard-coded registers for LSE atomics
  arm64: atomics: avoid out-of-line ll/sc atomics
  arm64: Use correct ll/sc atomic constraints
  jump_label: Don't warn on __exit jump entries
  docs/perf: Add documentation for the i.MX8 DDR PMU
  perf/imx_ddr: Add support for AXI ID filtering
  arm64: kpti: ensure patched kernel text is fetched from PoU
  arm64: fix fixmap copy for 16K pages and 48-bit VA
  perf/smmuv3: Validate groups for global filtering
  perf/smmuv3: Validate group size
  arm64: Relax Documentation/arm64/tagged-pointers.rst
  arm64: kvm: Replace hardcoded '1' with SYS_PAR_EL1_F
  arm64: mm: Ignore spurious translation faults taken from the kernel
  ...
2019-09-16 14:31:40 -07:00
Christoph Hellwig 5489c8e0cf arm64: use asm-generic/dma-mapping.h
Now that the Xen special cases are gone nothing worth mentioning is
left in the arm64 <asm/dma-mapping.h> file, so switch to use the
asm-generic version instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2019-09-11 12:43:27 +02:00
Christoph Hellwig 0e0d26e779 xen/arm: remove xen_dma_ops
arm and arm64 can just use xen_swiotlb_dma_ops directly like x86, no
need for a pointer indirection.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2019-09-11 12:43:26 +02:00
Christoph Hellwig e376897f42 arm64: remove __iounmap
No need to indirect iounmap for arm64.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Will Deacon <will@kernel.org>
2019-09-04 13:12:26 +01:00
Will Deacon ac12cf85d6 Merge branches 'for-next/52-bit-kva', 'for-next/cpu-topology', 'for-next/error-injection', 'for-next/perf', 'for-next/psci-cpuidle', 'for-next/rng', 'for-next/smpboot', 'for-next/tbi' and 'for-next/tlbi' into for-next/core
* for-next/52-bit-kva: (25 commits)
  Support for 52-bit virtual addressing in kernel space

* for-next/cpu-topology: (9 commits)
  Move CPU topology parsing into core code and add support for ACPI 6.3

* for-next/error-injection: (2 commits)
  Support for function error injection via kprobes

* for-next/perf: (8 commits)
  Support for i.MX8 DDR PMU and proper SMMUv3 group validation

* for-next/psci-cpuidle: (7 commits)
  Move PSCI idle code into a new CPUidle driver

* for-next/rng: (4 commits)
  Support for 'rng-seed' property being passed in the devicetree

* for-next/smpboot: (3 commits)
  Reduce fragility of secondary CPU bringup in debug configurations

* for-next/tbi: (10 commits)
  Introduce new syscall ABI with relaxed requirements for pointer tags

* for-next/tlbi: (6 commits)
  Handle spurious page faults arising from kernel space
2019-08-30 12:46:12 +01:00
Christoph Hellwig 8e3a68fb55 dma-mapping: make dma_atomic_pool_init self-contained
The memory allocated for the atomic pool needs to have the same
mapping attributes that we use for remapping, so use
pgprot_dmacoherent instead of open coding it.  Also deduct a
suitable zone to allocate the memory from based on the presence
of the DMA zones.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-08-29 16:43:33 +02:00
Christoph Hellwig 419e2f1838 dma-mapping: remove arch_dma_mmap_pgprot
arch_dma_mmap_pgprot is used for two things:

 1) to override the "normal" uncached page attributes for mapping
    memory coherent to devices that can't snoop the CPU caches
 2) to provide the special DMA_ATTR_WRITE_COMBINE semantics on older
    arm systems and some mips platforms

Replace one with the pgprot_dmacoherent macro that is already provided
by arm and much simpler to use, and lift the DMA_ATTR_WRITE_COMBINE
handling to common code with an explicit arch opt-in.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	# m68k
Acked-by: Paul Burton <paul.burton@mips.com>		# mips
2019-08-29 16:43:22 +02:00
Mark Rutland f32c7a8e45 arm64: kpti: ensure patched kernel text is fetched from PoU
While the MMUs is disabled, I-cache speculation can result in
instructions being fetched from the PoC. During boot we may patch
instructions (e.g. for alternatives and jump labels), and these may be
dirty at the PoU (and stale at the PoC).

Thus, while the MMU is disabled in the KPTI pagetable fixup code we may
load stale instructions into the I-cache, potentially leading to
subsequent crashes when executing regions of code which have been
modified at runtime.

Similarly to commit:

  8ec4198743 ("arm64: mm: ensure patched kernel text is fetched from PoU")

... we can invalidate the I-cache after enabling the MMU to prevent such
issues.

The KPTI pagetable fixup code itself should be clean to the PoC per the
boot protocol, so no maintenance is required for this code.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-28 13:40:08 +01:00
Mark Rutland b333b0ba23 arm64: fix fixmap copy for 16K pages and 48-bit VA
With 16K pages and 48-bit VAs, the PGD level of table has two entries,
and so the fixmap shares a PGD with the kernel image. Since commit:

  f9040773b7 ("arm64: move kernel image to base of vmalloc area")

... we copy the existing fixmap to the new fine-grained page tables at
the PUD level in this case. When walking to the new PUD, we forgot to
offset the PGD entry and always used the PGD entry at index 0, but this
worked as the kernel image and fixmap were in the low half of the TTBR1
address space.

As of commit:

  14c127c957 ("arm64: mm: Flip kernel VA space")

... the kernel image and fixmap are in the high half of the TTBR1
address space, and hence use the PGD at index 1, but we didn't update
the fixmap copying code to account for this.

Thus, we'll erroneously try to copy the fixmap slots into a PUD under
the PGD entry at index 0. At the point we do so this PGD entry has not
been initialised, and thus we'll try to write a value to a small offset
from physical address 0, causing a number of potential problems.

Fix this be correctly offsetting the PGD. This is split over a few steps
for legibility.

Fixes: 14c127c957 ("arm64: mm: Flip kernel VA space")
Reported-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Steve Capper <Steve.Capper@arm.com>
Tested-by: Steve Capper <Steve.Capper@arm.com>
Tested-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-28 12:10:03 +01:00
Will Deacon 42f91093b0 arm64: mm: Ignore spurious translation faults taken from the kernel
Thanks to address translation being performed out of order with respect to
loads and stores, it is possible for a CPU to take a translation fault when
accessing a page that was mapped by a different CPU.

For example, in the case that one CPU maps a page and then sets a flag to
tell another CPU:

	CPU 0
	-----

	MOV	X0, <valid pte>
	STR	X0, [Xptep]	// Store new PTE to page table
	DSB	ISHST
	ISB
	MOV	X1, #1
	STR	X1, [Xflag]	// Set the flag

	CPU 1
	-----

loop:	LDAR	X0, [Xflag]	// Poll flag with Acquire semantics
	CBZ	X0, loop
	LDR	X1, [X2]	// Translates using the new PTE

then the final load on CPU 1 can raise a translation fault because the
translation can be performed speculatively before the read of the flag and
marked as "faulting" by the CPU. This isn't quite as bad as it sounds
since, in reality, code such as:

	CPU 0				CPU 1
	-----				-----
	spin_lock(&lock);		spin_lock(&lock);
	*ptr = vmalloc(size);		if (*ptr)
	spin_unlock(&lock);			foo = **ptr;
					spin_unlock(&lock);

will not trigger the fault because there is an address dependency on CPU 1
which prevents the speculative translation. However, more exotic code where
the virtual address is known ahead of time, such as:

	CPU 0				CPU 1
	-----				-----
	spin_lock(&lock);		spin_lock(&lock);
	set_fixmap(0, paddr, prot);	if (mapped)
	mapped = true;				foo = *fix_to_virt(0);
	spin_unlock(&lock);		spin_unlock(&lock);

could fault. This can be avoided by any of:

	* Introducing broadcast TLB maintenance on the map path
	* Adding a DSB;ISB sequence after checking a flag which indicates
	  that a virtual address is now mapped
	* Handling the spurious fault

Given that we have never observed a problem due to this under Linux and
future revisions of the architecture are being tightened so that
translation table walks are effectively ordered in the same way as explicit
memory accesses, we no longer treat spurious kernel faults as fatal if an
AT instruction indicates that the access does not trigger a translation
fault.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-27 17:38:36 +01:00
Hsin-Yi Wang e112b032a7 arm64: map FDT as RW for early_init_dt_scan()
Currently in arm64, FDT is mapped to RO before it's passed to
early_init_dt_scan(). However, there might be some codes
(eg. commit "fdt: add support for rng-seed") that need to modify FDT
during init. Map FDT to RO after early fixups are done.

Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-23 16:39:16 +01:00
Christoph Hellwig d225bb8d8a arm64: unexport set_memory_x and set_memory_nx
No module currently messed with clearing or setting the execute
permission of kernel memory, and none really should.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-15 12:58:26 +01:00
Mark Rutland 77ad4ce693 arm64: memory: rename VA_START to PAGE_END
Prior to commit:

  14c127c957 ("arm64: mm: Flip kernel VA space")

... VA_START described the start of the TTBR1 address space for a given
VA size described by VA_BITS, where all kernel mappings began.

Since that commit, VA_START described a portion midway through the
address space, where the linear map ends and other kernel mappings
begin.

To avoid confusion, let's rename VA_START to PAGE_END, making it clear
that it's not the start of the TTBR1 address space and implying that
it's related to PAGE_OFFSET. Comments and other mnemonics are updated
accordingly, along with a typo fix in the decription of VMEMMAP_SIZE.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 17:06:58 +01:00
Mark Rutland 233947ef16 arm64: memory: fix flipped VA space fallout
VA_START used to be the start of the TTBR1 address space, but now it's a
point midway though. In a couple of places we still use VA_START to get
the start of the TTBR1 address space, so let's fix these up to use
PAGE_OFFSET instead.

Fixes: 14c127c957 ("arm64: mm: Flip kernel VA space")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-14 17:05:11 +01:00
Christoph Hellwig 33dcb37cef dma-mapping: fix page attributes for dma_mmap_*
All the way back to introducing dma_common_mmap we've defaulted to mark
the pages as uncached.  But this is wrong for DMA coherent devices.
Later on DMA_ATTR_WRITE_COMBINE also got incorrect treatment as that
flag is only treated special on the alloc side for non-coherent devices.

Introduce a new dma_pgprot helper that deals with the check for coherent
devices so that only the remapping cases ever reach arch_dma_mmap_pgprot
and we thus ensure no aliasing of page attributes happens, which makes
the powerpc version of arch_dma_mmap_pgprot obsolete and simplifies the
remaining ones.

Note that this means arch_dma_mmap_pgprot is a bit misnamed now, but
we'll phase it out soon.

Fixes: 64ccc9c033 ("common: dma-mapping: add support for generic dma_mmap_* calls")
Reported-by: Shawn Anastasio <shawn@anastas.io>
Reported-by: Gavin Li <git@thegavinli.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> # arm64
2019-08-10 19:52:45 +02:00
Steve Capper 2c624fe687 arm64: mm: Remove vabits_user
Previous patches have enabled 52-bit kernel + user VAs and there is no
longer any scenario where user VA != kernel VA size.

This patch removes the, now redundant, vabits_user variable and replaces
usage with vabits_actual where appropriate.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:27 +01:00
Steve Capper b6d00d47e8 arm64: mm: Introduce 52-bit Kernel VAs
Most of the machinery is now in place to enable 52-bit kernel VAs that
are detectable at boot time.

This patch adds a Kconfig option for 52-bit user and kernel addresses
and plumbs in the requisite CONFIG_ macros as well as sets TCR.T1SZ,
physvirt_offset and vmemmap at early boot.

To simplify things this patch also removes the 52-bit user/48-bit kernel
kconfig option.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:26 +01:00
Steve Capper c8b6d2ccf9 arm64: mm: Separate out vmemmap
vmemmap is a preprocessor definition that depends on a variable,
memstart_addr. In a later patch we will need to expand the size of
the VMEMMAP region and optionally modify vmemmap depending upon
whether or not hardware support is available for 52-bit virtual
addresses.

This patch changes vmemmap to be a variable. As the old definition
depended on a variable load, this should not affect performance
noticeably.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:25 +01:00
Steve Capper c812026c54 arm64: mm: Logic to make offset_ttbr1 conditional
When running with a 52-bit userspace VA and a 48-bit kernel VA we offset
ttbr1_el1 to allow the kernel pagetables with a 52-bit PTRS_PER_PGD to
be used for both userspace and kernel.

Moving on to a 52-bit kernel VA we no longer require this offset to
ttbr1_el1 should we be running on a system with HW support for 52-bit
VAs.

This patch introduces conditional logic to offset_ttbr1 to query
SYS_ID_AA64MMFR2_EL1 whenever 52-bit VAs are selected. If there is HW
support for 52-bit VAs then the ttbr1 offset is skipped.

We choose to read a system register rather than vabits_actual because
offset_ttbr1 can be called in places where the kernel data is not
actually mapped.

Calls to offset_ttbr1 appear to be made from rarely called code paths so
this extra logic is not expected to adversely affect performance.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:24 +01:00
Steve Capper 5383cc6efe arm64: mm: Introduce vabits_actual
In order to support 52-bit kernel addresses detectable at boot time, one
needs to know the actual VA_BITS detected. A new variable vabits_actual
is introduced in this commit and employed for the KVM hypervisor layout,
KASAN, fault handling and phys-to/from-virt translation where there
would normally be compile time constants.

In order to maintain performance in phys_to_virt, another variable
physvirt_offset is introduced.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:21 +01:00
Steve Capper 90ec95cda9 arm64: mm: Introduce VA_BITS_MIN
In order to support 52-bit kernel addresses detectable at boot time, the
kernel needs to know the most conservative VA_BITS possible should it
need to fall back to this quantity due to lack of hardware support.

A new compile time constant VA_BITS_MIN is introduced in this patch and
it is employed in the KASAN end address, KASLR, and EFI stub.

For Arm, if 52-bit VA support is unavailable the fallback is to 48-bits.

In other words: VA_BITS_MIN = min (48, VA_BITS)

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:16 +01:00
Steve Capper 99426e5e8c arm64: dump: De-constify VA_START and KASAN_SHADOW_START
The kernel page table dumper assumes that the placement of VA regions is
constant and determined at compile time. As we are about to introduce
variable VA logic, we need to be able to determine certain regions at
boot time.

Specifically the VA_START and KASAN_SHADOW_START will depend on whether
or not the system is booted with 52-bit kernel VAs.

This patch adds logic to the kernel page table dumper s.t. these regions
can be computed at boot time.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:17:16 +01:00
Steve Capper 14c127c957 arm64: mm: Flip kernel VA space
In order to allow for a KASAN shadow that changes size at boot time, one
must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the
start address. Also, it is highly desirable to maintain the same
function addresses in the kernel .text between VA sizes. Both of these
requirements necessitate us to flip the kernel address space halves s.t.
the direct linear map occupies the lower addresses.

This patch puts the direct linear map in the lower addresses of the
kernel VA range and everything else in the higher ranges.

We need to adjust:
 *) KASAN shadow region placement logic,
 *) KASAN_SHADOW_OFFSET computation logic,
 *) virt_to_phys, phys_to_virt checks,
 *) page table dumper.

These are all small changes, that need to take place atomically, so they
are bundled into this commit.

As part of the re-arrangement, a guard region of 2MB (to preserve
alignment for fixed map) is added after the vmemmap. Otherwise the
vmemmap could intersect with IS_ERR pointers.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-09 11:16:51 +01:00
Miles Chen 2951d5efaf arm64: mm: print hexadecimal EC value in mem_abort_decode()
This change prints the hexadecimal EC value in mem_abort_decode(),
which makes it easier to lookup the corresponding EC in
the ARM Architecture Reference Manual.

The commit 1f9b8936f3 ("arm64: Decode information from ESR upon mem
faults") prints useful information when memory abort occurs. It would
be easier to lookup "0x25" instead of "DABT" in the document. Then we
can check the corresponding ISS.

For example:
Current	info	  	Document
		  	EC	Exception class
"CP15 MCR/MRC"		0x3	"MCR or MRC access to CP15a..."
"ASIMD"			0x7	"Access to SIMD or floating-point..."
"DABT (current EL)" 	0x25	"Data Abort taken without..."
...

Before:
Unable to handle kernel paging request at virtual address 000000000000c000
Mem abort info:
  ESR = 0x96000046
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000046
  CM = 0, WnR = 1

After:
Unable to handle kernel paging request at virtual address 000000000000c000
Mem abort info:
  ESR = 0x96000046
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000046
  CM = 0, WnR = 1

Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <Mark.rutland@arm.com>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-07 16:20:57 +01:00
Junhua Huang 13776f9d40 arm64: mm: free the initrd reserved memblock in a aligned manner
We should free the initrd reserved memblock in an aligned manner,
because the initrd reserves the memblock in an aligned manner
in arm64_memblock_init().
Otherwise there are some fragments in memblock_reserved regions
after free_initrd_mem(). e.g.:
/sys/kernel/debug/memblock # cat reserved
   0: 0x0000000080080000..0x00000000817fafff
   1: 0x0000000083400000..0x0000000083ffffff
   2: 0x0000000090000000..0x000000009000407f
   3: 0x00000000b0000000..0x00000000b000003f
   4: 0x00000000b26184ea..0x00000000b2618fff
The fragments like the ranges from b0000000 to b000003f and
from b26184ea to b2618fff should be freed.

And we can do free_reserved_area() after memblock_free(),
as free_reserved_area() calls __free_pages(), once we've done
that it could be allocated somewhere else,
but memblock and iomem still say this is reserved memory.

Fixes: 05c58752f9 ("arm64: To remove initrd reserved area entry from memblock")
Signed-off-by: Junhua Huang <huang.junhua@zte.com.cn>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05 12:35:34 +01:00
Chuhong Yuan b3e089cd44 arm64: Replace strncmp with str_has_prefix
In commit b6b2735514
("tracing: Use str_has_prefix() instead of using fixed sizes")
the newly introduced str_has_prefix() was used
to replace error-prone strncmp(str, const, len).
Here fix codes with the same pattern.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05 11:06:34 +01:00
Masami Hiramatsu d8bb6718c4 arm64: Make debug exception handlers visible from RCU
Make debug exceptions visible from RCU so that synchronize_rcu()
correctly track the debug exception handler.

This also introduces sanity checks for user-mode exceptions as same
as x86's ist_enter()/ist_exit().

The debug exception can interrupt in idle task. For example, it warns
if we put a kprobe on a function called from idle task as below.
The warning message showed that the rcu_read_lock() caused this
problem. But actually, this means the RCU is lost the context which
is already in NMI/IRQ.

  /sys/kernel/debug/tracing # echo p default_idle_call >> kprobe_events
  /sys/kernel/debug/tracing # echo 1 > events/kprobes/enable
  /sys/kernel/debug/tracing # [  135.122237]
  [  135.125035] =============================
  [  135.125310] WARNING: suspicious RCU usage
  [  135.125581] 5.2.0-08445-g9187c508bdc7 #20 Not tainted
  [  135.125904] -----------------------------
  [  135.126205] include/linux/rcupdate.h:594 rcu_read_lock() used illegally while idle!
  [  135.126839]
  [  135.126839] other info that might help us debug this:
  [  135.126839]
  [  135.127410]
  [  135.127410] RCU used illegally from idle CPU!
  [  135.127410] rcu_scheduler_active = 2, debug_locks = 1
  [  135.128114] RCU used illegally from extended quiescent state!
  [  135.128555] 1 lock held by swapper/0/0:
  [  135.128944]  #0: (____ptrval____) (rcu_read_lock){....}, at: call_break_hook+0x0/0x178
  [  135.130499]
  [  135.130499] stack backtrace:
  [  135.131192] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0-08445-g9187c508bdc7 #20
  [  135.131841] Hardware name: linux,dummy-virt (DT)
  [  135.132224] Call trace:
  [  135.132491]  dump_backtrace+0x0/0x140
  [  135.132806]  show_stack+0x24/0x30
  [  135.133133]  dump_stack+0xc4/0x10c
  [  135.133726]  lockdep_rcu_suspicious+0xf8/0x108
  [  135.134171]  call_break_hook+0x170/0x178
  [  135.134486]  brk_handler+0x28/0x68
  [  135.134792]  do_debug_exception+0x90/0x150
  [  135.135051]  el1_dbg+0x18/0x8c
  [  135.135260]  default_idle_call+0x0/0x44
  [  135.135516]  cpu_startup_entry+0x2c/0x30
  [  135.135815]  rest_init+0x1b0/0x280
  [  135.136044]  arch_call_rest_init+0x14/0x1c
  [  135.136305]  start_kernel+0x4d4/0x500
  [  135.136597]

So make debug exception visible to RCU can fix this warning.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Acked-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-08-02 11:56:01 +01:00
David Hildenbrand 80ec922dbd mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE
We want to improve error handling while adding memory by allowing to use
arch_remove_memory() and __remove_pages() even if
CONFIG_MEMORY_HOTREMOVE is not set to e.g., implement something like:

	arch_add_memory()
	rc = do_something();
	if (rc) {
		arch_remove_memory();
	}

We won't get rid of CONFIG_MEMORY_HOTREMOVE for now, as it will require
quite some dependencies for memory offlining.

Link: http://lkml.kernel.org/r/20190527111152.16324-7-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Mark Brown <broonie@kernel.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "mike.travis@hpe.com" <mike.travis@hpe.com>
Cc: Andrew Banman <andrew.banman@hpe.com>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chintan Pandya <cpandya@codeaurora.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Jun Yao <yaojun8558363@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-18 17:08:06 -07:00
David Hildenbrand 22eb634632 arm64/mm: add temporary arch_remove_memory() implementation
A proper arch_remove_memory() implementation is on its way, which also
cleanly removes page tables in arch_add_memory() in case something goes
wrong.

As we want to use arch_remove_memory() in case something goes wrong
during memory hotplug after arch_add_memory() finished, let's add a
temporary hack that is sufficient enough until we get a proper
implementation that cleans up page table entries.

We will remove CONFIG_MEMORY_HOTREMOVE around this code in follow up
patches.

Link: http://lkml.kernel.org/r/20190527111152.16324-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Chintan Pandya <cpandya@codeaurora.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Jun Yao <yaojun8558363@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Andrew Banman <andrew.banman@hpe.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "mike.travis@hpe.com" <mike.travis@hpe.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-18 17:08:06 -07:00
Anshuman Khandual b98cca444d mm, kprobes: generalize and rename notify_page_fault() as kprobe_page_fault()
Architectures which support kprobes have very similar boilerplate around
calling kprobe_fault_handler().  Use a helper function in kprobes.h to
unify them, based on the x86 code.

This changes the behaviour for other architectures when preemption is
enabled.  Previously, they would have disabled preemption while calling
the kprobe handler.  However, preemption would be disabled if this fault
was due to a kprobe, so we know the fault was not due to a kprobe
handler and can simply return failure.

This behaviour was introduced in commit a980c0ef9f ("x86/kprobes:
Refactor kprobes_fault() like kprobe_exceptions_notify()")

[anshuman.khandual@arm.com: export kprobe_fault_handler()]
  Link: http://lkml.kernel.org/r/1561133358-8876-1-git-send-email-anshuman.khandual@arm.com
Link: http://lkml.kernel.org/r/1560420444-25737-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:22 -07:00
Anshuman Khandual 0f472d04f5 mm/ioremap: probe platform for p4d huge map support
Finish up what commit c2febafc67 ("mm: convert generic code to 5-level
paging") started while levelling up P4D huge mapping support at par with
PUD and PMD.  A new arch call back arch_ioremap_p4d_supported() is added
which just maintains status quo (P4D huge map not supported) on x86,
arm64 and powerpc.

When HAVE_ARCH_HUGE_VMAP is enabled its just a simple check from the
arch about the support, hence runtime effects are minimal.

Link: http://lkml.kernel.org/r/1561699231-20991-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:22 -07:00
Anshuman Khandual 8b1e0f81fb mm/pgtable: drop pgtable_t variable from pte_fn_t functions
Drop the pgtable_t variable from all implementation for pte_fn_t as none
of them use it.  apply_to_pte_range() should stop computing it as well.
Should help us save some cycles.

Link: http://lkml.kernel.org/r/1556803126-26596-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Matthew Wilcox <willy@infradead.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <jglisse@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12 11:05:46 -07:00
Mike Rapoport 50f11a8a46 arm64: switch to generic version of pte allocation
The PTE allocations in arm64 are identical to the generic ones modulo the
GFP flags.

Using the generic pte_alloc_one() functions ensures that the user page
tables are allocated with __GFP_ACCOUNT set.

The arm64 definition of PGALLOC_GFP is removed and replaced with
GFP_PGTABLE_USER for p[gum]d_alloc_one() for the user page tables and
GFP_PGTABLE_KERNEL for the kernel page tables. The KVM memory cache is now
using GFP_PGTABLE_USER.

The mappings created with create_pgd_mapping() are now using
GFP_PGTABLE_KERNEL.

The conversion to the generic version of pte_free_kernel() removes the NULL
check for pte.

The pte_free() version on arm64 is identical to the generic one and
can be simply dropped.

[cai@lca.pw: fix a bogus GFP flag in pgd_alloc()]
  Link: https://lore.kernel.org/r/1559656836-24940-1-git-send-email-cai@lca.pw/
[and fix it more]
  Link: https://lore.kernel.org/linux-mm/20190617151252.GF16810@rapoport-lnx/
Link: http://lkml.kernel.org/r/1557296232-15361-5-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Guo Ren <ren_guo@c-sky.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sam Creasey <sammy@sammy.net>
Cc: Vincent Chen <deanbo422@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12 11:05:45 -07:00
Linus Torvalds 6b04014f3f IOMMU Updates for Linux v5.3
Including:
 
 	- Patches to make the dma-iommu code more generic so that it can
 	  be used outside of the ARM context with other IOMMU drivers.
 	  Goal is to make use of it on x86 too.
 
 	- Generic IOMMU domain support for the Intel VT-d driver. This
 	  driver now makes more use of common IOMMU code to allocate
 	  default domains for the devices it handles.
 
 	- An IOMMU fault reporting API to userspace. With that the IOMMU
 	  fault handling can be done in user-space, for example to
 	  forward the faults to a VM.
 
 	- Better handling for reserved regions requested by the
 	  firmware. These can be 'relaxed' now, meaning that those don't
 	  prevent a device being attached to a VM.
 
 	- Suspend/Resume support for the Renesas IOMMU driver.
 
 	- Added support for dumping SVA related fields of the DMAR table
 	  in the Intel VT-d driver via debugfs.
 
 	- A pile of smaller fixes and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl0jaOAACgkQK/BELZcB
 GuNYTRAAjXuNs1OX/ROJ4TByT3YBWj5BLZVMrpGx75MAEvO68a7rTdaCpGuHv09w
 5JneXxA3H2O1q6JCGguLm//Dy5QycJQIn0WuaBGi5Vjo+cGe47sf48Hb6AhIoAFf
 exjhfmG6kL3ZzEkLU/7iJWqz+iiCBxRU0yMINJvzFFzliJkfyGWsjoruxobmUEhm
 XdEsdeAQFXoPMNVCDiorM+B+3VCl7BfwfhdlFrmX3fpNrMZ5ytlnPTgfJdXAGgHn
 DAoK8qGoyYtRnZnivn1y56M13qv5y3XvicKqx/GRKK8CyJpz/vYSBLktI5HO4tOF
 y/VzNg+ctDSjQgFyKIq+38k6qu4YWctyxZNObmPVpioNf9j9GaZWT97fxGzXMNS4
 G7TqOeq4yxjotAOcd+DdH265uVBtGmLqk7XobigxCZ0xzGVVnmPy4oJtkWwjaMgN
 HtNSMPRRirjprK4BUoCiXwP8YyWZs0a93oJmh2orD406yZkHNuqLNaicu24WsKKf
 Ze5BsFbNUb0qYdyL7ofOugQrOvYYoWYuREiz8ClF3U4mCcOk4FJ9K2Vf1kSfKBX8
 1Skz75NeSZdzXg9JcznNMuiqH2mN3lx/88xKJg3LJxEeFxU++VDMX3Z8kEmpj5w+
 9m/v+VRclUXL6g8s6QFNIlIenMknAEjtWw6hVWPhWLaULH6fR6Y=
 =Kvk5
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - Make the dma-iommu code more generic so that it can be used outside
   of the ARM context with other IOMMU drivers. Goal is to make use of
   it on x86 too.

 - Generic IOMMU domain support for the Intel VT-d driver. This driver
   now makes more use of common IOMMU code to allocate default domains
   for the devices it handles.

 - An IOMMU fault reporting API to userspace. With that the IOMMU fault
   handling can be done in user-space, for example to forward the faults
   to a VM.

 - Better handling for reserved regions requested by the firmware. These
   can be 'relaxed' now, meaning that those don't prevent a device being
   attached to a VM.

 - Suspend/Resume support for the Renesas IOMMU driver.

 - Added support for dumping SVA related fields of the DMAR table in the
   Intel VT-d driver via debugfs.

 - A pile of smaller fixes and cleanups.

* tag 'iommu-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (90 commits)
  iommu/omap: No need to check return value of debugfs_create functions
  iommu/arm-smmu-v3: Invalidate ATC when detaching a device
  iommu/arm-smmu-v3: Fix compilation when CONFIG_CMA=n
  iommu/vt-d: Cleanup unused variable
  iommu/amd: Flush not present cache in iommu_map_page
  iommu/amd: Only free resources once on init error
  iommu/amd: Move gart fallback to amd_iommu_init
  iommu/amd: Make iommu_disable safer
  iommu/io-pgtable: Support non-coherent page tables
  iommu/io-pgtable: Replace IO_PGTABLE_QUIRK_NO_DMA with specific flag
  iommu/io-pgtable-arm: Add support to use system cache
  iommu/arm-smmu-v3: Increase maximum size of queues
  iommu/vt-d: Silence a variable set but not used
  iommu/vt-d: Remove an unused variable "length"
  iommu: Fix integer truncation
  iommu: Add padding to struct iommu_fault
  iommu/vt-d: Consolidate domain_init() to avoid duplication
  iommu/vt-d: Cleanup after delegating DMA domain to generic iommu
  iommu/vt-d: Fix suspicious RCU usage in probe_acpi_namespace_devices()
  iommu/vt-d: Allow DMA domain attaching to rmrr locked device
  ...
2019-07-09 09:21:02 -07:00
Linus Torvalds dfd437a257 arm64 updates for 5.3:
- arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP}
 
 - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to
   manage the permissions of executable vmalloc regions more strictly
 
 - Slight performance improvement by keeping softirqs enabled while
   touching the FPSIMD/SVE state (kernel_neon_begin/end)
 
 - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new XAFLAG
   and AXFLAG instructions for floating point comparison flags
   manipulation) and FRINT (rounding floating point numbers to integers)
 
 - Re-instate ARM64_PSEUDO_NMI support which was previously marked as
   BROKEN due to some bugs (now fixed)
 
 - Improve parking of stopped CPUs and implement an arm64-specific
   panic_smp_self_stop() to avoid warning on not being able to stop
   secondary CPUs during panic
 
 - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI
   platforms
 
 - perf: DDR performance monitor support for iMX8QXP
 
 - cache_line_size() can now be set from DT or ACPI/PPTT if provided to
   cope with a system cache info not exposed via the CPUID registers
 
 - Avoid warning on hardware cache line size greater than
   ARCH_DMA_MINALIGN if the system is fully coherent
 
 - arm64 do_page_fault() and hugetlb cleanups
 
 - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep)
 
 - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the 'arm_boot_flags'
   introduced in 5.1)
 
 - CONFIG_RANDOMIZE_BASE now enabled in defconfig
 
 - Allow the selection of ARM64_MODULE_PLTS, currently only done via
   RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill
   over into the vmalloc area
 
 - Make ZONE_DMA32 configurable
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl0eHqcACgkQa9axLQDI
 XvFyNA/+L+bnkz8m3ncydlqqfXomQn4eJJVQ8Uksb0knJz+1+3CUxxbO4ry4jXZN
 fMkbggYrDPRKpDbsUl0lsRipj7jW9bqan+N37c3SWqCkgb6HqDaHViwxdx6Ec/Uk
 gHudozDSPh/8c7hxGcSyt/CFyuW6b+8eYIQU5rtIgz8aVY2BypBvS/7YtYCbIkx0
 w4CFleRTK1zXD5mJQhrc6jyDx659sVkrAvdhf6YIymOY8nBTv40vwdNo3beJMYp8
 Po/+0Ixu+VkHUNtmYYZQgP/AGH96xiTcRnUqd172JdtRPpCLqnLqwFokXeVIlUKT
 KZFMDPzK+756Ayn4z4huEePPAOGlHbJje8JVNnFyreKhVVcCotW7YPY/oJR10bnc
 eo7yD+DxABTn+93G2yP436bNVa8qO1UqjOBfInWBtnNFJfANIkZweij/MQ6MjaTA
 o7KtviHnZFClefMPoiI7HDzwL8XSmsBDbeQ04s2Wxku1Y2xUHLx4iLmadwLQ1ZPb
 lZMTZP3N/T1554MoURVA1afCjAwiqU3bt1xDUGjbBVjLfSPBAn/25IacsG9Li9AF
 7Rp1M9VhrfLftjFFkB2HwpbhRASOxaOSx+EI3kzEfCtM2O9I1WHgP3rvCdc3l0HU
 tbK0/IggQicNgz7GSZ8xDlWPwwSadXYGLys+xlMZEYd3pDIOiFc=
 =0TDT
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP}

 - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to
   manage the permissions of executable vmalloc regions more strictly

 - Slight performance improvement by keeping softirqs enabled while
   touching the FPSIMD/SVE state (kernel_neon_begin/end)

 - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new
   XAFLAG and AXFLAG instructions for floating point comparison flags
   manipulation) and FRINT (rounding floating point numbers to integers)

 - Re-instate ARM64_PSEUDO_NMI support which was previously marked as
   BROKEN due to some bugs (now fixed)

 - Improve parking of stopped CPUs and implement an arm64-specific
   panic_smp_self_stop() to avoid warning on not being able to stop
   secondary CPUs during panic

 - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI
   platforms

 - perf: DDR performance monitor support for iMX8QXP

 - cache_line_size() can now be set from DT or ACPI/PPTT if provided to
   cope with a system cache info not exposed via the CPUID registers

 - Avoid warning on hardware cache line size greater than
   ARCH_DMA_MINALIGN if the system is fully coherent

 - arm64 do_page_fault() and hugetlb cleanups

 - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep)

 - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the
   'arm_boot_flags' introduced in 5.1)

 - CONFIG_RANDOMIZE_BASE now enabled in defconfig

 - Allow the selection of ARM64_MODULE_PLTS, currently only done via
   RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill
   over into the vmalloc area

 - Make ZONE_DMA32 configurable

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits)
  perf: arm_spe: Enable ACPI/Platform automatic module loading
  arm_pmu: acpi: spe: Add initial MADT/SPE probing
  ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  x86/entry: Simplify _TIF_SYSCALL_EMU handling
  arm64: rename dump_instr as dump_kernel_instr
  arm64/mm: Drop [PTE|PMD]_TYPE_FAULT
  arm64: Implement panic_smp_self_stop()
  arm64: Improve parking of stopped CPUs
  arm64: Expose FRINT capabilities to userspace
  arm64: Expose ARMv8.5 CondM capability to userspace
  arm64: defconfig: enable CONFIG_RANDOMIZE_BASE
  arm64: ARM64_MODULES_PLTS must depend on MODULES
  arm64: bpf: do not allocate executable memory
  arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages
  arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP
  arm64: module: create module allocations without exec permissions
  arm64: Allow user selection of ARM64_MODULE_PLTS
  acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
  arm64: Allow selecting Pseudo-NMI again
  ...
2019-07-08 09:54:55 -07:00
Ard Biesheuvel 4739d53fcd arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP
Wire up the special helper functions to manipulate aliases of vmalloc
regions in the linear map.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-24 18:10:39 +01:00
Joerg Roedel ceedd5f74d Linux 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm
 A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe
 v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35
 Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z
 aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW
 s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH
 VTS5zps=
 =huBY
 -----END PGP SIGNATURE-----

Merge tag 'v5.2-rc6' into generic-dma-ops

Linux 5.2-rc6
2019-06-24 10:23:16 +02:00
Thomas Gleixner d2912cb15b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:55 +02:00
Thomas Gleixner caab277b1d treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not see http www gnu org
  licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 503 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:07 +02:00
Masayoshi Mizuma 8f5c9037a5 arm64/mm: Correct the cache line size warning with non coherent device
If the cache line size is greater than ARCH_DMA_MINALIGN (128),
the warning shows and it's tainted as TAINT_CPU_OUT_OF_SPEC.

However, it's not good because as discussed in the thread [1], the cpu
cache line size will be problem only on non-coherent devices.

Since the coherent flag is already introduced to struct device,
show the warning only if the device is non-coherent device and
ARCH_DMA_MINALIGN is smaller than the cpu cache size.

[1] https://lore.kernel.org/linux-arm-kernel/20180514145703.celnlobzn3uh5tc2@localhost/

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Tested-by: Zhang Lei <zhang.lei@jp.fujitsu.com>
[catalin.marinas@arm.com: removed 'if' block for WARN_TAINT]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-17 11:52:47 +01:00
Anshuman Khandual 4745224b45 arm64/mm: Refactor __do_page_fault()
__do_page_fault() is over complicated with multiple goto statements. This
cleans up the code flow and while there drops local variable vm_fault_t.

Reviewed-by: Mark Rutland <Mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-07 14:35:12 +01:00
Anshuman Khandual c49bd02f4c arm64/mm: Document write abort detection from ESR
This patch adds an is_write_abort() wrapper and documents the detection
of the abort type on cache maintenance operations.

Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
[catalin.marinas@arm.com: only keep the is_write_abort() wrapper]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-07 14:34:24 +01:00
Odin Ugedal 8e01076afd arm64: Fix comment after #endif
The config value used in the if was changed in
b433dce056, but the comment on the
corresponding end was not changed.

Signed-off-by: Odin Ugedal <odin@ugedal.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-07 10:41:12 +01:00
Thomas Gleixner b886d83c5b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation version 2 of the license

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 315 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:17 +02:00
Thomas Gleixner 97fb5e8d9b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 284
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 and
  only version 2 as published by the free software foundation this
  program is distributed in the hope that it will be useful but
  without any warranty without even the implied warranty of
  merchantability or fitness for a particular purpose see the gnu
  general public license for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 294 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141900.825281744@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Anshuman Khandual 6168103600 arm64/mm: Drop task_struct argument from __do_page_fault()
The task_struct argument is not getting used in __do_page_fault(). Hence
just drop it and use current or cuurent->mm instead where ever required.
This does not change any functionality.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-04 15:26:47 +01:00
Anshuman Khandual a0509313d5 arm64/mm: Drop mmap_sem before calling __do_kernel_fault()
There is an inconsistency between down_read_trylock() success and failure
paths while dealing with kernel access for non exception table areas where
it calls __do_kernel_fault(). In case of failure it just bails out without
holding mmap_sem but when it succeeds it does so while holding mmap_sem.
Fix this inconsistency by just dropping mmap_sem in success path as well.

__do_kernel_fault() calls die_kernel_fault() which then calls show_pte().
show_pte() in this path might become bit more unreliable without holding
mmap_sem. But there are already instances [1] in do_page_fault() where
die_kernel_fault() gets called without holding mmap_sem. show_pte() can
be made more robust independently but in a later patch.

[1] Conditional block for (is_ttbr0_addr && is_el1_permission_fault)

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-04 15:26:18 +01:00
Anshuman Khandual 01de1776f6 arm64/mm: Identify user instruction aborts
We don't currently set the FAULT_FLAG_INSTRUCTION mm flag for EL0
instruction aborts. This has no functional impact, as we don't override
arch_vma_access_permitted(), and the default implementation always returns
true. However, it would be helpful to provide the flag so that it can be
consumed by tracepoints such as dax_pmd_fault.

This patch sets the FAULT_FLAG_INSTRUCTION flag for EL0 instruction aborts.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-04 14:55:15 +01:00
Anshuman Khandual 87dedf7c61 arm64/mm: Change BUG_ON() to VM_BUG_ON() in [pmd|pud]_set_huge()
There are no callers for the functions which will pass unaligned physical
addresses. Hence just change these BUG_ON() checks into VM_BUG_ON() which
gets compiled out unless CONFIG_VM_DEBUG is enabled.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-04 14:53:18 +01:00
Miles Chen 0c1f14ed12 arm64: mm: make CONFIG_ZONE_DMA32 configurable
This change makes CONFIG_ZONE_DMA32 defuly y and allows users
to overwrite it only when CONFIG_EXPERT=y.

For the SoCs that do not need CONFIG_ZONE_DMA32, this is the
first step to manage all available memory by a single
zone(normal zone) to reduce the overhead of multiple zones.

The change also fixes a build error when CONFIG_NUMA=y and
CONFIG_ZONE_DMA32=n.

arch/arm64/mm/init.c:195:17: error: use of undeclared identifier 'ZONE_DMA32'
                max_zone_pfns[ZONE_DMA32] = PFN_DOWN(max_zone_dma_phys());

Change since v1:
1. only expose CONFIG_ZONE_DMA32 when CONFIG_EXPERT=y
2. remove redundant IS_ENABLED(CONFIG_ZONE_DMA32)

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-04 13:58:28 +01:00
Anshuman Khandual f7f0097af6 arm64/mm: Simplify protection flag creation for kernel huge mappings
Even though they have got the same value, PMD_TYPE_SECT and PUD_TYPE_SECT
get used for kernel huge mappings. But before that first the table bit gets
cleared using leaf level PTE_TABLE_BIT. Though functionally they are same,
we should use page table level specific macros to be consistent as per the
MMU specifications. Create page table level specific wrappers for kernel
huge mapping entries and just drop mk_sect_prot() which does not have any
other user.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-04 13:49:58 +01:00
Anshuman Khandual 441a627806 arm64/hugetlb: Use macros for contiguous huge page sizes
Replace all open encoded contiguous huge page size computations with
available macro encodings CONT_PTE_SIZE and CONT_PMD_SIZE. There are other
instances where these macros are used in the file and this change makes it
consistently use the same mnemonic.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-06-03 16:58:37 +01:00
Thomas Gleixner 1802d0beec treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 655 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070034.575739538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:41 -07:00
Christoph Hellwig a84cc69eb5 arm64: trim includes in dma-mapping.c
With most of the previous functionality now elsewhere a lot of the
headers included in this file are not needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-05-27 17:31:12 +02:00
Christoph Hellwig b5f75a3639 arm64: switch copyright boilerplace to SPDX in dma-mapping.c
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-05-27 17:31:12 +02:00
Christoph Hellwig 06d60728ff iommu/dma: move the arm64 wrappers to common code
There is nothing really arm64 specific in the iommu_dma_ops
implementation, so move it to dma-iommu.c and keep a lot of symbols
self-contained.  Note the implementation does depend on the
DMA_DIRECT_REMAP infrastructure for now, so we'll have to make the
DMA_IOMMU support depend on it, but this will be relaxed soon.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-05-27 17:31:10 +02:00
Christoph Hellwig af751d4308 iommu/dma: Remove the flush_page callback
We now have a arch_dma_prep_coherent architecture hook that is used
for the generic DMA remap allocator, and we should use the same
interface for the dma-iommu code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-05-27 17:31:10 +02:00
Linus Torvalds 0a72ef8990 Second round of arm64 fixes for -rc2
- Fix incorrect LDADD instruction encoding in our disassembly macros
 
 - Disable the broken ARM64_PSEUDO_NMI support for now
 
 - Add workaround for Cortex-A76 CPU erratum #1463225
 
 - Handle Cortex-A76/Neoverse-N1 erratum #1418040 w/ existing workaround
 
 - Fix IORT build failure if IOMMU_SUPPORT=n
 
 - Fix place-relative module relocation range checking and its
   interaction with KASLR
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAlzoC8MACgkQt6xw3ITB
 YzQfiAf+MXFzrAd3o7v40CnZu6ELw+ldedPh34oBjD7h6we3hroxi5Fss2nbwH0o
 BmAm4Nv1/Njk5+hA7Mlp3/mRn0vcd3NDP+FyH3inLjUU7owc41thp0SKlCOfFdZk
 K8sVCOeCWt7GEEPcnFsPO0nU+7f3ZKDDNBo0L+qJPxrMOTDcbQ3cIjW/ua7vQRHv
 pIDGF+iJAhHeNoc1Wjq08F8Q+Dq7dYvhtokeyDivSn4NulmRvdL+z581gMmj7ExT
 ARB6WtHGoOo+8UdjBJIDnXRKhJLfGexQaoAojk+IogaV0ACDtz6CuqsSIh1e5SFC
 oPqRSP5ITTbXEDS5uaUW1pYlwmGTaw==
 =ynUz
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull more arm64 fixes from Will Deacon:

 - Fix incorrect LDADD instruction encoding in our disassembly macros

 - Disable the broken ARM64_PSEUDO_NMI support for now

 - Add workaround for Cortex-A76 CPU erratum #1463225

 - Handle Cortex-A76/Neoverse-N1 erratum #1418040 w/ existing workaround

 - Fix IORT build failure if IOMMU_SUPPORT=n

 - Fix place-relative module relocation range checking and its
   interaction with KASLR

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: insn: Add BUILD_BUG_ON() for invalid masks
  arm64: insn: Fix ldadd instruction encoding
  arm64: Kconfig: Make ARM64_PSEUDO_NMI depend on BROKEN for now
  arm64: Handle erratum 1418040 as a superset of erratum 1188873
  arm64/module: deal with ambiguity in PRELxx relocation ranges
  ACPI/IORT: Fix build error when IOMMU_SUPPORT is disabled
  arm64/kernel: kaslr: reduce module randomization range to 2 GB
  arm64: errata: Add workaround for Cortex-A76 erratum #1463225
  arm64: Remove useless message during oops
2019-05-24 11:03:26 -07:00
Will Deacon 969f5ea627 arm64: errata: Add workaround for Cortex-A76 erratum #1463225
Revisions of the Cortex-A76 CPU prior to r4p0 are affected by an erratum
that can prevent interrupts from being taken when single-stepping.

This patch implements a software workaround to prevent userspace from
effectively being able to disable interrupts.

Cc: <stable@vger.kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-23 11:38:10 +01:00
Linus Torvalds 54dee40637 First round of arm64 fixes for -rc2
- Fix SPE probe failure when backing auxbuf with high-order pages
 
 - Fix handling of DMA allocations from outside of the vmalloc area
 
 - Fix generation of build-id ELF section for vDSO object
 
 - Disable huge I/O mappings if kernel page table dumping is enabled
 
 - A few other minor fixes (comments, kconfig etc)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAlzlRT0ACgkQt6xw3ITB
 YzRGOwgArUDryBedDdkxAvjx7fk8O+qjtWctAhdPtyuXIvVLOc3tpiKlayCguF/a
 clqr4qAfxswoDLHRMwhh7xdv955A2vraHQWlzvGUj2O2M4mG8RdbVJLm3NxpA09m
 dufjSuFcwxcou2c4rXbSXSB4AYJXPmQJiad04VsWj68+TVehy0P45zaPcjHsPNPI
 D9sTa9XhBlNa0qpJG7tP9T8FS/QP/hpWHn8v0z/DQ4QetKRTstkpwD5kmJox8WmM
 Bw593bvQQ2+5q9g+z0FM3M/7yHwTJw2RLnnIb29YsW8MxM3rUeqt+FMA2OALBgbi
 0m7WoTZwO9hDQuPU1DDvZUtw3iOpeg==
 =buiS
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

 - Fix SPE probe failure when backing auxbuf with high-order pages

 - Fix handling of DMA allocations from outside of the vmalloc area

 - Fix generation of build-id ELF section for vDSO object

 - Disable huge I/O mappings if kernel page table dumping is enabled

 - A few other minor fixes (comments, kconfig etc)

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: vdso: Explicitly add build-id option
  arm64/mm: Inhibit huge-vmap with ptdump
  arm64: Print physical address of page table base in show_pte()
  arm64: don't trash config with compat symbol if COMPAT is disabled
  arm64: assembler: Update comment above cond_yield_neon() macro
  drivers/perf: arm_spe: Don't error on high-order pages for aux buf
  arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable
2019-05-22 08:36:16 -07:00
Mark Rutland 7ba36eccb3 arm64/mm: Inhibit huge-vmap with ptdump
The arm64 ptdump code can race with concurrent modification of the
kernel page tables. At the time this was added, this was sound as:

* Modifications to leaf entries could result in stale information being
  logged, but would not result in a functional problem.

* Boot time modifications to non-leaf entries (e.g. freeing of initmem)
  were performed when the ptdump code cannot be invoked.

* At runtime, modifications to non-leaf entries only occurred in the
  vmalloc region, and these were strictly additive, as intermediate
  entries were never freed.

However, since commit:

  commit 324420bf91 ("arm64: add support for ioremap() block mappings")

... it has been possible to create huge mappings in the vmalloc area at
runtime, and as part of this existing intermediate levels of table my be
removed and freed.

It's possible for the ptdump code to race with this, and continue to
walk tables which have been freed (and potentially poisoned or
reallocated). As a result of this, the ptdump code may dereference bogus
addresses, which could be fatal.

Since huge-vmap is a TLB and memory optimization, we can disable it when
the runtime ptdump code is in use to avoid this problem.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Fixes: 324420bf91 ("arm64: add support for ioremap() block mappings")
Acked-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-16 11:24:09 +01:00
Masahiro Yamada 87dfb311b7 treewide: replace #include <asm/sizes.h> with #include <linux/sizes.h>
Since commit dccd2304cc ("ARM: 7430/1: sizes.h: move from asm-generic
to <linux/sizes.h>"), <asm/sizes.h> and <asm-generic/sizes.h> are just
wrappers of <linux/sizes.h>.

This commit replaces all <asm/sizes.h> and <asm-generic/sizes.h> to
prepare for the removal.

Link: http://lkml.kernel.org/r/1553267665-27228-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 19:52:52 -07:00
Michal Hocko 940519f0c8 mm, memory_hotplug: provide a more generic restrictions for memory hotplug
arch_add_memory, __add_pages take a want_memblock which controls whether
the newly added memory should get the sysfs memblock user API (e.g.
ZONE_DEVICE users do not want/need this interface).  Some callers even
want to control where do we allocate the memmap from by configuring
altmap.

Add a more generic hotplug context for arch_add_memory and __add_pages.
struct mhp_restrictions contains flags which contains additional features
to be enabled by the memory hotplug (MHP_MEMBLOCK_API currently) and
altmap for alternative memmap allocator.

This patch shouldn't introduce any functional change.

[akpm@linux-foundation.org: build fix]
Link: http://lkml.kernel.org/r/20190408082633.2864-3-osalvador@suse.de
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:49 -07:00
Christoph Hellwig d8ae8a3765 initramfs: move the legacy keepinitrd parameter to core code
No need to handle the freeing disable in arch code when we already have a
core hook (and a different name for the option) for it.

Link: http://lkml.kernel.org/r/20190213174621.29297-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Cc: Steven Price <steven.price@arm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:47 -07:00
Will Deacon 48caebf7e1 arm64: Print physical address of page table base in show_pte()
When dumping the page table in response to an unexpected kernel page
fault, we print the virtual (hashed) address of the page table base, but
display physical addresses for everything else.

Make the page table dumping code in show_pte() consistent, by printing
the page table base pointer as a physical address.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-14 12:25:28 +01:00
Christoph Hellwig a98d9ae937 arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable
DMA allocations that can't sleep may return non-remapped addresses, but
we do not properly handle them in the mmap and get_sgtable methods.
Resolve non-vmalloc addresses using virt_to_page to handle this corner
case.

Cc: <stable@vger.kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-05-07 10:06:16 +01:00
Linus Torvalds c620f7bd0b arm64 updates for 5.2
Mostly just incremental improvements here:
 
 - Introduce AT_HWCAP2 for advertising CPU features to userspace
 
 - Expose SVE2 availability to userspace
 
 - Support for "data cache clean to point of deep persistence" (DC PODP)
 
 - Honour "mitigations=off" on the cmdline and advertise status via sysfs
 
 - CPU timer erratum workaround (Neoverse-N1 #1188873)
 
 - Introduce perf PMU driver for the SMMUv3 performance counters
 
 - Add config option to disable the kuser helpers page for AArch32 tasks
 
 - Futex modifications to ensure liveness under contention
 
 - Rework debug exception handling to seperate kernel and user handlers
 
 - Non-critical fixes and cleanup
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAlzMFGgACgkQt6xw3ITB
 YzTicAf/TX1h1+ecbx4WJAa4qeiOCPoNpG9efldQumqJhKL44MR5bkhuShna5mwE
 ptm5qUXkZCxLTjzssZKnbdbgwa3t+emW8Of3D91IfI9akiZbMoDx5FGgcNbqjazb
 RLrhOFHwgontA38yppZN+DrL+sXbvif/CVELdHahkEx6KepSGaS2lmPXRmz/W56v
 4yIRy/zxc3Dhjgfm3wKh72nBwoZdLiIc4mchd5pthNlR9E2idrYkQegG1C+gA00r
 o8uZRVOWgoh7H+QJE+xLUc8PaNCg8xqRRXOuZYg9GOz6hh7zSWhm+f1nRz9S2tIR
 gIgsCHNqoO2I3E1uJpAQXDGtt2kFhA==
 =ulpJ
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "Mostly just incremental improvements here:

   - Introduce AT_HWCAP2 for advertising CPU features to userspace

   - Expose SVE2 availability to userspace

   - Support for "data cache clean to point of deep persistence" (DC PODP)

   - Honour "mitigations=off" on the cmdline and advertise status via
     sysfs

   - CPU timer erratum workaround (Neoverse-N1 #1188873)

   - Introduce perf PMU driver for the SMMUv3 performance counters

   - Add config option to disable the kuser helpers page for AArch32 tasks

   - Futex modifications to ensure liveness under contention

   - Rework debug exception handling to seperate kernel and user
     handlers

   - Non-critical fixes and cleanup"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (92 commits)
  Documentation: Add ARM64 to kernel-parameters.rst
  arm64/speculation: Support 'mitigations=' cmdline option
  arm64: ssbs: Don't treat CPUs with SSBS as unaffected by SSB
  arm64: enable generic CPU vulnerabilites support
  arm64: add sysfs vulnerability show for speculative store bypass
  arm64: Fix size of __early_cpu_boot_status
  clocksource/arm_arch_timer: Use arch_timer_read_counter to access stable counters
  clocksource/arm_arch_timer: Remove use of workaround static key
  clocksource/arm_arch_timer: Drop use of static key in arch_timer_reg_read_stable
  clocksource/arm_arch_timer: Direcly assign set_next_event workaround
  arm64: Use arch_timer_read_counter instead of arch_counter_get_cntvct
  watchdog/sbsa: Use arch_timer_read_counter instead of arch_counter_get_cntvct
  ARM: vdso: Remove dependency with the arch_timer driver internals
  arm64: Apply ARM64_ERRATUM_1188873 to Neoverse-N1
  arm64: Add part number for Neoverse N1
  arm64: Make ARM64_ERRATUM_1188873 depend on COMPAT
  arm64: Restrict ARM64_ERRATUM_1188873 mitigation to AArch32
  arm64: mm: Remove pte_unmap_nested()
  arm64: Fix compiler warning from pte_unmap() with -Wunused-but-set-variable
  arm64: compat: Reduce address limit for 64K pages
  ...
2019-05-06 17:54:22 -07:00
Bjorn Andersson d4d18e3ec6 arm64: mm: Ensure tail of unaligned initrd is reserved
In the event that the start address of the initrd is not aligned, but
has an aligned size, the base + size will not cover the entire initrd
image and there is a chance that the kernel will corrupt the tail of the
image.

By aligning the end of the initrd to a page boundary and then
subtracting the adjusted start address the memblock reservation will
cover all pages that contains the initrd.

Fixes: c756c592e4 ("arm64: Utilize phys_initrd_start/phys_initrd_size")
Cc: stable@vger.kernel.org
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-04-23 10:56:24 +01:00
Jean-Philippe Brucker 827a108e35 arm64: Save and restore OSDLR_EL1 across suspend/resume
When the CPU comes out of suspend, the firmware may have modified the OS
Double Lock Register. Save it in an unused slot of cpu_suspend_ctx, and
restore it on resume.

Cc: <stable@vger.kernel.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-09 12:39:13 +01:00
Will Deacon 475ba3fc19 arm64: mm: Consolidate early page table allocation
The logic for early allocation of page tables is duplicated between
pgd_kernel_pgtable_alloc() and pgd_pgtable_alloc(). Drop the duplication
by calling one from the other and renaming pgd_kernel_pgtable_alloc()
accordingly.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-09 11:21:50 +01:00
Yu Zhao 369aaab845 arm64: mm: don't call page table ctors for init_mm
init_mm doesn't require page table lock to be initialized at
any level. Add a separate page table allocator for it, and the
new one skips page table ctors.

The ctors allocate memory when ALLOC_SPLIT_PTLOCKS is set. Not
calling them avoids memory leak in case we call pte_free_kernel()
on init_mm.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-09 11:21:50 +01:00
Yu Zhao 90292aca98 arm64: mm: use appropriate ctors for page tables
For pte page, use pgtable_page_ctor(); for pmd page, use
pgtable_pmd_page_ctor(); and for the rest (pud, p4d and pgd),
don't use any.

For now, we don't select ARCH_ENABLE_SPLIT_PMD_PTLOCK and
pgtable_pmd_page_ctor() is a nop. When we do in patch 3, we
make sure pmd is not folded so we won't mistakenly call
pgtable_pmd_page_ctor() on pud or p4d.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-09 11:21:50 +01:00
Will Deacon 52c6d145da arm64: debug: Remove unused return value from do_debug_exception()
do_debug_exception() goes out of its way to return a value that isn't
ever used, so just make the thing void.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-09 11:21:12 +01:00
Will Deacon 70b3d237bd arm64: mm: Ensure we ignore the initrd if it is placed out of range
If the initrd payload isn't completely accessible via the linear map,
then we print a warning during boot and nobble the virtual address of
the payload so that we ignore it later on.

Unfortunately, since commit c756c592e4 ("arm64: Utilize
phys_initrd_start/phys_initrd_size"), the virtual address isn't
initialised until later anyway, so we need to nobble the size of the
payload to ensure that we don't try to use it later on.

Fixes: c756c592e4 ("arm64: Utilize phys_initrd_start/phys_initrd_size")
Reported-by: Pierre Kuo <vichy.kuo@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-03 18:35:08 +01:00
Will Deacon 7048a5973e arm64: mm: Make show_pte() a static function
show_pte() doesn't have any external callers, so make it static.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-03 13:36:54 +01:00
Miles Chen 19d6242ece arm64: setup min_low_pfn
When debugging with CONFIG_PAGE_OWNER, I noticed that the min_low_pfn
on arm64 is always zero and the page owner scanning has to start from zero.
We have to loop a while before we see the first valid pfn.
(see: read_page_owner())

Setup min_low_pfn to save some loops.

Before setting min_low_pfn:

[   21.265602] min_low_pfn=0, *ppos=0
Page allocated via order 0, mask 0x100cca(GFP_HIGHUSER_MOVABLE)
PFN 262144 type Movable Block 512 type Movable Flags 0x8001e
referenced|uptodate|dirty|lru|swapbacked)
prep_new_page+0x13c/0x140
get_page_from_freelist+0x254/0x1068
__alloc_pages_nodemask+0xd4/0xcb8

After setting min_low_pfn:

[   11.025787] min_low_pfn=262144, *ppos=0
Page allocated via order 0, mask 0x100cca(GFP_HIGHUSER_MOVABLE)
PFN 262144 type Movable Block 512 type Movable Flags 0x8001e
referenced|uptodate|dirty|lru|swapbacked)
prep_new_page+0x13c/0x140
get_page_from_freelist+0x254/0x1068
__alloc_pages_nodemask+0xd4/0xcb8
shmem_alloc_page+0x7c/0xa0
shmem_alloc_and_acct_page+0x124/0x1e8
shmem_getpage_gfp.isra.7+0x118/0x878
shmem_write_begin+0x38/0x68

Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-03 13:07:59 +01:00
Qian Cai b1ce45e86b arm64/mm: fix kernel-doc comments
Building a kernel with W=1 generates several warnings due to abuse of
kernel-doc comments:

  | arch/arm64/mm/numa.c:281: warning: Cannot understand  *
  |   on line 281 - I thought it was a doc line

Tidy up the comments to remove the warnings.

Fixes: 1a2db30034 ("arm64, numa: Add NUMA support for arm64 platforms.")
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-03 12:46:27 +01:00
Muchun Song 344bf332ce arm64: mm: fix incorrect assignment of 'max_mapnr'
Although we don't actually make use of the 'max_mapnr' global variable,
we do set it to a junk value for !CONFIG_FLATMEM configurations that
leave mem_map uninitialised.

To avoid somebody tripping over this in future, set 'max_mapnr' using
max_pfn, which is calculated directly from the memblock information.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Muchun Song <smuchun@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-01 15:35:48 +01:00
Mike Rapoport 8a7f97b902 treewide: add checks for the return value of memblock_alloc*()
Add check for the return value of memblock_alloc*() functions and call
panic() in case of error.  The panic message repeats the one used by
panicing memblock allocators with adjustment of parameters to include
only relevant ones.

The replacement was mostly automated with semantic patches like the one
below with manual massaging of format strings.

  @@
  expression ptr, size, align;
  @@
  ptr = memblock_alloc(size, align);
  + if (!ptr)
  + 	panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);

[anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
  Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
[rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
  Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
[rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
  Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
[akpm@linux-foundation.org: fix xtensa printk warning]
Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Guo Ren <ren_guo@c-sky.com>		[c-sky]
Acked-by: Paul Burton <paul.burton@mips.com>		[MIPS]
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>	[s390]
Reviewed-by: Juergen Gross <jgross@suse.com>		[Xen]
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Acked-by: Max Filippov <jcmvbkbc@gmail.com>		[xtensa]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-12 10:04:02 -07:00
Mike Rapoport ecc3e771f4 memblock: memblock_phys_alloc(): don't panic
Make the memblock_phys_alloc() function an inline wrapper for
memblock_phys_alloc_range() and update the memblock_phys_alloc() callers
to check the returned value and panic in case of error.

Link: http://lkml.kernel.org/r/1548057848-15136-8-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Juergen Gross <jgross@suse.com>			[Xen]
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-12 10:04:01 -07:00
Mike Rapoport 337555744e memblock: memblock_phys_alloc_try_nid(): don't panic
The memblock_phys_alloc_try_nid() function tries to allocate memory from
the requested node and then falls back to allocation from any node in
the system.  The memblock_alloc_base() fallback used by this function
panics if the allocation fails.

Replace the memblock_alloc_base() fallback with the direct call to
memblock_alloc_range_nid() and update the memblock_phys_alloc_try_nid()
callers to check the returned value and panic in case of error.

Link: http://lkml.kernel.org/r/1548057848-15136-7-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Juergen Gross <jgross@suse.com>			[Xen]
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-12 10:04:01 -07:00
Linus Torvalds 3d8dfe75ef arm64 updates for 5.1:
- Pseudo NMI support for arm64 using GICv3 interrupt priorities
 
 - uaccess macros clean-up (unsafe user accessors also merged but
   reverted, waiting for objtool support on arm64)
 
 - ptrace regsets for Pointer Authentication (ARMv8.3) key management
 
 - inX() ordering w.r.t. delay() on arm64 and riscv (acks in place by the
   riscv maintainers)
 
 - arm64/perf updates: PMU bindings converted to json-schema, unused
   variable and misleading comment removed
 
 - arm64/debug fixes to ensure checking of the triggering exception level
   and to avoid the propagation of the UNKNOWN FAR value into the si_code
   for debug signals
 
 - Workaround for Fujitsu A64FX erratum 010001
 
 - lib/raid6 ARM NEON optimisations
 
 - NR_CPUS now defaults to 256 on arm64
 
 - Minor clean-ups (documentation/comments, Kconfig warning, unused
   asm-offsets, clang warnings)
 
 - MAINTAINERS update for list information to the ARM64 ACPI entry
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlyCl0cACgkQa9axLQDI
 XvEyKxAAiogBZLbyhcy8bTUHVzVoJE0FyAkdO2wWnnaff2Ohkhy1Y/npv33IeK2q
 RknxqDIx2DUUVPJNRZGoI/WwBtTZdKaAnW4rIKG84yC1eAkFcd96WQasaZzcp1qY
 HmvbJiYXM0bh+0J7i3Wgry/QzOkrltJFJW2kp6Wd5aFE+R1WyWyxT6d+Fp0J3vlA
 bT70jlpBK6LXEOmmBS+04Ml02+8MvaGxIl8EInBHSfDLRLErj5E8n41rRHKUiSWz
 maWI+kVoLYwOE68xiZlDftUBEeQpUSWgg2nxeK+640QSl1wJmVcRcY9nm6TZeMG2
 AiZTR9a7cP5rrdSN5suUmb7d4AMMVlVMisGDlwb+9oCxeTRDzg0uwACaVgHfPqQr
 UeBdHbL9nStN7uBH23H8L9mKk+tqpFmk0sgzdrKejOwysAiqWV8aazb/Na3qnVRl
 J1B5opxMnGOsjXmHvtG/tiZl281Uwz5ZmzfLmIY3gUZgUgdA3511Egp0ry5y1dzJ
 SkYC4Hmzb2ybQvXGIDDa3OzCwXXiqyqKsO+O8Egg1k4OIwbp3w+NHE7gKeA+dMgD
 gjN7zEalCUi46Q28xiCPEb+88BpQ18czIWGQLb9mAnmYeZPjqqenXKXuRHr4lgVe
 jPURJ/vqvFEglZJN1RDuQHKzHEcm5f2XE566sMZYdSoeiUCb0QM=
 =2U56
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - Pseudo NMI support for arm64 using GICv3 interrupt priorities

 - uaccess macros clean-up (unsafe user accessors also merged but
   reverted, waiting for objtool support on arm64)

 - ptrace regsets for Pointer Authentication (ARMv8.3) key management

 - inX() ordering w.r.t. delay() on arm64 and riscv (acks in place by
   the riscv maintainers)

 - arm64/perf updates: PMU bindings converted to json-schema, unused
   variable and misleading comment removed

 - arm64/debug fixes to ensure checking of the triggering exception
   level and to avoid the propagation of the UNKNOWN FAR value into the
   si_code for debug signals

 - Workaround for Fujitsu A64FX erratum 010001

 - lib/raid6 ARM NEON optimisations

 - NR_CPUS now defaults to 256 on arm64

 - Minor clean-ups (documentation/comments, Kconfig warning, unused
   asm-offsets, clang warnings)

 - MAINTAINERS update for list information to the ARM64 ACPI entry

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits)
  arm64: mmu: drop paging_init comments
  arm64: debug: Ensure debug handlers check triggering exception level
  arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals
  Revert "arm64: uaccess: Implement unsafe accessors"
  arm64: avoid clang warning about self-assignment
  arm64: Kconfig.platforms: fix warning unmet direct dependencies
  lib/raid6: arm: optimize away a mask operation in NEON recovery routine
  lib/raid6: use vdupq_n_u8 to avoid endianness warnings
  arm64: io: Hook up __io_par() for inX() ordering
  riscv: io: Update __io_[p]ar() macros to take an argument
  asm-generic/io: Pass result of I/O accessor to __io_[p]ar()
  arm64: Add workaround for Fujitsu A64FX erratum 010001
  arm64: Rename get_thread_info()
  arm64: Remove documentation about TIF_USEDFPU
  arm64: irqflags: Fix clang build warnings
  arm64: Enable the support of pseudo-NMIs
  arm64: Skip irqflags tracing for NMI in IRQs disabled context
  arm64: Skip preemption when exiting an NMI
  arm64: Handle serror in NMI context
  irqchip/gic-v3: Allow interrupts to be set as pseudo-NMI
  ...
2019-03-10 10:17:23 -07:00
Linus Torvalds d276709ce6 ACPI updates for 5.1-rc1
- Update the ACPICA code in the kernel to upstream revision 20190215
    including ACPI 6.3 support and more:
    * New predefined methods: _NBS, _NCH, _NIC, _NIH, and _NIG (Erik
      Schmauss).
    * Update of the PCC Identifier structure in PDTT (Erik Schmauss).
    * Support for new Generic Affinity Structure subtable in SRAT
      (Erik Schmauss).
    * New PCC operation region support (Erik Schmauss).
    * Support for GICC statistical profiling for MADT (Erik Schmauss).
    * New Error Disconnect Recover notification support (Erik Schmauss).
    * New PPTT Processor Structure Flags fields support (Erik Schmauss).
    * ACPI 6.3 HMAT updates (Erik Schmauss).
    * GTDT Revision 3 support (Erik Schmauss).
    * Legacy module-level code (MLC) support removal (Erik Schmauss).
    * Update/clarification of messages for control method failures
      (Bob Moore).
    * Warning on creation of a zero-length opregion (Bob Moore).
    * acpiexec option to dump extra info for memory leaks (Bob Moore).
    * More ACPI error to firmware error conversions (Bob Moore).
    * Debugger fix (Bob Moore).
    * Copyrights update (Bob Moore).
 
  - Clean up sleep states support code in ACPICA (Christoph Hellwig).
 
  - Rework in_nmi() handling in the APEI code and add suppor for the
    ARM Software Delegated Exception Interface (SDEI) to it (James
    Morse).
 
  - Fix possible out-of-bounds accesses in BERT-related core (Ross
    Lagerwall).
 
  - Fix the APEI code parsing HEST that includes a Deferred Machine
    Check subtable (Yazen Ghannam).
 
  - Use DEFINE_DEBUGFS_ATTRIBUTE for APEI-related debugfs files
    (YueHaibing).
 
  - Switch the APEI ERST code to the new generic UUID API (Andy
    Shevchenko).
 
  - Update the MAINTAINERS entry for APEI (Borislav Petkov).
 
  - Fix and clean up the ACPI EC driver (Rafael Wysocki, Zhang Rui).
 
  - Fix DMI checks handling in the ACPI backlight driver and add the
    "Lunch Box" chassis-type check to it (Hans de Goede).
 
  - Add support for using ACPI table overrides included in built-in
    initrd images (Shunyong Yang).
 
  - Update ACPI device enumeration to treat the PWM2 device as "always
    present" on Lenovo Yoga Book (Yauhen Kharuzhy).
 
  - Fix up the enumeration of device objects with the PRP0001 device
    ID (Andy Shevchenko).
 
  - Clean up PPTT parsing error messages (John Garry).
 
  - Clean up debugfs files creation handling (Greg Kroah-Hartman,
    Rafael Wysocki).
 
  - Clean up the ACPI DPTF Makefile (Masahiro Yamada).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJcfSIaAAoJEILEb/54YlRxvL8P/2oiG+u3tm3JahQ2tk9iiX3S
 4yjYMB5Gmhua3w/t6tnRHHhy3pjjgI6xH5S7WB0VPTMp57E91EQihcbLJNFiJ1Jf
 zjeZtWSmoxvcVwHAXq0DZHFMRK9Xgc/1ckzWNH/pwVlBSgaYazuLr6bwtZhtorci
 eNWi82abWfAp6kAXjzJkcFbEp9+H6JzseewKcT8VAKn63KZizCEzxT0PuE9c54km
 QnILVB9we0aGD2i0w2BRpbz99Wse0vnoUkBcrDw0LFHCaEQjfyAa94YFVQVrkE1Q
 ynH26+yQanyzH00q/HWuH7N7YdcYMYT1CgZoIKR5XtJ+CbTc63VQez4csLOgOFMM
 VEwmuv5SdRQ+tLCNFn71dxRheAttKI/nGBAZWMRTLQkp412IrQP4BtWw4wFM8SHZ
 3G7eReR/bBeS4u1T5KR8CVVxchinDdwnTvqQII1uEniX80AmsHsQZxtU+JdPDp+w
 N6gUE+lPF8e4iT+YsrWFMoNsJ9/MoXbSPQK1oYIcL0f5+PjFMxjTbA53wDiMHAhS
 9AqVW1fdSPX0ImV3DuDqHph3ekAt26QHKxIA2xj5WTRWKf+29ijO2+5zU8isT7kI
 RfGzpvsSYdvPyIRLUqc/Q3d5u/ElacAaaKJNT+6gUT4AkINAZJKQRiw2dWO1g82O
 HVuSc5hRfnAJ5ALfCdIG
 =r6fU
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "These are ACPICA updates including ACPI 6.3 support among other
  things, APEI updates including the ARM Software Delegated Exception
  Interface (SDEI) support, ACPI EC driver fixes and cleanups and other
  assorted improvements.

  Specifics:

   - Update the ACPICA code in the kernel to upstream revision 20190215
     including ACPI 6.3 support and more:
      * New predefined methods: _NBS, _NCH, _NIC, _NIH, and _NIG (Erik
        Schmauss).
      * Update of the PCC Identifier structure in PDTT (Erik Schmauss).
      * Support for new Generic Affinity Structure subtable in SRAT
        (Erik Schmauss).
      * New PCC operation region support (Erik Schmauss).
      * Support for GICC statistical profiling for MADT (Erik Schmauss).
      * New Error Disconnect Recover notification support (Erik
        Schmauss).
      * New PPTT Processor Structure Flags fields support (Erik
        Schmauss).
      * ACPI 6.3 HMAT updates (Erik Schmauss).
      * GTDT Revision 3 support (Erik Schmauss).
      * Legacy module-level code (MLC) support removal (Erik Schmauss).
      * Update/clarification of messages for control method failures
        (Bob Moore).
      * Warning on creation of a zero-length opregion (Bob Moore).
      * acpiexec option to dump extra info for memory leaks (Bob Moore).
      * More ACPI error to firmware error conversions (Bob Moore).
      * Debugger fix (Bob Moore).
      * Copyrights update (Bob Moore)

   - Clean up sleep states support code in ACPICA (Christoph Hellwig)

   - Rework in_nmi() handling in the APEI code and add suppor for the
     ARM Software Delegated Exception Interface (SDEI) to it (James
     Morse)

   - Fix possible out-of-bounds accesses in BERT-related core (Ross
     Lagerwall)

   - Fix the APEI code parsing HEST that includes a Deferred Machine
     Check subtable (Yazen Ghannam)

   - Use DEFINE_DEBUGFS_ATTRIBUTE for APEI-related debugfs files
     (YueHaibing)

   - Switch the APEI ERST code to the new generic UUID API (Andy
     Shevchenko)

   - Update the MAINTAINERS entry for APEI (Borislav Petkov)

   - Fix and clean up the ACPI EC driver (Rafael Wysocki, Zhang Rui)

   - Fix DMI checks handling in the ACPI backlight driver and add the
     "Lunch Box" chassis-type check to it (Hans de Goede)

   - Add support for using ACPI table overrides included in built-in
     initrd images (Shunyong Yang)

   - Update ACPI device enumeration to treat the PWM2 device as "always
     present" on Lenovo Yoga Book (Yauhen Kharuzhy)

   - Fix up the enumeration of device objects with the PRP0001 device ID
     (Andy Shevchenko)

   - Clean up PPTT parsing error messages (John Garry)

   - Clean up debugfs files creation handling (Greg Kroah-Hartman,
     Rafael Wysocki)

   - Clean up the ACPI DPTF Makefile (Masahiro Yamada)"

* tag 'acpi-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits)
  ACPI / bus: Respect PRP0001 when retrieving device match data
  ACPICA: Update version to 20190215
  ACPI/ACPICA: Trivial: fix spelling mistakes and fix whitespace formatting
  ACPICA: ACPI 6.3: add GTDT Revision 3 support
  ACPICA: ACPI 6.3: HMAT updates
  ACPICA: ACPI 6.3: PPTT add additional fields in Processor Structure Flags
  ACPICA: ACPI 6.3: add Error Disconnect Recover Notification value
  ACPICA: ACPI 6.3: MADT: add support for statistical profiling in GICC
  ACPICA: ACPI 6.3: add PCC operation region support for AML interpreter
  efi: cper: Fix possible out-of-bounds access
  ACPI: APEI: Fix possible out-of-bounds access to BERT region
  ACPICA: ACPI 6.3: SRAT: add Generic Affinity Structure subtable
  ACPICA: ACPI 6.3: Add Trigger order to PCC Identifier structure in PDTT
  ACPICA: ACPI 6.3: Adding predefined methods _NBS, _NCH, _NIC, _NIH, and _NIG
  ACPICA: Update/clarify messages for control method failures
  ACPICA: Debugger: Fix possible fault with the "test objects" command
  ACPICA: Interpreter: Emit warning for creation of a zero-length op region
  ACPICA: Remove legacy module-level code support
  ACPI / x86: Make PWM2 device always present at Lenovo Yoga Book
  ACPI / video: Extend chassis-type detection with a "Lunch Box" check
  ..
2019-03-06 13:33:11 -08:00
Alexey Dobriyan b9726c26dc numa: make "nr_node_ids" unsigned int
Number of NUMA nodes can't be negative.

This saves a few bytes on x86_64:

	add/remove: 0/0 grow/shrink: 4/21 up/down: 27/-265 (-238)
	Function                                     old     new   delta
	hv_synic_alloc.cold                           88     110     +22
	prealloc_shrinker                            260     262      +2
	bootstrap                                    249     251      +2
	sched_init_numa                             1566    1567      +1
	show_slab_objects                            778     777      -1
	s_show                                      1201    1200      -1
	kmem_cache_init                              346     345      -1
	__alloc_workqueue_key                       1146    1145      -1
	mem_cgroup_css_alloc                        1614    1612      -2
	__do_sys_swapon                             4702    4699      -3
	__list_lru_init                              655     651      -4
	nic_probe                                   2379    2374      -5
	store_user_store                             118     111      -7
	red_zone_store                               106      99      -7
	poison_store                                 106      99      -7
	wq_numa_init                                 348     338     -10
	__kmem_cache_empty                            75      65     -10
	task_numa_free                               186     173     -13
	merge_across_nodes_store                     351     336     -15
	irq_create_affinity_masks                   1261    1246     -15
	do_numa_crng_init                            343     321     -22
	task_numa_fault                             4760    4737     -23
	swapfile_init                                179     156     -23
	hv_synic_alloc                               536     492     -44
	apply_wqattrs_prepare                        746     695     -51

Link: http://lkml.kernel.org/r/20190201223029.GA15820@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:19 -08:00
David Hildenbrand d9fa9d9517 arm64: kdump: no need to mark crashkernel pages manually PG_reserved
The crashkernel is reserved via memblock_reserve().  memblock_free_all()
will call free_low_memory_core_early(), which will go over all reserved
memblocks, marking the pages as PG_reserved.

So manually marking pages as PG_reserved is not necessary, they are
already in the desired state (otherwise they would have been handed over
to the buddy as free pages and bad things would happen).

Link: http://lkml.kernel.org/r/20190114125903.24845-8-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Matthias Brugger <mbrugger@suse.com>
Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Dave Kleikamp <dave.kleikamp@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Greg Hackmann <ghackmann@android.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kristina Martsenko <kristina.martsenko@arm.com>
Cc: CHANDAN VN <chandan.vn@samsung.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:19 -08:00
Anshuman Khandual 5480280d3f arm64/mm: enable HugeTLB migration for contiguous bit HugeTLB pages
Let arm64 subscribe to the previously added framework in which
architecture can inform whether a given huge page size is supported for
migration.  This just overrides the default function
arch_hugetlb_migration_supported() and enables migration for all
possible HugeTLB page sizes on arm64.

With this, HugeTLB migration support on arm64 now covers all possible
HugeTLB options.

          CONT PTE    PMD    CONT PMD    PUD
          --------    ---    --------    ---
  4K:        64K      2M        32M      1G
  16K:        2M     32M         1G
  64K:        2M    512M        16G

Link: http://lkml.kernel.org/r/1545121450-1663-6-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-03-05 21:07:15 -08:00
Rafael J. Wysocki dcaed592b2 Merge branch 'acpi-apei'
* acpi-apei: (29 commits)
  efi: cper: Fix possible out-of-bounds access
  ACPI: APEI: Fix possible out-of-bounds access to BERT region
  MAINTAINERS: Add James Morse to the list of APEI reviewers
  ACPI / APEI: Add support for the SDEI GHES Notification type
  firmware: arm_sdei: Add ACPI GHES registration helper
  ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications
  ACPI / APEI: Only use queued estatus entry during in_nmi_queue_one_entry()
  ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length
  ACPI / APEI: Make GHES estatus header validation more user friendly
  ACPI / APEI: Pass ghes and estatus separately to avoid a later copy
  ACPI / APEI: Let the notification helper specify the fixmap slot
  ACPI / APEI: Move locking to the notification helper
  arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
  KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
  ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue
  ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI
  ACPI / APEI: Don't allow ghes_ack_error() to mask earlier errors
  ACPI / APEI: Generalise the estatus queue's notify code
  ACPI / APEI: Don't update struct ghes' flags in read/clear estatus
  ACPI / APEI: Remove spurious GHES_TO_CLEAR check
  ...
2019-03-04 11:16:35 +01:00
Peng Fan b855b58ac1 arm64: mmu: drop paging_init comments
The comments could not reflect the code, and it is easy to get
what this function does from a straight-line reading of the code.
So let's drop the comments

Signed-off-by: Peng Fan <peng.fan@nxp.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-03-01 16:40:07 +00:00
Will Deacon b9a4b9d084 arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals
FAR_EL1 is UNKNOWN for all debug exceptions other than those caused by
taking a hardware watchpoint. Unfortunately, if a debug handler returns
a non-zero value, then we will propagate the UNKNOWN FAR value to
userspace via the si_addr field of the SIGTRAP siginfo_t.

Instead, let's set si_addr to take on the PC of the faulting instruction,
which we have available in the current pt_regs.

Cc: <stable@vger.kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-03-01 16:23:17 +00:00
Zhang Lei 3e32131abc arm64: Add workaround for Fujitsu A64FX erratum 010001
On the Fujitsu-A64FX cores ver(1.0, 1.1), memory access may cause
an undefined fault (Data abort, DFSC=0b111111). This fault occurs under
a specific hardware condition when a load/store instruction performs an
address translation. Any load/store instruction, except non-fault access
including Armv8 and SVE might cause this undefined fault.

The TCR_ELx.NFD1 bit is used by the kernel when CONFIG_RANDOMIZE_BASE
is enabled to mitigate timing attacks against KASLR where the kernel
address space could be probed using the FFR and suppressed fault on
SVE loads.

Since this erratum causes spurious exceptions, which may corrupt
the exception registers, we clear the TCR_ELx.NFDx=1 bits when
booting on an affected CPU.

Signed-off-by: Zhang Lei <zhang.lei@jp.fujitsu.com>
[Generated MIDR value/mask for __cpu_setup(), removed spurious-fault handler
 and always disabled the NFDx bits on affected CPUs]
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: zhang.lei <zhang.lei@jp.fujitsu.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-28 16:24:25 +00:00
Andrey Konovalov 3f41b60938 kasan: fix random seed generation for tag-based mode
There are two issues with assigning random percpu seeds right now:

1. We use for_each_possible_cpu() to iterate over cpus, but cpumask is
   not set up yet at the moment of kasan_init(), and thus we only set
   the seed for cpu #0.

2. A call to get_random_u32() always returns the same number and produces
   a message in dmesg, since the random subsystem is not yet initialized.

Fix 1 by calling kasan_init_tags() after cpumask is set up.

Fix 2 by using get_cycles() instead of get_random_u32(). This gives us
lower quality random numbers, but it's good enough, as KASAN is meant to
be used as a debugging tool and not a mitigation.

Link: http://lkml.kernel.org/r/1f815cc914b61f3516ed4cc9bfd9eeca9bd5d9de.1550677973.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-21 09:01:00 -08:00
James Morse d44f1b8dd7 arm64: KVM/mm: Move SEA handling behind a single 'claim' interface
To split up APEIs in_nmi() path, the caller needs to always be
in_nmi(). Add a helper to do the work and claim the notification.

When KVM or the arch code takes an exception that might be a RAS
notification, it asks the APEI firmware-first code whether it wants
to claim the exception. A future kernel-first mechanism may be queried
afterwards, and claim the notification, otherwise we fall through
to the existing default behaviour.

The NOTIFY_SEA code was merged before considering multiple, possibly
interacting, NMI-like notifications and the need to consider kernel
first in the future. Make the 'claiming' behaviour explicit.

Restructuring the APEI code to allow multiple NMI-like notifications
means any notification that might interrupt interrupts-masked
code must always be wrapped in nmi_enter()/nmi_exit(). This will
allow APEI to use in_nmi() to use the right fixmap entries.

Mask SError over this window to prevent an asynchronous RAS error
arriving and tripping 'nmi_enter()'s BUG_ON(in_nmi()).

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-02-07 23:10:45 +01:00
James Morse 0db5e02230 KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing
To split up APEIs in_nmi() path, the caller needs to always be
in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing
out into a header file.

Currently guest synchronous external aborts are claimed as RAS
notifications by handle_guest_sea(), which is hidden in the arch codes
mm/fault.c. 32bit gets a dummy declaration in system_misc.h.

There is going to be more of this in the future if/when the kernel
supports the SError-based firmware-first notification mechanism and/or
kernel-first notifications for both synchronous external abort and
SError. Each of these will come with some Kconfig symbols and a
handful of header files.

Create a header file for all this.

This patch gives handle_guest_sea() a 'kvm_' prefix, and moves the
declarations to kvm_ras.h as preparation for a future patch that moves
the ACPI-specific RAS code out of mm/fault.c.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-02-07 23:10:45 +01:00
Julien Thierry a9806aa259 arm64: Unmask PMR before going idle
CPU does not received signals for interrupts with a priority masked by
ICC_PMR_EL1. This means the CPU might not come back from a WFI
instruction.

Make sure ICC_PMR_EL1 does not mask interrupts when doing a WFI.

Since the logic of cpu_do_idle is becoming a bit more complex than just
two instructions, lets turn it from ASM to C.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-02-06 10:05:18 +00:00
Will Deacon d23c808c6f arm64: ptdump: Don't iterate kernel page tables using PTRS_PER_PXX
When 52-bit virtual addressing is enabled for userspace
(CONFIG_ARM64_USER_VA_BITS_52=y), the kernel continues to utilise 48-bit
virtual addressing in TTBR1. Consequently, PTRS_PER_PGD reflects the
larger page table size for userspace and the pgd pointer for kernel page
tables is offset before being written to TTBR1.

This means that we can't use PTRS_PER_PGD to iterate over kernel page
tables unless we apply the same offset, which is fiddly to get right and
leads to some non-idiomatic walking code. Instead, just follow the usual
pattern when walking page tables by using a while loop driven by
pXd_offset() and pXd_addr_end().

Reported-by: Qian Cai <cai@lca.pw>
Tested-by: Qian Cai <cai@lca.pw>
Acked-by: Steve Capper <steve.capper@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-02-04 14:37:38 +00:00
Catalin Marinas 132fdc379e arm64: Do not issue IPIs for user executable ptes
Commit 3b8c9f1cdf ("arm64: IPI each CPU after invalidating the I-cache
for kernel mappings") was aimed at fixing the I-cache invalidation for
kernel mappings. However, it inadvertently caused all cache maintenance
for user mappings via set_pte_at() -> __sync_icache_dcache() ->
sync_icache_aliases() to call kick_all_cpus_sync().

Reported-by: Shijith Thotton <sthotton@marvell.com>
Tested-by: Shijith Thotton <sthotton@marvell.com>
Reported-by: Wandun Chen <chenwandun@huawei.com>
Fixes: 3b8c9f1cdf ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
Cc: <stable@vger.kernel.org> # 4.19.x-
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-02-01 14:06:50 +00:00
Greg Kroah-Hartman e2a2e56e40 arm64: dump: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Peng Donglin <dolinux.peng@gmail.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-01-31 17:38:19 +00:00
Christoph Hellwig 60d8cd572f arm64/xen: fix xen-swiotlb cache flushing
Xen-swiotlb hooks into the arm/arm64 arch code through a copy of the DMA
DMA mapping operations stored in the struct device arch data.

Switching arm64 to use the direct calls for the merged DMA direct /
swiotlb code broke this scheme.  Replace the indirect calls with
direct-calls in xen-swiotlb as well to fix this problem.

Fixes: 356da6d0cd ("dma-mapping: bypass indirect calls for dma-direct")
Reported-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2019-01-23 22:14:56 +01:00
Logan Gunthorpe a2c801c53d arm64: mm: make use of new memblocks_present() helper
Cleanup the arm64_memory_present() function seeing it's very
similar to other arches.

memblocks_present() is a direct replacement of arm64_memory_present()

Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-01-21 17:34:29 +00:00
Will Deacon 83504032e6 arm64: Remove asm/memblock.h
The arm64 asm/memblock.h header exists only to provide a function
prototype for arm64_memblock_init(), which is called only from
setup_arch().

Move the declaration into mmu.h, where it can live alongside other
init functions such as paging_init() and bootmem_init() without the
need for its own special header file.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-01-21 17:31:15 +00:00
Linus Torvalds 078a5a4faf arm64 fixes for -rc1
- Prevent KASLR from mapping the top page of the virtual address space
 
 - Fix device-tree probing of SDEI driver
 
 - Fix incorrect register offset definition in Hisilicon DDRC PMU driver
 
 - Fix compilation issue with older binutils not liking unsigned immediates
 
 - Fix uapi headers so that libc can provide its own sigcontext definition
 
 - Fix handling of private compat syscalls
 
 - Hook up compat io_pgetevents() syscall for 32-bit tasks
 
 - Cleanup to arm64 Makefile (including now to avoid silly conflicts)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJcL3d+AAoJELescNyEwWM0PNcIAIdjWQeBQYMBc8C/A2dBqL2s
 tWBI+ormmZO72eAOVuGr1ZBqPhIpqXPQQquchnPDEzL+vZiq5Y6HP6ND8a+ISN2c
 0NmWH2aURR+SZG5Mfpa9PffUlDu1LVbssbzt3Vk89BmOEFwBbr5w9FEO96c8drJC
 MJ5NICtHnTvuI9jRs9zQoJOk+LKAL1Ei3v7EEyJGKVlRahtaYGZIkfx9t1BmFXzB
 SFCA7Zf8kHQItKAwfGWsocd7CP7hQZcmpFcn/GfjXML2FQ+sa9Slys+u+8mvSziQ
 EiU5os5krKPUpXXmyOeWXzEukZSJMRm2f9FBR2YquYm5RJ7Y0xQH1pB4aLsCR0g=
 =LvTk
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "I'm safely chained back up to my desk, so please pull these arm64
  fixes for -rc1 that address some issues that cropped up during the
  merge window:

   - Prevent KASLR from mapping the top page of the virtual address
     space

   - Fix device-tree probing of SDEI driver

   - Fix incorrect register offset definition in Hisilicon DDRC PMU
     driver

   - Fix compilation issue with older binutils not liking unsigned
     immediates

   - Fix uapi headers so that libc can provide its own sigcontext
     definition

   - Fix handling of private compat syscalls

   - Hook up compat io_pgetevents() syscall for 32-bit tasks

   - Cleanup to arm64 Makefile (including now to avoid silly conflicts)"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: compat: Hook up io_pgetevents() for 32-bit tasks
  arm64: compat: Don't pull syscall number from regs in arm_compat_syscall
  arm64: compat: Avoid sending SIGILL for unallocated syscall numbers
  arm64/sve: Disentangle <uapi/asm/ptrace.h> from <uapi/asm/sigcontext.h>
  arm64/sve: ptrace: Fix SVE_PT_REGS_OFFSET definition
  drivers/perf: hisi: Fixup one DDRC PMU register offset
  arm64: replace arm64-obj-* in Makefile with obj-*
  arm64: kaslr: Reserve size of ARM64_MEMSTART_ALIGN in linear region
  firmware: arm_sdei: Fix DT platform device creation
  firmware: arm_sdei: fix wrong of_node_put() in init function
  arm64: entry: remove unused register aliases
  arm64: smp: Fix compilation error
2019-01-05 11:28:39 -08:00
Yueyi Li c8a43c18a9 arm64: kaslr: Reserve size of ARM64_MEMSTART_ALIGN in linear region
When KASLR is enabled (CONFIG_RANDOMIZE_BASE=y), the top 4K of kernel
virtual address space may be mapped to physical addresses despite being
reserved for ERR_PTR values.

Fix the randomization of the linear region so that we avoid mapping the
last page of the virtual address space.

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: liyueyi <liyueyi@live.com>
[will: rewrote commit message; merged in suggestion from Ard]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-01-03 18:03:54 +00:00
Linus Torvalds 030672aea8 Merge tag 'devicetree-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull Devicetree updates from Rob Herring:
 "The biggest highlight here is the start of using json-schema for DT
  bindings. Being able to validate bindings has been discussed for years
  with little progress.

   - Initial support for DT bindings using json-schema language. This is
     the start of converting DT bindings from free-form text to a
     structured format.

   - Reworking of initrd address initialization. This moves to using the
     phys address instead of virt addr in the DT parsing code. This
     rework was motivated by CONFIG_DEV_BLK_INITRD causing unnecessary
     rebuilding of lots of files.

   - Fix stale phandle entries in phandle cache

   - DT overlay validation improvements. This exposed several memory
     leak bugs which have been fixed.

   - Use node name and device_type helper functions in DT code

   - Last remaining conversions to using %pOFn printk specifier instead
     of device_node.name directly

   - Create new common RTC binding doc and move all trivial RTC devices
     out of trivial-devices.txt.

   - New bindings for Freescale MAG3110 magnetometer, Cadence Sierra
     PHY, and Xen shared memory

   - Update dtc to upstream version v1.4.7-57-gf267e674d145"

* tag 'devicetree-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (68 commits)
  of: __of_detach_node() - remove node from phandle cache
  of: of_node_get()/of_node_put() nodes held in phandle cache
  gpio-omap.txt: add reg and interrupts properties
  dt-bindings: mrvl,intc: fix a trivial typo
  dt-bindings: iio: magnetometer: add dt-bindings for freescale mag3110
  dt-bindings: Convert trivial-devices.txt to json-schema
  dt-bindings: arm: mrvl: amend Browstone compatible string
  dt-bindings: arm: Convert Tegra board/soc bindings to json-schema
  dt-bindings: arm: Convert ZTE board/soc bindings to json-schema
  dt-bindings: arm: Add missing Xilinx boards
  dt-bindings: arm: Convert Xilinx board/soc bindings to json-schema
  dt-bindings: arm: Convert VIA board/soc bindings to json-schema
  dt-bindings: arm: Convert ST STi board/soc bindings to json-schema
  dt-bindings: arm: Convert SPEAr board/soc bindings to json-schema
  dt-bindings: arm: Convert CSR SiRF board/soc bindings to json-schema
  dt-bindings: arm: Convert QCom board/soc bindings to json-schema
  dt-bindings: arm: Convert TI nspire board/soc bindings to json-schema
  dt-bindings: arm: Convert TI davinci board/soc bindings to json-schema
  dt-bindings: arm: Convert Calxeda board/soc bindings to json-schema
  dt-bindings: arm: Convert Altera board/soc bindings to json-schema
  ...
2018-12-28 20:08:34 -08:00
Linus Torvalds f346b0becb Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - large KASAN update to use arm's "software tag-based mode"

 - a few misc things

 - sh updates

 - ocfs2 updates

 - just about all of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (167 commits)
  kernel/fork.c: mark 'stack_vm_area' with __maybe_unused
  memcg, oom: notify on oom killer invocation from the charge path
  mm, swap: fix swapoff with KSM pages
  include/linux/gfp.h: fix typo
  mm/hmm: fix memremap.h, move dev_page_fault_t callback to hmm
  hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
  hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
  memory_hotplug: add missing newlines to debugging output
  mm: remove __hugepage_set_anon_rmap()
  include/linux/vmstat.h: remove unused page state adjustment macro
  mm/page_alloc.c: allow error injection
  mm: migrate: drop unused argument of migrate_page_move_mapping()
  blkdev: avoid migration stalls for blkdev pages
  mm: migrate: provide buffer_migrate_page_norefs()
  mm: migrate: move migrate_page_lock_buffers()
  mm: migrate: lock buffers before migrate_page_move_mapping()
  mm: migration: factor out code to compute expected number of page references
  mm, page_alloc: enable pcpu_drain with zone capability
  kmemleak: add config to select auto scan
  mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init
  ...
2018-12-28 16:55:46 -08:00
Linus Torvalds af7ddd8a62 DMA mapping updates for Linux 4.21
A huge update this time, but a lot of that is just consolidating or
 removing code:
 
  - provide a common DMA_MAPPING_ERROR definition and avoid indirect
    calls for dma_map_* error checking
  - use direct calls for the DMA direct mapping case, avoiding huge
    retpoline overhead for high performance workloads
  - merge the swiotlb dma_map_ops into dma-direct
  - provide a generic remapping DMA consistent allocator for architectures
    that have devices that perform DMA that is not cache coherent. Based
    on the existing arm64 implementation and also used for csky now.
  - improve the dma-debug infrastructure, including dynamic allocation
    of entries (Robin Murphy)
  - default to providing chaining scatterlist everywhere, with opt-outs
    for the few architectures (alpha, parisc, most arm32 variants) that
    can't cope with it
  - misc sparc32 dma-related cleanups
  - remove the dma_mark_clean arch hook used by swiotlb on ia64 and
    replace it with the generic noncoherent infrastructure
  - fix the return type of dma_set_max_seg_size (Niklas Söderlund)
  - move the dummy dma ops for not DMA capable devices from arm64 to
    common code (Robin Murphy)
  - ensure dma_alloc_coherent returns zeroed memory to avoid kernel data
    leaks through userspace.  We already did this for most common
    architectures, but this ensures we do it everywhere.
    dma_zalloc_coherent has been deprecated and can hopefully be
    removed after -rc1 with a coccinelle script.
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlwctQgLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYMxgQ//dBpAfS4/J76CdAbYry2zqgcOUU9hIrD6NHiEMWov
 ltJxyvEl3LsUmIdEj3aCrYL9jZN0qsnCzn5BVj2c3jDIVgD64fAr7HDf/PbEEfKb
 j6/GgEnVLPZV+sQMvhNA5jOzHrkseaqPa4/pNLFZ/l8jnuZ2d+btusDWJpMoVDer
 TXVwtIfgeIu0gTygYOShLYXd5qptWKWsZEpbTZOO2sE6+x+ZJX7yQYUxYDTlcOIj
 JWVO2l5QNHPc5T9o2at+6L5aNUvnZOxT79sWgyZLn0Kc+FagKAVwfLqUEl0v7foG
 8k/xca5/8p3afB1DfrIrtplJqis7cVgdyGxriwuuoO8X4F0nPyWwpGmxsBhrWwwl
 xTqC4UorEJ7QwoP6Azopk/vYI2QXIUBLjuCJCuFXZj9+2BGf4IfvBY1S2cLM9qLs
 HMcxQonuXJii044KEFS96ePEuiT+igVINweIFBKWcgNCEG0UQtyL6RQ1U5297ipF
 JiWZAqD+p9X52UdKS+oKfAiZEekMXn6Xyo97+YCiNpfOo0GP5eEcwhL+JpY4AiRq
 apPXtsRy2o1s8yfjdraUIM2Mc2n62vFKb35oUbGCd/QO9piPrFQHl6T0HHcHk4YR
 XrUXcHieFZBCYqh7ZVa4RL8Msq1wvGuTL4Dxl43mXdsMoUFRR6eSNWLoAV4IpOLZ
 WgA=
 =in72
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping

Pull DMA mapping updates from Christoph Hellwig:
 "A huge update this time, but a lot of that is just consolidating or
  removing code:

   - provide a common DMA_MAPPING_ERROR definition and avoid indirect
     calls for dma_map_* error checking

   - use direct calls for the DMA direct mapping case, avoiding huge
     retpoline overhead for high performance workloads

   - merge the swiotlb dma_map_ops into dma-direct

   - provide a generic remapping DMA consistent allocator for
     architectures that have devices that perform DMA that is not cache
     coherent. Based on the existing arm64 implementation and also used
     for csky now.

   - improve the dma-debug infrastructure, including dynamic allocation
     of entries (Robin Murphy)

   - default to providing chaining scatterlist everywhere, with opt-outs
     for the few architectures (alpha, parisc, most arm32 variants) that
     can't cope with it

   - misc sparc32 dma-related cleanups

   - remove the dma_mark_clean arch hook used by swiotlb on ia64 and
     replace it with the generic noncoherent infrastructure

   - fix the return type of dma_set_max_seg_size (Niklas Söderlund)

   - move the dummy dma ops for not DMA capable devices from arm64 to
     common code (Robin Murphy)

   - ensure dma_alloc_coherent returns zeroed memory to avoid kernel
     data leaks through userspace. We already did this for most common
     architectures, but this ensures we do it everywhere.
     dma_zalloc_coherent has been deprecated and can hopefully be
     removed after -rc1 with a coccinelle script"

* tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping: (73 commits)
  dma-mapping: fix inverted logic in dma_supported
  dma-mapping: deprecate dma_zalloc_coherent
  dma-mapping: zero memory returned from dma_alloc_*
  sparc/iommu: fix ->map_sg return value
  sparc/io-unit: fix ->map_sg return value
  arm64: default to the direct mapping in get_arch_dma_ops
  PCI: Remove unused attr variable in pci_dma_configure
  ia64: only select ARCH_HAS_DMA_COHERENT_TO_PFN if swiotlb is enabled
  dma-mapping: bypass indirect calls for dma-direct
  vmd: use the proper dma_* APIs instead of direct methods calls
  dma-direct: merge swiotlb_dma_ops into the dma_direct code
  dma-direct: use dma_direct_map_page to implement dma_direct_map_sg
  dma-direct: improve addressability error reporting
  swiotlb: remove dma_mark_clean
  swiotlb: remove SWIOTLB_MAP_ERROR
  ACPI / scan: Refactor _CCA enforcement
  dma-mapping: factor out dummy DMA ops
  dma-mapping: always build the direct mapping code
  dma-mapping: move dma_cache_sync out of line
  dma-mapping: move various slow path functions out of line
  ...
2018-12-28 14:12:21 -08:00
Will Deacon 8e2d43405b lib/ioremap: ensure break-before-make is used for huge p4d mappings
Whilst no architectures actually enable support for huge p4d mappings in
the vmap area, the code that is implemented should be using
break-before-make, as we do for pud and pmd huge entries.

Link: http://lkml.kernel.org/r/1544120495-17438-6-git-send-email-will.deacon@arm.com
Signed-off-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Chintan Pandya <cpandya@codeaurora.org>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:50 -08:00
Will Deacon 9c006972c3 arm64: mmu: drop pXd_present() checks from pXd_free_pYd_table()
The core code already has a check for pXd_none(), so remove it from the
architecture implementation.

Link: http://lkml.kernel.org/r/1544120495-17438-3-git-send-email-will.deacon@arm.com
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Chintan Pandya <cpandya@codeaurora.org>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:50 -08:00
Qian Cai fed84c7852 mm/memblock.c: skip kmemleak for kasan_init()
Kmemleak does not play well with KASAN (tested on both HPE Apollo 70 and
Huawei TaiShan 2280 aarch64 servers).

After calling start_kernel()->setup_arch()->kasan_init(), kmemleak early
log buffer went from something like 280 to 260000 which caused kmemleak
disabled and crash dump memory reservation failed.  The multitude of
kmemleak_alloc() calls is from nested loops while KASAN is setting up full
memory mappings, so let early kmemleak allocations skip those
memblock_alloc_internal() calls came from kasan_init() given that those
early KASAN memory mappings should not reference to other memory.  Hence,
no kmemleak false positives.

kasan_init
  kasan_map_populate [1]
    kasan_pgd_populate [2]
      kasan_pud_populate [3]
        kasan_pmd_populate [4]
          kasan_pte_populate [5]
            kasan_alloc_zeroed_page
              memblock_alloc_try_nid
                memblock_alloc_internal
                  kmemleak_alloc

[1] for_each_memblock(memory, reg)
[2] while (pgdp++, addr = next, addr != end)
[3] while (pudp++, addr = next, addr != end && pud_none(READ_ONCE(*pudp)))
[4] while (pmdp++, addr = next, addr != end && pmd_none(READ_ONCE(*pmdp)))
[5] while (ptep++, addr = next, addr != end && pte_none(READ_ONCE(*ptep)))

Link: http://lkml.kernel.org/r/1543442925-17794-1-git-send-email-cai@gmx.us
Signed-off-by: Qian Cai <cai@gmx.us>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:49 -08:00
Andrey Konovalov 21696c1613 kasan, arm64: enable top byte ignore for the kernel
Tag-based KASAN uses the Top Byte Ignore feature of arm64 CPUs to store a
pointer tag in the top byte of each pointer.  This commit enables the
TCR_TBI1 bit, which enables Top Byte Ignore for the kernel, when tag-based
KASAN is used.

Link: http://lkml.kernel.org/r/f51eca084c8cdb2f3a55195fe342dc8953b7aead.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:43 -08:00
Andrey Konovalov 356607f21e kasan, arm64: fix up fault handling logic
Right now arm64 fault handling code removes pointer tags from addresses
covered by TTBR0 in faults taken from both EL0 and EL1, but doesn't do
that for pointers covered by TTBR1.

This patch adds two helper functions is_ttbr0_addr() and is_ttbr1_addr(),
where the latter one accounts for the fact that TTBR1 pointers might be
tagged when tag-based KASAN is in use, and uses these helper functions to
perform pointer checks in arch/arm64/mm/fault.c.

Link: http://lkml.kernel.org/r/3f349b0e9e48b5df3298a6b4ae0634332274494a.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:43 -08:00
Andrey Konovalov 3c9e3aa110 kasan: add tag related helper functions
This commit adds a few helper functions, that are meant to be used to work
with tags embedded in the top byte of kernel pointers: to set, to get or
to reset the top byte.

Link: http://lkml.kernel.org/r/f6c6437bb8e143bc44f42c3c259c62e734be7935.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:43 -08:00
Andrey Konovalov 080eb83f54 kasan: initialize shadow to 0xff for tag-based mode
A tag-based KASAN shadow memory cell contains a memory tag, that
corresponds to the tag in the top byte of the pointer, that points to that
memory.  The native top byte value of kernel pointers is 0xff, so with
tag-based KASAN we need to initialize shadow memory to 0xff.

[cai@lca.pw: arm64: skip kmemleak for KASAN again\
  Link: http://lkml.kernel.org/r/20181226020550.63712-1-cai@lca.pw
Link: http://lkml.kernel.org/r/5cc1b789aad7c99cf4f3ec5b328b147ad53edb40.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:43 -08:00
Andrey Konovalov 9577dd7486 kasan: rename kasan_zero_page to kasan_early_shadow_page
With tag based KASAN mode the early shadow value is 0xff and not 0x00, so
this patch renames kasan_zero_(page|pte|pmd|pud|p4d) to
kasan_early_shadow_(page|pte|pmd|pud|p4d) to avoid confusion.

Link: http://lkml.kernel.org/r/3fed313280ebf4f88645f5b89ccbc066d320e177.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-28 12:11:43 -08:00
Linus Torvalds 5694cecdb0 arm64 festive updates for 4.21
In the end, we ended up with quite a lot more than I expected:
 
 - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and
   kernel-side support to come later)
 
 - Support for per-thread stack canaries, pending an update to GCC that
   is currently undergoing review
 
 - Support for kexec_file_load(), which permits secure boot of a kexec
   payload but also happens to improve the performance of kexec
   dramatically because we can avoid the sucky purgatory code from
   userspace. Kdump will come later (requires updates to libfdt).
 
 - Optimisation of our dynamic CPU feature framework, so that all
   detected features are enabled via a single stop_machine() invocation
 
 - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that
   they can benefit from global TLB entries when KASLR is not in use
 
 - 52-bit virtual addressing for userspace (kernel remains 48-bit)
 
 - Patch in LSE atomics for per-cpu atomic operations
 
 - Custom preempt.h implementation to avoid unconditional calls to
   preempt_schedule() from preempt_enable()
 
 - Support for the new 'SB' Speculation Barrier instruction
 
 - Vectorised implementation of XOR checksumming and CRC32 optimisations
 
 - Workaround for Cortex-A76 erratum #1165522
 
 - Improved compatibility with Clang/LLD
 
 - Support for TX2 system PMUS for profiling the L3 cache and DMC
 
 - Reflect read-only permissions in the linear map by default
 
 - Ensure MMIO reads are ordered with subsequent calls to Xdelay()
 
 - Initial support for memory hotplug
 
 - Tweak the threshold when we invalidate the TLB by-ASID, so that
   mremap() performance is improved for ranges spanning multiple PMDs.
 
 - Minor refactoring and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJcE4TmAAoJELescNyEwWM0Nr0H/iaU7/wQSzHyNXtZoImyKTul
 Blu2ga4/EqUrTU7AVVfmkl/3NBILWlgQVpY6tH6EfXQuvnxqD7CizbHyLdyO+z0S
 B5PsFUH2GLMNAi48AUNqGqkgb2knFbg+T+9IimijDBkKg1G/KhQnRg6bXX32mLJv
 Une8oshUPBVJMsHN1AcQknzKariuoE3u0SgJ+eOZ9yA2ZwKxP4yy1SkDt3xQrtI0
 lojeRjxcyjTP1oGRNZC+BWUtGOT35p7y6cGTnBd/4TlqBGz5wVAJUcdoxnZ6JYVR
 O8+ob9zU+4I0+SKt80s7pTLqQiL9rxkKZ5joWK1pr1g9e0s5N5yoETXKFHgJYP8=
 =sYdt
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 festive updates from Will Deacon:
 "In the end, we ended up with quite a lot more than I expected:

   - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and
     kernel-side support to come later)

   - Support for per-thread stack canaries, pending an update to GCC
     that is currently undergoing review

   - Support for kexec_file_load(), which permits secure boot of a kexec
     payload but also happens to improve the performance of kexec
     dramatically because we can avoid the sucky purgatory code from
     userspace. Kdump will come later (requires updates to libfdt).

   - Optimisation of our dynamic CPU feature framework, so that all
     detected features are enabled via a single stop_machine()
     invocation

   - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that
     they can benefit from global TLB entries when KASLR is not in use

   - 52-bit virtual addressing for userspace (kernel remains 48-bit)

   - Patch in LSE atomics for per-cpu atomic operations

   - Custom preempt.h implementation to avoid unconditional calls to
     preempt_schedule() from preempt_enable()

   - Support for the new 'SB' Speculation Barrier instruction

   - Vectorised implementation of XOR checksumming and CRC32
     optimisations

   - Workaround for Cortex-A76 erratum #1165522

   - Improved compatibility with Clang/LLD

   - Support for TX2 system PMUS for profiling the L3 cache and DMC

   - Reflect read-only permissions in the linear map by default

   - Ensure MMIO reads are ordered with subsequent calls to Xdelay()

   - Initial support for memory hotplug

   - Tweak the threshold when we invalidate the TLB by-ASID, so that
     mremap() performance is improved for ranges spanning multiple PMDs.

   - Minor refactoring and cleanups"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (125 commits)
  arm64: kaslr: print PHYS_OFFSET in dump_kernel_offset()
  arm64: sysreg: Use _BITUL() when defining register bits
  arm64: cpufeature: Rework ptr auth hwcaps using multi_entry_cap_matches
  arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4
  arm64: docs: document pointer authentication
  arm64: ptr auth: Move per-thread keys from thread_info to thread_struct
  arm64: enable pointer authentication
  arm64: add prctl control for resetting ptrauth keys
  arm64: perf: strip PAC when unwinding userspace
  arm64: expose user PAC bit positions via ptrace
  arm64: add basic pointer authentication support
  arm64/cpufeature: detect pointer authentication
  arm64: Don't trap host pointer auth use to EL2
  arm64/kvm: hide ptrauth from guests
  arm64/kvm: consistently handle host HCR_EL2 flags
  arm64: add pointer authentication register bits
  arm64: add comments about EC exception levels
  arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned
  arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field
  arm64: enable per-task stack canaries
  ...
2018-12-25 17:41:56 -08:00
Logan Gunthorpe d1402fc708 mm: introduce common STRUCT_PAGE_MAX_SHIFT define
This define is used by arm64 to calculate the size of the vmemmap
region.  It is defined as the log2 of the upper bound on the size of a
struct page.

We move it into mm_types.h so it can be defined properly instead of set
and checked with a build bug.  This also allows us to use the same
define for riscv.

Link: http://lkml.kernel.org/r/20181107205433.3875-2-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-14 15:05:45 -08:00
Christoph Hellwig 356da6d0cd dma-mapping: bypass indirect calls for dma-direct
Avoid expensive indirect calls in the fast path DMA mapping
operations by directly calling the dma_direct_* ops if we are using
the directly mapped DMA operations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13 21:06:18 +01:00
Christoph Hellwig 55897af630 dma-direct: merge swiotlb_dma_ops into the dma_direct code
While the dma-direct code is (relatively) clean and simple we actually
have to use the swiotlb ops for the mapping on many architectures due
to devices with addressing limits.  Instead of keeping two
implementations around this commit allows the dma-direct
implementation to call the swiotlb bounce buffering functions and
thus share the guts of the mapping implementation.  This also
simplified the dma-mapping setup on a few architectures where we
don't have to differenciate which implementation to use.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
2018-12-13 21:06:17 +01:00
Robin Murphy 90ac706e98 dma-mapping: factor out dummy DMA ops
The dummy DMA ops are currently used by arm64 for any device which has
an invalid ACPI description and is thus barred from using DMA due to not
knowing whether is is cache-coherent or not. Factor these out into
general dma-mapping code so that they can be referenced from other
common code paths. In the process, we can prune all the optional
callbacks which just do the same thing as the default behaviour, and
fill in .map_resource for completeness.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[hch: moved to a separate source file]
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-13 21:06:12 +01:00
Rob Herring acc2038738 Merge branch 'yaml-bindings-for-v4.21' into dt/next 2018-12-13 11:20:36 -06:00
Robin Murphy 4ab2150615 arm64: Add memory hotplug support
Wire up the basic support for hot-adding memory. Since memory hotplug
is fairly tightly coupled to sparsemem, we tweak pfn_valid() to also
cross-check the presence of a section in the manner of the generic
implementation, before falling back to memblock to check for no-map
regions within a present section as before. By having arch_add_memory(()
create the linear mapping first, this then makes everything work in the
way that __add_section() expects.

We expect hotplug to be ACPI-driven, so the swapper_pg_dir updates
should be safe from races by virtue of the global device hotplug lock.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-12 14:43:43 +00:00
Robin Murphy 3238c359ac arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing
We need to invalidate the caches *before* clearing the buffer via the
non-cacheable alias, else in the worst case __dma_flush_area() may
write back dirty lines over the top of our nice new zeros.

Fixes: dd65a941f6 ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag")
Cc: <stable@vger.kernel.org> # 4.18.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-12-11 11:55:32 +00:00
Will Deacon 4a1daf29d3 arm64: mm: EXPORT vabits_user to modules
TASK_SIZE is defined using the vabits_user variable for 64-bit tasks,
so ensure that this variable is exported to modules to avoid the
following build breakage with allmodconfig:

 | ERROR: "vabits_user" [lib/test_user_copy.ko] undefined!
 | ERROR: "vabits_user" [drivers/misc/lkdtm/lkdtm.ko] undefined!
 | ERROR: "vabits_user" [drivers/infiniband/hw/mlx5/mlx5_ib.ko] undefined!

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 19:20:23 +00:00
Will Deacon 68d23da437 arm64: Kconfig: Re-jig CONFIG options for 52-bit VA
Enabling 52-bit VAs for userspace is pretty confusing, since it requires
you to select "48-bit" virtual addressing in the Kconfig.

Rework the logic so that 52-bit user virtual addressing is advertised in
the "Virtual address space size" choice, along with some help text to
describe its interaction with Pointer Authentication. The EXPERT-only
option to force all user mappings to the 52-bit range is then made
available immediately below the VA size selection.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 18:42:18 +00:00
Steve Capper 67e7fdfcc6 arm64: mm: introduce 52-bit userspace support
On arm64 there is optional support for a 52-bit virtual address space.
To exploit this one has to be running with a 64KB page size and be
running on hardware that supports this.

For an arm64 kernel supporting a 48 bit VA with a 64KB page size,
some changes are needed to support a 52-bit userspace:
 * TCR_EL1.T0SZ needs to be 12 instead of 16,
 * TASK_SIZE needs to reflect the new size.

This patch implements the above when the support for 52-bit VAs is
detected at early boot time.

On arm64 userspace addresses translation is controlled by TTBR0_EL1. As
well as userspace, TTBR0_EL1 controls:
 * The identity mapping,
 * EFI runtime code.

It is possible to run a kernel with an identity mapping that has a
larger VA size than userspace (and for this case __cpu_set_tcr_t0sz()
would set TCR_EL1.T0SZ as appropriate). However, when the conditions for
52-bit userspace are met; it is possible to keep TCR_EL1.T0SZ fixed at
12. Thus in this patch, the TCR_EL1.T0SZ size changing logic is
disabled.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 18:42:17 +00:00
Steve Capper e842dfb5a2 arm64: mm: Offset TTBR1 to allow 52-bit PTRS_PER_PGD
Enabling 52-bit VAs on arm64 requires that the PGD table expands from 64
entries (for the 48-bit case) to 1024 entries. This quantity,
PTRS_PER_PGD is used as follows to compute which PGD entry corresponds
to a given virtual address, addr:

pgd_index(addr) -> (addr >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1)

Userspace addresses are prefixed by 0's, so for a 48-bit userspace
address, uva, the following is true:
(uva >> PGDIR_SHIFT) & (1024 - 1) == (uva >> PGDIR_SHIFT) & (64 - 1)

In other words, a 48-bit userspace address will have the same pgd_index
when using PTRS_PER_PGD = 64 and 1024.

Kernel addresses are prefixed by 1's so, given a 48-bit kernel address,
kva, we have the following inequality:
(kva >> PGDIR_SHIFT) & (1024 - 1) != (kva >> PGDIR_SHIFT) & (64 - 1)

In other words a 48-bit kernel virtual address will have a different
pgd_index when using PTRS_PER_PGD = 64 and 1024.

If, however, we note that:
kva = 0xFFFF << 48 + lower (where lower[63:48] == 0b)
and, PGDIR_SHIFT = 42 (as we are dealing with 64KB PAGE_SIZE)

We can consider:
(kva >> PGDIR_SHIFT) & (1024 - 1) - (kva >> PGDIR_SHIFT) & (64 - 1)
 = (0xFFFF << 6) & 0x3FF - (0xFFFF << 6) & 0x3F	// "lower" cancels out
 = 0x3C0

In other words, one can switch PTRS_PER_PGD to the 52-bit value globally
provided that they increment ttbr1_el1 by 0x3C0 * 8 = 0x1E00 bytes when
running with 48-bit kernel VAs (TCR_EL1.T1SZ = 16).

For kernel configuration where 52-bit userspace VAs are possible, this
patch offsets ttbr1_el1 and sets PTRS_PER_PGD corresponding to the
52-bit value.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
[will: added comment to TTBR1_BADDR_4852_OFFSET calculation]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 18:42:17 +00:00
Steve Capper 363524d2b1 arm64: mm: Introduce DEFAULT_MAP_WINDOW
We wish to introduce a 52-bit virtual address space for userspace but
maintain compatibility with software that assumes the maximum VA space
size is 48 bit.

In order to achieve this, on 52-bit VA systems, we make mmap behave as
if it were running on a 48-bit VA system (unless userspace explicitly
requests a VA where addr[51:48] != 0).

On a system running a 52-bit userspace we need TASK_SIZE to represent
the 52-bit limit as it is used in various places to distinguish between
kernelspace and userspace addresses.

Thus we need a new limit for mmap, stack, ELF loader and EFI (which uses
TTBR0) to represent the non-extended VA space.

This patch introduces DEFAULT_MAP_WINDOW and DEFAULT_MAP_WINDOW_64 and
switches the appropriate logic to use that instead of TASK_SIZE.

Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 18:42:17 +00:00
Will Deacon 33309ecda0 arm64: Fix minor issues with the dcache_by_line_op macro
The dcache_by_line_op macro suffers from a couple of small problems:

First, the GAS directives that are currently being used rely on
assembler behavior that is not documented, and probably not guaranteed
to produce the correct behavior going forward. As a result, we end up
with some undefined symbols in cache.o:

$ nm arch/arm64/mm/cache.o
         ...
         U civac
         ...
         U cvac
         U cvap
         U cvau

This is due to the fact that the comparisons used to select the
operation type in the dcache_by_line_op macro are comparing symbols
not strings, and even though it seems that GAS is doing the right
thing here (undefined symbols by the same name are equal to each
other), it seems unwise to rely on this.

Second, when patching in a DC CVAP instruction on CPUs that support it,
the fallback path consists of a DC CVAU instruction which may be
affected by CPU errata that require ARM64_WORKAROUND_CLEAN_CACHE.

Solve these issues by unrolling the various maintenance routines and
using the conditional directives that are documented as operating on
strings. To avoid the complexity of nested alternatives, we move the
DC CVAP patching to __clean_dcache_area_pop, falling back to a branch
to __clean_dcache_area_poc if DCPOP is not supported by the CPU.

Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 15:03:51 +00:00
Mark Rutland 03ef055fd3 arm64: move memstart_addr export inline
Since we define memstart_addr in a C file, we can have the export
immediately after the definition of the symbol, as we do elsewhere.

As a step towards removing arm64ksyms.c, move the export of
memstart_addr to init.c, where the symbol is defined.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-10 11:50:11 +00:00
Allen Pais a21b0b78ea arm64: hugetlb: Register hugepages during arch init
Add hstate for each supported hugepage size using arch initcall.

* no hugepage parameters

  Without hugepage parameters, only a default hugepage size is
  available for dynamic allocation.  It's different, for example, from
  x86_64 and sparc64 where all supported hugepage sizes are available.

* only default_hugepagesz= is specified and set not to HPAGE_SIZE

  In spite of the fact that default_hugepagesz= is set to a valid
  hugepage size, it's treated as unsupported and reverted to
  HPAGE_SIZE.  Such behaviour is also different from x86_64 and
  sparc64.

Acked-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Dmitry Klochkov <dmitry.klochkov@oracle.com>
Signed-off-by: Allen Pais <allen.pais@oracle.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-12-06 17:01:13 +00:00
Christoph Hellwig cad34be747 iommu/dma-iommu: remove the mapping_error dma_map_ops method
Return DMA_MAPPING_ERROR instead of 0 on a dma mapping failure and let
the core dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-06 06:56:48 -08:00
Christoph Hellwig 52f0b3ee0b arm64: remove the dummy_dma_ops mapping_error method
Just return DMA_MAPPING_ERROR from __dummy_map_page and let the core
dma-mapping code handle the rest.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-06 06:56:41 -08:00
Christoph Hellwig 0c3b3171ce dma-mapping: move the arm64 noncoherent alloc/free support to common code
The arm64 codebase to implement coherent dma allocation for architectures
with non-coherent DMA is a good start for a generic implementation, given
that is uses the generic remap helpers, provides the atomic pool for
allocations that can't sleep and still is realtively simple and well
tested.  Move it to kernel/dma and allow architectures to opt into it
using a config symbol.  Architectures just need to provide a new
arch_dma_prep_coherent helper to writeback an invalidate the caches
for any memory that gets remapped for uncached access.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2018-12-01 18:07:11 +01:00
Florian Fainelli 229c55ccb4 arch: Move initrd= parsing into do_mounts_initrd.c
ARC, ARM, ARM64 and Unicore32 are all capable of parsing the "initrd="
command line parameter to allow specifying the physical address and size
of an initrd. Move that parsing into init/do_mounts_initrd.c such that
we no longer duplicate that logic.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2018-11-26 15:50:45 -06:00
Florian Fainelli c756c592e4 arm64: Utilize phys_initrd_start/phys_initrd_size
ARM64 is the only architecture that re-defines
__early_init_dt_declare_initrd() in order for that function to populate
initrd_start/initrd_end with physical addresses instead of virtual
addresses. Instead of having an override we can leverage
drivers/of/fdt.c populating phys_initrd_start/phys_initrd_size to
populate those variables for us.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2018-11-26 15:50:31 -06:00
Ard Biesheuvel c55191e96c arm64: mm: apply r/o permissions of VM areas to its linear alias as well
On arm64, we use block mappings and contiguous hints to map the linear
region, to minimize the TLB footprint. However, this means that the
entire region is mapped using read/write permissions, which we cannot
modify at page granularity without having to take intrusive measures to
prevent TLB conflicts.

This means the linear aliases of pages belonging to read-only mappings
(executable or otherwise) in the vmalloc region are also mapped read/write,
and could potentially be abused to modify things like module code, bpf JIT
code or other read-only data.

So let's fix this, by extending the set_memory_ro/rw routines to take
the linear alias into account. The consequence of enabling this is
that we can no longer use block mappings or contiguous hints, so in
cases where the TLB footprint of the linear region is a bottleneck,
performance may be affected.

Therefore, allow this feature to be runtime en/disabled, by setting
rodata=full (or 'on' to disable just this enhancement, or 'off' to
disable read-only mappings for code and r/o data entirely) on the
kernel command line. Also, allow the default value to be set via a
Kconfig option.

Tested-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-20 11:38:26 +00:00
Ard Biesheuvel b34d2ef0c6 arm64: mm: purge lazily unmapped vm regions before changing permissions
Call vm_unmap_aliases() every time we apply any changes to permission
attributes of mappings in the vmalloc region. This avoids any potential
issues resulting from lingering writable or executable aliases of
mappings that should be read-only or non-executable, respectively.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-11-20 11:38:26 +00:00
Ard Biesheuvel 24cc61d8cb arm64: memblock: don't permit memblock resizing until linear mapping is up
Bhupesh reports that having numerous memblock reservations at early
boot may result in the following crash:

  Unable to handle kernel paging request at virtual address ffff80003ffe0000
  ...
  Call trace:
   __memcpy+0x110/0x180
   memblock_add_range+0x134/0x2e8
   memblock_reserve+0x70/0xb8
   memblock_alloc_base_nid+0x6c/0x88
   __memblock_alloc_base+0x3c/0x4c
   memblock_alloc_base+0x28/0x4c
   memblock_alloc+0x2c/0x38
   early_pgtable_alloc+0x20/0xb0
   paging_init+0x28/0x7f8

This is caused by the fact that we permit memblock resizing before the
linear mapping is up, and so the memblock_reserved() array is moved
into memory that is not mapped yet.

So let's ensure that this crash can no longer occur, by deferring to
call to memblock_allow_resize() to after the linear mapping has been
created.

Reported-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-11-08 17:54:03 +00:00
Christoph Hellwig f62717551b arm64: fix warnings without CONFIG_IOMMU_DMA
__swiotlb_get_sgtable_page and __swiotlb_mmap_pfn are not only misnamed
but also only used if CONFIG_IOMMU_DMA is set.  Just add a simple ifdef
for now, given that we plan to remove them entirely for the next merge
window.

Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2018-11-02 19:27:22 +01:00
Mike Rapoport 57c8a661d9 mm: remove include/linux/bootmem.h
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>

@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
  Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:16 -07:00
Mike Rapoport c6ffc5ca8f memblock: rename free_all_bootmem to memblock_free_all
The conversion is done using

sed -i 's@free_all_bootmem@memblock_free_all@' \
    $(git grep -l free_all_bootmem)

Link: http://lkml.kernel.org/r/1536927045-23536-26-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:16 -07:00
Mike Rapoport 2013288f72 memblock: replace free_bootmem{_node} with memblock_free
The free_bootmem and free_bootmem_node are merely wrappers for
memblock_free. Replace their usage with a call to memblock_free using the
following semantic patch:

@@
expression e1, e2, e3;
@@
(
- free_bootmem(e1, e2)
+ memblock_free(e1, e2)
|
- free_bootmem_node(e1, e2, e3)
+ memblock_free(e2, e3)
)

Link: http://lkml.kernel.org/r/1536927045-23536-24-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:16 -07:00
Mike Rapoport eb31d559f1 memblock: remove _virt from APIs returning virtual address
The conversion is done using

sed -i 's@memblock_virt_alloc@memblock_alloc@g' \
	$(git grep -l memblock_virt_alloc)

Link: http://lkml.kernel.org/r/1536927045-23536-8-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:15 -07:00
Mike Rapoport 9a8dd708d5 memblock: rename memblock_alloc{_nid,_try_nid} to memblock_phys_alloc*
Make it explicit that the caller gets a physical address rather than a
virtual one.

This will also allow using meblock_alloc prefix for memblock allocations
returning virtual address, which is done in the following patches.

The conversion is done using the following semantic patch:

@@
expression e1, e2, e3;
@@
(
- memblock_alloc(e1, e2)
+ memblock_phys_alloc(e1, e2)
|
- memblock_alloc_nid(e1, e2, e3)
+ memblock_phys_alloc_nid(e1, e2, e3)
|
- memblock_alloc_try_nid(e1, e2, e3)
+ memblock_phys_alloc_try_nid(e1, e2, e3)
)

Link: http://lkml.kernel.org/r/1536927045-23536-7-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-31 08:54:15 -07:00
Linus Torvalds befa936331 Second batch of dma-mapping updates for 4.20:
- various swiotlb cleanups
  - do not dip into the ѕwiotlb pool for dma coherent allocations
  - add support for not cache coherent DMA to swiotlb
  - switch ARM64 to use the generic swiotlb_dma_ops
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlvTDNELHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYOfdg//a4nGw5bTwj2uol/oS/vOdpE0otR3XpnNC3oopHd3
 3xbb0LUR12A0YJmV5Ao9iIRFc60boLNblZglho8Diqks3P9+nYWCQ2rQ+o7ZsfK9
 xGBNP0HrdGKPBzs9JodvhP/WBNWEReNbb6KwMe63LtDqUqNDNFmlYfxUgQ+fbgST
 ZscR7iHJT1Nl0PoJ0mVziPFDA5DFB6JN7z0nU9uddVXLM7MPOFVX/WsFdLZLg/8W
 /jgQ1gQ+gRMBSyctR8u+LG5KTxtYofiPo46lbyuSa5sx+cnZmjo7Uk5OZgs3soHn
 gnobSjxgNbPgt8ttaKoFtOpyuO/3cLQeObS8i2MQuzvYpUD/nBT5q6l8XZBAFQCc
 BkOe0qaBa2BrSG2O7BRgagHuSUjpgkxR3vyVT5e6eyr7uFk+LWslJKx4Uem3uIRf
 yfNQzx91Xaw7ypaF7//jVPeWucLFZuwYsy13ZAg5D71vYpZBkfgHoMSCrmZQT9o/
 V1ijmJy2ZDlGIKH95A0Utr6QKrci1drigg8Zce5A2E2EPAzTiro90ZeGT2tcJmVs
 LOrbxEN1Dctkkk5l6PJE3k4zYCP7rW4lbW7wPcLZrJoptsm3sQprE10zaAFGKks6
 vPLY615FK/JydJ9AIjx6PhFiY6M6Zb2er6CbRo18BjpqjKHgmK0A/uCbLhN+Qnac
 IoE=
 =pI5p
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-4.20-1' of git://git.infradead.org/users/hch/dma-mapping

Pull more dma-mapping updates from Christoph Hellwig:

 - various swiotlb cleanups

 - do not dip into the ѕwiotlb pool for dma coherent allocations

 - add support for not cache coherent DMA to swiotlb

 - switch ARM64 to use the generic swiotlb_dma_ops

* tag 'dma-mapping-4.20-1' of git://git.infradead.org/users/hch/dma-mapping:
  arm64: use the generic swiotlb_dma_ops
  swiotlb: add support for non-coherent DMA
  swiotlb: don't dip into swiotlb pool for coherent allocations
  swiotlb: refactor swiotlb_map_page
  swiotlb: use swiotlb_map_page in swiotlb_map_sg_attrs
  swiotlb: merge swiotlb_unmap_page and unmap_single
  swiotlb: remove the overflow buffer
  swiotlb: do not panic on mapping failures
  swiotlb: mark is_swiotlb_buffer static
  swiotlb: remove a pointless comment
2018-10-26 11:29:17 -07:00
Linus Torvalds d1f2b1710d IOMMU Updates for Linux v4.20
These updates bring:
 
 	- Debugfs support for the Intel VT-d driver. When enabled, it
 	  now also exposes some of its internal data structures to
 	  user-space for debugging purposes.
 
 	- ARM-SMMU driver now uses the generic deferred flushing
 	  and fast-path iova allocation code. This is expected to be a
 	  major performance improvement, as this allocation path scales
 	  a lot better.
 
 	- Support for r8a7744 in the Renesas iommu driver
 
 	- Couple of minor fixes and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJb0vixAAoJECvwRC2XARrj0lkQALur432cGae8225gLNG+Ab1B
 lDGz/8uJeV4V552r58msq/yFpVascoMYOCgS+5N5J/jn5UiPnWxk//Uz2lvvCsFn
 3Z4HswSbmNLSuEHmN3/1CK28An44LjYxtnH/zAEaHRJgWNmC05lO4glPXaSIBwVS
 ve6ULymHJittCHFNNAstNBvMYirYV2y+FYxoq6EteTuCruNNXR78KQV7TqPYI+uZ
 0DwaXUyxO+HZbVeLpOnj/WHZ6+EUY0cHwHuk8U6ZCHnINZ+k9knt+WUvYu7wPCtj
 jGIyJXW5BG0rjJZnVUQs9BFXFSJLV2Ap8M3zKVIyFAUAyStEtGHct0YMRC29GX/J
 e45GPbElAZqx1NWRGGTV0xTsH5Gn85S2nP3p7iiPhj5zUhX/6SreZBDQdC+brtsB
 8HG85xohsUkVmRq/ez4hu0yqXtB66ppV7TcOjyixybG+ixRPtUwTbiaYUxbvkZTr
 hcYUVLGcpJX463VjUKGoRPFL/jZ6BXUWdLVllZPYgDT+IBXtQx1TB20DDtj5V2mR
 3m7B0xLQJDWdarhdA9Oj0FQj7ivmwmitcJ9EoNvHSRdEoE1iIy1vHv/7v/GokRVS
 J1YT5ZYAsGHBgZIsL7FpVA37i9t3JPVvgakUV/ZfLDyG3v+P0+eS3gNhECYt5luS
 D8G7Jy+2vsitO/ZCyu/r
 =q1HJ
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:

 - Debugfs support for the Intel VT-d driver.

   When enabled, it now also exposes some of its internal data
   structures to user-space for debugging purposes.

 - ARM-SMMU driver now uses the generic deferred flushing and fast-path
   iova allocation code.

   This is expected to be a major performance improvement, as this
   allocation path scales a lot better.

 - Support for r8a7744 in the Renesas iommu driver

 - Couple of minor fixes and improvements all over the place

* tag 'iommu-updates-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (39 commits)
  iommu/arm-smmu-v3: Remove unnecessary wrapper function
  iommu/arm-smmu-v3: Add SPDX header
  iommu/amd: Add default branch in amd_iommu_capable()
  dt-bindings: iommu: ipmmu-vmsa: Add r8a7744 support
  iommu/amd: Move iommu_init_pci() to .init section
  iommu/arm-smmu: Support non-strict mode
  iommu/io-pgtable-arm-v7s: Add support for non-strict mode
  iommu/arm-smmu-v3: Add support for non-strict mode
  iommu/io-pgtable-arm: Add support for non-strict mode
  iommu: Add "iommu.strict" command line option
  iommu/dma: Add support for non-strict mode
  iommu/arm-smmu: Ensure that page-table updates are visible before TLBI
  iommu/arm-smmu-v3: Implement flush_iotlb_all hook
  iommu/arm-smmu-v3: Avoid back-to-back CMD_SYNC operations
  iommu/arm-smmu-v3: Fix unexpected CMD_SYNC timeout
  iommu/io-pgtable-arm: Fix race handling in split_blk_unmap()
  iommu/arm-smmu-v3: Fix a couple of minor comment typos
  iommu: Fix a typo
  iommu: Remove .domain_{get,set}_windows
  iommu: Tidy up window attributes
  ...
2018-10-26 10:50:10 -07:00
Linus Torvalds ba9f6f8954 Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull siginfo updates from Eric Biederman:
 "I have been slowly sorting out siginfo and this is the culmination of
  that work.

  The primary result is in several ways the signal infrastructure has
  been made less error prone. The code has been updated so that manually
  specifying SEND_SIG_FORCED is never necessary. The conversion to the
  new siginfo sending functions is now complete, which makes it
  difficult to send a signal without filling in the proper siginfo
  fields.

  At the tail end of the patchset comes the optimization of decreasing
  the size of struct siginfo in the kernel from 128 bytes to about 48
  bytes on 64bit. The fundamental observation that enables this is by
  definition none of the known ways to use struct siginfo uses the extra
  bytes.

  This comes at the cost of a small user space observable difference.
  For the rare case of siginfo being injected into the kernel only what
  can be copied into kernel_siginfo is delivered to the destination, the
  rest of the bytes are set to 0. For cases where the signal and the
  si_code are known this is safe, because we know those bytes are not
  used. For cases where the signal and si_code combination is unknown
  the bits that won't fit into struct kernel_siginfo are tested to
  verify they are zero, and the send fails if they are not.

  I made an extensive search through userspace code and I could not find
  anything that would break because of the above change. If it turns out
  I did break something it will take just the revert of a single change
  to restore kernel_siginfo to the same size as userspace siginfo.

  Testing did reveal dependencies on preferring the signo passed to
  sigqueueinfo over si->signo, so bit the bullet and added the
  complexity necessary to handle that case.

  Testing also revealed bad things can happen if a negative signal
  number is passed into the system calls. Something no sane application
  will do but something a malicious program or a fuzzer might do. So I
  have fixed the code that performs the bounds checks to ensure negative
  signal numbers are handled"

* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (80 commits)
  signal: Guard against negative signal numbers in copy_siginfo_from_user32
  signal: Guard against negative signal numbers in copy_siginfo_from_user
  signal: In sigqueueinfo prefer sig not si_signo
  signal: Use a smaller struct siginfo in the kernel
  signal: Distinguish between kernel_siginfo and siginfo
  signal: Introduce copy_siginfo_from_user and use it's return value
  signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE
  signal: Fail sigqueueinfo if si_signo != sig
  signal/sparc: Move EMT_TAGOVF into the generic siginfo.h
  signal/unicore32: Use force_sig_fault where appropriate
  signal/unicore32: Generate siginfo in ucs32_notify_die
  signal/unicore32: Use send_sig_fault where appropriate
  signal/arc: Use force_sig_fault where appropriate
  signal/arc: Push siginfo generation into unhandled_exception
  signal/ia64: Use force_sig_fault where appropriate
  signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn
  signal/ia64: Use the generic force_sigsegv in setup_frame
  signal/arm/kvm: Use send_sig_mceerr
  signal/arm: Use send_sig_fault where appropriate
  signal/arm: Use force_sig_fault where appropriate
  ...
2018-10-24 11:22:39 +01:00
Linus Torvalds 5289851171 arm64 updates for 4.20:
- Core mmu_gather changes which allow tracking the levels of page-table
   being cleared together with the arm64 low-level flushing routines
 
 - Support for the new ARMv8.5 PSTATE.SSBS bit which can be used to
   mitigate Spectre-v4 dynamically without trapping to EL3 firmware
 
 - Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack
 
 - Optimise emulation of MRS instructions to ID_* registers on ARMv8.4
 
 - Support for Common Not Private (CnP) translations allowing threads of
   the same CPU to share the TLB entries
 
 - Accelerated crc32 routines
 
 - Move swapper_pg_dir to the rodata section
 
 - Trap WFI instruction executed in user space
 
 - ARM erratum 1188874 workaround (arch_timer)
 
 - Miscellaneous fixes and clean-ups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlvKGdEACgkQa9axLQDI
 XvGSQBAAiOH6aQABL4TB7c5KIc7C+Unjm6QCFCoaeGWoHuemnM6cFJ7RQsi0GqnP
 dVEX5V/FKfmeTWO5g24Ah+MbTm3Bt6+81gywAmi1rrHhmCaCIPjT7xDqy/WsLlvt
 7WtgegSGvQ7DIMj2dbfFav6+ra67qAiYZTc46jvuynVl6DrE3BCiyTDbXAWt2nzP
 Xf3un4AHRbg3UEMUZTLqU5q4z0tbM6rEAZru8O0UOTnD2q7uttUqW3Ab7fpuEkkj
 lEVrMWD3h8SJg+Df9CbXmCNOjh4VhwBwDb5LgO8vA/AcyV/YLEF5b2OUAk/28qwo
 0GBwjqRyI4+YQ9LPg41MhGzrlnta0HCdYoeNLgLQZiDcUkuSfGhoA+MNZNOR8B08
 sCWF7F6f8UIQm8KMMBiYYdlVyUYgHLsWE/1+CyeLV0oIoWT5k3c+Xe3pho9KpVb0
 Co04TqMlqalry0sbevHz5c55H7iWIjB1Tpo3SxM105dVJVibXRPXkz+WZ5iPO+xa
 ex2j1kjNdA/AUzrSCZ5lh22zhg0WsfwD++E5meAaJMxieim8FeZDRga43rowJ0BA
 zMbSNB/+NDFZ9EhC40VaUfKk8Tkgiug9J5swv0+v7hy1QLDyydHhbOecTuIueauM
 6taiT2Iuov5yFng1eonYj4htvouVF4WOhPGthFPJMOcrB9mLMhs=
 =3Mc8
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "Apart from some new arm64 features and clean-ups, this also contains
  the core mmu_gather changes for tracking the levels of the page table
  being cleared and a minor update to the generic
  compat_sys_sigaltstack() introducing COMPAT_SIGMINSKSZ.

  Summary:

   - Core mmu_gather changes which allow tracking the levels of
     page-table being cleared together with the arm64 low-level flushing
     routines

   - Support for the new ARMv8.5 PSTATE.SSBS bit which can be used to
     mitigate Spectre-v4 dynamically without trapping to EL3 firmware

   - Introduce COMPAT_SIGMINSTKSZ for use in compat_sys_sigaltstack

   - Optimise emulation of MRS instructions to ID_* registers on ARMv8.4

   - Support for Common Not Private (CnP) translations allowing threads
     of the same CPU to share the TLB entries

   - Accelerated crc32 routines

   - Move swapper_pg_dir to the rodata section

   - Trap WFI instruction executed in user space

   - ARM erratum 1188874 workaround (arch_timer)

   - Miscellaneous fixes and clean-ups"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (78 commits)
  arm64: KVM: Guests can skip __install_bp_hardening_cb()s HYP work
  arm64: cpufeature: Trap CTR_EL0 access only where it is necessary
  arm64: cpufeature: Fix handling of CTR_EL0.IDC field
  arm64: cpufeature: ctr: Fix cpu capability check for late CPUs
  Documentation/arm64: HugeTLB page implementation
  arm64: mm: Use __pa_symbol() for set_swapper_pgd()
  arm64: Add silicon-errata.txt entry for ARM erratum 1188873
  Revert "arm64: uaccess: implement unsafe accessors"
  arm64: mm: Drop the unused cpu parameter
  MAINTAINERS: fix bad sdei paths
  arm64: mm: Use #ifdef for the __PAGETABLE_P?D_FOLDED defines
  arm64: Fix typo in a comment in arch/arm64/mm/kasan_init.c
  arm64: xen: Use existing helper to check interrupt status
  arm64: Use daifflag_restore after bp_hardening
  arm64: daifflags: Use irqflags functions for daifflags
  arm64: arch_timer: avoid unused function warning
  arm64: Trap WFI executed in userspace
  arm64: docs: Document SSBS HWCAP
  arm64: docs: Fix typos in ELF hwcaps
  arm64/kprobes: remove an extra semicolon in arch_prepare_kprobe
  ...
2018-10-22 17:30:06 +01:00
Christoph Hellwig 886643b766 arm64: use the generic swiotlb_dma_ops
Now that the generic swiotlb code supports non-coherent DMA we can switch
to it for arm64.  For that we need to refactor the existing
alloc/free/mmap/pgprot helpers to be used as the architecture hooks,
and implement the standard arch_sync_dma_for_{device,cpu} hooks for
cache maintaincance in the streaming dma hooks, which also implies
using the generic dma_coherent flag in struct device.

Note that we need to keep the old is_device_dma_coherent function around
for now, so that the shared arm/arm64 Xen code keeps working.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-19 08:53:24 +02:00
Christoph Hellwig fafadcd165 swiotlb: don't dip into swiotlb pool for coherent allocations
All architectures that support swiotlb also have a zone that backs up
these less than full addressing allocations (usually ZONE_DMA32).

Because of that it is rather pointless to fall back to the global swiotlb
buffer if the normal dma direct allocation failed - the only thing this
will do is to eat up bounce buffers that would be more useful to serve
streaming mappings.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2018-10-19 08:48:28 +02:00
Christoph Hellwig dff8d6c1ed swiotlb: remove the overflow buffer
Like all other dma mapping drivers just return an error code instead
of an actual memory buffer.  The reason for the overflow buffer was
that at the time swiotlb was invented there was no way to check for
dma mapping errors, but this has long been fixed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2018-10-19 08:43:46 +02:00
James Morse 26a6f87ef5 arm64: mm: Use __pa_symbol() for set_swapper_pgd()
commit 2330b7ca78 ("arm64/mm: use fixmap to modify
swapper_pg_dir") modifies the swapper_pg_dir via the fixmap
as the kernel page tables have been moved to a read-only
part of the kernel mapping.

Using __pa() to setup the fixmap causes CONFIG_DEBUG_VIRTUAL
to fire, as this function is used on the kernel-image swapper
address. The in_swapper_pgdir() test before each call of this
function means set_swapper_pgd() will only ever be called when
pgdp points somewhere in the kernel-image mapping of
swapper_pd_dir. Use __pa_symbol().

Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Jun Yao <yaojun8558363@gmail.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-10 17:57:06 +01:00
Joerg Roedel 2f2fbfb71e Merge branches 'arm/renesas', 'arm/smmu', 'ppc/pamu', 'x86/vt-d', 'x86/amd' and 'core' into next 2018-10-10 18:09:37 +02:00
Shaokun Zhang 742fafa50b arm64: mm: Drop the unused cpu parameter
Cpu parameter is never used in flush_context, remove it.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-09 17:17:23 +01:00
Kyrylo Tkachov 0293c8ba80 arm64: Fix typo in a comment in arch/arm64/mm/kasan_init.c
"bellow" -> "below"

The recommendation from kegel.com/kerspell is to only fix the howlers.
"Bellow" is a synonym of "howl" so this should be appropriate.

Signed-off-by: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-05 11:49:32 +01:00
Julien Thierry 9a0c032825 arm64: Use daifflag_restore after bp_hardening
For EL0 entries requiring bp_hardening, daif status is kept at
DAIF_PROCCTX_NOIRQ until after hardening has been done. Then interrupts
are enabled through local_irq_enable().

Before using local_irq_* functions, daifflags should be properly restored
to a state where IRQs are enabled.

Enable IRQs by restoring DAIF_PROCCTX state after bp hardening.

Acked-by: James Morse <james.morse@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-03 16:12:21 +01:00
Anshuman Khandual 52338088ef arm64/numa: Unify common error path in numa_init()
At present numa_free_distance() is being called before numa_distance is
even initialized with numa_alloc_distance() which is really pointless.
Instead lets call numa_free_distance() on the common error path inside
numa_init() after numa_alloc_distance() has been successful.

Fixes: 1a2db30034 ("arm64, numa: Add NUMA support for arm64 platforms")
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-01 13:49:52 +01:00
Anshuman Khandual 77cfe95090 arm64/numa: Report correct memblock range for the dummy node
The dummy node ID is marked into all memory ranges on the system. So the
dummy node really extends the entire memblock.memory. Hence report correct
extent information for the dummy node using memblock range helper functions
instead of the range [0LLU, PFN_PHYS(max_pfn) - 1)].

Fixes: 1a2db30034 ("arm64, numa: Add NUMA support for arm64 platforms")
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-01 13:48:54 +01:00
Anshuman Khandual 359048f91d arm64/mm: Define esr_to_debug_fault_info()
fault_info[] and debug_fault_info[] are static arrays defining memory abort
exception handling functions looking into ESR fault status code encodings.
As esr_to_fault_info() is already available providing fault_info[] array
lookup, it really makes sense to have a corresponding debug_fault_info[]
array lookup function as well. This just adds an equivalent helper function
esr_to_debug_fault_info().

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-01 13:48:23 +01:00
Anshuman Khandual dbfe3828a6 arm64/mm: Reorganize arguments for is_el1_permission_fault()
Most memory abort exception handling related functions have the arguments
in the order (addr, esr, regs) except is_el1_permission_fault(). This
changes the argument order in this function as (addr, esr, regs) like
others.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-01 13:47:31 +01:00
Anshuman Khandual 00bbd5d901 arm64/mm: Use ESR_ELx_FSC macro while decoding fault exception
Just replace hard code value of 63 (0x111111) with an existing macro
ESR_ELx_FSC when parsing for the status code during fault exception.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-10-01 13:47:12 +01:00
Eric W. Biederman b4d5557caa signal/arm64: Add and use arm64_force_sig_mceerr as appropriate
Add arm64_force_sig_mceerr for consistency with arm64_force_sig_fault,
and use it in the one location that can take advantage of it.

This removes the fiddly filling out of siginfo before sending a signal
reporting an memory error to userspace.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-09-27 21:54:51 +02:00
Eric W. Biederman feca355b3d signal/arm64: Add and use arm64_force_sig_fault where appropriate
Wrap force_sig_fault with a helper that calls arm64_show_signal
and call arm64_force_sig_fault where appropraite.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2018-09-27 21:54:43 +02:00
Eric W. Biederman 559d8d91a8 signal/arm64: Only call set_thread_esr once in do_page_fault
This code is truly common between the signal sending cases so share it.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2018-09-27 21:54:33 +02:00
Eric W. Biederman 2d2837fab5 signal/arm64: Only perform one esr_to_fault_info call in do_page_fault
As this work is truly common between all of the signal sending cases
there is no need to repeat it between the different cases.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2018-09-27 21:54:26 +02:00
Eric W. Biederman effb093ad2 signal/arm64: Expand __do_user_fault and remove it
Not all of the signals passed to __do_user_fault can be handled
the same way so expand the now tiny __do_user_fault in it's callers
and remove it.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-09-27 21:54:18 +02:00
Eric W. Biederman aefab2b4c0 signal/arm64: For clarity separate the 3 signal sending cases in do_page_fault
It gets easy to confuse what is going on when some code is shared and some not
so stop sharing the trivial bits of signal generation to make future updates
easier to understand.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-09-27 21:54:10 +02:00
Eric W. Biederman 9ea3a9743c signal/arm64: Consolidate the two hwpoison cases in do_page_fault
These two cases are practically the same and use siginfo differently
from the other signals sent from do_page_fault.  So consolidate them
to make future changes easier.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-09-27 21:54:02 +02:00
Eric W. Biederman f29ad209e4 signal/arm64: Factor set_thread_esr out of __do_user_fault
This pepares for sending signals with something other than
arm64_force_sig_info.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-09-27 21:53:54 +02:00
Eric W. Biederman 24b8f79dd8 signal/arm64: Remove unneeded tsk parameter from arm64_force_sig_info
Every caller passes in current for tsk so there is no need to pass
tsk.  Instead make tsk a local variable initialized to current.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-09-27 21:53:35 +02:00
Eric W. Biederman 6fa998e83e signal/arm64: Push siginfo generation into arm64_notify_die
Instead of generating a struct siginfo before calling arm64_notify_die
pass the signal number, tne sicode and the fault address into
arm64_notify_die and have it call force_sig_fault instead of
force_sig_info to let the generic code generate the struct siginfo.

This keeps code passing just the needed information into
siginfo generating code, making it easier to see what
is happening and harder to get wrong.  Further by letting
the generic code handle the generation of struct siginfo
it reduces the number of sites generating struct siginfo
making it possible to review them and verify that all
of the fiddly details for a structure passed to userspace
are handled properly.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-09-27 21:52:54 +02:00
Christoph Hellwig 1a0afc14b5 Revert "dma-mapping: clear dev->dma_ops in arch_teardown_dma_ops"
This reverts commit 46053c7368.

This change breaks architectures setting up dma_ops in their own magic
way and not using arch_setup_dma_ops, so revert it.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-09-25 15:12:26 -07:00
Jun Yao 2330b7ca78 arm64/mm: use fixmap to modify swapper_pg_dir
Once swapper_pg_dir is in the rodata section, it will not be possible to
modify it directly, but we will need to modify it in some cases.

To enable this, we can use the fixmap when deliberately modifying
swapper_pg_dir. As the pgd is only transiently mapped, this provides
some resilience against illicit modification of the pgd, e.g. for
Kernel Space Mirror Attack (KSMA).

Signed-off-by: Jun Yao <yaojun8558363@gmail.com>
Reviewed-by: James Morse <james.morse@arm.com>
[Mark: simplify ifdeffery, commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-25 15:10:55 +01:00
Jun Yao 2b5548b681 arm64/mm: Separate boot-time page tables from swapper_pg_dir
Since the address of swapper_pg_dir is fixed for a given kernel image,
it is an attractive target for manipulation via an arbitrary write. To
mitigate this we'd like to make it read-only by moving it into the
rodata section.

We require that swapper_pg_dir is at a fixed offset from tramp_pg_dir
and reserved_ttbr0, so these will also need to move into rodata.
However, swapper_pg_dir is allocated along with some transient page
tables used for boot which we do not want to move into rodata.

As a step towards this, this patch separates the boot-time page tables
into a new init_pg_dir, and reduces swapper_pg_dir to the single page it
needs to be. This allows us to retain the relationship between
swapper_pg_dir, tramp_pg_dir, and swapper_pg_dir, while cleanly
separating these from the boot-time page tables.

The init_pg_dir holds all of the pgd/pud/pmd/pte levels needed during
boot, and all of these levels will be freed when we switch to the
swapper_pg_dir, which is initialized by the existing code in
paging_init(). Since we start off on the init_pg_dir, we no longer need
to allocate a transient page table in paging_init() in order to ensure
that swapper_pg_dir isn't live while we initialize it.

There should be no functional change as a result of this patch.

Signed-off-by: Jun Yao <yaojun8558363@gmail.com>
Reviewed-by: James Morse <james.morse@arm.com>
[Mark: place init_pg_dir after BSS, fold mm changes, commit message]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-25 15:10:54 +01:00
Robin Murphy 7adb562c3e arm64/dma-mapping: Mildly optimise non-coherent IOMMU ops
Whilst the symmetry of deferring to the existing sync callback in
__iommu_map_page() is nice, taking a round-trip through
iommu_iova_to_phys() is a pretty heavyweight way to get an address we
can trivially compute from the page we already have. Tweaking it to just
perform the cache maintenance directly when appropriate doesn't really
make the code any more complicated, and the runtime efficiency gain can
only be a benefit.

Furthermore, the sync operations themselves know they can only be
invoked on a managed DMA ops domain, so can use the fast specific domain
lookup to avoid excessive manipulation of the group refcount
(particularly in the scatterlist cases).

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-09-25 10:23:16 +02:00
Steve Capper 031e6e6b4e arm64: hugetlb: Avoid unnecessary clearing in huge_ptep_set_access_flags
For contiguous hugetlb, huge_ptep_set_access_flags performs a
get_clear_flush (which then flushes the TLBs) even when no change of ptes
is necessary.

Unfortunately, this behaviour can lead to back-to-back page faults being
generated when running with multiple threads that access the same
contiguous huge page.

Thread 1                     |  Thread 2
-----------------------------+------------------------------
hugetlb_fault                |
huge_ptep_set_access_flags   |
  -> invalidate pte range    | hugetlb_fault
continue processing          | wait for hugetlb_fault_mutex
release mutex and return     | huge_ptep_set_access_flags
                             |   -> invalidate pte range
hugetlb_fault
...

This patch changes huge_ptep_set_access_flags s.t. we first read the
contiguous range of ptes (whilst preserving dirty information); the pte
range is only then invalidated where necessary and this prevents further
spurious page faults.

Fixes: d8bdcff287 ("arm64: hugetlb: Add break-before-make logic for contiguous entries")
Reported-by: Lei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-09-24 17:51:50 +01:00
Steve Capper 469ed9d823 arm64: hugetlb: Fix handling of young ptes
In the contiguous bit hugetlb break-before-make code we assume that all
hugetlb pages are young.

In fact, remove_migration_pte is able to place an old hugetlb pte so
this assumption is not valid.

This patch fixes the contiguous hugetlb scanning code to preserve young
ptes.

Fixes: d8bdcff287 ("arm64: hugetlb: Add break-before-make logic for contiguous entries")
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-09-24 17:51:50 +01:00
James Morse 8a695a5873 arm64: Kconfig: Remove ARCH_HAS_HOLES_MEMORYMODEL
include/linux/mmzone.h describes ARCH_HAS_HOLES_MEMORYMODEL as
relevant when parts the memmap have been free()d. This would
happen on systems where memory is smaller than a sparsemem-section,
and the extra struct pages are expensive. pfn_valid() on these
systems returns true for the whole sparsemem-section, so an extra
memmap_valid_within() check is needed.

On arm64 we have nomap memory, so always provide pfn_valid() to test
for nomap pages. This means ARCH_HAS_HOLES_MEMORYMODEL's extra checks
are already rolled up into pfn_valid().

Remove it.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-21 12:02:45 +01:00
Vladimir Murzin 5ffdfaedfa arm64: mm: Support Common Not Private translations
Common Not Private (CNP) is a feature of ARMv8.2 extension which
allows translation table entries to be shared between different PEs in
the same inner shareable domain, so the hardware can use this fact to
optimise the caching of such entries in the TLB.

CNP occupies one bit in TTBRx_ELy and VTTBR_EL2, which advertises to
the hardware that the translation table entries pointed to by this
TTBR are the same as every PE in the same inner shareable domain for
which the equivalent TTBR also has CNP bit set. In case CNP bit is set
but TTBR does not point at the same translation table entries for a
given ASID and VMID, then the system is mis-configured, so the results
of translations are UNPREDICTABLE.

For kernel we postpone setting CNP till all cpus are up and rely on
cpufeature framework to 1) patch the code which is sensitive to CNP
and 2) update TTBR1_EL1 with CNP bit set. TTBR1_EL1 can be
reprogrammed as result of hibernation or cpuidle (via __enable_mmu).
For these two cases we restore CnP bit via __cpu_suspend_exit().

There are a few cases we need to care of changes in TTBR0_EL1:
  - a switch to idmap
  - software emulated PAN

we rule out latter via Kconfig options and for the former we make
sure that CNP is set for non-zero ASIDs only.

Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
[catalin.marinas@arm.com: default y for CONFIG_ARM64_CNP]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-18 12:02:27 +01:00
Will Deacon b8925ee2e1 arm64: cpu: Move errata and feature enable callbacks closer to callers
The cpu errata and feature enable callbacks are only called via their
respective arm64_cpu_capabilities structure and therefore shouldn't
exist in the global namespace.

Move the PAN, RAS and cache maintenance emulation enable callbacks into
the same files as their corresponding arm64_cpu_capabilities structures,
making them static in the process.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-14 17:46:22 +01:00
Will Deacon 4733c7c79e arm64: dump: Use consistent capitalisation for page-table dumps
Being consistent in our capitalisation for page-table dumps helps when
grepping for things like "end".

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-09-10 16:15:23 +01:00
Christoph Hellwig 46053c7368 dma-mapping: clear dev->dma_ops in arch_teardown_dma_ops
There is no reason to leave the per-device dma_ops around when
deconfiguring a device, so move this code from arm64 into the
common code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2018-09-08 11:19:38 +02:00
Mark Rutland fac880c7d0 arm64: fix erroneous warnings in page freeing functions
In pmd_free_pte_page() and pud_free_pmd_page() we try to warn if they
hit a present non-table entry. In both cases we'll warn for non-present
entries, as the VM_WARN_ON() only checks the entry is not a table entry.

This has been observed to result in warnings when booting a v4.19-rc2
kernel under qemu.

Fix this by bailing out earlier for non-present entries.

Fixes: ec28bb9c9b ("arm64: Implement page table free interfaces")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-09-06 18:01:13 +01:00
Linus Torvalds 6ada4e2826 Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - a few misc things

 - a few Y2038 fixes

 - ntfs fixes

 - arch/sh tweaks

 - ocfs2 updates

 - most of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (111 commits)
  mm/hmm.c: remove unused variables align_start and align_end
  fs/userfaultfd.c: remove redundant pointer uwq
  mm, vmacache: hash addresses based on pmd
  mm/list_lru: introduce list_lru_shrink_walk_irq()
  mm/list_lru.c: pass struct list_lru_node* as an argument to __list_lru_walk_one()
  mm/list_lru.c: move locking from __list_lru_walk_one() to its caller
  mm/list_lru.c: use list_lru_walk_one() in list_lru_walk_node()
  mm, swap: make CONFIG_THP_SWAP depend on CONFIG_SWAP
  mm/sparse: delete old sparse_init and enable new one
  mm/sparse: add new sparse_init_nid() and sparse_init()
  mm/sparse: move buffer init/fini to the common place
  mm/sparse: use the new sparse buffer functions in non-vmemmap
  mm/sparse: abstract sparse buffer allocations
  mm/hugetlb.c: don't zero 1GiB bootmem pages
  mm, page_alloc: double zone's batchsize
  mm/oom_kill.c: document oom_lock
  mm/hugetlb: remove gigantic page support for HIGHMEM
  mm, oom: remove sleep from under oom_lock
  kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()
  mm/cma: remove unsupported gfp_mask parameter from cma_alloc()
  ...
2018-08-17 16:49:31 -07:00
Marek Szyprowski d834c5ab83 kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()
The CMA memory allocator doesn't support standard gfp flags for memory
allocation, so there is no point having it as a parameter for
dma_alloc_from_contiguous() function.  Replace it by a boolean no_warn
argument, which covers all the underlaying cma_alloc() function
supports.

This will help to avoid giving false feeling that this function supports
standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer,
what has already been an issue: see commit dd65a941f6 ("arm64:
dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag").

Link: http://lkml.kernel.org/r/20180709122020eucas1p21a71b092975cb4a3b9954ffc63f699d1~-sqUFoa-h2939329393eucas1p2Y@eucas1p2.samsung.com
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Michał Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:32 -07:00
Souptick Joarder 50a7ca3c6f mm: convert return type of handle_mm_fault() caller to vm_fault_t
Use new return type vm_fault_t for fault handler.  For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno.  Once all instances are converted, vm_fault_t will become a
distinct type.

Ref-> commit 1c8f422059 ("mm: change return type to vm_fault_t")

In this patch all the caller of handle_mm_fault() are changed to return
vm_fault_t type.

Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: James Hogan <jhogan@kernel.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: David S. Miller <davem@davemloft.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17 16:20:28 -07:00
Linus Torvalds edb0a20009 A couple of arm64 fixes
- Fix boot on Hikey-960 by avoiding an IPI with interrupts disabled
 - Fix address truncation in pfn_valid() implementation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJbdp+EAAoJELescNyEwWM0Ld8H/iJqjvPwNLRC0KGL/rCQJH70
 D80qlNBnwlrs2eUJTeNeRVZC+t2l9vJIoT17W938WkjxV+DSGDsfFDy3/BQ7VTji
 7e33mwFBNoH+feAfMYmzht3sRlvyZ0oqXSIq/GrdZ8a4Gg/6iNVz7K1kpboBVFXp
 LFnFIN4I7mNwdl1nAyNmnU081MMWfyvgRB82Xd9eS00KCAm3ueHfkwBNcwkfulDg
 RT2ZXPzwd3Yxsdy3Z+r1vyXMHAw2GjcYpL5pjvHf34zMdvqkk03sMsx2yReuSR1U
 M6MpNCdZfWHgMlFWbsEoEOd0g0CF5s6TQK3hBqoUEE3AUVNrQ8ixZMip326axoQ=
 =C2YW
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "A couple of arm64 fixes

   - Fix boot on Hikey-960 by avoiding an IPI with interrupts disabled

   - Fix address truncation in pfn_valid() implementation"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid()
  arm64: Avoid calling stop_machine() when patching jump labels
2018-08-17 11:48:04 -07:00
Greg Hackmann 5ad356eabc arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid()
ARM64's pfn_valid() shifts away the upper PAGE_SHIFT bits of the input
before seeing if the PFN is valid.  This leads to false positives when
some of the upper bits are set, but the lower bits match a valid PFN.

For example, the following userspace code looks up a bogus entry in
/proc/kpageflags:

    int pagemap = open("/proc/self/pagemap", O_RDONLY);
    int pageflags = open("/proc/kpageflags", O_RDONLY);
    uint64_t pfn, val;

    lseek64(pagemap, [...], SEEK_SET);
    read(pagemap, &pfn, sizeof(pfn));
    if (pfn & (1UL << 63)) {        /* valid PFN */
        pfn &= ((1UL << 55) - 1);   /* clear flag bits */
        pfn |= (1UL << 55);
        lseek64(pageflags, pfn * sizeof(uint64_t), SEEK_SET);
        read(pageflags, &val, sizeof(val));
    }

On ARM64 this causes the userspace process to crash with SIGSEGV rather
than reading (1 << KPF_NOPAGE).  kpageflags_read() treats the offset as
valid, and stable_page_flags() will try to access an address between the
user and kernel address ranges.

Fixes: c1cc155261 ("arm64: MMU initialisation")
Cc: stable@vger.kernel.org
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-08-17 10:27:14 +01:00
Linus Torvalds 1202f4fdbc arm64 updates for 4.19
A bunch of good stuff in here:
 
 - Wire up support for qspinlock, replacing our trusty ticket lock code
 
 - Add an IPI to flush_icache_range() to ensure that stale instructions
   fetched into the pipeline are discarded along with the I-cache lines
 
 - Support for the GCC "stackleak" plugin
 
 - Support for restartable sequences, plus an arm64 port for the selftest
 
 - Kexec/kdump support on systems booting with ACPI
 
 - Rewrite of our syscall entry code in C, which allows us to zero the
   GPRs on entry from userspace
 
 - Support for chained PMU counters, allowing 64-bit event counters to be
   constructed on current CPUs
 
 - Ensure scheduler topology information is kept up-to-date with CPU
   hotplug events
 
 - Re-enable support for huge vmalloc/IO mappings now that the core code
   has the correct hooks to use break-before-make sequences
 
 - Miscellaneous, non-critical fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJbbV41AAoJELescNyEwWM0WoEIALhrKtsIn6vqFlSs/w6aDuJL
 cMWmFxjTaKLmIq2+cJIdFLOJ3CH80Pu9gB+nEv/k+cZdCTfUVKfRf28HTpmYWsht
 bb4AhdHMC7yFW752BHk+mzJspeC8h/2Rm8wMuNVplZ3MkPrwo3vsiuJTofLhVL/y
 BihlU3+5sfBvCYIsWnuEZIev+/I/s/qm1ASiqIcKSrFRZP6VTt5f9TC75vFI8seW
 7yc3odKb0CArexB8yBjiPNziehctQF42doxQyL45hezLfWw4qdgHOSiwyiOMxEz9
 Fwwpp8Tx33SKLNJgqoqYznGW9PhYJ7n2Kslv19uchJrEV+mds82vdDNaWRULld4=
 =kQn6
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "A bunch of good stuff in here. Worth noting is that we've pulled in
  the x86/mm branch from -tip so that we can make use of the core
  ioremap changes which allow us to put down huge mappings in the
  vmalloc area without screwing up the TLB. Much of the positive
  diffstat is because of the rseq selftest for arm64.

  Summary:

   - Wire up support for qspinlock, replacing our trusty ticket lock
     code

   - Add an IPI to flush_icache_range() to ensure that stale
     instructions fetched into the pipeline are discarded along with the
     I-cache lines

   - Support for the GCC "stackleak" plugin

   - Support for restartable sequences, plus an arm64 port for the
     selftest

   - Kexec/kdump support on systems booting with ACPI

   - Rewrite of our syscall entry code in C, which allows us to zero the
     GPRs on entry from userspace

   - Support for chained PMU counters, allowing 64-bit event counters to
     be constructed on current CPUs

   - Ensure scheduler topology information is kept up-to-date with CPU
     hotplug events

   - Re-enable support for huge vmalloc/IO mappings now that the core
     code has the correct hooks to use break-before-make sequences

   - Miscellaneous, non-critical fixes and cleanups"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (90 commits)
  arm64: alternative: Use true and false for boolean values
  arm64: kexec: Add comment to explain use of __flush_icache_range()
  arm64: sdei: Mark sdei stack helper functions as static
  arm64, kaslr: export offset in VMCOREINFO ELF notes
  arm64: perf: Add cap_user_time aarch64
  efi/libstub: Only disable stackleak plugin for arm64
  arm64: drop unused kernel_neon_begin_partial() macro
  arm64: kexec: machine_kexec should call __flush_icache_range
  arm64: svc: Ensure hardirq tracing is updated before return
  arm64: mm: Export __sync_icache_dcache() for xen-privcmd
  drivers/perf: arm-ccn: Use devm_ioremap_resource() to map memory
  arm64: Add support for STACKLEAK gcc plugin
  arm64: Add stack information to on_accessible_stack
  drivers/perf: hisi: update the sccl_id/ccl_id when MT is supported
  arm64: fix ACPI dependencies
  rseq/selftests: Add support for arm64
  arm64: acpi: fix alignment fault in accessing ACPI
  efi/arm: map UEFI memory map even w/o runtime services enabled
  efi/arm: preserve early mapping of UEFI memory map longer for BGRT
  drivers: acpi: add dependency of EFI for arm64
  ...
2018-08-14 16:39:13 -07:00