Commit Graph

3831 Commits

Author SHA1 Message Date
Linus Torvalds f1947d7c8a Random number generator fixes for Linux 6.1-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmNHYD0ACgkQSfxwEqXe
 A655AA//dJK0PdRghqrKQsl18GOCffV5TUw5i1VbJQbI9d8anfxNjVUQiNGZi4et
 qUwZ8OqVXxYx1Z1UDgUE39PjEDSG9/cCvOpMUWqN20/+6955WlNZjwA7Fk6zjvlM
 R30fz5CIJns9RFvGT4SwKqbVLXIMvfg/wDENUN+8sxt36+VD2gGol7J2JJdngEhM
 lW+zqzi0ABqYy5so4TU2kixpKmpC08rqFvQbD1GPid+50+JsOiIqftDErt9Eg1Mg
 MqYivoFCvbAlxxxRh3+UHBd7ZpJLtp1UFEOl2Rf00OXO+ZclLCAQAsTczucIWK9M
 8LCZjb7d4lPJv9RpXFAl3R1xvfc+Uy2ga5KeXvufZtc5G3aMUKPuIU7k28ZyblVS
 XXsXEYhjTSd0tgi3d0JlValrIreSuj0z2QGT5pVcC9utuAqAqRIlosiPmgPlzXjr
 Us4jXaUhOIPKI+Musv/fqrxsTQziT0jgVA3Njlt4cuAGm/EeUbLUkMWwKXjZLTsv
 vDsBhEQFmyZqxWu4pYo534VX2mQWTaKRV1SUVVhQEHm57b00EAiZohoOvweB09SR
 4KiJapikoopmW4oAUFotUXUL1PM6yi+MXguTuc1SEYuLz/tCFtK8DJVwNpfnWZpE
 lZKvXyJnHq2Sgod/hEZq58PMvT6aNzTzSg7YzZy+VabxQGOO5mc=
 =M+mV
 -----END PGP SIGNATURE-----

Merge tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull more random number generator updates from Jason Donenfeld:
 "This time with some large scale treewide cleanups.

  The intent of this pull is to clean up the way callers fetch random
  integers. The current rules for doing this right are:

   - If you want a secure or an insecure random u64, use get_random_u64()

   - If you want a secure or an insecure random u32, use get_random_u32()

     The old function prandom_u32() has been deprecated for a while
     now and is just a wrapper around get_random_u32(). Same for
     get_random_int().

   - If you want a secure or an insecure random u16, use get_random_u16()

   - If you want a secure or an insecure random u8, use get_random_u8()

   - If you want secure or insecure random bytes, use get_random_bytes().

     The old function prandom_bytes() has been deprecated for a while
     now and has long been a wrapper around get_random_bytes()

   - If you want a non-uniform random u32, u16, or u8 bounded by a
     certain open interval maximum, use prandom_u32_max()

     I say "non-uniform", because it doesn't do any rejection sampling
     or divisions. Hence, it stays within the prandom_*() namespace, not
     the get_random_*() namespace.

     I'm currently investigating a "uniform" function for 6.2. We'll see
     what comes of that.

  By applying these rules uniformly, we get several benefits:

   - By using prandom_u32_max() with an upper-bound that the compiler
     can prove at compile-time is ≤65536 or ≤256, internally
     get_random_u16() or get_random_u8() is used, which wastes fewer
     batched random bytes, and hence has higher throughput.

   - By using prandom_u32_max() instead of %, when the upper-bound is
     not a constant, division is still avoided, because
     prandom_u32_max() uses a faster multiplication-based trick instead.

   - By using get_random_u16() or get_random_u8() in cases where the
     return value is intended to indeed be a u16 or a u8, we waste fewer
     batched random bytes, and hence have higher throughput.

  This series was originally done by hand while I was on an airplane
  without Internet. Later, Kees and I worked on retroactively figuring
  out what could be done with Coccinelle and what had to be done
  manually, and then we split things up based on that.

  So while this touches a lot of files, the actual amount of code that's
  hand fiddled is comfortably small"

* tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  prandom: remove unused functions
  treewide: use get_random_bytes() when possible
  treewide: use get_random_u32() when possible
  treewide: use get_random_{u8,u16}() when possible, part 2
  treewide: use get_random_{u8,u16}() when possible, part 1
  treewide: use prandom_u32_max() when possible, part 2
  treewide: use prandom_u32_max() when possible, part 1
2022-10-16 15:27:07 -07:00
Linus Torvalds 1df046ab1c arm64 fixes:
- Cortex-A55 errata workaround (repeat TLBI).
 
 - AMPERE1 added to the Spectre-BHB affected list.
 
 - MTE fix to avoid setting PG_mte_tagged if no tags have been touched on
   a page.
 
 - Fixed typo in the SCTLR_EL1.SPINTMASK bit naming (the commit log has
   other typos).
 
 - perf: return value check in ali_drw_pmu_probe(),
   ALIBABA_UNCORE_DRW_PMU dependency on ACPI.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmNJrpAACgkQa9axLQDI
 XvFwWQ/+O71bVQPXf43p+O3LapX3IsOxDCWLTMjNcxVgSBSK+TPwtjXN7vIPZDvv
 Ibx4Y10vmHo8Copbs7C8USx+hGo7hzknk/s2zoeqJQX13WQkqpuAwTDshzMp60La
 nQoJXab3KapQ3UIPL5El/cbvAD9+DGJSiWdyvC8GBHwtWKWi1WDpSNFN3WMJm97P
 uQqERiWaf3XOI9BhsuOlCzQE5eemCllycdWoRBelCjIQByuo6SaDPEpTUZDICCPp
 f4Ji7U1hfORmXg/DJcjSJbtkSshVRqjhSAtAmP/sWUic7+kWGBiC+zJQ0PxwiNQH
 Bfryz90ETa/INA65hA1iC51lE7hvt1DKueZAMKjozxYSCSVAxUNonSkEOfKegPeU
 hLhTowmveryqxYGuQ75p5tZjdpvML0Sa/lx7p/GUEhaV77dca/EJ0B68x8WrBpO5
 TCsW3iDq2V+ErWgYL7n6nFoMhZQnNvq9jxhAPuJ8Y47ZkeQ8HcvooKLHUSDSjMk2
 f/7A7rUJh0piYf0FEPSjRBTO/HyPb1D90n1t2wJoCqwrICZ/mmWzVqua0fgmrbvS
 H33YQiSEIkwsfLktIIJRGknYgC0P/JALKlAQPAcmsd+njWsThXJ/WwwRrpvCZdMj
 9CVuDfhw7Ipt4Iz5Tg61lLDkzi7bPRqPpEKc8zzsI3nmY0KC/iA=
 =vjYu
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Cortex-A55 errata workaround (repeat TLBI)

 - AMPERE1 added to the Spectre-BHB affected list

 - MTE fix to avoid setting PG_mte_tagged if no tags have been touched
   on a page

 - Fixed typo in the SCTLR_EL1.SPINTMASK bit naming (the commit log has
   other typos)

 - perf: return value check in ali_drw_pmu_probe(),
   ALIBABA_UNCORE_DRW_PMU dependency on ACPI

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Add AMPERE1 to the Spectre-BHB affected list
  arm64: mte: Avoid setting PG_mte_tagged if no tags cleared or restored
  MAINTAINERS: rectify file entry in ALIBABA PMU DRIVER
  drivers/perf: ALIBABA_UNCORE_DRW_PMU should depend on ACPI
  drivers/perf: fix return value check in ali_drw_pmu_probe()
  arm64: errata: Add Cortex-A55 to the repeat tlbi list
  arm64/sysreg: Fix typo in SCTR_EL1.SPINTMASK
2022-10-14 12:38:03 -07:00
Linus Torvalds 676cb49573 - hfs and hfsplus kmap API modernization from Fabio Francesco
- Valentin Schneider makes crash-kexec work properly when invoked from
   an NMI-time panic.
 
 - ntfs bugfixes from Hawkins Jiawei
 
 - Jiebin Sun improves IPC msg scalability by replacing atomic_t's with
   percpu counters.
 
 - nilfs2 cleanups from Minghao Chi
 
 - lots of other single patches all over the tree!
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0Yf0gAKCRDdBJ7gKXxA
 joapAQDT1d1zu7T8yf9cQXkYnZVuBKCjxKE/IsYvqaq1a42MjQD/SeWZg0wV05B8
 DhJPj9nkEp6R3Rj3Mssip+3vNuceAQM=
 =lUQY
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - hfs and hfsplus kmap API modernization (Fabio Francesco)

 - make crash-kexec work properly when invoked from an NMI-time panic
   (Valentin Schneider)

 - ntfs bugfixes (Hawkins Jiawei)

 - improve IPC msg scalability by replacing atomic_t's with percpu
   counters (Jiebin Sun)

 - nilfs2 cleanups (Minghao Chi)

 - lots of other single patches all over the tree!

* tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
  include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype
  proc: test how it holds up with mapping'less process
  mailmap: update Frank Rowand email address
  ia64: mca: use strscpy() is more robust and safer
  init/Kconfig: fix unmet direct dependencies
  ia64: update config files
  nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure
  fork: remove duplicate included header files
  init/main.c: remove unnecessary (void*) conversions
  proc: mark more files as permanent
  nilfs2: remove the unneeded result variable
  nilfs2: delete unnecessary checks before brelse()
  checkpatch: warn for non-standard fixes tag style
  usr/gen_init_cpio.c: remove unnecessary -1 values from int file
  ipc/msg: mitigate the lock contention with percpu counter
  percpu: add percpu_counter_add_local and percpu_counter_sub_local
  fs/ocfs2: fix repeated words in comments
  relay: use kvcalloc to alloc page array in relay_alloc_page_array
  proc: make config PROC_CHILDREN depend on PROC_FS
  fs: uninline inode_maybe_inc_iversion()
  ...
2022-10-12 11:00:22 -07:00
D Scott Phillips 0e5d5ae837 arm64: Add AMPERE1 to the Spectre-BHB affected list
Per AmpereOne erratum AC03_CPU_12, "Branch history may allow control of
speculative execution across software contexts," the AMPERE1 core needs the
bhb clearing loop to mitigate Spectre-BHB, with a loop iteration count of
11.

Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20221011022140.432370-1-scott@os.amperecomputing.com
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-10-12 17:36:06 +01:00
Catalin Marinas a8e5e5146a arm64: mte: Avoid setting PG_mte_tagged if no tags cleared or restored
Prior to commit 69e3b846d8 ("arm64: mte: Sync tags for pages where PTE
is untagged"), mte_sync_tags() was only called for pte_tagged() entries
(those mapped with PROT_MTE). Therefore mte_sync_tags() could safely use
test_and_set_bit(PG_mte_tagged, &page->flags) without inadvertently
setting PG_mte_tagged on an untagged page.

The above commit was required as guests may enable MTE without any
control at the stage 2 mapping, nor a PROT_MTE mapping in the VMM.
However, the side-effect was that any page with a PTE that looked like
swap (or migration) was getting PG_mte_tagged set automatically. A
subsequent page copy (e.g. migration) copied the tags to the destination
page even if the tags were owned by KASAN.

This issue was masked by the page_kasan_tag_reset() call introduced in
commit e5b8d92189 ("arm64: mte: reset the page tag in page->flags").
When this commit was reverted (20794545c1), KASAN started reporting
access faults because the overriding tags in a page did not match the
original page->flags (with CONFIG_KASAN_HW_TAGS=y):

  BUG: KASAN: invalid-access in copy_page+0x10/0xd0 arch/arm64/lib/copy_page.S:26
  Read at addr f5ff000017f2e000 by task syz-executor.1/2218
  Pointer tag: [f5], memory tag: [f2]

Move the PG_mte_tagged bit setting from mte_sync_tags() to the actual
place where tags are cleared (mte_sync_page_tags()) or restored
(mte_restore_tags()).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: syzbot+c2c79c6d6eddc5262b77@syzkaller.appspotmail.com
Fixes: 69e3b846d8 ("arm64: mte: Sync tags for pages where PTE is untagged")
Cc: <stable@vger.kernel.org> # 5.14.x
Cc: Steven Price <steven.price@arm.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/0000000000004387dc05e5888ae5@google.com/
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20221006163354.3194102-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-10-12 10:00:19 +01:00
Jason A. Donenfeld 7e3cf0843f treewide: use get_random_{u8,u16}() when possible, part 1
Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value,
simply use the get_random_{u8,u16}() functions, which are faster than
wasting the additional bytes from a 32-bit value. This was done
mechanically with this coccinelle script:

@@
expression E;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u16;
typedef __be16;
typedef __le16;
typedef u8;
@@
(
- (get_random_u32() & 0xffff)
+ get_random_u16()
|
- (get_random_u32() & 0xff)
+ get_random_u8()
|
- (get_random_u32() % 65536)
+ get_random_u16()
|
- (get_random_u32() % 256)
+ get_random_u8()
|
- (get_random_u32() >> 16)
+ get_random_u16()
|
- (get_random_u32() >> 24)
+ get_random_u8()
|
- (u16)get_random_u32()
+ get_random_u16()
|
- (u8)get_random_u32()
+ get_random_u8()
|
- (__be16)get_random_u32()
+ (__be16)get_random_u16()
|
- (__le16)get_random_u32()
+ (__le16)get_random_u16()
|
- prandom_u32_max(65536)
+ get_random_u16()
|
- prandom_u32_max(256)
+ get_random_u8()
|
- E->inet_id = get_random_u32()
+ E->inet_id = get_random_u16()
)

@@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u16;
identifier v;
@@
- u16 v = get_random_u32();
+ u16 v = get_random_u16();

@@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u8;
identifier v;
@@
- u8 v = get_random_u32();
+ u8 v = get_random_u8();

@@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u16;
u16 v;
@@
-  v = get_random_u32();
+  v = get_random_u16();

@@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u8;
u8 v;
@@
-  v = get_random_u32();
+  v = get_random_u8();

// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@

        ((T)get_random_u32()@p & (LITERAL))

// Examine limits
@script:python add_one@
literal << literal_mask.LITERAL;
RESULT;
@@

value = None
if literal.startswith('0x'):
        value = int(literal, 16)
elif literal[0] in '123456789':
        value = int(literal, 10)
if value is None:
        print("I don't know how to handle %s" % (literal))
        cocci.include_match(False)
elif value < 256:
        coccinelle.RESULT = cocci.make_ident("get_random_u8")
elif value < 65536:
        coccinelle.RESULT = cocci.make_ident("get_random_u16")
else:
        print("Skipping large mask of %s" % (literal))
        cocci.include_match(False)

// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
identifier add_one.RESULT;
identifier FUNC;
@@

-       (FUNC()@p & (LITERAL))
+       (RESULT() & LITERAL)

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-11 17:42:58 -06:00
Jason A. Donenfeld 81895a65ec treewide: use prandom_u32_max() when possible, part 1
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done mechanically with this coccinelle script:

@basic@
expression E;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u64;
@@
(
- ((T)get_random_u32() % (E))
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ((E) - 1))
+ prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
|
- ((u64)(E) * get_random_u32() >> 32)
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ~PAGE_MASK)
+ prandom_u32_max(PAGE_SIZE)
)

@multi_line@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
identifier RAND;
expression E;
@@

-       RAND = get_random_u32();
        ... when != RAND
-       RAND %= (E);
+       RAND = prandom_u32_max(E);

// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@

        ((T)get_random_u32()@p & (LITERAL))

// Add one to the literal.
@script:python add_one@
literal << literal_mask.LITERAL;
RESULT;
@@

value = None
if literal.startswith('0x'):
        value = int(literal, 16)
elif literal[0] in '123456789':
        value = int(literal, 10)
if value is None:
        print("I don't know how to handle %s" % (literal))
        cocci.include_match(False)
elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1:
        print("Skipping 0x%x for cleanup elsewhere" % (value))
        cocci.include_match(False)
elif value & (value + 1) != 0:
        print("Skipping 0x%x because it's not a power of two minus one" % (value))
        cocci.include_match(False)
elif literal.startswith('0x'):
        coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
else:
        coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))

// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
expression add_one.RESULT;
identifier FUNC;
@@

-       (FUNC()@p & (LITERAL))
+       prandom_u32_max(RESULT)

@collapse_ret@
type T;
identifier VAR;
expression E;
@@

 {
-       T VAR;
-       VAR = (E);
-       return VAR;
+       return E;
 }

@drop_var@
type T;
identifier VAR;
@@

 {
-       T VAR;
        ... when != VAR
 }

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: KP Singh <kpsingh@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-11 17:42:55 -06:00
Linus Torvalds 27bc50fc90 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any negative
   reports (or any positive ones, come to that).
 
 - Also the Maple Tree from Liam R.  Howlett.  An overlapping range-based
   tree for vmas.  It it apparently slight more efficient in its own right,
   but is mainly targeted at enabling work to reduce mmap_lock contention.
 
   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.
 
   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   (https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
   This has yet to be addressed due to Liam's unfortunately timed
   vacation.  He is now back and we'll get this fixed up.
 
 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer.  It uses
   clang-generated instrumentation to detect used-unintialized bugs down to
   the single bit level.
 
   KMSAN keeps finding bugs.  New ones, as well as the legacy ones.
 
 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.
 
 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
   file/shmem-backed pages.
 
 - userfaultfd updates from Axel Rasmussen
 
 - zsmalloc cleanups from Alexey Romanov
 
 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
 
 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.
 
 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.
 
 - memcg cleanups from Kairui Song.
 
 - memcg fixes and cleanups from Johannes Weiner.
 
 - Vishal Moola provides more folio conversions
 
 - Zhang Yi removed ll_rw_block() :(
 
 - migration enhancements from Peter Xu
 
 - migration error-path bugfixes from Huang Ying
 
 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths.  For optimizations by PMEM drivers, DRM
   drivers, etc.
 
 - vma merging improvements from Jakub Matěn.
 
 - NUMA hinting cleanups from David Hildenbrand.
 
 - xu xin added aditional userspace visibility into KSM merging activity.
 
 - THP & KSM code consolidation from Qi Zheng.
 
 - more folio work from Matthew Wilcox.
 
 - KASAN updates from Andrey Konovalov.
 
 - DAMON cleanups from Kaixu Xia.
 
 - DAMON work from SeongJae Park: fixes, cleanups.
 
 - hugetlb sysfs cleanups from Muchun Song.
 
 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
 joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
 bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
 =xfWx
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP & KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
2022-10-10 17:53:04 -07:00
Linus Torvalds 8afc66e8d4 Kbuild updates for v6.1
- Remove potentially incomplete targets when Kbuid is interrupted by
    SIGINT etc. in case GNU Make may miss to do that when stderr is piped
    to another program.
 
  - Rewrite the single target build so it works more correctly.
 
  - Fix rpm-pkg builds with V=1.
 
  - List top-level subdirectories in ./Kbuild.
 
  - Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in kallsyms.
 
  - Avoid two different modules in lib/zstd/ having shared code, which
    potentially causes building the common code as build-in and modular
    back-and-forth.
 
  - Unify two modpost invocations to optimize the build process.
 
  - Remove head-y syntax in favor of linker scripts for placing particular
    sections in the head of vmlinux.
 
  - Bump the minimal GNU Make version to 3.82.
 
  - Clean up misc Makefiles and scripts.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmM+4vcVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGY2IQAInr0JUNnkkxwUSXtOcQuA3IK8RJ
 FbU9HXJRoV9H+7+l3SMlN7mIbrs5eE5fTY3iwQ3CVe139d1+1q7nvTMRv8owywJx
 GBgzswncuu1lk7iQQ//CxiqMwSCG8GJdYn1uDVy4I5jg3o+DtFZJtyq2Wb7pqsMm
 ZhZ4PozRN+idYQJSF6Vx/zEVLHI7quMBwfe4CME8/0Kg2+hnYzbXV/aUf0ED2emq
 zdCMDQgIOK5AhY+8qgMXKYnBUJMTqBp6LoR4p3ApfUkwRFY0sGa0/LK3U/B22OE7
 uWyR4fCUExGyerlcHEVev+9eBfmsLLPyqlchNwpSDOPf5OSdnKmgqJEBR/Cvx0eh
 URerPk7EHxyH3G8yi+cU2GtofNTGc5RHPRgJE2ADsQEi5TAUKGmbXMlsFRL/51Vn
 lTANZObBNa1d4enljF6TfTL5nuccOa+DKvXnH9fQ49t0QdtSikv6J/lGwilwm1Sr
 BctmCsySPuURZfkpI9OQnLuouloMXl9f7Q/+S39haS/tSgvPpyITyO71nxDnXn/s
 BbFObZJUk9QkqOACjBP1hNErTLt83uBxQ9z+rDCw/SbLIe4nw0wyneuygfHI5rI8
 3RZB2DbGauuJHX2Zs6YGS14SLSY33IsLqKR1/Vy3LrPvOHuEvNiOR8LITq5E0YCK
 OffK2Y5cIlXR0QWf
 =DHiN
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Remove potentially incomplete targets when Kbuid is interrupted by
   SIGINT etc in case GNU Make may miss to do that when stderr is piped
   to another program.

 - Rewrite the single target build so it works more correctly.

 - Fix rpm-pkg builds with V=1.

 - List top-level subdirectories in ./Kbuild.

 - Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in
   kallsyms.

 - Avoid two different modules in lib/zstd/ having shared code, which
   potentially causes building the common code as build-in and modular
   back-and-forth.

 - Unify two modpost invocations to optimize the build process.

 - Remove head-y syntax in favor of linker scripts for placing
   particular sections in the head of vmlinux.

 - Bump the minimal GNU Make version to 3.82.

 - Clean up misc Makefiles and scripts.

* tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits)
  docs: bump minimal GNU Make version to 3.82
  ia64: simplify esi object addition in Makefile
  Revert "kbuild: Check if linker supports the -X option"
  kbuild: rebuild .vmlinux.export.o when its prerequisite is updated
  kbuild: move modules.builtin(.modinfo) rules to Makefile.vmlinux_o
  zstd: Fixing mixed module-builtin objects
  kallsyms: ignore __kstrtab_* and __kstrtabns_* symbols
  kallsyms: take the input file instead of reading stdin
  kallsyms: drop duplicated ignore patterns from kallsyms.c
  kbuild: reuse mksysmap output for kallsyms
  mksysmap: update comment about __crc_*
  kbuild: remove head-y syntax
  kbuild: use obj-y instead extra-y for objects placed at the head
  kbuild: hide error checker logs for V=1 builds
  kbuild: re-run modpost when it is updated
  kbuild: unify two modpost invocations
  kbuild: move vmlinux.o rule to the top Makefile
  kbuild: move .vmlinux.objs rule to Makefile.modpost
  kbuild: list sub-directories in ./Kbuild
  Makefile.compiler: replace cc-ifversion with compiler-specific macros
  ...
2022-10-10 12:00:45 -07:00
Linus Torvalds 2e64066dab RISC-V Patches for the 6.1 Merge Window, Part 1
* Improvements to the CPU topology subsystem, which fix some issues
   where RISC-V would report bad topology information.
 * The default NR_CPUS has increased to XLEN, and the maximum
   configurable value is 512.
 * The CD-ROM filesystems have been enabled in the defconfig.
 * Support for THP_SWAP has been added for rv64 systems.
 
 There are also a handful of cleanups and fixes throughout the tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmNAWgwTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYicSiEACmuB9WuGZmAasKvmPgz7thyLqakg7/
 cE4YK0MxgJxkhsXzYSAv1Fn+WUfX7DSzhK4OOM5wEngAYul7QoFdc84MF0DYKO+E
 InjdOvVavzUsWYqETNCuMHPRK6xyzvfHCqqBDDxKHx5jUoicCQfFwJyHLw+cvouR
 7WSJoFdvOEV01QyN5Qw9bQp7ASx61ZZX1yE6OAPc2/EJlDEA2QSnjBAi4M+n2ZCx
 ZsQz+Dp9RfSU8/nIr13oGiL3Zm+kyXwdOS/8PaDqtrkyiGh6+vSeGqZZwRLVITP/
 oUxqGEgnn2eFBD1y8vjsQNWMLWoi9Av4746Fxr8CEHX+jX1cp9CCkU2OkkLxaFcv
 6XFtXPJIh/UjzVgPmjZxK+ArEX28QOM5IVyBFxsSl0dNtvyVqKpBXCV1RQ+fFHkO
 ntHF3ZxibqOn8ZJmziCn0nzWSOqugNTdAhD4dJAbl58RB/IQtQT0OnHpmpXCG3xh
 +/JBzy//xkr7u2HMqU69PzwPtWwZrENUV6jl5SHUDUoW8pySng2Pl4pbmTFqgWty
 JTfc5EdyWOWyshhoSCtK2//bnVFryl2ntwGr3LIZrZxkiUiOeYjn+C/YedXZIRob
 yy2CN+QanW/FXdIa4GMNeGc9sGDApd3/RtP+8L9mV1kWK6OE0EVskkI1UMCGXrIP
 5JoE1jLMVhjcKQ==
 =LJg6
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Improvements to the CPU topology subsystem, which fix some issues
   where RISC-V would report bad topology information.

 - The default NR_CPUS has increased to XLEN, and the maximum
   configurable value is 512.

 - The CD-ROM filesystems have been enabled in the defconfig.

 - Support for THP_SWAP has been added for rv64 systems.

There are also a handful of cleanups and fixes throughout the tree.

* tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: enable THP_SWAP for RV64
  RISC-V: Print SSTC in canonical order
  riscv: compat: s/failed/unsupported if compat mode isn't supported
  RISC-V: Increase range and default value of NR_CPUS
  cpuidle: riscv-sbi: Fix CPU_PM_CPU_IDLE_ENTER_xyz() macro usage
  perf: RISC-V: throttle perf events
  perf: RISC-V: exclude invalid pmu counters from SBI calls
  riscv: enable CD-ROM file systems in defconfig
  riscv: topology: fix default topology reporting
  arm64: topology: move store_cpu_topology() to shared code
2022-10-09 13:24:01 -07:00
Linus Torvalds 0e470763d8 EFI updates for v6.1
- implement EFI boot support for LoongArch
 - implement generic EFI compressed boot support for arm64, RISC-V and
   LoongArch, none of which implement a decompressor today
 - measure the kernel command line into the TPM if measured boot is in
   effect
 - refactor the EFI stub code in order to isolate DT dependencies for
   architectures other than x86
 - avoid calling SetVirtualAddressMap() on arm64 if the configured size
   of the VA space guarantees that doing so is unnecessary
 - move some ARM specific code out of the generic EFI source files
 - unmap kernel code from the x86 mixed mode 1:1 page tables
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmM5mfEACgkQw08iOZLZ
 jySnJwv9G2nBheSlK9bbWKvCpnDvVIExtlL+mg1wB64oxPrGiWRgjxeyA9+92bT0
 Y6jYfKbGOGKnxkEJQl19ik6C3JfEwtGm4SnOVp4+osFeDRB7lFemfcIYN5dqz111
 wkZA/Y15rnz3tZeGaXnq2jMoFuccQDXPJtOlqbdVqFQ5Py6YT92uMyuI079pN0T+
 GSu7VVOX+SBsv4nGaUKIpSVwAP0gXkS/7s7CTf47QiR2+j8WMTlQEYZVjOKZjMJZ
 /7hXY2/mduxnuVuT7cfx0mpZKEryUREJoBL5nDzjTnlhLb5X8cHKiaE1lx0aJ//G
 JYTR8lDklJZl/7RUw/IW/YodcKcofr3F36NMzWB5vzM+KHOOpv4qEZhoGnaXv94u
 auqhzYA83heaRjz7OISlk6kgFxdlIRE1VdrkEBXSlQeCQUv1woS+ZNVGYcKqgR0B
 48b31Ogm2A0pAuba89+U9lz/n33lhIDtYvJqLO6AAPLGiVacD9ZdapN5kMftVg/1
 SfhFqNzy
 =d8Ps
 -----END PGP SIGNATURE-----

Merge tag 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI updates from Ard Biesheuvel:
 "A bit more going on than usual in the EFI subsystem. The main driver
  for this has been the introduction of the LoonArch architecture last
  cycle, which inspired some cleanup and refactoring of the EFI code.
  Another driver for EFI changes this cycle and in the future is
  confidential compute.

  The LoongArch architecture does not use either struct bootparams or DT
  natively [yet], and so passing information between the EFI stub and
  the core kernel using either of those is undesirable. And in general,
  overloading DT has been a source of issues on arm64, so using DT for
  this on new architectures is a to avoid for the time being (even if we
  might converge on something DT based for non-x86 architectures in the
  future). For this reason, in addition to the patch that enables EFI
  boot for LoongArch, there are a number of refactoring patches applied
  on top of which separate the DT bits from the generic EFI stub bits.
  These changes are on a separate topich branch that has been shared
  with the LoongArch maintainers, who will include it in their pull
  request as well. This is not ideal, but the best way to manage the
  conflicts without stalling LoongArch for another cycle.

  Another development inspired by LoongArch is the newly added support
  for EFI based decompressors. Instead of adding yet another
  arch-specific incarnation of this pattern for LoongArch, we are
  introducing an EFI app based on the existing EFI libstub
  infrastructure that encapulates the decompression code we use on other
  architectures, but in a way that is fully generic. This has been
  developed and tested in collaboration with distro and systemd folks,
  who are eager to start using this for systemd-boot and also for arm64
  secure boot on Fedora. Note that the EFI zimage files this introduces
  can also be decompressed by non-EFI bootloaders if needed, as the
  image header describes the location of the payload inside the image,
  and the type of compression that was used. (Note that Fedora's arm64
  GRUB is buggy [0] so you'll need a recent version or switch to
  systemd-boot in order to use this.)

  Finally, we are adding TPM measurement of the kernel command line
  provided by EFI. There is an oversight in the TCG spec which results
  in a blind spot for command line arguments passed to loaded images,
  which means that either the loader or the stub needs to take the
  measurement. Given the combinatorial explosion I am anticipating when
  it comes to firmware/bootloader stacks and firmware based attestation
  protocols (SEV-SNP, TDX, DICE, DRTM), it is good to set a baseline now
  when it comes to EFI measured boot, which is that the kernel measures
  the initrd and command line. Intermediate loaders can measure
  additional assets if needed, but with the baseline in place, we can
  deploy measured boot in a meaningful way even if you boot into Linux
  straight from the EFI firmware.

  Summary:

   - implement EFI boot support for LoongArch

   - implement generic EFI compressed boot support for arm64, RISC-V and
     LoongArch, none of which implement a decompressor today

   - measure the kernel command line into the TPM if measured boot is in
     effect

   - refactor the EFI stub code in order to isolate DT dependencies for
     architectures other than x86

   - avoid calling SetVirtualAddressMap() on arm64 if the configured
     size of the VA space guarantees that doing so is unnecessary

   - move some ARM specific code out of the generic EFI source files

   - unmap kernel code from the x86 mixed mode 1:1 page tables"

* tag 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits)
  efi/arm64: libstub: avoid SetVirtualAddressMap() when possible
  efi: zboot: create MemoryMapped() device path for the parent if needed
  efi: libstub: fix up the last remaining open coded boot service call
  efi/arm: libstub: move ARM specific code out of generic routines
  efi/libstub: measure EFI LoadOptions
  efi/libstub: refactor the initrd measuring functions
  efi/loongarch: libstub: remove dependency on flattened DT
  efi: libstub: install boot-time memory map as config table
  efi: libstub: remove DT dependency from generic stub
  efi: libstub: unify initrd loading between architectures
  efi: libstub: remove pointless goto kludge
  efi: libstub: simplify efi_get_memory_map() and struct efi_boot_memmap
  efi: libstub: avoid efi_get_memory_map() for allocating the virt map
  efi: libstub: drop pointless get_memory_map() call
  efi: libstub: fix type confusion for load_options_size
  arm64: efi: enable generic EFI compressed boot
  loongarch: efi: enable generic EFI compressed boot
  riscv: efi: enable generic EFI compressed boot
  efi/libstub: implement generic EFI zboot
  efi/libstub: move efi_system_table global var into separate object
  ...
2022-10-09 08:56:54 -07:00
James Morse 171df58028 arm64: errata: Add Cortex-A55 to the repeat tlbi list
Cortex-A55 is affected by an erratum where in rare circumstances the
CPUs may not handle a race between a break-before-make sequence on one
CPU, and another CPU accessing the same page. This could allow a store
to a page that has been unmapped.

Work around this by adding the affected CPUs to the list that needs
TLB sequences to be done twice.

Signed-off-by: James Morse <james.morse@arm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220930131959.3082594-1-james.morse@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-10-07 14:42:20 +01:00
Linus Torvalds 18fd049731 arm64 updates for 6.1:
- arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE
   vector granule register added to the user regs together with SVE perf
   extensions documentation.
 
 - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation
   to match the actual kernel behaviour (zeroing the registers on syscall
   rather than "zeroed or preserved" previously).
 
 - More conversions to automatic system registers generation.
 
 - vDSO: use self-synchronising virtual counter access in gettimeofday()
   if the architecture supports it.
 
 - arm64 stacktrace cleanups and improvements.
 
 - arm64 atomics improvements: always inline assembly, remove LL/SC
   trampolines.
 
 - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception
   handling, better EL1 undefs reporting.
 
 - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect
   result.
 
 - arm64 defconfig updates: build CoreSight as a module, enable options
   necessary for docker, memory hotplug/hotremove, enable all PMUs
   provided by Arm.
 
 - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME
   extensions).
 
 - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove
   unused function.
 
 - kselftest updates for arm64: simple HWCAP validation, FP stress test
   improvements, validation of ZA regs in signal handlers, include larger
   SVE and SME vector lengths in signal tests, various cleanups.
 
 - arm64 alternatives (code patching) improvements to robustness and
   consistency: replace cpucap static branches with equivalent
   alternatives, associate callback alternatives with a cpucap.
 
 - Miscellaneous updates: optimise kprobe performance of patching
   single-step slots, simplify uaccess_mask_ptr(), move MTE registers
   initialisation to C, support huge vmalloc() mappings, run softirqs on
   the per-CPU IRQ stack, compat (arm32) misalignment fixups for
   multiword accesses.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmM9W4cACgkQa9axLQDI
 XvEy3w/+LJ3KCFowWiz5gTAWikjv+UVssHjLMJixn47V7hsEFQ26Xnam/438rTMI
 kE95u6DHUpw2SMIxKzFRO7oI5cQtP+cWGwTtOUnjVO+U1oN+HqDOIbO9DbylWDcU
 eeeqMMmawMfTPuZrYklpOhXscsorbrKIvYBg7wHYOcwBYV3EPhWr89lwMvTVRuyJ
 qpX628KlkGMaBcONNhv3nS3qZcAOs0oHQCAVS4C8czLDL+vtJlumXUS3xr1Mqm72
 xtFe7sje8Djr2kZ8mzh0GbFiZEBoBD3F/l7ayq8gVRaVpToUt8sk36Stjs4LojF1
 6imuAfji/5TItkScq5KhGqj6MIugwp/eUVbRN74OLNTYx7msF1ZADNFQ+Q0UuY0H
 SYK13KvmOji0xjS8qAfhqrwNB79sk3fb+zF9LjETbdz4ZJCgg9gcFbSUTY0DvMfS
 MXZk/jVeB07olA8xYbjh0BRt4UV9xU628FPQzK5k7e4Nzl4jSvgtJZCZanfuVtjy
 /ZS1vbN8o7tQLBAlVnw+Exi/VedkKxkkMgm8tPKsMgERTFDx0Pc4Gs72hRpDnPWT
 MRbeCCGleAf3JQ5vF0coBDNOCEVvweQgShHOyHTz0GyhWXLCFx3RJICo5I4EIpps
 LLUk4JK0fO3LVrf1AEpu5ZP4+Sact0zfsH3gB7qyLPYFDmjDXD8=
 =jl3Z
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE
   vector granule register added to the user regs together with SVE perf
   extensions documentation.

 - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI
   documentation to match the actual kernel behaviour (zeroing the
   registers on syscall rather than "zeroed or preserved" previously).

 - More conversions to automatic system registers generation.

 - vDSO: use self-synchronising virtual counter access in gettimeofday()
   if the architecture supports it.

 - arm64 stacktrace cleanups and improvements.

 - arm64 atomics improvements: always inline assembly, remove LL/SC
   trampolines.

 - Improve the reporting of EL1 exceptions: rework BTI and FPAC
   exception handling, better EL1 undefs reporting.

 - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect
   result.

 - arm64 defconfig updates: build CoreSight as a module, enable options
   necessary for docker, memory hotplug/hotremove, enable all PMUs
   provided by Arm.

 - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME
   extensions).

 - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove
   unused function.

 - kselftest updates for arm64: simple HWCAP validation, FP stress test
   improvements, validation of ZA regs in signal handlers, include
   larger SVE and SME vector lengths in signal tests, various cleanups.

 - arm64 alternatives (code patching) improvements to robustness and
   consistency: replace cpucap static branches with equivalent
   alternatives, associate callback alternatives with a cpucap.

 - Miscellaneous updates: optimise kprobe performance of patching
   single-step slots, simplify uaccess_mask_ptr(), move MTE registers
   initialisation to C, support huge vmalloc() mappings, run softirqs on
   the per-CPU IRQ stack, compat (arm32) misalignment fixups for
   multiword accesses.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits)
  arm64: alternatives: Use vdso/bits.h instead of linux/bits.h
  arm64/kprobe: Optimize the performance of patching single-step slot
  arm64: defconfig: Add Coresight as module
  kselftest/arm64: Handle EINTR while reading data from children
  kselftest/arm64: Flag fp-stress as exiting when we begin finishing up
  kselftest/arm64: Don't repeat termination handler for fp-stress
  ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
  arm64/mm: fold check for KFENCE into can_set_direct_map()
  arm64: ftrace: fix module PLTs with mcount
  arm64: module: Remove unused plt_entry_is_initialized()
  arm64: module: Make plt_equals_entry() static
  arm64: fix the build with binutils 2.27
  kselftest/arm64: Don't enable v8.5 for MTE selftest builds
  arm64: uaccess: simplify uaccess_mask_ptr()
  arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header
  kselftest/arm64: Fix typo in hwcap check
  arm64: mte: move register initialization to C
  arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
  arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
  arm64/sve: Add Perf extensions documentation
  ...
2022-10-06 11:51:49 -07:00
Linus Torvalds 865dad2022 kcfi updates for v6.1-rc1
This replaces the prior support for Clang's standard Control Flow
 Integrity (CFI) instrumentation, which has required a lot of special
 conditions (e.g. LTO) and work-arounds. The current implementation
 ("Kernel CFI") is specific to C, directly designed for the Linux kernel,
 and takes advantage of architectural features like x86's IBT. This
 series retains arm64 support and adds x86 support. Additional "generic"
 architectural support is expected soon:
 https://github.com/samitolvanen/llvm-project/commits/kcfi_generic
 
 - treewide: Remove old CFI support details
 
 - arm64: Replace Clang CFI support with Clang KCFI support
 
 - x86: Introduce Clang KCFI support
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmM4aAUWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJkgWD/4mUgb7xewNIG/+fuipGd620Iao
 K0T8q4BNxLNRltOxNc3Q0WMDCggX0qJGCeds7EdFQJQOGxWcbifM8MAS4idAGM0G
 fc3Gxl1imC/oF6goCAbQgndA6jYFIWXGsv8LsRjAXRidWLFr3GFAqVqYJyokSySr
 8zMQsEDuF4I1gQnOhEWdtPZbV3MQ4ZjfFzpv+33agbq6Gb72vKvDh3G6g2VXlxjt
 1qnMtS+eEpbBU65cJkOi4MSLgymWbnIAeTMb0dbsV4kJ08YoTl8uz1B+weeH6GgT
 WP73ZJ4nqh1kkkT9EqS9oKozNB9fObhvCokEuAjuQ7i1eCEZsbShvRc0iL7OKTGG
 UfuTJa5qQ4h7Z0JS35FCSJETa+fcG0lTyEd133nLXLMZP9K2antf+A6O//fd0J1V
 Jg4VN7DQmZ+UNGOzRkL6dTtQUy4PkxhniIloaClfSYXxhNirA+v//sHTnTK3z2Bl
 6qceYqmFmns2Laual7+lvnZgt6egMBcmAL/MOdbU74+KIR9Xw76wxQjifktHX+WF
 FEUQkUJDB5XcUyKlbvHoqobRMxvEZ8RIlC5DIkgFiPRE3TI0MqfzNSFnQ/6+lFNg
 Y0AS9HYJmcj8sVzAJ7ji24WPFCXzsbFn6baJa9usDNbWyQZokYeiv7ZPNPHPDVrv
 YEBP6aYko0lVSUS9qw==
 =Li4D
 -----END PGP SIGNATURE-----

Merge tag 'kcfi-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull kcfi updates from Kees Cook:
 "This replaces the prior support for Clang's standard Control Flow
  Integrity (CFI) instrumentation, which has required a lot of special
  conditions (e.g. LTO) and work-arounds.

  The new implementation ("Kernel CFI") is specific to C, directly
  designed for the Linux kernel, and takes advantage of architectural
  features like x86's IBT. This series retains arm64 support and adds
  x86 support.

  GCC support is expected in the future[1], and additional "generic"
  architectural support is expected soon[2].

  Summary:

   - treewide: Remove old CFI support details

   - arm64: Replace Clang CFI support with Clang KCFI support

   - x86: Introduce Clang KCFI support"

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107048 [1]
Link: https://github.com/samitolvanen/llvm-project/commits/kcfi_generic [2]

* tag 'kcfi-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
  x86: Add support for CONFIG_CFI_CLANG
  x86/purgatory: Disable CFI
  x86: Add types to indirectly called assembly functions
  x86/tools/relocs: Ignore __kcfi_typeid_ relocations
  kallsyms: Drop CONFIG_CFI_CLANG workarounds
  objtool: Disable CFI warnings
  objtool: Preserve special st_shndx indexes in elf_update_symbol
  treewide: Drop __cficanonical
  treewide: Drop WARN_ON_FUNCTION_MISMATCH
  treewide: Drop function_nocfi
  init: Drop __nocfi from __init
  arm64: Drop unneeded __nocfi attributes
  arm64: Add CFI error handling
  arm64: Add types to indirect called assembly functions
  psci: Fix the function type for psci_initcall_t
  lkdtm: Emit an indirect call for CFI tests
  cfi: Add type helper macros
  cfi: Switch to -fsanitize=kcfi
  cfi: Drop __CFI_ADDRESSABLE
  cfi: Remove CONFIG_CFI_CLANG_SHADOW
  ...
2022-10-03 17:11:07 -07:00
Masahiro Yamada 3216484550 kbuild: use obj-y instead extra-y for objects placed at the head
The objects placed at the head of vmlinux need special treatments:

 - arch/$(SRCARCH)/Makefile adds them to head-y in order to place
   them before other archives in the linker command line.

 - arch/$(SRCARCH)/kernel/Makefile adds them to extra-y instead of
   obj-y to avoid them going into built-in.a.

This commit gets rid of the latter.

Create vmlinux.a to collect all the objects that are unconditionally
linked to vmlinux. The objects listed in head-y are moved to the head
of vmlinux.a by using 'ar m'.

With this, arch/$(SRCARCH)/kernel/Makefile can consistently use obj-y
for builtin objects.

There is no *.o that is directly linked to vmlinux. Drop unneeded code
in scripts/clang-tools/gen_compile_commands.py.

$(AR) mPi needs 'T' to workaround the llvm-ar bug. The fix was suggested
by Nathan Chancellor [1].

[1]: https://lore.kernel.org/llvm/YyjjT5gQ2hGMH0ni@dev-arch.thelio-3990X/

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-10-02 18:04:05 +09:00
Catalin Marinas 53630a1f61 Merge branch 'for-next/misc' into for-next/core
* for-next/misc:
  : Miscellaneous patches
  arm64/kprobe: Optimize the performance of patching single-step slot
  ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
  arm64/mm: fold check for KFENCE into can_set_direct_map()
  arm64: uaccess: simplify uaccess_mask_ptr()
  arm64: mte: move register initialization to C
  arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
  arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
  arm64: support huge vmalloc mappings
  arm64: spectre: increase parameters that can be used to turn off bhb mitigation individually
  arm64: run softirqs on the per-CPU IRQ stack
  arm64: compat: Implement misalignment fixups for multiword loads
2022-09-30 09:18:26 +01:00
Catalin Marinas c704cf27a1 Merge branch 'for-next/alternatives' into for-next/core
* for-next/alternatives:
  : Alternatives (code patching) improvements
  arm64: fix the build with binutils 2.27
  arm64: avoid BUILD_BUG_ON() in alternative-macros
  arm64: alternatives: add shared NOP callback
  arm64: alternatives: add alternative_has_feature_*()
  arm64: alternatives: have callbacks take a cap
  arm64: alternatives: make alt_region const
  arm64: alternatives: hoist print out of __apply_alternatives()
  arm64: alternatives: proton-pack: prepare for cap changes
  arm64: alternatives: kvm: prepare for cap changes
  arm64: cpufeature: make cpus_have_cap() noinstr-safe
2022-09-30 09:18:22 +01:00
Catalin Marinas b23ec74cbd Merge branches 'for-next/doc', 'for-next/sve', 'for-next/sysreg', 'for-next/gettimeofday', 'for-next/stacktrace', 'for-next/atomics', 'for-next/el1-exceptions', 'for-next/a510-erratum-2658417', 'for-next/defconfig', 'for-next/tpidr2_el0' and 'for-next/ftrace', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
  arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header
  arm64/sve: Add Perf extensions documentation
  perf: arm64: Add SVE vector granule register to user regs
  MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver
  drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC
  docs: perf: Add description for Alibaba's T-Head PMU driver

* for-next/doc:
  : Documentation/arm64 updates
  arm64/sve: Document our actual ABI for clearing registers on syscall

* for-next/sve:
  : SVE updates
  arm64/sysreg: Add hwcap for SVE EBF16

* for-next/sysreg: (35 commits)
  : arm64 system registers generation (more conversions)
  arm64/sysreg: Fix a few missed conversions
  arm64/sysreg: Convert ID_AA64AFRn_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64DFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64FDR0_EL1 to automatic generation
  arm64/sysreg: Use feature numbering for PMU and SPE revisions
  arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names
  arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture
  arm64/sysreg: Add defintion for ALLINT
  arm64/sysreg: Convert SCXTNUM_EL1 to automatic generation
  arm64/sysreg: Convert TIPDR_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64PFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64PFR0_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64MMFR2_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64MMFR1_EL1 to automatic generation
  arm64/sysreg: Convert ID_AA64MMFR0_EL1 to automatic generation
  arm64/sysreg: Convert HCRX_EL2 to automatic generation
  arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 SME enumeration
  arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 BTI enumeration
  arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 fractional version fields
  arm64/sysreg: Standardise naming for MTE feature enumeration
  ...

* for-next/gettimeofday:
  : Use self-synchronising counter access in gettimeofday() (if FEAT_ECV)
  arm64: vdso: use SYS_CNTVCTSS_EL0 for gettimeofday
  arm64: alternative: patch alternatives in the vDSO
  arm64: module: move find_section to header

* for-next/stacktrace:
  : arm64 stacktrace cleanups and improvements
  arm64: stacktrace: track hyp stacks in unwinder's address space
  arm64: stacktrace: track all stack boundaries explicitly
  arm64: stacktrace: remove stack type from fp translator
  arm64: stacktrace: rework stack boundary discovery
  arm64: stacktrace: add stackinfo_on_stack() helper
  arm64: stacktrace: move SDEI stack helpers to stacktrace code
  arm64: stacktrace: rename unwind_next_common() -> unwind_next_frame_record()
  arm64: stacktrace: simplify unwind_next_common()
  arm64: stacktrace: fix kerneldoc comments

* for-next/atomics:
  : arm64 atomics improvements
  arm64: atomic: always inline the assembly
  arm64: atomics: remove LL/SC trampolines

* for-next/el1-exceptions:
  : Improve the reporting of EL1 exceptions
  arm64: rework BTI exception handling
  arm64: rework FPAC exception handling
  arm64: consistently pass ESR_ELx to die()
  arm64: die(): pass 'err' as long
  arm64: report EL1 UNDEFs better

* for-next/a510-erratum-2658417:
  : Cortex-A510: 2658417: remove BF16 support due to incorrect result
  arm64: errata: remove BF16 HWCAP due to incorrect result on Cortex-A510
  arm64: cpufeature: Expose get_arm64_ftr_reg() outside cpufeature.c
  arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space

* for-next/defconfig:
  : arm64 defconfig updates
  arm64: defconfig: Add Coresight as module
  arm64: Enable docker support in defconfig
  arm64: defconfig: Enable memory hotplug and hotremove config
  arm64: configs: Enable all PMUs provided by Arm

* for-next/tpidr2_el0:
  : arm64 ptrace() support for TPIDR2_EL0
  kselftest/arm64: Add coverage of TPIDR2_EL0 ptrace interface
  arm64/ptrace: Support access to TPIDR2_EL0
  arm64/ptrace: Document extension of NT_ARM_TLS to cover TPIDR2_EL0
  kselftest/arm64: Add test coverage for NT_ARM_TLS

* for-next/ftrace:
  : arm64 ftraces updates/fixes
  arm64: ftrace: fix module PLTs with mcount
  arm64: module: Remove unused plt_entry_is_initialized()
  arm64: module: Make plt_equals_entry() static
2022-09-30 09:17:57 +01:00
Liao Chang a0caebbd04 arm64/kprobe: Optimize the performance of patching single-step slot
Single-step slot would not be used until kprobe is enabled, that means
no race condition occurs on it under SMP, hence it is safe to pacth ss
slot without stopping machine.

Since I and D caches are coherent within single-step slot from
aarch64_insn_patch_text_nosync(), hence no need to do it again via
flush_icache_range().

Acked-by: Will Deacon <will@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Liao Chang <liaochang1@huawei.com>
Link: https://lore.kernel.org/r/20220927022435.129965-4-liaochang1@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-30 09:17:15 +01:00
Xiu Jianfeng 8c6e3657be ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
Add missing __init/__exit annotations to module init/exit funcs.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Link: https://lore.kernel.org/r/20220911034747.132098-1-xiujianfeng@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-29 18:04:25 +01:00
Mark Rutland 8cfb08575c arm64: ftrace: fix module PLTs with mcount
Li Huafei reports that mcount-based ftrace with module PLTs was broken
by commit:

  a625357997 ("arm64: ftrace: consistently handle PLTs.")

When a module PLTs are used and a module is loaded sufficiently far away
from the kernel, we'll create PLTs for any branches which are
out-of-range. These are separate from the special ftrace trampoline
PLTs, which the module PLT code doesn't directly manipulate.

When mcount is in use this is a problem, as each mcount callsite in a
module will be initialized to point to a module PLT, but since commit
a625357997 ftrace_make_nop() will assume that the callsite has
been initialized to point to the special ftrace trampoline PLT, and
ftrace_find_callable_addr() rejects other cases.

This means that when ftrace tries to initialize a callsite via
ftrace_make_nop(), the call to ftrace_find_callable_addr() will find
that the `_mcount` stub is out-of-range and is not handled by the ftrace
PLT, resulting in a splat:

| ftrace_test: loading out-of-tree module taints kernel.
| ftrace: no module PLT for _mcount
| ------------[ ftrace bug ]------------
| ftrace failed to modify
| [<ffff800029180014>] 0xffff800029180014
|  actual:   44:00:00:94
| Initializing ftrace call sites
| ftrace record flags: 2000000
|  (0)
|  expected tramp: ffff80000802eb3c
| ------------[ cut here ]------------
| WARNING: CPU: 3 PID: 157 at kernel/trace/ftrace.c:2120 ftrace_bug+0x94/0x270
| Modules linked in:
| CPU: 3 PID: 157 Comm: insmod Tainted: G           O       6.0.0-rc6-00151-gcd722513a189-dirty #22
| Hardware name: linux,dummy-virt (DT)
| pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : ftrace_bug+0x94/0x270
| lr : ftrace_bug+0x21c/0x270
| sp : ffff80000b2bbaf0
| x29: ffff80000b2bbaf0 x28: 0000000000000000 x27: ffff0000c4d38000
| x26: 0000000000000001 x25: ffff800009d7e000 x24: ffff0000c4d86e00
| x23: 0000000002000000 x22: ffff80000a62b000 x21: ffff8000098ebea8
| x20: ffff0000c4d38000 x19: ffff80000aa24158 x18: ffffffffffffffff
| x17: 0000000000000000 x16: 0a0d2d2d2d2d2d2d x15: ffff800009aa9118
| x14: 0000000000000000 x13: 6333626532303830 x12: 3030303866666666
| x11: 203a706d61727420 x10: 6465746365707865 x9 : 3362653230383030
| x8 : c0000000ffffefff x7 : 0000000000017fe8 x6 : 000000000000bff4
| x5 : 0000000000057fa8 x4 : 0000000000000000 x3 : 0000000000000001
| x2 : ad2cb14bb5438900 x1 : 0000000000000000 x0 : 0000000000000022
| Call trace:
|  ftrace_bug+0x94/0x270
|  ftrace_process_locs+0x308/0x430
|  ftrace_module_init+0x44/0x60
|  load_module+0x15b4/0x1ce8
|  __do_sys_init_module+0x1ec/0x238
|  __arm64_sys_init_module+0x24/0x30
|  invoke_syscall+0x54/0x118
|  el0_svc_common.constprop.4+0x84/0x100
|  do_el0_svc+0x3c/0xd0
|  el0_svc+0x1c/0x50
|  el0t_64_sync_handler+0x90/0xb8
|  el0t_64_sync+0x15c/0x160
| ---[ end trace 0000000000000000 ]---
| ---------test_init-----------

Fix this by reverting to the old behaviour of ignoring the old
instruction when initialising an mcount callsite in a module, which was
the behaviour prior to commit a625357997.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: a625357997 ("arm64: ftrace: consistently handle PLTs.")
Reported-by: Li Huafei <lihuafei1@huawei.com>
Link: https://lore.kernel.org/linux-arm-kernel/20220929094134.99512-1-lihuafei1@huawei.com
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220929134525.798593-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-29 17:47:54 +01:00
Li Huafei 3fb420f56c arm64: module: Make plt_equals_entry() static
Since commit 4e69ecf4da ("arm64/module: ftrace: deal with place
relative nature of PLTs"), plt_equals_entry() is not used outside of
module-plts.c, so make it static.

Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220929094134.99512-2-lihuafei1@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-29 17:47:18 +01:00
Liam R. Howlett ef770d180e arm64: Change elfcore for_each_mte_vma() to use VMA iterator
Rework for_each_mte_vma() to use a VMA iterator instead of an explicit
linked-list.

Link: https://lkml.kernel.org/r/20220906194824.2110408-32-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220218023650.672072-1-Liam.Howlett@oracle.com
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26 19:46:19 -07:00
Matthew Wilcox (Oracle) de2b84d24b arm64: remove mmap linked list from vdso
Use the VMA iterator instead.

Link: https://lkml.kernel.org/r/20220906194824.2110408-31-Liam.Howlett@oracle.com
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26 19:46:19 -07:00
Sami Tolvanen 607289a7cd treewide: Drop function_nocfi
With -fsanitize=kcfi, we no longer need function_nocfi() as
the compiler won't change function references to point to a
jump table. Remove all implementations and uses of the macro.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-14-samitolvanen@google.com
2022-09-26 10:13:14 -07:00
Sami Tolvanen 5f20997c19 arm64: Drop unneeded __nocfi attributes
With -fsanitize=kcfi, CONFIG_CFI_CLANG no longer has issues
with address space confusion in functions that switch to linear
mapping. Now that the indirectly called assembly functions have
type annotations, drop the __nocfi attributes.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-12-samitolvanen@google.com
2022-09-26 10:13:14 -07:00
Sami Tolvanen b26e484b8b arm64: Add CFI error handling
With -fsanitize=kcfi, CFI always traps. Add arm64 support for handling CFI
failures. The registers containing the target address and the expected type
are encoded in the first ten bits of the ESR as follows:

 - 0-4: n, where the register Xn contains the target address
 - 5-9: m, where the register Wm contains the type hash

This produces the following oops on CFI failure (generated using lkdtm):

[   21.885179] CFI failure at lkdtm_indirect_call+0x2c/0x44 [lkdtm]
(target: lkdtm_increment_int+0x0/0x1c [lkdtm]; expected type: 0x7e0c52a)
[   21.886593] Internal error: Oops - CFI: 0 [#1] PREEMPT SMP
[   21.891060] Modules linked in: lkdtm
[   21.893363] CPU: 0 PID: 151 Comm: sh Not tainted
5.19.0-rc1-00021-g852f4e48dbab #1
[   21.895560] Hardware name: linux,dummy-virt (DT)
[   21.896543] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   21.897583] pc : lkdtm_indirect_call+0x2c/0x44 [lkdtm]
[   21.898551] lr : lkdtm_CFI_FORWARD_PROTO+0x3c/0x6c [lkdtm]
[   21.899520] sp : ffff8000083a3c50
[   21.900191] x29: ffff8000083a3c50 x28: ffff0000027e0ec0 x27: 0000000000000000
[   21.902453] x26: 0000000000000000 x25: ffffc2aa3d07e7b0 x24: 0000000000000002
[   21.903736] x23: ffffc2aa3d079088 x22: ffffc2aa3d07e7b0 x21: ffff000003379000
[   21.905062] x20: ffff8000083a3dc0 x19: 0000000000000012 x18: 0000000000000000
[   21.906371] x17: 000000007e0c52a5 x16: 000000003ad55aca x15: ffffc2aa60d92138
[   21.907662] x14: ffffffffffffffff x13: 2e2e2e2065707974 x12: 0000000000000018
[   21.909775] x11: ffffc2aa62322b88 x10: ffffc2aa62322aa0 x9 : c7e305fb5195d200
[   21.911898] x8 : ffffc2aa3d077e20 x7 : 6d20676e696c6c61 x6 : 43203a6d74646b6c
[   21.913108] x5 : ffffc2aa6266c9df x4 : ffffc2aa6266c9e1 x3 : ffff8000083a3968
[   21.914358] x2 : 80000000fffff122 x1 : 00000000fffff122 x0 : ffffc2aa3d07e8f8
[   21.915827] Call trace:
[   21.916375]  lkdtm_indirect_call+0x2c/0x44 [lkdtm]
[   21.918060]  lkdtm_CFI_FORWARD_PROTO+0x3c/0x6c [lkdtm]
[   21.919030]  lkdtm_do_action+0x34/0x4c [lkdtm]
[   21.919920]  direct_entry+0x170/0x1ac [lkdtm]
[   21.920772]  full_proxy_write+0x84/0x104
[   21.921759]  vfs_write+0x188/0x3d8
[   21.922387]  ksys_write+0x78/0xe8
[   21.922986]  __arm64_sys_write+0x1c/0x2c
[   21.923696]  invoke_syscall+0x58/0x134
[   21.924554]  el0_svc_common+0xb4/0xf4
[   21.925603]  do_el0_svc+0x2c/0xb4
[   21.926563]  el0_svc+0x2c/0x7c
[   21.927147]  el0t_64_sync_handler+0x84/0xf0
[   21.927985]  el0t_64_sync+0x18c/0x190
[   21.929133] Code: 728a54b1 72afc191 6b11021f 54000040 (d4304500)
[   21.930690] ---[ end trace 0000000000000000 ]---
[   21.930971] Kernel panic - not syncing: Oops - CFI: Fatal exception

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-11-samitolvanen@google.com
2022-09-26 10:13:14 -07:00
Sami Tolvanen c50d32859e arm64: Add types to indirect called assembly functions
With CONFIG_CFI_CLANG, assembly functions indirectly called from C
code must be annotated with type identifiers to pass CFI checking. Use
SYM_TYPED_FUNC_START for the indirectly called functions, and ensure
we emit `bti c` also with SYM_TYPED_FUNC_START.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-10-samitolvanen@google.com
2022-09-26 10:13:13 -07:00
Sami Tolvanen f143ff397a treewide: Filter out CC_FLAGS_CFI
In preparation for removing CC_FLAGS_CFI from CC_FLAGS_LTO, explicitly
filter out CC_FLAGS_CFI in all the makefiles where we currently filter
out CC_FLAGS_LTO.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-2-samitolvanen@google.com
2022-09-26 10:13:12 -07:00
Peter Collingbourne 973b9e3733 arm64: mte: move register initialization to C
If FEAT_MTE2 is disabled via the arm64.nomte command line argument on a
CPU that claims to support FEAT_MTE2, the kernel will use Tagged Normal
in the MAIR. If we interpret arm64.nomte to mean that the CPU does not
in fact implement FEAT_MTE2, setting the system register like this may
lead to UNSPECIFIED behavior. Fix it by arranging for MAIR to be set
in the C function cpu_enable_mte which is called based on the sanitized
version of the system register.

There is no need for the rest of the MTE-related system register
initialization to happen from assembly, with the exception of TCR_EL1,
which must be set to include at least TBI1 because the secondary CPUs
access KASan-allocated data structures early. Therefore, make the TCR_EL1
initialization unconditional and move the rest of the initialization to
cpu_enable_mte so that we no longer have a dependency on the unsanitized
ID register value.

Co-developed-by: Evgenii Stepanov <eugenis@google.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Signed-off-by: Evgenii Stepanov <eugenis@google.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 3b714d24ef ("arm64: mte: CPU feature detection and initial sysreg configuration")
Cc: <stable@vger.kernel.org> # 5.10.x
Link: https://lore.kernel.org/r/20220915222053.3484231-1-eugenis@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-22 18:02:50 +01:00
James Clark cbb0c02caf perf: arm64: Add SVE vector granule register to user regs
Dwarf based unwinding in a function that pushes SVE registers onto
the stack requires the unwinder to know the length of the SVE register
to calculate the stack offsets correctly. This was added to the Arm
specific Dwarf spec as the VG pseudo register[1].

Add the vector length at position 46 if it's requested by userspace and
SVE is supported. If it's not supported then fail to open the event.

The vector length must be on each sample because it can be changed
at runtime via a prctl or ptrace call. Also by adding it as a register
rather than a separate attribute, minimal changes will be required in an
unwinder that already indexes into the register list.

[1]: https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst

Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20220901132658.1024635-2-james.clark@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2022-09-22 15:06:02 +01:00
Sergey Shtylyov d4955c0ad7 arm64: topology: fix possible overflow in amu_fie_setup()
cpufreq_get_hw_max_freq() returns max frequency in kHz as *unsigned int*,
while freq_inv_set_max_ratio() gets passed this frequency in Hz as 'u64'.
Multiplying max frequency by 1000 can potentially result in overflow --
multiplying by 1000ULL instead should avoid that...

Found by Linux Verification Center (linuxtesting.org) with the SVACE static
analysis tool.

Fixes: cd0ed03a89 ("arm64: use activity monitors for frequency invariance")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/01493d64-2bce-d968-86dc-11a122a9c07d@omp.ru
Signed-off-by: Will Deacon <will@kernel.org>
2022-09-22 12:57:06 +01:00
Mark Brown 0027d9c6c7 arm64/ptrace: Support access to TPIDR2_EL0
SME introduces an additional EL0 register, TPIDR2_EL0, intended for use
by userspace as part of the SME. Provide ptrace access to it through the
existing NT_ARM_TLS regset used for TPIDR_EL0 by expanding it to two
registers with TPIDR2_EL0 being the second one.

Existing programs that query the size of the register set will be able
to observe the increased size of the register set. Programs that assume
the register set is single register will see no change. On systems that
do not support SME TPIDR2_EL0 will read as 0 and writes will be ignored,
support for SME should be queried via hwcaps as normal.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220829154921.837871-4-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-21 17:26:58 +01:00
Ard Biesheuvel c82ceb440b efi/libstub: use EFI provided memcpy/memset routines
The stub is used in different execution environments, but on arm64,
RISC-V and LoongArch, we still use the core kernel's implementation of
memcpy and memset, as they are just a branch instruction away, and can
generally be reused even from code such as the EFI stub that runs in a
completely different address space.

KAsan complicates this slightly, resulting in the need for some hacks to
expose the uninstrumented, __ prefixed versions as the normal ones, as
the latter are instrumented to include the KAsan checks, which only work
in the core kernel.

Unfortunately, #define'ing memcpy to __memcpy when building C code does
not guarantee that no explicit memcpy() calls will be emitted. And with
the upcoming zboot support, which consists of a separate binary which
therefore needs its own implementation of memcpy/memset anyway, it's
better to provide one explicitly instead of linking to the existing one.

Given that EFI exposes implementations of memmove() and memset() via the
boot services table, let's wire those up in the appropriate way, and
drop the references to the core kernel ones.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-17 15:13:21 +02:00
Mark Rutland d926079f17 arm64: alternatives: add shared NOP callback
For each instance of an alternative, the compiler outputs a distinct
copy of the alternative instructions into a subsection. As the compiler
doesn't have special knowledge of alternatives, it cannot coalesce these
to save space.

In a defconfig kernel built with GCC 12.1.0, there are approximately
10,000 instances of alternative_has_feature_likely(), where the
replacement instruction is always a NOP. As NOPs are
position-independent, we don't need a unique copy per alternative
sequence.

This patch adds a callback to patch an alternative sequence with NOPs,
and make use of this in alternative_has_feature_likely(). So that this
can be used for other sites in future, this is written to patch multiple
instructions up to the original sequence length.

For NVHE, an alias is added to image-vars.h.

For modules, the callback is exported. Note that as modules are loaded
within 2GiB of the kernel, an alt_instr entry in a module can always
refer directly to the callback, and no special handling is necessary.

When building with GCC 12.1.0, the vmlinux is ~158KiB smaller, though
the resulting Image size is unchanged due to alignment constraints and
padding:

| % ls -al vmlinux-*
| -rwxr-xr-x 1 mark mark 134644592 Sep  1 14:52 vmlinux-after
| -rwxr-xr-x 1 mark mark 134486232 Sep  1 14:50 vmlinux-before
| % ls -al Image-*
| -rw-r--r-- 1 mark mark 37108224 Sep  1 14:52 Image-after
| -rw-r--r-- 1 mark mark 37108224 Sep  1 14:50 Image-before

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220912162210.3626215-9-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 17:15:03 +01:00
Mark Rutland 21fb26bfb0 arm64: alternatives: add alternative_has_feature_*()
Currrently we use a mixture of alternative sequences and static branches
to handle features detected at boot time. For ease of maintenance we
generally prefer to use static branches in C code, but this has a few
downsides:

* Each static branch has metadata in the __jump_table section, which is
  not discarded after features are finalized. This wastes some space,
  and slows down the patching of other static branches.

* The static branches are patched at a different point in time from the
  alternatives, so changes are not atomic. This leaves a transient
  period where there could be a mismatch between the behaviour of
  alternatives and static branches, which could be problematic for some
  features (e.g. pseudo-NMI).

* More (instrumentable) kernel code is executed to patch each static
  branch, which can be risky when patching certain features (e.g.
  irqflags management for pseudo-NMI).

* When CONFIG_JUMP_LABEL=n, static branches are turned into a load of a
  flag and a conditional branch. This means it isn't safe to use such
  static branches in an alternative address space (e.g. the NVHE/PKVM
  hyp code), where the generated address isn't safe to acccess.

To deal with these issues, this patch introduces new
alternative_has_feature_*() helpers, which work like static branches but
are patched using alternatives. This ensures the patching is performed
at the same time as other alternative patching, allows the metadata to
be freed after patching, and is safe for use in alternative address
spaces.

Note that all supported toolchains have asm goto support, and since
commit:

  a0a12c3ed0 ("asm goto: eradicate CC_HAS_ASM_GOTO)"

... the CC_HAS_ASM_GOTO Kconfig symbol has been removed, so no feature
check is necessary, and we can always make use of asm goto.

Additionally, note that:

* This has no impact on cpus_have_cap(), which is a dynamic check.

* This has no functional impact on cpus_have_const_cap(). The branches
  are patched slightly later than before this patch, but these branches
  are not reachable until caps have been finalised.

* It is now invalid to use cpus_have_final_cap() in the window between
  feature detection and patching. All existing uses are only expected
  after patching anyway, so this should not be a problem.

* The LSE atomics will now be enabled during alternatives patching
  rather than immediately before. As the LL/SC an LSE atomics are
  functionally equivalent this should not be problematic.

When building defconfig with GCC 12.1.0, the resulting Image is 64KiB
smaller:

| % ls -al Image-*
| -rw-r--r-- 1 mark mark 37108224 Aug 23 09:56 Image-after
| -rw-r--r-- 1 mark mark 37173760 Aug 23 09:54 Image-before

According to bloat-o-meter.pl:

| add/remove: 44/34 grow/shrink: 602/1294 up/down: 39692/-61108 (-21416)
| Function                                     old     new   delta
| [...]
| Total: Before=16618336, After=16596920, chg -0.13%
| add/remove: 0/2 grow/shrink: 0/0 up/down: 0/-1296 (-1296)
| Data                                         old     new   delta
| arm64_const_caps_ready                        16       -     -16
| cpu_hwcap_keys                              1280       -   -1280
| Total: Before=8987120, After=8985824, chg -0.01%
| add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
| RO Data                                      old     new   delta
| Total: Before=18408, After=18408, chg +0.00%

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220912162210.3626215-8-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 17:15:03 +01:00
Mark Rutland 4c0bd995d7 arm64: alternatives: have callbacks take a cap
Today, callback alternatives are special-cased within
__apply_alternatives(), and are applied alongside patching for system
capabilities as ARM64_NCAPS is not part of the boot_capabilities feature
mask.

This special-casing is less than ideal. Giving special meaning to
ARM64_NCAPS for this requires some structures and loops to use
ARM64_NCAPS + 1 (AKA ARM64_NPATCHABLE), while others use ARM64_NCAPS.
It's also not immediately clear callback alternatives are only applied
when applying alternatives for system-wide features.

To make this a bit clearer, changes the way that callback alternatives
are identified to remove the special-casing of ARM64_NCAPS, and to allow
callback alternatives to be associated with a cpucap as with all other
alternatives.

New cpucaps, ARM64_ALWAYS_BOOT and ARM64_ALWAYS_SYSTEM are added which
are always detected alongside boot cpu capabilities and system
capabilities respectively. All existing callback alternatives are made
to use ARM64_ALWAYS_SYSTEM, and so will be patched at the same point
during the boot flow as before.

Subsequent patches will make more use of these new cpucaps.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220912162210.3626215-7-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 17:15:03 +01:00
Mark Rutland b723edf3a1 arm64: alternatives: make alt_region const
We never alter a struct alt_region after creation, and we open-code the
bounds of the kernel alternatives region in two functions. The
duplication is a bit unfortunate for clarity (and in future we're likely
to have more functions altering alternative regions), and to avoid
accidents it would be good to make the structure const.

This patch adds a shared struct `kernel_alternatives` alt_region for the
main kernel image, and marks the alt_regions as const to prevent
unintentional modification.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220912162210.3626215-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 17:15:03 +01:00
Mark Rutland c5ba03260c arm64: alternatives: hoist print out of __apply_alternatives()
Printing in the middle of __apply_alternatives() is potentially unsafe
and not all that helpful given these days we practically always patch
*something*.

Hoist the print out of __apply_alternatives(), and add separate prints
to __apply_alternatives() and apply_alternatives_all(), which will make
it easier to spot if either patching call goes wrong.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220912162210.3626215-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 17:15:03 +01:00
Mark Rutland 747ad8d557 arm64: alternatives: proton-pack: prepare for cap changes
The spectre patching callbacks use cpus_have_final_cap(), and subsequent
patches will make it invalid to call cpus_have_final_cap() before
alternatives patching has completed.

In preparation for said change, this patch modifies the spectre patching
callbacks use cpus_have_cap(). This is not subject to patching, and will
dynamically check the cpu_hwcaps array, which is functionally equivalent
to the existing behaviour.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20220912162210.3626215-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 17:15:03 +01:00
James Morse 1bdb0fbb2e arm64: errata: remove BF16 HWCAP due to incorrect result on Cortex-A510
Cortex-A510's erratum #2658417 causes two BF16 instructions to return the
wrong result in rare circumstances when a pair of A510 CPUs are using
shared neon hardware.

The two instructions affected are BFMMLA and VMMLA, support for these is
indicated by the BF16 HWCAP. Remove it on affected platforms.

Signed-off-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20220909165938.3931307-4-james.morse@arm.com
[catalin.marinas@arm.com: add revision to the Kconfig help; remove .type]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 15:23:50 +01:00
James Morse 445c953e4a arm64: cpufeature: Expose get_arm64_ftr_reg() outside cpufeature.c
get_arm64_ftr_reg() returns the properties of a system register based
on its instruction encoding.

This is needed by erratum workaround in cpu_errata.c to modify the
user-space visible view of id registers.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220909165938.3931307-3-james.morse@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 15:16:34 +01:00
James Morse 237405ebef arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space
arm64 advertises hardware features to user-space via HWCAPs, and by
emulating access to the CPUs id registers. The cpufeature code has a
sanitised system-wide view of an id register, and a sanitised user-space
view of an id register, where some features use their 'safe' value
instead of the hardware value.

It is currently possible for a HWCAP to be advertised where the user-space
view of the id register does not show the feature as supported.
Erratum workaround need to remove both the HWCAP, and the feature from
the user-space view of the id register. This involves duplicating the
code, and spreading it over cpufeature.c and cpu_errata.c.

Make the HWCAP code use the user-space view of id registers. This ensures
the values never diverge, and allows erratum workaround to remove HWCAP
by modifying the user-space view of the id register.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220909165938.3931307-2-james.morse@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 15:16:34 +01:00
Mark Brown 121a8fc088 arm64/sysreg: Use feature numbering for PMU and SPE revisions
Currently the kernel refers to the versions of the PMU and SPE features by
the version of the architecture where those features were updated but the
ARM refers to them using the FEAT_ names for the features. To improve
consistency and help with updating for newer features and since v9 will
make our current naming scheme a bit more confusing update the macros
identfying features to use the FEAT_ based scheme.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220910163354.860255-4-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:38:57 +01:00
Mark Brown fcf37b38ff arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names
Normally we include the full register name in the defines for fields within
registers but this has not been followed for ID registers. In preparation
for automatic generation of defines add the _EL1s into the defines for
ID_AA64DFR0_EL1 to follow the convention. No functional changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220910163354.860255-3-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:38:57 +01:00
Mark Brown c0357a73fa arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture
The naming scheme the architecture uses for the fields in ID_AA64DFR0_EL1
does not align well with kernel conventions, using as it does a lot of
MixedCase in various arrangements. In preparation for automatically
generating the defines for this register rename the defines used to match
what is in the architecture.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220910163354.860255-2-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:38:57 +01:00
Mark Rutland 830a2a4d85 arm64: rework BTI exception handling
If a BTI exception is taken from EL1, the entry code will treat this as
an unhandled exception and will panic() the kernel. This is inconsistent
with the way we handle FPAC exceptions, which have a dedicated handler
and only necessarily kill the thread from which the exception was taken
from, and we don't log all the information that could be relevant to
debug the issue.

The code in do_bti() has:

	BUG_ON(!user_mode(regs));

... and it seems like the intent was to call this for EL1 BTI
exceptions, as with FPAC, but this was omitted due to an oversight.

This patch adds separate EL0 and EL1 BTI exception handlers, with the
latter calling die() directly to report the original context the BTI
exception was taken from. This matches our handling of FPAC exceptions.

Prior to this patch, a BTI failure is reported as:

| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x0000000034000002 -- BTI
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9
| Hardware name: linux,dummy-virt (DT)
| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c)
| pc : test_bti_callee+0x4/0x10
| lr : test_bti_caller+0x1c/0x28
| sp : ffff80000800bdf0
| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83
| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378
| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff8000080257e4
| Kernel panic - not syncing: Unhandled exception
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9
| Hardware name: linux,dummy-virt (DT)
| Call trace:
|  dump_backtrace.part.0+0xcc/0xe0
|  show_stack+0x18/0x5c
|  dump_stack_lvl+0x64/0x80
|  dump_stack+0x18/0x34
|  panic+0x170/0x360
|  arm64_exit_nmi.isra.0+0x0/0x80
|  el1h_64_sync_handler+0x64/0xd0
|  el1h_64_sync+0x64/0x68
|  test_bti_callee+0x4/0x10
|  smp_cpus_done+0xb0/0xbc
|  smp_init+0x7c/0x8c
|  kernel_init_freeable+0x128/0x28c
|  kernel_init+0x28/0x13c
|  ret_from_fork+0x10/0x20

With this patch applied, a BTI failure is reported as:

| Internal error: Oops - BTI: 0000000034000002 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g0ad98265d582-dirty #8
| Hardware name: linux,dummy-virt (DT)
| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c)
| pc : test_bti_callee+0x4/0x10
| lr : test_bti_caller+0x1c/0x28
| sp : ffff80000800bdf0
| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83
| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378
| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff800008025804
| Call trace:
|  test_bti_callee+0x4/0x10
|  smp_cpus_done+0xb0/0xbc
|  smp_init+0x7c/0x8c
|  kernel_init_freeable+0x128/0x28c
|  kernel_init+0x28/0x13c
|  ret_from_fork+0x10/0x20
| Code: d50323bf d53cd040 d65f03c0 d503233f (d50323bf)

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220913101732.3925290-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:17:03 +01:00
Mark Rutland a1fafa3b24 arm64: rework FPAC exception handling
If an FPAC exception is taken from EL1, the entry code will call
do_ptrauth_fault(), where due to:

	BUG_ON(!user_mode(regs))

... the kernel will report a problem within do_ptrauth_fault() rather
than reporting the original context the FPAC exception was taken from.
The pt_regs and ESR value reported will be from within
do_ptrauth_fault() and the code dump will be for the BRK in BUG_ON(),
which isn't sufficient to debug the cause of the original exception.

This patch makes the reporting better by having separate EL0 and EL1
FPAC exception handlers, with the latter calling die() directly to
report the original context the FPAC exception was taken from.

Note that we only need to prevent kprobes of the EL1 FPAC handler, since
the EL0 FPAC handler cannot be called recursively.

For consistency with do_el0_svc*(), I've named the split functions
do_el{0,1}_fpac() rather than do_el{0,1}_ptrauth_fault(). I've also
clarified the comment to not imply there are casues other than FPAC
exceptions.

Prior to this patch FPAC exceptions are reported as:

| kernel BUG at arch/arm64/kernel/traps.c:517!
| Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00130-g9c8a180a1cdf-dirty #12
| Hardware name: FVP Base RevC (DT)
| pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : do_ptrauth_fault+0x3c/0x40
| lr : el1_fpac+0x34/0x54
| sp : ffff80000a3bbc80
| x29: ffff80000a3bbc80 x28: ffff0008001d8000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: 0000000020400009 x22: ffff800008f70fa4 x21: ffff80000a3bbe00
| x20: 0000000072000000 x19: ffff80000a3bbcb0 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000081a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000080000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000783
| x5 : ffff80000a3bbcb0 x4 : ffff0008001d8000 x3 : 0000000072000000
| x2 : 0000000000000000 x1 : 0000000020400009 x0 : ffff80000a3bbcb0
| Call trace:
|  do_ptrauth_fault+0x3c/0x40
|  el1h_64_sync_handler+0xc4/0xd0
|  el1h_64_sync+0x64/0x68
|  test_pac+0x8/0x10
|  smp_init+0x7c/0x8c
|  kernel_init_freeable+0x128/0x28c
|  kernel_init+0x28/0x13c
|  ret_from_fork+0x10/0x20
| Code: 97fffe5e a8c17bfd d50323bf d65f03c0 (d4210000)

With this patch applied FPAC exceptions are reported as:

| Internal error: Oops - FPAC: 0000000072000000 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g78846e1c4757-dirty #11
| Hardware name: FVP Base RevC (DT)
| pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : test_pac+0x8/0x10
| lr : 0x0
| sp : ffff80000a3bbe00
| x29: ffff80000a3bbe00 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a2c8000 x22: 0000000000000000 x21: 0000000000000000
| x20: ffff8000099fa5b0 x19: ffff80000a007000 x18: fffffbfffda37000
| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000081a90000
| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000
| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000080000000
| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000783
| x5 : ffff80000a2c6000 x4 : ffff0008001d8000 x3 : ffff800009f88378
| x2 : 0000000000000000 x1 : 0000000080210000 x0 : ffff000001a90000
| Call trace:
|  test_pac+0x8/0x10
|  smp_init+0x7c/0x8c
|  kernel_init_freeable+0x128/0x28c
|  kernel_init+0x28/0x13c
|  ret_from_fork+0x10/0x20
| Code: d50323bf d65f03c0 d503233f aa1f03fe (d50323bf)

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220913101732.3925290-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:17:03 +01:00
Mark Rutland 0f2cb928a1 arm64: consistently pass ESR_ELx to die()
Currently, bug_handler() and kasan_handler() call die() with '0' as the
'err' value, whereas die_kernel_fault() passes the ESR_ELx value.

For consistency, this patch ensures we always pass the ESR_ELx value to
die(). As this is only called for exceptions taken from kernel mode,
there should be no user-visible change as a result of this patch.

For UNDEFINED exceptions, I've had to modify do_undefinstr() and its
callers to pass the ESR_ELx value. In all cases the ESR_ELx value had
already been read and was available.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220913101732.3925290-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:17:03 +01:00
Mark Rutland 18906ff9af arm64: die(): pass 'err' as long
Recently, we reworked a lot of code to consistentlt pass ESR_ELx as a
64-bit quantity. However, we missed that this can be passed into die()
and __die() as the 'err' parameter where it is truncated to a 32-bit
int.

As notify_die() already takes 'err' as a long, this patch changes die()
and __die() to also take 'err' as a long, ensuring that the full value
of ESR_ELx is retained.

At the same time, die() is updated to consistently log 'err' as a
zero-padded 64-bit quantity.

Subsequent patches will pass the ESR_ELx value to die() for a number of
exceptions.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220913101732.3925290-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-16 12:17:03 +01:00