Commit Graph

35510 Commits

Author SHA1 Message Date
Linus Torvalds e7989789c6 Timers and timekeeping updates:
- Improve the VDSO build time checks to cover all dynamic relocations
 
     VDSO does not allow dynamic relcations, but the build time check is
     incomplete and fragile.
 
     It's based on architectures specifying the relocation types to search
     for and does not handle R_*_NONE relocation entries correctly.
     R_*_NONE relocations are injected by some GNU ld variants if they fail
     to determine the exact .rel[a]/dyn_size to cover trailing zeros.
     R_*_NONE relocations must be ignored by dynamic loaders, so they
     should be ignored in the build time check too.
 
     Remove the architecture specific relocation types to check for and
     validate strictly that no other relocations than R_*_NONE end up
     in the VSDO .so file.
 
   - Prefer signal delivery to the current thread for
     CLOCK_PROCESS_CPUTIME_ID based posix-timers
 
     Such timers prefer to deliver the signal to the main thread of a
     process even if the context in which the timer expires is the current
     task. This has the downside that it might wake up an idle thread.
 
     As there is no requirement or guarantee that the signal has to be
     delivered to the main thread, avoid this by preferring the current
     task if it is part of the thread group which shares sighand.
 
     This not only avoids waking idle threads, it also distributes the
     signal delivery in case of multiple timers firing in the context
     of different threads close to each other better.
 
   - Align the tick period properly (again)
 
     For a long time the tick was starting at CLOCK_MONOTONIC zero, which
     allowed users space applications to either align with the tick or to
     place a periodic computation so that it does not interfere with the
     tick. The alignement of the tick period was more by chance than by
     intention as the tick is set up before a high resolution clocksource is
     installed, i.e. timekeeping is still tick based and the tick period
     advances from there.
 
     The early enablement of sched_clock() broke this alignement as the time
     accumulated by sched_clock() is taken into account when timekeeping is
     initialized. So the base value now(CLOCK_MONOTONIC) is not longer a
     multiple of tick periods, which breaks applications which relied on
     that behaviour.
 
     Cure this by aligning the tick starting point to the next multiple of
     tick periods, i.e 1000ms/CONFIG_HZ.
 
  - A set of NOHZ fixes and enhancements
 
    - Cure the concurrent writer race for idle and IO sleeptime statistics
 
      The statitic values which are exposed via /proc/stat are updated from
      the CPU local idle exit and remotely by cpufreq, but that happens
      without any form of serialization. As a consequence sleeptimes can be
      accounted twice or worse.
 
      Prevent this by restricting the accumulation writeback to the CPU
      local idle exit and let the remote access compute the accumulated
      value.
 
    - Protect idle/iowait sleep time with a sequence count
 
      Reading idle/iowait sleep time, e.g. from /proc/stat, can race with
      idle exit updates. As a consequence the readout may result in random
      and potentially going backwards values.
 
      Protect this by a sequence count, which fixes the idle time
      statistics issue, but cannot fix the iowait time problem because
      iowait time accounting races with remote wake ups decrementing the
      remote runqueues nr_iowait counter. The latter is impossible to fix,
      so the only way to deal with that is to document it properly and to
      remove the assertion in the selftest which triggers occasionally due
      to that.
 
    - Restructure struct tick_sched for better cache layout
 
    - Some small cleanups and a better cache layout for struct tick_sched
 
  - Implement the missing timer_wait_running() callback for POSIX CPU timers
 
    For unknown reason the introduction of the timer_wait_running() callback
    missed to fixup posix CPU timers, which went unnoticed for almost four
    years.
 
    While initially only targeted to prevent livelocks between a timer
    deletion and the timer expiry function on PREEMPT_RT enabled kernels, it
    turned out that fixing this for mainline is not as trivial as just
    implementing a stub similar to the hrtimer/timer callbacks.
 
    The reason is that for CONFIG_POSIX_CPU_TIMERS_TASK_WORK enabled systems
    there is a livelock issue independent of RT.
 
    CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y moves the expiry of POSIX CPU timers
    out from hard interrupt context to task work, which is handled before
    returning to user space or to a VM. The expiry mechanism moves the
    expired timers to a stack local list head with sighand lock held. Once
    sighand is dropped the task can be preempted and a task which wants to
    delete a timer will spin-wait until the expiry task is scheduled back
    in. In the worst case this will end up in a livelock when the preempting
    task and the expiry task are pinned on the same CPU.
 
    The timer wheel has a timer_wait_running() mechanism for RT, which uses
    a per CPU timer-base expiry lock which is held by the expiry code and the
    task waiting for the timer function to complete blocks on that lock.
 
    This does not work in the same way for posix CPU timers as there is no
    timer base and expiry for process wide timers can run on any task
    belonging to that process, but the concept of waiting on an expiry lock
    can be used too in a slightly different way.
 
    Add a per task mutex to struct posix_cputimers_work, let the expiry task
    hold it accross the expiry function and let the deleting task which
    waits for the expiry to complete block on the mutex.
 
    In the non-contended case this results in an extra mutex_lock()/unlock()
    pair on both sides.
 
    This avoids spin-waiting on a task which is scheduled out, prevents the
    livelock and cures the problem for RT and !RT systems.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRGrj4THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZhdEAC/lwfDWCnTXHC8ExQQRDIVNyXmDlLb
 EHB8ZY7Wc4gNZ8UEXEOLOXJHMG9bsbtPGctVewJwRGnXZWKVhpPwQba6kCRycyX0
 0J6l5DlvUaGGrpoOzOZwgETRmtIZE9tEArZR8xlfRScYd93a7yLhwIjO8JaV9vKs
 IQpAQMeJ/ysp6gHrS59qakYfoHU/ERUAu3Tk4GqHUtPtcyz3nX3eTlLWV8LySqs+
 00qr2yc0bQFUFoKzTCxtM8lcEi9ja9SOj1rw28348O+BXE4d0HC12Ie7eU/CDN2Y
 OAlWYxVjy4LMh24LDrRQKTzoVqx9MXDx2g+09B3t8NK5LgeS+EJIjujDhZF147/H
 5y906nplZUKa8BiZW5Rpm/HKH8tFI80T9XWSQCRBeMgTEJyRyRU1yASAwO4xw+dY
 Dn3tGmFGymcV/72o4ic9JFKQd8cTSxPjEJS3qqzMkEAtyI/zPBmKxj/Tce50OH40
 6FSZq1uU21ZQzszwSHISwgFtNr75laUSK4Z1te5OhPOOz+C7O9YqHvqS/1jwhPj2
 tMd8X17fRW3UTUBlBj+zqxqiEGBl/Yk2AvKrJIXGUtfWYCtjMJ7ieCf0kZ7NSVJx
 9ewubA0gqseMD783YomZsy8LLtMKnhclJeslUOVb1oKs1q/WF1R/k6qjy9vUwYaB
 nIJuHl8mxSetag==
 =SVnj
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timers and timekeeping updates from Thomas Gleixner:

 - Improve the VDSO build time checks to cover all dynamic relocations

   VDSO does not allow dynamic relocations, but the build time check is
   incomplete and fragile.

   It's based on architectures specifying the relocation types to search
   for and does not handle R_*_NONE relocation entries correctly.
   R_*_NONE relocations are injected by some GNU ld variants if they
   fail to determine the exact .rel[a]/dyn_size to cover trailing zeros.
   R_*_NONE relocations must be ignored by dynamic loaders, so they
   should be ignored in the build time check too.

   Remove the architecture specific relocation types to check for and
   validate strictly that no other relocations than R_*_NONE end up in
   the VSDO .so file.

 - Prefer signal delivery to the current thread for
   CLOCK_PROCESS_CPUTIME_ID based posix-timers

   Such timers prefer to deliver the signal to the main thread of a
   process even if the context in which the timer expires is the current
   task. This has the downside that it might wake up an idle thread.

   As there is no requirement or guarantee that the signal has to be
   delivered to the main thread, avoid this by preferring the current
   task if it is part of the thread group which shares sighand.

   This not only avoids waking idle threads, it also distributes the
   signal delivery in case of multiple timers firing in the context of
   different threads close to each other better.

 - Align the tick period properly (again)

   For a long time the tick was starting at CLOCK_MONOTONIC zero, which
   allowed users space applications to either align with the tick or to
   place a periodic computation so that it does not interfere with the
   tick. The alignement of the tick period was more by chance than by
   intention as the tick is set up before a high resolution clocksource
   is installed, i.e. timekeeping is still tick based and the tick
   period advances from there.

   The early enablement of sched_clock() broke this alignement as the
   time accumulated by sched_clock() is taken into account when
   timekeeping is initialized. So the base value now(CLOCK_MONOTONIC) is
   not longer a multiple of tick periods, which breaks applications
   which relied on that behaviour.

   Cure this by aligning the tick starting point to the next multiple of
   tick periods, i.e 1000ms/CONFIG_HZ.

 - A set of NOHZ fixes and enhancements:

     * Cure the concurrent writer race for idle and IO sleeptime
       statistics

       The statitic values which are exposed via /proc/stat are updated
       from the CPU local idle exit and remotely by cpufreq, but that
       happens without any form of serialization. As a consequence
       sleeptimes can be accounted twice or worse.

       Prevent this by restricting the accumulation writeback to the CPU
       local idle exit and let the remote access compute the accumulated
       value.

     * Protect idle/iowait sleep time with a sequence count

       Reading idle/iowait sleep time, e.g. from /proc/stat, can race
       with idle exit updates. As a consequence the readout may result
       in random and potentially going backwards values.

       Protect this by a sequence count, which fixes the idle time
       statistics issue, but cannot fix the iowait time problem because
       iowait time accounting races with remote wake ups decrementing
       the remote runqueues nr_iowait counter. The latter is impossible
       to fix, so the only way to deal with that is to document it
       properly and to remove the assertion in the selftest which
       triggers occasionally due to that.

     * Restructure struct tick_sched for better cache layout

     * Some small cleanups and a better cache layout for struct
       tick_sched

 - Implement the missing timer_wait_running() callback for POSIX CPU
   timers

   For unknown reason the introduction of the timer_wait_running()
   callback missed to fixup posix CPU timers, which went unnoticed for
   almost four years.

   While initially only targeted to prevent livelocks between a timer
   deletion and the timer expiry function on PREEMPT_RT enabled kernels,
   it turned out that fixing this for mainline is not as trivial as just
   implementing a stub similar to the hrtimer/timer callbacks.

   The reason is that for CONFIG_POSIX_CPU_TIMERS_TASK_WORK enabled
   systems there is a livelock issue independent of RT.

   CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y moves the expiry of POSIX CPU
   timers out from hard interrupt context to task work, which is handled
   before returning to user space or to a VM. The expiry mechanism moves
   the expired timers to a stack local list head with sighand lock held.
   Once sighand is dropped the task can be preempted and a task which
   wants to delete a timer will spin-wait until the expiry task is
   scheduled back in. In the worst case this will end up in a livelock
   when the preempting task and the expiry task are pinned on the same
   CPU.

   The timer wheel has a timer_wait_running() mechanism for RT, which
   uses a per CPU timer-base expiry lock which is held by the expiry
   code and the task waiting for the timer function to complete blocks
   on that lock.

   This does not work in the same way for posix CPU timers as there is
   no timer base and expiry for process wide timers can run on any task
   belonging to that process, but the concept of waiting on an expiry
   lock can be used too in a slightly different way.

   Add a per task mutex to struct posix_cputimers_work, let the expiry
   task hold it accross the expiry function and let the deleting task
   which waits for the expiry to complete block on the mutex.

   In the non-contended case this results in an extra
   mutex_lock()/unlock() pair on both sides.

   This avoids spin-waiting on a task which is scheduled out, prevents
   the livelock and cures the problem for RT and !RT systems

* tag 'timers-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  posix-cpu-timers: Implement the missing timer_wait_running callback
  selftests/proc: Assert clock_gettime(CLOCK_BOOTTIME) VS /proc/uptime monotonicity
  selftests/proc: Remove idle time monotonicity assertions
  MAINTAINERS: Remove stale email address
  timers/nohz: Remove middle-function __tick_nohz_idle_stop_tick()
  timers/nohz: Add a comment about broken iowait counter update race
  timers/nohz: Protect idle/iowait sleep time under seqcount
  timers/nohz: Only ever update sleeptime from idle exit
  timers/nohz: Restructure and reshuffle struct tick_sched
  tick/common: Align tick period with the HZ tick.
  selftests/timers/posix_timers: Test delivery of signals across threads
  posix-timers: Prefer delivery of signals to the current thread
  vdso: Improve cmd_vdso_check to check all dynamic relocations
2023-04-25 11:22:46 -07:00
Linus Torvalds 15bbeec0fe Update for entry and ptrace:
Provide a ptrace set/get interface for syscall user dispatch. The main
   purpose is to enable checkpoint/restore (CRIU) to handle processes which
   utilize syscall user dispatch correctly.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRGgIETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYockhEACWVd/KOBlQIdUMpM3jfSWsm+VZrITg
 sKN2WCKaz8MS5RA7xTAfZIEqMzkI0V+GPoj+8eK70W39XFU/PlSQo8LUFahSxVHF
 RVyz4zFKeR2XZpDa8J3ytoOvngiAnpOUflssvfA0+f3gq/B48jgLmj8XsrkmkL2T
 6txRpusYNlzVTBoza0+1uEmxBTNhRxvURXa6OR/l24Kbh2udyNd6dlAoRHBV0iOW
 qn7ILgoYIr/74ChCbrr8yZe2rZ+BqqlS1fsjDWkuUqq9AgzeuOjGJnZtMKG6WbGg
 /NBj0Ewe7gsgZwBo7t4MbKNF7bXRkLczp8BX/l9xOTe+mpZ+LyNIHvOM3/TD6O1A
 NFJNwTAGAnhU5Uoba9HzaKYZZnanqgLxuszXznJDU3zKV5pCNMNzlKxjPT73Jzsl
 T1WTCyhSydluSuhOHLU4awC38pqVEQwichx98c9agIBPo7kxkb5RcTVq223wOSeI
 h8otkecJ6U+gmjNDHnRtNBzykEIjVFjgiSBYGTr+/6ek2Myf0O/RMr13oe9OZG5R
 jaKyjcDIADbYRow1rXfEs7Bq42K8rIkbVZvEEK/auYRUFngAoQ3l090i9wj6ViXf
 7CqAjCC1K1BBxbqQwf0YLuDXCzUaXxcWfvNGEGEGs/NYDuu291QntGSFSxsJgsym
 HXvO4NzHOHi13A==
 =AS+6
 -----END PGP SIGNATURE-----

Merge tag 'core-entry-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core entry/ptrace update from Thomas Gleixner:
 "Provide a ptrace set/get interface for syscall user dispatch. The main
  purpose is to enable checkpoint/restore (CRIU) to handle processes
  which utilize syscall user dispatch correctly"

* tag 'core-entry-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftest, ptrace: Add selftest for syscall user dispatch config api
  ptrace: Provide set/get interface for syscall user dispatch
  syscall_user_dispatch: Untag selector address before access_ok()
  syscall_user_dispatch: Split up set_syscall_user_dispatch()
2023-04-25 11:05:04 -07:00
Linus Torvalds 4a4a28fca6 - Add a x86 hw vulnerabilities section to MAINTAINERS so that the folks
involved in it can get CCed on patches
 
 - Add some more CPUID leafs to the kcpuid tool and extend its
   functionality to be more useful when grepping for CPUID bits
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmRGiuYACgkQEsHwGGHe
 VUoC2BAAsbtki6d4bds6uezRapz9RoOJUGHm4eY4drMCXe2Mz0myD8/jDV6kRKVb
 WWDlVEv9fxoTOHTCO7kazjrwm6wXt/MgZ51gyZx9/WyqlS27U1SCH9REHKKhzgCK
 OduHi741mfAXOZ1h0M3atpJvgaKzqqugVYX7whaGwVbPKAFT+DLu+lendzAe5sxv
 1WG1JIhfLf0Wn4aUX9E5N9wenyGOUWHgPbE/UBSKaxO6ySi+ut4Mn2fTKZ70Lyp1
 HoZMNns2RDknXD3dMcZG3ztPaLwsNoEgRkIjLolVFMnaK2L9DrVqP+6/7mCGc5er
 GHdFeDdtnHih9CNm4WOQtrbynrdEAM93A3u531JCpySmVSTG1j+pCTd0P020mscl
 It+jnVm1f0csY7/y7tUMnzzNsQRQQszTCfpiuXTZy5Ml7z68sLcDqJQfPwFZh6fX
 OqmplDxE127VvZESakDyFbhV600PF1aC6xRoM2pkjqKcL6uZ7JDh+s4KisMRGGGK
 i39PaCH+E0aJ0mIp9EvWKI7LXYot3DPUX1A+O4zjNnsnC4dNTLu34Hb8fdBxOUB0
 egqcPVDIY23ELwtzF2+f7rMZ3DufEAQ2O3YP/aVRo8didepYPJpbWb/JKvpnaYXR
 4Zp0gqUtKmJ6xvO0o3pECsm/Y5WmCYHKeo9CuggT6T8iTbEEedo=
 =hSYX
 -----END PGP SIGNATURE-----

Merge tag 'x86_misc_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 updates from Borislav Petkov:

 - Add a x86 hw vulnerabilities section to MAINTAINERS so that the folks
   involved in it can get CCed on patches

 - Add some more CPUID leafs to the kcpuid tool and extend its
   functionality to be more useful when grepping for CPUID bits

* tag 'x86_misc_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Add x86 hardware vulnerabilities section
  tools/x86/kcpuid: Dump the CPUID function in detailed view
  tools/x86/kcpuid: Update AMD leaf Fn80000001
  tools/x86/kcpuid: Fix avx512bw and avx512lvl fields in Fn00000007
2023-04-25 10:27:02 -07:00
Andrii Nakryiko be7dbd275d selftests/bpf: avoid mark_all_scalars_precise() trigger in one of iter tests
iter_pass_iter_ptr_to_subprog subtest is relying on actual array size
being passed as subprog parameter. This combined with recent fixes to
precision tracking in conditional jumps ([0]) is now causing verifier to
backtrack all the way to the point where sum() and fill() subprogs are
called, at which point precision backtrack bails out and forces all the
states to have precise SCALAR registers. This in turn causes each
possible value of i within fill() and sum() subprogs to cause
a different non-equivalent state, preventing iterator code to converge.

For now, change the test to assume fixed size of passed in array. Once
BPF verifier supports precision tracking across subprogram calls, these
changes will be reverted as unnecessary.

  [0] 71b547f561 ("bpf: Fix incorrect verifier pruning due to missing register precision taints")

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230424235128.1941726-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-24 17:46:44 -07:00
Linus Torvalds 97adb49f05 v6.4/vfs.open
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZEEn8AAKCRCRxhvAZXjc
 okJoAQC1bjJp4SQw7VQhTuyv0Ak67PACwKPNUPQyHcqMV5s5DAD/fcnMjq7+UieH
 TEk/zRBGWWI8m0wb51MMO+VVM2GeXwI=
 =EKj/
 -----END PGP SIGNATURE-----

Merge tag 'v6.4/vfs.open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs open fixlet from Christian Brauner:
 "EINVAL ist keinmal: This contains the changes to make O_DIRECTORY when
  specified together with O_CREAT an invalid request.

  The wider background is that a regression report about the behavior of
  O_DIRECTORY | O_CREAT was sent to fsdevel about a behavior that was
  changed multiple years and LTS releases earlier during v5.7
  development.

  This has also been covered in

        https://lwn.net/Articles/926782/

  which provides an excellent summary of the discussion"

* tag 'v6.4/vfs.open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  open: return EINVAL for O_DIRECTORY | O_CREAT
2023-04-24 14:06:41 -07:00
Dave Marchevsky 7deca5eae8 bpf: Disable bpf_refcount_acquire kfunc calls until race conditions are fixed
As reported by Kumar in [0], the shared ownership implementation for BPF
programs has some race conditions which need to be addressed before it
can safely be used. This patch does so in a minimal way instead of
ripping out shared ownership entirely, as proper fixes for the issues
raised will follow ASAP, at which point this patch's commit can be
reverted to re-enable shared ownership.

The patch removes the ability to call bpf_refcount_acquire_impl from BPF
programs. Programs can only bump refcount and obtain a new owning
reference using this kfunc, so removing the ability to call it
effectively disables shared ownership.

Instead of changing success / failure expectations for
bpf_refcount-related selftests, this patch just disables them from
running for now.

  [0]: https://lore.kernel.org/bpf/d7hyspcow5wtjcmw4fugdgyp3fwhljwuscp3xyut5qnwivyeru@ysdq543otzv2/

Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230424204321.2680232-1-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-24 14:02:11 -07:00
Linus Torvalds a632b76b42 v6.4/kernel.clone3.tests
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZEEuiAAKCRCRxhvAZXjc
 otFNAP4iesF3tqrzb1JLf/vKg8PT98re0Iec5tGYyfDSCjfGYAEAraeFOSry16sy
 VUn6JBjDwrkAbUeraz6zauS+rXwstQk=
 =xYmX
 -----END PGP SIGNATURE-----

Merge tag 'v6.4/kernel.clone3.tests' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull clone3 selftest fix from Christian Brauner:
 "This is a single fix to the clone3() selftstests.

  It fell through the sefltest tree cracks a few times so I'll provide
  it here. It has low urgency but we should still correctly report the
  number of tests"

* tag 'v6.4/kernel.clone3.tests' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  selftests/clone3: fix number of tests in ksft_set_plan
2023-04-24 12:48:33 -07:00
Linus Torvalds c23f28975a Commit volume in documentation is relatively low this time, but there is
still a fair amount going on, including:
 
 - Reorganizing the architecture-specific documentation under
   Documentation/arch.  This makes the structure match the source directory
   and helps to clean up the mess that is the top-level Documentation
   directory a bit.  This work creates the new directory and moves x86 and
   most of the less-active architectures there.  The current plan is to move
   the rest of the architectures in 6.5, with the patches going through the
   appropriate subsystem trees.
 
 - Some more Spanish translations and maintenance of the Italian
   translation.
 
 - A new "Kernel contribution maturity model" document from Ted.
 
 - A new tutorial on quickly building a trimmed kernel from Thorsten.
 
 Plus the usual set of updates and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmRGze0PHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5Y/VsH/RyWqinorRVFZmHqRJMRhR0j7hE2pAgK5prE
 dGXYVtHHNQ+25thNaqhZTOLYFbSX6ii2NG7sLRXmyOTGIZrhUCFFXCHkuq4ZUypR
 gJpMUiKQVT4dhln3gIZ0k09NSr60gz8UTcq895N9UFpUdY1SCDhbCcLc4uXTRajq
 NrdgFaHWRkPb+gBRbXOExYm75DmCC6Ny5AyGo2rXfItV//ETjWIJVQpJhlxKrpMZ
 3LgpdYSLhEFFnFGnXJ+EAPJ7gXDi2Tg5DuPbkvJyFOTouF3j4h8lSS9l+refMljN
 xNRessv+boge/JAQidS6u8F2m2ESSqSxisv/0irgtKIMJwXaoX4=
 =1//8
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.4' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "Commit volume in documentation is relatively low this time, but there
  is still a fair amount going on, including:

   - Reorganize the architecture-specific documentation under
     Documentation/arch

     This makes the structure match the source directory and helps to
     clean up the mess that is the top-level Documentation directory a
     bit. This work creates the new directory and moves x86 and most of
     the less-active architectures there.

     The current plan is to move the rest of the architectures in 6.5,
     with the patches going through the appropriate subsystem trees.

   - Some more Spanish translations and maintenance of the Italian
     translation

   - A new "Kernel contribution maturity model" document from Ted

   - A new tutorial on quickly building a trimmed kernel from Thorsten

  Plus the usual set of updates and fixes"

* tag 'docs-6.4' of git://git.lwn.net/linux: (47 commits)
  media: Adjust column width for pdfdocs
  media: Fix building pdfdocs
  docs: clk: add documentation to log which clocks have been disabled
  docs: trace: Fix typo in ftrace.rst
  Documentation/process: always CC responsible lists
  docs: kmemleak: adjust to config renaming
  ELF: document some de-facto PT_* ABI quirks
  Documentation: arm: remove stih415/stih416 related entries
  docs: turn off "smart quotes" in the HTML build
  Documentation: firmware: Clarify firmware path usage
  docs/mm: Physical Memory: Fix grammar
  Documentation: Add document for false sharing
  dma-api-howto: typo fix
  docs: move m68k architecture documentation under Documentation/arch/
  docs: move parisc documentation under Documentation/arch/
  docs: move ia64 architecture docs under Documentation/arch/
  docs: Move arc architecture docs under Documentation/arch/
  docs: move nios2 documentation under Documentation/arch/
  docs: move openrisc documentation under Documentation/arch/
  docs: move superh documentation under Documentation/arch/
  ...
2023-04-24 12:35:49 -07:00
Linus Torvalds 1be89faab3 linux-kselftest-kunit-6.4-rc1
This KUnit update Linux 6.4-rc1 consists of:
 
 - several fixes to kunit tool
 - new klist structure test
 - support for m68k under QEMU
 - support for overriding the QEMU serial port
 - support for SH under QEMU
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmRFYFsACgkQCwJExA0N
 Qxw+PxAA1KHnHool3QbzZouFgLgTS2N/hxsOIoWKeUl6guUPX0XYu67FEIyt7p5k
 a1eFLjt+q4URW/heHKYdffP+Up6xhN5yVP8xJEcbn6GD13lz1clI9RAjObiPOehc
 KOV90PeAEfzosEGRIp97g4Gzu8NUMZqN7BsKBdzYJ4rEftlcjaILBVp4OfSuCyAi
 UbYBdRjK4eIOwGXuHVfhNqzH1HRSbzcoSRTywj5qW0Qhpe6KnZBRuZESXYBsxzGb
 G0nd4+OttjZyplI/xQYwaU0XGAI6roG5G4nAT5YGHLp5g8rTaHetTi+i3iK4iEru
 wEL0NgywkA0ujAge97RldOjtU97vvSFk7FwxdS9lxaMW/Ut2sN72I2ThI8dBvVRZ
 fcw8t8mmT1gUv3SCq+s1X13vz22IedXLOfvOY2o/fLk2zxOw5e8FirAz/aFeOf3K
 ++hK+IQvDmeMMv08bz0ORzdRQcjdwQNQ3klnfdrUVFN9yK+iAllOJ/nrXHLNIXu4
 c3ITlAMldcAf2W+LRWzvqqKyT4H8MCXL3L0bBc1M1reRu9nM89AZedO8MHCB0R9Q
 2ic0rOxIwZzPJuk0qPDxEVmN7Rpyx85I96YOwRemJTEfdkB/ZX+BfOU0KzinOVHC
 3qrHuIw/SyRTlUEDAr53gJ5WHbdjhKAmrd1/FuplyoOSX0w6VVA=
 =COQn
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:

 - several fixes to kunit tool

 - new klist structure test

 - support for m68k under QEMU

 - support for overriding the QEMU serial port

 - support for SH under QEMU

* tag 'linux-kselftest-kunit-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: add tests for using current KUnit test field
  kunit: tool: Add support for SH under QEMU
  kunit: tool: Add support for overriding the QEMU serial port
  .gitignore: Unignore .kunitconfig
  list: test: Test the klist structure
  kunit: increase KUNIT_LOG_SIZE to 2048 bytes
  kunit: Use gfp in kunit_alloc_resource() kernel-doc
  kunit: tool: fix pre-existing `mypy --strict` errors and update run_checks.py
  kunit: tool: remove unused imports and variables
  kunit: tool: add subscripts for type annotations where appropriate
  kunit: fix bug of extra newline characters in debugfs logs
  kunit: fix bug in the order of lines in debugfs logs
  kunit: fix bug in debugfs logs of parameterized tests
  kunit: tool: Add support for m68k under QEMU
2023-04-24 12:31:32 -07:00
Linus Torvalds 0f50767d7e linux-kselftest-next-6.4-rc1
linux-kselftest-next-6.4-rc1
 
 This Kselftest update for Linux 6.4-rc1 consists of:
 
 - several patches to enhance and fix resctrl test
 - nolibc support for kselftest with an addition to vprintf() to
   tools/nolibc/stdio and related test changes
 - Refactor 'peeksiginfo' ptrace test part
 - add 'malloc' failures checks in cgroup test_memcontrol
 - a new prctl test
 - enhancements sched test with additional ore schedule prctl calls
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmRFWfMACgkQCwJExA0N
 QxxXPg/9Fjo+jLEt6/CraOQeeyQsmzQkPiXutqrjUoUwaMGpOoon+En39g1eDAiZ
 5iJE8lu47ldUJNp7Pn2THfcLXJ4Le+12s0imWuXaDf6VdFUNsyd9AZMfziMmeUGa
 6GXLF+FUzT7uXbIuIrPvZmyX3QFL8WEcywmgFRlAyfLFcg37uUgiKSl7ATrgWCEw
 jlPELO2p2+t+EB0/n1VMoXup6tGD6tpuuNH50rDeRMTV+cW7wKTvJyFXbMvGThcx
 YfjzofYm+drX5gka/XQYynZehTNcbASjkvYYEqm3piwWCyfqt4Y1aOAA8fS9h+86
 jKGF/pxz1Zcl7vgZW5WixKaJ+cMf8gfCMRsny6h2x2pmqKwqSJtg+jk8XqNkJQnh
 JKwRxosLjEQMyIRPMH9PUjBQD46VuC2nvu4SY5apxkbHH2iG8SKG/DNIpSFXB2m0
 7BuziwKZe9uw671vaZU2b0r5eeCxETtuFwlHWN9Rl954g0zeueyUuBqg0ibpQois
 Jp2SvwVR+oXRLHoqDMrCEDk0WEGJ9WD/mMxW4iHGMYRwoXb1RCKzFWEoYw3dFAxw
 N/knx2IeEFshiKIAqaGnpER1chLe6ChX/5rDXkj1YLDwHEJdowHhE9/GxqGbhEUi
 zOzZIAGMVTOZp6g/TIutGYfpyxUAXj19eRoAbA/KkLmv4d+s1nY=
 =992w
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-next-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest updates from Shuah Khan:

 - several patches to enhance and fix resctrl test

 - nolibc support for kselftest with an addition to vprintf() to
   tools/nolibc/stdio and related test changes

 - Refactor 'peeksiginfo' ptrace test part

 - add 'malloc' failures checks in cgroup test_memcontrol

 - a new prctl test

 - enhancements sched test with additional ore schedule prctl calls

* tag 'linux-kselftest-next-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (25 commits)
  selftests/resctrl: Fix incorrect error return on test complete
  selftests/resctrl: Remove duplicate codes that clear each test result file
  selftests/resctrl: Commonize the signal handler register/unregister for all tests
  selftests/resctrl: Cleanup properly when an error occurs in CAT test
  selftests/resctrl: Flush stdout file buffer before executing fork()
  selftests/resctrl: Return MBA check result and make it to output message
  selftests/resctrl: Fix set up schemata with 100% allocation on first run in MBM test
  selftests/resctrl: Use correct exit code when tests fail
  kselftest/arm64: Convert za-fork to use kselftest.h
  kselftest: Support nolibc
  tools/nolibc/stdio: Implement vprintf()
  selftests/resctrl: Correct get_llc_perf() param in function comment
  selftests/resctrl: Use remount_resctrlfs() consistently with boolean
  selftests/resctrl: Change name from CBM_MASK_PATH to INFO_PATH
  selftests/resctrl: Change initialize_llc_perf() return type to void
  selftests/resctrl: Replace obsolete memalign() with posix_memalign()
  selftests/resctrl: Check for return value after write_schemata()
  selftests/resctrl: Allow ->setup() to return errors
  selftests/resctrl: Move ->setup() call outside of test specific branches
  selftests/resctrl: Return NULL if malloc_and_init_memory() did not alloc mem
  ...
2023-04-24 12:28:34 -07:00
Linus Torvalds 5dfb75e842 RCU Changes for 6.4:
o  MAINTAINERS files additions and changes.
  o  Fix hotplug warning in nohz code.
  o  Tick dependency changes by Zqiang.
  o  Lazy-RCU shrinker fixes by Zqiang.
  o  rcu-tasks stall reporting improvements by Neeraj.
  o  Initial changes for renaming of k[v]free_rcu() to its new k[v]free_rcu_mightsleep()
     name for robustness.
  o  Documentation Updates:
  o  Significant changes to srcu_struct size.
  o  Deadlock detection for srcu_read_lock() vs synchronize_srcu() from Boqun.
  o  rcutorture and rcu-related tool, which are targeted for v6.4 from Boqun's tree.
  o  Other misc changes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEcoCIrlGe4gjE06JJqA4nf2o45hAFAmQuBnIACgkQqA4nf2o4
 5hACVRAAoXu7/gfh5Pjw9O4E4pCdPJKsZZVYrcrVGrq6NAxRn6M1SgurAdC5grj2
 96x0waoGaiO82V0H5iJMcKdAVu67x9R8WaQ1JoxN75Efn8h9W4TguB87TV1gk0xS
 eZ18b/CyEaM5mNb80DFFF4FLohy5737p/kNTMqXQdUyR1BsDl16iRMgjiBiFhNUx
 yPo8Y2kC2U2OTbldZgaE7s9bQO3xxEcifx93sGWsAex/gx54FYNisiwSlCOSgOE+
 XkYo/OKk8Xvr82tLVX8XQVEPCMJ+rxea8T5zSs8/alvsPq7gA8wW3y6fsoa3vUU/
 +Gd+W+Q/OsONIDtp8rQAY1qsD0ScDpaR8052RSH0zTa7pj8HsQgE5PjZ+cJW0SEi
 cKN+Oe8+ETqKald+xZ6PDf58O212VLrru3RpQWrOQcJ7fmKmfT4REK0RcbLgg4qT
 CBgOo6eg+ub4pxq2y11LZJBNTv1/S7xAEzFE0kArew64KB2gyVud0VJRZVAJnEfe
 93QQVDFrwK2bhgWQZ6J6IbTvGeQW0L93IibuaU6jhZPR283VtUIIvM7vrOylN7Fq
 4jsae0T7YGYfKUhgTpm7rCnm8A/D3Ni8MY0sKYYgDSyKmZUsnpI5wpx1xke4lwwV
 ErrY46RCFa+k8wscc6iWfB4cGXyyFHyu+wtyg0KpFn5JAzcfz4A=
 =Rgbj
 -----END PGP SIGNATURE-----

Merge tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux

Pull RCU updates from Joel Fernandes:

 - Updates and additions to MAINTAINERS files, with Boqun being added to
   the RCU entry and Zqiang being added as an RCU reviewer.

   I have also transitioned from reviewer to maintainer; however, Paul
   will be taking over sending RCU pull-requests for the next merge
   window.

 - Resolution of hotplug warning in nohz code, achieved by fixing
   cpu_is_hotpluggable() through interaction with the nohz subsystem.

   Tick dependency modifications by Zqiang, focusing on fixing usage of
   the TICK_DEP_BIT_RCU_EXP bitmask.

 - Avoid needless calls to the rcu-lazy shrinker for CONFIG_RCU_LAZY=n
   kernels, fixed by Zqiang.

 - Improvements to rcu-tasks stall reporting by Neeraj.

 - Initial renaming of k[v]free_rcu() to k[v]free_rcu_mightsleep() for
   increased robustness, affecting several components like mac802154,
   drbd, vmw_vmci, tracing, and more.

   A report by Eric Dumazet showed that the API could be unknowingly
   used in an atomic context, so we'd rather make sure they know what
   they're asking for by being explicit:

      https://lore.kernel.org/all/20221202052847.2623997-1-edumazet@google.com/

 - Documentation updates, including corrections to spelling,
   clarifications in comments, and improvements to the srcu_size_state
   comments.

 - Better srcu_struct cache locality for readers, by adjusting the size
   of srcu_struct in support of SRCU usage by Christoph Hellwig.

 - Teach lockdep to detect deadlocks between srcu_read_lock() vs
   synchronize_srcu() contributed by Boqun.

   Previously lockdep could not detect such deadlocks, now it can.

 - Integration of rcutorture and rcu-related tools, targeted for v6.4
   from Boqun's tree, featuring new SRCU deadlock scenarios, test_nmis
   module parameter, and more

 - Miscellaneous changes, various code cleanups and comment improvements

* tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux: (71 commits)
  checkpatch: Error out if deprecated RCU API used
  mac802154: Rename kfree_rcu() to kvfree_rcu_mightsleep()
  rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()
  ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()
  net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()
  net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
  lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
  tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
  misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
  drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
  rcu: Protect rcu_print_task_exp_stall() ->exp_tasks access
  rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-ed
  rcu-tasks: Report stalls during synchronize_srcu() in rcu_tasks_postscan()
  rcu: Permit start_poll_synchronize_rcu_expedited() to be invoked early
  rcu: Remove never-set needwake assignment from rcu_report_qs_rdp()
  rcu: Register rcu-lazy shrinker only for CONFIG_RCU_LAZY=y kernels
  rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check
  rcu: Fix set/clear TICK_DEP_BIT_RCU_EXP bitmask race
  rcu/trace: use strscpy() to instead of strncpy()
  tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem
  ...
2023-04-24 12:16:14 -07:00
Linus Torvalds 5d77652fbf nolibc updates for v6.4
o     Add support for loongarch.
 
 o     Fix stack-protector issues.
 
 o     Support additional integral types and signal-related macros.
 
 o     Add support for stdin, stdout, and stderr.
 
 o     Add getuid() and geteuid().
 
 o     Allow S_I* macros to be overridden by program.
 
 o     Defer to linux/fcntl.h and linux/stat.h to avoid duplicate
 	definitions.
 
 o     Many improvements to the self tests.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmQsjLUTHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jKuOD/9Y18ncS7+K7D9m89h5zwqBoPeNdP65
 fC/kSgBeLreqO9hR7fsDegPk/pC8ZcMZj7iiLMV8dQIan0N74Hcd4mnL+Nf9LdkP
 KpwMfN/DG7h+p69Ug4d6qQqSr7jwEGa1NE280NguNtugs+kanSayVoSh0lhWPp7K
 5t9s4oITOQNa69EgluB25b1qbaY36DibsBbt7vO2b/Rsmbw13UdgXQxKBQTgEOmQ
 yfNqjSYatRrl7pQQiQQDwV8xzd84jNLgNKsTDtG18kUWqYBhleMDQxt54betRFZ2
 O3SNylt7pJKVFsgjAB90TcOl4SRmzXJ0+jNLIuNYJkqlQaPjQy29N4nBBUp2ciBf
 sSkL4OBfaegJlnBWZ4LwyV+ZTPLGQ5PZ78EhfqMpQzVbEibL5Es/pPAVuUxLVK4B
 U03+sh4IAr2aK4aXcxXgDWYr/HmKP6DWrlp/oH679sW1e2mUpnjghOfuGMYwLifo
 d+a/M2VH8++KOV/0aZf/Z4p1Jbexn6/jz4pW7DTfiARduUtlaN5F69oAr85q5vrS
 T1dZ84ZjmxrWJUYEJjfs5DM+fvPzsOyho8ae9TRlxnJJtW6UOZ0TTUo+mUDfgqWQ
 OU6XteK6zTQmE5H9aVOVwO/jl0uXWzmHAoJTAynpPgEkP01I+t6Vx8e1PAK/oPbT
 fJdUIEFiMo1qMg==
 =BS2x
 -----END PGP SIGNATURE-----

Merge tag 'nolibc.2023.04.04a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull nolibc updates from Paul McKenney:

 - Add support for loongarch

 - Fix stack-protector issues

 - Support additional integral types and signal-related macros

 - Add support for stdin, stdout, and stderr

 - Add getuid() and geteuid()

 - Allow S_I* macros to be overridden by program

 - Defer to linux/fcntl.h and linux/stat.h to avoid duplicate
   definitions

 - Many improvements to the selftests

* tag 'nolibc.2023.04.04a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (22 commits)
  tools/nolibc: x86_64: add stackprotector support
  tools/nolibc: i386: add stackprotector support
  tools/nolibc: tests: add test for -fstack-protector
  tools/nolibc: tests: fold in no-stack-protector cflags
  tools/nolibc: add support for stack protector
  tools/nolibc: tests: constify test_names
  tools/nolibc: add helpers for wait() signal exits
  tools/nolibc: add definitions for standard fds
  selftests/nolibc: Adjust indentation for Makefile
  selftests/nolibc: Add support for LoongArch
  tools/nolibc: Add support for LoongArch
  tools/nolibc: Add statx() and make stat() rely on statx() if necessary
  tools/nolibc: Include linux/fcntl.h and remove duplicate code
  tools/nolibc: check for S_I* macros before defining them
  selftests/nolibc: skip the chroot_root and link_dir tests when not privileged
  tools/nolibc: add getuid() and geteuid()
  tools/nolibc: add tests for the integer limits in stdint.h
  tools/nolibc: enlarge column width of tests
  tools/nolibc: add integer types and integer limit macros
  tools/nolibc: add stdint.h
  ...
2023-04-24 12:09:43 -07:00
Linus Torvalds 4a4075ada6 locktorture updates for v6.4
This update adds tests for nested locking and also adds support for
 testing raw spinlocks in PREEMPT_RT kernels.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmQs8kETHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jImaEACFX7CPZyRUG32Yo6wdzxRHuZPid6cR
 Si5GyRiTJzKuS9aDgl6jMYRvFXSXE9Xx1TVX0ad6fkNW40IMAkXprmUkQwN3ZtSb
 K/pOLyOSFkm/XDrfDinPU46kh+DgSrAZtB3jhELa5doRxr9lWWSnwV4HoBx64T3/
 84LEyIi47OSVxucaUWfimDUyBbNl4Oq95hdpD3hwxyxq5nsv2Q+oLWy2syXeegOz
 3ru4Aswg40cwjYT9tjnrfZKZeteby2q55JYUDvP3kPfu/utyMyafUOda0DhHFdRB
 dT1EISkY/zyqf3orTfghLpYJEplDNkSKhVtyn2dQcRHhoUJ9e/8xnRclqVo4tkqv
 QWUZHJFar08P6iNBh9Z/YiM8D4kpeQNVCmR29h094BlQMbTLYbcZUjJ3YeE5nsz+
 Bid7Ln6aBvGb3Ui6EWq7FVfcGzrPms3MUXw6nQLh6HaQg0F2g73MKS9Wd75OjEc/
 cKPxkqzC35pM87eEf0xBlJzudZYxkYhP8Rt0bCGt/tq/pZAulCyOgnET2mcBv7Z0
 94uEIGVvswVPB9/VKyqf7mHVrk/uJeygGKD1++4pzGumdhfsaM1dl3g6DkrSgK1j
 A/kAApkhha8Zacj3oAAQuBPi8JuIqUFQvfbA8Os6d/8PXfTRaaMnV9DRS7wcohkP
 7haDPwX8pHj+Gg==
 =QAhX
 -----END PGP SIGNATURE-----

Merge tag 'locktorture.2023.04.04a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull locktorture updates from Paul McKenney:
 "This adds tests for nested locking and also adds support for testing
  raw spinlocks in PREEMPT_RT kernels"

* tag 'locktorture.2023.04.04a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  locktorture: Add raw_spinlock* torture tests for PREEMPT_RT kernels
  locktorture: With nested locks, occasionally skip main lock
  locktorture: Add nested locking to rtmutex torture tests
  locktorture: Add nested locking to mutex torture tests
  locktorture: Add nested_[un]lock() hooks and nlocks parameter
2023-04-24 12:05:08 -07:00
Linus Torvalds 60eb450742 LKMM scripting updates for v6.4
This update improves litmus-test documentation and improves the ability
 to do before/after tests on the https://github.com/paulmckrcu/litmus repo.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmQwtAMTHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jE9AD/4pdoS4w+XmGTkOaSYWVKz0B822+FnZ
 822s/Z+4sA7ngoDEx4NSno299mSjONMS/HS8oTDXQQgGL7xXZNJc1phD1oP17dwa
 3Ic6RKqWlYLOtFLfGLZF+wvVo6Z0WLnyh4KDeA31AVcb/Cdzzb30RZTO9oz1WDZH
 ueD4egvl6ECyZPh2HfjcQ7Y2hH00Ohi1igY+WPCBiMM1FrTbPmaLrAwsRrEbhsqx
 PwnrbMdGrTvT62sgnm9LHGr/P2YKDdYxs8wUyWRg876KitdUPmZb8uy2gZ0Bpp5+
 mMB6h54mjVtDnpVtPHm8u4Viq2ir3zSlbWGmI24JxFCn3FTRFQwYQMCPBm7tlpqB
 n+08OGtWDRM3b+aLa5gYo1MogMayWtZN/vL6/9BSTF6mvjMbKLu2esi6JttU1tOV
 o4LvG+b6lO+L1ZvQctnDmzCPjmVB4QuFZvcNdRwHIVFtlG2v2ffaZ5ogaM+3uN4u
 vUeW5pOmAaD3aO0g7xJVdTwHfBasxrXfYazjYPdpvuoIXHbOeEC+LVfCaQVJRFGf
 20w0lB6hZqsE8qnaKAvHzupDi7nz3X0Ge/PAvu54o9PgOP1XKDNH+p6fCxefCx1T
 M8VnQHdgR29kuyrVy9XbQjRDgEXSPrQXrItl2B8MAoXVhaCDt6LOQ/LyGnKL3Q7w
 4sEBieegEnqLQQ==
 =kQ01
 -----END PGP SIGNATURE-----

Merge tag 'lkmm-scripting.2023.04.07a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull Linux Kernel Memory Model scripting updates from Paul McKenney:
 "This improves litmus-test documentation and improves the ability to do
  before/after tests on the https://github.com/paulmckrcu/litmus repo"

* tag 'lkmm-scripting.2023.04.07a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (32 commits)
  tools/memory-model: Remove out-of-date SRCU documentation
  tools/memory-model: Document LKMM test procedure
  tools/memory-model: Use "grep -E" instead of "egrep"
  tools/memory-model: Use "-unroll 0" to keep --hw runs finite
  tools/memory-model: Make judgelitmus.sh handle scripted Result: tag
  tools/memory-model: Add data-race capabilities to judgelitmus.sh
  tools/memory-model: Add checktheselitmus.sh to run specified litmus tests
  tools/memory-model: Repair parseargs.sh header comment
  tools/memory-model:  Add "--" to parseargs.sh for additional arguments
  tools/memory-model: Make history-check scripts use mselect7
  tools/memory-model: Make checkghlitmus.sh use mselect7
  tools/memory-model: Fix scripting --jobs argument
  tools/memory-model: Implement --hw support for checkghlitmus.sh
  tools/memory-model: Add -v flag to jingle7 runs
  tools/memory-model: Make runlitmus.sh check for jingle errors
  tools/memory-model: Allow herd to deduce CPU type
  tools/memory-model: Keep assembly-language litmus tests
  tools/memory-model: Move from .AArch64.litmus.out to .litmus.AArch.out
  tools/memory-model: Make runlitmus.sh generate .litmus.out for --hw
  tools/memory-model: Split runlitmus.sh out of checklitmus.sh
  ...
2023-04-24 12:02:25 -07:00
Linus Torvalds 406037351e LKMM updates for v6.4
This update improves LKMM diagnostic messages, unifies handling
 of the ordering produced by unlock/lock pairs, adds support for the
 smp_mb__after_srcu_read_unlock() macro, removes redundant members from
 the to-r relation, brings SRCU read-side semantics into alignment
 with Linux-kernel SRCU, makes ppo a subrelation of po, and improves
 documentation.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmQws9wTHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jKNGD/9oBTUIlsscsZ0GCtx5CYkA+AsIOAzV
 ejvH3vB935jKHVXuDYtnvjDrk9km2aK98CzZaYyJT44bvMZ7gMIvMG8b/k/ZokgF
 GNkde2Vv2S2sMpgQZUxSCjCBPyqd7uRzPtixyAb4zIu7VAEHdlptHRFc+0DQ2l1z
 F4jCRI01ZXD/pk+7fcztxi+obBYXfbVwSFSieN7Kym5Vt6sQXbG/SG3nIS/ZgExA
 i52uiKScAVtfdGOQCnC4YR8hrcahFZ7ohmEx66B1mcgLknzjS/Pzy7tDNiRr6jx9
 Ong7c/ImLk7JLKWMC/52lzeXlzrwMAvdoJ9EviJO5zfAjP5ycD8Te2ErCDkg8f4P
 yoNoLinYweG5VTGrMvYFpUnRUL6iABbtZe4NrwlP3eWsfRQ3lZg2NK/QypirYW36
 aHYQPhKjj2zSpxx2+DDNQZUpd+kXGMNLngix+VT/E9hJ1k95lt1I44ID68UJdIaU
 ctIB1c1lEPJbgFnNXsBBorhcfBU1JduYrNg212Uh2C2XuMFzMiiVGCuZvckFc9nZ
 HllllQzE4WH5BhBx5o1qNmRAsbH4v1+B/UAoYpH6Er7fDtHhdXE0YtgopH4YiU11
 9UxO35qFQYGjU1+iR3063OO9dVhbeSlwRyAQ2cbIJ/GZG+ciniOtI/lg7Px8MO/y
 nZyuOytkf6x6JQ==
 =Vilg
 -----END PGP SIGNATURE-----

Merge tag 'lkmm.2023.04.07a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull Linux Kernel Memory Model updates from Paul McKenney
 "This improves LKMM diagnostic messages, unifies handling of the
  ordering produced by unlock/lock pairs, adds support for the
  smp_mb__after_srcu_read_unlock() macro, removes redundant members from
  the to-r relation, brings SRCU read-side semantics into alignment with
  Linux-kernel SRCU, makes ppo a subrelation of po, and improves
  documentation"

* tag 'lkmm.2023.04.07a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  Documentation: litmus-tests: Correct spelling
  tools/memory-model: Add documentation about SRCU read-side critical sections
  tools/memory-model: Make ppo a subrelation of po
  tools/memory-model: Provide exact SRCU semantics
  tools/memory-model: Restrict to-r to read-read address dependency
  tools/memory-model: Add smp_mb__after_srcu_read_unlock()
  tools/memory-model: Unify UNLOCK+LOCK pairings to po-unlock-lock-po
  tools/memory-model: Update some warning labels
2023-04-24 12:00:51 -07:00
Davidlohr Bueso fd35fdcbf7 cxl/test: Add mock test for set_timestamp
Support the command testing in a unit-test fashion.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/20230423221231.6357-1-dave@stgolabs.net
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-24 11:31:02 -07:00
James Clark d1efa4a0a6 perf cs-etm: Add separate decode paths for timeless and per-thread modes
Timeless and per-thread are orthogonal concepts that are currently
treated as if they are the same (per-thread == timeless). This breaks
when you modify the command line or itrace options to something that the
current logic doesn't expect.

For example:

  # Force timeless with Z
  --itrace=Zi10i

  # Or inconsistent record options
  -e cs_etm/timestamp=1/ --per-thread

Adding Z for decoding in per-cpu mode is particularly bad because in
per-thread mode trace channel IDs are discarded and all assumed to be 0,
which would mix trace from different CPUs in per-cpu mode.

Although the results might not be perfect in all scenarios, if the user
requests no timestamps, it should still be possible to decode in either
mode. Especially if the relative times of samples in different processes
aren't interesting, quite a bit of space can be saved by turning off
timestamps in per-cpu mode.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Denis Nikitin <denik@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230424134748.228137-8-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-24 14:42:20 -03:00
James Clark 1764ce069b perf cs-etm: Use bool type for boolean values
Using u8 for boolean values makes the code a bit more difficult to read
so be more explicit by using bool.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Denis Nikitin <denik@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230424134748.228137-7-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-24 14:42:20 -03:00
James Clark 7bfc1544d9 perf cs-etm: Allow user to override timestamp and contextid settings
Timestamps and context tracking are automatically enabled in per-core
mode and it's impossible to override this. Use the new utility function
to set them conditionally.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Denis Nikitin <denik@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230424134748.228137-6-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-24 14:42:20 -03:00
James Clark 35c51f83dd perf cs-etm: Validate options after applying them
Currently the cs_etm_set_option() function both validates and applies
the config options. Because it's only called when they are added
automatically, there are some paths where the user can apply the option
on the command line and skip the validation. By moving it to the end it
covers both cases.

Also, options don't need to be re-applied anyway, Perf handles parsing
and applying the config terms automatically.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Denis Nikitin <denik@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230424134748.228137-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-24 14:42:20 -03:00
James Clark 3963d84b1b perf cs-etm: Don't test full_auxtrace because it's always set
There is no path in cs-etm where this isn't true so it doesn't need to
be tested. Also re-order the beginning of cs_etm_recording_options() so
that nothing is done until the early exit is passed.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Denis Nikitin <denik@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230424134748.228137-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-24 14:42:20 -03:00
James Clark 6593f019c2 perf tools: Add util function for overriding user set config values
There is some duplicated code to only override config values if they
haven't already been set by the user so make a util function for this.

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Denis Nikitin <denik@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230424134748.228137-3-james.clark@arm.com
[ Moved evsel__set_config_if_unset() to util/pmu.c to avoid dragging stuff into the python binding ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-24 14:41:51 -03:00
Linus Torvalds a562456643 Merge branch 'x86-rep-insns': x86 user copy clarifications
Merge my x86 user copy updates branch.

This cleans up a lot of our x86 memory copy code, particularly for user
accesses.  I've been pushing for microarchitectural support for good
memory copying and clearing for a long while, and it's been visible in
how the kernel has aggressively used 'rep movs' and 'rep stos' whenever
possible.

And that micro-architectural support has been improving over the years,
to the point where on modern CPU's the best option for a memory copy
that would become a function call (as opposed to being something that
can just be turned into individual 'mov' instructions) is now to inline
the string instruction sequence instead.

However, that only makes sense when we have the modern markers for this:
the x86 FSRM and FSRS capabilities ("Fast Short REP MOVS/STOS").

So this cleans up a lot of our historical code, gets rid of the legacy
marker use ("REP_GOOD" and "ERMS") from the memcpy/memset cases, and
replaces it with that modern reality.  Note that REP_GOOD and ERMS end
up still being used by the known large cases (ie page copyin gand
clearing).

The reason much of this ends up being about user memory accesses is that
the normal in-kernel cases are done by the compiler (__builtin_memcpy()
and __builtin_memset()) and getting to the point where we can use our
instruction rewriting to inline those to be string instructions will
need some compiler support.

In contrast, the user accessor functions are all entirely controlled by
the kernel code, so we can change those arbitrarily.

Thanks to Borislav Petkov for feedback on the series, and Jens testing
some of this on micro-architectures I didn't personally have access to.

* x86-rep-insns:
  x86: rewrite '__copy_user_nocache' function
  x86: remove 'zerorest' argument from __copy_user_nocache()
  x86: set FSRS automatically on AMD CPUs that have FSRM
  x86: improve on the non-rep 'copy_user' function
  x86: improve on the non-rep 'clear_user' function
  x86: inline the 'rep movs' in user copies for the FSRM case
  x86: move stac/clac from user copy routines into callers
  x86: don't use REP_GOOD or ERMS for user memory clearing
  x86: don't use REP_GOOD or ERMS for user memory copies
  x86: don't use REP_GOOD or ERMS for small memory clearing
  x86: don't use REP_GOOD or ERMS for small memory copies
2023-04-24 10:39:27 -07:00
James Clark 449067f3fc perf cs-etm: Fix timeless decode mode detection
In this context, timeless refers to the trace data rather than the perf
event data. But when detecting whether there are timestamps in the trace
data or not, the presence of a timestamp flag on any perf event is used.

Since commit f42c0ce573 ("perf record: Always get text_poke events
with --kcore option") timestamps were added to a tracking event when
--kcore is used which breaks this detection mechanism. Fix it by
detecting if trace timestamps exist by looking at the ETM config flags.
This would have always been a more accurate way of doing it anyway.

This fixes the following error message when using --kcore with
Coresight:

  $ perf record --kcore -e cs_etm// --per-thread
  $ perf report
  The perf.data/data data has no samples!

Fixes: f42c0ce573 ("perf record: Always get text_poke events with --kcore option")
Reported-by: Yang Shi <shy828301@gmail.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: denik@google.com
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/lkml/CAHbLzkrJQTrYBtPkf=jf3OpQ-yBcJe7XkvQstX9j2frz4WF-SQ@mail.gmail.com/
Link: https://lore.kernel.org/r/20230424134748.228137-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-24 14:28:17 -03:00
Arnaldo Carvalho de Melo ce1d3bc273 perf evsel: Introduce evsel__name_is() method to check if the evsel name is equal to a given string
This makes the logic a bit clear by avoiding the !strcmp() pattern and
also a way to intercept the pointer if we need to do extra validation on
it or to do lazy setting of evsel->name via evsel__name(evsel).

Reviewed-by: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZEGLM8VehJbS0gP2@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-24 14:28:11 -03:00
Rafael J. Wysocki d3f2c402e4 Merge branches 'pm-core', 'pm-sleep', 'pm-opp' and 'pm-tools'
Merge PM core changes, updates related to system sleep support,
operating performance points (OPP) changes and power management
utilities changes for 6.4-rc1:

 - Drop unnecessary (void *) conversions from the PM core (Li zeming).

 - Add sysfs files to represent time spent in a platform sleep state
   during suspend-to-idle and make AMD and Intel PMC drivers use them
   (Mario Limonciello).

 - Use of_property_present() for testing DT property presence (Rob
   Herring).

 - Add set_required_opps() callback to the 'struct opp_table', to make
   the code paths cleaner (Viresh Kumar).

 - Update the pm-graph siute of utilities to v5.11 with the following
   changes:
   * New script which allows users to install the latest pm-graph
     from the upstream github repo.
   * Update all the dmesg suspend/resume PM print formats to be able to
     process recent timelines using dmesg only.
   * Add ethtool output to the log for the system's ethernet device if
     ethtool exists.
   * Make the tool more robustly handle events where mangled dmesg or
     ftrace outputs do not include all the requisite data.

 - Make the sleepgraph utility recognize "CPU killed" messages (Xueqin
   Luo).

* pm-core:
  PM: core: Remove unnecessary (void *) conversions

* pm-sleep:
  platform/x86/intel/pmc: core: Report duration of time in HW sleep state
  platform/x86/intel/pmc: core: Always capture counters on suspend
  platform/x86/amd: pmc: Report duration of time in hw sleep state
  PM: Add sysfs files to represent time spent in hardware sleep state

* pm-opp:
  OPP: Move required opps configuration to specialized callback
  OPP: Handle all genpd cases together in _set_required_opps()
  opp: Use of_property_present() for testing DT property presence

* pm-tools:
  PM: tools: sleepgraph: Recognize "CPU killed" messages
  pm-graph: Update to v5.11
2023-04-24 18:37:20 +02:00
Rafael J. Wysocki c90b29cede Merge branch 'acpica'
Merge ACPICA material for 6.4-rc1:

 - Delete bogus node_array array of pointers from AEST table (Jessica
   Clarke).

 - Add support for trace buffer extension in GICC to the ACPI MADT
   parser (Xiongfeng Wang).

 - Add missing macro ACPI_FUNCTION_TRACE() for acpi_ns_repair_HID()
   (Xiongfeng Wang).

 - Add missing tables to astable (Pedro Falcato).

 - Add support for 64 bit loong_arch compilation to ACPICA (Huacai
   Chen).

 - Add support for ASPT table in disassembler to ACPICA (Jeremi
   Piotrowski).

 - Add support for Arm's MPAM ACPI table version 2 (Hesham Almatary).

 - Update all copyrights/signons in ACPICA to 2023 (Bob Moore).

 - Add support for ClockInput resource (v6.5) (Niyas Sait).

 - Add RISC-V INTC interrupt controller definition to the list of
   supported interrupt controllers for MADT (Sunil V L).

 - Add structure definitions for the RISC-V RHCT ACPI table (Sunil V L).

 - Address several cases in which the ACPICA code might lead to
   undefined behavior (Tamir Duberstein).

 - Make ACPICA code support flexible arrays properly (Kees Cook).

 - Check null return of ACPI_ALLOCATE_ZEROED in
   acpi_db_display_objects() (void0red).

 - Add os specific support for Zephyr RTOS to ACPICA (Najumon).

 - Update version to 20230331 (Bob Moore).

* acpica: (32 commits)
  ACPICA: Update version to 20230331
  ACPICA: add os specific support for Zephyr RTOS
  ACPICA: ACPICA: check null return of ACPI_ALLOCATE_ZEROED in acpi_db_display_objects
  ACPICA: acpi_resource_irq: Replace 1-element arrays with flexible array
  ACPICA: acpi_madt_oem_data: Fix flexible array member definition
  ACPICA: acpi_dmar_andd: Replace 1-element array with flexible array
  ACPICA: acpi_pci_routing_table: Replace fixed-size array with flex array member
  ACPICA: struct acpi_resource_dma: Replace 1-element array with flexible array
  ACPICA: Introduce ACPI_FLEX_ARRAY
  ACPICA: struct acpi_nfit_interleave: Replace 1-element array with flexible array
  ACPICA: actbl2: Replace 1-element arrays with flexible arrays
  ACPICA: actbl1: Replace 1-element arrays with flexible arrays
  ACPICA: struct acpi_resource_vendor: Replace 1-element array with flexible array
  ACPICA: Avoid undefined behavior: load of misaligned address
  ACPICA: Avoid undefined behavior: member access within misaligned address
  ACPICA: Avoid undefined behavior: member access within misaligned address
  ACPICA: Avoid undefined behavior: member access within misaligned address
  ACPICA: Avoid undefined behavior: member access within misaligned address
  ACPICA: Avoid undefined behavior: member access within null pointer
  ACPICA: Avoid undefined behavior: applying zero offset to null pointer
  ...
2023-04-24 17:31:41 +02:00
Takashi Iwai baa6584a24 ASoC: Updates for v6.4
The bulk of the commits here are for the conversion of drivers to use
 void remove callbacks but there's a reasonable amount of other stuff
 going on, the pace of development with the SOF code continues to be high
 and there's a bunch of new drivers too:
 
  - More core cleanups from Morimto-san.
  - Update drivers to have remove() callbacks returning void, mostly
    mechanical with some substantial changes.
  - Continued feature and simplification work on SOF, including addition
    of a no-DSP mode for bringup, HDA MLink and extensions to the IPC4
    protocol.
  - Hibernation support for CS35L45.
  - More DT binding conversions.
  - Support for Cirrus Logic CS35L56, Freescale QMC, Maxim MAX98363,
    nVidia systems with MAX9809x and RT5631, Realtek RT712, Renesas R-Car
    Gen4, Rockchip RK3588 and TI TAS5733.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmRGdEsACgkQJNaLcl1U
 h9BNVAf+Ijupg3dhAl6847v1PRYXkYK6YjAayhNd0xRoRePKnnv1zkrsXSBzxZUM
 8KHpQDUJyfQbPnE2JRmr1WfHSoNDl/NXdJl+lefPBgol5bzxRji68hDPGjA0645o
 R6vzrqLNw8mh3jwptXfEpQfKH8tIvwOeMeLkncDvsm0ZCUR2GhYnjB1g82ih0ssx
 lvh36PdCRF0e/ruHxkiVn9b/riID65oTRkN6IxJqoPnqJZVyCiqmiJcfWePpaPir
 4R9Dyk+REos/aCLdne1g6H21Tgi0td+blv6empqwdEXG41VSdRMTrOZb1ZISKmpF
 ggPbKsk9BjJFBCewllHXJ0YEcBp9/g==
 =TVxt
 -----END PGP SIGNATURE-----

Merge tag 'asoc-v6.4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next

ASoC: Updates for v6.4

The bulk of the commits here are for the conversion of drivers to use
void remove callbacks but there's a reasonable amount of other stuff
going on, the pace of development with the SOF code continues to be high
and there's a bunch of new drivers too:

 - More core cleanups from Morimto-san.
 - Update drivers to have remove() callbacks returning void, mostly
   mechanical with some substantial changes.
 - Continued feature and simplification work on SOF, including addition
   of a no-DSP mode for bringup, HDA MLink and extensions to the IPC4
   protocol.
 - Hibernation support for CS35L45.
 - More DT binding conversions.
 - Support for Cirrus Logic CS35L56, Freescale QMC, Maxim MAX98363,
   nVidia systems with MAX9809x and RT5631, Realtek RT712, Renesas R-Car
   Gen4, Rockchip RK3588 and TI TAS5733.
2023-04-24 15:15:31 +02:00
Alison Schofield 30a8a105f0 tools/testing/cxl: Require CONFIG_DEBUG_FS
The cxl_mem driver uses debugfs to support poison inject and clear.
Add debugfs to the list of required symbols so that cxl_test can
emulate those poison operations.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/4f3aab57fbf1cc3ccde2eb887c5d90566c8d0e90.1681874357.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-23 12:08:39 -07:00
Alison Schofield 98980d76c3 tools/testing/cxl: Add a sysfs attr to test poison inject limits
CXL devices may report a maximum number of addresses that a device
allows to be poisoned using poison injection. When cxl_test creates
mock CXL memory devices, it defaults to MOCK_INJECT_DEV_MAX==88 for
all mocked memdevs.

Add a sysfs attribute, poison_inject_max to module cxl_mock_mem so
that users can set a custom device injection limit. Fail, and return
-EBUSY, if the mock poison list is not empty.

/sys/bus/platform/drivers/cxl_mock_mem/poison_inject_max

A simple usage model is to set the attribute before running a test in
order to emulate a device's poison handling.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/0f25b2862b90013545450222d2199e435c6cc11a.1681874357.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-23 12:08:39 -07:00
Alison Schofield 8eac7ea725 tools/testing/cxl: Use injected poison for get poison list
Prior to poison inject support, the mock of 'Get Poison List'
returned a poison list containing a single mocked error record.

Following the addition of poison inject and clear support to the
mock driver, use the mock_poison_list[], rather than faking an
error record. Mock_poison_list[] list tracks the actual poison
inject and clear requests issued by userspace.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/0f4242c81821f4982b02cb1009c22783ef66b2f1.1681874357.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-23 12:08:39 -07:00
Alison Schofield 6ec4b6d23e tools/testing/cxl: Mock the Clear Poison mailbox command
Mock the clear of poison by deleting the device:address entry from
the mock_poison_list[]. Behave like a real CXL device and do not fail
if the address is not in the poison list, but offer a dev_dbg()
message.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/ecf19743c6572e60971bbd078f67d520cf5bca5d.1681874357.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-23 12:08:39 -07:00
Alison Schofield 371c16101e tools/testing/cxl: Mock the Inject Poison mailbox command
Mock the injection of poison by storing the device:address entries in
mock_poison_list[]. Enforce a limit of 8 poison injections per memdev
device and 128 total entries for the cxl_test mock driver.

Introducing the mock_poison[] list here, makes it available for use in
the mock of Clear Poison, and the mock of Get Poison List.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/f6b7f03541eaa8c2260d3eafadd04afe3f0d7962.1681874357.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-23 12:08:39 -07:00
Alison Schofield f8d22bf50c tools/testing/cxl: Mock support for Get Poison List
Make mock memdevs support the Get Poison List mailbox command.
Return a fake poison error record when the get poison list command
is issued.

This supports testing the kernel tracing and cxl list capabilities
for media errors.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/14d661ce3e3a32b7d8e76b8ecc5eb88343b3d09c.1681838292.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-23 11:46:22 -07:00
Pedro Tammela 7eb060a51a selftests: tc-testing: add more tests for sch_qfq
The QFQ qdisc class has parameter bounds that are not being
checked for correctness.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23 18:47:09 +01:00
Eduard Zingerman 35150203e3 selftests/bpf: verifier/prevent_map_lookup converted to inline assembly
Test verifier/prevent_map_lookup automatically converted to use inline assembly.

This was a part of a series [1] but could not be applied becuase
another patch from a series had to be witheld.

[1] https://lore.kernel.org/bpf/20230421174234.2391278-1-eddyz87@gmail.com/

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421204514.2450907-1-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-22 08:26:58 -07:00
Jakub Kicinski 9a82cdc28f bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZELn8wAKCRDbK58LschI
 g1khAQC1nmXPuKjM4EAfFK8Ysb3KoF8ADmpE97n+/HEDydCagwD/bX0+NABR75Nh
 ueGcoU1TcfcbshDzrH0s+C95owZDZw4=
 =BeZM
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-04-21

We've added 71 non-merge commits during the last 8 day(s) which contain
a total of 116 files changed, 13397 insertions(+), 8896 deletions(-).

The main changes are:

1) Add a new BPF netfilter program type and minimal support to hook
   BPF programs to netfilter hooks such as prerouting or forward,
   from Florian Westphal.

2) Fix race between btf_put and btf_idr walk which caused a deadlock,
   from Alexei Starovoitov.

3) Second big batch to migrate test_verifier unit tests into test_progs
   for ease of readability and debugging, from Eduard Zingerman.

4) Add support for refcounted local kptrs to the verifier for allowing
   shared ownership, useful for adding a node to both the BPF list and
   rbtree, from Dave Marchevsky.

5) Migrate bpf_for(), bpf_for_each() and bpf_repeat() macros from BPF
  selftests into libbpf-provided bpf_helpers.h header and improve
  kfunc handling, from Andrii Nakryiko.

6) Support 64-bit pointers to kfuncs needed for archs like s390x,
   from Ilya Leoshkevich.

7) Support BPF progs under getsockopt with a NULL optval,
   from Stanislav Fomichev.

8) Improve verifier u32 scalar equality checking in order to enable
   LLVM transformations which earlier had to be disabled specifically
   for BPF backend, from Yonghong Song.

9) Extend bpftool's struct_ops object loading to support links,
   from Kui-Feng Lee.

10) Add xsk selftest follow-up fixes for hugepage allocated umem,
    from Magnus Karlsson.

11) Support BPF redirects from tc BPF to ifb devices,
    from Daniel Borkmann.

12) Add BPF support for integer type when accessing variable length
    arrays, from Feng Zhou.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (71 commits)
  selftests/bpf: verifier/value_ptr_arith converted to inline assembly
  selftests/bpf: verifier/value_illegal_alu converted to inline assembly
  selftests/bpf: verifier/unpriv converted to inline assembly
  selftests/bpf: verifier/subreg converted to inline assembly
  selftests/bpf: verifier/spin_lock converted to inline assembly
  selftests/bpf: verifier/sock converted to inline assembly
  selftests/bpf: verifier/search_pruning converted to inline assembly
  selftests/bpf: verifier/runtime_jit converted to inline assembly
  selftests/bpf: verifier/regalloc converted to inline assembly
  selftests/bpf: verifier/ref_tracking converted to inline assembly
  selftests/bpf: verifier/map_ptr_mixing converted to inline assembly
  selftests/bpf: verifier/map_in_map converted to inline assembly
  selftests/bpf: verifier/lwt converted to inline assembly
  selftests/bpf: verifier/loops1 converted to inline assembly
  selftests/bpf: verifier/jeq_infer_not_null converted to inline assembly
  selftests/bpf: verifier/direct_packet_access converted to inline assembly
  selftests/bpf: verifier/d_path converted to inline assembly
  selftests/bpf: verifier/ctx converted to inline assembly
  selftests/bpf: verifier/btf_ctx_access converted to inline assembly
  selftests/bpf: verifier/bpf_get_stack converted to inline assembly
  ...
====================

Link: https://lore.kernel.org/r/20230421211035.9111-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-21 20:32:37 -07:00
Davide Caratti 7041101ff6 net/sched: sch_fq: fix integer overflow of "credit"
if sch_fq is configured with "initial quantum" having values greater than
INT_MAX, the first assignment of "credit" does signed integer overflow to
a very negative value.
In this situation, the syzkaller script provided by Cristoph triggers the
CPU soft-lockup warning even with few sockets. It's not an infinite loop,
but "credit" wasn't probably meant to be minus 2Gb for each new flow.
Capping "initial quantum" to INT_MAX proved to fix the issue.

v2: validation of "initial quantum" is done in fq_policy, instead of open
    coding in fq_change() _ suggested by Jakub Kicinski

Reported-by: Christoph Paasch <cpaasch@apple.com>
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/377
Fixes: afe4fd0624 ("pkt_sched: fq: Fair Queue packet scheduler")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/r/7b3a3c7e36d03068707a021760a194a8eb5ad41a.1682002300.git.dcaratti@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-21 20:24:29 -07:00
Stefan Roesch 07115fcc15 selftests/mm: add new selftests for KSM
This adds three new tests to the selftests for KSM.  These tests use the
new prctl API's to enable and disable KSM.

1) add new prctl flags to prctl header file in tools dir

   This adds the new prctl flags to the include file prct.h in the
   tools directory.  This makes sure they are available for testing.

2) add KSM prctl merge test to ksm_tests

   This adds the -t option to the ksm_tests program.  The -t flag
   allows to specify if it should use madvise or prctl ksm merging.

3) add two functions for debugging merge outcome for ksm_tests

   This adds two functions to report the metrics in /proc/self/ksm_stat
   and /sys/kernel/debug/mm/ksm. The debug output is enabled with the
   -d option.

4) add KSM prctl test to ksm_functional_tests

   This adds a test to the ksm_functional_test that verifies that the
   prctl system call to enable / disable KSM works.

5) add KSM fork test to ksm_functional_test

   Add fork test to verify that the MMF_VM_MERGE_ANY flag is inherited
   by the child process.

Link: https://lkml.kernel.org/r/20230418051342.1919757-4-shr@devkernel.io
Signed-off-by: Stefan Roesch <shr@devkernel.io>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-21 14:52:03 -07:00
Peter Xu 760aee0b71 selftests/mm: add tests for RO pinning vs fork()
Add a test suite (with 10 more sub-tests) to cover RO pinning against
fork() over uffd-wp.  It covers both:

  (1) Early CoW test in fork() when page pinned,
  (2) page unshare due to RO longterm pin.

They are:

  Testing wp-fork-pin on anon... done
  Testing wp-fork-pin on shmem... done
  Testing wp-fork-pin on shmem-private... done
  Testing wp-fork-pin on hugetlb... done
  Testing wp-fork-pin on hugetlb-private... done
  Testing wp-fork-pin-with-event on anon... done
  Testing wp-fork-pin-with-event on shmem... done
  Testing wp-fork-pin-with-event on shmem-private... done
  Testing wp-fork-pin-with-event on hugetlb... done
  Testing wp-fork-pin-with-event on hugetlb-private... done

CONFIG_GUP_TEST needed or they'll be skipped.

  Testing wp-fork-pin on anon... skipped [reason: Possibly CONFIG_GUP_TEST missing or unprivileged]

Note that the major test goal is on private memory, but no hurt to also run
all of them over shared because shared memory should work the same.

Link: https://lkml.kernel.org/r/20230417195317.898696-7-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-21 14:52:00 -07:00
Peter Xu 71fc41eb98 selftests/mm: rename COW_EXTRA_LIBS to IOURING_EXTRA_LIBS
The macro and facility can be reused in other tests too.  Make it general.

Link: https://lkml.kernel.org/r/20230417195317.898696-6-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-21 14:52:00 -07:00
Peter Xu cff2945827 selftests/mm: extend and rename uffd pagemap test
Extend it to all types of mem, meanwhile add one parallel test when
EVENT_FORK is enabled, where uffd-wp bits should be persisted rather than
dropped.

Since at it, rename the test to "wp-fork" to better show what it means. 
Making the new test called "wp-fork-with-event".

Before:

        Testing pagemap on anon... done

After:

        Testing wp-fork on anon... done
        Testing wp-fork on shmem... done
        Testing wp-fork on shmem-private... done
        Testing wp-fork on hugetlb... done
        Testing wp-fork on hugetlb-private... done
        Testing wp-fork-with-event on anon... done
        Testing wp-fork-with-event on shmem... done
        Testing wp-fork-with-event on shmem-private... done
        Testing wp-fork-with-event on hugetlb... done
        Testing wp-fork-with-event on hugetlb-private... done

Link: https://lkml.kernel.org/r/20230417195317.898696-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-21 14:52:00 -07:00
Peter Xu 21337f2af1 selftests/mm: add a few options for uffd-unit-test
Namely:

  "-f": add a wildcard filter for tests to run
  "-l": list tests rather than running any
  "-h": help msg

Link: https://lkml.kernel.org/r/20230417195317.898696-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-21 14:52:00 -07:00
Eduard Zingerman 4db10a8243 selftests/bpf: verifier/value_ptr_arith converted to inline assembly
Test verifier/value_ptr_arith automatically converted to use inline assembly.

Test cases "sanitation: alu with different scalars 2" and
"sanitation: alu with different scalars 3" are updated to
avoid -ENOENT as return value, as __retval() annotation
only supports numeric literals.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-25-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:27:19 -07:00
Eduard Zingerman efe25a330b selftests/bpf: verifier/value_illegal_alu converted to inline assembly
Test verifier/value_illegal_alu automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-24-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:27:07 -07:00
Eduard Zingerman 82887c2568 selftests/bpf: verifier/unpriv converted to inline assembly
Test verifier/unpriv semi-automatically converted to use inline assembly.

The verifier/unpriv.c had to be split in two parts:
- the bulk of the tests is in the progs/verifier_unpriv.c;
- the single test that needs `struct bpf_perf_event_data`
  definition is in the progs/verifier_unpriv_perf.c.

The tests above can't be in a single file because:
- first requires inclusion of the filter.h header
  (to get access to BPF_ST_MEM macro, inline assembler does
   not support this isntruction);
- the second requires vmlinux.h, which contains definitions
  conflicting with filter.h.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-23-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:26:52 -07:00
Eduard Zingerman 81d1d6dd40 selftests/bpf: verifier/subreg converted to inline assembly
Test verifier/subreg automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-22-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:25:45 -07:00
Eduard Zingerman f323a81806 selftests/bpf: verifier/spin_lock converted to inline assembly
Test verifier/spin_lock automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-21-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:25:31 -07:00
Eduard Zingerman 426fc0e3fc selftests/bpf: verifier/sock converted to inline assembly
Test verifier/sock automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-20-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:25:19 -07:00
Eduard Zingerman 034d9ad25d selftests/bpf: verifier/search_pruning converted to inline assembly
Test verifier/search_pruning automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-19-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:25:07 -07:00
Eduard Zingerman 65222842ca selftests/bpf: verifier/runtime_jit converted to inline assembly
Test verifier/runtime_jit automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-18-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:24:41 -07:00
Eduard Zingerman 16a42573c2 selftests/bpf: verifier/regalloc converted to inline assembly
Test verifier/regalloc automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-17-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:23:40 -07:00
Eduard Zingerman 8be6327959 selftests/bpf: verifier/ref_tracking converted to inline assembly
Test verifier/ref_tracking automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-16-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:23:13 -07:00
Eduard Zingerman aee1779f0d selftests/bpf: verifier/map_ptr_mixing converted to inline assembly
Test verifier/map_ptr_mixing automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-13-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:20:38 -07:00
Eduard Zingerman 4a400ef9ba selftests/bpf: verifier/map_in_map converted to inline assembly
Test verifier/map_in_map automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-12-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:20:26 -07:00
Eduard Zingerman b427ca576f selftests/bpf: verifier/lwt converted to inline assembly
Test verifier/lwt automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-11-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:19:20 -07:00
Eduard Zingerman a6fc14dc5e selftests/bpf: verifier/loops1 converted to inline assembly
Test verifier/loops1 automatically converted to use inline assembly.

There are a few modifications for the converted tests.
"tracepoint" programs do not support test execution, change program
type to "xdp" (which supports test execution) for the following tests
that have __retval tags:
- bounded loop, count to 4
- bonded loop containing forward jump

Also, remove the __retval tag for test:
- bounded loop, count from positive unknown to 4

As it's return value is a random number.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-10-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:19:07 -07:00
Eduard Zingerman a5828e3154 selftests/bpf: verifier/jeq_infer_not_null converted to inline assembly
Test verifier/jeq_infer_not_null automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-9-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:18:55 -07:00
Eduard Zingerman 0a372c9c08 selftests/bpf: verifier/direct_packet_access converted to inline assembly
Test verifier/direct_packet_access automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-8-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:18:44 -07:00
Eduard Zingerman 6080280243 selftests/bpf: verifier/d_path converted to inline assembly
Test verifier/d_path automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-7-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:18:16 -07:00
Eduard Zingerman fcd36964f2 selftests/bpf: verifier/ctx converted to inline assembly
Test verifier/ctx automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-6-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:18:03 -07:00
Eduard Zingerman 37467c79e1 selftests/bpf: verifier/btf_ctx_access converted to inline assembly
Test verifier/btf_ctx_access automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-5-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:17:51 -07:00
Eduard Zingerman 965a3f913e selftests/bpf: verifier/bpf_get_stack converted to inline assembly
Test verifier/bpf_get_stack automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:17:39 -07:00
Eduard Zingerman c92336559a selftests/bpf: verifier/bounds converted to inline assembly
Test verifier/bounds automatically converted to use inline assembly.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:17:14 -07:00
Eduard Zingerman 63bb645b9d selftests/bpf: Add notion of auxiliary programs for test_loader
In order to express test cases that use bpf_tail_call() intrinsic it
is necessary to have several programs to be loaded at a time.
This commit adds __auxiliary annotation to the set of annotations
supported by test_loader.c. Programs marked as auxiliary are always
loaded but are not treated as a separate test.

For example:

    void dummy_prog1(void);

    struct {
            __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
            __uint(max_entries, 4);
            __uint(key_size, sizeof(int));
            __array(values, void (void));
    } prog_map SEC(".maps") = {
            .values = {
                    [0] = (void *) &dummy_prog1,
            },
    };

    SEC("tc")
    __auxiliary
    __naked void dummy_prog1(void) {
            asm volatile ("r0 = 42; exit;");
    }

    SEC("tc")
    __description("reference tracking: check reference or tail call")
    __success __retval(0)
    __naked void check_reference_or_tail_call(void)
    {
            asm volatile (
            "r2 = %[prog_map] ll;"
            "r3 = 0;"
            "call %[bpf_tail_call];"
            "r0 = 0;"
            "exit;"
            :: __imm(bpf_tail_call),
            :  __clobber_all);
    }

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230421174234.2391278-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 12:16:56 -07:00
Florian Westphal 006c0e44ed selftests/bpf: add missing netfilter return value and ctx access tests
Extend prog_tests with two test cases:

 # ./test_progs --allow=verifier_netfilter_retcode
 #278/1   verifier_netfilter_retcode/bpf_exit with invalid return code. test1:OK
 #278/2   verifier_netfilter_retcode/bpf_exit with valid return code. test2:OK
 #278/3   verifier_netfilter_retcode/bpf_exit with valid return code. test3:OK
 #278/4   verifier_netfilter_retcode/bpf_exit with invalid return code. test4:OK
 #278     verifier_netfilter_retcode:OK

This checks that only accept and drop (0,1) are permitted.

NF_QUEUE could be implemented later if we can guarantee that attachment
of such programs can be rejected if they get attached to a pf/hook that
doesn't support async reinjection.

NF_STOLEN could be implemented via trusted helpers that can guarantee
that the skb will eventually be free'd.

v4: test case for bpf_nf_ctx access checks, requested by Alexei Starovoitov.
v5: also check ctx->{state,skb} can be dereferenced (Alexei).

 # ./test_progs --allow=verifier_netfilter_ctx
 #281/1   verifier_netfilter_ctx/netfilter invalid context access, size too short:OK
 #281/2   verifier_netfilter_ctx/netfilter invalid context access, size too short:OK
 #281/3   verifier_netfilter_ctx/netfilter invalid context access, past end of ctx:OK
 #281/4   verifier_netfilter_ctx/netfilter invalid context, write:OK
 #281/5   verifier_netfilter_ctx/netfilter valid context read and invalid write:OK
 #281/6   verifier_netfilter_ctx/netfilter test prog with skb and state read access:OK
 #281/7   verifier_netfilter_ctx/netfilter test prog with skb and state read access @unpriv:OK
 #281     verifier_netfilter_ctx:OK
Summary: 1/7 PASSED, 0 SKIPPED, 0 FAILED

This checks:
1/2: partial reads of ctx->{skb,state} are rejected
3. read access past sizeof(ctx) is rejected
4. write to ctx content, e.g. 'ctx->skb = NULL;' is rejected
5. ctx->state content cannot be altered
6. ctx->state and ctx->skb can be dereferenced
7. ... same program fails for unpriv (CAP_NET_ADMIN needed).

Link: https://lore.kernel.org/bpf/20230419021152.sjq4gttphzzy6b5f@dhcp-172-26-102-232.dhcp.thefacebook.com/
Link: https://lore.kernel.org/bpf/20230420201655.77kkgi3dh7fesoll@MacBook-Pro-6.local/
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/20230421170300.24115-8-fw@strlen.de
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 11:34:50 -07:00
Florian Westphal d0fe92fb5e tools: bpftool: print netfilter link info
Dump protocol family, hook and priority value:
$ bpftool link
2: netfilter  prog 14
        ip input prio -128
        pids install(3264)
5: netfilter  prog 14
        ip6 forward prio 21
        pids a.out(3387)
9: netfilter  prog 14
        ip prerouting prio 123
        pids a.out(5700)
10: netfilter  prog 14
        ip input prio 21
        pids test2(5701)

v2: Quentin Monnet suggested to also add 'bpftool net' support:

$ bpftool net
xdp:

tc:

flow_dissector:

netfilter:

        ip prerouting prio 21 prog_id 14
        ip input prio -128 prog_id 14
        ip input prio 21 prog_id 14
        ip forward prio 21 prog_id 14
        ip output prio 21 prog_id 14
        ip postrouting prio 21 prog_id 14

'bpftool net' only dumps netfilter link type, links are sorted by protocol
family, hook and priority.

v5: fix bpf ci failure: libbpf needs small update to prog_type_name[]
    and probe_prog_load helper.
v4: don't fail with -EOPNOTSUPP in libbpf probe_prog_load, update
    prog_type_name[] with "netfilter" entry (bpf ci)
v3: fix bpf.h copy, 'reserved' member was removed (Alexei)
    use p_err, not fprintf (Quentin)

Suggested-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/eeeaac99-9053-90c2-aa33-cc1ecb1ae9ca@isovalent.com/
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/20230421170300.24115-6-fw@strlen.de
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 11:34:49 -07:00
Kui-Feng Lee 45cea721ea bpftool: Update doc to explain struct_ops register subcommand.
The "struct_ops register" subcommand now allows for an optional *LINK_DIR*
to be included. This specifies the directory path where bpftool will pin
struct_ops links with the same name as their corresponding map names.

Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230420002822.345222-2-kuifeng@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 11:10:10 -07:00
Kui-Feng Lee 0232b78897 bpftool: Register struct_ops with a link.
You can include an optional path after specifying the object name for the
'struct_ops register' subcommand.

Since the commit 226bc6ae64 ("Merge branch 'Transit between BPF TCP
congestion controls.'") has been accepted, it is now possible to create a
link for a struct_ops. This can be done by defining a struct_ops in
SEC(".struct_ops.link") to make libbpf returns a real link. If we don't pin
the links before leaving bpftool, they will disappear. To instruct bpftool
to pin the links in a directory with the names of the maps, we need to
provide the path of that directory.

Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20230420002822.345222-1-kuifeng@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-21 11:10:10 -07:00
Stanislav Fomichev 833d67ecdc selftests/bpf: Verify optval=NULL case
Make sure we get optlen exported instead of getting EFAULT.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230418225343.553806-3-sdf@google.com
2023-04-21 17:10:34 +02:00
Magnus Karlsson 02e93e0475 selftests/xsk: Put MAP_HUGE_2MB in correct argument
Put the flag MAP_HUGE_2MB in the correct flags argument instead of the
wrong offset argument.

Fixes: 2ddade3229 ("selftests/xsk: Fix munmap for hugepage allocated umem")
Reported-by: Kal Cutter Conley <kal.conley@dectris.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230421062208.3772-1-magnus.karlsson@gmail.com
2023-04-21 16:35:10 +02:00
Dave Marchevsky 4ab07209d5 bpf: Fix bpf_refcount_acquire's refcount_t address calculation
When calculating the address of the refcount_t struct within a local
kptr, bpf_refcount_acquire_impl should add refcount_off bytes to the
address of the local kptr. Due to some missing parens, the function is
incorrectly adding sizeof(refcount_t) * refcount_off bytes. This patch
fixes the calculation.

Due to the incorrect calculation, bpf_refcount_acquire_impl was trying
to refcount_inc some memory well past the end of local kptrs, resulting
in kasan and refcount complaints, as reported in [0]. In that thread,
Florian and Eduard discovered that bpf selftests written in the new
style - with __success and an expected __retval, specifically - were
not actually being run. As a result, selftests added in bpf_refcount
series weren't really exercising this behavior, and thus didn't unearth
the bug.

With this fixed behavior it's safe to revert commit 7c4b96c000
("selftests/bpf: disable program test run for progs/refcounted_kptr.c"),
this patch does so.

  [0] https://lore.kernel.org/bpf/ZEEp+j22imoN6rn9@strlen.de/

Fixes: 7c50b1cb76 ("bpf: Add bpf_refcount_acquire kfunc")
Reported-by: Florian Westphal <fw@strlen.de>
Reported-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20230421074431.3548349-1-davemarchevsky@fb.com
2023-04-21 16:31:37 +02:00
Marc Zyngier 6dcf7316e0 Merge branch kvm-arm64/smccc-filtering into kvmarm-master/next
* kvm-arm64/smccc-filtering:
  : .
  : SMCCC call filtering and forwarding to userspace, courtesy of
  : Oliver Upton. From the cover letter:
  :
  : "The Arm SMCCC is rather prescriptive in regards to the allocation of
  : SMCCC function ID ranges. Many of the hypercall ranges have an
  : associated specification from Arm (FF-A, PSCI, SDEI, etc.) with some
  : room for vendor-specific implementations.
  :
  : The ever-expanding SMCCC surface leaves a lot of work within KVM for
  : providing new features. Furthermore, KVM implements its own
  : vendor-specific ABI, with little room for other implementations (like
  : Hyper-V, for example). Rather than cramming it all into the kernel we
  : should provide a way for userspace to handle hypercalls."
  : .
  KVM: selftests: Fix spelling mistake "KVM_HYPERCAL_EXIT_SMC" -> "KVM_HYPERCALL_EXIT_SMC"
  KVM: arm64: Test that SMC64 arch calls are reserved
  KVM: arm64: Prevent userspace from handling SMC64 arch range
  KVM: arm64: Expose SMC/HVC width to userspace
  KVM: selftests: Add test for SMCCC filter
  KVM: selftests: Add a helper for SMCCC calls with SMC instruction
  KVM: arm64: Let errors from SMCCC emulation to reach userspace
  KVM: arm64: Return NOT_SUPPORTED to guest for unknown PSCI version
  KVM: arm64: Introduce support for userspace SMCCC filtering
  KVM: arm64: Add support for KVM_EXIT_HYPERCALL
  KVM: arm64: Use a maple tree to represent the SMCCC filter
  KVM: arm64: Refactor hvc filtering to support different actions
  KVM: arm64: Start handling SMCs from EL1
  KVM: arm64: Rename SMC/HVC call handler to reflect reality
  KVM: arm64: Add vm fd device attribute accessors
  KVM: arm64: Add a helper to check if a VM has ran once
  KVM: x86: Redefine 'longmode' as a flag for KVM_EXIT_HYPERCALL

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-04-21 09:44:32 +01:00
Marc Zyngier 367eb095b8 Merge branch kvm-arm64/selftest/misc-6.4 into kvmarm-master/next
* kvm-arm64/selftest/misc-6.4:
  : .
  : Misc selftest updates for 6.4
  :
  : - Add comments for recently added ID registers
  : .
  KVM: selftests: Comment newly defined aarch64 ID registers

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-04-21 09:39:07 +01:00
Marc Zyngier e2e321a7d6 Merge branch kvm-arm64/selftest/lpa into kvmarm-master/next
* kvm-arm64/selftest/lpa:
  : .
  : Selftest fixes addressing PTE and TTBR0_EL1 encodings for
  : 52bit PAs
  : .
  KVM: selftests: arm64: Fix ttbr0_el1 encoding for PA bits > 48
  KVM: selftests: arm64: Fix pte encode/decode for PA bits > 48
  KVM: selftests: Fixup config fragment for access_tracking_perf_test

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-04-21 09:37:36 +01:00
Marc Zyngier b22498c484 Merge branch kvm-arm64/timer-vm-offsets into kvmarm-master/next
* kvm-arm64/timer-vm-offsets: (21 commits)
  : .
  : This series aims at satisfying multiple goals:
  :
  : - allow a VMM to atomically restore a timer offset for a whole VM
  :   instead of updating the offset each time a vcpu get its counter
  :   written
  :
  : - allow a VMM to save/restore the physical timer context, something
  :   that we cannot do at the moment due to the lack of offsetting
  :
  : - provide a framework that is suitable for NV support, where we get
  :   both global and per timer, per vcpu offsetting, and manage
  :   interrupts in a less braindead way.
  :
  : Conflict resolution involves using the new per-vcpu config lock instead
  : of the home-grown timer lock.
  : .
  KVM: arm64: Handle 32bit CNTPCTSS traps
  KVM: arm64: selftests: Augment existing timer test to handle variable offset
  KVM: arm64: selftests: Deal with spurious timer interrupts
  KVM: arm64: selftests: Add physical timer registers to the sysreg list
  KVM: arm64: nv: timers: Support hyp timer emulation
  KVM: arm64: nv: timers: Add a per-timer, per-vcpu offset
  KVM: arm64: Document KVM_ARM_SET_CNT_OFFSETS and co
  KVM: arm64: timers: Abstract the number of valid timers per vcpu
  KVM: arm64: timers: Fast-track CNTPCT_EL0 trap handling
  KVM: arm64: Elide kern_hyp_va() in VHE-specific parts of the hypervisor
  KVM: arm64: timers: Move the timer IRQs into arch_timer_vm_data
  KVM: arm64: timers: Abstract per-timer IRQ access
  KVM: arm64: timers: Rationalise per-vcpu timer init
  KVM: arm64: timers: Allow save/restoring of the physical timer
  KVM: arm64: timers: Allow userspace to set the global counter offset
  KVM: arm64: Expose {un,}lock_all_vcpus() to the rest of KVM
  KVM: arm64: timers: Allow physical offset without CNTPOFF_EL2
  KVM: arm64: timers: Use CNTPOFF_EL2 to offset the physical timer
  arm64: Add HAS_ECV_CNTPOFF capability
  arm64: Add CNTPOFF_EL2 register definition
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-04-21 09:36:40 +01:00
Ido Schimmel 7648ac72dc selftests: net: Add bridge neighbor suppression test
Add test cases for bridge neighbor suppression, testing both per-port
and per-{Port, VLAN} neighbor suppression with both ARP and NS packets.

Example truncated output:

 # ./test_bridge_neigh_suppress.sh
 [...]
 Tests passed: 148
 Tests failed:   0

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-21 08:25:50 +01:00
Shunsuke Mie e9c4962c5d tools/virtio: fix build caused by virtio_ring changes
Fix the build dependency for virtio_test. The virtio_ring that is used from
the test requires container_of_const(). Change to use container_of.h kernel
header directly and adapt related codes.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Message-Id: <20230417022037.917668-2-mie@igel.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21 03:02:35 -04:00
Rong Tao 6b27cd84a7 tools/virtio: virtio_test -h,--help should return directly
When we get help information, we should return directly, and we should not
execute test cases. Move the exit() directly into the help() function and
remove it from case '?'.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Message-Id: <tencent_822CEBEB925205EA1573541CD1C2604F4805@qq.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21 03:02:30 -04:00
Rong Tao 9b2b3de63c tools/virtio: virtio_test: Fix indentation
Replace eight spaces with Tab.

Signed-off-by: Rong Tao <rtoax@foxmail.com>
Message-Id: <tencent_89579C514BC4020324A1A4ACA44B5B95BB07@qq.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21 03:02:30 -04:00
Vladimir Oltean e6991384ac selftests: forwarding: add a test for MAC Merge layer
The MAC Merge layer (IEEE 802.3-2018 clause 99) does all the heavy
lifting for Frame Preemption (IEEE 802.1Q-2018 clause 6.7.2), a TSN
feature for minimizing latency.

Preemptible traffic is different on the wire from normal traffic in
incompatible ways. If we send a preemptible packet and the link partner
doesn't support preemption, it will drop it as an error frame and we
will never know. The MAC Merge layer has a control plane of its own,
which can be manipulated (using ethtool) in order to negotiate this
capability with the link partner (through LLDP).

Actually the TLV format for LLDP solves this problem only partly,
because both partners only advertise:
- if they support preemption (RX and TX)
- if they have enabled preemption (TX)
so we cannot tell the link partner what to do - we cannot force it to
enable reception of our preemptible packets.

That is fully solved by the verification feature, where the local device
generates some small probe frames which look like preemptible frames
with no useful content, and the link partner is obliged to respond to
them if it supports the standard. If the verification times out, we know
that preemption isn't active in our TX direction on the link.

Having clarified the definition, this selftest exercises the manual
(ethtool) configuration path of 2 link partners (with and without
verification), and the LLDP code path, using the openlldp project.

The test also verifies the TX activity of the MAC Merge layer by
sending traffic through a traffic class configured as preemptible
(using mqprio). There isn't a good way to make this really portable
(user space cannot find out how many traffic classes there are for
a device), but I chose num_tc 4 here, that should work reasonably well.
I also know that some devices (stmmac) only permit TXQ0 to be
preemptible, so this is why PREEMPTIBLE_PRIO was strategically chosen
as 0. Even if other hardware is more configurable, this test should
cover the baseline.

This is not really a "forwarding" selftest, but I put it near the other
"ethtool" selftests.

$ ./ethtool_mm.sh eno0 swp0
TEST: Manual configuration with verification: eno0 to swp0          [ OK ]
TEST: Manual configuration with verification: swp0 to eno0          [ OK ]
TEST: Manual configuration without verification: eno0 to swp0       [ OK ]
TEST: Manual configuration without verification: swp0 to eno0       [ OK ]
TEST: Manual configuration with failed verification: eno0 to swp0   [ OK ]
TEST: Manual configuration with failed verification: swp0 to eno0   [ OK ]
TEST: LLDP                                                          [ OK ]

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20 20:03:21 -07:00
Vladimir Oltean b5bf7126a6 selftests: forwarding: introduce helper for standard ethtool counters
Counters for the MAC Merge layer and preemptible MAC have standardized
so far on using structured ethtool stats as opposed to the driver
specific names and meanings.

Benefit from that rare opportunity and introduce a helper to lib.sh for
querying standardized counters, in the hope that these will take off for
other uses as well.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20 20:03:21 -07:00
Petr Machata 8fcac79270 selftests: forwarding: generalize bail_on_lldpad from mlxsw
mlxsw selftests often invoke a bail_on_lldpad() helper to make sure LLDPAD
is not running, to prevent conflicts between the QoS configuration applied
through TC or DCB command line tool, and the DCB configuration that LLDPAD
might apply. This helper might be useful to others. Move the function to
lib.sh, and parameterize to make reusable in other contexts.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20 20:03:21 -07:00
Petr Machata 54e906f163 selftests: forwarding: sch_tbf_*: Add a pre-run hook
The driver-specific wrappers of these selftests invoke bail_on_lldpad to
make sure that LLDPAD doesn't trample the configuration. The function
bail_on_lldpad is going to move to lib.sh in the next patch. With that, it
won't be visible for the wrappers before sourcing the framework script. And
after sourcing it, it is too late: the selftest will have run by then.

One option might be to source NUM_NETIFS=0 lib.sh from the wrapper, but
even if that worked (it might, it might not), that seems cumbersome. lib.sh
is doing fair amount of stuff, and even if it works today, it does not look
particularly solid as a solution.

Instead, introduce a hook, sch_tbf_pre_hook(), that when available, gets
invoked. Move the bail to the hook.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20 20:03:21 -07:00
Eduard Zingerman cbb110bc66 selftests/bpf: populate map_array_ro map for verifier_array_access test
Two test cases:
- "valid read map access into a read-only array 1" and
- "valid read map access into a read-only array 2"

Expect that map_array_ro map is filled with mock data. This logic was
not taken into acount during initial test conversion.

This commit modifies prog_tests/verifier.c entry point for this test
to fill the map.

Fixes: a3c830ae02 ("selftests/bpf: verifier/array_access.c converted to inline assembly")
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230420232317.2181776-5-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-20 16:49:16 -07:00
Eduard Zingerman 5b22f4d143 selftests/bpf: add pre bpf_prog_test_run_opts() callback for test_loader
When a test case is annotated with __retval tag the test_loader engine
would use libbpf's bpf_prog_test_run_opts() to do a test run of the
program and compare retvals.

This commit allows to perform arbitrary actions on bpf object right
before test loader invokes bpf_prog_test_run_opts(). This could be
used to setup some state for program execution, e.g. fill some maps.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230420232317.2181776-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-20 16:49:16 -07:00
Eduard Zingerman 7cdddb99e4 selftests/bpf: fix __retval() being always ignored
Florian Westphal found a bug in and suggested a fix for test_loader.c
processing of __retval tag. Because of this bug the function
test_loader.c:do_prog_test_run() never executed and all __retval test
tags were ignored.

If this bug is fixed a number of test cases from
progs/verifier_array_access.c fail with retval not matching the
expected value. This test was recently converted to use test_loader.c
and inline assembly in [1]. When doing the conversion I missed the
important detail of test_verifier.c operation: when it creates
fixup_map_array_ro, fixup_map_array_wo and fixup_map_array_small it
populates these maps with a dummy record.

Disabling the __retval checks for the affected verifier_array_access
in this commit to avoid false-postivies in any potential bisects.
The issue is addressed in the next patch.

I verified that the __retval tags are now respected by changing
expected return values for all tests annotated with __retval, and
checking that these tests started to fail.

[1] https://lore.kernel.org/bpf/20230325025524.144043-1-eddyz87@gmail.com/

Fixes: 19a8e06f5f ("selftests/bpf: Tests execution support for test_loader.c")
Reported-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/bpf/f4c4aee644425842ee6aa8edf1da68f0a8260e7c.camel@gmail.com/T/
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230420232317.2181776-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-20 16:49:16 -07:00
Eduard Zingerman 7c4b96c000 selftests/bpf: disable program test run for progs/refcounted_kptr.c
Florian Westphal found a bug in test_loader.c processing of __retval
tag. Because of this bug the function test_loader.c:do_prog_test_run()
never executed and all __retval test tags were ignored. This hid an
issue with progs/refcounted_kptr.c tests.

When __retval tag bug is fixed and refcounted_kptr.c tests are run
kernel reports various issues and eventually hangs. Shortest reproducer
is the following command run a few times:

  $ for i in $(seq 1 4); do (./test_progs --allow=refcounted_kptr &); done

Commenting out __retval tags for these tests until this issue is resolved.

Reported-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/bpf/f4c4aee644425842ee6aa8edf1da68f0a8260e7c.camel@gmail.com/T/
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230420232317.2181776-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-20 16:49:16 -07:00
Quentin Monnet 4b7ef71ac9 bpftool: Replace "__fallthrough" by a comment to address merge conflict
The recent support for inline annotations in control flow graphs
generated by bpftool introduced the usage of the "__fallthrough" macro
in a switch/case block in btf_dumper.c. This change went through the
bpf-next tree, but resulted in a merge conflict in linux-next, because
this macro has been renamed "fallthrough" (no underscores) in the
meantime.

To address the conflict, we temporarily switch to a simple comment
instead of a macro.

Related: commit f7a858bffc ("tools: Rename __fallthrough to fallthrough")

Fixes: 9fd496848b ("bpftool: Support inline annotations when dumping the CFG of a program")
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/all/yt9dttxlwal7.fsf@linux.ibm.com/
Link: https://lore.kernel.org/bpf/20230412123636.2358949-1-tmricht@linux.ibm.com/
Link: https://lore.kernel.org/bpf/20230420003333.90901-1-quentin@isovalent.com
2023-04-20 16:38:10 -07:00
Jakub Kicinski 681c5b51dc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Adjacent changes:

net/mptcp/protocol.h
  63740448a3 ("mptcp: fix accept vs worker race")
  2a6a870e44 ("mptcp: stops worker on unaccepted sockets at listener close")
  ddb1a072f8 ("mptcp: move first subflow allocation at mpc access time")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20 16:29:51 -07:00
Ian Rogers 9be6ab181b libperf rc_check: Enable implicitly with sanitizers
If using leak sanitizer then implicitly enable reference count checking.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230420171812.561603-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-20 15:12:02 -03:00
Ian Rogers edd4cab2d4 perf test: Fix maps use after put
Fix a use after put reference count issue. maps is copied from leader,
but the leader is put on line 79 and then maps is used to read the
reference count below - so a use after put, with the put of maps
happening within thread__put. Fix by reversing the order of puts so
that the leader is put last.

To explain the reference count checker, I wrote this up as a little
example here:
https://perf.wiki.kernel.org/index.php/Reference_Count_Checking

Note, the bug was introduced by the committer and wasn't present in
the original reference count patch set.

Committer notes:

Yes, the bug predated your patch and is detected by the reference count
checking you contributed.

This was just part of splitting up your series into smaller chunks, in
this case either we fix the problem detected while developing this
reference counting infrastructure before the patch introducing REFCNT_CHECKING
or fix it later after the merged infrastructure, when built with
EXTRA_CFLAGS="-DREFCNT_CHECKING=1" detects it when running 'perf test', which
is what this patch does.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230420030430.489243-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-20 08:23:52 -03:00
Feng Zhou 5ff54dedf3 selftests/bpf: Add test to access integer type of variable array
Add prog test for accessing integer type of variable array in tracing
program.
In addition, hook load_balance function to access sd->span[0], only
to confirm whether the load is successful. Because there is no direct
way to trigger load_balance call.

Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
Link: https://lore.kernel.org/r/20230420032735.27760-3-zhoufeng.zf@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-19 21:29:39 -07:00
Benjamin Gray ae7312c090 selftests/powerpc/dscr: Restore timeout to DSCR selftests
Reducing the time taken by dscr_sysfs_test.c allows restoring the
default timeout, which was removed in
commit 850507f30c ("selftests/powerpc: Turn off timeout setting for
benchmarks, dscr, signal, tm") because that test took too long.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230406043320.125138-8-bgray@linux.ibm.com
2023-04-20 13:21:46 +10:00
Benjamin Gray c14a9d0a79 selftests/powerpc/dscr: Speed up DSCR sysfs tests
This test case is extremely slow, taking around a minute compared to
most of the other DSCR tests taking a second at most. Perf shows most
time is spent by the kernel switching to each CPU it reads in
/sys/devices/system/cpu. This switching is an unavoidable consequnce
of reading all the .../cpuN/dscr values.

Remove the outer iteration loop from this test case, reducing the reads
from 1600 to 16. This still updates the DSCR 16 times and verifies on
every CPU each time, so I do not expect the lower coverage to be
meaningful. The speedup is significant: back down to ~1 second like the
other tests.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230406043320.125138-7-bgray@linux.ibm.com
2023-04-20 13:21:46 +10:00
Benjamin Gray 3067b89ab6 selftests/powerpc/dscr: Improve DSCR explicit random test case
The tests currently have a single writer thread updating the system
DSCR with a 1/1000 chance looped only 100 times. So only around one in
10 runs actually do anything.

* Add multiple threads to the dscr_explicit_random_test case.
* Use a barrier to make all the threads start work as simultaneously as
  possible.
* Use a rwlock and make all threads have a reasonable chance to write to
  the DSCR on each iteration.
  PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP is used to prevent
  writers from starving while all the other threads keep reading.
  Logging the reads/writes shows a decent mix across the whole test.
* Allow all threads a chance to write.
* Make the chance of writing more likely.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230406043320.125138-6-bgray@linux.ibm.com
2023-04-20 13:21:45 +10:00
Benjamin Gray fda8158870 selftests/powerpc/dscr: Add lockstep test cases to DSCR explicit tests
Add new cases to the relevant tests that use explicitly synchronized
threads to test the behaviour across context switches with less
randomness. By locking the participants to the same CPU we guarantee a
context switch occurs each time they make progress, which is a likely
failure point if the kernel is not tracking the thread local DSCR
correctly.

The random case is left in to keep exercising potential edge cases.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230406043320.125138-5-bgray@linux.ibm.com
2023-04-20 13:21:45 +10:00
Benjamin Gray 6ff4dc2548 selftests/powerpc: Allow bind_to_cpu() to automatically pick CPU
All current users of bind_to_cpu() don't care _which_ CPU they get, just
that they are bound to a single free one. So alter the interface to

	1. Accept a BIND_CPU_ANY value that tells it to automatically
	   pick a CPU
	2. Return the picked CPU

And convert all these users to bind_to_cpu(BIND_CPU_ANY).

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230406043320.125138-4-bgray@linux.ibm.com
2023-04-20 13:21:45 +10:00
Benjamin Gray c97b2fc662 selftests/powerpc: Move bind_to_cpu() to utils.h
This function will be useful in the DSCR test patches later in this
series, so promote it to be shared by all powerpc selftests.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230406043320.125138-3-bgray@linux.ibm.com
2023-04-20 13:21:45 +10:00
Benjamin Gray 15f0c2601e selftests/powerpc/dscr: Correct typos
Correct a couple of typos while working on other improvements to the
DSCR tests.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230406043320.125138-2-bgray@linux.ibm.com
2023-04-20 13:21:45 +10:00
Nicholas Piggin 4e991e3c16 powerpc: add CFUNC assembly label annotation
This macro is to be used in assembly where C functions are called.
pcrel addressing mode requires branches to functions with a
localentry value of 1 to have either a trailing nop or @notoc.
This macro permits the latter without changing callers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add dummy definitions to fix selftests build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230408021752.862660-5-npiggin@gmail.com
2023-04-20 12:54:24 +10:00
Linus Torvalds cb0856346a 22 hotfixes.
19 are cc:stable and the remainder address issues which were introduced
 during this merge cycle, or aren't considered suitable for -stable
 backporting.
 
 19 are for MM and the remainder are for other subsystems.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEB7GgAKCRDdBJ7gKXxA
 jl4zAP9LxKisY8L29qrZG/SKoYbMMSM33ASOGZJRAuRRaOYL6QEAvS14pg/c22rL
 4GCZbzvENY4xPRbz/6kc/s2Jnuww4wA=
 =Kh/V
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-04-19-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "22 hotfixes.

  19 are cc:stable and the remainder address issues which were
  introduced during this merge cycle, or aren't considered suitable for
  -stable backporting.

  19 are for MM and the remainder are for other subsystems"

* tag 'mm-hotfixes-stable-2023-04-19-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits)
  nilfs2: initialize unused bytes in segment summary blocks
  mm: page_alloc: skip regions with hugetlbfs pages when allocating 1G pages
  mm/mmap: regression fix for unmapped_area{_topdown}
  maple_tree: fix mas_empty_area() search
  maple_tree: make maple state reusable after mas_empty_area_rev()
  mm: kmsan: handle alloc failures in kmsan_ioremap_page_range()
  mm: kmsan: handle alloc failures in kmsan_vmap_pages_range_noflush()
  tools/Makefile: do missed s/vm/mm/
  mm: fix memory leak on mm_init error handling
  mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock
  kernel/sys.c: fix and improve control flow in __sys_setres[ug]id()
  Revert "userfaultfd: don't fail on unrecognized features"
  writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs
  maple_tree: fix a potential memory leak, OOB access, or other unpredictable bug
  tools/mm/page_owner_sort.c: fix TGID output when cull=tg is used
  mailmap: update jtoppins' entry to reference correct email
  mm/mempolicy: fix use-after-free of VMA iterator
  mm/huge_memory.c: warn with pr_warn_ratelimited instead of VM_WARN_ON_ONCE_FOLIO
  mm/mprotect: fix do_mprotect_pkey() return on error
  mm/khugepaged: check again on anon uffd-wp during isolation
  ...
2023-04-19 17:55:45 -07:00
Arnaldo Carvalho de Melo 265b0de2f0 perf probe: Add missing 0x prefix for addresses printed in hexadecimal
To fix this confusing warning:

  # perf probe -l
  Failed to find debug information for address 798240
    probe_main:prometheus_new_counter__return (on github.com/prometheus/client_golang/prometheus.NewCounter%return in /home/acme/git/prometheus-uprobes/main with counter)
  #

As that 798240 is printed with PRIx64 but has no letters, better print
the 0x prefix to disambiguate.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZEBCyFu2GjTw6qOi@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-19 16:38:43 -03:00
Arnaldo Carvalho de Melo 686c511866 perf build: Test the refcnt check build
Make sure we test build the currently added REFCNT_CHECKING
infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-19 13:47:03 -03:00
Ian Rogers 2832ef81d4 perf map: Add reference count checking
There's no strict get/put policy with map that leads to leaks or use
after free. Reference count checking identifies correct pairing of gets
and puts.

Committer notes:

Extracted from a larger patch removing bits that were covered by the use
of pre-existing map__ accessors (e.g. maps__nr_maps()) and new ones
added (map__refcnt() and the maps__set_ ones) to reduce
RC_CHK_ACCESS(maps)-> source code pollution.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Link: https://lore.kernel.org/lkml/20230407230405.2931830-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-19 12:57:53 -03:00
Arnaldo Carvalho de Melo e6a9efcee5 perf map: Add set_ methods for map->{start,end,pgoff,pgoff,reloc,erange_warned,dso,map_ip,unmap_ip,priv}
To have a way to intercept usage of the reference counted struct map.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-19 12:54:41 -03:00
Arnaldo Carvalho de Melo e1805aae1e perf map: Add missing conversions to map__refcnt()
Some conversions weren't performed in 4e8db2d752 ("perf map: Add
map__refcnt() accessor to use in the maps test"), fix it.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-19 12:33:53 -03:00
Magnus Karlsson 2ddade3229 selftests/xsk: Fix munmap for hugepage allocated umem
Fix the unmapping of hugepage allocated umems so that they are
properly unmapped. The new test referred to in the fixes label,
introduced a test that allocated a umem that is not a multiple of a 2M
hugepage size. This is fine for mmap() that rounds the size up the
nearest multiple of 2M. But munmap() requires the size to be a
multiple of the hugepage size in order for it to unmap the region. The
current behaviour of not properly unmapping the umem, was discovered
when further additions of tests that require hugepages (unaligned mode
tests only) started failing as the system was running out of
hugepages.

Fixes: c0801598e5 ("selftests: xsk: Add test UNALIGNED_INV_DESC_4K1_FRAME_SIZE")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230418143617.27762-1-magnus.karlsson@gmail.com
2023-04-19 16:33:53 +02:00
Ian Rogers 8f12692b7e perf maps: Add reference count checking
Add reference count checking to make sure of good use of get and put.
Add and use accessors to reduce RC_CHK clutter.

The only significant issue was in tests/thread-maps-share.c where
reference counts were released in the reverse order to acquisition,
leading to a use after put. This was fixed by reversing the put order.

Committer notes:

Extracted from a larger patch removing bits that were covered by the use
of pre-existing maps__ accessors (e.g. maps__nr_maps()) and new ones
added (maps__refcnt()) to reduce RC_CHK_ACCESS(maps)-> source code
pollution.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Link: https://lore.kernel.org/lkml/20230407230405.2931830-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-19 10:53:01 -03:00
Arnaldo Carvalho de Melo a07dacad8a perf maps: Use maps__nr_maps() instead of open coded maps->nr_maps
To use the existing accessor and be consistent.

Signef-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-19 10:52:54 -03:00
Arnaldo Carvalho de Melo fe693d951e perf maps: Add maps__refcnt() accessor to allow checking maps pointer
To remove one more direct access to 'struct maps' so that we can
intercept accesses to its instantiations and refcount check it to catch
use after free, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-19 10:52:54 -03:00
Arnaldo Carvalho de Melo 3ad1be6fae perf dso: Fix use before NULL check introduced by map__dso() introduction
James Clark noticed that the recent 63df0e4bc3 ("perf map: Add
accessor for dso") patch accessed map->dso before the 'map' variable was
NULL checked, which is a change in logic that leads to segmentation
faults, so comb thru that patch to fix similar cases.

Fixes: 63df0e4bc3 ("perf map: Add accessor for dso")
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/lkml/ZD68RYCVT8hqPuxr@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-19 10:51:48 -03:00
Tiezhu Yang b5533e990d tools/loongarch: Use __SIZEOF_LONG__ to define __BITS_PER_LONG
Although __SIZEOF_POINTER__ is equal to _SIZEOF_LONG__ on LoongArch,
it is better to use __SIZEOF_LONG__ to define __BITS_PER_LONG to keep
consistent between arch/loongarch/include/uapi/asm/bitsperlong.h and
tools/arch/loongarch/include/uapi/asm/bitsperlong.h.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-04-19 12:07:34 +08:00
Linus Torvalds 427fda2c8a x86: improve on the non-rep 'copy_user' function
The old 'copy_user_generic_unrolled' function was oddly implemented for
largely historical reasons: it had been largely based on the uncached
copy case, which has some other concerns.

For example, the __copy_user_nocache() function uses 'movnti' for the
destination stores, and those want the destination to be aligned.  In
contrast, the regular copy function doesn't really care, and trying to
align things only complicates matters.

Also, like the clear_user function, the copy function had some odd
handling of the repeat counts, complicating the exception handling for
no really good reason.  So as with clear_user, just write it to keep all
the byte counts in the %rcx register, exactly like the 'rep movs'
functionality that this replaces.

Unlike a real 'rep movs', we do allow for this to trash a few temporary
registers to not have to unnecessarily save/restore registers on the
stack.

And like the clearing case, rename this to what it now clearly is:
'rep_movs_alternative', and make it one coherent function, so that it
shows up as such in profiles (instead of the odd split between
"copy_user_generic_unrolled" and "copy_user_short_string", the latter of
which was not about strings at all, and which was shared with the
uncached case).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18 17:05:28 -07:00
Linus Torvalds 8c9b6a88b7 x86: improve on the non-rep 'clear_user' function
The old version was oddly written to have the repeat count in multiple
registers.  So instead of taking advantage of %rax being zero, it had
some sub-counts in it.  All just for a "single word clearing" loop,
which isn't even efficient to begin with.

So get rid of those games, and just keep all the state in the same
registers we got it in (and that we should return things in).  That not
only makes this act much more like 'rep stos' (which this function is
replacing), but makes it much easier to actually do the obvious loop
unrolling.

Also rename the function from the now nonsensical 'clear_user_original'
to what it now clearly is: 'rep_stos_alternative'.

End result: if we don't have a fast 'rep stosb', at least we can have a
fast fallback for it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18 17:05:28 -07:00
Linus Torvalds 577e6a7fd5 x86: inline the 'rep movs' in user copies for the FSRM case
This does the same thing for the user copies as commit 0db7058e8e
("x86/clear_user: Make it faster") did for clear_user().  In other
words, it inlines the "rep movs" case when X86_FEATURE_FSRM is set,
avoiding the function call entirely.

In order to do that, it makes the calling convention for the out-of-line
case ("copy_user_generic_unrolled") match the 'rep movs' calling
convention, although it does also end up clobbering a number of
additional registers.

Also, to simplify code sharing in the low-level assembly with the
__copy_user_nocache() function (that uses the normal C calling
convention), we end up with a kind of mixed return value for the
low-level asm code: it will return the result in both %rcx (to work as
an alternative for the 'rep movs' case), _and_ in %rax (for the nocache
case).

We could avoid this by wrapping __copy_user_nocache() callers in an
inline asm, but since the cost is just an extra register copy, it's
probably not worth it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18 17:05:28 -07:00
Linus Torvalds 3639a53558 x86: move stac/clac from user copy routines into callers
This is preparatory work for inlining the 'rep movs' case, but also a
cleanup.  The __copy_user_nocache() function was mis-used by the rdma
code to do uncached kernel copies that don't actually want user copies
at all, and as a result doesn't want the stac/clac either.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18 17:05:28 -07:00
Linus Torvalds d2c95f9d68 x86: don't use REP_GOOD or ERMS for user memory clearing
The modern target to use is FSRS (Fast Short REP STOS), and the other
cases should only be used for bigger areas (ie mainly things like page
clearing).

Note! This changes the conditional for the inlining from FSRM ("fast
short rep movs") to FSRS ("fast short rep stos").

We'll have a separate fixup for AMD microarchitectures that have a good
'rep stosb' yet do not set the new Intel-specific FSRS bit (because FSRM
was there first).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-18 17:05:28 -07:00
Jeff Xu 3cc0c3738c selftests/memfd: fix test_sysctl
sysctl memfd_noexec is pid-namespaced, non-reservable, and inherent to the
child process.

Move the inherence test from init ns to child ns, so init ns can keep the
default value.

Link: https://lkml.kernel.org/r/20230414022801.2545257-1-jeffxu@google.com
Signed-off-by: Jeff Xu <jeffxu@google.com>
Reported-by: kernel test robot <yujie.liu@intel.com>
  Link: https://lore.kernel.org/oe-lkp/202303312259.441e35db-yujie.liu@intel.com
Tested-by: Yujie Liu <yujie.liu@intel.com>
Cc: Daniel Verkamp <dverkamp@chromium.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:53:52 -07:00
Chaitanya S Prakash c025da0f14 selftests/mm: run hugetlb testcases of va switch
The va_high_addr_switch selftest is used to test mmap across 128TB
boundary.  It divides the selftest cases into two main categories on the
basis of size.  One set is used to create mappings that are multiples of
PAGE_SIZE while the other creates mappings that are multiples of
HUGETLB_SIZE.

In order to run the hugetlb testcases the binary must be appended with
"--run-hugetlb" but the file that used to run the test only invokes the
binary, thereby completely skipping the hugetlb testcases.  Hence, the
required statement has been added.

Link: https://lkml.kernel.org/r/20230323105243.2807166-6-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:53:52 -07:00
Chaitanya S Prakash 2f489e2e69 selftests/mm: configure nr_hugepages for arm64
Arm64 has a default hugepage size of 512MB when CONFIG_ARM64_64K_PAGES=y
is enabled.  While testing on arm64 platforms having up to 4PB of virtual
address space, a minimum of 6 hugepages were required for all test cases
to pass.  Support for this requirement has been added.

Link: https://lkml.kernel.org/r/20230323105243.2807166-5-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:53:51 -07:00
Chaitanya S Prakash c2af2a4190 selftests/mm: add platform independent in code comments
The in code comments for the selftest were made on the basis of 128TB
switch, an architecture feature specific to PowerPc and x86 platforms. 
Keeping in mind the support added for arm64 platforms which implements a
256TB switch, a more generic explanation has been provided.

Link: https://lkml.kernel.org/r/20230323105243.2807166-4-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:53:51 -07:00
Chaitanya S Prakash bbe168729d selftests/mm: rename va_128TBswitch to va_high_addr_switch
As the initial selftest only took into consideration PowperPC and x86
architectures, on adding support for arm64, a platform independent naming
convention is chosen.

Link: https://lkml.kernel.org/r/20230323105243.2807166-3-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:53:51 -07:00
Chaitanya S Prakash cd834afa8e selftests/mm: add support for arm64 platform on va switch
Patch series "selftests/mm: Implement support for arm64 on va".

The va_128TBswitch selftest is designed and implemented for PowerPC and
x86 architectures which support a 128TB switch, up to 256TB of virtual
address space and hugepage sizes of 16MB and 2MB respectively.  Arm64
platforms on the other hand support a 256Tb switch, up to 4PB of virtual
address space and a default hugepage size of 512MB when 64k pagesize is
enabled.

These architectural differences require introducing support for arm64
platforms, after which a more generic naming convention is suggested.  The
in code comments are amended to provide a more platform independent
explanation of the working of the code and nr_hugepages are configured as
required.  Finally, the file running the testcase is modified in order to
prevent skipping of hugetlb testcases of va_high_addr_switch.


This patch (of 5):

Arm64 platforms have the ability to support 64kb pagesize, 512MB default
hugepage size and up to 4PB of virtual address space.  The address switch
occurs at 256TB as opposed to 128TB.  Hence, the necessary support has
been added.

Link: https://lkml.kernel.org/r/20230323105243.2807166-1-chaitanyas.prakash@arm.com
Link: https://lkml.kernel.org/r/20230323105243.2807166-2-chaitanyas.prakash@arm.com
Signed-off-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:53:51 -07:00
Yang Yang a3b2aeac9d delayacct: track delays from IRQ/SOFTIRQ
Delay accounting does not track the delay of IRQ/SOFTIRQ.  While
IRQ/SOFTIRQ could have obvious impact on some workloads productivity, such
as when workloads are running on system which is busy handling network
IRQ/SOFTIRQ.

Get the delay of IRQ/SOFTIRQ could help users to reduce such delay.  Such
as setting interrupt affinity or task affinity, using kernel thread for
NAPI etc.  This is inspired by "sched/psi: Add PSI_IRQ to track
IRQ/SOFTIRQ pressure"[1].  Also fix some code indent problems of older
code.

And update tools/accounting/getdelays.c:
    / # ./getdelays -p 156 -di
    print delayacct stats ON
    printing IO accounting
    PID     156

    CPU             count     real total  virtual total    delay total  delay average
                       15       15836008       16218149      275700790         18.380ms
    IO              count    delay total  delay average
                        0              0          0.000ms
    SWAP            count    delay total  delay average
                        0              0          0.000ms
    RECLAIM         count    delay total  delay average
                        0              0          0.000ms
    THRASHING       count    delay total  delay average
                        0              0          0.000ms
    COMPACT         count    delay total  delay average
                        0              0          0.000ms
    WPCOPY          count    delay total  delay average
                       36        7586118          0.211ms
    IRQ             count    delay total  delay average
                       42         929161          0.022ms

[1] commit 52b1364ba0b1("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure")

Link: https://lkml.kernel.org/r/202304081728353557233@zte.com.cn
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Cc: Jiang Xuexin <jiang.xuexin@zte.com.cn>
Cc: wangyong <wang.yong12@zte.com.cn>
Cc: junhua huang <huang.junhua@zte.com.cn>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:39:34 -07:00
Peter Xu 43759d44dc selftests/mm: add uffdio register ioctls test
This new test tests against the returned ioctls from UFFDIO_REGISTER,
where put into uffdio_register.ioctls.

This also tests the expected failure cases of UFFDIO_REGISTER, aka:

  - Register with empty mode should fail with -EINVAL
  - Register minor without page cache (anon) should fail with -EINVAL

Link: https://lkml.kernel.org/r/20230412164548.329376-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:08 -07:00
Peter Xu 5aec236fdd selftests/mm: add shmem-private test to uffd-stress
The userfaultfd stress test never tested private shmem, which I think was
overlooked long due.  Add it so it matches with uffd unit test and it'll
cover all memory supported with the three memory types.

Meanwhile, rename the memory types a bit.  Considering shared mem is the
major use case for both shmem / hugetlbfs, changing from:

  anon, hugetlb, hugetlb_shared, shmem

To (with shmem-private added):

  anon, hugetlb, hugetlb-private, shmem, shmem-private

Add the shmem-private to run_vmtests.sh too.

Link: https://lkml.kernel.org/r/20230412164546.329355-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:08 -07:00
Peter Xu 111fd29b2a selftests/mm: drop sys/dev test in uffd-stress test
With the new uffd unit test covering the /dev/userfaultfd path and syscall
path of uffd initializations, we can safely drop the devnode test in the
old stress test.

One thing is to avoid duplication of running the stress test twice which is
an overkill to only test the /dev/ interface in run_vmtests.sh.

The other benefit is now all uffd tests (that uses userfaultfd_open) can
run automatically as long as any type of interface is enabled (either
syscall or dev), so it's more likely to succeed rather than fail due to
unprivilege.

With this patch lands, we can drop all the "mem_type:XXX" handlings too.

Link: https://lkml.kernel.org/r/20230412164525.329176-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:08 -07:00
Peter Xu f9da24263d selftests/mm: allow uffd test to skip properly with no privilege
Allow skip a unit test properly due to no privilege (e.g.  sigbus and
events tests).

[colin.i.king@gmail.com: fix spelling mistake "priviledge" -> "privilege"]
  Link: https://lkml.kernel.org/r/20230414081506.1678998-1-colin.i.king@gmail.com
Link: https://lkml.kernel.org/r/20230412164520.329163-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:08 -07:00
Peter Xu 4df9cefa94 selftests/mm: workaround no way to detect uffd-minor + wp
Userfaultfd minor+wp mode was very recently added.  The test will fail on
the old kernels at ioctl(UFFDIO_CONTINUE) which is misterious. 
Unfortunately there's no feature bit to detect for this support.

Add a hack to leverage WP_UNPOPULATED to detect whether that feature
existed, since WP_UNPOPULATED was merged right after minor+wp.

Link: https://lkml.kernel.org/r/20230412164517.329152-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:07 -07:00
Peter Xu c3315502c9 selftests/mm: move zeropage test into uffd unit tests
Simplifies it a bit along the way, e.g., drop the never used offset field
(which was always the 1st page so offset=0).

Introduce uffd_register_with_ioctls() out of uffd_register() to detect
uffdio_register.ioctls got returned.  Check that automatically when testing
UFFDIO_ZEROPAGE on different types of memory (and kernel).

Link: https://lkml.kernel.org/r/20230412164404.328815-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:07 -07:00
Peter Xu 73c1ea939b selftests/mm: move uffd sig/events tests into uffd unit tests
Move the two tests into the unit test, and convert it into 20 standalone
tests:

  - events test on all 5 mem types, with wp on/off
  - signal test on all 5 mem types, with wp on/off

  Testing sigbus on anon... done
  Testing sigbus on shmem... done
  Testing sigbus on shmem-private... done
  Testing sigbus on hugetlb... done
  Testing sigbus on hugetlb-private... done
  Testing sigbus-wp on anon... done
  Testing sigbus-wp on shmem... done
  Testing sigbus-wp on shmem-private... done
  Testing sigbus-wp on hugetlb... done
  Testing sigbus-wp on hugetlb-private... done
  Testing events on anon... done
  Testing events on shmem... done
  Testing events on shmem-private... done
  Testing events on hugetlb... done
  Testing events on hugetlb-private... done
  Testing events-wp on anon... done
  Testing events-wp on shmem... done
  Testing events-wp on shmem-private... done
  Testing events-wp on hugetlb... done
  Testing events-wp on hugetlb-private... done

It'll also remove a lot of global references along the way,
e.g. test_uffdio_wp will be replaced with the wp value passed over.

Link: https://lkml.kernel.org/r/20230412164400.328798-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:07 -07:00
Peter Xu 62515b5f9f selftests/mm: move uffd minor test to unit test
This moves the minor test to the new unit test.

Rewrite the content check with char* opeartions to avoid fiddling with
my_bcmp().

Drop global vars test_uffdio_minor and test_collapse, just assume test them
always in common code for now.

OTOH make this single test into five tests:

  - minor test on [shmem, hugetlb] with wp=false
  - minor test on [shmem, hugetlb] with wp=true
  - minor test + collapse on shmem only

One thing to mention that we used to test COLLAPSE+WP but that doesn't
sound right at all.  It's possible it's silently broken but unnoticed
because COLLAPSE is not part of the default test suite.

Make the MADV_COLLAPSE test fail-able (by skip it when failing), because
it's not guaranteed to success anyway.

Drop a bunch of useless code after the move, because the unit test always
use aligned num of pages and has nothing to do with n_cpus.

Link: https://lkml.kernel.org/r/20230412164357.328779-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:07 -07:00
Peter Xu 8bda424fca selftests/mm: move uffd pagemap test to unit test
Move it over and make it split into two tests, one for pagemap and one for
the new WP_UNPOPULATED (to be a separate one).

The thp pagemap test wasn't really working (with MADV_HUGEPAGE).  Let's
just drop it (since it never really worked anyway..) and leave that for
later.

Link: https://lkml.kernel.org/r/20230412164352.328733-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:07 -07:00
Peter Xu 16a45b57cb selftests/mm: add framework for uffd-unit-test
Add a framework to be prepared to move unit tests from uffd-stress.c into
uffd-unit-tests.c.  The goal is to allow detection of uffd features for
each test, and also loop over specified types of memory that a test
support.

Link: https://lkml.kernel.org/r/20230412164348.328710-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:06 -07:00
Peter Xu be39fec4f9 selftests/mm: allow allocate_area() to fail properly
Mostly to detect hugetlb allocation errors and skip hugetlb tests when
pages are not allocated.

Link: https://lkml.kernel.org/r/20230412164345.328659-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:06 -07:00
Peter Xu 0210c43ef6 selftests/mm: let uffd_handle_page_fault() take wp parameter
Make the handler optionally apply WP bit when resolving page faults for
either missing or minor page faults.  This moves towards removing global
test_uffdio_wp outside of the common code.

Link: https://lkml.kernel.org/r/20230412164341.328618-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:06 -07:00
Peter Xu 508340845d selftests/mm: rename uffd_stats to uffd_args
Prepare for adding more fields into the struct.

Link: https://lkml.kernel.org/r/20230412164337.328607-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Suggested-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:06 -07:00
Peter Xu 265818ef98 selftests/mm: drop global hpage_size in uffd tests
hpage_size was wrongly used.  Sometimes it means hugetlb default size,
sometimes it was used as thp size.

Remove the global variable and use the right one at each place.

Link: https://lkml.kernel.org/r/20230412164333.328596-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:06 -07:00
Peter Xu c5cb903646 selftests/mm: drop global mem_fd in uffd tests
Drop it by creating the memfd dynamically in the tests.

Link: https://lkml.kernel.org/r/20230412164331.328584-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:05 -07:00
Peter Xu d5433ce84d selftests/mm: UFFDIO_API test
Add one simple test for UFFDIO_API.  With that, I also added a bunch of
small but handy helpers along the way.

Link: https://lkml.kernel.org/r/20230412164257.328375-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:05 -07:00
Peter Xu 78391f6460 selftests/mm: uffd_open_{dev|sys}()
Provide two helpers to open an uffd handle.  Drop the error checks around
SKIPs because it's inside an errexit() anyway, which IMHO doesn't really
help much if the test will not continue.

Link: https://lkml.kernel.org/r/20230412164254.328335-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:05 -07:00
Peter Xu c4277cb6c8 selftests/mm: uffd_[un]register()
Add two helpers to register/unregister to an uffd.  Use them to drop
duplicate codes.

This patch also drops assert_expected_ioctls_present() and
get_expected_ioctls().  Reasons:

  - It'll need a lot of effort to pass test_type==HUGETLB into it from
    the upper, so it's the simplest way to get rid of another global var

  - The ioctls returned in UFFDIO_REGISTER is hardly useful at all,
    because any app can already detect kernel support on any ioctl via its
    corresponding UFFD_FEATURE_*.  The check here is for sanity mostly but
    it's probably destined no user app will even use it.

  - It's not friendly to one future goal of uffd to run on old
    kernels, the problem is get_expected_ioctls() compiles against
    UFFD_API_RANGE_IOCTLS, which is a value that can change depending on
    where the test is compiled, rather than reflecting what the kernel
    underneath has.  It means it'll report false negatives on old kernels
    so it's against our will.

So let's make our lives easier.

[peterx@redhat.com; tools/testing/selftests/mm/hugepage-mremap.c: add headers]
  Link: https://lkml.kernel.org/r/ZDxrvZh/cw357D8P@x1n
Link: https://lkml.kernel.org/r/20230412164247.328293-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:05 -07:00
Peter Xu 686a8bb723 selftests/mm: split uffd tests into uffd-stress and uffd-unit-tests
In many ways it's weird and unwanted to keep all the tests in the same
userfaultfd.c at least when still in the current way.

For example, it doesn't make much sense to run the stress test for each
method we can create an userfaultfd handle (either via syscall or /dev/
node).  It's a waste of time running this twice for the whole stress as
the stress paths are the same, only the open path is different.

It's also just weird to need to manually specify different types of memory
to run all unit tests for the userfaultfd interface.  We should be able to
just run a single program and that should go through all functional uffd
tests without running the stress test at all.  The stress test was more
for torturing and finding race conditions.  We don't want to wait for
stress to finish just to regress test a functional test.

When we start to pile up more things on top of the same file and same
functions, things start to go a bit chaos and the code is just harder to
maintain too with tons of global variables.

This patch creates a new test uffd-unit-tests to keep userfaultfd unit
tests in the future, currently empty.

Meanwhile rename the old userfaultfd.c test to uffd-stress.c.

Link: https://lkml.kernel.org/r/20230412164244.328270-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:04 -07:00
Peter Xu 33be4e8928 selftests/mm: create uffd-common.[ch]
Move common utility functions into uffd-common.[ch] files from the
original userfaultfd.c.  This prepares for a split of userfaultfd.c into
two tests: one to only cover the old but powerful stress test, the other
one covers all the functional tests.

This movement is kind of a brute-force effort for now, with light
touch-ups but nothing should really change.  There's chances to optimize
more, but let's leave that for later.

Link: https://lkml.kernel.org/r/20230412164241.328259-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:04 -07:00
Peter Xu 618aeb5d62 selftests/mm: drop test_uffdio_zeropage_eexist
The idea was trying to flip this var in the alarm handler from time to
time to test -EEXIST of UFFDIO_ZEROPAGE, but firstly it's only used in the
zeropage test so probably only used once, meanwhile we passed
"retry==false" so it'll never got tested anyway.

Drop both sides so we always test UFFDIO_ZEROPAGE retries if has_zeropage
is set (!hugetlb).

One more thing to do is doing UFFDIO_REGISTER for the alias buffer too,
because otherwise the test won't even pass!  We were just lucky that this
test never really got ran at all.

Link: https://lkml.kernel.org/r/20230412164238.328238-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:04 -07:00
Peter Xu 4af9ff2981 selftests/mm: test UFFDIO_ZEROPAGE only when !hugetlb
Make the check as simple as "test_type == TEST_HUGETLB" because that's the
only mem that doesn't support ZEROPAGE.

Link: https://lkml.kernel.org/r/20230412164234.328168-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:04 -07:00
Peter Xu 366e93c465 selftests/mm: reuse pagemap_get_entry() in vm_util.h
Meanwhile drop pagemap_read_vaddr().

Link: https://lkml.kernel.org/r/20230412164231.328157-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:04 -07:00
Peter Xu 9f74696bd2 selftests/mm: use PM_* macros in vm_utils.h
We've got the macros in uffd-stress.c, move it over and use it in
vm_util.h.

Link: https://lkml.kernel.org/r/20230412164227.328145-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:03 -07:00
Peter Xu bd4d67e76f selftests/mm: merge default_huge_page_size() into one
There're already 3 same definitions of the three functions.  Move it into
vm_util.[ch].

Link: https://lkml.kernel.org/r/20230412164223.328134-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-04-18 16:30:03 -07:00