Commit Graph

37194 Commits

Author SHA1 Message Date
Rafael J. Wysocki 885c429e06 Merge branches 'pm-devfreq' and 'pm-tools'
Merge devfreq changes and power management tools changes for 6.6-rc1:

 - Fix memory leak in devfreq_dev_release() (Boris Brezillon).

 - Rewrite devfreq_monitor_start() kerneldoc comment (Manivannan
   Sadhasivam).

 - Explicitly include correct DT includes in devfreq (Rob Herring).

 - Add turbo-boost support to cpupower (Wyes Karny).

 - Add support for amd_pstate mode change to cpupower (Wyes Karny).

 - Fix 'cpupower idle_set' command to accept only numeric values of
   arguments (Likhitha Korrapati).

* pm-devfreq:
  PM / devfreq: Fix leak in devfreq_dev_release()
  PM / devfreq: Reword the kernel-doc comment for devfreq_monitor_start() API
  PM / devfreq: Explicitly include correct DT includes

* pm-tools:
  cpupower: Fix cpuidle_set to accept only numeric values for idle-set operation.
  cpupower: Add turbo-boost support in cpupower
  cpupower: Add support for amd_pstate mode change
  cpupower: Add EPP value change support
  cpupower: Add is_valid_path API
  cpupower: Recognise amd-pstate active mode driver
  cpupower: Bump soname version
2023-08-25 21:29:59 +02:00
Linus Torvalds 6f0edbb833 18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4 issues
or aren't considered suitable for a -stable backport.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZOjuGgAKCRDdBJ7gKXxA
 jkLlAQDY9sYxhQZp1PFLirUIPeOBjEyifVy6L6gCfk9j0snLggEA2iK+EtuJt2Dc
 SlMfoTq29zyU/YgfKKwZEVKtPJZOHQU=
 =oTcj
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4
  issues or aren't considered suitable for a -stable backport"

* tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  shmem: fix smaps BUG sleeping while atomic
  selftests: cachestat: catch failing fsync test on tmpfs
  selftests: cachestat: test for cachestat availability
  maple_tree: disable mas_wr_append() when other readers are possible
  madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
  madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check
  madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
  mm: multi-gen LRU: don't spin during memcg release
  mm: memory-failure: fix unexpected return value in soft_offline_page()
  radix tree: remove unused variable
  mm: add a call to flush_cache_vmap() in vmap_pfn()
  selftests/mm: FOLL_LONGTERM need to be updated to 0x100
  nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers()
  mm/gup: handle cont-PTE hugetlb pages correctly in gup_must_unshare() via GUP-fast
  selftests: cgroup: fix test_kmem_basic less than error
  mm: enable page walking API to lock vmas during the walk
  smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd()
  mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
2023-08-25 11:44:43 -07:00
Dave Marchevsky 312aa5bde8 selftests/bpf: Add tests for rbtree API interaction in sleepable progs
Confirm that the following sleepable prog states fail verification:
  * bpf_rcu_read_unlock before bpf_spin_unlock
     * RCU CS will last at least as long as spin_lock CS

Also confirm that correct usage passes verification, specifically:
  * Explicit use of bpf_rcu_read_{lock, unlock} in sleepable test prog
  * Implied RCU CS due to spin_lock CS

None of the selftest progs actually attach to bpf_testmod's
bpf_testmod_test_read.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230821193311.3290257-8-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-25 09:23:17 -07:00
Dave Marchevsky ba2464c86f bpf: Reenable bpf_refcount_acquire
Now that all reported issues are fixed, bpf_refcount_acquire can be
turned back on. Also reenable all bpf_refcount-related tests which were
disabled.

This a revert of:
 * commit f3514a5d67 ("selftests/bpf: Disable newly-added 'owner' field test until refcount re-enabled")
 * commit 7deca5eae8 ("bpf: Disable bpf_refcount_acquire kfunc calls until race conditions are fixed")

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230821193311.3290257-5-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-25 09:23:16 -07:00
Will Deacon e1df272139 Merge branch 'for-next/selftests' into for-next/core
* for-next/selftests: (22 commits)
  kselftest/arm64: Fix hwcaps selftest build
  kselftest/arm64: add jscvt feature to hwcap test
  kselftest/arm64: add pmull feature to hwcap test
  kselftest/arm64: add AES feature check to hwcap test
  kselftest/arm64: add SHA1 and related features to hwcap test
  kselftest/arm64: build BTI tests in output directory
  kselftest/arm64: fix a memleak in zt_regs_run()
  kselftest/arm64: Size sycall-abi buffers for the actual maximum VL
  kselftest/arm64: add lse and lse2 features to hwcap test
  kselftest/arm64: add test item that support to capturing the SIGBUS signal
  kselftest/arm64: add DEF_SIGHANDLER_FUNC() and DEF_INST_RAISE_SIG() helpers
  kselftest/arm64: add crc32 feature to hwcap test
  kselftest/arm64: add float-point feature to hwcap test
  kselftest/arm64: Use the tools/include compiler.h rather than our own
  kselftest/arm64: Use shared OPTIMZER_HIDE_VAR() definiton
  kselftest/arm64: Make the tools/include headers available
  tools include: Add some common function attributes
  tools compiler.h: Add OPTIMIZER_HIDE_VAR()
  kselftest/arm64: Exit streaming mode after collecting signal context
  kselftest/arm64: add RCpc load-acquire to hwcap test
  ...
2023-08-25 12:36:57 +01:00
Linus Torvalds 4f9e7fabf8 Tracing fixes for 6.5:
- Fix ring buffer being permanently disabled due to missed record_disabled()
   Changing the trace cpu mask will disable the ring buffers for the CPUs no
   longer in the mask. But it fails to update the snapshot buffer. If a snapshot
   takes place, the accounting for the ring buffer being disabled is corrupted
   and this can lead to the ring buffer being permanently disabled.
 
 - Add test case for snapshot and cpu mask working together
 
 - Fix memleak by the function graph tracer not getting closed properly.
   The iterator is used to read the ring buffer. When it opens, it calls
   the open function of a tracer, and when it is closed, it calls the close
   iteration. While a trace is being read, it is still possible to change
   the tracer. If this happens between the function graph tracer and the
   wakeup tracer (which uses function graph tracing), the tracers are not
   closed properly during when the iterator sees the switch, and the wakeup
   function did not initialize its private pointer to NULL, which is used
   to know if the function graph tracer was the last tracer. It could be
   fooled in thinking it is, but then on exit it does not call the close
   function of the function graph tracer to clean up its data.
 
 - Fix synthetic events on big endian machines, by introducing a union
   that does the conversions properly.
 
 - Fix synthetic events from printing out the number of elements in the
   stacktrace when it shouldn't.
 
 - Fix synthetic events stacktrace to not print a bogus value at the end.
 
 - Introduce a pipe_cpumask that prevents the trace_pipe files from being
   opened by more than one task (file descriptor). There was a race found
   where if splice is called, the iter->ent could become stale and events
   could be missed. There's no point reading a producer/consumer file by
   more than one task as they will corrupt each other anyway. Add a cpumask
   that keeps track of the per_cpu trace_pipe files as well as the global
   trace_pipe file that prevents more than one open of a trace_pipe file
   that represents the same ring buffer. This prevents the race from
   happening.
 
 - Fix ftrace samples for arm64 to work with older compilers.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEXtmkj8VMCiLR0IBM68Js21pW3nMFAmTn7hsUHHJvc3RlZHRA
 Z29vZG1pcy5vcmcACgkQ68Js21pW3nOSVg/9HVFTUX52yqvT9YLv8b1QZYMo2I5n
 to1nSbF9UUYIAMqkyNHuXU2TBj1I77JVkbsPEFsjYTN97CzJ/Zc8jNa6p7HgVure
 1xUuaCgtOFPDjU6OpTa6wFRt4usU1UM8Noc/ii0WDUGsA+RKBAKmGUsNmuDHx1Ae
 a3uJTLR4VHMeAkbtth/8f6RHBVocDVSYPQbKnC0PksGW1wg9e9PU2G6+o3069I1Q
 qbOZJqvGDpeaapjyG2LYrDwkVGOPSvUJuJNfcjcNcyKoHSqxZzBb+feY16NhWDm9
 idpyHLE4qF1nvlh1SIpErFl8Bu5MG9CN8a+xrx3ufd6i1jO4bcHyRD9XmjG4p72+
 zVshoS86DI4KCK9wHxJ/5/SU6XuL6JoNTP7NhmDIX83QCuZwgTOa8C2xzHKHu58F
 In13IhJqS5ob6Jy25a/bAy0CbdTl0cjQvMfXrrYK0ZWuEBWgBUDqwB/eKY6oq79D
 oTKmFNOZsuiLAhPywAoY5cAqQtftpy5Ul8o6ed3RAw3th0WC4EeXIk5eUu5g/ABI
 1ZfnpeY5al/JROFGXWxyn964RRjoVpbC1M6NVTz33e7No+r2KSRlLb5ZEWZVVIcZ
 My96QiZXamBZ/EOR7x72yFWxXBSuACu9nvbSSZRnbEGujoKwDb89xtgWzSNuR/wi
 GexDcj6qWfODrfc=
 =CI9f
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix ring buffer being permanently disabled due to missed
   record_disabled()

   Changing the trace cpu mask will disable the ring buffers for the
   CPUs no longer in the mask. But it fails to update the snapshot
   buffer. If a snapshot takes place, the accounting for the ring buffer
   being disabled is corrupted and this can lead to the ring buffer
   being permanently disabled.

 - Add test case for snapshot and cpu mask working together

 - Fix memleak by the function graph tracer not getting closed properly.

   The iterator is used to read the ring buffer. When it opens, it calls
   the open function of a tracer, and when it is closed, it calls the
   close iteration. While a trace is being read, it is still possible to
   change the tracer.

   If this happens between the function graph tracer and the wakeup
   tracer (which uses function graph tracing), the tracers are not
   closed properly during when the iterator sees the switch, and the
   wakeup function did not initialize its private pointer to NULL, which
   is used to know if the function graph tracer was the last tracer. It
   could be fooled in thinking it is, but then on exit it does not call
   the close function of the function graph tracer to clean up its data.

 - Fix synthetic events on big endian machines, by introducing a union
   that does the conversions properly.

 - Fix synthetic events from printing out the number of elements in the
   stacktrace when it shouldn't.

 - Fix synthetic events stacktrace to not print a bogus value at the
   end.

 - Introduce a pipe_cpumask that prevents the trace_pipe files from
   being opened by more than one task (file descriptor).

   There was a race found where if splice is called, the iter->ent could
   become stale and events could be missed. There's no point reading a
   producer/consumer file by more than one task as they will corrupt
   each other anyway. Add a cpumask that keeps track of the per_cpu
   trace_pipe files as well as the global trace_pipe file that prevents
   more than one open of a trace_pipe file that represents the same ring
   buffer. This prevents the race from happening.

 - Fix ftrace samples for arm64 to work with older compilers.

* tag 'trace-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  samples: ftrace: Replace bti assembly with hint for older compiler
  tracing: Introduce pipe_cpumask to avoid race on trace_pipes
  tracing: Fix memleak due to race between current_tracer and trace
  tracing/synthetic: Allocate one additional element for size
  tracing/synthetic: Skip first entry for stack traces
  tracing/synthetic: Use union instead of casts
  selftests/ftrace: Add a basic testcase for snapshot
  tracing: Fix cpu buffers unavailable due to 'record_disabled' missed
2023-08-24 19:39:20 -07:00
Jakub Kicinski 4c8c24e801 tools: ynl-gen: support empty attribute lists
Differentiate between empty list and None for member lists.
New families may want to create request responses with no attribute.
If we treat those the same as None we end up rendering
a full parsing policy in user space, instead of an empty one.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824003056.1436637-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-24 19:04:20 -07:00
Jakub Kicinski dc2ef94d89 tools: ynl-gen: fix collecting global policy attrs
We look for attributes inside do.request, but there's another
layer of nesting in the spec, look inside do.request.attributes.

This bug had no effect as all global policies we generate (fou)
seem to be full, anyway, and we treat full and empty the same.

Next patch will change the treatment of empty policies.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824003056.1436637-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-24 19:04:20 -07:00
Jakub Kicinski a149a3a13b tools: ynl-gen: set length of binary fields
Remember to set the length field in the request setters.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824003056.1436637-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-24 19:04:20 -07:00
Jakub Kicinski 649bde9004 tools: ynl: allow passing binary data
Recent changes made us assume that input for binary data is in hex.
When using YNL as a Python library it's possible to pass in raw bytes.
Bring the ability to do that back.

Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824003056.1436637-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-24 19:04:20 -07:00
Anh Tuan Phan bad5a3a42a selftests/mm: fix WARNING comparing pointer to 0
Remove comparing pointer to 0 to avoid this warning from coccinelle:

./tools/testing/selftests/mm/map_populate.c:80:16-17: WARNING comparing pointer to 0, suggest !E
./tools/testing/selftests/mm/map_populate.c:80:16-17: WARNING comparing pointer to 0

Link: https://lkml.kernel.org/r/20230817160033.90079-1-tuananhlfc@gmail.com
Signed-off-by: Anh Tuan Phan <tuananhlfc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 16:20:27 -07:00
Lucas Karpinski 7131fd7e30 selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check
Currently, not all kernel memory usage is being accounted for. This
commit switches to using the kernel entry within memory.stat which
already includes kernel_stack, pagetables, and slab. The kernel entry
also includes vmalloc and other additional kernel memory use cases which
were missing.

Link: https://lkml.kernel.org/r/bvrhe2tpsts2azaroq4ubp2slawmop6orndsswrewuscw3ugvk@kmemmrttsnc7
Signed-off-by: Lucas Karpinski <lkarpins@redhat.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 16:20:27 -07:00
Matthew Wilcox (Oracle) f9bff0e318 minmax: add in_range() macro
Patch series "New page table range API", v6.

This patchset changes the API used by the MM to set up page table entries.
The four APIs are:

    set_ptes(mm, addr, ptep, pte, nr)
    update_mmu_cache_range(vma, addr, ptep, nr)
    flush_dcache_folio(folio) 
    flush_icache_pages(vma, page, nr)

flush_dcache_folio() isn't technically new, but no architecture
implemented it, so I've done that for them.  The old APIs remain around
but are mostly implemented by calling the new interfaces.

The new APIs are based around setting up N page table entries at once. 
The N entries belong to the same PMD, the same folio and the same VMA, so
ptep++ is a legitimate operation, and locking is taken care of for you. 
Some architectures can do a better job of it than just a loop, but I have
hesitated to make too deep a change to architectures I don't understand
well.

One thing I have changed in every architecture is that PG_arch_1 is now a
per-folio bit instead of a per-page bit when used for dcache clean/dirty
tracking.  This was something that would have to happen eventually, and it
makes sense to do it now rather than iterate over every page involved in a
cache flush and figure out if it needs to happen.

The point of all this is better performance, and Fengwei Yin has measured
improvement on x86.  I suspect you'll see improvement on your architecture
too.  Try the new will-it-scale test mentioned here:
https://lore.kernel.org/linux-mm/20230206140639.538867-5-fengwei.yin@intel.com/
You'll need to run it on an XFS filesystem and have
CONFIG_TRANSPARENT_HUGEPAGE set.

This patchset is the basis for much of the anonymous large folio work
being done by Ryan, so it's received quite a lot of testing over the last
few months.


This patch (of 38):

Determine if a value lies within a range more efficiently (subtraction +
comparison vs two comparisons and an AND).  It also has useful (under some
circumstances) behaviour if the range exceeds the maximum value of the
type.  Convert all the conflicting definitions of in_range() within the
kernel; some can use the generic definition while others need their own
definition.

Link: https://lkml.kernel.org/r/20230802151406.3735276-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20230802151406.3735276-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 16:20:18 -07:00
Andrew Morton fcbc329fa3 merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes 2023-08-24 15:25:56 -07:00
Andre Przywara f84f62e699 selftests: cachestat: catch failing fsync test on tmpfs
The cachestat kselftest runs a test on a normal file, which is created
temporarily in the current directory.  Among the tests it runs there is a
call to fsync(), which is expected to clean all dirty pages used by the
file.

However the tmpfs filesystem implements fsync() as noop_fsync(), so the
call will not even attempt to clean anything when this test file happens
to live on a tmpfs instance.  This happens in an initramfs, or when the
current directory is in /dev/shm or sometimes /tmp.

To avoid this test failing wrongly, use statfs() to check which filesystem
the test file lives on.  If that is "tmpfs", we skip the fsync() test.

Since the fsync test is only one part of the "normal file" test, we now
execute this twice, skipping the fsync part on the first call.  This way
only the second test, including the fsync part, would be skipped.

Link: https://lkml.kernel.org/r/20230821160534.3414911-3-andre.przywara@arm.com
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 14:59:47 -07:00
Andre Przywara 5e56982dd0 selftests: cachestat: test for cachestat availability
Patch series "selftests: cachestat: fix run on older kernels", v2.

I ran all kernel selftests on some test machine, and stumbled upon
cachestat failing (among others).  These patches fix the run on older
kernels and when the current directory is on a tmpfs instance.


This patch (of 2):

As cachestat is a new syscall, it won't be available on older kernels, for
instance those running on a development machine.  At the moment the test
reports all tests as "not ok" in this case.

Test for the cachestat syscall availability first, before doing further
tests, and bail out early with a TAP SKIP comment.

This also uses the opportunity to add the proper TAP headers, and add one
check for proper error handling (illegal file descriptor).

Link: https://lkml.kernel.org/r/20230821160534.3414911-1-andre.przywara@arm.com
Link: https://lkml.kernel.org/r/20230821160534.3414911-2-andre.przywara@arm.com
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 14:59:47 -07:00
Jakub Kicinski 57ce6427e0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

include/net/inet_sock.h
  f866fbc842 ("ipv4: fix data-races around inet->inet_id")
  c274af2242 ("inet: introduce inet->inet_flags")
https://lore.kernel.org/all/679ddff6-db6e-4ff6-b177-574e90d0103d@tessares.net/

Adjacent changes:

drivers/net/bonding/bond_alb.c
  e74216b8de ("bonding: fix macvlan over alb bond support")
  f11e5bd159 ("bonding: support balance-alb with openvswitch")

drivers/net/ethernet/broadcom/bgmac.c
  d6499f0b7c ("net: bgmac: Return PTR_ERR() for fixed_phy_register()")
  23a14488ea ("net: bgmac: Fix return value check for fixed_phy_register()")

drivers/net/ethernet/broadcom/genet/bcmmii.c
  32bbe64a13 ("net: bcmgenet: Fix return value check for fixed_phy_register()")
  acf50d1adb ("net: bcmgenet: Return PTR_ERR() for fixed_phy_register()")

net/sctp/socket.c
  f866fbc842 ("ipv4: fix data-races around inet->inet_id")
  b09bde5c35 ("inet: move inet->mc_loop to inet->inet_frags")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-24 10:51:39 -07:00
Pu Lehui 0209fd511f selftests/bpf: Enable cpu v4 tests for RV64
Enable cpu v4 tests for RV64, and the relevant tests have passed.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20230824095001.3408573-8-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 09:13:08 -07:00
Linus Torvalds b5cc3833f1 Networking fixes for 6.5-rc8, including fixes from wifi, can
and netfilter
 
 Fixes to fixes:
 
   - nf_tables:
     - GC transaction race with abort path
     - defer gc run if previous batch is still pending
 
 Previous releases - regressions:
 
   - ipv4: fix data-races around inet->inet_id
 
   - phy: fix deadlocking in phy_error() invocation
 
   - mdio: fix C45 read/write protocol
 
   - ipvlan: fix a reference count leak warning in ipvlan_ns_exit()
 
   - ice: fix NULL pointer deref during VF reset
 
   - i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()
 
   - tg3: use slab_build_skb() when needed
 
   - mtk_eth_soc: fix NULL pointer on hw reset
 
 Previous releases - always broken:
 
   - core: validate veth and vxcan peer ifindexes
 
   - sched: fix a qdisc modification with ambiguous command request
 
   - devlink: add missing unregister linecard notification
 
   - wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning
 
   - batman:
     - do not get eth header before batadv_check_management_packet
     - fix batadv_v_ogm_aggr_send memory leak
 
   - bonding: fix macvlan over alb bond support
 
   - mlxsw: set time stamp fields also when its type is MIRROR_UTC
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmTnJIQSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkt7kP/jy6HOMwSOMFbtxQD2m89EImr6ZlLUPg
 H09seQzC5nwRbgZrdzukmM27HDKEkYe1sPyxhpS8E4iAslFaefEvnWqOY0oiQSpH
 OuF4mP/cS9QKb62NwKVrau3SCARS9arLmOF0mcJNdDOWwucE+SoFaebxSMitAU/w
 k8hHVsLwc5dwZAYznOl2/qsmPBnIUsxfymNJE/RuFqj1nHccGybh9mJKpAxc0knj
 QEjqno//PgAXPV/X3mH/wG0fcsXs0OlAnBS9yA95GNzuR2yWrh7bD/et99En/elS
 8paUio+O3P6Y6WaewgDYFm44pf/x+hFb18Irtab82BkdRw+lgFyF23g8IH7ToJAE
 mEaxwdS7AQ4XEunNyJsjwiffWUG1nFaoIhaGb0Lo1qmgLHDo+rrNhkrBWvZxSf0Q
 8QlMnCXopJ1c5Qltz5QNVaWPErpCcanxV3cpNlG+lTpfamWBrUpuv/EhHCUF/fr3
 hlgJEm+WoFTvexO+QC3CyJDz2JYLLMaaYaoUZ1aJS2dtTTc3tfUjEL8VcopfXI87
 2FXJ3qEtCkvfdtfFjhofw97qHDvGrTXa9r2JSh1Pp8v15pKdM2P/lMYxd4B0cSEw
 9udW/3bWkvHZayzBWvqDEiz3UTID1+uX0/qpBWY40QzTdIXo6sBrCCk93tjJUdcA
 kXjw9HkSqW6H
 =WKil
 -----END PGP SIGNATURE-----

Merge tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from wifi, can and netfilter.

  Fixes to fixes:

   - nf_tables:
       - GC transaction race with abort path
       - defer gc run if previous batch is still pending

  Previous releases - regressions:

   - ipv4: fix data-races around inet->inet_id

   - phy: fix deadlocking in phy_error() invocation

   - mdio: fix C45 read/write protocol

   - ipvlan: fix a reference count leak warning in ipvlan_ns_exit()

   - ice: fix NULL pointer deref during VF reset

   - i40e: fix potential NULL pointer dereferencing of pf->vf in
     i40e_sync_vsi_filters()

   - tg3: use slab_build_skb() when needed

   - mtk_eth_soc: fix NULL pointer on hw reset

  Previous releases - always broken:

   - core: validate veth and vxcan peer ifindexes

   - sched: fix a qdisc modification with ambiguous command request

   - devlink: add missing unregister linecard notification

   - wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning

   - batman:
      - do not get eth header before batadv_check_management_packet
      - fix batadv_v_ogm_aggr_send memory leak

   - bonding: fix macvlan over alb bond support

   - mlxsw: set time stamp fields also when its type is MIRROR_UTC"

* tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
  selftests: bonding: add macvlan over bond testing
  selftest: bond: add new topo bond_topo_2d1c.sh
  bonding: fix macvlan over alb bond support
  rtnetlink: Reject negative ifindexes in RTM_NEWLINK
  netfilter: nf_tables: defer gc run if previous batch is still pending
  netfilter: nf_tables: fix out of memory error handling
  netfilter: nf_tables: use correct lock to protect gc_list
  netfilter: nf_tables: GC transaction race with abort path
  netfilter: nf_tables: flush pending destroy work before netlink notifier
  netfilter: nf_tables: validate all pending tables
  ibmveth: Use dcbf rather than dcbfl
  i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()
  net/sched: fix a qdisc modification with ambiguous command request
  igc: Fix the typo in the PTM Control macro
  batman-adv: Hold rtnl lock during MTU update via netlink
  igb: Avoid starting unnecessary workqueues
  can: raw: add missing refcount for memory leak fix
  can: isotp: fix support for transmission of SF without flow control
  bnx2x: new flag for track HW resource allocation
  sfc: allocate a big enough SKB for loopback selftest packet
  ...
2023-08-24 08:23:13 -07:00
Yonghong Song 001fedacc9 selftests/bpf: Add a local kptr test with no special fields
Add a local kptr test with no special fields in the struct. Without the
previous patch, the following warning will hit:

  [   44.683877] WARNING: CPU: 3 PID: 485 at kernel/bpf/syscall.c:660 bpf_obj_free_fields+0x220/0x240
  [   44.684640] Modules linked in: bpf_testmod(OE)
  [   44.685044] CPU: 3 PID: 485 Comm: kworker/u8:5 Tainted: G           OE      6.5.0-rc5-01703-g260d855e9b90 #248
  [   44.685827] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
  [   44.686693] Workqueue: events_unbound bpf_map_free_deferred
  [   44.687297] RIP: 0010:bpf_obj_free_fields+0x220/0x240
  [   44.687775] Code: e8 55 17 1f 00 49 8b 74 24 08 4c 89 ef e8 e8 14 05 00 e8 a3 da e2 ff e9 55 fe ff ff 0f 0b e9 4e fe ff
                       ff 0f 0b e9 47 fe ff ff <0f> 0b e8 d9 d9 e2 ff 31 f6 eb d5 48 83 c4 10 5b 41 5c e
  [   44.689353] RSP: 0018:ffff888106467cb8 EFLAGS: 00010246
  [   44.689806] RAX: 0000000000000000 RBX: ffff888112b3a200 RCX: 0000000000000001
  [   44.690433] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8881128ad988
  [   44.691094] RBP: 0000000000000002 R08: ffffffff81370bd0 R09: 1ffff110216231a5
  [   44.691643] R10: dffffc0000000000 R11: ffffed10216231a6 R12: ffff88810d68a488
  [   44.692245] R13: ffff88810767c288 R14: ffff88810d68a400 R15: ffff88810d68a418
  [   44.692829] FS:  0000000000000000(0000) GS:ffff8881f7580000(0000) knlGS:0000000000000000
  [   44.693484] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   44.693964] CR2: 000055c7f2afce28 CR3: 000000010fee4002 CR4: 0000000000370ee0
  [   44.694513] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   44.695102] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   44.695747] Call Trace:
  [   44.696001]  <TASK>
  [   44.696183]  ? __warn+0xfe/0x270
  [   44.696447]  ? bpf_obj_free_fields+0x220/0x240
  [   44.696817]  ? report_bug+0x220/0x2d0
  [   44.697180]  ? handle_bug+0x3d/0x70
  [   44.697507]  ? exc_invalid_op+0x1a/0x50
  [   44.697887]  ? asm_exc_invalid_op+0x1a/0x20
  [   44.698282]  ? btf_find_struct_meta+0xd0/0xd0
  [   44.698634]  ? bpf_obj_free_fields+0x220/0x240
  [   44.699027]  ? bpf_obj_free_fields+0x1e2/0x240
  [   44.699414]  array_map_free+0x1a3/0x260
  [   44.699763]  bpf_map_free_deferred+0x7b/0xe0
  [   44.700154]  process_one_work+0x46d/0x750
  [   44.700523]  worker_thread+0x49e/0x900
  [   44.700892]  ? pr_cont_work+0x270/0x270
  [   44.701224]  kthread+0x1ae/0x1d0
  [   44.701516]  ? kthread_blkcg+0x50/0x50
  [   44.701860]  ret_from_fork+0x34/0x50
  [   44.702178]  ? kthread_blkcg+0x50/0x50
  [   44.702508]  ret_from_fork_asm+0x11/0x20
  [   44.702880]  </TASK>

With the previous patch, there is no warnings.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230824063422.203097-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-24 08:15:16 -07:00
Stas Sergeev bfe2e8f569 selftests: add OFD lock tests
Test the basic locking stuff on 2 fds: multiple read locks,
conflicts between read and write locks, use of len==0 for queries.
Also tests for F_UNLCK F_OFD_GETLK extension.

[ jlayton: fix unlink() pathname in selftest ]

Cc: Jeff Layton <jlayton@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: linux-api@vger.kernel.org
Signed-off-by: Stas Sergeev <stsp2@yandex.ru>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2023-08-24 10:41:47 -04:00
Michael Ellerman c040c7488b powerpc/pseries: Move VPHN constants into vphn.h
These don't have any particularly good reason to belong in lppaca.h,
move them into their own header.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230823055317.751786-1-mpe@ellerman.id.au
2023-08-24 22:33:16 +10:00
Hangbin Liu 246af950b9 selftests: bonding: add macvlan over bond testing
Add a macvlan over bonding test with mode active-backup, balance-tlb
and balance-alb.

]# ./bond_macvlan.sh
TEST: active-backup: IPv4: client->server                           [ OK ]
TEST: active-backup: IPv6: client->server                           [ OK ]
TEST: active-backup: IPv4: client->macvlan_1                        [ OK ]
TEST: active-backup: IPv6: client->macvlan_1                        [ OK ]
TEST: active-backup: IPv4: client->macvlan_2                        [ OK ]
TEST: active-backup: IPv6: client->macvlan_2                        [ OK ]
TEST: active-backup: IPv4: macvlan_1->macvlan_2                     [ OK ]
TEST: active-backup: IPv6: macvlan_1->macvlan_2                     [ OK ]
TEST: active-backup: IPv4: server->client                           [ OK ]
TEST: active-backup: IPv6: server->client                           [ OK ]
TEST: active-backup: IPv4: macvlan_1->client                        [ OK ]
TEST: active-backup: IPv6: macvlan_1->client                        [ OK ]
TEST: active-backup: IPv4: macvlan_2->client                        [ OK ]
TEST: active-backup: IPv6: macvlan_2->client                        [ OK ]
TEST: active-backup: IPv4: macvlan_2->macvlan_2                     [ OK ]
TEST: active-backup: IPv6: macvlan_2->macvlan_2                     [ OK ]
[...]
TEST: balance-alb: IPv4: client->server                             [ OK ]
TEST: balance-alb: IPv6: client->server                             [ OK ]
TEST: balance-alb: IPv4: client->macvlan_1                          [ OK ]
TEST: balance-alb: IPv6: client->macvlan_1                          [ OK ]
TEST: balance-alb: IPv4: client->macvlan_2                          [ OK ]
TEST: balance-alb: IPv6: client->macvlan_2                          [ OK ]
TEST: balance-alb: IPv4: macvlan_1->macvlan_2                       [ OK ]
TEST: balance-alb: IPv6: macvlan_1->macvlan_2                       [ OK ]
TEST: balance-alb: IPv4: server->client                             [ OK ]
TEST: balance-alb: IPv6: server->client                             [ OK ]
TEST: balance-alb: IPv4: macvlan_1->client                          [ OK ]
TEST: balance-alb: IPv6: macvlan_1->client                          [ OK ]
TEST: balance-alb: IPv4: macvlan_2->client                          [ OK ]
TEST: balance-alb: IPv6: macvlan_2->client                          [ OK ]
TEST: balance-alb: IPv4: macvlan_2->macvlan_2                       [ OK ]
TEST: balance-alb: IPv6: macvlan_2->macvlan_2                       [ OK ]

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-08-24 10:07:13 +02:00
Hangbin Liu 27aa43f83c selftest: bond: add new topo bond_topo_2d1c.sh
Add a new testing topo bond_topo_2d1c.sh which is used more commonly.
Make bond_topo_3d1c.sh just source bond_topo_2d1c.sh and add the
extra link.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-08-24 10:07:13 +02:00
Takashi Iwai a057efde80 Merge branch 'for-linus' into for-next
Back-merge the 6.5-devel branch for the clean patch application for
6.6 and resolving merge conflicts.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-08-24 09:27:21 +02:00
Andrii Nakryiko f3bdb54f09 libbpf: fix signedness determination in CO-RE relo handling logic
Extracting btf_int_encoding() is only meaningful for BTF_KIND_INT, so we
need to check that first before inferring signedness.

Closes: https://github.com/libbpf/libbpf/issues/704
Reported-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230824000016.2658017-2-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-23 21:13:48 -07:00
Andrii Nakryiko a182e64147 selftests/bpf: add uprobe_multi test binary to .gitignore
It seems like it was forgotten to add uprobe_multi binary to .gitignore.
Fix this trivial omission.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230824000016.2658017-1-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-23 21:13:48 -07:00
Daniel Xu 068ca522d5 libbpf: Add bpf_object__unpin()
For bpf_object__pin_programs() there is bpf_object__unpin_programs().
Likewise bpf_object__unpin_maps() for bpf_object__pin_maps().

But no bpf_object__unpin() for bpf_object__pin(). Adding the former adds
symmetry to the API.

It's also convenient for cleanup in application code. It's an API I
would've used if it was available for a repro I was writing earlier.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/b2f9d41da4a350281a0b53a804d11b68327e14e5.1692832478.git.dxu@dxuuu.xyz
2023-08-23 17:10:09 -07:00
Charlie Jenkins 4d0c04eac0
RISC-V: mm: Add tests for RISC-V mm
Add tests that enforce mmap hint address behavior. mmap should default
to sv48. mmap will provide an address at the highest address space that
can fit into the hint address, unless the hint address is less than sv39
and not 0, then it will return a sv39 address.

These tests are split into two files: mmap_default.c and mmap_bottomup.c
because a new process must be exec'd in order to change the mmap layout.
The run_mmap.sh script sets the stack to be unlimited for the
mmap_bottomup.c test which triggers a bottomup layout.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20230809232218.849726-3-charlie@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-08-23 14:54:13 -07:00
Yafang Shao 0072e3624b selftests/bpf: Add selftest for allow_ptr_leaks
- Without prev commit

  $ tools/testing/selftests/bpf/test_progs --name=tc_bpf
  #232/1   tc_bpf/tc_bpf_root:OK
  test_tc_bpf_non_root:PASS:set_cap_bpf_cap_net_admin 0 nsec
  test_tc_bpf_non_root:PASS:disable_cap_sys_admin 0 nsec
  0: R1=ctx(off=0,imm=0) R10=fp0
  ; if ((long)(iph + 1) > (long)skb->data_end)
  0: (61) r2 = *(u32 *)(r1 +80)         ; R1=ctx(off=0,imm=0) R2_w=pkt_end(off=0,imm=0)
  ; struct iphdr *iph = (void *)(long)skb->data + sizeof(struct ethhdr);
  1: (61) r1 = *(u32 *)(r1 +76)         ; R1_w=pkt(off=0,r=0,imm=0)
  ; if ((long)(iph + 1) > (long)skb->data_end)
  2: (07) r1 += 34                      ; R1_w=pkt(off=34,r=0,imm=0)
  3: (b4) w0 = 1                        ; R0_w=1
  4: (2d) if r1 > r2 goto pc+1
  R2 pointer comparison prohibited
  processed 5 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
  test_tc_bpf_non_root:FAIL:test_tc_bpf__open_and_load unexpected error: -13
  #233/2   tc_bpf_non_root:FAIL

- With prev commit

  $ tools/testing/selftests/bpf/test_progs --name=tc_bpf
  #232/1   tc_bpf/tc_bpf_root:OK
  #232/2   tc_bpf/tc_bpf_non_root:OK
  #232     tc_bpf:OK
  Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20230823020703.3790-3-laoar.shao@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-23 09:37:29 -07:00
Willy Tarreau 556fb7131e tools/nolibc: avoid undesired casts in the __sysret() macro
Having __sysret() as an inline function has the unfortunate effect of
adding casts and large constants comparisons after the syscall returns
that significantly inflate some light code that's otherwise syscall-
heavy. Even nolibc-test grew by ~1%.

Let's switch back to a macro for this, and use it only with signed
arguments. Note that it is also possible to design a slightly more
complex macro covering unsigned and pointers but we only have 3 such
syscalls so it is pointless, and these were just addressed not to use
this macro anymore. Now for the argument (the local variable containing
the syscall return value), any negative value is an error, that results
in -1 being returned and errno to be assigned the opposite value.

This may be revisited again in the future if really needed but for now
let's get back to something sane.

Fixes: 428905da6e ("tools/nolibc: sys.h: add a syscall return helper")
Link: https://lore.kernel.org/lkml/20230806095846.GB10627@1wt.eu/
Link: https://lore.kernel.org/lkml/ZNKOJY+g66nkIyvv@1wt.eu/
Cc: Zhangjin Wu <falcon@tinylab.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Thomas Weißschuh <thomas@t-8ch.de>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Willy Tarreau fb01ff635e tools/nolibc: keep brk(), sbrk(), mmap() away from __sysret()
The __sysret() function causes some undesirable casts so we'll revert
it. In order to keep it simple it will now only support integer return
values like in the past, so we must basically revert the changes that
were made to these 3 syscalls which return a pointer so that they
simply rely on their own test and the SET_ERRNO() macro.

Fixes: 4201cfce15 ("tools/nolibc: clean up sbrk() routine")
Fixes: 924e9539ae ("tools/nolibc: clean up mmap() routine")
Fixes: d27447bc2e ("tools/nolibc: sys.h: apply __sysret() helper")
Link: https://lore.kernel.org/lkml/20230806095846.GB10627@1wt.eu/
Link: https://lore.kernel.org/lkml/ZNKOJY+g66nkIyvv@1wt.eu/
Cc: Zhangjin Wu <falcon@tinylab.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Thomas Weißschuh <thomas@t-8ch.de>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:19:22 +02:00
Zhangjin Wu 872dbfa032 tools/nolibc: silence ppc64 compile warnings
Silence the following warnings reported by the new -Wall -Wextra options
with pure assembly code.

    In file included from sysroot/powerpc/include/stdio.h:13,
                     from nolibc-test.c:13:
    sysroot/powerpc/include/arch.h: In function '_start':
    sysroot/powerpc/include/arch.h:192:32: warning: unused variable 'r2' [-Wunused-variable]
      192 |         register volatile long r2 __asm__ ("r2") = (void *)&TOC - (void *)_start;
          |                                ^~
    sysroot/powerpc/include/arch.h:187:97: warning: optimization may eliminate reads and/or writes to register variables [-Wvolatile-register-var]
      187 | void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_stack_protector _start(void)
          |                                                                                                 ^~~~~~

Since only elfv2 ABI requires to save the TOC/GOT pointer to r2
register, when using elfv1 ABI, the old C code is simply ignored by the
compiler, but the compiler can not ignore the inline assembly code and
will introduce build failure or running segfaults. So, let's further
only add the new assembly code for elfv2 ABI with the checking of
_CALL_ELF == 2.

Link: https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.pdf
Link: https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Euro-LLVM-2014-Weigand.pdf
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Zhangjin Wu 418c846821 selftests/nolibc: libc-test: use HOSTCC instead of CC
libc-test is mainly added to compare the behavior of nolibc to the
system libc, it is meaningless and error-prone with cross compiling.

Let's use HOSTCC instead of CC to avoid wrongly use cross compiler when
CROSS_COMPILE is passed or customized.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Fixes: cfb672f94f ("selftests/nolibc: add run-libc-test target")
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:18:50 +02:00
Zhangjin Wu dcb677c3d3 tools/nolibc: stackprotector.h: make __stack_chk_init static
This allows to generate smaller text/data/dec size.

As the _start_c() function added by crt.h, __stack_chk_init() is called
from _start_c() instead of the assembly _start. So, it is able to mark
it with static now.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Zhangjin Wu ce1bb82b1c selftests/nolibc: allow report with existing test log
After the tests finish, it is valuable to report and summarize with
existing test log.

This avoid rerun or run the tests again when not necessary.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Zhangjin Wu faeb4e09fe selftests/nolibc: add test support for ppc64
Kernel uses ARCH=powerpc for both 32-bit and 64-bit PowerPC, here adds a
ppc64 variant for big endian 64-bit PowerPC, users can pass XARCH=ppc64
to test it.

The powernv machine of qemu-system-ppc64 is used with
powernv_be_defconfig.

As the document [1] shows:

  PowerNV (as Non-Virtualized) is the “bare metal” platform using the
  OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be
  used as an hypervisor OS, running KVM guests, or simply as a host OS.

Notes,

- differs from little endian 64-bit PowerPC, vmlinux is used instead of
  zImage, because big endian zImage [2] only boot on qemu with x-vof=on
  (added from qemu v7.0) and a fixup patch [3] for qemu v7.0.51:

- since the VSX support may be disabled in kernel side, to avoid
  "illegal instruction" errors due to missing VSX kernel support, let's
  simply let compiler not generate vector/scalar (VSX) instructions via
  the '-mno-vsx' option.

- as 'man gcc' shows, '-mmultiple' is used to generate code that uses
  the load multiple word instructions and the store multiple word
  instructions. Those instructions do not work when the processor is in
  little-endian mode (except PPC740/PPC750), so, we only enable it
  for big endian powerpc.

- for big endian ppc64, as the help message from arch/powerpc/Kconfig
  shows, the V2 ABI is standard for 64-bit little-endian, but for
  big-endian it is less well tested by kernel and toolchain, so, use
  elfv1 as-is, no need to explicitly ask toolchain to use elfv2 here.

[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powernv.html
[2]: https://github.com/linuxppc/issues/issues/402
[3]: https://lore.kernel.org/qemu-devel/20220504065536.3534488-1-aik@ozlabs.ru/

Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230722121019.GD17311@1wt.eu/
Link: https://lore.kernel.org/lkml/20230719043353.GC5331@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Zhangjin Wu 8a5040cb3f selftests/nolibc: add test support for ppc64le
Kernel uses ARCH=powerpc for both 32-bit and 64-bit PowerPC, here adds a
ppc64le variant for little endian 64-bit PowerPC, users can pass
XARCH=ppc64le to test it.

The powernv machine of qemu-system-ppc64le is used for there is just a
working powernv_defconfig.

As the document [1] shows:

  PowerNV (as Non-Virtualized) is the “bare metal” platform using the
  OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be
  used as an hypervisor OS, running KVM guests, or simply as a host OS.

Notes,

- since the VSX support may be disabled in kernel side, to avoid
  "illegal instruction" errors due to missing VSX kernel support, let's
  simply let compiler not generate vector/scalar (VSX) instructions via
  the '-mno-vsx' option.

- little endian ppc64 prefers elfv2 to elfv1 if the toolchain (e.g. gcc
  13.1.0) supports it, let's align with kernel, otherwise, our elfv1
  binary will not run on kernel with elfv2 ABI.

[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powernv.html

Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230722120747.GC17311@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Zhangjin Wu 587e984591 selftests/nolibc: add test support for ppc
Kernel uses ARCH=powerpc for both 32-bit and 64-bit PowerPC, here adds a
ppc variant for 32-bit PowerPC and uses it as the default variant of
powerpc architecture.

Users can pass XARCH=ppc (or ARCH=powerpc) to test 32-bit PowerPC.

The default qemu-system-ppc g3beige machine [1] is used to run 32-bit
powerpc kernel with pmac32_defconfig. The missing PMACZILOG serial tty
and console are enabled in another patch [2].

Note,

- zImage doesn't boot due to "qemu-system-ppc: Some ROM regions are
  overlapping" error, so, vmlinux is used instead.

- since the VSX support may be disabled in kernel side, to avoid
  "illegal instruction" errors due to missing VSX kernel support, let's
  simply let compiler not generate vector/scalar (VSX) instructions via
  the '-mno-vsx' option.

- as 'man gcc' shows, '-mmultiple' is used to generate code that uses
  the load multiple word instructions and the store multiple word
  instructions. Those instructions do not work when the processor is in
  little-endian mode (except PPC740/PPC750), so, we only enable it
  for big endian powerpc.

[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powermac.html
[2]: https://lore.kernel.org/lkml/bb7b5f9958b3e3a20f6573ff7ce7c5dc566e7e32.1690982937.git.tanyuan@tinylab.org/

Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/ZL9leVOI25S2+0+g@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Zhangjin Wu c6c3734fb6 selftests/nolibc: add XARCH and ARCH mapping support
Most of the CPU architectures have different variants, but kernel
usually only accepts parts of them via the ARCH variable, the others
should be customized via kernel config files.

To simplify testing, a new XARCH variable is added to extend the
kernel's ARCH with a few variants of the same architecture, and it is
used to customize variant specific variables, at last XARCH is converted
to the kernel's ARCH:

  e.g. make run XARCH=<one of the supported variants>
                | \
                |  `-> variant specific variables:
                |      IMAGE, DEFCONFIG, QEMU_ARCH, QEMU_ARGS, CFLAGS ...
                \
                 `---> kernel's ARCH

XARCH and ARCH are carefully mapped to allow users to pass architecture
variants via XARCH or pass architecture via ARCH from cmdline.

PowerPC is the first user and also a very good reference architecture of
this mapping, it has variants with different combinations of
32-bit/64-bit and bit endian/little endian.

To use this mapping, the other architectures can refer to PowerPC, If
the target architecture only has one variant, XARCH is simply an alias
of ARCH, no additional mapping required.

Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702171715.GD16233@1wt.eu/
Link: https://lore.kernel.org/lkml/20230730061801.GA7690@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Zhangjin Wu e45ce88e65 tools/nolibc: add support for powerpc64
This follows the 64-bit PowerPC ABI [1], refers to the slides: "A new
ABI for little-endian PowerPC64 Design & Implementation" [2] and the
musl code in arch/powerpc64/crt_arch.h.

First, stdu and clrrdi are used instead of stwu and clrrwi for
powerpc64.

Second, the stack frame size is increased to 32 bytes for powerpc64, 32
bytes is the minimal stack frame size supported described in [2].

Besides, the TOC pointer (GOT pointer) must be saved to r2.

This works on both little endian and big endian 64-bit PowerPC.

[1]: https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.pdf
[2]: https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Euro-LLVM-2014-Weigand.pdf

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Zhangjin Wu 0cb0675ec3 tools/nolibc: add support for powerpc
Both syscall declarations and _start code definition are added for
powerpc to nolibc.

Like mips, powerpc uses a register (exactly, the summary overflow bit)
to record the error occurred, and uses another register to return the
value [1]. So, the return value of every syscall declaration must be
normalized to match the __sysret() helper, return -value when there is
an error, otheriwse, return value directly.

Glibc and musl use different methods to check the summary overflow bit,
glibc (sysdeps/unix/sysv/linux/powerpc/sysdep.h) saves the cr register
to r0 at first, and then check the summary overflow bit in cr0:

    mfcr r0
    r0 & (1 << 28) ? -r3 : r3

    -->

    10003c14:       7c 00 00 26     mfcr    r0
    10003c18:       74 09 10 00     andis.  r9,r0,4096
    10003c1c:       41 82 00 08     beq     0x10003c24
    10003c20:       7c 63 00 d0     neg     r3,r3

Musl (arch/powerpc/syscall_arch.h) directly checks the summary overflow
bit with the 'bns' instruction, it is smaller:

    /* no summary overflow bit means no error, return value directly */
    bns+ 1f
    /* otherwise, return negated value */
    neg r3, r3
    1:

    -->

    10000418:       40 a3 00 08     bns     0x10000420
    1000041c:       7c 63 00 d0     neg     r3,r3

Like musl, Linux (arch/powerpc/include/asm/vdso/gettimeofday.h) uses the
same method for do_syscall_2() too.

Here applies the second method to get smaller size.

[1]: https://man7.org/linux/man-pages/man2/syscall.2.html

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 45f65f8d04 selftests/nolibc: enable compiler warnings
It will help the developers to avoid cruft and detect some bugs.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 711edef8f7 selftests/nolibc: don't strip nolibc-test
Binary size is not important for nolibc-test and some debugging
information is nice to have, so don't strip the binary during linking.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 9c5e490093 selftests/nolibc: prevent out of bounds access in expect_vfprintf
If read() fails and returns -1 (or returns garbage for some other
reason) buf would be accessed out of bounds.
Only use the return value of read() after it has been validated.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 37266a9ec7 selftests/nolibc: use correct return type for read() and write()
Avoid truncating values before comparing them.

As printf in nolibc doesn't support ssize_t add casts to int for
printing.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 711f91fdec selftests/nolibc: avoid sign-compare warnings
These warnings will be enabled later so avoid triggering them.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh c8d078153f selftests/nolibc: avoid unused parameter warnings
This warning will be enabled later so avoid triggering it.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 17e66f235e selftests/nolibc: make functions static if possible
This allows the compiler to generate warnings if they go unused.

Functions that are supposed to be used as breakpoints should not be
static, so un-statify those if necessary.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 10874f20ee selftests/nolibc: mark test helpers as potentially unused
When warning about unused functions these would be reported by we want
to keep them for future use.

Suggested-by: Zhangjin Wu <falcon@tinylab.org>
Link: https://lore.kernel.org/lkml/20230731064826.16584-1-falcon@tinylab.org/
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230731224929.GA18296@1wt.eu/
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 79df81aaea selftests/nolibc: drop unused variables
These got copied around as new testcases where created.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Willy Tarreau ca283457b3 selftests/nolibc: avoid warnings during intptr tests
Recent fix ceb528feb7 ("selftests/nolibc: avoid gaps in test numbers")
had the annoying side effect of always returning skipped tests, which
are normally supposed to happen only when certain features are missing
to run the test (missing kernel options, toolchain not supporting
stack-protector etc). As such there are now always warnings. Let's
modify the test to not use the condition and instead use a ternary
expression to check the result.

Fixes: ceb528feb7 ("selftests/nolibc: avoid gaps in test numbers")
Cc: Thomas WeiÃ<9F>schuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:57 +02:00
Thomas Weißschuh 202a0bd12f tools/nolibc: stdint: use __SIZE_TYPE__ for size_t
Otherwise both gcc and clang may generate warnings about type
mismatches:

sysroot/mips/include/string.h:12:14: warning: mismatch in argument 1 type of built-in function 'malloc'; expected 'unsigned int' [-Wbuiltin-declaration-mismatch]
   12 | static void *malloc(size_t len);
      |              ^~~~~~

The compiler provides __SIZE_TYPE__ as the type that corresponds to size_t
(typically "long unsigned int" or "unsigned int"). It was verified to be
available at least since gcc-3.4 and clang-3.8, so from now on we'll use
this definition for size_t.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/20230805161929.GA15284@1wt.eu/
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 04694658ad tools/nolibc: sys: avoid implicit sign cast
getauxval() returns an unsigned long but the overall type of the ternary
operator needs to be signed.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 809145f842 tools/nolibc: setvbuf: avoid unused parameter warnings
This warning will be enabled later so avoid triggering it.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 6407750225 tools/nolibc: fix return type of getpagesize()
It's documented as returning int which is also implemented by glibc and
musl, so adopt that return type.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh f2f5eaefa1 tools/nolibc: drop unused variables
Nobody needs it, get rid of it.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Yuan Tan 5c01259b12 selftests/nolibc: add testcase for pipe
Add a test case of pipe that sends and receives message in a single
process.

Suggested-by: Thomas Weißschuh <thomas@t-8ch.de>
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/all/c5de2d13-3752-4e1b-90d9-f58cca99c702@t-8ch.de/
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
[wt: fixed the "len" type to size_t to address a sign-compare warning
 with upcoming patches]
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Yuan Tan 3ec38af6ee tools/nolibc: add pipe() and pipe2() support
According to manual page [1], posix spec [2] and source code like
arch/mips/kernel/syscall.c, for historic reasons, the sys_pipe() syscall
on some architectures has an unusual calling convention.  It returns
results in two registers which means there is no need for it to do
verify the validity of a userspace pointer argument.  Historically that
used to be expensive in Linux.  These days the performance advantage is
negligible.

Nolibc doesn't support the unusual calling convention above, luckily
Linux provides a generic sys_pipe2() with an additional flags argument
from 2.6.27. If flags is 0, then pipe2() is the same as pipe(). So here
we use sys_pipe2() to implement the pipe().

pipe2() is also provided to allow users to use flags argument on demand.

[1]: https://man7.org/linux/man-pages/man2/pipe.2.html
[2]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/pipe.html

Suggested-by: Zhangjin Wu <falcon@tinylab.org>
Link: https://lore.kernel.org/all/20230729100401.GA4577@1wt.eu/
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Zhangjin Wu e7d0129df6 selftests/nolibc: mmap_munmap_good: fix up return value
The other tests use 1 as failure, mmap_munmap_good uses -1 as failure,
let's fix up this.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Thomas Weißschuh 447e56023f selftests/nolibc: avoid buffer underrun in space printing
If the test description is longer than the status alignment the
parameter 'n' to putcharn() would lead to a signed underflow that then
gets converted to a very large unsigned value.
This in turn leads out-of-bound writes in memset() crashing the
application.

The failure case of EXPECT_PTRER() used in "mmap_bad" exhibits this
exact behavior.

Fixes: 29f5540be3 ("selftests/nolibc: add EXPECT_PTREQ, EXPECT_PTRNE and EXPECT_PTRER")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 05:17:07 +02:00
Ryan Roberts 4893c22eb2 tools/nolibc/stdio: add setvbuf() to set buffering mode
Add a minimal implementation of setvbuf(), which error checks the mode
argument (as required by spec) and returns. Since nolibc never buffers
output, nothing needs to be done.

The kselftest framework recently added a call to setvbuf(). As a result,
any tests that use the kselftest framework and nolibc cause a compiler
error due to missing function. This provides an urgent fix for the
problem which is preventing arm64 testing on linux-next.

Example:

clang --target=aarch64-linux-gnu -fintegrated-as
-Werror=unknown-warning-option -Werror=ignored-optimization-argument
-Werror=option-ignored -Werror=unused-command-line-argument
--target=aarch64-linux-gnu -fintegrated-as
-fno-asynchronous-unwind-tables -fno-ident -s -Os -nostdlib \
-include ../../../../include/nolibc/nolibc.h -I../..\
-static -ffreestanding -Wall za-fork.c
build/kselftest/arm64/fp/za-fork-asm.o
-o build/kselftest/arm64/fp/za-fork
In file included from <built-in>:1:
In file included from ./../../../../include/nolibc/nolibc.h:97:
In file included from ./../../../../include/nolibc/arch.h:25:
./../../../../include/nolibc/arch-aarch64.h:178:35: warning: unknown
attribute 'optimize' ignored [-Wunknown-attributes]
void __attribute__((weak,noreturn,optimize("omit-frame-pointer")))
__no_stack_protector _start(void)
                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from za-fork.c:12:
../../kselftest.h:123:2: error: call to undeclared function 'setvbuf';
ISO C99 and later do not support implicit function declarations
[-Wimplicit-function-declaration]
        setvbuf(stdout, NULL, _IOLBF, 0);
        ^
../../kselftest.h:123:24: error: use of undeclared identifier '_IOLBF'
        setvbuf(stdout, NULL, _IOLBF, 0);
                              ^
1 warning and 2 errors generated.

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Link: https://lore.kernel.org/linux-kselftest/CA+G9fYus3Z8r2cg3zLv8uH8MRrzLFVWdnor02SNr=rCz+_WGVg@mail.gmail.com/
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 850fad7de8 selftests/nolibc: allow test -include /path/to/nolibc.h
As the head comment of nolibc-test.c shows, it can be built in 3 ways:

    $(CC) -nostdlib -include /path/to/nolibc.h => NOLIBC already defined
    $(CC) -nostdlib -I/path/to/nolibc/sysroot  => _NOLIBC_* guards are present
    $(CC) with default libc                    => NOLIBC* never defined

Only last two of them are tested currently, let's allow test the first one too.

This may help to find issues about using nolibc.h to build programs. it
derives from this change:

    commit 3a8039e289 ("tools/nolibc: Fix build of stdio.h due to header ordering")

Usage:

    // test with sysroot by default
    $ make run-user

    // test without sysroot, using nolibc.h directly
    $ make run-user NOLIBC_SYSROOT=0

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu b81434073b selftests/nolibc: allow run nolibc-test locally
It is able to run nolibc-test directly without qemu-user when the target
machine is the same as the host machine.

Sometimes, the result running locally may help a lot when the qemu-user
package is too old.

When the target machine differs from the host machine, it is also able
to run nolibc-test directly with qemu-user-static + binfmt_misc.

Link: https://lore.kernel.org/lkml/ZKutZwIOfy5MqedG@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 48967b73f8 selftests/nolibc: add testcases for startup code
The startup code is critical to get the right argc, argv, envp/environ
and _auxv, let's add a startup test group and the corresponding
testcases.

The "environ" test case is also moved from the stdlib test group to this
new startup test group and it is renamed to "environ_envp".

Since argv0 has been used by many other test cases, let's add testcases
to gurantee it too.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu fd3a9efde8 selftests/nolibc: add EXPECT_PTRGE, EXPECT_PTRGT, EXPECT_PTRLE, EXPECT_PTRLT
4 new pointer compare macros are added, they are similar to the integer
compare macros.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu c48d8af2fa tools/nolibc: s390: shrink _start with _start_c
move most of the _start operations to _start_c(), include the
stackprotector initialization.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu eea70cdac6 tools/nolibc: riscv: shrink _start with _start_c
move most of the _start operations to _start_c(), include the
stackprotector initialization.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 61bd4621c0 tools/nolibc: loongarch: shrink _start with _start_c
move most of the _start operations to _start_c(), include the
stackprotector initialization.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 431b806b9b tools/nolibc: mips: shrink _start with _start_c
move most of the _start operations to _start_c(), include the
stackprotector initialization.

Also clean up the instructions in delayed slots.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 539287d751 tools/nolibc: x86_64: shrink _start with _start_c
move most of the _start operations to _start_c(), include the
stackprotector initialization.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 2ab446336b tools/nolibc: i386: shrink _start with _start_c
move most of the _start operations to _start_c(), include the
stackprotector initialization.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu ded8af47c2 tools/nolibc: aarch64: shrink _start with _start_c
move most of the _start operations to _start_c(), include the
stackprotector initialization.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 61f9880721 tools/nolibc: arm: shrink _start with _start_c
move most of the _start operations to _start_c(), include the
stackprotector initialization.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 06f2a62c81 tools/nolibc: crt.h: initialize stack protector
As suggested by Thomas, It is able to move the stackprotector
initialization from the assembly _start to the beginning of the new
_start_c(). Let's call __stack_chk_init() in _start_c() as a
preparation.

Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/a00284a6-54b1-498c-92aa-44997fa78403@t-8ch.de/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu d7f16723d3 tools/nolibc: stackprotector.h: add empty __stack_chk_init for !_NOLIBC_STACKPROTECTOR
Let's define an empty __stack_chk_init for the !_NOLIBC_STACKPROTECTOR
branch.

This allows to remove #ifdef around every call of __stack_chk_init().

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 1733675515 tools/nolibc: add new crt.h with _start_c
As the environ and _auxv support added for nolibc, the assembly _start
function becomes more and more complex and therefore makes the porting
of nolibc to new architectures harder and harder.

To simplify portability, this C version of _start_c() is added to do
most of the assembly start operations in C, which reduces the complexity
a lot and will eventually simplify the porting of nolibc to the new
architectures.

The new _start_c() only requires a stack pointer argument, it will find
argc, argv, envp/environ and _auxv for us, and then call main(),
finally, it exit() with main's return status. With this new _start_c(),
the future new architectures only require to add very few assembly
instructions.

As suggested by Thomas, users may use a different signature of main
(e.g. void main(void)), a _nolibc_main alias is added for main to
silence the warning about potential conflicting types.

As suggested by Willy, the code is carefully polished for both smaller
size and better readability with local variables and the right types.

Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230715095729.GC24086@1wt.eu/
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/90fdd255-32f4-4caf-90ff-06456b53dac3@t-8ch.de/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu af93807eae tools/nolibc: remove the old sys_stat support
The statx manpage [1] shows that it has been supported from Linux 4.11
and glibc 2.28, the Linux support can be checked for all of the
architectures with this command:

    $ git grep -r statx v4.11 arch/ include/uapi/asm-generic/unistd.h \
      | grep -E "aarch64|arm|mips|s390|x86|:include/uapi"

Besides riscv and loongarch, all of the nolibc supported architectures
have added sys_statx from Linux v4.11. riscv is mainlined to v4.15,
loongarch is mainlined to v5.19, both of them use the generic unistd.h,
so, they have added sys_statx from their first mainline versions.

The current oldest stable branch is v4.14, only reserving sys_statx
still preserves compatibility with all of the supported stable branches,
So, let's remove the old arch related and dependent sys_stat support
completely.

This is friendly to the future new architecture porting.

[1]: https://man7.org/linux/man-pages/man2/statx.2.html

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu bff60150f7 tools/nolibc: fix up startup failures for -O0 under gcc < 11.1.0
As gcc doc [1] shows:

  Most optimizations are completely disabled at -O0 or if an -O level is
  not set on the command line, even if individual optimization flags are
  specified.

Test result [2] shows, gcc>=11.1.0 deviates from the above description,
but before gcc 11.1.0, "-O0" still forcely uses frame pointer in the
_start function even if the individual optimize("omit-frame-pointer")
flag is specified.

The frame pointer related operations will change the stack pointer (e.g.
In x86_64, an extra "push %rbp" will be inserted at the beginning of
_start) and make it differs from the one we expected, as a result, break
the whole startup function.

To fix up this issue, as suggested by Thomas, the individual "Os" and
"omit-frame-pointer" optimize flags are used together on _start function
to disable frame pointer completely even if the -O0 is set on the
command line.

[1]: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
[2]: https://lore.kernel.org/lkml/20230714094723.140603-1-falcon@tinylab.org/

Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/34b21ba5-7b59-4b3b-9ed6-ef9a3a5e06f7@t-8ch.de/
Fixes: 7f85485896 ("tools/nolibc: make compiler and assembler agree on the section around _start")
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 2023349835 tools/nolibc: arch-*.h: add missing space after ','
Fix up such errors reported by scripts/checkpatch.pl:

    ERROR: space required after that ',' (ctx:VxV)
    #148: FILE: tools/include/nolibc/arch-aarch64.h:148:
    +void __attribute__((weak,noreturn,optimize("omit-frame-pointer"))) __no_stack_protector _start(void)
                             ^

    ERROR: space required after that ',' (ctx:VxV)
    #148: FILE: tools/include/nolibc/arch-aarch64.h:148:
    +void __attribute__((weak,noreturn,optimize("omit-frame-pointer"))) __no_stack_protector _start(void)
                                      ^

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Thomas Weißschuh ceb528feb7 selftests/nolibc: avoid gaps in test numbers
As the test numbers are based on line numbers gaps without testcases are
to be avoided.
Instead use the already existing test condition logic to implement
conditional execution.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Thomas Weißschuh b184a261e5 selftests/nolibc: simplify status printing
pad_spc() is only ever used to print the status message of testcases.
The line size is always constant, the return value is never used and the
format string is never used as such.

Remove all the unneeded logic and simplify the API and its users.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Thomas Weißschuh 3097783ecf selftests/nolibc: make evaluation of test conditions
If "cond" is a multi-token statement the behavior of the preprocessor
will lead to the negation "!" to be only applied to the first token.
Although currently no test uses such multi-token conditions but it can
happen at any time.

Put braces around "cond" to ensure the negation works as expected.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Thomas Weißschuh 67d108e2a2 tools/nolibc: completely remove optional environ support
In commit 52e423f5b9 ("tools/nolibc: export environ as a weak symbol on i386")
and friends the asm startup logic was extended to directly populate the
"environ" array.

This makes it impossible for "environ" to be dropped by the linker.
Therefore also drop the other logic to handle non-present "environ".

Also add a testcase to validate the initialization of environ.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 4beb9be811 selftests/nolibc: report: add newline before test failures
a newline is inserted just before the test failures to avoid mixing the
test failures with the raw test log.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 7d92e89363 selftests/nolibc: report: extrude the test status line
two newlines are added around the test summary line to extrude the test
status.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 0ac908e304 selftests/nolibc: report: align passed, skipped and failed
align the test values for different runs and different architectures.

Since the total number of tests is not bigger than 1000 currently, let's
align them with "%3d".

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
[wt: s/%03d/%3d/ as discussed with Zhangjin]
Link: https://lore.kernel.org/lkml/20230709185112.97236-1-falcon@tinylab.org/
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu c0faa0dace selftests/nolibc: report: print total tests
Let's count and print the total number of tests, now, the data of
passed, skipped and failed have the same format.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu c0315c79aa selftests/nolibc: report: print a summarized test status
one of the test status: success, warning and failure is printed to
summarize the passed, skipped and failed values.

- "success" means no skipped and no failed.
- "warning" means has at least one skipped and no failed.
- "failure" means all tests are failed.

Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702164358.GB16233@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:22 +02:00
Zhangjin Wu 148e9718e2 selftests/nolibc: add chmod_argv0 test
argv0 is readable and chmodable, let's use it for chmod test, but a safe
umask should be used, the readable and executable modes should be
reserved.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:40:15 +02:00
Zhangjin Wu 135b622e48 selftests/nolibc: chroot_exe: remove procfs dependency
Since argv0 also works for CONFIG_PROC_FS=n, let's use it instead of
'/proc/self/exe'.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:38:02 +02:00
Zhangjin Wu f576d3c075 selftests/nolibc: stat_timestamps: remove procfs dependency
'/proc/self/' is a good path which doesn't have stale time info but it
is only available for CONFIG_PROC_FS=y.

When CONFIG_PROC_FS=n, use argv0 instead of '/proc/self', use '/' for the
worst case.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:38:02 +02:00
Zhangjin Wu 38fc0a3553 selftests/nolibc: chdir_root: restore current path after test
The PWD environment variable has the path of the nolibc-test program,
the current path must be the same as it, otherwise, the test cases will
fail with relative path (e.g. ./nolibc-test).

Since only chdir_root really changes the current path, let's restore it
with the PWD environment variable.

Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:38:02 +02:00
Zhangjin Wu 6861b1a339 selftests/nolibc: vfprintf: remove MEMFD_CREATE dependency
The vfprintf test case require to open a temporary file to write, the
old memfd_create() method is perfect but has strong dependency on
MEMFD_CREATE and also TMPFS or HUGETLBFS (see fs/Kconfig):

    config MEMFD_CREATE
	def_bool TMPFS || HUGETLBFS

And from v6.2, MFD_NOEXEC_SEAL must be passed for the non-executable
memfd, otherwise, The kernel warning will be output to the test result
like this:

        Running test 'vfprintf'
        0 emptymemfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=1 'init'
         "" = ""                                                  [OK]

To avoid such warning and also to remove the MEMFD_CREATE dependency,
let's open a file from tmpfs directly.

The /tmp directory is used to detect the existing of tmpfs, if not
there, skip instead of fail.

And further, for pid == 1, the initramfs is loaded as ramfs, which can
be used as tmpfs, so, it is able to further remove TMPFS dependency too.

Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/9ad51430-b7c0-47dc-80af-20c86539498d@t-8ch.de
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:38:02 +02:00
Zhangjin Wu bbb14546bd selftests/nolibc: prepare /tmp for tests that need to write
create a /tmp directory. If it succeeds, the directory is writable,
which is normally the case when booted from an initramfs anyway.

This will be used instead of procfs for some tests.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Link: https://lore.kernel.org/lkml/20230710050600.9697-1-falcon@tinylab.org/
[wt: removed the unneeded mount() call]
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:38:02 +02:00
Zhangjin Wu b8b26108e4 selftests/nolibc: fix up failures when CONFIG_PROC_FS=n
For CONFIG_PROC_FS=n, the /proc is not mountable, but the /proc
directory has been created in the prepare() stage whenever /proc is
there or not.

so, the checking of /proc in the run_syscall() stage will be always true
and at last it will fail all of the procfs dependent test cases, which
deviates from the 'cond' check design of the EXPECT_xx macros, without
procfs, these test cases should be skipped instead of failed.

To solve this issue, one method is checking /proc/self instead of /proc,
another method is removing the /proc directory completely for
CONFIG_PROC_FS=n, we apply the second method to avoid misleading the
users.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:38:02 +02:00
Zhangjin Wu 4e14e84442 selftests/nolibc: add a new rmdir() test case
A new rmdir_blah test case is added to remove a non-existing /blah,
which expects failure with ENOENT errno.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:38:02 +02:00
Zhangjin Wu f4191f3d52 tools/nolibc: add rmdir() support
a reverse operation of mkdir() is meaningful, add rmdir() here.

required by nolibc-test to remove /proc while CONFIG_PROC_FS is not
enabled.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:38:02 +02:00
Zhangjin Wu f7a419e35b selftests/nolibc: link_cross: use /proc/self/cmdline
For CONFIG_NET=n, there would be no /proc/self/net, so, use
/proc/self/cmdline instead.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:38:02 +02:00
Zhangjin Wu c388c9920d selftests/nolibc: fix up kernel parameters support
kernel parameters allow pass two types of strings, one type is like
'noapic', another type is like 'panic=5', the first type is passed as
arguments of the init program, the second type is passed as environment
variables of the init program.

when users pass kernel parameters like this:

    noapic NOLIBC_TEST=syscall

our nolibc-test program will use the test setting from argv[1] and
ignore the one from NOLIBC_TEST environment variable, and at last, it
will print the following line and ignore the whole test setting.

    Ignoring unknown test name 'noapic'

reversing the parsing order does solve the above issue:

    test = getenv("NOLIBC_TEST");
    if (test)
        test = argv[1];

but it still doesn't work with such kernel parameters (without
NOLIBC_TEST environment variable):

    noapic FOO=bar

To support all of the potential kernel parameters, let's verify the test
setting from both of argv[1] and NOLIBC_TEST environment variable.

Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2023-08-23 04:38:02 +02:00