Merge devfreq changes and power management tools changes for 6.6-rc1:
- Fix memory leak in devfreq_dev_release() (Boris Brezillon).
- Rewrite devfreq_monitor_start() kerneldoc comment (Manivannan
Sadhasivam).
- Explicitly include correct DT includes in devfreq (Rob Herring).
- Add turbo-boost support to cpupower (Wyes Karny).
- Add support for amd_pstate mode change to cpupower (Wyes Karny).
- Fix 'cpupower idle_set' command to accept only numeric values of
arguments (Likhitha Korrapati).
* pm-devfreq:
PM / devfreq: Fix leak in devfreq_dev_release()
PM / devfreq: Reword the kernel-doc comment for devfreq_monitor_start() API
PM / devfreq: Explicitly include correct DT includes
* pm-tools:
cpupower: Fix cpuidle_set to accept only numeric values for idle-set operation.
cpupower: Add turbo-boost support in cpupower
cpupower: Add support for amd_pstate mode change
cpupower: Add EPP value change support
cpupower: Add is_valid_path API
cpupower: Recognise amd-pstate active mode driver
cpupower: Bump soname version
or aren't considered suitable for a -stable backport.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZOjuGgAKCRDdBJ7gKXxA
jkLlAQDY9sYxhQZp1PFLirUIPeOBjEyifVy6L6gCfk9j0snLggEA2iK+EtuJt2Dc
SlMfoTq29zyU/YgfKKwZEVKtPJZOHQU=
=oTcj
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4
issues or aren't considered suitable for a -stable backport"
* tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
shmem: fix smaps BUG sleeping while atomic
selftests: cachestat: catch failing fsync test on tmpfs
selftests: cachestat: test for cachestat availability
maple_tree: disable mas_wr_append() when other readers are possible
madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check
madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
mm: multi-gen LRU: don't spin during memcg release
mm: memory-failure: fix unexpected return value in soft_offline_page()
radix tree: remove unused variable
mm: add a call to flush_cache_vmap() in vmap_pfn()
selftests/mm: FOLL_LONGTERM need to be updated to 0x100
nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers()
mm/gup: handle cont-PTE hugetlb pages correctly in gup_must_unshare() via GUP-fast
selftests: cgroup: fix test_kmem_basic less than error
mm: enable page walking API to lock vmas during the walk
smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd()
mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
Confirm that the following sleepable prog states fail verification:
* bpf_rcu_read_unlock before bpf_spin_unlock
* RCU CS will last at least as long as spin_lock CS
Also confirm that correct usage passes verification, specifically:
* Explicit use of bpf_rcu_read_{lock, unlock} in sleepable test prog
* Implied RCU CS due to spin_lock CS
None of the selftest progs actually attach to bpf_testmod's
bpf_testmod_test_read.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230821193311.3290257-8-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Now that all reported issues are fixed, bpf_refcount_acquire can be
turned back on. Also reenable all bpf_refcount-related tests which were
disabled.
This a revert of:
* commit f3514a5d67 ("selftests/bpf: Disable newly-added 'owner' field test until refcount re-enabled")
* commit 7deca5eae8 ("bpf: Disable bpf_refcount_acquire kfunc calls until race conditions are fixed")
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230821193311.3290257-5-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* for-next/selftests: (22 commits)
kselftest/arm64: Fix hwcaps selftest build
kselftest/arm64: add jscvt feature to hwcap test
kselftest/arm64: add pmull feature to hwcap test
kselftest/arm64: add AES feature check to hwcap test
kselftest/arm64: add SHA1 and related features to hwcap test
kselftest/arm64: build BTI tests in output directory
kselftest/arm64: fix a memleak in zt_regs_run()
kselftest/arm64: Size sycall-abi buffers for the actual maximum VL
kselftest/arm64: add lse and lse2 features to hwcap test
kselftest/arm64: add test item that support to capturing the SIGBUS signal
kselftest/arm64: add DEF_SIGHANDLER_FUNC() and DEF_INST_RAISE_SIG() helpers
kselftest/arm64: add crc32 feature to hwcap test
kselftest/arm64: add float-point feature to hwcap test
kselftest/arm64: Use the tools/include compiler.h rather than our own
kselftest/arm64: Use shared OPTIMZER_HIDE_VAR() definiton
kselftest/arm64: Make the tools/include headers available
tools include: Add some common function attributes
tools compiler.h: Add OPTIMIZER_HIDE_VAR()
kselftest/arm64: Exit streaming mode after collecting signal context
kselftest/arm64: add RCpc load-acquire to hwcap test
...
- Fix ring buffer being permanently disabled due to missed record_disabled()
Changing the trace cpu mask will disable the ring buffers for the CPUs no
longer in the mask. But it fails to update the snapshot buffer. If a snapshot
takes place, the accounting for the ring buffer being disabled is corrupted
and this can lead to the ring buffer being permanently disabled.
- Add test case for snapshot and cpu mask working together
- Fix memleak by the function graph tracer not getting closed properly.
The iterator is used to read the ring buffer. When it opens, it calls
the open function of a tracer, and when it is closed, it calls the close
iteration. While a trace is being read, it is still possible to change
the tracer. If this happens between the function graph tracer and the
wakeup tracer (which uses function graph tracing), the tracers are not
closed properly during when the iterator sees the switch, and the wakeup
function did not initialize its private pointer to NULL, which is used
to know if the function graph tracer was the last tracer. It could be
fooled in thinking it is, but then on exit it does not call the close
function of the function graph tracer to clean up its data.
- Fix synthetic events on big endian machines, by introducing a union
that does the conversions properly.
- Fix synthetic events from printing out the number of elements in the
stacktrace when it shouldn't.
- Fix synthetic events stacktrace to not print a bogus value at the end.
- Introduce a pipe_cpumask that prevents the trace_pipe files from being
opened by more than one task (file descriptor). There was a race found
where if splice is called, the iter->ent could become stale and events
could be missed. There's no point reading a producer/consumer file by
more than one task as they will corrupt each other anyway. Add a cpumask
that keeps track of the per_cpu trace_pipe files as well as the global
trace_pipe file that prevents more than one open of a trace_pipe file
that represents the same ring buffer. This prevents the race from
happening.
- Fix ftrace samples for arm64 to work with older compilers.
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEXtmkj8VMCiLR0IBM68Js21pW3nMFAmTn7hsUHHJvc3RlZHRA
Z29vZG1pcy5vcmcACgkQ68Js21pW3nOSVg/9HVFTUX52yqvT9YLv8b1QZYMo2I5n
to1nSbF9UUYIAMqkyNHuXU2TBj1I77JVkbsPEFsjYTN97CzJ/Zc8jNa6p7HgVure
1xUuaCgtOFPDjU6OpTa6wFRt4usU1UM8Noc/ii0WDUGsA+RKBAKmGUsNmuDHx1Ae
a3uJTLR4VHMeAkbtth/8f6RHBVocDVSYPQbKnC0PksGW1wg9e9PU2G6+o3069I1Q
qbOZJqvGDpeaapjyG2LYrDwkVGOPSvUJuJNfcjcNcyKoHSqxZzBb+feY16NhWDm9
idpyHLE4qF1nvlh1SIpErFl8Bu5MG9CN8a+xrx3ufd6i1jO4bcHyRD9XmjG4p72+
zVshoS86DI4KCK9wHxJ/5/SU6XuL6JoNTP7NhmDIX83QCuZwgTOa8C2xzHKHu58F
In13IhJqS5ob6Jy25a/bAy0CbdTl0cjQvMfXrrYK0ZWuEBWgBUDqwB/eKY6oq79D
oTKmFNOZsuiLAhPywAoY5cAqQtftpy5Ul8o6ed3RAw3th0WC4EeXIk5eUu5g/ABI
1ZfnpeY5al/JROFGXWxyn964RRjoVpbC1M6NVTz33e7No+r2KSRlLb5ZEWZVVIcZ
My96QiZXamBZ/EOR7x72yFWxXBSuACu9nvbSSZRnbEGujoKwDb89xtgWzSNuR/wi
GexDcj6qWfODrfc=
=CI9f
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix ring buffer being permanently disabled due to missed
record_disabled()
Changing the trace cpu mask will disable the ring buffers for the
CPUs no longer in the mask. But it fails to update the snapshot
buffer. If a snapshot takes place, the accounting for the ring buffer
being disabled is corrupted and this can lead to the ring buffer
being permanently disabled.
- Add test case for snapshot and cpu mask working together
- Fix memleak by the function graph tracer not getting closed properly.
The iterator is used to read the ring buffer. When it opens, it calls
the open function of a tracer, and when it is closed, it calls the
close iteration. While a trace is being read, it is still possible to
change the tracer.
If this happens between the function graph tracer and the wakeup
tracer (which uses function graph tracing), the tracers are not
closed properly during when the iterator sees the switch, and the
wakeup function did not initialize its private pointer to NULL, which
is used to know if the function graph tracer was the last tracer. It
could be fooled in thinking it is, but then on exit it does not call
the close function of the function graph tracer to clean up its data.
- Fix synthetic events on big endian machines, by introducing a union
that does the conversions properly.
- Fix synthetic events from printing out the number of elements in the
stacktrace when it shouldn't.
- Fix synthetic events stacktrace to not print a bogus value at the
end.
- Introduce a pipe_cpumask that prevents the trace_pipe files from
being opened by more than one task (file descriptor).
There was a race found where if splice is called, the iter->ent could
become stale and events could be missed. There's no point reading a
producer/consumer file by more than one task as they will corrupt
each other anyway. Add a cpumask that keeps track of the per_cpu
trace_pipe files as well as the global trace_pipe file that prevents
more than one open of a trace_pipe file that represents the same ring
buffer. This prevents the race from happening.
- Fix ftrace samples for arm64 to work with older compilers.
* tag 'trace-v6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
samples: ftrace: Replace bti assembly with hint for older compiler
tracing: Introduce pipe_cpumask to avoid race on trace_pipes
tracing: Fix memleak due to race between current_tracer and trace
tracing/synthetic: Allocate one additional element for size
tracing/synthetic: Skip first entry for stack traces
tracing/synthetic: Use union instead of casts
selftests/ftrace: Add a basic testcase for snapshot
tracing: Fix cpu buffers unavailable due to 'record_disabled' missed
Differentiate between empty list and None for member lists.
New families may want to create request responses with no attribute.
If we treat those the same as None we end up rendering
a full parsing policy in user space, instead of an empty one.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824003056.1436637-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We look for attributes inside do.request, but there's another
layer of nesting in the spec, look inside do.request.attributes.
This bug had no effect as all global policies we generate (fou)
seem to be full, anyway, and we treat full and empty the same.
Next patch will change the treatment of empty policies.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824003056.1436637-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Recent changes made us assume that input for binary data is in hex.
When using YNL as a Python library it's possible to pass in raw bytes.
Bring the ability to do that back.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824003056.1436637-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remove comparing pointer to 0 to avoid this warning from coccinelle:
./tools/testing/selftests/mm/map_populate.c:80:16-17: WARNING comparing pointer to 0, suggest !E
./tools/testing/selftests/mm/map_populate.c:80:16-17: WARNING comparing pointer to 0
Link: https://lkml.kernel.org/r/20230817160033.90079-1-tuananhlfc@gmail.com
Signed-off-by: Anh Tuan Phan <tuananhlfc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Currently, not all kernel memory usage is being accounted for. This
commit switches to using the kernel entry within memory.stat which
already includes kernel_stack, pagetables, and slab. The kernel entry
also includes vmalloc and other additional kernel memory use cases which
were missing.
Link: https://lkml.kernel.org/r/bvrhe2tpsts2azaroq4ubp2slawmop6orndsswrewuscw3ugvk@kmemmrttsnc7
Signed-off-by: Lucas Karpinski <lkarpins@redhat.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "New page table range API", v6.
This patchset changes the API used by the MM to set up page table entries.
The four APIs are:
set_ptes(mm, addr, ptep, pte, nr)
update_mmu_cache_range(vma, addr, ptep, nr)
flush_dcache_folio(folio)
flush_icache_pages(vma, page, nr)
flush_dcache_folio() isn't technically new, but no architecture
implemented it, so I've done that for them. The old APIs remain around
but are mostly implemented by calling the new interfaces.
The new APIs are based around setting up N page table entries at once.
The N entries belong to the same PMD, the same folio and the same VMA, so
ptep++ is a legitimate operation, and locking is taken care of for you.
Some architectures can do a better job of it than just a loop, but I have
hesitated to make too deep a change to architectures I don't understand
well.
One thing I have changed in every architecture is that PG_arch_1 is now a
per-folio bit instead of a per-page bit when used for dcache clean/dirty
tracking. This was something that would have to happen eventually, and it
makes sense to do it now rather than iterate over every page involved in a
cache flush and figure out if it needs to happen.
The point of all this is better performance, and Fengwei Yin has measured
improvement on x86. I suspect you'll see improvement on your architecture
too. Try the new will-it-scale test mentioned here:
https://lore.kernel.org/linux-mm/20230206140639.538867-5-fengwei.yin@intel.com/
You'll need to run it on an XFS filesystem and have
CONFIG_TRANSPARENT_HUGEPAGE set.
This patchset is the basis for much of the anonymous large folio work
being done by Ryan, so it's received quite a lot of testing over the last
few months.
This patch (of 38):
Determine if a value lies within a range more efficiently (subtraction +
comparison vs two comparisons and an AND). It also has useful (under some
circumstances) behaviour if the range exceeds the maximum value of the
type. Convert all the conflicting definitions of in_range() within the
kernel; some can use the generic definition while others need their own
definition.
Link: https://lkml.kernel.org/r/20230802151406.3735276-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20230802151406.3735276-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The cachestat kselftest runs a test on a normal file, which is created
temporarily in the current directory. Among the tests it runs there is a
call to fsync(), which is expected to clean all dirty pages used by the
file.
However the tmpfs filesystem implements fsync() as noop_fsync(), so the
call will not even attempt to clean anything when this test file happens
to live on a tmpfs instance. This happens in an initramfs, or when the
current directory is in /dev/shm or sometimes /tmp.
To avoid this test failing wrongly, use statfs() to check which filesystem
the test file lives on. If that is "tmpfs", we skip the fsync() test.
Since the fsync test is only one part of the "normal file" test, we now
execute this twice, skipping the fsync part on the first call. This way
only the second test, including the fsync part, would be skipped.
Link: https://lkml.kernel.org/r/20230821160534.3414911-3-andre.przywara@arm.com
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "selftests: cachestat: fix run on older kernels", v2.
I ran all kernel selftests on some test machine, and stumbled upon
cachestat failing (among others). These patches fix the run on older
kernels and when the current directory is on a tmpfs instance.
This patch (of 2):
As cachestat is a new syscall, it won't be available on older kernels, for
instance those running on a development machine. At the moment the test
reports all tests as "not ok" in this case.
Test for the cachestat syscall availability first, before doing further
tests, and bail out early with a TAP SKIP comment.
This also uses the opportunity to add the proper TAP headers, and add one
check for proper error handling (illegal file descriptor).
Link: https://lkml.kernel.org/r/20230821160534.3414911-1-andre.przywara@arm.com
Link: https://lkml.kernel.org/r/20230821160534.3414911-2-andre.przywara@arm.com
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Enable cpu v4 tests for RV64, and the relevant tests have passed.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20230824095001.3408573-8-pulehui@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
and netfilter
Fixes to fixes:
- nf_tables:
- GC transaction race with abort path
- defer gc run if previous batch is still pending
Previous releases - regressions:
- ipv4: fix data-races around inet->inet_id
- phy: fix deadlocking in phy_error() invocation
- mdio: fix C45 read/write protocol
- ipvlan: fix a reference count leak warning in ipvlan_ns_exit()
- ice: fix NULL pointer deref during VF reset
- i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()
- tg3: use slab_build_skb() when needed
- mtk_eth_soc: fix NULL pointer on hw reset
Previous releases - always broken:
- core: validate veth and vxcan peer ifindexes
- sched: fix a qdisc modification with ambiguous command request
- devlink: add missing unregister linecard notification
- wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning
- batman:
- do not get eth header before batadv_check_management_packet
- fix batadv_v_ogm_aggr_send memory leak
- bonding: fix macvlan over alb bond support
- mlxsw: set time stamp fields also when its type is MIRROR_UTC
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmTnJIQSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkt7kP/jy6HOMwSOMFbtxQD2m89EImr6ZlLUPg
H09seQzC5nwRbgZrdzukmM27HDKEkYe1sPyxhpS8E4iAslFaefEvnWqOY0oiQSpH
OuF4mP/cS9QKb62NwKVrau3SCARS9arLmOF0mcJNdDOWwucE+SoFaebxSMitAU/w
k8hHVsLwc5dwZAYznOl2/qsmPBnIUsxfymNJE/RuFqj1nHccGybh9mJKpAxc0knj
QEjqno//PgAXPV/X3mH/wG0fcsXs0OlAnBS9yA95GNzuR2yWrh7bD/et99En/elS
8paUio+O3P6Y6WaewgDYFm44pf/x+hFb18Irtab82BkdRw+lgFyF23g8IH7ToJAE
mEaxwdS7AQ4XEunNyJsjwiffWUG1nFaoIhaGb0Lo1qmgLHDo+rrNhkrBWvZxSf0Q
8QlMnCXopJ1c5Qltz5QNVaWPErpCcanxV3cpNlG+lTpfamWBrUpuv/EhHCUF/fr3
hlgJEm+WoFTvexO+QC3CyJDz2JYLLMaaYaoUZ1aJS2dtTTc3tfUjEL8VcopfXI87
2FXJ3qEtCkvfdtfFjhofw97qHDvGrTXa9r2JSh1Pp8v15pKdM2P/lMYxd4B0cSEw
9udW/3bWkvHZayzBWvqDEiz3UTID1+uX0/qpBWY40QzTdIXo6sBrCCk93tjJUdcA
kXjw9HkSqW6H
=WKil
-----END PGP SIGNATURE-----
Merge tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from wifi, can and netfilter.
Fixes to fixes:
- nf_tables:
- GC transaction race with abort path
- defer gc run if previous batch is still pending
Previous releases - regressions:
- ipv4: fix data-races around inet->inet_id
- phy: fix deadlocking in phy_error() invocation
- mdio: fix C45 read/write protocol
- ipvlan: fix a reference count leak warning in ipvlan_ns_exit()
- ice: fix NULL pointer deref during VF reset
- i40e: fix potential NULL pointer dereferencing of pf->vf in
i40e_sync_vsi_filters()
- tg3: use slab_build_skb() when needed
- mtk_eth_soc: fix NULL pointer on hw reset
Previous releases - always broken:
- core: validate veth and vxcan peer ifindexes
- sched: fix a qdisc modification with ambiguous command request
- devlink: add missing unregister linecard notification
- wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning
- batman:
- do not get eth header before batadv_check_management_packet
- fix batadv_v_ogm_aggr_send memory leak
- bonding: fix macvlan over alb bond support
- mlxsw: set time stamp fields also when its type is MIRROR_UTC"
* tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
selftests: bonding: add macvlan over bond testing
selftest: bond: add new topo bond_topo_2d1c.sh
bonding: fix macvlan over alb bond support
rtnetlink: Reject negative ifindexes in RTM_NEWLINK
netfilter: nf_tables: defer gc run if previous batch is still pending
netfilter: nf_tables: fix out of memory error handling
netfilter: nf_tables: use correct lock to protect gc_list
netfilter: nf_tables: GC transaction race with abort path
netfilter: nf_tables: flush pending destroy work before netlink notifier
netfilter: nf_tables: validate all pending tables
ibmveth: Use dcbf rather than dcbfl
i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()
net/sched: fix a qdisc modification with ambiguous command request
igc: Fix the typo in the PTM Control macro
batman-adv: Hold rtnl lock during MTU update via netlink
igb: Avoid starting unnecessary workqueues
can: raw: add missing refcount for memory leak fix
can: isotp: fix support for transmission of SF without flow control
bnx2x: new flag for track HW resource allocation
sfc: allocate a big enough SKB for loopback selftest packet
...
Test the basic locking stuff on 2 fds: multiple read locks,
conflicts between read and write locks, use of len==0 for queries.
Also tests for F_UNLCK F_OFD_GETLK extension.
[ jlayton: fix unlink() pathname in selftest ]
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: linux-api@vger.kernel.org
Signed-off-by: Stas Sergeev <stsp2@yandex.ru>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Add a new testing topo bond_topo_2d1c.sh which is used more commonly.
Make bond_topo_3d1c.sh just source bond_topo_2d1c.sh and add the
extra link.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Extracting btf_int_encoding() is only meaningful for BTF_KIND_INT, so we
need to check that first before inferring signedness.
Closes: https://github.com/libbpf/libbpf/issues/704
Reported-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230824000016.2658017-2-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
It seems like it was forgotten to add uprobe_multi binary to .gitignore.
Fix this trivial omission.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230824000016.2658017-1-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
For bpf_object__pin_programs() there is bpf_object__unpin_programs().
Likewise bpf_object__unpin_maps() for bpf_object__pin_maps().
But no bpf_object__unpin() for bpf_object__pin(). Adding the former adds
symmetry to the API.
It's also convenient for cleanup in application code. It's an API I
would've used if it was available for a repro I was writing earlier.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/b2f9d41da4a350281a0b53a804d11b68327e14e5.1692832478.git.dxu@dxuuu.xyz
Add tests that enforce mmap hint address behavior. mmap should default
to sv48. mmap will provide an address at the highest address space that
can fit into the hint address, unless the hint address is less than sv39
and not 0, then it will return a sv39 address.
These tests are split into two files: mmap_default.c and mmap_bottomup.c
because a new process must be exec'd in order to change the mmap layout.
The run_mmap.sh script sets the stack to be unlimited for the
mmap_bottomup.c test which triggers a bottomup layout.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20230809232218.849726-3-charlie@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Having __sysret() as an inline function has the unfortunate effect of
adding casts and large constants comparisons after the syscall returns
that significantly inflate some light code that's otherwise syscall-
heavy. Even nolibc-test grew by ~1%.
Let's switch back to a macro for this, and use it only with signed
arguments. Note that it is also possible to design a slightly more
complex macro covering unsigned and pointers but we only have 3 such
syscalls so it is pointless, and these were just addressed not to use
this macro anymore. Now for the argument (the local variable containing
the syscall return value), any negative value is an error, that results
in -1 being returned and errno to be assigned the opposite value.
This may be revisited again in the future if really needed but for now
let's get back to something sane.
Fixes: 428905da6e ("tools/nolibc: sys.h: add a syscall return helper")
Link: https://lore.kernel.org/lkml/20230806095846.GB10627@1wt.eu/
Link: https://lore.kernel.org/lkml/ZNKOJY+g66nkIyvv@1wt.eu/
Cc: Zhangjin Wu <falcon@tinylab.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Thomas Weißschuh <thomas@t-8ch.de>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The __sysret() function causes some undesirable casts so we'll revert
it. In order to keep it simple it will now only support integer return
values like in the past, so we must basically revert the changes that
were made to these 3 syscalls which return a pointer so that they
simply rely on their own test and the SET_ERRNO() macro.
Fixes: 4201cfce15 ("tools/nolibc: clean up sbrk() routine")
Fixes: 924e9539ae ("tools/nolibc: clean up mmap() routine")
Fixes: d27447bc2e ("tools/nolibc: sys.h: apply __sysret() helper")
Link: https://lore.kernel.org/lkml/20230806095846.GB10627@1wt.eu/
Link: https://lore.kernel.org/lkml/ZNKOJY+g66nkIyvv@1wt.eu/
Cc: Zhangjin Wu <falcon@tinylab.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Thomas Weißschuh <thomas@t-8ch.de>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Silence the following warnings reported by the new -Wall -Wextra options
with pure assembly code.
In file included from sysroot/powerpc/include/stdio.h:13,
from nolibc-test.c:13:
sysroot/powerpc/include/arch.h: In function '_start':
sysroot/powerpc/include/arch.h:192:32: warning: unused variable 'r2' [-Wunused-variable]
192 | register volatile long r2 __asm__ ("r2") = (void *)&TOC - (void *)_start;
| ^~
sysroot/powerpc/include/arch.h:187:97: warning: optimization may eliminate reads and/or writes to register variables [-Wvolatile-register-var]
187 | void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_stack_protector _start(void)
| ^~~~~~
Since only elfv2 ABI requires to save the TOC/GOT pointer to r2
register, when using elfv1 ABI, the old C code is simply ignored by the
compiler, but the compiler can not ignore the inline assembly code and
will introduce build failure or running segfaults. So, let's further
only add the new assembly code for elfv2 ABI with the checking of
_CALL_ELF == 2.
Link: https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.pdf
Link: https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Euro-LLVM-2014-Weigand.pdf
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
libc-test is mainly added to compare the behavior of nolibc to the
system libc, it is meaningless and error-prone with cross compiling.
Let's use HOSTCC instead of CC to avoid wrongly use cross compiler when
CROSS_COMPILE is passed or customized.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Fixes: cfb672f94f ("selftests/nolibc: add run-libc-test target")
Signed-off-by: Willy Tarreau <w@1wt.eu>
This allows to generate smaller text/data/dec size.
As the _start_c() function added by crt.h, __stack_chk_init() is called
from _start_c() instead of the assembly _start. So, it is able to mark
it with static now.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
After the tests finish, it is valuable to report and summarize with
existing test log.
This avoid rerun or run the tests again when not necessary.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Kernel uses ARCH=powerpc for both 32-bit and 64-bit PowerPC, here adds a
ppc64 variant for big endian 64-bit PowerPC, users can pass XARCH=ppc64
to test it.
The powernv machine of qemu-system-ppc64 is used with
powernv_be_defconfig.
As the document [1] shows:
PowerNV (as Non-Virtualized) is the “bare metal” platform using the
OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be
used as an hypervisor OS, running KVM guests, or simply as a host OS.
Notes,
- differs from little endian 64-bit PowerPC, vmlinux is used instead of
zImage, because big endian zImage [2] only boot on qemu with x-vof=on
(added from qemu v7.0) and a fixup patch [3] for qemu v7.0.51:
- since the VSX support may be disabled in kernel side, to avoid
"illegal instruction" errors due to missing VSX kernel support, let's
simply let compiler not generate vector/scalar (VSX) instructions via
the '-mno-vsx' option.
- as 'man gcc' shows, '-mmultiple' is used to generate code that uses
the load multiple word instructions and the store multiple word
instructions. Those instructions do not work when the processor is in
little-endian mode (except PPC740/PPC750), so, we only enable it
for big endian powerpc.
- for big endian ppc64, as the help message from arch/powerpc/Kconfig
shows, the V2 ABI is standard for 64-bit little-endian, but for
big-endian it is less well tested by kernel and toolchain, so, use
elfv1 as-is, no need to explicitly ask toolchain to use elfv2 here.
[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powernv.html
[2]: https://github.com/linuxppc/issues/issues/402
[3]: https://lore.kernel.org/qemu-devel/20220504065536.3534488-1-aik@ozlabs.ru/
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230722121019.GD17311@1wt.eu/
Link: https://lore.kernel.org/lkml/20230719043353.GC5331@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Kernel uses ARCH=powerpc for both 32-bit and 64-bit PowerPC, here adds a
ppc64le variant for little endian 64-bit PowerPC, users can pass
XARCH=ppc64le to test it.
The powernv machine of qemu-system-ppc64le is used for there is just a
working powernv_defconfig.
As the document [1] shows:
PowerNV (as Non-Virtualized) is the “bare metal” platform using the
OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can be
used as an hypervisor OS, running KVM guests, or simply as a host OS.
Notes,
- since the VSX support may be disabled in kernel side, to avoid
"illegal instruction" errors due to missing VSX kernel support, let's
simply let compiler not generate vector/scalar (VSX) instructions via
the '-mno-vsx' option.
- little endian ppc64 prefers elfv2 to elfv1 if the toolchain (e.g. gcc
13.1.0) supports it, let's align with kernel, otherwise, our elfv1
binary will not run on kernel with elfv2 ABI.
[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powernv.html
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230722120747.GC17311@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Kernel uses ARCH=powerpc for both 32-bit and 64-bit PowerPC, here adds a
ppc variant for 32-bit PowerPC and uses it as the default variant of
powerpc architecture.
Users can pass XARCH=ppc (or ARCH=powerpc) to test 32-bit PowerPC.
The default qemu-system-ppc g3beige machine [1] is used to run 32-bit
powerpc kernel with pmac32_defconfig. The missing PMACZILOG serial tty
and console are enabled in another patch [2].
Note,
- zImage doesn't boot due to "qemu-system-ppc: Some ROM regions are
overlapping" error, so, vmlinux is used instead.
- since the VSX support may be disabled in kernel side, to avoid
"illegal instruction" errors due to missing VSX kernel support, let's
simply let compiler not generate vector/scalar (VSX) instructions via
the '-mno-vsx' option.
- as 'man gcc' shows, '-mmultiple' is used to generate code that uses
the load multiple word instructions and the store multiple word
instructions. Those instructions do not work when the processor is in
little-endian mode (except PPC740/PPC750), so, we only enable it
for big endian powerpc.
[1]: https://qemu.readthedocs.io/en/latest/system/ppc/powermac.html
[2]: https://lore.kernel.org/lkml/bb7b5f9958b3e3a20f6573ff7ce7c5dc566e7e32.1690982937.git.tanyuan@tinylab.org/
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/ZL9leVOI25S2+0+g@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Most of the CPU architectures have different variants, but kernel
usually only accepts parts of them via the ARCH variable, the others
should be customized via kernel config files.
To simplify testing, a new XARCH variable is added to extend the
kernel's ARCH with a few variants of the same architecture, and it is
used to customize variant specific variables, at last XARCH is converted
to the kernel's ARCH:
e.g. make run XARCH=<one of the supported variants>
| \
| `-> variant specific variables:
| IMAGE, DEFCONFIG, QEMU_ARCH, QEMU_ARGS, CFLAGS ...
\
`---> kernel's ARCH
XARCH and ARCH are carefully mapped to allow users to pass architecture
variants via XARCH or pass architecture via ARCH from cmdline.
PowerPC is the first user and also a very good reference architecture of
this mapping, it has variants with different combinations of
32-bit/64-bit and bit endian/little endian.
To use this mapping, the other architectures can refer to PowerPC, If
the target architecture only has one variant, XARCH is simply an alias
of ARCH, no additional mapping required.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702171715.GD16233@1wt.eu/
Link: https://lore.kernel.org/lkml/20230730061801.GA7690@1wt.eu/
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
This follows the 64-bit PowerPC ABI [1], refers to the slides: "A new
ABI for little-endian PowerPC64 Design & Implementation" [2] and the
musl code in arch/powerpc64/crt_arch.h.
First, stdu and clrrdi are used instead of stwu and clrrwi for
powerpc64.
Second, the stack frame size is increased to 32 bytes for powerpc64, 32
bytes is the minimal stack frame size supported described in [2].
Besides, the TOC pointer (GOT pointer) must be saved to r2.
This works on both little endian and big endian 64-bit PowerPC.
[1]: https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.pdf
[2]: https://www.llvm.org/devmtg/2014-04/PDFs/Talks/Euro-LLVM-2014-Weigand.pdf
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Both syscall declarations and _start code definition are added for
powerpc to nolibc.
Like mips, powerpc uses a register (exactly, the summary overflow bit)
to record the error occurred, and uses another register to return the
value [1]. So, the return value of every syscall declaration must be
normalized to match the __sysret() helper, return -value when there is
an error, otheriwse, return value directly.
Glibc and musl use different methods to check the summary overflow bit,
glibc (sysdeps/unix/sysv/linux/powerpc/sysdep.h) saves the cr register
to r0 at first, and then check the summary overflow bit in cr0:
mfcr r0
r0 & (1 << 28) ? -r3 : r3
-->
10003c14: 7c 00 00 26 mfcr r0
10003c18: 74 09 10 00 andis. r9,r0,4096
10003c1c: 41 82 00 08 beq 0x10003c24
10003c20: 7c 63 00 d0 neg r3,r3
Musl (arch/powerpc/syscall_arch.h) directly checks the summary overflow
bit with the 'bns' instruction, it is smaller:
/* no summary overflow bit means no error, return value directly */
bns+ 1f
/* otherwise, return negated value */
neg r3, r3
1:
-->
10000418: 40 a3 00 08 bns 0x10000420
1000041c: 7c 63 00 d0 neg r3,r3
Like musl, Linux (arch/powerpc/include/asm/vdso/gettimeofday.h) uses the
same method for do_syscall_2() too.
Here applies the second method to get smaller size.
[1]: https://man7.org/linux/man-pages/man2/syscall.2.html
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
It will help the developers to avoid cruft and detect some bugs.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Binary size is not important for nolibc-test and some debugging
information is nice to have, so don't strip the binary during linking.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
If read() fails and returns -1 (or returns garbage for some other
reason) buf would be accessed out of bounds.
Only use the return value of read() after it has been validated.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Avoid truncating values before comparing them.
As printf in nolibc doesn't support ssize_t add casts to int for
printing.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
These warnings will be enabled later so avoid triggering them.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
This warning will be enabled later so avoid triggering it.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
This allows the compiler to generate warnings if they go unused.
Functions that are supposed to be used as breakpoints should not be
static, so un-statify those if necessary.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Recent fix ceb528feb7 ("selftests/nolibc: avoid gaps in test numbers")
had the annoying side effect of always returning skipped tests, which
are normally supposed to happen only when certain features are missing
to run the test (missing kernel options, toolchain not supporting
stack-protector etc). As such there are now always warnings. Let's
modify the test to not use the condition and instead use a ternary
expression to check the result.
Fixes: ceb528feb7 ("selftests/nolibc: avoid gaps in test numbers")
Cc: Thomas WeiÃ<9F>schuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Otherwise both gcc and clang may generate warnings about type
mismatches:
sysroot/mips/include/string.h:12:14: warning: mismatch in argument 1 type of built-in function 'malloc'; expected 'unsigned int' [-Wbuiltin-declaration-mismatch]
12 | static void *malloc(size_t len);
| ^~~~~~
The compiler provides __SIZE_TYPE__ as the type that corresponds to size_t
(typically "long unsigned int" or "unsigned int"). It was verified to be
available at least since gcc-3.4 and clang-3.8, so from now on we'll use
this definition for size_t.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/20230805161929.GA15284@1wt.eu/
Signed-off-by: Willy Tarreau <w@1wt.eu>
getauxval() returns an unsigned long but the overall type of the ternary
operator needs to be signed.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
This warning will be enabled later so avoid triggering it.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
It's documented as returning int which is also implemented by glibc and
musl, so adopt that return type.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Add a test case of pipe that sends and receives message in a single
process.
Suggested-by: Thomas Weißschuh <thomas@t-8ch.de>
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/all/c5de2d13-3752-4e1b-90d9-f58cca99c702@t-8ch.de/
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
[wt: fixed the "len" type to size_t to address a sign-compare warning
with upcoming patches]
Signed-off-by: Willy Tarreau <w@1wt.eu>
According to manual page [1], posix spec [2] and source code like
arch/mips/kernel/syscall.c, for historic reasons, the sys_pipe() syscall
on some architectures has an unusual calling convention. It returns
results in two registers which means there is no need for it to do
verify the validity of a userspace pointer argument. Historically that
used to be expensive in Linux. These days the performance advantage is
negligible.
Nolibc doesn't support the unusual calling convention above, luckily
Linux provides a generic sys_pipe2() with an additional flags argument
from 2.6.27. If flags is 0, then pipe2() is the same as pipe(). So here
we use sys_pipe2() to implement the pipe().
pipe2() is also provided to allow users to use flags argument on demand.
[1]: https://man7.org/linux/man-pages/man2/pipe.2.html
[2]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/pipe.html
Suggested-by: Zhangjin Wu <falcon@tinylab.org>
Link: https://lore.kernel.org/all/20230729100401.GA4577@1wt.eu/
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The other tests use 1 as failure, mmap_munmap_good uses -1 as failure,
let's fix up this.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
If the test description is longer than the status alignment the
parameter 'n' to putcharn() would lead to a signed underflow that then
gets converted to a very large unsigned value.
This in turn leads out-of-bound writes in memset() crashing the
application.
The failure case of EXPECT_PTRER() used in "mmap_bad" exhibits this
exact behavior.
Fixes: 29f5540be3 ("selftests/nolibc: add EXPECT_PTREQ, EXPECT_PTRNE and EXPECT_PTRER")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Add a minimal implementation of setvbuf(), which error checks the mode
argument (as required by spec) and returns. Since nolibc never buffers
output, nothing needs to be done.
The kselftest framework recently added a call to setvbuf(). As a result,
any tests that use the kselftest framework and nolibc cause a compiler
error due to missing function. This provides an urgent fix for the
problem which is preventing arm64 testing on linux-next.
Example:
clang --target=aarch64-linux-gnu -fintegrated-as
-Werror=unknown-warning-option -Werror=ignored-optimization-argument
-Werror=option-ignored -Werror=unused-command-line-argument
--target=aarch64-linux-gnu -fintegrated-as
-fno-asynchronous-unwind-tables -fno-ident -s -Os -nostdlib \
-include ../../../../include/nolibc/nolibc.h -I../..\
-static -ffreestanding -Wall za-fork.c
build/kselftest/arm64/fp/za-fork-asm.o
-o build/kselftest/arm64/fp/za-fork
In file included from <built-in>:1:
In file included from ./../../../../include/nolibc/nolibc.h:97:
In file included from ./../../../../include/nolibc/arch.h:25:
./../../../../include/nolibc/arch-aarch64.h:178:35: warning: unknown
attribute 'optimize' ignored [-Wunknown-attributes]
void __attribute__((weak,noreturn,optimize("omit-frame-pointer")))
__no_stack_protector _start(void)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from za-fork.c:12:
../../kselftest.h:123:2: error: call to undeclared function 'setvbuf';
ISO C99 and later do not support implicit function declarations
[-Wimplicit-function-declaration]
setvbuf(stdout, NULL, _IOLBF, 0);
^
../../kselftest.h:123:24: error: use of undeclared identifier '_IOLBF'
setvbuf(stdout, NULL, _IOLBF, 0);
^
1 warning and 2 errors generated.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Link: https://lore.kernel.org/linux-kselftest/CA+G9fYus3Z8r2cg3zLv8uH8MRrzLFVWdnor02SNr=rCz+_WGVg@mail.gmail.com/
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As the head comment of nolibc-test.c shows, it can be built in 3 ways:
$(CC) -nostdlib -include /path/to/nolibc.h => NOLIBC already defined
$(CC) -nostdlib -I/path/to/nolibc/sysroot => _NOLIBC_* guards are present
$(CC) with default libc => NOLIBC* never defined
Only last two of them are tested currently, let's allow test the first one too.
This may help to find issues about using nolibc.h to build programs. it
derives from this change:
commit 3a8039e289 ("tools/nolibc: Fix build of stdio.h due to header ordering")
Usage:
// test with sysroot by default
$ make run-user
// test without sysroot, using nolibc.h directly
$ make run-user NOLIBC_SYSROOT=0
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
It is able to run nolibc-test directly without qemu-user when the target
machine is the same as the host machine.
Sometimes, the result running locally may help a lot when the qemu-user
package is too old.
When the target machine differs from the host machine, it is also able
to run nolibc-test directly with qemu-user-static + binfmt_misc.
Link: https://lore.kernel.org/lkml/ZKutZwIOfy5MqedG@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The startup code is critical to get the right argc, argv, envp/environ
and _auxv, let's add a startup test group and the corresponding
testcases.
The "environ" test case is also moved from the stdlib test group to this
new startup test group and it is renamed to "environ_envp".
Since argv0 has been used by many other test cases, let's add testcases
to gurantee it too.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
4 new pointer compare macros are added, they are similar to the integer
compare macros.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Also clean up the instructions in delayed slots.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
move most of the _start operations to _start_c(), include the
stackprotector initialization.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As suggested by Thomas, It is able to move the stackprotector
initialization from the assembly _start to the beginning of the new
_start_c(). Let's call __stack_chk_init() in _start_c() as a
preparation.
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/a00284a6-54b1-498c-92aa-44997fa78403@t-8ch.de/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Let's define an empty __stack_chk_init for the !_NOLIBC_STACKPROTECTOR
branch.
This allows to remove #ifdef around every call of __stack_chk_init().
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As the environ and _auxv support added for nolibc, the assembly _start
function becomes more and more complex and therefore makes the porting
of nolibc to new architectures harder and harder.
To simplify portability, this C version of _start_c() is added to do
most of the assembly start operations in C, which reduces the complexity
a lot and will eventually simplify the porting of nolibc to the new
architectures.
The new _start_c() only requires a stack pointer argument, it will find
argc, argv, envp/environ and _auxv for us, and then call main(),
finally, it exit() with main's return status. With this new _start_c(),
the future new architectures only require to add very few assembly
instructions.
As suggested by Thomas, users may use a different signature of main
(e.g. void main(void)), a _nolibc_main alias is added for main to
silence the warning about potential conflicting types.
As suggested by Willy, the code is carefully polished for both smaller
size and better readability with local variables and the right types.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230715095729.GC24086@1wt.eu/
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/90fdd255-32f4-4caf-90ff-06456b53dac3@t-8ch.de/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The statx manpage [1] shows that it has been supported from Linux 4.11
and glibc 2.28, the Linux support can be checked for all of the
architectures with this command:
$ git grep -r statx v4.11 arch/ include/uapi/asm-generic/unistd.h \
| grep -E "aarch64|arm|mips|s390|x86|:include/uapi"
Besides riscv and loongarch, all of the nolibc supported architectures
have added sys_statx from Linux v4.11. riscv is mainlined to v4.15,
loongarch is mainlined to v5.19, both of them use the generic unistd.h,
so, they have added sys_statx from their first mainline versions.
The current oldest stable branch is v4.14, only reserving sys_statx
still preserves compatibility with all of the supported stable branches,
So, let's remove the old arch related and dependent sys_stat support
completely.
This is friendly to the future new architecture porting.
[1]: https://man7.org/linux/man-pages/man2/statx.2.html
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As gcc doc [1] shows:
Most optimizations are completely disabled at -O0 or if an -O level is
not set on the command line, even if individual optimization flags are
specified.
Test result [2] shows, gcc>=11.1.0 deviates from the above description,
but before gcc 11.1.0, "-O0" still forcely uses frame pointer in the
_start function even if the individual optimize("omit-frame-pointer")
flag is specified.
The frame pointer related operations will change the stack pointer (e.g.
In x86_64, an extra "push %rbp" will be inserted at the beginning of
_start) and make it differs from the one we expected, as a result, break
the whole startup function.
To fix up this issue, as suggested by Thomas, the individual "Os" and
"omit-frame-pointer" optimize flags are used together on _start function
to disable frame pointer completely even if the -O0 is set on the
command line.
[1]: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
[2]: https://lore.kernel.org/lkml/20230714094723.140603-1-falcon@tinylab.org/
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/34b21ba5-7b59-4b3b-9ed6-ef9a3a5e06f7@t-8ch.de/
Fixes: 7f85485896 ("tools/nolibc: make compiler and assembler agree on the section around _start")
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Fix up such errors reported by scripts/checkpatch.pl:
ERROR: space required after that ',' (ctx:VxV)
#148: FILE: tools/include/nolibc/arch-aarch64.h:148:
+void __attribute__((weak,noreturn,optimize("omit-frame-pointer"))) __no_stack_protector _start(void)
^
ERROR: space required after that ',' (ctx:VxV)
#148: FILE: tools/include/nolibc/arch-aarch64.h:148:
+void __attribute__((weak,noreturn,optimize("omit-frame-pointer"))) __no_stack_protector _start(void)
^
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
As the test numbers are based on line numbers gaps without testcases are
to be avoided.
Instead use the already existing test condition logic to implement
conditional execution.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
pad_spc() is only ever used to print the status message of testcases.
The line size is always constant, the return value is never used and the
format string is never used as such.
Remove all the unneeded logic and simplify the API and its users.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
If "cond" is a multi-token statement the behavior of the preprocessor
will lead to the negation "!" to be only applied to the first token.
Although currently no test uses such multi-token conditions but it can
happen at any time.
Put braces around "cond" to ensure the negation works as expected.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
In commit 52e423f5b9 ("tools/nolibc: export environ as a weak symbol on i386")
and friends the asm startup logic was extended to directly populate the
"environ" array.
This makes it impossible for "environ" to be dropped by the linker.
Therefore also drop the other logic to handle non-present "environ".
Also add a testcase to validate the initialization of environ.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
a newline is inserted just before the test failures to avoid mixing the
test failures with the raw test log.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
two newlines are added around the test summary line to extrude the test
status.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
align the test values for different runs and different architectures.
Since the total number of tests is not bigger than 1000 currently, let's
align them with "%3d".
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
[wt: s/%03d/%3d/ as discussed with Zhangjin]
Link: https://lore.kernel.org/lkml/20230709185112.97236-1-falcon@tinylab.org/
Signed-off-by: Willy Tarreau <w@1wt.eu>
Let's count and print the total number of tests, now, the data of
passed, skipped and failed have the same format.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
one of the test status: success, warning and failure is printed to
summarize the passed, skipped and failed values.
- "success" means no skipped and no failed.
- "warning" means has at least one skipped and no failed.
- "failure" means all tests are failed.
Suggested-by: Willy Tarreau <w@1wt.eu>
Link: https://lore.kernel.org/lkml/20230702164358.GB16233@1wt.eu/
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
argv0 is readable and chmodable, let's use it for chmod test, but a safe
umask should be used, the readable and executable modes should be
reserved.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Since argv0 also works for CONFIG_PROC_FS=n, let's use it instead of
'/proc/self/exe'.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
'/proc/self/' is a good path which doesn't have stale time info but it
is only available for CONFIG_PROC_FS=y.
When CONFIG_PROC_FS=n, use argv0 instead of '/proc/self', use '/' for the
worst case.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The PWD environment variable has the path of the nolibc-test program,
the current path must be the same as it, otherwise, the test cases will
fail with relative path (e.g. ./nolibc-test).
Since only chdir_root really changes the current path, let's restore it
with the PWD environment variable.
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
The vfprintf test case require to open a temporary file to write, the
old memfd_create() method is perfect but has strong dependency on
MEMFD_CREATE and also TMPFS or HUGETLBFS (see fs/Kconfig):
config MEMFD_CREATE
def_bool TMPFS || HUGETLBFS
And from v6.2, MFD_NOEXEC_SEAL must be passed for the non-executable
memfd, otherwise, The kernel warning will be output to the test result
like this:
Running test 'vfprintf'
0 emptymemfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=1 'init'
"" = "" [OK]
To avoid such warning and also to remove the MEMFD_CREATE dependency,
let's open a file from tmpfs directly.
The /tmp directory is used to detect the existing of tmpfs, if not
there, skip instead of fail.
And further, for pid == 1, the initramfs is loaded as ramfs, which can
be used as tmpfs, so, it is able to further remove TMPFS dependency too.
Suggested-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/9ad51430-b7c0-47dc-80af-20c86539498d@t-8ch.de
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
create a /tmp directory. If it succeeds, the directory is writable,
which is normally the case when booted from an initramfs anyway.
This will be used instead of procfs for some tests.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Link: https://lore.kernel.org/lkml/20230710050600.9697-1-falcon@tinylab.org/
[wt: removed the unneeded mount() call]
Signed-off-by: Willy Tarreau <w@1wt.eu>
For CONFIG_PROC_FS=n, the /proc is not mountable, but the /proc
directory has been created in the prepare() stage whenever /proc is
there or not.
so, the checking of /proc in the run_syscall() stage will be always true
and at last it will fail all of the procfs dependent test cases, which
deviates from the 'cond' check design of the EXPECT_xx macros, without
procfs, these test cases should be skipped instead of failed.
To solve this issue, one method is checking /proc/self instead of /proc,
another method is removing the /proc directory completely for
CONFIG_PROC_FS=n, we apply the second method to avoid misleading the
users.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
A new rmdir_blah test case is added to remove a non-existing /blah,
which expects failure with ENOENT errno.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
a reverse operation of mkdir() is meaningful, add rmdir() here.
required by nolibc-test to remove /proc while CONFIG_PROC_FS is not
enabled.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
For CONFIG_NET=n, there would be no /proc/self/net, so, use
/proc/self/cmdline instead.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
kernel parameters allow pass two types of strings, one type is like
'noapic', another type is like 'panic=5', the first type is passed as
arguments of the init program, the second type is passed as environment
variables of the init program.
when users pass kernel parameters like this:
noapic NOLIBC_TEST=syscall
our nolibc-test program will use the test setting from argv[1] and
ignore the one from NOLIBC_TEST environment variable, and at last, it
will print the following line and ignore the whole test setting.
Ignoring unknown test name 'noapic'
reversing the parsing order does solve the above issue:
test = getenv("NOLIBC_TEST");
if (test)
test = argv[1];
but it still doesn't work with such kernel parameters (without
NOLIBC_TEST environment variable):
noapic FOO=bar
To support all of the potential kernel parameters, let's verify the test
setting from both of argv[1] and NOLIBC_TEST environment variable.
Reviewed-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>