Commit Graph

27242 Commits

Author SHA1 Message Date
Masami Hiramatsu bdac5c2b24 bootconfig: Allocate xbc_data inside xbc_init()
Allocate 'xbc_data' in the xbc_init() so that it does
not need to care about the ownership of the copied
data.

Link: https://lkml.kernel.org/r/163177339986.682366.898762699429769117.stgit@devnote2

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-10 20:43:42 -04:00
Linus Torvalds 75cd9b0152 - Remove an extra section.len member in favour of section.sh_size
- Align .altinstructions section creation with the kernel's by creating
 them with entry size of 0
 
 - Fix objtool to convert a reloc symbol to a section offset and not to
 not warn about not knowing how
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmFirC4ACgkQEsHwGGHe
 VUqkNg/8Dc8CUTz950A0aGmqrTvy4JWlbh+vHMF3o45oEA7JI34awSAGR2I6USTf
 Z9EQCjaSf/i6IddZKJBqO8ajBXcOfZi6w6lsAtN2mKK4zgx5JLEP7PKTBjaCBWZC
 8+nTKhCd1ZBsh5ndjntO7ippMZYpiMzneMtHn84qtCarMxX6HEkm1zjB7f6y1S5v
 C3I1V0EdxNfyJQIFiOoGVpMZTmPLp9akvJAe5tUElAsMJ+fQgvlc2jgCRX3JeXUr
 buaZRgAONbIhnxC9xgCt7s7nn0ICgp5r6P5aA8QAzfDBPbeL/g2ZggozTufYECxM
 KE2FVPibwejKpUEtUyBNjA+yz46Ce89xryoTKN8uwTKcDQIheo0mE03eWtbObZW1
 vDPFyGAEl6a3g7WOotL5tDYUEHtGPKSmXrz8UL8F+E4nEKtlNySZspkrOx+aJSg7
 fHbbc4L1GllDaC8spF8hH1EDYva2QZyc1xt9rFdEAOBluwX0unVo250+Ue5D6DE8
 aOQtv/RJgkPxlk8UEyPzP1HHQ/VXQ07F8jRRgkuQbEsvY3mxAvt+BcG+ZtLF0FKk
 e7jO3I3wCHclS4EnP/4kY/Diy9A9htqr30vJayiy2+G80hbFYoE7nuiLU75rukWp
 fQcsHGv8UjkOO9qkNecDn0znc2sqhgxF+3RyaEfNaeEsVmsUASE=
 =0JCB
 -----END PGP SIGNATURE-----

Merge tag 'objtool_urgent_for_v5.15_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fixes from Borislav Petkov:

 - Remove an extra section.len member in favour of section.sh_size

 - Align .altinstructions section creation with the kernel's by creating
   them with entry size of 0

 - Fix objtool to convert a reloc symbol to a section offset and not to
   not warn about not knowing how

* tag 'objtool_urgent_for_v5.15_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Remove redundant 'len' field from struct section
  objtool: Make .altinstructions section entry size consistent
  objtool: Remove reloc symbol type checks in get_alt_entry()
2021-10-10 10:05:39 -07:00
Ilya Leoshkevich 5319255b8d selftests/bpf: Skip verifier tests that fail to load with ENOTSUPP
The verifier tests added in commit c48e51c8b0 ("bpf: selftests: Add
selftests for module kfunc support") fail on s390, since the JIT does
not support calling kernel functions. This is most likely an issue for
all the other non-Intel arches, as well as on Intel with
!CONFIG_DEBUG_INFO_BTF or !CONFIG_BPF_JIT.

Trying to check for messages from all the possible add_kfunc_call()
failure cases in test_verifier looks pointless, so do a much simpler
thing instead: just like it's already done in do_prog_test_run(), skip
the tests that fail to load with ENOTSUPP.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211007173329.381754-1-iii@linux.ibm.com
2021-10-08 20:07:05 -07:00
Tianjia Zhang e506342a03 selftests/tls: add SM4 GCM/CCM to tls selftests
Add new cipher as a variant of standard tls selftests.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20211008091745.42917-1-tianjia.zhang@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-08 16:58:46 -07:00
Yucong Sun d3f7b1664d selfetest/bpf: Make some tests serial
Change tests that often fails in parallel execution mode to serial.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-15-fallentree@fb.com
2021-10-08 15:17:00 -07:00
Yucong Sun 5db02dd7f0 selftests/bpf: Fix pid check in fexit_sleep test
bpf_get_current_pid_tgid() returns u64, whose upper 32 bits are the same
as userspace getpid() return value.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-13-fallentree@fb.com
2021-10-08 15:17:00 -07:00
Yucong Sun 0f4feacc91 selftests/bpf: Adding pid filtering for atomics test
This make atomics test able to run in parallel with other tests.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-11-fallentree@fb.com
2021-10-08 15:17:00 -07:00
Yucong Sun 445e72c782 selftests/bpf: Make cgroup_v1v2 use its own port
This patch change cgroup_v1v2 use a different port, avoid conflict with
other tests.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-8-fallentree@fb.com
2021-10-08 15:10:43 -07:00
Yucong Sun d719de0d2f selftests/bpf: Fix race condition in enable_stats
In parallel execution mode, this test now need to use atomic operation
to avoid race condition.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-7-fallentree@fb.com
2021-10-08 15:10:43 -07:00
Yucong Sun e87c3434f8 selftests/bpf: Add per worker cgroup suffix
This patch make each worker use a unique cgroup base directory, thus
allowing tests that uses cgroups to run concurrently.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-5-fallentree@fb.com
2021-10-08 14:45:18 -07:00
Yucong Sun 6587ff58ce selftests/bpf: Allow some tests to be executed in sequence
This patch allows tests to define serial_test_name() instead of
test_name(), and this will make test_progs execute those in sequence
after all other tests finished executing concurrently.

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-3-fallentree@fb.com
2021-10-08 14:38:02 -07:00
Yucong Sun 91b2c0afd0 selftests/bpf: Add parallelism to test_progs
This patch adds "-j" mode to test_progs, executing tests in multiple
process.  "-j" mode is optional, and works with all existing test
selection mechanism, as well as "-v", "-l" etc.

In "-j" mode, main process use UDS/SEQPACKET to communicate to each forked
worker, commanding it to run tests and collect logs. After all tests are
finished, a summary is printed. main process use multiple competing
threads to dispatch work to worker, trying to keep them all busy.

The test status will be printed as soon as it is finished, if there are
error logs, it will be printed after the final summary line.

By specifying "--debug", additional debug information on server/worker
communication will be printed.

Example output:
  > ./test_progs -n 15-20 -j
  [   12.801730] bpf_testmod: loading out-of-tree module taints kernel.
  Launching 8 workers.
  #20 btf_split:OK
  #16 btf_endian:OK
  #18 btf_module:OK
  #17 btf_map_in_map:OK
  #19 btf_skc_cls_ingress:OK
  #15 btf_dump:OK
  Summary: 6/20 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006185619.364369-2-fallentree@fb.com
2021-10-08 14:37:56 -07:00
Hou Tao fa7f17d066 bpf/selftests: Add test for writable bare tracepoint
Add a writable bare tracepoint in bpf_testmod module, and
trigger its calling when reading /sys/kernel/bpf_testmod
with a specific buffer length. The reading will return
the value in writable context if the early return flag
is enabled in writable context.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211004094857.30868-4-hotforest@gmail.com
2021-10-08 13:22:57 -07:00
Hou Tao ccaf12d621 libbpf: Support detecting and attaching of writable tracepoint program
Program on writable tracepoint is BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
but its attachment is the same as BPF_PROG_TYPE_RAW_TRACEPOINT.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211004094857.30868-3-hotforest@gmail.com
2021-10-08 13:22:57 -07:00
Quentin Monnet d7db0a4e8d bpftool: Add install-bin target to install binary only
With "make install", bpftool installs its binary and its bash completion
file. Usually, this is what we want. But a few components in the kernel
repository (namely, BPF iterators and selftests) also install bpftool
locally before using it. In such a case, bash completion is not
necessary and is just a useless build artifact.

Let's add an "install-bin" target to bpftool, to offer a way to install
the binary only.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-13-quentin@isovalent.com
2021-10-08 12:02:40 -07:00
Quentin Monnet 87ee33bfdd selftests/bpf: Better clean up for runqslower in test_bpftool_build.sh
The script test_bpftool_build.sh attempts to build bpftool in the
various supported ways, to make sure nothing breaks.

One of those ways is to run "make tools/bpf" from the root of the kernel
repository. This command builds bpftool, along with the other tools
under tools/bpf, and runqslower in particular. After running the
command and upon a successful bpftool build, the script attempts to
cleanup the generated objects. However, after building with this target
and in the case of runqslower, the files are not cleaned up as expected.

This is because the "tools/bpf" target sets $(OUTPUT) to
.../tools/bpf/runqslower/ when building the tool, causing the object
files to be placed directly under the runqslower directory. But when
running "cd tools/bpf; make clean", the value for $(OUTPUT) is set to
".output" (relative to the runqslower directory) by runqslower's
Makefile, and this is where the Makefile looks for files to clean up.

We cannot easily fix in the root Makefile (where "tools/bpf" is defined)
or in tools/scripts/Makefile.include (setting $(OUTPUT)), where changing
the way the output variables are passed would likely have consequences
elsewhere. We could change runqslower's Makefile to build in the
repository instead of in a dedicated ".output/", but doing so just to
accommodate a test script doesn't sound great. Instead, let's just make
sure that we clean up runqslower properly by adding the correct command
to the script.

This will attempt to clean runqslower twice: the first try with command
"cd tools/bpf; make clean" will search for tools/bpf/runqslower/.output
and fail to clean it (but will still clean the other tools, in
particular bpftool), the second one (added in this commit) sets the
$(OUTPUT) variable like for building with the "tool/bpf" target and
should succeed.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-12-quentin@isovalent.com
2021-10-08 12:02:38 -07:00
Quentin Monnet be79505caf tools/runqslower: Install libbpf headers when building
API headers from libbpf should not be accessed directly from the
library's source directory. Instead, they should be exported with "make
install_headers". Let's make sure that runqslower installs the
headers properly when building.

We use a libbpf_hdrs target to mark the logical dependency on libbpf's
headers export for a number of object files, even though the headers
should have been exported at this time (since bpftool needs them, and is
required to generate the skeleton or the vmlinux.h).

When descending from a parent Makefile, the specific output directories
for building the library and exporting the headers are configurable with
BPFOBJ_OUTPUT and BPF_DESTDIR, respectively. This is in addition to
OUTPUT, on top of which those variables are constructed by default.

Also adjust the Makefile for the BPF selftests. We pass a number of
variables to the "make" invocation, because we want to point runqslower
to the (target) libbpf shared with other tools, instead of building its
own version. In addition, runqslower relies on (target) bpftool, and we
also want to pass the proper variables to its Makefile so that bpftool
itself reuses the same libbpf.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-6-quentin@isovalent.com
2021-10-08 11:54:15 -07:00
Quentin Monnet 1478994aad tools/resolve_btfids: Install libbpf headers when building
API headers from libbpf should not be accessed directly from the
library's source directory. Instead, they should be exported with "make
install_headers". Let's make sure that resolve_btfids installs the
headers properly when building.

When descending from a parent Makefile, the specific output directories
for building the library and exporting the headers are configurable with
LIBBPF_OUT and LIBBPF_DESTDIR, respectively. This is in addition to
OUTPUT, on top of which those variables are constructed by default.

Also adjust the Makefile for the BPF selftests in order to point to the
(target) libbpf shared with other tools, instead of building a version
specific to resolve_btfids. Remove libbpf's order-only dependencies on
the include directories (they are created by libbpf and don't need to
exist beforehand).

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-5-quentin@isovalent.com
2021-10-08 11:54:11 -07:00
Quentin Monnet f012ade10b bpftool: Install libbpf headers instead of including the dir
Bpftool relies on libbpf, therefore it relies on a number of headers
from the library and must be linked against the library. The Makefile
for bpftool exposes these objects by adding tools/lib as an include
directory ("-I$(srctree)/tools/lib"). This is a working solution, but
this is not the cleanest one. The risk is to involuntarily include
objects that are not intended to be exposed by the libbpf.

The headers needed to compile bpftool should in fact be "installed" from
libbpf, with its "install_headers" Makefile target. In addition, there
is one header which is internal to the library and not supposed to be
used by external applications, but that bpftool uses anyway.

Adjust the Makefile in order to install the header files properly before
compiling bpftool. Also copy the additional internal header file
(nlattr.h), but call it out explicitly. Build (and install headers) in a
subdirectory under bpftool/ instead of tools/lib/bpf/. When descending
from a parent Makefile, this is configurable by setting the OUTPUT,
LIBBPF_OUTPUT and LIBBPF_DESTDIR variables.

Also adjust the Makefile for BPF selftests, so as to reuse the (host)
libbpf compiled earlier and to avoid compiling a separate version of the
library just for bpftool.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-4-quentin@isovalent.com
2021-10-08 11:48:43 -07:00
Quentin Monnet c66a248f19 bpftool: Remove unused includes to <bpf/bpf_gen_internal.h>
It seems that the header file was never necessary to compile bpftool,
and it is not part of the headers exported from libbpf. Let's remove the
includes from prog.c and gen.c.

Fixes: d510296d33 ("bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-3-quentin@isovalent.com
2021-10-08 11:47:40 -07:00
Quentin Monnet b79c2ce3ba libbpf: Skip re-installing headers file if source is older than target
The "install_headers" target in libbpf's Makefile would unconditionally
export all API headers to the target directory. When those headers are
installed to compile another application, this means that make always
finds newer dependencies for the source files relying on those headers,
and deduces that the targets should be rebuilt.

Avoid that by making "install_headers" depend on the source header
files, and (re-)install them only when necessary.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007194438.34443-2-quentin@isovalent.com
2021-10-08 11:47:34 -07:00
Yucong Sun 7e3cbd3405 selftests/bpf: Fix btf_dump test under new clang
New clang version changed ([0]) type name in dwarf from "long int" to "long",
this is causing btf_dump tests to fail.

  [0] f6a561c4d6

Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211008173139.1457407-1-fallentree@fb.com
2021-10-08 11:08:11 -07:00
Amit Cohen 7f63cdde50 selftests: mlxsw: devlink_trap_tunnel_ipip: Send a full-length key
As part of adding same test for GRE tunnel with IPv6 underlay, missing
bytes for key were found.

mausezahn does not fill zeros between two colons, so send them
explicitly. For example, use "00:00:00:E9:" instead of ":E9:"

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:59 +01:00
Amit Cohen 8bb0ebd522 selftests: mlxsw: devlink_trap_tunnel_ipip: Remove code duplication
As part of adding same test for GRE tunnel with IPv6 underlay, an
optional improvement was found - call ipip_payload_get from
ecn_payload_get, so do not duplicate the code which creates the payload.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:59 +01:00
Amit Cohen c473f723f9 selftests: mlxsw: devlink_trap_tunnel_ipip: Align topology drawing correctly
As part of adding same test for GRE tunnel with IPv6 underlay, wrong
alignments were found, fix them.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:59 +01:00
Amit Cohen 4bb6cce00a selftests: mlxsw: devlink_trap_tunnel_ipip6: Add test case for IPv6 decap_error
IPv6 underlay support was added, add test to check that "decap_error" trap
is triggered under the right conditions and that devlink counters increase.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:58 +01:00
Amit Cohen 4b3d967b5c selftests: forwarding: Add IPv6 GRE hierarchical tests
Add tests that check IPv6-in-IPv6, IPv4-in-IPv6 and MTU change of GRE
tunnel. The tests use hierarchical model - the tunnel is bound to a device
in a different VRF.

These tests can be run with TC_FLAG=skip_sw, so then they will verify
that packets go through hardware as part of enacp and decap phases.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:58 +01:00
Amit Cohen 7df29960fa selftests: forwarding: Add IPv6 GRE flat tests
Add tests that check IPv6-in-IPv6, IPv4-in-IPv6 and MTU change of GRE
tunnel. The tests use flat model - overlay and underlay share the same VRF.

These tests can be run with TC_FLAG=skip_sw, so then they will verify
that packets go through hardware as part of enacp and decap phases.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:58 +01:00
Amit Cohen c08d227290 testing: selftests: tc_common: Add tc_check_at_least_x_packets()
Add function that checks that at least X packets hit the tc rule.
There are cases that it is not possible to catch only the interesting
packets, so then, it is possible to send many packets and verify that at
least this amount of packets hit the rule.

This function will be used in the next patch for general tc rule that
can be used to test both software and hardware.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:58 +01:00
Amit Cohen 45d45e5323 testing: selftests: forwarding.config.sample: Add tc flag
Add TC_FLAG value to tests topology.
This flag supposed to be skip_sw/skip_hw which means do not filter by
software/hardware.

This can be useful for adding tests to forwarding directory, and be able
to verify that packets go through the hardware.

When the flag is not set or set to 'skip_hw', tests can still be executed
with veth pairs.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 16:40:57 +01:00
Dave Marchevsky dd65acf72d selftests/bpf: Remove SEC("version") from test progs
Since commit 6c4fc209fc ("bpf: remove useless version check for prog
load") these "version" sections, which result in bpf_attr.kern_version
being set, have been unnecessary.

Remove them so that it's obvious to folks using selftests as a guide that
"modern" BPF progs don't need this section.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007231234.2223081-1-davemarchevsky@fb.com
2021-10-07 22:01:56 -07:00
Song Liu aa67fdb464 selftests/bpf: Skip the second half of get_branch_snapshot in vm
VMs running on upstream 5.12+ kernel support LBR. However,
bpf_get_branch_snapshot couldn't stop the LBR before too many entries
are flushed. Skip the hit/waste test for VMs before we find a proper fix
for LBR in VM.

Fixes: 025bd7c753 ("selftests/bpf: Add test for bpf_get_branch_snapshot")
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211007050231.728496-1-songliubraving@fb.com
2021-10-07 21:51:04 -07:00
Jakub Kicinski 9fe1155233 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-07 15:24:06 -07:00
Linus Torvalds 14df9235aa perf tools fixes for v5.15: 3rd batch
- Fix plugin static linking with libopencsd on ARM and ARM64.
 
 - Add missing -lstdc++ when linking with libopencsd.
 
 - Add missing topdown metrics events to 'perf test attr'.
 
 - Plug leak sys_event_tables list after processing JSON vendor events entries.
 
 - Sync sound/asound.h copy with the kernel sources.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYV8V0gAKCRCyPKLppCJ+
 JwEKAQCqrmoaUcX3hsrb1GqRNVgVpxztL+GmfK8dL9IEWZxSfgEA6ekSZzSx8bn/
 i7t3ccKYYuDKA0nqtB5hcaXFArqVTQA=
 =AIE3
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.15-2021-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix plugin static linking with libopencsd on ARM and ARM64

 - Add missing -lstdc++ when linking with libopencsd

 - Add missing topdown metrics events to 'perf test attr'

 - Plug leak sys_event_tables list after processing JSON vendor events
   entries

 - Sync sound/asound.h copy with the kernel sources

* tag 'perf-tools-fixes-for-v5.15-2021-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf tests attr: Add missing topdown metrics events
  tools include UAPI: Sync sound/asound.h copy with the kernel sources
  perf build: Fix plugin static linking with libopencsd on ARM and ARM64
  perf build: Add missing -lstdc++ when linking with libopencsd
  perf jevents: Free the sys_event_tables list after processing entries
2021-10-07 10:58:42 -07:00
Linus Torvalds 4a16df549d Networking fixes for 5.15-rc5, including fixes from xfrm, bpf,
netfilter, and wireless.
 
 Current release - regressions:
 
  - xfrm: fix XFRM_MSG_MAPPING ABI breakage caused by inserting
    a new value in the middle of an enum
 
  - unix: fix an issue in unix_shutdown causing the other end
    read/write failures
 
  - phy: mdio: fix memory leak
 
 Current release - new code bugs:
 
  - mlx5e: improve MQPRIO resiliency against bad configs
 
 Previous releases - regressions:
 
  - bpf: fix integer overflow leading to OOB access in map element
    pre-allocation
 
  - stmmac: dwmac-rk: fix ethernet on rk3399 based devices
 
  - netfilter: conntrack: fix boot failure with nf_conntrack.enable_hooks=1
 
  - brcmfmac: revert using ISO3166 country code and 0 rev as fallback
 
  - i40e: fix freeing of uninitialized misc IRQ vector
 
  - iavf: fix double unlock of crit_lock
 
 Previous releases - always broken:
 
  - bpf, arm: fix register clobbering in div/mod implementation
 
  - netfilter: nf_tables: correct issues in netlink rule change
    event notifications
 
  - dsa: tag_dsa: fix mask for trunked packets
 
  - usb: r8152: don't resubmit rx immediately to avoid soft lockup
    on device unplug
 
  - i40e: fix endless loop under rtnl if FW fails to correctly
    respond to capability query
 
  - mlx5e: fix rx checksum offload coexistence with ipsec offload
 
  - mlx5: force round second at 1PPS out start time and allow it
    only in supported clock modes
 
  - phy: pcs: xpcs: fix incorrect CL37 AN sequence, EEE disable
    sequence
 
 Misc:
 
  - xfrm: slightly rejig the new policy uAPI to make it less cryptic
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFfEpUACgkQMUZtbf5S
 Iru3vQ//fgm+pdDE860BXmLEgrbJTHU4rq/YD1vwZBcWw/i5wI1vnLr6BzZsdPNX
 DkhcKFGOZUTj+8ctuRDuqrkqoDjva6uRjwM0vcPWh5i9sGqJpKjxB3dFksyxJELR
 SnXM3Jmlk7YiGAw9Bi+66OuIwt2ouRLR/bNIwg8/qCnFI1efIF7IPeCpuvKCw/yd
 SOiSBIfuSPD1qcs5Sy4aHZqA8Xr9qbwDNwWQfFLXgNDKEiY7XOSdo3CoCddSxdR+
 2nmpOiz4w68wspapdZn3GSZHYrQh5kjz7b0Aru0Jvw86M79mKp3b9AfJ9uXTcJhp
 4cQBralLnQvLKanvKi1z5CI6NjXx+r6rXI43N6NjHOtjRUPoFMqxZEH0d7o11aT1
 sN3UDgtFtuE9Pfrhnc5ZHuHqFCCyxA6NWD6nt1dUoSEo0oWt9mOHfeoFT4+45fO0
 5no5+1q3EkYdH4jiJlavnM2DMvVzMd6FbxDzWpXJ2j4W1vM6TLkexEJIK4MLGxPV
 76lxeXzcvbM9a0vq5BabR4QbPIAv+A9qYPmXJwPVrvjo+zynwuWc3gMO5hc4EaOf
 FXF2Ka5Jn97jW8/JS7i7Gj6M8GKdyIxaFHgS4MLtJNs6pt3h7m6bSgcIOQZ5psBZ
 dKRYjM2lxeVkvDDmy5Gztkw2asbofYQP004tgagc+jwXP7DwXaY=
 =xzZg
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from xfrm, bpf, netfilter, and wireless.

  Current release - regressions:

   - xfrm: fix XFRM_MSG_MAPPING ABI breakage caused by inserting a new
     value in the middle of an enum

   - unix: fix an issue in unix_shutdown causing the other end
     read/write failures

   - phy: mdio: fix memory leak

  Current release - new code bugs:

   - mlx5e: improve MQPRIO resiliency against bad configs

  Previous releases - regressions:

   - bpf: fix integer overflow leading to OOB access in map element
     pre-allocation

   - stmmac: dwmac-rk: fix ethernet on rk3399 based devices

   - netfilter: conntrack: fix boot failure with
     nf_conntrack.enable_hooks=1

   - brcmfmac: revert using ISO3166 country code and 0 rev as fallback

   - i40e: fix freeing of uninitialized misc IRQ vector

   - iavf: fix double unlock of crit_lock

  Previous releases - always broken:

   - bpf, arm: fix register clobbering in div/mod implementation

   - netfilter: nf_tables: correct issues in netlink rule change event
     notifications

   - dsa: tag_dsa: fix mask for trunked packets

   - usb: r8152: don't resubmit rx immediately to avoid soft lockup on
     device unplug

   - i40e: fix endless loop under rtnl if FW fails to correctly respond
     to capability query

   - mlx5e: fix rx checksum offload coexistence with ipsec offload

   - mlx5: force round second at 1PPS out start time and allow it only
     in supported clock modes

   - phy: pcs: xpcs: fix incorrect CL37 AN sequence, EEE disable
     sequence

  Misc:

   - xfrm: slightly rejig the new policy uAPI to make it less cryptic"

* tag 'net-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
  net: prefer socket bound to interface when not in VRF
  iavf: fix double unlock of crit_lock
  i40e: Fix freeing of uninitialized misc IRQ vector
  i40e: fix endless loop under rtnl
  dt-bindings: net: dsa: marvell: fix compatible in example
  ionic: move filter sync_needed bit set
  gve: report 64bit tx_bytes counter from gve_handle_report_stats()
  gve: fix gve_get_stats()
  rtnetlink: fix if_nlmsg_stats_size() under estimation
  gve: Properly handle errors in gve_assign_qpl
  gve: Avoid freeing NULL pointer
  gve: Correct available tx qpl check
  unix: Fix an issue in unix_shutdown causing the other end read/write failures
  net: stmmac: trigger PCS EEE to turn off on link down
  net: pcs: xpcs: fix incorrect steps on disable EEE
  netlink: annotate data races around nlk->bound
  net: pcs: xpcs: fix incorrect CL37 AN sequence
  net: sfp: Fix typo in state machine debug string
  net/sched: sch_taprio: properly cancel timer from taprio_destroy()
  net: bridge: fix under estimation in br_get_linkxstats_size()
  ...
2021-10-07 09:50:31 -07:00
Jakub Kicinski 7671b026bb Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-10-07

We've added 7 non-merge commits during the last 8 day(s) which contain
a total of 8 files changed, 38 insertions(+), 21 deletions(-).

The main changes are:

1) Fix ARM BPF JIT to preserve caller-saved regs for DIV/MOD JIT-internal
   helper call, from Johan Almbladh.

2) Fix integer overflow in BPF stack map element size calculation when
   used with preallocation, from Tatsuhiko Yasumatsu.

3) Fix an AF_UNIX regression due to added BPF sockmap support related
   to shutdown handling, from Jiang Wang.

4) Fix a segfault in libbpf when generating light skeletons from objects
   without BTF, from Kumar Kartikeya Dwivedi.

5) Fix a libbpf memory leak in strset to free the actual struct strset
   itself, from Andrii Nakryiko.

6) Dual-license bpf_insn.h similarly as we did for libbpf and bpftool,
   with ACKs from all contributors, from Luca Boccassi.
====================

Link: https://lore.kernel.org/r/20211007135010.21143-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-07 07:11:33 -07:00
André Almeida 9d57f7c797 selftests: futex: Test sys_futex_waitv() wouldblock
Test if futex_waitv() returns -EWOULDBLOCK correctly when the expected
value is different from the actual value for a waiter.

Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210923171111.300673-22-andrealmeid@collabora.com
2021-10-07 13:51:12 +02:00
André Almeida 02e56ccbae selftests: futex: Test sys_futex_waitv() timeout
Test if the futex_waitv timeout is working as expected, using the
supported clockid options.

Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210923171111.300673-21-andrealmeid@collabora.com
2021-10-07 13:51:12 +02:00
André Almeida 5e59c1d1c7 selftests: futex: Add sys_futex_waitv() test
Create a new file to test the waitv mechanism. Test both private and
shared futexes. Wake the last futex in the array, and check if the
return value from futex_waitv() is the right index.

Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210923171111.300673-20-andrealmeid@collabora.com
2021-10-07 13:51:12 +02:00
Mark Brown 0ba1ce1e86 selftests: arm64: Add coverage of ptrace flags for SVE VL inheritance
Add a test that covers enabling and disabling of SVE vector length
inheritance via the ptrace interface.

Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20211005123537.976795-1-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-10-07 09:20:52 +01:00
Michael Forney 86e1e054e0 objtool: Update section header before relocations
The libelf implementation from elftoolchain has a safety check in
gelf_update_rel[a] to check that the data corresponds to a section
that has type SHT_REL[A] [0]. If the relocation is updated before
the section header is updated with the proper type, this check
fails.

To fix this, update the section header first, before the relocations.
Previously, the section size was calculated in elf_rebuild_reloc_section
by counting the number of entries in the reloc_list. However, we
now need the size during elf_write so instead keep a running total
and add to it for every new relocation.

[0] https://sourceforge.net/p/elftoolchain/mailman/elftoolchain-developers/thread/CAGw6cBtkZro-8wZMD2ULkwJ39J+tHtTtAWXufMjnd3cQ7XG54g@mail.gmail.com/

Signed-off-by: Michael Forney <mforney@mforney.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210509000103.11008-2-mforney@mforney.org
2021-10-06 20:11:57 -07:00
Michael Forney b46179d6bb objtool: Check for gelf_update_rel[a] failures
Otherwise, if these fail we end up with garbage data in the
.rela.orc_unwind_ip section, leading to errors like

  ld: fs/squashfs/namei.o: bad reloc symbol index (0x7f16 >= 0x12) for offset 0x7f16d5c82cc8 in section `.orc_unwind_ip'

Signed-off-by: Michael Forney <mforney@mforney.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210509000103.11008-1-mforney@mforney.org
2021-10-06 20:11:53 -07:00
Peter Zijlstra b08cadbd3b Merge branch 'objtool/urgent'
Fixup conflicts.

# Conflicts:
#	tools/objtool/check.c
2021-10-07 00:40:17 +02:00
Hengqi Chen 6f2b219b62 selftests/bpf: Switch to new bpf_object__next_{map,program} APIs
Replace deprecated bpf_{map,program}__next APIs with newly added
bpf_object__next_{map,program} APIs, so that no compilation warnings
emit.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211003165844.4054931-3-hengqi.chen@gmail.com
2021-10-06 12:34:02 -07:00
Hengqi Chen 2088a3a71d libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7
Deprecate bpf_{map,program}__{prev,next} APIs. Replace them with
a new set of APIs named bpf_object__{prev,next}_{program,map} which
follow the libbpf API naming convention ([0]). No functionality changes.

  [0] Closes: https://github.com/libbpf/libbpf/issues/296

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211003165844.4054931-2-hengqi.chen@gmail.com
2021-10-06 12:34:02 -07:00
Hengqi Chen 4a404a7e8a libbpf: Deprecate bpf_object__unload() API since v0.6
BPF objects are not reloadable after unload. Users are expected to use
bpf_object__close() to unload and free up resources in one operation.
No need to expose bpf_object__unload() as a public API, deprecate it
([0]).  Add bpf_object__unload() as an alias to internal
bpf_object_unload() and replace all bpf_object__unload() uses to avoid
compilation errors.

  [0] Closes: https://github.com/libbpf/libbpf/issues/290

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211002161000.3854559-1-hengqi.chen@gmail.com
2021-10-06 12:34:02 -07:00
Jiri Olsa 189c83bdde selftest/bpf: Switch recursion test to use htab_map_delete_elem
Currently the recursion test is hooking __htab_map_lookup_elem
function, which is invoked both from bpf_prog and bpf syscall.

But in our kernel build, the __htab_map_lookup_elem gets inlined
within the htab_map_lookup_elem, so it's not trigered and the
test fails.

Fixing this by using htab_map_delete_elem, which is not inlined
for bpf_prog calls (like htab_map_lookup_elem is) and is used
directly as pointer for map_delete_elem, so it won't disappear
by inlining.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/YVnfFTL/3T6jOwHI@krava
2021-10-06 12:34:02 -07:00
Quentin Monnet 929bef4677 bpf: Use $(pound) instead of \# in Makefiles
Recent-ish versions of make do no longer consider number signs ("#") as
comment symbols when they are inserted inside of a macro reference or in
a function invocation. In such cases, the symbols should not be escaped.

There are a few occurrences of "\#" in libbpf's and samples' Makefiles.
In the former, the backslash is harmless, because grep associates no
particular meaning to the escaped symbol and reads it as a regular "#".
In samples' Makefile, recent versions of make will pass the backslash
down to the compiler, making the probe fail all the time and resulting
in the display of a warning about "make headers_install" being required,
even after headers have been installed.

A similar issue has been addressed at some other locations by commit
9564a8cf42 ("Kbuild: fix # escaping in .cmd files for future Make").
Let's address it for libbpf's and samples' Makefiles in the same
fashion, by using a "$(pound)" variable (pulled from
tools/scripts/Makefile.include for libbpf, or re-defined for the
samples).

Reference for the change in make:
https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57

Fixes: 2f38304127 ("libbpf: Make libbpf_version.h non-auto-generated")
Fixes: 07c3bbdb1a ("samples: bpf: print a warning about headers_install")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211006111049.20708-1-quentin@isovalent.com
2021-10-06 12:34:02 -07:00
Andrii Nakryiko 9d05787223 selftests/bpf: Test new btf__add_btf() API
Add a test that validates that btf__add_btf() API is correctly copying
all the types from the source BTF into destination BTF object and
adjusts type IDs and string offsets properly.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211006051107.17921-4-andrii@kernel.org
2021-10-06 15:36:30 +02:00
Andrii Nakryiko c65eb8082d selftests/bpf: Refactor btf_write selftest to reuse BTF generation logic
Next patch will need to reuse BTF generation logic, which tests every
supported BTF kind, for testing btf__add_btf() APIs. So restructure
existing selftests and make it as a single subtest that uses bulk
VALIDATE_RAW_BTF() macro for raw BTF dump checking.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211006051107.17921-3-andrii@kernel.org
2021-10-06 15:36:29 +02:00
Andrii Nakryiko 7ca6112159 libbpf: Add API that copies all BTF types from one BTF object to another
Add a bulk copying api, btf__add_btf(), that speeds up and simplifies
appending entire contents of one BTF object to another one, taking care
of copying BTF type data, adjusting resulting BTF type IDs according to
their new locations in the destination BTF object, as well as copying
and deduplicating all the referenced strings and updating all the string
offsets in new BTF types as appropriate.

This API is intended to be used from tools that are generating and
otherwise manipulating BTFs generically, such as pahole. In pahole's
case, this API is useful for speeding up parallelized BTF encoding, as
it allows pahole to offload all the intricacies of BTF type copying to
libbpf and handle the parallelization aspects of the process.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/bpf/20211006051107.17921-2-andrii@kernel.org
2021-10-06 15:35:46 +02:00
Jie Meng 57a610f1c5 bpf, x64: Save bytes for DIV by reducing reg copies
Instead of unconditionally performing push/pop on %rax/%rdx in case of
division/modulo, we can save a few bytes in case of destination register
being either BPF r0 (%rax) or r3 (%rdx) since the result is written in
there anyway.

Also, we do not need to copy the source to %r11 unless the source is either
%rax, %rdx or an immediate.

For example, before the patch:

  22:   push   %rax
  23:   push   %rdx
  24:   mov    %rsi,%r11
  27:   xor    %edx,%edx
  29:   div    %r11
  2c:   mov    %rax,%r11
  2f:   pop    %rdx
  30:   pop    %rax
  31:   mov    %r11,%rax

After:

  22:   push   %rdx
  23:   xor    %edx,%edx
  25:   div    %rsi
  28:   pop    %rdx

Signed-off-by: Jie Meng <jmeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211002035626.2041910-1-jmeng@fb.com
2021-10-06 15:24:36 +02:00
Borislav Petkov f96b467583 x86/insn: Use get_unaligned() instead of memcpy()
Use get_unaligned() instead of memcpy() to access potentially unaligned
memory, which, when accessed through a pointer, leads to undefined
behavior. get_unaligned() describes much better what is happening there
anyway even if memcpy() does the job.

In addition, since perf tool builds with -Werror, it would fire with:

  util/intel-pt-decoder/../../../arch/x86/lib/insn.c: In function '__insn_get_emulate_prefix':
  tools/include/../include/asm-generic/unaligned.h:10:15: error: packed attribute is unnecessary [-Werror=packed]
     10 |  const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \

because -Werror=packed would complain if the packed attribute would have
no effect on the layout of the structure.

In this case, that is intentional so disable the warning only for that
compilation unit.

That part is Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>

No functional changes.

Fixes: 5ba1071f75 ("x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lkml.kernel.org/r/YVSsIkj9Z29TyUjE@zn.tnic
2021-10-06 11:56:37 +02:00
Kumar Kartikeya Dwivedi c48e51c8b0 bpf: selftests: Add selftests for module kfunc support
This adds selftests that tests the success and failure path for modules
kfuncs (in presence of invalid kfunc calls) for both libbpf and
gen_loader. It also adds a prog_test kfunc_btf_id_list so that we can
add module BTF ID set from bpf_testmod.

This also introduces  a couple of test cases to verifier selftests for
validating whether we get an error or not depending on if invalid kfunc
call remains after elimination of unreachable instructions.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-10-memxor@gmail.com
2021-10-05 17:07:42 -07:00
Kumar Kartikeya Dwivedi 18f4fccbf3 libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations
This change updates the BPF syscall loader to relocate BTF_KIND_FUNC
relocations, with support for weak kfunc relocations. The general idea
is to move map_fds to loader map, and also use the data for storing
kfunc BTF fds. Since both reuse the fd_array parameter, they need to be
kept together.

For map_fds, we reserve MAX_USED_MAPS slots in a region, and for kfunc,
we reserve MAX_KFUNC_DESCS. This is done so that insn->off has more
chances of being <= INT16_MAX than treating data map as a sparse array
and adding fd as needed.

When the MAX_KFUNC_DESCS limit is reached, we fall back to the sparse
array model, so that as long as it does remain <= INT16_MAX, we pass an
index relative to the start of fd_array.

We store all ksyms in an array where we try to avoid calling the
bpf_btf_find_by_name_kind helper, and also reuse the BTF fd that was
already stored. This also speeds up the loading process compared to
emitting calls in all cases, in later tests.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-9-memxor@gmail.com
2021-10-05 17:07:42 -07:00
Kumar Kartikeya Dwivedi 466b2e1397 libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0
Preserve these calls as it allows verifier to succeed in loading the
program if they are determined to be unreachable after dead code
elimination during program load. If not, the verifier will fail at
runtime. This is done for ext->is_weak symbols similar to the case for
variable ksyms.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-8-memxor@gmail.com
2021-10-05 17:07:42 -07:00
Kumar Kartikeya Dwivedi 9dbe601563 libbpf: Support kernel module function calls
This patch adds libbpf support for kernel module function call support.
The fd_array parameter is used during BPF program load to pass module
BTFs referenced by the program. insn->off is set to index into this
array, but starts from 1, because insn->off as 0 is reserved for
btf_vmlinux.

We try to use existing insn->off for a module, since the kernel limits
the maximum distinct module BTFs for kfuncs to 256, and also because
index must never exceed the maximum allowed value that can fit in
insn->off (INT16_MAX). In the future, if kernel interprets signed offset
as unsigned for kfunc calls, this limit can be increased to UINT16_MAX.

Also introduce a btf__find_by_name_kind_own helper to start searching
from module BTF's start id when we know that the BTF ID is not present
in vmlinux BTF (in find_ksym_btf_id).

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-7-memxor@gmail.com
2021-10-05 17:07:42 -07:00
Kumar Kartikeya Dwivedi f614f2c755 tools: Allow specifying base BTF file in resolve_btfids
This commit allows specifying the base BTF for resolving btf id
lists/sets during link time in the resolve_btfids tool. The base BTF is
set to NULL if no path is passed. This allows resolving BTF ids for
module kernel objects.

Also, drop the --no-fail option, as it is only used in case .BTF_ids
section is not present, instead make no-fail the default mode. The long
option name is same as that of pahole.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-5-memxor@gmail.com
2021-10-05 17:07:41 -07:00
Joe Lawrence fe255fe6ad objtool: Remove redundant 'len' field from struct section
The section structure already contains sh_size, so just remove the extra
'len' member that requires extra mirroring and potential confusion.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210822225037.54620-3-joe.lawrence@redhat.com
Cc: Andy Lavr <andy.lavr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
2021-10-05 12:03:21 -07:00
Joe Lawrence dc02368164 objtool: Make .altinstructions section entry size consistent
Commit e31694e0a7 ("objtool: Don't make .altinstructions writable")
aligned objtool-created and kernel-created .altinstructions section
flags, but there remains a minor discrepency in their use of a section
entry size: objtool sets one while the kernel build does not.

While sh_entsize of sizeof(struct alt_instr) seems intuitive, this small
deviation can cause failures with external tooling (kpatch-build).

Fix this by creating new .altinstructions sections with sh_entsize of 0
and then later updating sec->sh_size as alternatives are added to the
section.  An added benefit is avoiding the data descriptor and buffer
created by elf_create_section(), but previously unused by
elf_add_alternative().

Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210822225037.54620-2-joe.lawrence@redhat.com
Cc: Andy Lavr <andy.lavr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
2021-10-05 12:03:20 -07:00
Josh Poimboeuf 4d8b35968b objtool: Remove reloc symbol type checks in get_alt_entry()
Converting a special section's relocation reference to a symbol is
straightforward.  No need for objtool to complain that it doesn't know
how to handle it.  Just handle it.

This fixes the following warning:

  arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception

Fixes: 24ff652573 ("objtool: Teach get_alt_entry() about more relocation types")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/feadbc3dfb3440d973580fad8d3db873cbfe1694.1633367242.git.jpoimboe@redhat.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: linux-kernel@vger.kernel.org
2021-10-05 12:03:20 -07:00
Kan Liang 0b6c5371c0 perf tests attr: Add missing topdown metrics events
The Topdown metrics events were added as 'perf stat' default events
since commit 42641d6f4d ("perf stat: Add Topdown metrics events as
default events").

However, the perf attr tests were not updated
accordingly.

The perf attr test fails on the platform which supports Topdown metrics.

  # perf test 17

    17: Setup struct perf_event_attr        :FAILED!

Add Topdown metrics events into perf attr test cases. Make them optional
since they are only available on newer platforms.

Fixes: 42641d6f4d ("perf stat: Add Topdown metrics events as default events")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/1633031566-176517-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-05 14:55:38 -03:00
Arnaldo Carvalho de Melo 9fce636e5c tools include UAPI: Sync sound/asound.h copy with the kernel sources
Picking the changes from:

  09d2317440 ("ALSA: rawmidi: introduce SNDRV_RAWMIDI_IOCTL_USER_PVERSION")

Which entails no changes in the tooling side as it doesn't introduce new
SNDRV_PCM_IOCTL_ ioctls.

To silence this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
  diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h

Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-05 14:51:33 -03:00
Branislav Rankov 35c46bf545 perf build: Fix plugin static linking with libopencsd on ARM and ARM64
Filter out -static flag when building plugins as they are always built
as dynamic libraries and -static and -dynamic don't work well together
on arm and arm64.

Signed-off-by: Branislav Rankov <branislav.rankov@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Brown <Mark.Brown@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: coresight@lists.linaro.org
Cc: nd@arm.com
Link: https://lore.kernel.org/r/e88952b3-2470-da96-dee9-e247a1759cd0@arm.com
Signed-off-by: Tamas Zsoldos <tamas.zsoldos@arm.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-05 14:48:10 -03:00
Branislav Rankov 573cf5c9a1 perf build: Add missing -lstdc++ when linking with libopencsd
Add -lstdc++ to perf when linking libopencsd as it is a dependency. It
does not hurt to add it when dynamic linking.

Signed-off-by: Branislav Rankov <branislav.rankov@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Brown <Mark.Brown@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: coresight@lists.linaro.org
Cc: nd@arm.com
Link: https://lore.kernel.org/r/e88952b3-2470-da96-dee9-e247a1759cd0@arm.com
Signed-off-by: Tamas Zsoldos <tamas.zsoldos@arm.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-05 14:48:10 -03:00
Like Xu b94729919d perf jevents: Free the sys_event_tables list after processing entries
The compiler reports that free_sys_event_tables() is dead code.

But according to the semantics, the "LIST_HEAD(sys_event_tables)" should
also be released, just like we do with 'arch_std_events' in main().

Fixes: e9d32c1bf0 ("perf vendor events: Add support for arch standard events")
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210928102938.69681-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-05 14:48:10 -03:00
Li Zhijian 1c36432b27 kselftests/sched: cleanup the child processes
Previously, 'make -C sched run_tests' will block forever when it occurs
something wrong where the *selftests framework* is waiting for its child
processes to exit.

[root@iaas-rpma sched]# ./cs_prctl_test

 ## Create a thread/process/process group hiearchy
Not a core sched system
tid=74985, / tgid=74985 / pgid=74985: ffffffffffffffff
Not a core sched system
    tid=74986, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74988, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74989, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74990, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
    tid=74987, / tgid=74987 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74991, / tgid=74987 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74992, / tgid=74987 / pgid=74985: ffffffffffffffff
Not a core sched system
        tid=74993, / tgid=74987 / pgid=74985: ffffffffffffffff

Not a core sched system
(268) FAILED: get_cs_cookie(0) == 0

 ## Set a cookie on entire process group
-1 = prctl(62, 1, 0, 2, 0)
core_sched create failed -- PGID: Invalid argument
(cs_prctl_test.c:272) -
[root@iaas-rpma sched]# ps
    PID TTY          TIME CMD
   4605 pts/2    00:00:00 bash
  74986 pts/2    00:00:00 cs_prctl_test
  74987 pts/2    00:00:00 cs_prctl_test
  74999 pts/2    00:00:00 ps

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chris Hyser <chris.hyser@oracle.com>
Link: https://lore.kernel.org/r/20210902024333.75983-1-lizhijian@cn.fujitsu.com
2021-10-05 15:51:43 +02:00
Linus Torvalds f6274b06e3 linux-kselftest-fixes-5.15-rc5
This Kselftest fixes update for Linux 5.15-rc5 consists of a fix
 to implicit declaration warns in drivers/dma-buf test.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmFbaykACgkQCwJExA0N
 QxwXRhAAwGntF/dFNcKspMKTepdmKLFHr5Mopd1+OMbVLgjQAtXZs+wCoR0tm6RK
 EZf5D7y8IpGedFnbjWJ1t18XCRJtBW7HladC6otTmR0+1SEFdHR1n4H/pqrmSagV
 5rYqwh1BgTb5Eb/9GPWKlZ0zagXGujR5UVdFPqCEMWpF6be9J989NdKGzpD4Qbkq
 eux3x5iaz5dsVf7iTGhzy+iSxKqIqSw5LCtSrE3PoTRbM5PS+K4I7ImafysnsYK9
 kZ8zkmEzixziwoWUEJLRqFveS7cY/5l7Nd2kD9YH1WE2xBYxYQZFdj4HVWdlQePd
 99UthTSfcyO6C2fplorlKkoNKuX1tlGCJyZbNjgQSHPYuVrLoau136yoSk5F+ZPg
 OUFsIs52d09e7vN70nh7UD9rRqihFRWwgI4EQt9nUP1mAmYm8y7bXKSFDF5hp3ay
 4VbvYl50QEkllqU0sDcEuFAodGLz72T2PZNWP5DQxg5q70Hni/CWqFm+EizztRCi
 mv749twroD61b8Jziu8Mqhp5BqlO1tY8uu9z4GuRC4UIMhdARY2J4mD+8wm03JRO
 4S28o+mH4StrrhyjngrCM9k8YYwa/n+XDrswiNJIpd/PmxVXD1KUP6z80sQE6Tcb
 o34yFVeABzdovryC/8VwDC9n3RDYIN6SM5XqOxvX5jckdtfwCbI=
 =t85S
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:
 "A fix to implicit declaration warns in drivers/dma-buf test"

* tag 'linux-kselftest-fixes-5.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: drivers/dma-buf: Fix implicit declaration warns
2021-10-04 14:33:30 -07:00
Justin Iurman bf77b1400a selftests: net: Test for the IOAM encapsulation with IPv6
This patch adds support for testing the encap (ip6ip6) mode of IOAM.

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04 12:53:36 +01:00
Linus Torvalds 7fab1c12bd objtool: print out the symbol type when complaining about it
The objtool warning that the kvm instruction emulation code triggered
wasn't very useful:

    arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception

in that it helpfully tells you which symbol name it had trouble figuring
out the relocation for, but it doesn't actually say what the unknown
symbol type was that triggered it all.

In this case it was because of missing type information (type 0, aka
STT_NOTYPE), but on the whole it really should just have printed that
out as part of the message.

Because if this warning triggers, that's very much the first thing you
want to know - why did reloc2sec_off() return failure for that symbol?

So rather than just saying you can't handle some type of symbol without
saying what the type _was_, just print out the type number too.

Fixes: 24ff652573 ("objtool: Teach get_alt_entry() about more relocation types")
Link: https://lore.kernel.org/lkml/CAHk-=wiZwq-0LknKhXN4M+T8jbxn_2i9mcKpO+OaBSSq_Eh7tg@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-10-03 13:45:48 -07:00
Linus Torvalds 52c3c17062 - Handle symbol relocations properly due to changes in the toolchains
which remove section symbols now
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmFZesIACgkQEsHwGGHe
 VUp3PhAAg6pe71fCCoO3Z30wwTpqEI9n4CV93Nj+L6xdNw2LhvKWuPVYee0eAYnW
 oMyMv82kWngAMDa55U4o4isvB3CK4yOcCLOk+Odtok/2xmjfL1NB4wGMsk0Zz7Pr
 Rf3gNcyc4/5d8njxPvxEx4jj4thXm//vIKYpoeMXYdMoSGOT1p3w/DVBB2LhneZf
 7ygEH0zTmX75Nxfc9YFZSwXweg1sB6vz0sK0fXPb/BF547y0qyplPRKhK8BOLLCe
 LAN15cxdV0/Su3Wcu59aguzuwDGa3N7GGIa1Lbsg6rxXgMwOzVY6fzk1edCvFnOo
 ro+19jm+riDJGBJUCfi0O8JzTXY2gEBxDACvMDkUtKkONxpeet+mWS3CmFOMqQM4
 0z6C052//vJd6grqDzzy1nb151bW4lU/RrDNrcihA99rVtmfe249xVzniIgR5zAy
 Tzezu+DgCZnN5Os/dLFoMOKTrRqGfqJ/yGNjs9KSgN2Lwe6SmKkVmrNL/i0KIwgd
 4sIh+nXhMerZaR/GAPcazEBR3yUvCtKDMmtNYpzxfVo+JRtNUolvlglPpzxE3e0t
 gOjnqrf1/wutYzJ65xpa4h7JjTtXCwey/83JdY2f463kid2+t94mYpp9wirooY23
 yfv+wTyECcMrlP+p0tTdrMD8mD9m0Cxo96vMVqfT5ER8g4CwiTo=
 =Buve
 -----END PGP SIGNATURE-----

Merge tag 'objtool_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fix from Borislav Petkov:

 - Handle symbol relocations properly due to changes in the toolchains
   which remove section symbols now

* tag 'objtool_urgent_for_v5.15_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Teach get_alt_entry() about more relocation types
2021-10-03 10:23:54 -07:00
Vladimir Oltean 434ef35095 selftests: net: mscc: ocelot: add a test for egress VLAN modification
For this test we are exercising the VCAP ES0 block's ability to match on
a packet with a given VLAN ID, and push an ES0 TAG A with a VID derived
from VID_A_VAL plus the classified VLAN.

$eth3.200 is the generator port
$eth0 is the bridged DUT port that receives
$eth1 is the bridged DUT port that forwards and rewrites VID 200 to 300
      on egress via VCAP ES0
$eth2 is the port that receives from the DUT port $eth1

Since the egress rewriting happens outside the bridging service, VID 300
does not need to be in the bridge VLAN table of $eth1.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-02 14:15:57 +01:00
Vladimir Oltean 4a907f6594 selftests: net: mscc: ocelot: rename the VLAN modification test to ingress
There will be one more VLAN modification selftest added, this time for
egress. Rename the one that exists right now to be more specific.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-02 14:15:57 +01:00
Vladimir Oltean 239f163cea selftests: net: mscc: ocelot: bring up the ports automatically
Looks like when I wrote the selftests I was using a network manager that
brought up the ports automatically. In order to not rely on that, let
the script open them up.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-02 14:15:57 +01:00
Jakub Kicinski 6b7b0c3091 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
bpf-next 2021-10-02

We've added 85 non-merge commits during the last 15 day(s) which contain
a total of 132 files changed, 13779 insertions(+), 6724 deletions(-).

The main changes are:

1) Massive update on test_bpf.ko coverage for JITs as preparatory work for
   an upcoming MIPS eBPF JIT, from Johan Almbladh.

2) Add a batched interface for RX buffer allocation in AF_XDP buffer pool,
   with driver support for i40e and ice from Magnus Karlsson.

3) Add legacy uprobe support to libbpf to complement recently merged legacy
   kprobe support, from Andrii Nakryiko.

4) Add bpf_trace_vprintk() as variadic printk helper, from Dave Marchevsky.

5) Support saving the register state in verifier when spilling <8byte bounded
   scalar to the stack, from Martin Lau.

6) Add libbpf opt-in for stricter BPF program section name handling as part
   of libbpf 1.0 effort, from Andrii Nakryiko.

7) Add a document to help clarifying BPF licensing, from Alexei Starovoitov.

8) Fix skel_internal.h to propagate errno if the loader indicates an internal
   error, from Kumar Kartikeya Dwivedi.

9) Fix build warnings with -Wcast-function-type so that the option can later
   be enabled by default for the kernel, from Kees Cook.

10) Fix libbpf to ignore STT_SECTION symbols in legacy map definitions as it
    otherwise errors out when encountering them, from Toke Høiland-Jørgensen.

11) Teach libbpf to recognize specialized maps (such as for perf RB) and
    internally remove BTF type IDs when creating them, from Hengqi Chen.

12) Various fixes and improvements to BPF selftests.
====================

Link: https://lore.kernel.org/r/20211002001327.15169-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-01 19:58:02 -07:00
Hengqi Chen bd368cb554 selftests/bpf: Use BTF-defined key/value for map definitions
Change map definitions in BPF selftests to use BTF-defined
key/value types. This unifies the map definitions and ensures
libbpf won't emit warning about retrying map creation.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210930161456.3444544-3-hengqi.chen@gmail.com
2021-10-01 15:31:51 -07:00
Hengqi Chen f731052325 libbpf: Support uniform BTF-defined key/value specification across all BPF maps
A bunch of BPF maps do not support specifying BTF types for key and value.
This is non-uniform and inconvenient[0]. Currently, libbpf uses a retry
logic which removes BTF type IDs when BPF map creation failed. Instead
of retrying, this commit recognizes those specialized maps and removes
BTF type IDs when creating BPF map.

  [0] Closes: https://github.com/libbpf/libbpf/issues/355

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210930161456.3444544-2-hengqi.chen@gmail.com
2021-10-01 15:31:50 -07:00
Andrii Nakryiko b0e875bac0 libbpf: Fix memory leak in strset
Free struct strset itself, not just its internal parts.

Fixes: 90d76d3ece ("libbpf: Extract internal set-of-strings datastructure APIs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211001185910.86492-1-andrii@kernel.org
2021-10-01 22:54:38 +02:00
Daniel Latypov d8c23ead70 kunit: tool: better handling of quasi-bool args (--json, --raw_output)
Problem:

What does this do?
$ kunit.py run --json
Well, it runs all the tests and prints test results out as JSON.

And next is
$ kunit.py run my-test-suite --json
This runs just `my-test-suite` and prints results out as JSON.

But what about?
$ kunit.py run --json my-test-suite
This runs all the tests and stores the json results in a "my-test-suite"
file.

Why:
--json, and now --raw_output are actually string flags. They just have a
default value. --json in particular takes the name of an output file.

It was intended that you'd do
$ kunit.py run --json=my_output_file my-test-suite
if you ever wanted to specify the value.

Workaround:
It doesn't seem like there's a way to make
https://docs.python.org/3/library/argparse.html only accept arg values
after a '='.

I believe that `--json` should "just work" regardless of where it is.
So this patch automatically rewrites a bare `--json` to `--json=stdout`.

That makes the examples above work the same way.
Add a regression test that can catch this for --raw_output.

Fixes: 6a499c9c42 ("kunit: tool: make --raw_output support only showing kunit output")
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Tested-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-10-01 13:45:25 -06:00
Linus Torvalds b2626f1e32 Small x86 fixes.
-----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFXQUoUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMglgf/egh3zb9/+BUQWe0xWfhcINNzpsVk
 PJtiBmJc3nQLbZbTSLp63rouy1lNgR0s2DiMwP7G1u39OwW8W3LHMrBUSqF1F01+
 gntb4GGiRTiTPJI64K4z6ytORd3tuRarHq8TUIa2zvki9ZW5Obgkm1i1RsNMOo+s
 AOA7whhpS8e/a5fBbtbS9bTZb30PKTZmbW4oMjvO9Sw4Eb76IauqPSEtRPSuCAc7
 r7z62RTlm10Qk0JR3tW1iXMxTJHZk+tYPJ8pclUAWVX5bZqWa/9k8R0Z5i/miFiZ
 glW/y3R4+aUwIQV2v7V3Jx9MOKDhZxniMtnqZG/Hp9NVDtWIz37V/U37vw==
 =zQQ1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm fixes from Paolo Bonzini:
 "Small x86 fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Ensure all migrations are performed when test is affined
  KVM: x86: Swap order of CPUID entry "index" vs. "significant flag" checks
  ptp: Fix ptp_kvm_getcrosststamp issue for x86 ptp_kvm
  x86/kvmclock: Move this_cpu_pvti into kvmclock.h
  selftests: KVM: Don't clobber XMM register when read
  KVM: VMX: Fix a TSX_CTRL_CPUID_CLEAR field mask issue
2021-10-01 11:08:07 -07:00
Peter Zijlstra 24ff652573 objtool: Teach get_alt_entry() about more relocation types
Occasionally objtool encounters symbol (as opposed to section)
relocations in .altinstructions. Typically they are the alternatives
written by elf_add_alternative() as encountered on a noinstr
validation run on vmlinux after having already ran objtool on the
individual .o files.

Basically this is the counterpart of commit 44f6a7c075 ("objtool:
Fix seg fault with Clang non-section symbols"), because when these new
assemblers (binutils now also does this) strip the section symbols,
elf_add_reloc_to_insn() is forced to emit symbol based relocations.

As such, teach get_alt_entry() about different relocation types.

Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/YVWUvknIEVNkPvnP@hirez.programming.kicks-ass.net
2021-10-01 13:57:47 +02:00
Josh Poimboeuf 5b284b1933 objtool: Ignore unwind hints for ignored functions
If a function is ignored, also ignore its hints.  This is useful for the
case where the function ignore is conditional on frame pointers, e.g.
STACK_FRAME_NON_STANDARD_FP().

Link: https://lkml.kernel.org/r/163163048317.489837.10988954983369863209.stgit@devnote2

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-30 21:24:07 -04:00
Josh Poimboeuf e028c4f7ac objtool: Add frame-pointer-specific function ignore
Add a CONFIG_FRAME_POINTER-specific version of
STACK_FRAME_NON_STANDARD() for the case where a function is
intentionally missing frame pointer setup, but otherwise needs
objtool/ORC coverage when frame pointers are disabled.

Link: https://lkml.kernel.org/r/163163047364.489837.17377799909553689661.stgit@devnote2

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-30 21:24:07 -04:00
Jakub Kicinski dd9a887b35 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/phy/bcm7xxx.c
  d88fd1b546 ("net: phy: bcm7xxx: Fixed indirect MMD operations")
  f68d08c437 ("net: phy: bcm7xxx: Add EPHY entry for 72165")

net/sched/sch_api.c
  b193e15ac6 ("net: prevent user from passing illegal stab size")
  69508d4333 ("net_sched: Use struct_size() and flex_array_size() helpers")

Both cases trivial - adjacent code additions.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-30 14:49:21 -07:00
Linus Torvalds 4de593fb96 Networking fixes for 5.15-rc4, including fixes from mac80211, netfilter
and bpf.
 
 Current release - regressions:
 
  - bpf, cgroup: assign cgroup in cgroup_sk_alloc when called from
    interrupt
 
  - mdio: revert mechanical patches which broke handling of optional
    resources
 
  - dev_addr_list: prevent address duplication
 
 Previous releases - regressions:
 
  - sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb
    (NULL deref)
 
  - Revert "mac80211: do not use low data rates for data frames with no
    ack flag", fixing broadcast transmissions
 
  - mac80211: fix use-after-free in CCMP/GCMP RX
 
  - netfilter: include zone id in tuple hash again, minimize collisions
 
  - netfilter: nf_tables: unlink table before deleting it (race -> UAF)
 
  - netfilter: log: work around missing softdep backend module
 
  - mptcp: don't return sockets in foreign netns
 
  - sched: flower: protect fl_walk() with rcu (race -> UAF)
 
  - ixgbe: fix NULL pointer dereference in ixgbe_xdp_setup
 
  - smsc95xx: fix stalled rx after link change
 
  - enetc: fix the incorrect clearing of IF_MODE bits
 
  - ipv4: fix rtnexthop len when RTA_FLOW is present
 
  - dsa: mv88e6xxx: 6161: use correct MAX MTU config method for this SKU
 
  - e100: fix length calculation & buffer overrun in ethtool::get_regs
 
 Previous releases - always broken:
 
  - mac80211: fix using stale frag_tail skb pointer in A-MSDU tx
 
  - mac80211: drop frames from invalid MAC address in ad-hoc mode
 
  - af_unix: fix races in sk_peer_pid and sk_peer_cred accesses
    (race -> UAF)
 
  - bpf, x86: Fix bpf mapping of atomic fetch implementation
 
  - bpf: handle return value of BPF_PROG_TYPE_STRUCT_OPS prog
 
  - netfilter: ip6_tables: zero-initialize fragment offset
 
  - mhi: fix error path in mhi_net_newlink
 
  - af_unix: return errno instead of NULL in unix_create1() when
    over the fs.file-max limit
 
 Misc:
 
  - bpf: exempt CAP_BPF from checks against bpf_jit_limit
 
  - netfilter: conntrack: make max chain length random, prevent guessing
    buckets by attackers
 
  - netfilter: nf_nat_masquerade: make async masq_inet6_event handling
    generic, defer conntrack walk to work queue (prevent hogging RTNL lock)
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFV5KYACgkQMUZtbf5S
 Irs3cRAAqNsgaQXSVOXcPKsndeDDKHIv1Ktes8aOP9tgEXZw5rpsbct7g9Yxc0os
 Oolyt6HThjWr3u1/e3HHJO9I5Klr/J8eQReoRZnKW+6TYZflmmzfuf8u1nx6SLP/
 tliz5y8wKbp8BqqNuTMdRpm+R1QQcNkXTeruUoR1PgREcY4J7bC2BRrqeZhBGHSR
 Z5yPOIietFN3nITxNwbe4AYJXlesMc6QCWhBtXjMPGQ4Zc4/sjDNfqi7eHJi2H2y
 kW2dHeXG86gnlgFllOBBWP85ptxynyxoNQJuhrxgC9T+/FpSVST7cwKbtmkwDI3M
 5WGmeE6B3yfF8iOQuR8fbKQmsnLgQlYhjpbbhgN0GxzkyI7RpGYOFroX0Pht4IVZ
 mwprDOtvoLs4UeDjULRMB0JZfRN75PCtVlhfUkhhJxXGCCmnhGYaxG/pE+6OQWlr
 +n8RXYYMoOzPaHIYTS9NGSGqT0r32IUy/W5Yfv3rEzSeehy2/fxzGr2fOyBGs+q7
 xrnqpsOnM8cODDwGMy3TclCI4Dd72WoHNCHPhA/bk/ZMjHpBd4CSEZPm8IROY3Ja
 g1t68cncgL8fB7TSD9WLFgYu67Lg5j0gC/BHOzUQDQMX5IbhOq/fj1xRy5Lc6SYp
 mqW1f7LdnixBe4W61VjDAYq5jJRqrwEWedx+rvV/ONLvr77KULk=
 =rSti
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from mac80211, netfilter and bpf.

  Current release - regressions:

   - bpf, cgroup: assign cgroup in cgroup_sk_alloc when called from
     interrupt

   - mdio: revert mechanical patches which broke handling of optional
     resources

   - dev_addr_list: prevent address duplication

  Previous releases - regressions:

   - sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb
     (NULL deref)

   - Revert "mac80211: do not use low data rates for data frames with no
     ack flag", fixing broadcast transmissions

   - mac80211: fix use-after-free in CCMP/GCMP RX

   - netfilter: include zone id in tuple hash again, minimize collisions

   - netfilter: nf_tables: unlink table before deleting it (race -> UAF)

   - netfilter: log: work around missing softdep backend module

   - mptcp: don't return sockets in foreign netns

   - sched: flower: protect fl_walk() with rcu (race -> UAF)

   - ixgbe: fix NULL pointer dereference in ixgbe_xdp_setup

   - smsc95xx: fix stalled rx after link change

   - enetc: fix the incorrect clearing of IF_MODE bits

   - ipv4: fix rtnexthop len when RTA_FLOW is present

   - dsa: mv88e6xxx: 6161: use correct MAX MTU config method for this
     SKU

   - e100: fix length calculation & buffer overrun in ethtool::get_regs

  Previous releases - always broken:

   - mac80211: fix using stale frag_tail skb pointer in A-MSDU tx

   - mac80211: drop frames from invalid MAC address in ad-hoc mode

   - af_unix: fix races in sk_peer_pid and sk_peer_cred accesses (race
     -> UAF)

   - bpf, x86: Fix bpf mapping of atomic fetch implementation

   - bpf: handle return value of BPF_PROG_TYPE_STRUCT_OPS prog

   - netfilter: ip6_tables: zero-initialize fragment offset

   - mhi: fix error path in mhi_net_newlink

   - af_unix: return errno instead of NULL in unix_create1() when over
     the fs.file-max limit

  Misc:

   - bpf: exempt CAP_BPF from checks against bpf_jit_limit

   - netfilter: conntrack: make max chain length random, prevent
     guessing buckets by attackers

   - netfilter: nf_nat_masquerade: make async masq_inet6_event handling
     generic, defer conntrack walk to work queue (prevent hogging RTNL
     lock)"

* tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
  af_unix: fix races in sk_peer_pid and sk_peer_cred accesses
  net: stmmac: fix EEE init issue when paired with EEE capable PHYs
  net: dev_addr_list: handle first address in __hw_addr_add_ex
  net: sched: flower: protect fl_walk() with rcu
  net: introduce and use lock_sock_fast_nested()
  net: phy: bcm7xxx: Fixed indirect MMD operations
  net: hns3: disable firmware compatible features when uninstall PF
  net: hns3: fix always enable rx vlan filter problem after selftest
  net: hns3: PF enable promisc for VF when mac table is overflow
  net: hns3: fix show wrong state when add existing uc mac address
  net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE
  net: hns3: don't rollback when destroy mqprio fail
  net: hns3: remove tc enable checking
  net: hns3: do not allow call hns3_nic_net_open repeatedly
  ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
  net: bridge: mcast: Associate the seqcount with its protecting lock.
  net: mdio-ipq4019: Fix the error for an optional regs resource
  net: hns3: fix hclge_dbg_dump_tm_pg() stack usage
  net: mdio: mscc-miim: Fix the mdio controller
  af_unix: Return errno instead of NULL in unix_create1().
  ...
2021-09-30 14:28:05 -07:00
Kumar Kartikeya Dwivedi 4729445b47 libbpf: Fix segfault in light skeleton for objects without BTF
When fed an empty BPF object, bpftool gen skeleton -L crashes at
btf__set_fd() since it assumes presence of obj->btf, however for
the sequence below clang adds no .BTF section (hence no BTF).

Reproducer:

  $ touch a.bpf.c
  $ clang -O2 -g -target bpf -c a.bpf.c
  $ bpftool gen skeleton -L a.bpf.o
  /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
  /* THIS FILE IS AUTOGENERATED! */

  struct a_bpf {
	struct bpf_loader_ctx ctx;
  Segmentation fault (core dumped)

The same occurs for files compiled without BTF info, i.e. without
clang's -g flag.

Fixes: 6723474373 (libbpf: Generate loader program out of BPF ELF file.)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210930061634.1840768-1-memxor@gmail.com
2021-09-30 23:19:58 +02:00
Po-Hsu Lin d4b6f87e8d selftests/bpf: Use kselftest skip code for skipped tests
There are several test cases in the bpf directory are still using
exit 0 when they need to be skipped. Use kselftest framework skip
code instead so it can help us to distinguish the return status.

Criterion to filter out what should be fixed in bpf directory:
  grep -r "exit 0" -B1 | grep -i skip

This change might cause some false-positives if people are running
these test scripts directly and only checking their return codes,
which will change from 0 to 4. However I think the impact should be
small as most of our scripts here are already using this skip code.
And there will be no such issue if running them with the kselftest
framework.

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210929051250.13831-1-po-hsu.lin@canonical.com
2021-09-30 23:09:17 +02:00
Sean Christopherson 7b0035eaa7 KVM: selftests: Ensure all migrations are performed when test is affined
Rework the CPU selection in the migration worker to ensure the specified
number of migrations are performed when the test iteslf is affined to a
subset of CPUs.  The existing logic skips iterations if the target CPU is
not in the original set of possible CPUs, which causes the test to fail
if too many iterations are skipped.

  ==== Test Assertion Failure ====
  rseq_test.c:228: i > (NR_TASK_MIGRATIONS / 2)
  pid=10127 tid=10127 errno=4 - Interrupted system call
     1  0x00000000004018e5: main at rseq_test.c:227
     2  0x00007fcc8fc66bf6: ?? ??:0
     3  0x0000000000401959: _start at ??:?
  Only performed 4 KVM_RUNs, task stalled too much?

Calculate the min/max possible CPUs as a cheap "best effort" to avoid
high runtimes when the test is affined to a small percentage of CPUs.
Alternatively, a list or xarray of the possible CPUs could be used, but
even in a horrendously inefficient setup, such optimizations are not
needed because the runtime is completely dominated by the cost of
migrating the task, and the absolute runtime is well under a minute in
even truly absurd setups, e.g. running on a subset of vCPUs in a VM that
is heavily overcommited (16 vCPUs per pCPU).

Fixes: 61e52f1630 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs")
Reported-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210929234112.1862848-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30 04:25:57 -04:00
Kumar Kartikeya Dwivedi e68ac00827 libbpf: Fix skel_internal.h to set errno on loader retval < 0
When the loader indicates an internal error (result of a checked bpf
system call), it returns the result in attr.test.retval. However, tests
that rely on ASSERT_OK_PTR on NULL (returned from light skeleton) may
miss that NULL denotes an error if errno is set to 0. This would result
in skel pointer being NULL, while ASSERT_OK_PTR returning 1, leading to
a SEGV on dereference of skel, because libbpf_get_error relies on the
assumption that errno is always set in case of error for ptr == NULL.

In particular, this was observed for the ksyms_module test. When
executed using `./test_progs -t ksyms`, prior tests manipulated errno
and the test didn't crash when it failed at ksyms_module load, while
using `./test_progs -t ksyms_module` crashed due to errno being
untouched.

Fixes: 6723474373 (libbpf: Generate loader program out of BPF ELF file.)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-11-memxor@gmail.com
2021-09-29 20:42:32 -07:00
Toke Høiland-Jørgensen 161ecd5379 libbpf: Properly ignore STT_SECTION symbols in legacy map definitions
The previous patch to ignore STT_SECTION symbols only added the ignore
condition in one of them. This fails if there's more than one map
definition in the 'maps' section, because the subsequent modulus check will
fail, resulting in error messages like:

libbpf: elf: unable to determine legacy map definition size in ./xdpdump_xdp.o

Fix this by also ignoring STT_SECTION in the first loop.

Fixes: c3e8c44a90 ("libbpf: Ignore STT_SECTION symbols in 'maps' section")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210929213837.832449-1-toke@redhat.com
2021-09-29 15:50:32 -07:00
Alexei Starovoitov 66fe332417 libbpf: Make gen_loader data aligned.
Align gen_loader data to 8 byte boundary to make sure union bpf_attr,
bpf_insns and other structs are aligned.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-9-memxor@gmail.com
2021-09-29 13:27:19 -07:00
Kumar Kartikeya Dwivedi e31eec77e4 bpf: selftests: Fix fd cleanup in get_branch_snapshot
Cleanup code uses while (cpu++ < cpu_cnt) for closing fds, which means
it starts iterating from 1 for closing fds. If the first fd is -1, it
skips over it and closes garbage fds (typically zero) in the remaining
array. This leads to test failures for future tests when they end up
storing fd 0 (as the slot becomes free due to close(0)) in ldimm64's BTF
fd, ending up trying to match module BTF id with vmlinux.

This was observed as spurious CI failure for the ksym_module_libbpf and
module_attach tests. The test ends up closing fd 0 and breaking libbpf's
assumption that module BTF fd will always be > 0, which leads to the
kernel thinking that we are pointing to a BTF ID in vmlinux BTF.

Fixes: 025bd7c753 (selftests/bpf: Add test for bpf_get_branch_snapshot)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-12-memxor@gmail.com
2021-09-29 13:25:09 -07:00
Mark Brown 8694e5e638 selftests: arm64: Verify that all possible vector lengths are handled
As part of the enumeration interface for setting vector lengths it is valid
to set vector lengths not supported in the system, these will be rounded to
a supported vector length and returned from the prctl(). Add a test which
exercises this for every valid vector length and makes sure that the return
value is as expected and that this is reflected in the actual system state.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Link: https://lore.kernel.org/r/20210929151925.9601-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 16:33:04 +01:00
Mark Brown e42391150e selftests: arm64: Fix and enable test for setting current VL in vec-syscfg
We had some test code for verifying that we can write the current VL via
the prctl() interface but the condition for the test was inverted which
wasn't noticed as it was never actually hooked up to the array of tests
we execute. Fix this.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210929151925.9601-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 16:33:04 +01:00
Mark Brown 4caf339c03 selftests: arm64: Remove bogus error check on writing to files
Due to some refactoring with the error handling we ended up mangling things
so we never actually set ret and therefore shouldn't be checking it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210929151925.9601-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 16:33:04 +01:00
Mark Brown ff944c44b7 selftests: arm64: Fix printf() format mismatch in vec-syscfg
The format for this error message calls for the plain text version of the
error but we weren't supply it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210929151925.9601-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 16:33:04 +01:00
Mark Brown 34785030dc selftests: arm64: Move FPSIMD in SVE ptrace test into a function
Now that all the other tests are in functions rather than inline in the
main parent process function also move the test for accessing the FPSIMD
registers via the SVE regset out into their own function.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-9-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:32 +01:00
Mark Brown a1d7111257 selftests: arm64: More comprehensively test the SVE ptrace interface
Currently the selftest for the SVE register set is not quite as thorough
as is desirable - it only validates that the value of a single Z register
is not modified by a partial write to a lower numbered Z register after
having previously been set through the FPSIMD regset.

Make this more thorough:
 - Test the ability to set vector lengths and enumerate those supported in
   the system.
 - Validate data in all Z and P registers, plus FPSR and FPCR.
 - Test reads via the FPSIMD regset after set via the SVE regset.

There's still some oversights, the main one being that due to the need to
generate a pattern in FFR and the fact that this rewrite is primarily
motivated by SME's streaming SVE which doesn't have FFR we don't currently
test FFR. Update the TODO to reflect those that occurred to me (and fix an
adjacent typo in there).

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-8-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:32 +01:00
Mark Brown 9f7d03a2c5 selftests: arm64: Verify interoperation of SVE and FPSIMD register sets
After setting the FPSIMD registers via the SVE register set read them back
via the FPSIMD register set, validating that the two register sets are
interoperating and that the values we thought we set made it into the
child process.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-7-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:32 +01:00
Mark Brown 8c9eece0bf selftests: arm64: Clarify output when verifying SVE register set
When verifying setting a Z register via ptrace we check each byte by hand,
iterating over the buffer using a pointer called p and treating each
register value written as a test. This creates output referring to "p[X]"
which is confusing since SVE also has predicate registers Pn. Tweak the
output to avoid confusion here.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-6-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:32 +01:00
Mark Brown 736e6d5a54 selftests: arm64: Document what the SVE ptrace test is doing
Before we go modifying it further let's add some comments and output
clarifications explaining what this test is actually doing.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:32 +01:00
Mark Brown eab281e3af selftests: arm64: Remove extraneous register setting code
For some reason the SVE ptrace test code starts off by setting values in
some of the SVE vector registers in the parent process which it then never
interacts with when verifying the ptrace interfaces. This is not especially
relevant to what's being tested and somewhat confusing when reading the
code so let's remove it.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:31 +01:00
Mark Brown 09121ad718 selftests: arm64: Don't log child creation as a test in SVE ptrace test
Currently we log the creation of the child process as a test but it's not
really relevant to what we're trying to test and can make the output a
little confusing so don't do that.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-3-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:31 +01:00
Mark Brown 78d2d816c4 selftests: arm64: Use a define for the number of SVE ptrace tests to be run
Partly in preparation for future refactoring move from hard coding the
number of tests in main() to putting #define at the top of the source
instead.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210913125505.52619-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
2021-09-29 14:40:31 +01:00
Yonghong Song 38261f369f selftests/bpf: Fix probe_user test failure with clang build kernel
clang build kernel failed the selftest probe_user.
  $ ./test_progs -t probe_user
  $ ...
  $ test_probe_user:PASS:get_kprobe_res 0 nsec
  $ test_probe_user:FAIL:check_kprobe_res wrong kprobe res from probe read: 0.0.0.0:0
  $ #94 probe_user:FAIL

The test attached to kernel function __sys_connect(). In net/socket.c, we have
  int __sys_connect(int fd, struct sockaddr __user *uservaddr, int addrlen)
  {
        ......
  }
  ...
  SYSCALL_DEFINE3(connect, int, fd, struct sockaddr __user *, uservaddr,
                  int, addrlen)
  {
        return __sys_connect(fd, uservaddr, addrlen);
  }

The gcc compiler (8.5.0) does not inline __sys_connect() in syscall entry
function. But latest clang trunk did the inlining. So the bpf program
is not triggered.

To make the test more reliable, let us kprobe the syscall entry function
instead. Note that x86_64, arm64 and s390 have syscall wrappers and they have
to be handled specially.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210929033000.3711921-1-yhs@fb.com
2021-09-28 23:17:22 -07:00
Yucong Sun 09710d82c0 bpftool: Avoid using "?: " in generated code
"?:" is a GNU C extension, some environment has warning flags for its
use, or even prohibit it directly.  This patch avoid triggering these
problems by simply expand it to its full form, no functionality change.

Signed-off-by: Yucong Sun <fallentree@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210928184221.1545079-1-fallentree@fb.com
2021-09-28 15:19:22 -07:00
Andrii Nakryiko 7c80c87ad5 selftests/bpf: Switch sk_lookup selftests to strict SEC("sk_lookup") use
Update "sk_lookup/" definition to be a stand-alone type specifier,
with backwards-compatible prefix match logic in non-libbpf-1.0 mode.

Currently in selftests all the "sk_lookup/<whatever>" uses just use
<whatever> for duplicated unique name encoding, which is redundant as
BPF program's name (C function name) uniquely and descriptively
identifies the intended use for such BPF programs.

With libbpf's SEC_DEF("sk_lookup") definition updated, switch existing
sk_lookup programs to use "unqualified" SEC("sk_lookup") section names,
with no random text after it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-11-andrii@kernel.org
2021-09-28 13:51:20 -07:00
Andrii Nakryiko dd94d45cf0 libbpf: Add opt-in strict BPF program section name handling logic
Implement strict ELF section name handling for BPF programs. It utilizes
`libbpf_set_strict_mode()` framework and adds new flag: LIBBPF_STRICT_SEC_NAME.

If this flag is set, libbpf will enforce exact section name matching for
a lot of program types that previously allowed just partial prefix
match. E.g., if previously SEC("xdp_whatever_i_want") was allowed, now
in strict mode only SEC("xdp") will be accepted, which makes SEC("")
definitions cleaner and more structured. SEC() now won't be used as yet
another way to uniquely encode BPF program identifier (for that
C function name is better and is guaranteed to be unique within
bpf_object). Now SEC() is strictly BPF program type and, depending on
program type, extra load/attach parameter specification.

Libbpf completely supports multiple BPF programs in the same ELF
section, so multiple BPF programs of the same type/specification easily
co-exist together within the same bpf_object scope.

Additionally, a new (for now internal) convention is introduced: section
name that can be a stand-alone exact BPF program type specificator, but
also could have extra parameters after '/' delimiter. An example of such
section is "struct_ops", which can be specified by itself, but also
allows to specify the intended operation to be attached to, e.g.,
"struct_ops/dctcp_init". Note, that "struct_ops_some_op" is not allowed.
Such section definition is specified as "struct_ops+".

This change is part of libbpf 1.0 effort ([0], [1]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/271
  [1] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-10-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko d41ea045a6 libbpf: Complete SEC() table unification for BPF_APROG_SEC/BPF_EAPROG_SEC
Complete SEC() table refactoring towards unified form by rewriting
BPF_APROG_SEC and BPF_EAPROG_SEC definitions with
SEC_DEF(SEC_ATTACHABLE_OPT) (for optional expected_attach_type) and
SEC_DEF(SEC_ATTACHABLE) (mandatory expected_attach_type), respectively.
Drop BPF_APROG_SEC, BPF_EAPROG_SEC, and BPF_PROG_SEC_IMPL macros after
that, leaving SEC_DEF() macro as the only one used.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-9-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko 15ea31fadd libbpf: Refactor ELF section handler definitions
Refactor ELF section handler definitions table to use a set of flags and
unified SEC_DEF() macro. This allows for more succinct and table-like
set of definitions, and allows to more easily extend the logic without
adding more verbosity (this is utilized in later patches in the series).

This approach is also making libbpf-internal program pre-load callback
not rely on bpf_sec_def definition, which demonstrates that future
pluggable ELF section handlers will be able to achieve similar level of
integration without libbpf having to expose extra types and APIs.

For starters, update SEC_DEF() definitions and make them more succinct.
Also convert BPF_PROG_SEC() and BPF_APROG_COMPAT() definitions to
a common SEC_DEF() use.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-8-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko 13d35a0cf1 libbpf: Reduce reliance of attach_fns on sec_def internals
Move closer to not relying on bpf_sec_def internals that won't be part
of public API, when pluggable SEC() handlers will be allowed. Drop
pre-calculated prefix length, and in various helpers don't rely on this
prefix length availability. Also minimize reliance on knowing
bpf_sec_def's prefix for few places where section prefix shortcuts are
supported (e.g., tp vs tracepoint, raw_tp vs raw_tracepoint).

Given checking some string for having a given string-constant prefix is
such a common operation and so annoying to be done with pure C code, add
a small macro helper, str_has_pfx(), and reuse it throughout libbpf.c
where prefix comparison is performed. With __builtin_constant_p() it's
possible to have a convenient helper that checks some string for having
a given prefix, where prefix is either string literal (or compile-time
known string due to compiler optimization) or just a runtime string
pointer, which is quite convenient and saves a lot of typing and string
literal duplication.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-7-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko 12d9466d8b libbpf: Refactor internal sec_def handling to enable pluggability
Refactor internals of libbpf to allow adding custom SEC() handling logic
easily from outside of libbpf. To that effect, each SEC()-handling
registration sets mandatory program type/expected attach type for
a given prefix and can provide three callbacks called at different
points of BPF program lifetime:

  - init callback for right after bpf_program is initialized and
  prog_type/expected_attach_type is set. This happens during
  bpf_object__open() step, close to the very end of constructing
  bpf_object, so all the libbpf APIs for querying and updating
  bpf_program properties should be available;

  - pre-load callback is called right before BPF_PROG_LOAD command is
  called in the kernel. This callbacks has ability to set both
  bpf_program properties, as well as program load attributes, overriding
  and augmenting the standard libbpf handling of them;

  - optional auto-attach callback, which makes a given SEC() handler
  support auto-attachment of a BPF program through bpf_program__attach()
  API and/or BPF skeletons <skel>__attach() method.

Each callbacks gets a `long cookie` parameter passed in, which is
specified during SEC() handling. This can be used by callbacks to lookup
whatever additional information is necessary.

This is not yet completely ready to be exposed to the outside world,
mainly due to non-public nature of struct bpf_prog_load_params. Instead
of making it part of public API, we'll wait until the planned low-level
libbpf API improvements for BPF_PROG_LOAD and other typical bpf()
syscall APIs, at which point we'll have a public, probably OPTS-based,
way to fully specify BPF program load parameters, which will be used as
an interface for custom pre-load callbacks.

But this change itself is already a good first step to unify the BPF
program hanling logic even within the libbpf itself. As one example, all
the extra per-program type handling (sleepable bit, attach_btf_id
resolution, unsetting optional expected attach type) is now more obvious
and is gathered in one place.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-6-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko 15669e1dcd selftests/bpf: Normalize all the rest SEC() uses
Normalize all the other non-conforming SEC() usages across all
selftests. This is in preparation for libbpf to start to enforce
stricter SEC() rules in libbpf 1.0 mode.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-5-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko c22bdd2825 selftests/bpf: Switch SEC("classifier*") usage to a strict SEC("tc")
Convert all SEC("classifier*") uses to a new and strict SEC("tc")
section name. In reference_tracking selftests switch from ambiguous
searching by program title (section name) to non-ambiguous searching by
name in some selftests, getting closer to completely removing
bpf_object__find_program_by_title().

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-4-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko 8fffa0e345 selftests/bpf: Normalize XDP section names in selftests
Convert almost all SEC("xdp_blah") uses to strict SEC("xdp") to comply
with strict libbpf 1.0 logic of exact section name match for XDP program
types. There is only one exception, which is only tested through
iproute2 and defines multiple XDP programs within the same BPF object.
Given iproute2 still works in non-strict libbpf mode and it doesn't have
means to specify XDP programs by its name (not section name/title),
leave that single file alone for now until iproute2 gains lookup by
function/program name.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-3-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Andrii Nakryiko 9673268f03 libbpf: Add "tc" SEC_DEF which is a better name for "classifier"
As argued in [0], add "tc" ELF section definition for SCHED_CLS BPF
program type. "classifier" is a misleading terminology and should be
migrated away from.

  [0] https://lore.kernel.org/bpf/270e27b1-e5be-5b1c-b343-51bd644d0747@iogearbox.net/

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-2-andrii@kernel.org
2021-09-28 13:51:19 -07:00
Oliver Upton e02c16b9cd selftests: KVM: Don't clobber XMM register when read
There is no need to clobber a register that is only being read from.
Oops. Drop the XMM register from the clobbers list.

Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210927223621.50178-1-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-28 11:31:29 -04:00
David S. Miller 4ccb9f03fe Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-09-28

The following pull-request contains BPF updates for your *net* tree.

We've added 10 non-merge commits during the last 14 day(s) which contain
a total of 11 files changed, 139 insertions(+), 53 deletions(-).

The main changes are:

1) Fix MIPS JIT jump code emission for too large offsets, from Piotr Krysiuk.

2) Fix x86 JIT atomic/fetch emission when dst reg maps to rax, from Johan Almbladh.

3) Fix cgroup_sk_alloc corner case when called from interrupt, from Daniel Borkmann.

4) Fix segfault in libbpf's linker for objects without BTF, from Kumar Kartikeya Dwivedi.

5) Fix bpf_jit_charge_modmem for applications with CAP_BPF, from Lorenz Bauer.

6) Fix return value handling for struct_ops BPF programs, from Hou Tao.

7) Various fixes to BPF selftests, from Jiri Benc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
,
2021-09-28 13:52:46 +01:00
Jiri Benc 79e2c30666 selftests, bpf: test_lwt_ip_encap: Really disable rp_filter
It's not enough to set net.ipv4.conf.all.rp_filter=0, that does not override
a greater rp_filter value on the individual interfaces. We also need to set
net.ipv4.conf.default.rp_filter=0 before creating the interfaces. That way,
they'll also get their own rp_filter value of zero.

Fixes: 0fde56e438 ("selftests: bpf: add test_lwt_ip_encap selftest")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/b1cdd9d469f09ea6e01e9c89a6071c79b7380f89.1632386362.git.jbenc@redhat.com
2021-09-28 09:30:38 +02:00
Jiri Benc d888eaac4f selftests, bpf: Fix makefile dependencies on libbpf
When building bpf selftest with make -j, I'm randomly getting build failures
such as this one:

  In file included from progs/bpf_flow.c:19:
  [...]/tools/testing/selftests/bpf/tools/include/bpf/bpf_helpers.h:11:10: fatal error: 'bpf_helper_defs.h' file not found
  #include "bpf_helper_defs.h"
           ^~~~~~~~~~~~~~~~~~~

The file that fails the build varies between runs but it's always in the
progs/ subdir.

The reason is a missing make dependency on libbpf for the .o files in
progs/. There was a dependency before commit 3ac2e20fba but that commit
removed it to prevent unneeded rebuilds. However, that only works if libbpf
has been built already; the 'wildcard' prerequisite does not trigger when
there's no bpf_helper_defs.h generated yet.

Keep the libbpf as an order-only prerequisite to satisfy both goals. It is
always built before the progs/ objects but it does not trigger unnecessary
rebuilds by itself.

Fixes: 3ac2e20fba ("selftests/bpf: BPF object files should depend only on libbpf headers")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/ee84ab66436fba05a197f952af23c98d90eb6243.1632758415.git.jbenc@redhat.com
2021-09-28 09:30:14 +02:00
Kumar Kartikeya Dwivedi bcfd367c28 libbpf: Fix segfault in static linker for objects without BTF
When a BPF object is compiled without BTF info (without -g),
trying to link such objects using bpftool causes a SIGSEGV due to
btf__get_nr_types accessing obj->btf which is NULL. Fix this by
checking for the NULL pointer, and return error.

Reproducer:
$ cat a.bpf.c
extern int foo(void);
int bar(void) { return foo(); }
$ cat b.bpf.c
int foo(void) { return 0; }
$ clang -O2 -target bpf -c a.bpf.c
$ clang -O2 -target bpf -c b.bpf.c
$ bpftool gen obj out a.bpf.o b.bpf.o
Segmentation fault (core dumped)

After fix:
$ bpftool gen obj out a.bpf.o b.bpf.o
libbpf: failed to find BTF info for object 'a.bpf.o'
Error: failed to link 'a.bpf.o': Unknown error -22 (-22)

Fixes: a46349227c (libbpf: Add linker extern resolution support for functions and global variables)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com
2021-09-28 09:29:03 +02:00
Toke Høiland-Jørgensen c3e8c44a90 libbpf: Ignore STT_SECTION symbols in 'maps' section
When parsing legacy map definitions, libbpf would error out when
encountering an STT_SECTION symbol. This becomes a problem because some
versions of binutils will produce SECTION symbols for every section when
processing an ELF file, so BPF files run through 'strip' will end up with
such symbols, making libbpf refuse to load them.

There's not really any reason why erroring out is strictly necessary, so
change libbpf to just ignore SECTION symbols when parsing the ELF.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210927205810.715656-1-toke@redhat.com
2021-09-27 21:29:37 -07:00
Magnus Karlsson e34087fc00 selftests: xsk: Add frame_headroom test
Add a test for the frame_headroom feature that can be set on the
umem. The logic added validates that all offsets in all tests and
packets are valid, not just the ones that have a specifically
configured frame_headroom.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-14-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson e4e9baf06a selftests: xsk: Change interleaving of packets in unaligned mode
Change the interleaving of packets in unaligned mode. With the current
buffer addresses in the packet stream, the last buffer in the umem
could not be used as a large packet could potentially write over the
end of the umem. The kernel correctly threw this buffer address away
and refused to use it. This is perfectly fine for all regular packet
streams, but the ones used for unaligned mode have every other packet
being at some different offset. As we will add checks for correct
offsets in the next patch, this needs to be fixed. Just start these
page-boundary straddling buffers one page earlier so that the last
one is not on the last page of the umem, making all buffers valid.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-13-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson 96a40678ce selftests: xsk: Add single packet test
Add a test where a single packet is sent and received. This might
sound like a silly test, but since many of the interfaces in xsk are
batched, it is important to be able to validate that we did not break
something as fundamental as just receiving single packets, instead of
batches of packets at high speed.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-12-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson 1bf3649688 selftests: xsk: Introduce pacing of traffic
Introduce pacing of traffic so that the Tx thread can never send more
packets than the receiver has processed plus the number of packets it
can have in its umem. So at any point in time, the number of in flight
packets (not processed by the Rx thread) are less than or equal to the
number of packets that can be held in the Rx thread's umem.

The batch size is also increased to improve running time.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-11-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson 89013b8a29 selftests: xsk: Fix socket creation retry
The socket creation retry unnecessarily registered the umem once for
every retry. No reason to do this. It wastes memory and it might lead
to too many pages being locked at some point and the failure of a
test.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-10-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson 872a1184db selftests: xsk: Put the same buffer only once in the fill ring
Fix a problem where the fill ring was populated with too many
entries. If number of buffers in the umem was smaller than the fill
ring size, the code used to loop over from the beginning of the umem
and start putting the same buffers in again. This is racy indeed as a
later packet can be received overwriting an earlier one before the Rx
thread manages to validate it.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-9-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Magnus Karlsson 5b13205612 selftests: xsk: Fix missing initialization
Fix missing initialization of the member rx_pkt_nb in the packet
stream. This leads to some tests declaring success too early as the
test thought all packets had already been received.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922075613.12186-8-magnus.karlsson@gmail.com
2021-09-28 00:18:35 +02:00
Linus Torvalds 0513e464f9 perf tools fixes for v5.15: 2nd batch
- Fix 'perf test' DWARF unwind for optimized builds.
 
 - Fix 'perf test' 'Object code reading' when dealing with samples in @plt
   symbols.
 
 - Fix off-by-one directory paths in the ARM support code.
 
 - Fix error message to eliminate confusion in 'perf config' when first creating
   a config file.
 
 - 'perf iostat' fix for system wide operation.
 
 - Fix printing of metrics when 'perf iostat' is used with one or more
   iio_root_ports and unconnected cpus (using -C).
 
 - Fix several typos in the documentation files.
 
 - Fix spelling mistake "icach" -> "icache" in the power8 JSON vendor files.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYVIO3wAKCRCyPKLppCJ+
 J9piAP4jmxYEnimD6qvVHjOLio2LvwGI0u7MakZCHWVKQZKHbgEArb8l3+D2+YXw
 U7RxDmXoSE+0EjTV8o13sQlerRTU3wM=
 =oVI7
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.15-2021-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull more perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix 'perf test' DWARF unwind for optimized builds.

 - Fix 'perf test' 'Object code reading' when dealing with samples in
   @plt symbols.

 - Fix off-by-one directory paths in the ARM support code.

 - Fix error message to eliminate confusion in 'perf config' when first
   creating a config file.

 - 'perf iostat' fix for system wide operation.

 - Fix printing of metrics when 'perf iostat' is used with one or more
   iio_root_ports and unconnected cpus (using -C).

 - Fix several typos in the documentation files.

 - Fix spelling mistake "icach" -> "icache" in the power8 JSON vendor
   files.

* tag 'perf-tools-fixes-for-v5.15-2021-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf iostat: Fix Segmentation fault from NULL 'struct perf_counts_values *'
  perf iostat: Use system-wide mode if the target cpu_list is unspecified
  perf config: Refine error message to eliminate confusion
  perf doc: Fix typos all over the place
  perf arm: Fix off-by-one directory paths.
  perf vendor events powerpc: Fix spelling mistake "icach" -> "icache"
  perf tests: Fix flaky test 'Object code reading'
  perf test: Fix DWARF unwind for optimized builds.
2021-09-27 14:06:42 -07:00
Linus Torvalds 9cccec2bf3 x86:
- missing TLB flush
 
 - nested virtualization fixes for SMM (secure boot on nested hypervisor)
   and other nested SVM fixes
 
 - syscall fuzzing fixes
 
 - live migration fix for AMD SEV
 
 - mirror VMs now work for SEV-ES too
 
 - fixes for reset
 
 - possible out-of-bounds access in IOAPIC emulation
 
 - fix enlightened VMCS on Windows 2022
 
 ARM:
 
 - Add missing FORCE target when building the EL2 object
 
 - Fix a PMU probe regression on some platforms
 
 Generic:
 
 - KCSAN fixes
 
 selftests:
 
 - random fixes, mostly for clang compilation
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFN0EwUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNqaQf/Vx7ePFTqwWpo+8wKapnc6JN9SLjC
 hM4jipxfc1WyQWcfCt8ZuPhCnhF7o8mG/mrqTm+JB+oGqIsydHW19DiUT8ekv09F
 dQ+XYSiR4B547wUH5XLQc4xG9imwYlXGEOHqrE7eJvGH3LOqVFX2fLRBnFefZbO8
 GKhRJrGXwG3/JSAP6A0c22iVU+pLbfV9gpKwrAj0V7o8nzT2b3Wmh74WBNb47BzE
 a4+AwKpWO4rqJGOwdYwy67pdFHh1YmrlZ59cFZc7fzlXE+o0D0bitaJyioZALpOl
 4mRGdzoYkNB++ZjDzVFnAClCYQV/oNxCNGFaFF2mh/gzXG1TLmN7B8zGDg==
 =7oVh
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "A bit late... I got sidetracked by back-from-vacation routines and
  conferences. But most of these patches are already a few weeks old and
  things look more calm on the mailing list than what this pull request
  would suggest.

  x86:

   - missing TLB flush

   - nested virtualization fixes for SMM (secure boot on nested
     hypervisor) and other nested SVM fixes

   - syscall fuzzing fixes

   - live migration fix for AMD SEV

   - mirror VMs now work for SEV-ES too

   - fixes for reset

   - possible out-of-bounds access in IOAPIC emulation

   - fix enlightened VMCS on Windows 2022

  ARM:

   - Add missing FORCE target when building the EL2 object

   - Fix a PMU probe regression on some platforms

  Generic:

   - KCSAN fixes

  selftests:

   - random fixes, mostly for clang compilation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
  selftests: KVM: Explicitly use movq to read xmm registers
  selftests: KVM: Call ucall_init when setting up in rseq_test
  KVM: Remove tlbs_dirty
  KVM: X86: Synchronize the shadow pagetable before link it
  KVM: X86: Fix missed remote tlb flush in rmap_write_protect()
  KVM: x86: nSVM: don't copy virt_ext from vmcb12
  KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround
  KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0
  KVM: x86: nSVM: restore int_vector in svm_clear_vintr
  kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[]
  KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit
  KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry
  KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest state
  KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm
  KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode
  KVM: x86: reset pdptrs_from_userspace when exiting smm
  KVM: x86: nSVM: restore the L1 host state prior to resuming nested guest on SMM exit
  KVM: nVMX: Filter out all unsupported controls when eVMCS was activated
  KVM: KVM: Use cpumask_available() to check for NULL cpumask when kicking vCPUs
  KVM: Clean up benign vcpu->cpu data races when kicking vCPUs
  ...
2021-09-27 13:58:23 -07:00
Shuah Khan 2f96028708 selftests: drivers/dma-buf: Fix implicit declaration warns
udmabuf has the following implicit declaration warns:

udmabuf.c:30:10: warning: implicit declaration of function 'open';
udmabuf.c:42:8: warning: implicit declaration of function 'fcntl'

These are caused due to not including fcntl.h and including just
linux/fcntl.h. Fix it to include fcntl.h which will bring in the
linux/fcntl.h. In addition, define __EXPORTED_HEADERS__ to bring in
F_ADD_SEALS and F_SEAL_SHRINK defines and fix the following error
that show up when just fcntl.h is included.

udmabuf.c:45:21: error: 'F_ADD_SEALS' undeclared
   45 |  ret = fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK);
      |                     ^~~~~~~~~~~
udmabuf.c:45:34: error: 'F_SEAL_SHRINK' undeclared
   45 |  ret = fcntl(memfd, F_ADD_SEALS, F_SEAL_SHRINK);
      |                                  ^~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-27 09:52:29 -06:00
Like Xu 4da8b12188 perf iostat: Fix Segmentation fault from NULL 'struct perf_counts_values *'
If the 'perf iostat' user specifies two or more iio_root_ports and also
specifies the cpu(s) by -C which is not *connected to all* the above iio
ports, the iostat_print_metric() will run into trouble:

For example:

  $ perf iostat list
  S0-uncore_iio_0<0000:16>
  S1-uncore_iio_0<0000:97> # <--- CPU 1 is located in the socket S0

  $ perf iostat 0000:16,0000:97 -C 1 -- ls
  port 	Inbound Read(MB)	Inbound Write(MB)	Outbound Read(MB)	Outbound
  Write(MB) ../perf-iostat: line 12: 104418 Segmentation fault
  (core dumped) perf stat --iostat$DELIMITER$*

The core-dump stack says, in the above corner case, the returned
(struct perf_counts_values *) count will be NULL, and the caller
iostat_print_metric() apparently doesn't not handle this case.

  433	struct perf_counts_values *count = perf_counts(evsel->counts, die, 0);
  434
  435	if (count->run && count->ena) {
  (gdb) p count
  $1 = (struct perf_counts_values *) 0x0

The deeper reason is that there are actually no statistics from the user
specified pair "iostat 0000:X, -C (disconnected) Y ", but let's fix it with
minimum cost by adding a NULL check in the user space.

Fixes: f9ed693e8b ("perf stat: Enable iostat mode for x86 platforms")
Signed-off-by: Like Xu <likexu@tencent.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210927081115.39568-2-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:41:07 -03:00
Like Xu e4fe5d7349 perf iostat: Use system-wide mode if the target cpu_list is unspecified
An iostate use case like "perf iostat 0000:16,0000:97 -- ls" should be
implemented to work in system-wide mode to ensure that the output from
print_header() is consistent with the user documentation perf-iostat.txt,
rather than incorrectly assuming that the kernel does not support it:

 Error:
 The sys_perf_event_open() syscall returned with 22 (Invalid argument) \
 for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
 /bin/dmesg | grep -i perf may provide additional information.

This error is easily fixed by assigning system-wide mode by default
for IOSTAT_RUN only when the target cpu_list is unspecified.

Fixes: f07952b179 ("perf stat: Basic support for iostat in perf")
Signed-off-by: Like Xu <likexu@tencent.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210927081115.39568-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:39:30 -03:00
Like Xu a827c007c7 perf config: Refine error message to eliminate confusion
If there is no configuration file at first, the user can write any pair
of "key.subkey=value" to the newly created configuration file, while
value validation against a valid configurable key is *deferred* until
the next execution or the implied execution of "perf config ... ".

For example:

  $ rm ~/.perfconfig
  $ perf config call-graph.dump-size=65529
  $ cat ~/.perfconfig
  # this file is auto-generated.
  [call-graph]
 	dump-size = 65529
  $ perf config call-graph.dump-size=2048
  callchain: Incorrect stack dump size (max 65528): 65529
  Error: wrong config key-value pair call-graph.dump-size=65529

The user might expect that the second value 2048 is valid and can be
updated to the configuration file, but the error message is very
confusing because the first value 65529 is not reported as an error
during the last configuration.

It is recommended not to change the current behavior of delayed
validation (as more effort is needed), but to refine the original error
message to *clearly indicate* that the cause of the error is the
configuration file.

Signed-off-by: Like Xu <likexu@tencent.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210924115817.58689-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Like Xu 4da6552c5d perf doc: Fix typos all over the place
Considering that perf and its subcommands have so many parameters, the
documentation is always the first stop for perf beginners. Fixing some
spelling errors will relax the eyes of some readers a little bit.

 s/specicfication/specification/
 s/caheline/cacheline/
 s/tranasaction/transaction/
 s/complan/complain/
 s/sched_wakep/sched_wakeup/
 s/possble/possible/
 s/methology/methodology/

Signed-off-by: Like Xu <likexu@tencent.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210924081942.38368-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Ian Rogers c6613bd4a5 perf arm: Fix off-by-one directory paths.
Relative path include works in the regular build due to -I paths but may
fail in other situations.

v2. Rebase. Comments on v1 were that we should handle include paths
    differently and it is agreed that can be a sensible refactor but
    beyond the scope of this change.
https://lore.kernel.org/lkml/20210504191227.793712-1-irogers@google.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210923154254.737657-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Colin Ian King 774f2c0890 perf vendor events powerpc: Fix spelling mistake "icach" -> "icache"
There is a spelling mistake in the description text, fix it.

Signed-off-by: Colin King <colin.king@canonical.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210916081314.41751-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
James Clark 0f892fd1bd perf tests: Fix flaky test 'Object code reading'
This test occasionally fails on aarch64 when a sample is taken in
free@plt and it fails with "Bytes read differ from those read by
objdump".

This is because that symbol is near a section boundary in the elf file.
Despite the -z option to always output zeros, objdump uses
bfd_map_over_sections() to iterate through the elf file so it doesn't
see outside of the sections where these zeros are and can't print them.

For example this boundary proceeds free@plt in libc with a gap of 48
bytes between .plt and .text:

  objdump -d -z --start-address=0x23cc8 --stop-address=0x23d08 libc-2.30.so

  libc-2.30.so:     file format elf64-littleaarch64

  Disassembly of section .plt:

  0000000000023cc8 <*ABS*+0x7fd00@plt+0x8>:
     23cc8:	91018210 	add	x16, x16, #0x60
     23ccc:	d61f0220 	br	x17

  Disassembly of section .text:

  0000000000023d00 <abort@@GLIBC_2.17-0x98>:
     23d00:	a9bf7bfd 	stp	x29, x30, [sp, #-16]!
     23d04:	910003fd 	mov	x29, sp

Taking a sample in free@plt is very rare because it is so small, but the
test can be forced to fail almost every time on any platform by linking
the test with a shared library that has a single empty function and
calling it in a loop.

The fix is to zero the buffers so that when there is a jump in the
addresses output by objdump, zeros are already filled in between.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20210906152238.3415467-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Ian Rogers 5c34aea341 perf test: Fix DWARF unwind for optimized builds.
To ensure the stack frames are on the stack tail calls optimizations
need to be inhibited. If your compiler supports an attribute use it,
otherwise use an asm volatile barrier.

The barrier fix was suggested here:
https://lore.kernel.org/lkml/20201028081123.GT2628@hirez.programming.kicks-ass.net/
Tested with an optimized clang build and by forcing the asm barrier
route with an optimized clang build.

A GCC bug tracking a proper disable_tail_calls is:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97831

Fixes: 9ae1e990f1 ("perf tools: Remove broken __no_tail_call
       attribute")

v2. is a rebase. The original fix patch generated quite a lot of
    discussion over the right place for the fix:
    https://lore.kernel.org/lkml/20201114000803.909530-1-irogers@google.com/
    The patch reflects my preference of it being near the use, so that
    future code cleanups don't break this somewhat special usage.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210922173812.456348-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 09:32:28 -03:00
Petr Machata b69c99463d selftests: net: fib_nexthops: Wait before checking reported idle time
The purpose of this test is to verify that after a short activity passes,
the reported time is reasonable: not zero (which could be reported by
mistake), and not something outrageous (which would be indicative of an
issue in used units).

However, the idle time is reported in units of clock_t, or hundredths of
second. If the initial sequence of commands is very quick, it is possible
that the idle time is reported as just flat-out zero. When this test was
recently enabled in our nightly regression, we started seeing spurious
failures for exactly this reason.

Therefore buffer the delay leading up to the test with a sleep, to make
sure there is no legitimate way of reporting 0.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-27 12:15:36 +01:00
Martin KaFai Lau ef979017b8 bpf: selftest: Add verifier tests for <8-byte scalar spill and refill
This patch adds a few verifier tests for <8-byte spill and refill.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210922004953.627183-1-kafai@fb.com
2021-09-26 13:07:28 -07:00
Martin KaFai Lau 54ea6079b7 bpf: selftest: A bpf prog that has a 32bit scalar spill
It is a simplified example that can trigger a 32bit scalar spill.
The const scalar is refilled and added to a skb->data later.
Since the reg state of the 32bit scalar spill is not saved now,
adding the refilled reg to skb->data and then comparing it with
skb->data_end cannot verify the skb->data access.

With the earlier verifier patch and the llvm patch [1].  The verifier
can correctly verify the bpf prog.

Here is the snippet of the verifier log that leads to verifier conclusion
that the packet data is unsafe to read.  The log is from the kerne
without the previous verifier patch to save the <8-byte scalar spill.
67: R0=inv1 R1=inv17 R2=invP2 R3=inv1 R4=pkt(id=0,off=68,r=102,imm=0) R5=inv102 R6=pkt(id=0,off=62,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0
67: (63) *(u32 *)(r10 -12) = r5
68: R0=inv1 R1=inv17 R2=invP2 R3=inv1 R4=pkt(id=0,off=68,r=102,imm=0) R5=inv102 R6=pkt(id=0,off=62,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmm????
...
101: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R6_w=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
101: (61) r1 = *(u32 *)(r10 -12)
102: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6_w=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
102: (bc) w1 = w1
103: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6_w=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=0,off=0,r=102,imm=0) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
103: (0f) r7 += r1
last_idx 103 first_idx 67
regs=2 stack=0 before 102: (bc) w1 = w1
regs=2 stack=0 before 101: (61) r1 = *(u32 *)(r10 -12)
104: R0_w=map_value_or_null(id=2,off=0,ks=16,vs=1,imm=0) R1_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6_w=pkt(id=0,off=70,r=102,imm=0) R7_w=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=inv17 R10=fp0 fp-16=mmmmmmmm
...
127: R0_w=inv1 R1=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9_w=invP17 R10=fp0 fp-16=mmmmmmmm
127: (bf) r1 = r7
128: R0_w=inv1 R1_w=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9_w=invP17 R10=fp0 fp-16=mmmmmmmm
128: (07) r1 += 8
129: R0_w=inv1 R1_w=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9_w=invP17 R10=fp0 fp-16=mmmmmmmm
129: (b4) w0 = 1
130: R0=inv1 R1=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=invP17 R10=fp0 fp-16=mmmmmmmm
130: (2d) if r1 > r8 goto pc-66
 R0=inv1 R1=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=invP17 R10=fp0 fp-16=mmmmmmmm
131: R0=inv1 R1=pkt(id=3,off=8,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R6=pkt(id=0,off=70,r=102,imm=0) R7=pkt(id=3,off=0,r=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R8=pkt_end(id=0,off=0,imm=0) R9=invP17 R10=fp0 fp-16=mmmmmmmm
131: (69) r6 = *(u16 *)(r7 +0)
invalid access to packet, off=0 size=2, R7(id=3,off=0,r=0)
R7 offset is outside of the packet

[1]: https://reviews.llvm.org/D109073

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210922004947.626286-1-kafai@fb.com
2021-09-26 13:07:28 -07:00
Linus Torvalds 5bb7b2107f A set of fixes for X86:
- Prevent sending the wrong signal when protection keys are enabled and
    the kernel handles a fault in the vsyscall emulation.
 
  - Invoke early_reserve_memory() before invoking e820_memory_setup() which
    is required to make the Xen dom0 e820 hooks work correctly.
 
  - Use the correct data type for the SETZ operand in the EMQCMDS
    instruction wrapper.
 
  - Prevent undefined behaviour to the potential unaligned accesss in the
    instroction decoder library.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmFQQaITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaZjD/0TF0mE8QUhI4tyGELdNgwvje5iZ9vg
 Nd9KJpR4hUALHgfUD44NVl9JWawFY2d8FXyIPoAFEcvmy6o4f1w0ia8US3hQWA0Y
 EdLSigXi/eYSstkONaJUEBCxlLbwy7JDzaazA9DeKOEuRc7NWSyZURYvzTAkPK1Y
 mbE9kjKhjFa5NGnSB8HbSF2yEzFsKaTo4nreWP/OkzDjnEMshLR1/FUOUvZmlsgA
 CWjMxAVYFqeJN3QhDgR/vRKPoz1sOjDL1s4AsU+xdy63WyFJZ7Z1b8t6bOBoYh6w
 UztkuOkzZ6pIdzz4O1WGoFx4/FJ74qNx0vO/hOB+cKH6rgJs6AkHAvwlnjI/fE2C
 Y+IsuE4PBXMRpkaayTCsAq/enabwgKsmLSUu916APrhVvuUtb3GJgyhedLE3mEBw
 yZXezzRDhNpYop2yQSRXDeKebpoQgl+zqEP5g1O8pAFnud8FGHnz64eJV7Su7Y7C
 BCac0hmv+drlqb/jOSYqjsfo6QfhvR60WwDIgTplOMMLa3plEJFx/rIuU2xVg5g9
 w0m2QUsZboyT2yBnl8gRrqrcQmv2t4iX6TAj9Wm23Lx41h94JQMRtZyJT9bcNqY9
 jMJu27BcNSveciZA7W2DVUlFf/gTF3bwpF7ZDWRt/VSrHPtkI9WKlERhQaywo1L0
 rF8SGCEuNU2ktw==
 =h7v1
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of fixes for X86:

   - Prevent sending the wrong signal when protection keys are enabled
     and the kernel handles a fault in the vsyscall emulation.

   - Invoke early_reserve_memory() before invoking e820_memory_setup()
     which is required to make the Xen dom0 e820 hooks work correctly.

   - Use the correct data type for the SETZ operand in the EMQCMDS
     instruction wrapper.

   - Prevent undefined behaviour to the potential unaligned accesss in
     the instruction decoder library"

* tag 'x86-urgent-2021-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
  x86/asm: Fix SETZ size enqcmds() build failure
  x86/setup: Call early_reserve_memory() earlier
  x86/fault: Fix wrong signal when vsyscall fails with pkey
2021-09-26 10:09:20 -07:00
Linus Torvalds a3b397b4ff Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "16 patches.

  Subsystems affected by this patch series: xtensa, sh, ocfs2, scripts,
  lib, and mm (memory-failure, kasan, damon, shmem, tools, pagecache,
  debug, and pagemap)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: fix uninitialized use in overcommit_policy_handler
  mm/memory_failure: fix the missing pte_unmap() call
  kasan: always respect CONFIG_KASAN_STACK
  sh: pgtable-3level: fix cast to pointer from integer of different size
  mm/debug: sync up latest migrate_reason to migrate_reason_names
  mm/debug: sync up MR_CONTIG_RANGE and MR_LONGTERM_PIN
  mm: fs: invalidate bh_lrus for only cold path
  lib/zlib_inflate/inffast: check config in C to avoid unused function warning
  tools/vm/page-types: remove dependency on opt_file for idle page tracking
  scripts/sorttable: riscv: fix undeclared identifier 'EM_RISCV' error
  ocfs2: drop acl cache for directories too
  mm/shmem.c: fix judgment error in shmem_is_huge()
  xtensa: increase size of gcc stack frame check
  mm/damon: don't use strnlen() with known-bogus source length
  kasan: fix Kconfig check of CC_HAS_WORKING_NOSANITIZE_ADDRESS
  mm, hwpoison: add is_free_buddy_page() in HWPoisonHandlable()
2021-09-25 16:20:34 -07:00
Linus Torvalds 90316e6ea0 linux-kselftest-fixes-5.15-rc3
This Kselftest fixes update for Linux 5.15-rc3 consists of:
 
 - fix to Kselftest common framework header install to run before
   other targets for it work correctly in parallel build case.
 - fixes to kvm test to not ignore fscanf() returns which could
   result in inconsistent test behavior and failures.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmFOL14ACgkQCwJExA0N
 Qxyz9hAA6LKcNpv7BGXWpTNB2z/p9KcbSWdA3eejvb+3kiovd2AyFPvf1U1fT5kA
 VlJWk4d3cbhPkCrXmkpsI2vBuSb+u5HcwoEy0WH4T4q2aDjoEOnyn3xr2ktnBdXl
 oeWqPusSRKTJn7gCCZN+H6s5JaUbGZxTWsL2n0v5Pzb6sW4hikc8nXpd+1IeoTKq
 0xg7FM6NcaWRzKb6D97wFpQEo6mnC9Zifv6TalxQn71d//n9MXGW2600ZSDDfBRG
 XfolWqVUHGI2Lyy0mIT788fmntA7xOka3Tajzk4WrfmcgczABJFRJr8ZG6iZ+J0j
 dT+KaTqEZHcL1L9Pusf2VehJZTUwKMenc2NmOZ+n6pK+PUL/KZNobcXoFW01I9jI
 z4UYeH9DUnTiaTP8b7OJ1RB7H5XU3CncuO2gELkera212XckdmiTldpox0ywwyh9
 x+X6mk4lzbDSFMPrSPJrVTasvLMEZv6A05I8Td8Gw7mgeYIxupyk0/k5jgjMlZGN
 6CHyQu4iGre3oM72snuthomtzqwArF3bhWAm7ooZYG4qLQ6z6naDdvcpWYmszIV0
 RuC74FZCHZWGXXojVJWPquf47C79QPFQyQcwXxfAaSMY3XezVsI7c/QaXLoT1VeA
 BZbsUoAPvLa8mLRiFU2jZmLbx57rxej5zM+UvKJjI1JdCL9+cUA=
 =Q9zx
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:

 - fix to Kselftest common framework header install to run before other
   targets for it work correctly in parallel build case.

 - fixes to kvm test to not ignore fscanf() returns which could result
   in inconsistent test behavior and failures.

* tag 'linux-kselftest-fixes-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: kvm: fix get_run_delay() ignoring fscanf() return warn
  selftests: kvm: move get_run_delay() into lib/test_util
  selftests:kvm: fix get_trans_hugepagesz() ignoring fscanf() return warn
  selftests:kvm: fix get_warnings_count() ignoring fscanf() return warn
  selftests: be sure to make khdr before other targets
2021-09-25 15:30:29 -07:00
Linus Torvalds 2c4e969c38 USB driver fixes for 5.15-rc3
Here are some USB driver fixes and new device ids for 5.15-rc3.
 
 They include:
 	- usb-storage quirk additions
 	- usb-serial new device ids
 	- usb-serial driver fixes
 	- USB roothub registration bugfix to resolve a long-reported
 	  issue
 	- usb gadget driver fixes for a large number of small things
 	- dwc2 driver fixes
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYU8uKQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynjlgCeIE84XRGpjE/jCK/63Sjve9zyJjoAn2ZUFwLN
 lcLJxlHV3XHK8coC5/YZ
 =hNgV
 -----END PGP SIGNATURE-----

Merge tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB driver fixes from Greg KH:
 "Here are some USB driver fixes and new device ids for 5.15-rc3.

  They include:

   - usb-storage quirk additions

   - usb-serial new device ids

   - usb-serial driver fixes

   - USB roothub registration bugfix to resolve a long-reported issue

   - usb gadget driver fixes for a large number of small things

   - dwc2 driver fixes

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits)
  USB: serial: option: add device id for Foxconn T99W265
  USB: serial: cp210x: add ID for GW Instek GDM-834x Digital Multimeter
  USB: serial: cp210x: add part-number debug printk
  USB: serial: cp210x: fix dropped characters with CP2102
  MAINTAINERS: usb, update Peter Korsgaard's entries
  usb: musb: tusb6010: uninitialized data in tusb_fifo_write_unaligned()
  usb-storage: Add quirk for ScanLogic SL11R-IDE older than 2.6c
  Re-enable UAS for LaCie Rugged USB3-FW with fk quirk
  USB: serial: option: remove duplicate USB device ID
  USB: serial: mos7840: remove duplicated 0xac24 device ID
  arm64: dts: qcom: ipq8074: remove USB tx-fifo-resize property
  usb: gadget: f_uac2: Populate SS descriptors' wBytesPerInterval
  usb: gadget: f_uac2: Add missing companion descriptor for feedback EP
  usb: dwc2: gadget: Fix ISOC transfer complete handling for DDMA
  usb: core: hcd: Modularize HCD stop configuration in usb_stop_hcd()
  xhci: Set HCD flag to defer primary roothub registration
  usb: core: hcd: Add support for deferring roothub registration
  usb: dwc2: gadget: Fix ISOC flow for BDMA and Slave
  usb: dwc3: core: balance phy init and exit
  Revert "USB: bcma: Add a check for devm_gpiod_get"
  ...
2021-09-25 10:10:38 -07:00
Jakub Kicinski 7fe7f3182a Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

1) ipset limits the max allocatable memory via kvmalloc() to MAX_INT,
   from Jozsef Kadlecsik.

2) Check ip_vs_conn_tab_bits value to be in the range specified
   in Kconfig, from Andrea Claudi.

3) Initialize fragment offset in ip6tables, from Jeremy Sowden.

4) Make conntrack hash chain length random, from Florian Westphal.

5) Add zone ID to conntrack and NAT hashtuple again, also from Florian.

6) Add selftests for bidirectional zone support and colliding tuples,
   from Florian Westphal.

7) Unlink table before synchronize_rcu when cleaning tables with
   owner, from Florian.

8) ipset limits the max allocatable memory via kvmalloc() to MAX_INT.

9) Release conntrack entries via workqueue in masquerade, from Florian.

10) Fix bogus net_init in iptables raw table definition, also from Florian.

11) Work around missing softdep in log extensions, from Florian Westphal.

12) Serialize hash resizes and cleanups with mutex, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
  netfilter: conntrack: serialize hash resizes and cleanups
  netfilter: log: work around missing softdep backend module
  netfilter: iptable_raw: drop bogus net_init annotation
  netfilter: nf_nat_masquerade: defer conntrack walk to work queue
  netfilter: nf_nat_masquerade: make async masq_inet6_event handling generic
  netfilter: nf_tables: Fix oversized kvmalloc() calls
  netfilter: nf_tables: unlink table before deleting it
  selftests: netfilter: add zone stress test with colliding tuples
  selftests: netfilter: add selftest for directional zone support
  netfilter: nat: include zone id in nat table hash again
  netfilter: conntrack: include zone id in tuple hash again
  netfilter: conntrack: make max chain length random
  netfilter: ip6_tables: zero-initialize fragment offset
  ipvs: check that ip_vs_conn_tab_bits is between 8 and 20
  netfilter: ipset: Fix oversized kvmalloc() calls
====================

Link: https://lore.kernel.org/r/20210924221113.348767-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-24 17:27:20 -07:00
Changbin Du ebaeab2fe8 tools/vm/page-types: remove dependency on opt_file for idle page tracking
Idle page tracking can also be used for process address space, not only
file mappings.

Without this change, using with '-i' option for process address space
encounters below errors reported.

  $ sudo ./page-types -p $(pidof bash) -i
  mark page idle: Bad file descriptor
  mark page idle: Bad file descriptor
  mark page idle: Bad file descriptor
  mark page idle: Bad file descriptor
  ...

Link: https://lkml.kernel.org/r/20210917032826.10669-1-changbin.du@gmail.com
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-24 16:13:35 -07:00
Yonghong Song 091037fb77 selftests/bpf: Fix btf_dump __int128 test failure with clang build kernel
With clang build kernel (adding LLVM=1 to kernel and selftests/bpf build
command line), I hit the following test failure:

  $ ./test_progs -t btf_dump
  ...
  btf_dump_data:PASS:ensure expected/actual match 0 nsec
  btf_dump_data:FAIL:find type id unexpected find type id: actual -2 < expected 0
  btf_dump_data:FAIL:find type id unexpected find type id: actual -2 < expected 0
  test_btf_dump_int_data:FAIL:dump __int128 unexpected error: -2 (errno 2)
  #15/9 btf_dump/btf_dump: int_data:FAIL

Further analysis showed gcc build kernel has type "__int128" in dwarf/BTF
and it doesn't exist in clang build kernel. Code searching for kernel code
found the following:
  arch/s390/include/asm/types.h:  unsigned __int128 pair;
  crypto/ecc.c:   unsigned __int128 m = (unsigned __int128)left * right;
  include/linux/math64.h: return (u64)(((unsigned __int128)a * mul) >> shift);
  include/linux/math64.h: return (u64)(((unsigned __int128)a * mul) >> shift);
  lib/ubsan.h:typedef __int128 s_max;
  lib/ubsan.h:typedef unsigned __int128 u_max;

In my case, CONFIG_UBSAN is not enabled. Even if we only have "unsigned __int128"
in the code, somehow gcc still put "__int128" in dwarf while clang didn't.
Hence current test works fine for gcc but not for clang.

Enabling CONFIG_UBSAN is an option to provide __int128 type into dwarf
reliably for both gcc and clang, but not everybody enables CONFIG_UBSAN
in their kernel build. So the best choice is to use "unsigned __int128" type
which is available in both clang and gcc build kernels. But clang and gcc
dwarf encoded names for "unsigned __int128" are different:

  [$ ~] cat t.c
  unsigned __int128 a;
  [$ ~] gcc -g -c t.c && llvm-dwarfdump t.o | grep __int128
                  DW_AT_type      (0x00000031 "__int128 unsigned")
                  DW_AT_name      ("__int128 unsigned")
  [$ ~] clang -g -c t.c && llvm-dwarfdump t.o | grep __int128
                  DW_AT_type      (0x00000033 "unsigned __int128")
                  DW_AT_name      ("unsigned __int128")

The test change in this patch tries to test type name before
doing actual test.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210924025856.2192476-1-yhs@fb.com
2021-09-24 15:35:03 -07:00
Linus Torvalds 1b7eaf5701 arm64 fixes:
- It turns out that the optimised string routines merged in 5.14 are not
   safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond the
   end of a string (strcmp, strncmp). Such reading may go across a 16
   byte tag granule and cause a tag check fault. When KASAN_HW_TAGS is
   enabled, use the generic strcmp/strncmp C implementation.
 
 - An errata workaround for ThunderX relied on the CPU capabilities being
   enabled in a specific order. This disappeared with the automatic
   generation of the cpucaps.h file (sorted alphabetically). Fix it by
   checking the current CPU only rather than the system-wide capability.
 
 - Add system_supports_mte() checks on the kernel entry/exit path and
   thread switching to avoid unnecessary barriers and function calls on
   systems where MTE is not supported.
 
 - kselftests: skip arm64 tests if the required features are missing.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmFOC7oACgkQa9axLQDI
 XvHGhBAAkTzrLtyveymoeD5bk0Lh4Co2i0PKqoOl82ltvagjcvUY5lDLfgKV8ac2
 0fZ6E7Jjt6Khq0VAWryRf/RkjxTXmuqDmyb7hZ/t0TNciRXKMfQyE8o8KhsWOvRF
 hD0COUOnJ4CrgQuAJwsW9uagcrix9NucK9WziHB65RACxnD+k7foffCNv1b9Tw9s
 oqAfZ144B+L1Z0WFoqG4uGkngIyG/lcSPJMhDRo45iRWtNUnlR/m/ApWKb+r+X1X
 /Aq2dWklUnrGBB1MdO3WAaRH/lvulH6t4KVhTJw2Gkq5ogy+mYLO15wGYmM+FAWT
 ExS8fy4/G2vX4RAxYRBZF7Xl3zN7F9n3RYKyYfxgGwSy30TnOLYUMoemYvA9Eh27
 O6cjaZacyPVfaB5LUwj6H84NbDbcJKCO775sLuIW8IEMklyZhorS8HG7nYnL6rBk
 jXM5AVPC2j8kXCgawW9u+s/pEQAvLOvM9y7Q0pjj3n2WNeV09bxzkgX5wStSaMYJ
 Ok95zCK1SbKBYdA7yQnFcQkM52XkqpML8vccBHUdxVbtAetuYxoyGKKs/9G6mPEH
 vWSxP4ymvRza62EofmUJzNIEMWwwNUgyNeY/bVJXYf1oc324Wx4T1ewQagrGVQve
 miEgOVEpn+ygHFkEO3mc0wnFfQ3m5CSBvOjeWlTbv7lgekRDyx4=
 =JIq+
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - It turns out that the optimised string routines merged in 5.14 are
   not safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond
   the end of a string (strcmp, strncmp). Such reading may go across a
   16 byte tag granule and cause a tag check fault. When KASAN_HW_TAGS
   is enabled, use the generic strcmp/strncmp C implementation.

 - An errata workaround for ThunderX relied on the CPU capabilities
   being enabled in a specific order. This disappeared with the
   automatic generation of the cpucaps.h file (sorted alphabetically).
   Fix it by checking the current CPU only rather than the system-wide
   capability.

 - Add system_supports_mte() checks on the kernel entry/exit path and
   thread switching to avoid unnecessary barriers and function calls on
   systems where MTE is not supported.

 - kselftests: skip arm64 tests if the required features are missing.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Restore forced disabling of KPTI on ThunderX
  kselftest/arm64: signal: Skip tests if required features are missing
  arm64: Mitigate MTE issues with str{n}cmp()
  arm64: add MTE supported check to thread switching and syscall entry/exit
2021-09-24 11:12:17 -07:00
Numfor Mbiziwo-Tiapo 5ba1071f75 x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses
Don't perform unaligned loads in __get_next() and __peek_nbyte_next() as
these are forms of undefined behavior:

"A pointer to an object or incomplete type may be converted to a pointer
to a different object or incomplete type. If the resulting pointer
is not correctly aligned for the pointed-to type, the behavior is
undefined."

(from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)

These problems were identified using the undefined behavior sanitizer
(ubsan) with the tools version of the code and perf test.

 [ bp: Massage commit message. ]

Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210923161843.751834-1-irogers@google.com
2021-09-24 12:37:38 +02:00
Oliver Upton 386ca9d7fd selftests: KVM: Explicitly use movq to read xmm registers
Compiling the KVM selftests with clang emits the following warning:

>> include/x86_64/processor.h:297:25: error: variable 'xmm0' is uninitialized when used here [-Werror,-Wuninitialized]
>>                return (unsigned long)xmm0;

where xmm0 is accessed via an uninitialized register variable.

Indeed, this is a misuse of register variables, which really should only
be used for specifying register constraints on variables passed to
inline assembly. Rather than attempting to read xmm registers via
register variables, just explicitly perform the movq from the desired
xmm register.

Fixes: 783e9e5126 ("kvm: selftests: add API testing infrastructure")
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210924005147.1122357-1-oupton@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-24 02:32:58 -04:00
Oliver Upton fbf094ce52 selftests: KVM: Call ucall_init when setting up in rseq_test
While x86 does not require any additional setup to use the ucall
infrastructure, arm64 needs to set up the MMIO address used to signal a
ucall to userspace. rseq_test does not initialize the MMIO address,
resulting in the test spinning indefinitely.

Fix the issue by calling ucall_init() during setup.

Fixes: 61e52f1630 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs")
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210923220033.4172362-1-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-24 02:32:12 -04:00
Linus Torvalds f10f0481a5 A fix for a bug with restartable sequences and KVM. KVM's handling
of TIF_NOTIFY_RESUME, e.g. for task migration, clears the flag without
 informing rseq and leads to stale data in userspace's rseq struct.
 
 I'm sending this as a separate pull request since it's not code
 that I usually touch.  In particular, patch 2 ("entry: rseq: Call
 rseq_handle_notify_resume() in tracehook_notify_resume()") is just a
 cleanup to try and make future bugs less likely.  If you prefer this to
 be sent via Thomas and only in 5.16, please speak up.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmFLPYgUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNCowf6A9pTuDspCC2IRICfQnHj/Q/Xc7HH
 UmwYo26GctfYaq9AyIAGzcRx3iOEGjb9fZVJ6mzPUGIygio9ZyuiPHogf7lMAb+x
 39ts5uSOp+N+8e0fvX578WFfmG5hQa4Tp9W3T2Y5KsVgK2Nf8F08DckzIgD8cbkN
 NQKTRIi8AYgb20y3NFZjzsPRxF8850QK7xVCI+LBjryyWpEGT5ZsthrYUeexiJPz
 XN+VOYJen5GXVBCar2JbA7EVSrMZbKSy+M3fJ1vuW5dZHySaiu69JXJHop71jTnJ
 5BGue917MfH6RTDzIFFUcg7NmwcuXHpw4dsFeiyExYFNw1uWWQpk0efC1g==
 =/xlE
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull rseq fixes from Paolo Bonzini:
 "A fix for a bug with restartable sequences and KVM.

  KVM's handling of TIF_NOTIFY_RESUME, e.g. for task migration, clears
  the flag without informing rseq and leads to stale data in userspace's
  rseq struct"

* tag 'for-linus-rseq' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Remove __NR_userfaultfd syscall fallback
  KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
  tools: Move x86 syscall number fallbacks to .../uapi/
  entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()
  KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest
2021-09-23 11:24:12 -07:00
Jakub Kicinski 2fcd14d0f7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/mptcp/protocol.c
  977d293e23 ("mptcp: ensure tx skbs always have the MPTCP ext")
  efe686ffce ("mptcp: ensure tx skbs always have the MPTCP ext")

same patch merged in both trees, keep net-next.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-23 11:19:49 -07:00
Linus Torvalds 9bc62afe03 Networking fixes for 5.15-rc3.
Current release - regressions:
 
  - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports()
 
 Previous releases - regressions:
 
  - introduce a shutdown method to mdio device drivers, and make DSA
    switch drivers compatible with masters disappearing on shutdown;
    preventing infinite reference wait
 
  - fix issues in mdiobus users related to ->shutdown vs ->remove
 
  - virtio-net: fix pages leaking when building skb in big mode
 
  - xen-netback: correct success/error reporting for the SKB-with-fraglist
 
  - dsa: tear down devlink port regions when tearing down the devlink
         port on error
 
  - nexthop: fix division by zero while replacing a resilient group
 
  - hns3: check queue, vf, vlan ids range before using
 
 Previous releases - always broken:
 
  - napi: fix race against netpoll causing NAPI getting stuck
 
  - mlx4_en: ensure link operstate is updated even if link comes up
             before netdev registration
 
  - bnxt_en: fix TX timeout when TX ring size is set to the smallest
 
  - enetc: fix illegal access when reading affinity_hint;
           prevent oops on sysfs access
 
  - mtk_eth_soc: avoid creating duplicate offload entries
 
 Misc:
 
  - core: correct the sock::sk_lock.owned lockdep annotations
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFMr4MACgkQMUZtbf5S
 Irv6Ag/+Ml4q6/IVO0jBppztZO1RrSalb3YE9JjPQyMchauVcdcADNpYF+Jo/gcH
 4q+/Oikfp6gkQpJTFd0Y9X7UhwA4Jm4wWtEisqc6PJeHOagDZmVUn353WtgpnCNL
 CgYBfa5k3msGudkqgeXIyiP/2sekBevTy+fOptubLZClyBrEwNUUUZBlpT9aI9Sj
 ru1eMYklfcxP60AQgNhqq6ZwJnRELgN75fSR6ypVCGcRnTK4UGL/b6TvnPYn8uYY
 zeNuMZZzYZK5B73tC6rWpteHWZ7VW3Km0WvIKs+ORM8nYchz/EprKZ0HCLPYrWvf
 ib5Wi7HyL7/n9k9NUTCGrQY3tkOWNzXOepjpiBZPqCG9r2hc3JSR7Q2lFwL+gKv/
 sh2y+T2xfp0WFGmG2XiU2MgnkypMSKah1sC/XRE7YLw02vPAnWQxxl/KVNek4j7M
 CH/Tg9ErVKDRLN7KO/kKl3s8I8N4hdctms/YUt9QD5J9Rw/Jqwr/79bq1uLy6d4o
 //ipmCTHex57Nvy80PtgcuKJhoeqGwR/Av6BvBMRZ1SOYs/C6q45skHTlYyiNY3+
 Dyj9+nfrhsyE835GKPe8lqBFZONBXpXw+EUNXeYRiv0Pcd+JKek07bbajSQVSpd8
 8nqQwylpGII0iPGyOc9wKajzh7O5W2odFIdOwtY/5yVjrcFgBd0=
 =VcL3
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Current release - regressions:

   - dsa: bcm_sf2: fix array overrun in bcm_sf2_num_active_ports()

  Previous releases - regressions:

   - introduce a shutdown method to mdio device drivers, and make DSA
     switch drivers compatible with masters disappearing on shutdown;
     preventing infinite reference wait

   - fix issues in mdiobus users related to ->shutdown vs ->remove

   - virtio-net: fix pages leaking when building skb in big mode

   - xen-netback: correct success/error reporting for the
     SKB-with-fraglist

   - dsa: tear down devlink port regions when tearing down the devlink
     port on error

   - nexthop: fix division by zero while replacing a resilient group

   - hns3: check queue, vf, vlan ids range before using

  Previous releases - always broken:

   - napi: fix race against netpoll causing NAPI getting stuck

   - mlx4_en: ensure link operstate is updated even if link comes up
     before netdev registration

   - bnxt_en: fix TX timeout when TX ring size is set to the smallest

   - enetc: fix illegal access when reading affinity_hint; prevent oops
     on sysfs access

   - mtk_eth_soc: avoid creating duplicate offload entries

  Misc:

   - core: correct the sock::sk_lock.owned lockdep annotations"

* tag 'net-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
  atlantic: Fix issue in the pm resume flow.
  net/mlx4_en: Don't allow aRFS for encapsulated packets
  net: mscc: ocelot: fix forwarding from BLOCKING ports remaining enabled
  net: ethernet: mtk_eth_soc: avoid creating duplicate offload entries
  nfc: st-nci: Add SPI ID matching DT compatible
  MAINTAINERS: remove Guvenc Gulce as net/smc maintainer
  nexthop: Fix memory leaks in nexthop notification chain listeners
  mptcp: ensure tx skbs always have the MPTCP ext
  qed: rdma - don't wait for resources under hw error recovery flow
  s390/qeth: fix deadlock during failing recovery
  s390/qeth: Fix deadlock in remove_discipline
  s390/qeth: fix NULL deref in qeth_clear_working_pool_list()
  net: dsa: realtek: register the MDIO bus under devres
  net: dsa: don't allocate the slave_mii_bus using devres
  Doc: networking: Fox a typo in ice.rst
  net: dsa: fix dsa_tree_setup error path
  net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work
  net/smc: add missing error check in smc_clc_prfx_set()
  net: hns3: fix a return value error in hclge_get_reset_status()
  net: hns3: check vlan id before using it
  ...
2021-09-23 10:30:31 -07:00
Maxim Levitsky 1ad32105d7 KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0
Test that if:

* L1 disables virtual interrupt masking, and INTR intercept.

* L1 setups a virtual interrupt to be injected to L2 and enters L2 with
  interrupts disabled, thus the virtual interrupt is pending.

* Now an external interrupt arrives in L1 and since
  L1 doesn't intercept it, it should be delivered to L2 when
  it enables interrupts.

  to do this L0 (abuses) V_IRQ to setup an
  interrupt window, and returns to L2.

* L2 enables interrupts.
  This should trigger the interrupt window,
  injection of the external interrupt and delivery
  of the virtual interrupt that can now be done.

* Test that now L2 gets those interrupts.

This is the test that demonstrates the issue that was
fixed in the previous patch.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210914154825.104886-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23 10:05:07 -04:00
David Matlack 7c236b816e KVM: selftests: Create a separate dirty bitmap per slot
The calculation to get the per-slot dirty bitmap was incorrect leading
to a buffer overrun. Fix it by splitting out the dirty bitmap into a
separate bitmap per slot.

Fixes: 609e6202ea ("KVM: selftests: Support multiple slots in dirty_log_perf_test")
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20210917173657.44011-4-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:14 -04:00
David Matlack 9f2fc5554a KVM: selftests: Refactor help message for -s backing_src
All selftests that support the backing_src option were printing their
own description of the flag and then calling backing_src_help() to dump
the list of available backing sources. Consolidate the flag printing in
backing_src_help() to align indentation, reduce duplicated strings, and
improve consistency across tests.

Note: Passing "-s" to backing_src_help is unnecessary since every test
uses the same flag. However I decided to keep it for code readability
at the call sites.

While here this opportunistically fixes the incorrectly interleaved
printing -x help message and list of backing source types in
dirty_log_perf_test.

Fixes: 609e6202ea ("KVM: selftests: Support multiple slots in dirty_log_perf_test")
Reviewed-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20210917173657.44011-3-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:14 -04:00
David Matlack a1e638da1b KVM: selftests: Change backing_src flag to -s in demand_paging_test
Every other KVM selftest uses -s for the backing_src, so switch
demand_paging_test to match.

Reviewed-by: Ben Gardon <bgardon@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20210917173657.44011-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:13 -04:00
Oliver Upton 01f91acb55 selftests: KVM: Align SMCCC call with the spec in steal_time
The SMC64 calling convention passes a function identifier in w0 and its
parameters in x1-x17. Given this, there are two deviations in the
SMC64 call performed by the steal_time test: the function identifier is
assigned to a 64 bit register and the parameter is only 32 bits wide.

Align the call with the SMCCC by using a 32 bit register to handle the
function identifier and increasing the parameter width to 64 bits.

Suggested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20210921171121.2148982-3-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:08 -04:00
Oliver Upton 90b54129e8 selftests: KVM: Fix check for !POLLIN in demand_paging_test
The logical not operator applies only to the left hand side of a bitwise
operator. As such, the check for POLLIN not being set in revents wrong.
Fix it by adding parentheses around the bitwise expression.

Fixes: 4f72180eb4 ("KVM: selftests: Add demand paging content to the demand paging test")
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20210921171121.2148982-2-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:08 -04:00
Sean Christopherson 2da4a23599 KVM: selftests: Remove __NR_userfaultfd syscall fallback
Revert the __NR_userfaultfd syscall fallback added for KVM selftests now
that x86's unistd_{32,63}.h overrides are under uapi/ and thus not in
KVM selftests' search path, i.e. now that KVM gets x86 syscall numbers
from the installed kernel headers.

No functional change intended.

Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210901203030.1292304-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:24:02 -04:00
Sean Christopherson 61e52f1630 KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs
Add a test to verify an rseq's CPU ID is updated correctly if the task is
migrated while the kernel is handling KVM_RUN.  This is a regression test
for a bug introduced by commit 72c3c0fe54 ("x86/kvm: Use generic xfer
to guest work function"), where TIF_NOTIFY_RESUME would be cleared by KVM
without updating rseq, leading to a stale CPU ID and other badness.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Message-Id: <20210901203030.1292304-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:24:02 -04:00
Sean Christopherson de5f4213da tools: Move x86 syscall number fallbacks to .../uapi/
Move unistd_{32,64}.h from x86/include/asm to x86/include/uapi/asm so
that tools/selftests that install kernel headers, e.g. KVM selftests, can
include non-uapi tools headers, e.g. to get 'struct list_head', without
effectively overriding the installed non-tool uapi headers.

Swapping KVM's search order, e.g. to search the kernel headers before
tool headers, is not a viable option as doing results in linux/type.h and
other core headers getting pulled from the kernel headers, which do not
have the kernel-internal typedefs that are used through tools, including
many files outside of selftests/kvm's control.

Prior to commit cec07f53c3 ("perf tools: Move syscall number fallbacks
from perf-sys.h to tools/arch/x86/include/asm/"), the handcoded numbers
were actual fallbacks, i.e. overriding unistd_{32,64}.h from the kernel
headers was unintentional.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210901203030.1292304-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:24:01 -04:00
Jiri Benc 17b52c226a seltests: bpf: test_tunnel: Use ip neigh
The 'arp' command is deprecated and is another dependency of the selftest.
Just use 'ip neigh', the test depends on iproute2 already.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/40f24b9d3f0f53b5c44471b452f9a11f4d13b7af.1632236133.git.jbenc@redhat.com
2021-09-21 20:40:16 -07:00
Andrii Nakryiko cc10623c68 libbpf: Add legacy uprobe attaching support
Similarly to recently added legacy kprobe attach interface support
through tracefs, support attaching uprobes using the legacy interface if
host kernel doesn't support newer FD-based interface.

For uprobes event name consists of "libbpf_" prefix, PID, sanitized
binary path and offset within that binary. Structuraly the code is
aligned with kprobe logic refactoring in previous patch. struct
bpf_link_perf is re-used and all the same legacy_probe_name and
legacy_is_retprobe fields are used to ensure proper cleanup on
bpf_link__destroy().

Users should be aware, though, that on old kernels which don't support
FD-based interface for kprobe/uprobe attachment, if the application
crashes before bpf_link__destroy() is called, uprobe legacy
events will be left in tracefs. This is the same limitation as with
legacy kprobe interfaces.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-5-andrii@kernel.org
2021-09-21 19:40:09 -07:00
Andrii Nakryiko 46ed5fc33d libbpf: Refactor and simplify legacy kprobe code
Refactor legacy kprobe handling code to follow the same logic as uprobe
legacy logic added in the next patchs:
  - add append_to_file() helper that makes it simpler to work with
    tracefs file-based interface for creating and deleting probes;
  - move out probe/event name generation outside of the code that
    adds/removes it, which simplifies bookkeeping significantly;
  - change the probe name format to start with "libbpf_" prefix and
    include offset within kernel function;
  - switch 'unsigned long' to 'size_t' for specifying kprobe offsets,
    which is consistent with how uprobes define that, simplifies
    printf()-ing internally, and also avoids unnecessary complications on
    architectures where sizeof(long) != sizeof(void *).

This patch also implicitly fixes the problem with invalid open() error
handling present in poke_kprobe_events(), which (the function) this
patch removes.

Fixes: ca304b40c2 ("libbpf: Introduce legacy kprobe events support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-4-andrii@kernel.org
2021-09-21 19:40:09 -07:00
Andrii Nakryiko d3b0e3b03c selftests/bpf: Adopt attach_probe selftest to work on old kernels
Make sure to not use ref_ctr_off feature when running on old kernels
that don't support this feature. This allows to test libbpf's legacy
kprobe and uprobe logic on old kernels.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-3-andrii@kernel.org
2021-09-21 19:40:09 -07:00
Andrii Nakryiko 303a257223 libbpf: Fix memory leak in legacy kprobe attach logic
In some error scenarios legacy_probe string won't be free()'d. Fix this.
This was reported by Coverity static analysis.

Fixes: ca304b40c2 ("libbpf: Introduce legacy kprobe events support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-2-andrii@kernel.org
2021-09-21 19:40:08 -07:00
Cristian Marussi 0e3dbf765f kselftest/arm64: signal: Skip tests if required features are missing
During initialization of a signal testcase, features declared as required
are properly checked against the running system but no action is then taken
to effectively skip such a testcase.

Fix core signals test logic to abort initialization and report such a
testcase as skipped to the KSelfTest framework.

Fixes: f96bf43403 ("kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210920121228.35368-1-cristian.marussi@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-21 18:12:03 +01:00
Florian Westphal cb89f63ba6 selftests: netfilter: add zone stress test with colliding tuples
Add 20k entries to the connection tracking table, once from the
data plane, once via ctnetlink.

In both cases, each entry lives in a different conntrack zone
and addresses/ports are identical.

Expectation is that insertions work and occurs in constant time:

PASS: added 10000 entries in 1215 ms (now 10000 total, loop 1)
PASS: added 10000 entries in 1214 ms (now 20000 total, loop 2)
PASS: inserted 20000 entries from packet path in 2434 ms total
PASS: added 10000 entries in 57631 ms (now 10000 total)
PASS: added 10000 entries in 58572 ms (now 20000 total)
PASS: inserted 20000 entries via ctnetlink in 116205 ms

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-09-21 03:46:55 +02:00
Florian Westphal 0f1148abb2 selftests: netfilter: add selftest for directional zone support
Add a script to exercise NAT port clash resolution with directional zones.

Add net namespaces that use the same IP address and connect them to a
gateway.

Gateway uses policy routing based on iif/mark and conntrack zones to
isolate the client namespaces.  In server direction, same zone with NAT
to single address is used.

Then, connect to a server from each client netns, using identical
connection id, i.e.  saddr:sport -> daddr:dport.

Expectation is for all connections to succeeed: NAT gatway is
supposed to do port reallocation for each of the (clashing) connections.

This is based on the description/use case provided in the commit message of
deedb59039 ("netfilter: nf_conntrack: add direction support for zones").

Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-09-21 03:46:55 +02:00
Grant Seltzer 97c140d94e libbpf: Add doc comments in libbpf.h
This adds comments above functions in libbpf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

These doc comments are for:
- bpf_object__find_map_by_name()
- bpf_map__fd()
- bpf_map__is_internal()
- libbpf_get_error()
- libbpf_num_possible_cpus()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210918031457.36204-1-grantseltzer@gmail.com
2021-09-20 17:23:31 -07:00
Linus Torvalds 62453a460a powerpc fixes for 5.15 #2
Fix crashes when scv (System Call Vectored) is used to make a syscall when a transaction
 is active, on Power9 or later.
 
 Fix bad interactions between rfscv (Return-from scv) and Power9 fake-suspend mode.
 
 Fix crashes when handling machine checks in LPARs using the Hash MMU.
 
 Partly revert a recent change to our XICS interrupt controller code, which broke the
 recently added Microwatt support.
 
 Thanks to: Cédric Le Goater, Eirik Fuller, Ganesh Goudar, Gustavo Romero, Joel Stanley,
 Nicholas Piggin.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmFHDfsTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgNlAD/oC/14jV2fCv54vH5Al1XAz2WuPhhEu
 UsN+MLlx7JshXfLeGzzEogvvakfLFuVkpIxsv+g8a9H6jejjCj/A4wjUy34Im0hS
 nhsMsa1vZPZY/Vz9v3WzdYmqjHVZDKo6ZdUmn0z4OjPnv7wQaEEbaXmoe0+CaXXQ
 jIY2TjVQxbgXeUI1QcPUWh36hiSaYn4O8fAD4VdjxdStl2z6q8zbo/5nzQ8dkBpP
 lldC0T/HRCKLe1ODUExFfbBFqFuLVdbALq0RIuh0GXd+dAqpK1iOkNsxUWChU6Q7
 gu0PrmhRydX3bmL9jmOsZfUk/hIR0fEgwSGITQAb8v0LCvZI0JRn+yvkl7uVZ9Ce
 OWY7B8HZQldvoQqA4/f7YCpAIS2ag7ugDrcAfuDVkDl2o/qJHoTi58IU24wI6+LL
 lwJZjNJoxZ8n1tl1l+gtCNp4E4YYOfuKZi0zH0oJoLiSWeK6PYOvwEGuEC2iQ9cJ
 BEdyYVXqSBuc2anCQ4k+y8gu/zJwP1MCpDNMPMU4P/JcaG9lnrLUhvctf4GKtOSy
 UiktHSTL52J5ldMxSMicpw9qCVgePsQX78Sk3UTHRilRmix6TtWOWYslm4mL+zeD
 u3wpnK+NKkju72LRS/AB1U6cWvModW+WBANPws9j6caoOorh397xsDGEYiKWvp0Z
 xSG/dzXzi2oscQ==
 =yC9U
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix crashes when scv (System Call Vectored) is used to make a syscall
   when a transaction is active, on Power9 or later.

 - Fix bad interactions between rfscv (Return-from scv) and Power9
   fake-suspend mode.

 - Fix crashes when handling machine checks in LPARs using the Hash MMU.

 - Partly revert a recent change to our XICS interrupt controller code,
   which broke the recently added Microwatt support.

Thanks to Cédric Le Goater, Eirik Fuller, Ganesh Goudar, Gustavo Romero,
Joel Stanley, Nicholas Piggin.

* tag 'powerpc-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/xics: Set the IRQ chip data for the ICS native backend
  powerpc/mce: Fix access error in mce handler
  KVM: PPC: Book3S HV: Tolerate treclaim. in fake-suspend mode changing registers
  powerpc/64s: system call rfscv workaround for TM bugs
  selftests/powerpc: Add scv versions of the basic TM syscall tests
  powerpc/64s: system call scv tabort fix for corrupt irq soft-mask state
2021-09-19 13:00:23 -07:00
Shuah Khan e30cd812df selftests: net: af_unix: Fix makefile to use TEST_GEN_PROGS
Makefile uses TEST_PROGS instead of TEST_GEN_PROGS to define
executables. TEST_PROGS is for shell scripts that need to be
installed and run by the common lib.mk framework. The common
framework doesn't touch TEST_PROGS when it does build and clean.

As a result "make kselftest-clean" and "make clean" fail to remove
executables. Run and install work because the common framework runs
and installs TEST_PROGS. Build works because the Makefile defines
"all" rule which is unnecessary if TEST_GEN_PROGS is used.

Use TEST_GEN_PROGS so the common framework can handle build/run/
install/clean properly.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 13:22:18 +01:00
Shuah Khan 48514a2233 selftests: net: af_unix: Fix incorrect args in test result msg
Fix the args to fprintf(). Splitting the message ends up passing
incorrect arg for "sigurg %d" and an extra arg overall. The test
result message ends up incorrect.

test_unix_oob.c: In function ‘main’:
test_unix_oob.c:274:43: warning: format ‘%d’ expects argument of type ‘int’, but argument 3 has type ‘char *’ [-Wformat=]
  274 |   fprintf(stderr, "Test 3 failed, sigurg %d len %d OOB %c ",
      |                                          ~^
      |                                           |
      |                                           int
      |                                          %s
  275 |   "atmark %d\n", signal_recvd, len, oob, atmark);
      |   ~~~~~~~~~~~~~
      |   |
      |   char *
test_unix_oob.c:274:19: warning: too many arguments for format [-Wformat-extra-args]
  274 |   fprintf(stderr, "Test 3 failed, sigurg %d len %d OOB %c ",

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-19 13:15:19 +01:00
Andrii Nakryiko 219d720e6d perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id()
Perf code re-implements libbpf's btf__load_from_kernel_by_id() API as
a weak function, presumably to dynamically link against old version of
libbpf shared library. Unfortunately this causes compilation warning
when perf is compiled against libbpf v0.6+.

For now, just ignore deprecation warning, but there might be a better
solution, depending on perf's needs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: kernel-team@fb.com
LPU-Reference: 20210914170004.4185659-1-andrii@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:47:02 -03:00
Ian Rogers aba5daeb64 libperf evsel: Make use of FD robust.
FD uses xyarray__entry that may return NULL if an index is out of
bounds. If NULL is returned then a segv happens as FD unconditionally
dereferences the pointer. This was happening in a case of with perf
iostat as shown below. The fix is to make FD an "int*" rather than an
int and handle the NULL case as either invalid input or a closed fd.

  $ sudo gdb --args perf stat --iostat  list
  ...
  Breakpoint 1, perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
  50      {
  (gdb) bt
   #0  perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
   #1  0x000055555585c188 in evsel__open_cpu (evsel=0x5555560951a0, cpus=0x555556093410,
      threads=0x555556086fb0, start_cpu=0, end_cpu=1) at util/evsel.c:1792
   #2  0x000055555585cfb2 in evsel__open (evsel=0x5555560951a0, cpus=0x0, threads=0x555556086fb0)
      at util/evsel.c:2045
   #3  0x000055555585d0db in evsel__open_per_thread (evsel=0x5555560951a0, threads=0x555556086fb0)
      at util/evsel.c:2065
   #4  0x00005555558ece64 in create_perf_stat_counter (evsel=0x5555560951a0,
      config=0x555555c34700 <stat_config>, target=0x555555c2f1c0 <target>, cpu=0) at util/stat.c:590
   #5  0x000055555578e927 in __run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
      at builtin-stat.c:833
   #6  0x000055555578f3c6 in run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
      at builtin-stat.c:1048
   #7  0x0000555555792ee5 in cmd_stat (argc=1, argv=0x7fffffffe4a0) at builtin-stat.c:2534
   #8  0x0000555555835ed3 in run_builtin (p=0x555555c3f540 <commands+288>, argc=3,
      argv=0x7fffffffe4a0) at perf.c:313
   #9  0x0000555555836154 in handle_internal_command (argc=3, argv=0x7fffffffe4a0) at perf.c:365
   #10 0x000055555583629f in run_argv (argcp=0x7fffffffe2ec, argv=0x7fffffffe2e0) at perf.c:409
   #11 0x0000555555836692 in main (argc=3, argv=0x7fffffffe4a0) at perf.c:539
  ...
  (gdb) c
  Continuing.
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
  /bin/dmesg | grep -i perf may provide additional information.

  Program received signal SIGSEGV, Segmentation fault.
  0x00005555559b03ea in perf_evsel__close_fd_cpu (evsel=0x5555560951a0, cpu=1) at evsel.c:166
  166                     if (FD(evsel, cpu, thread) >= 0)

v3. fixes a bug in perf_evsel__run_ioctl where the sense of a branch was
    backward.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210918054440.2350466-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:06 -03:00
Michael Petlan 57f0ff059e perf machine: Initialize srcline string member in add_location struct
It's later supposed to be either a correct address or NULL. Without the
initialization, it may contain an undefined value which results in the
following segmentation fault:

  # perf top --sort comm -g --ignore-callees=do_idle

terminates with:

  #0  0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
  #1  0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
  #2  0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
  #3  hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
  #4  0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
      sample_self=sample_self@entry=true) at util/hist.c:657
  #5  0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
      sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
  #6  0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
      at util/hist.c:1056
  #7  iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
  #8  0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
      at util/hist.c:1231
  #9  0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
      at builtin-top.c:842
  #10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
  #11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
  #12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
  #13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
  #14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
  #15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
  #16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
  #17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
  #18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6

If you look at the frame #2, the code is:

488	 if (he->srcline) {
489          he->srcline = strdup(he->srcline);
490          if (he->srcline == NULL)
491              goto err_rawdata;
492	 }

If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
it gets strdupped and strdupping a rubbish random string causes the problem.

Also, if you look at the commit 1fb7d06a50, it adds the srcline property
into the struct, but not initializing it everywhere needed.

Committer notes:

Now I see, when using --ignore-callees=do_idle we end up here at line
2189 in add_callchain_ip():

2181         if (al.sym != NULL) {
2182                 if (perf_hpp_list.parent && !*parent &&
2183                     symbol__match_regex(al.sym, &parent_regex))
2184                         *parent = al.sym;
2185                 else if (have_ignore_callees && root_al &&
2186                   symbol__match_regex(al.sym, &ignore_callees_regex)) {
2187                         /* Treat this symbol as the root,
2188                            forgetting its callees. */
2189                         *root_al = al;
2190                         callchain_cursor_reset(cursor);
2191                 }
2192         }

And the al that doesn't have the ->srcline field initialized will be
copied to the root_al, so then, back to:

1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
1212                          int max_stack_depth, void *arg)
1213 {
1214         int err, err2;
1215         struct map *alm = NULL;
1216
1217         if (al)
1218                 alm = map__get(al->map);
1219
1220         err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
1221                                         iter->evsel, al, max_stack_depth);
1222         if (err) {
1223                 map__put(alm);
1224                 return err;
1225         }
1226
1227         err = iter->ops->prepare_entry(iter, al);
1228         if (err)
1229                 goto out;
1230
1231         err = iter->ops->add_single_entry(iter, al);
1232         if (err)
1233                 goto out;
1234

That al at line 1221 is what hist_entry_iter__add() (called from
sample__resolve_callchain()) saw as 'root_al', and then:

        iter->ops->add_single_entry(iter, al);

will go on with al->srcline with a bogus value, I'll add the above
sequence to the cset and apply, thanks!

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
CC: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 1fb7d06a50 ("perf report Use srcline from callchain for hist entries")
Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
Reported-by: Juri Lelli <jlelli@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Adrian Hunter ff6f41fbce perf script: Fix ip display when type != attr->type
set_print_ip_opts() was not being called when type != attr->type
because there is not a one-to-one relationship between output types
and attr->type. That resulted in ip not printing.

The attr_type() function is removed, and the match of attr->type to
output type is corrected.

Example on ADL using taskset to select an atom cpu:

 # perf record -e cpu_atom/cpu-cycles/ taskset 0x1000 uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]

 Before:

  # perf script | head
         taskset   428 [-01] 10394.179041:          1 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179043:          1 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179044:         11 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179045:        407 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179046:      16789 cpu_atom/cpu-cycles/:
         taskset   428 [-01] 10394.179052:     676300 cpu_atom/cpu-cycles/:
           uname   428 [-01] 10394.179278:    4079859 cpu_atom/cpu-cycles/:

 After:

  # perf script | head
         taskset   428 10394.179041:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179043:          1 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179044:         11 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179045:        407 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179046:      16789 cpu_atom/cpu-cycles/:  ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
         taskset   428 10394.179052:     676300 cpu_atom/cpu-cycles/:      7f829ef73800 cfree+0x0 (/lib/libc-2.32.so)
           uname   428 10394.179278:    4079859 cpu_atom/cpu-cycles/:  ffffffff95bae912 vma_interval_tree_remove+0x1f2 ([kernel.kallsyms])

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210911133053.15682-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Ravi Bangoria 7efbcc8c07 perf annotate: Fix fused instr logic for assembly functions
Some x86 microarchitectures fuse a subset of cmp/test/ALU instructions
with branch instructions, and thus perf annotate highlight such valid
pairs as fused.

When annotated with source, perf uses struct disasm_line to contain
either source or instruction line from objdump output. Usually, a C
statement generates multiple instructions which include such
cmp/test/ALU + branch instruction pairs. But in case of assembly
function, each individual assembly source line generate one
instruction.

The 'perf annotate' instruction fusion logic assumes the previous
disasm_line as the previous instruction line, which is wrong because,
for assembly function, previous disasm_line contains source line.  And
thus perf fails to highlight valid fused instruction pairs for assembly
functions.

Fix it by searching backward until we find an instruction line and
consider that disasm_line as fused with current branch instruction.

Before:
         │    cmpq    %rcx, RIP+8(%rsp)
    0.00 │      cmp    %rcx,0x88(%rsp)
         │    je      .Lerror_bad_iret      <--- Source line
    0.14 │   ┌──je     b4                   <--- Instruction line
         │   │movl    %ecx, %eax

After:
         │    cmpq    %rcx, RIP+8(%rsp)
    0.00 │   ┌──cmp    %rcx,0x88(%rsp)
         │   │je      .Lerror_bad_iret
    0.14 │   ├──je     b4
         │   │movl    %ecx, %eax

Reviewed-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https //lore.kernel.org/r/20210911043854.8373-1-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 17:43:05 -03:00
Florian Westphal ce9979129a selftests: mptcp: add mptcp getsockopt test cases
Add a test program that retrieves the three info types:
1. mptcp meta information
2. tcp info for subflow
3. subflow endpoint addresses

For all three rudimentary checks are added.

1. Meta information checks that the logical mptcp
   sequence numbers advance as expected, based on the bytes read
   (init seq + bytes_received/sent) and the connection state
   (after close, we should exect 1 extra byte due to FIN).

2. TCP info checks the number of bytes sent/received vs.
   sums of read/write syscall return values.

3. Subflow endpoint addresses are checked vs. getsockname/getpeername
   result.

Tests for forward compatibility (0-initialisation of output-only
fields in mptcp_subflow_data structure) are added as well.

Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-18 14:20:01 +01:00
Dave Marchevsky a42effb0b2 bpf: Clarify data_len param in bpf_snprintf and bpf_seq_printf comments
Since the data_len in these two functions is a byte len of the preceding
u64 *data array, it must always be a multiple of 8. If this isn't the
case both helpers error out, so let's make the requirement explicit so
users don't need to infer it.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-10-davemarchevsky@fb.com
2021-09-17 14:02:06 -07:00
Dave Marchevsky 7606729fe2 selftests/bpf: Add trace_vprintk test prog
This commit adds a test prog for vprintk which confirms that:
  * bpf_trace_vprintk is writing to /sys/kernel/debug/tracing/trace_pipe
  * __bpf_vprintk macro works as expected
  * >3 args are printed
  * bpf_printk w/ 0 format args compiles
  * bpf_trace_vprintk call w/ a fmt specifier but NULL fmt data fails

Approach and code are borrowed from trace_printk test.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-9-davemarchevsky@fb.com
2021-09-17 14:02:06 -07:00
Dave Marchevsky d313d45a22 selftests/bpf: Migrate prog_tests/trace_printk CHECKs to ASSERTs
Guidance for new tests is to use ASSERT macros instead of CHECK. Since
trace_vprintk test will borrow heavily from trace_printk's, migrate its
CHECKs so it remains obvious that the two are closely related.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-8-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky 4190c299a4 bpftool: Only probe trace_vprintk feature in 'full' mode
Since commit 368cb0e7cd ("bpftool: Make probes which emit dmesg
warnings optional"), some helpers aren't probed by bpftool unless
`full` arg is added to `bpftool feature probe`.

bpf_trace_vprintk can emit dmesg warnings when probed, so include it.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-7-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky 6c66b0e7c9 libbpf: Use static const fmt string in __bpf_printk
The __bpf_printk convenience macro was using a 'char' fmt string holder
as it predates support for globals in libbpf. Move to more efficient
'static const char', but provide a fallback to the old way via
BPF_NO_GLOBAL_DATA so users on old kernels can still use the macro.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-6-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky c2758baa97 libbpf: Modify bpf_printk to choose helper based on arg count
Instead of being a thin wrapper which calls into bpf_trace_printk,
libbpf's bpf_printk convenience macro now chooses between
bpf_trace_printk and bpf_trace_vprintk. If the arg count (excluding
format string) is >3, use bpf_trace_vprintk, otherwise use the older
helper.

The motivation behind this added complexity - instead of migrating
entirely to bpf_trace_vprintk - is to maintain good developer experience
for users compiling against new libbpf but running on older kernels.
Users who are passing <=3 args to bpf_printk will see no change in their
bytecode.

__bpf_vprintk functions similarly to BPF_SEQ_PRINTF and BPF_SNPRINTF
macros elsewhere in the file - it allows use of bpf_trace_vprintk
without manual conversion of varargs to u64 array. Previous
implementation of bpf_printk macro is moved to __bpf_printk for use by
the new implementation.

This does change behavior of bpf_printk calls with >3 args in the "new
libbpf, old kernels" scenario. Before this patch, attempting to use 4
args to bpf_printk results in a compile-time error. After this patch,
using bpf_printk with 4 args results in a trace_vprintk helper call
being emitted and a load-time failure on older kernels.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-5-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky 10aceb629e bpf: Add bpf_trace_vprintk helper
This helper is meant to be "bpf_trace_printk, but with proper vararg
support". Follow bpf_snprintf's example and take a u64 pseudo-vararg
array. Write to /sys/kernel/debug/tracing/trace_pipe using the same
mechanism as bpf_trace_printk. The functionality of this helper was
requested in the libbpf issue tracker [0].

[0] Closes: https://github.com/libbpf/libbpf/issues/315

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-4-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Dave Marchevsky 84b4c52960 selftests/bpf: Stop using bpf_program__load
bpf_program__load is not supposed to be used directly. Replace it with
bpf_object__ APIs for the reference_tracking prog_test, which is the
last offender in bpf selftests.

Some additional complexity is added for this test, namely the use of one
bpf_object to iterate through progs, while a second bpf_object is
created and opened/closed to test actual loading of progs. This is
because the test was doing bpf_program__load then __unload to test
loading of individual progs and same semantics with
bpf_object__load/__unload result in failure to load an __unload-ed obj.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-3-davemarchevsky@fb.com
2021-09-17 14:02:05 -07:00
Jakub Kicinski af54faab84 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-09-17

We've added 63 non-merge commits during the last 12 day(s) which contain
a total of 65 files changed, 2653 insertions(+), 751 deletions(-).

The main changes are:

1) Streamline internal BPF program sections handling and
   bpf_program__set_attach_target() in libbpf, from Andrii.

2) Add support for new btf kind BTF_KIND_TAG, from Yonghong.

3) Introduce bpf_get_branch_snapshot() to capture LBR, from Song.

4) IMUL optimization for x86-64 JIT, from Jie.

5) xsk selftest improvements, from Magnus.

6) Introduce legacy kprobe events support in libbpf, from Rafael.

7) Access hw timestamp through BPF's __sk_buff, from Vadim.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (63 commits)
  selftests/bpf: Fix a few compiler warnings
  libbpf: Constify all high-level program attach APIs
  libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7
  selftests/bpf: Switch fexit_bpf2bpf selftest to set_attach_target() API
  libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target()
  libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs
  selftests/bpf: Stop using relaxed_core_relocs which has no effect
  libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id()
  bpf: Update bpf_get_smp_processor_id() documentation
  libbpf: Add sphinx code documentation comments
  selftests/bpf: Skip btf_tag test if btf_tag attribute not supported
  docs/bpf: Add documentation for BTF_KIND_TAG
  selftests/bpf: Add a test with a bpf program with btf_tag attributes
  selftests/bpf: Test BTF_KIND_TAG for deduplication
  selftests/bpf: Add BTF_KIND_TAG unit tests
  selftests/bpf: Change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG format
  selftests/bpf: Test libbpf API function btf__add_tag()
  bpftool: Add support for BTF_KIND_TAG
  libbpf: Add support for BTF_KIND_TAG
  libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
  ...
====================

Link: https://lore.kernel.org/r/20210917173738.3397064-1-ast@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-17 12:40:21 -07:00
Yonghong Song ca21a3e5ed selftests/bpf: Fix a few compiler warnings
With clang building selftests/bpf, I hit a few warnings like below:

  .../bpf_iter.c:592:48: warning: variable 'expected_key_c' set but not used [-Wunused-but-set-variable]
  __u32 expected_key_a = 0, expected_key_b = 0, expected_key_c = 0;
                                                ^

  .../bpf_iter.c:688:48: warning: variable 'expected_key_c' set but not used [-Wunused-but-set-variable]
  __u32 expected_key_a = 0, expected_key_b = 0, expected_key_c = 0;
                                                ^

  .../tc_redirect.c:657:6: warning: variable 'target_fd' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
  if (!ASSERT_OK_PTR(nstoken, "setns " NS_FWD))
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  .../tc_redirect.c:743:6: note: uninitialized use occurs here
  if (target_fd >= 0)
      ^~~~~~~~~

Removing unused variables and initializing the previously-uninitialized variable
to ensure these warnings are gone.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210917043343.3711917-1-yhs@fb.com
2021-09-17 09:10:54 -07:00
Andrii Nakryiko 942025c9f3 libbpf: Constify all high-level program attach APIs
Attach APIs shouldn't need to modify bpf_program/bpf_map structs, so
change all struct bpf_program and struct bpf_map pointers to const
pointers. This is completely backwards compatible with no functional
change.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-8-andrii@kernel.org
2021-09-17 09:05:41 -07:00
Andrii Nakryiko 91b555d73e libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7
bpf_object_open_opts.attach_prog_fd makes a pretty strong assumption
that bpf_object contains either only single freplace BPF program or all
of BPF programs in BPF object are freplaces intended to replace
different subprograms of the same target BPF program. This seems both
a bit confusing, too assuming, and limiting.

We've had bpf_program__set_attach_target() API which allows more
fine-grained control over this, on a per-program level. As such, mark
open_opts.attach_prog_fd as deprecated starting from v0.7, so that we
have one more universal way of setting freplace targets. With previous
change to allow NULL attach_func_name argument, and especially combined
with BPF skeleton, arguable bpf_program__set_attach_target() is a more
convenient and explicit API as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-7-andrii@kernel.org
2021-09-17 09:05:41 -07:00
Andrii Nakryiko 60aed22076 selftests/bpf: Switch fexit_bpf2bpf selftest to set_attach_target() API
Switch fexit_bpf2bpf selftest to bpf_program__set_attach_target()
instead of using bpf_object_open_opts.attach_prog_fd, which is going to
be deprecated. These changes also demonstrate the new mode of
set_attach_target() in which it allows NULL when the target is BPF
program (attach_prog_fd != 0).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-6-andrii@kernel.org
2021-09-17 09:05:41 -07:00
Andrii Nakryiko 2d5ec1c66e libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target()
Allow to use bpf_program__set_attach_target to only set target attach
program FD, while letting libbpf to use target attach function name from
SEC() definition. This might be useful for some scenarios where
bpf_object contains multiple related freplace BPF programs intended to
replace different sub-programs in target BPF program. In such case all
programs will have the same attach_prog_fd, but different
attach_func_name. It's convenient to specify such target function names
declaratively in SEC() definitions, but attach_prog_fd is a dynamic
runtime setting.

To simplify such scenario, allow bpf_program__set_attach_target() to
delay BTF ID resolution till the BPF program load time by providing NULL
attach_func_name. In that case the behavior will be similar to using
bpf_object_open_opts.attach_prog_fd (which is marked deprecated since
v0.7), but has the benefit of allowing more control by user in what is
attached to what. Such setup allows having BPF programs attached to
different target attach_prog_fd with target functions still declaratively
recorded in BPF source code in SEC() definitions.

Selftests changes in the next patch should make this more obvious.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-5-andrii@kernel.org
2021-09-17 09:05:00 -07:00
Andrii Nakryiko 277641859e libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs
It's relevant and hasn't been doing anything for a long while now.
Deprecated it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-4-andrii@kernel.org
2021-09-17 09:04:12 -07:00
Andrii Nakryiko 23a7baaa93 selftests/bpf: Stop using relaxed_core_relocs which has no effect
relaxed_core_relocs option hasn't had any effect for a while now, stop
specifying it. Next patch marks it as deprecated.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-3-andrii@kernel.org
2021-09-17 09:04:12 -07:00
Andrii Nakryiko f11f86a393 libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id()
Don't perform another search for sec_def inside
libbpf_find_attach_btf_id(), as each recognized bpf_program already has
prog->sec_def set.

Also remove unnecessary NULL check for prog->sec_name, as it can never
be NULL.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-2-andrii@kernel.org
2021-09-17 09:04:12 -07:00
Vladimir Oltean 3c9cfb5269 net: update NXP copyright text
NXP Legal insists that the following are not fine:

- Saying "NXP Semiconductors" instead of "NXP", since the company's
  registered name is "NXP"

- Putting a "(c)" sign in the copyright string

- Putting a comma in the copyright string

The only accepted copyright string format is "Copyright <year-range> NXP".

This patch changes the copyright headers in the networking files that
were sent by me, or derived from code sent by me.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-17 13:52:17 +01:00
Peter Zijlstra db2b0c5d7b objtool: Support pv_opsindirect calls for noinstr
Normally objtool will now follow indirect calls; there is no need.

However, this becomes a problem with noinstr validation; if there's an
indirect call from noinstr code, we very much need to know it is to
another noinstr function. Luckily there aren't many indirect calls in
entry code with the obvious exception of paravirt. As such, noinstr
validation didn't work with paravirt kernels.

In order to track pv_ops[] call targets, objtool reads the static
pv_ops[] tables as well as direct assignments to the pv_ops[] array,
provided the compiler makes them a single instruction like:

  bf87:       48 c7 05 00 00 00 00 00 00 00 00        movq   $0x0,0x0(%rip)
    bf92 <xen_init_spinlocks+0x5f>
    bf8a: R_X86_64_PC32     pv_ops+0x268

There are, as of yet, no warnings for when this goes wrong :/

Using the functions found with the above means, all pv_ops[] calls are
now subject to noinstr validation.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210624095149.118815755@infradead.org
2021-09-17 13:20:26 +02:00
Jakub Kicinski 561bed688b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts!

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-16 13:58:38 -07:00
Linus Torvalds fc0c0548c1 Networking fixes for 5.15-rc2, including fixes from bpf.
Current release - regressions:
 
  - vhost_net: fix OoB on sendmsg() failure
 
  - mlx5: bridge, fix uninitialized variable usage
 
  - bnxt_en: fix error recovery regression
 
 Current release - new code bugs:
 
  - bpf, mm: fix lockdep warning triggered by stack_map_get_build_id_offset()
 
 Previous releases - regressions:
 
  - r6040: restore MDIO clock frequency after MAC reset
 
  - tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
 
  - dsa: flush switchdev workqueue before tearing down CPU/DSA ports
 
 Previous releases - always broken:
 
  - ptp: dp83640: don't define PAGE0, avoid compiler warning
 
  - igc: fix tunnel segmentation offloads
 
  - phylink: update SFP selected interface on advertising changes
 
  - stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume
 
  - mlx5e: fix mutual exclusion between CQE compression and HW TS
 
 Misc:
 
  - bpf, cgroups: fix cgroup v2 fallback on v1/v2 mixed mode
 
  - sfc: fallback for lack of xdp tx queues
 
  - hns3: add option to turn off page pool feature
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmFDnXIACgkQMUZtbf5S
 Irs+pA//UvT0UqL+G/9He9eVMaX/7mnvoShbU/hHaymLuvJE4dx6B3qw8im5fQ0i
 b7ovy1dxjMjSQI4E+2FleajRBgsdXvWgMCopFZr7h35+qPXziLxTTztwsaOIjD8U
 42HBpdwFrVtdrOhSsiKsGKe8KXqIehHhFAp4vQrRlsvsz1Yk/93o6wYaXaXqR6pQ
 qlTVgI7nAuLAw4VfFZ3k7iAiQsyrXrn5kUXq5/Wy6kK7iFnIN7X9j+epTJP/EXHn
 +17SW/iGnjUfwXsRSC2ysgICtOPatu48Cke5hq5XOGqXAtRVhhLfMDiZlr8a6+z4
 gS+QETakBXLrAfvYK6neLjlMhg1FX+Y952xd3XIU4jz+DgIh03r2Nd1xBsp17Yhu
 RRBc96w2KxuHQ6USEXbi5dY1K6o9DK52m/WRmnqXnaNX4ADeesgcrxXHqZ6O6I5R
 +ddvVWylaHxvngCQSL4DvZdYNJVoKjdlo6fHrwD1qu0Uv0rGz6O+AEI9o8DkK5Af
 ZhFBETvjTKzjCykd3UlaRo/ibYChIPYwpGTK+TTI9rUm8OWNw1+sC/Z14bx0/3Vl
 RB53rkJ93/qi6Wb9ljUxM8CPGViuPvMkQJjpJOaktrj0GOgQz3EBdDkIQMaplZx5
 r2AeHW7X5VGxCWMRg4NS/x7Do1jh8m/ch8/EulOAB5fG7KQPfYs=
 =xd6p
 -----END PGP SIGNATURE-----

Merge tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf.

  Current release - regressions:

   - vhost_net: fix OoB on sendmsg() failure

   - mlx5: bridge, fix uninitialized variable usage

   - bnxt_en: fix error recovery regression

  Current release - new code bugs:

   - bpf, mm: fix lockdep warning triggered by stack_map_get_build_id_offset()

  Previous releases - regressions:

   - r6040: restore MDIO clock frequency after MAC reset

   - tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()

   - dsa: flush switchdev workqueue before tearing down CPU/DSA ports

  Previous releases - always broken:

   - ptp: dp83640: don't define PAGE0, avoid compiler warning

   - igc: fix tunnel segmentation offloads

   - phylink: update SFP selected interface on advertising changes

   - stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume

   - mlx5e: fix mutual exclusion between CQE compression and HW TS

  Misc:

   - bpf, cgroups: fix cgroup v2 fallback on v1/v2 mixed mode

   - sfc: fallback for lack of xdp tx queues

   - hns3: add option to turn off page pool feature"

* tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
  mlxbf_gige: clear valid_polarity upon open
  igc: fix tunnel offloading
  net/{mlx5|nfp|bnxt}: Remove unnecessary RTNL lock assert
  net: wan: wanxl: define CROSS_COMPILE_M68K
  selftests: nci: replace unsigned int with int
  net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports
  Revert "net: phy: Uniform PHY driver access"
  net: dsa: destroy the phylink instance on any error in dsa_slave_phy_setup
  ptp: dp83640: don't define PAGE0
  bnx2x: Fix enabling network interfaces without VFs
  Revert "Revert "ipv4: fix memory leaks in ip_cmsg_send() callers""
  tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
  net-caif: avoid user-triggerable WARN_ON(1)
  bpf, selftests: Add test case for mixed cgroup v1/v2
  bpf, selftests: Add cgroup v1 net_cls classid helpers
  bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode
  bpf: Add oversize check before call kvcalloc()
  net: hns3: fix the timing issue of VF clearing interrupt sources
  net: hns3: fix the exception when query imp info
  net: hns3: disable mac in flr process
  ...
2021-09-16 13:05:42 -07:00
Shuah Khan f5013d412a selftests: kvm: fix get_run_delay() ignoring fscanf() return warn
Fix get_run_delay() to check fscanf() return value to get rid of the
following warning. When fscanf() fails return MIN_RUN_DELAY_NS from
get_run_delay(). Move MIN_RUN_DELAY_NS from steal_time.c to test_util.h
so get_run_delay() and steal_time.c can use it.

lib/test_util.c: In function ‘get_run_delay’:
lib/test_util.c:316:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  316 |  fscanf(fp, "%ld %ld ", &val[0], &val[1]);
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:32 -06:00
Shuah Khan 20175d5eac selftests: kvm: move get_run_delay() into lib/test_util
get_run_delay() is defined static in xen_shinfo_test and steal_time test.
Move it to lib and remove code duplication.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:26 -06:00
Shuah Khan 3a4f0cc693 selftests:kvm: fix get_trans_hugepagesz() ignoring fscanf() return warn
Fix get_trans_hugepagesz() to check fscanf() return value to get rid
of the following warning:

lib/test_util.c: In function ‘get_trans_hugepagesz’:
lib/test_util.c:138:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
  138 |  fscanf(f, "%ld", &size);
      |  ^~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:11 -06:00
Shuah Khan 39a71f712d selftests:kvm: fix get_warnings_count() ignoring fscanf() return warn
Fix get_warnings_count() to check fscanf() return value to get rid
of the following warning:

x86_64/mmio_warning_test.c: In function ‘get_warnings_count’:
x86_64/mmio_warning_test.c:85:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
   85 |  fscanf(f, "%d", &warnings);
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-16 12:57:02 -06:00
Paul E. McKenney faaaf2ac03 torture: Make kvm-remote.sh print size of downloaded tarball
This commit causes kvm-remote.sh to print the size of the tarball that
is downloaded to each of the remote systems.  This size can help with
performance projections and analysis.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-09-16 10:32:35 -07:00
Paul E. McKenney ae3357ac11 torture: Allot 1G of memory for scftorture runs
By default, torture.sh allots 512M of memory for each guest OS.  However,
when running scftorture with KASAN, 1G is needed.  This commit therefore
causes torture.sh to provide the required 1G.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-09-16 10:32:35 -07:00
Paul E. McKenney 2010776f8c tools/rcu: Add an extract-stall script
This commit adds a script that extracts RCU CPU stall warnings
from console output.  The user can optionally specify the number of
lines preceding the stall to output, and also the number of lines of
stall-warning text.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-09-16 10:31:26 -07:00
Xiang wangx 98dc68f8b0 selftests: nci: replace unsigned int with int
Should not use comparison of unsigned expressions < 0.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-16 13:55:51 +01:00
Matteo Croce 336562752a bpf: Update bpf_get_smp_processor_id() documentation
BPF programs run with migration disabled regardless of preemption, as
they are protected by migrate_disable(). Update the uapi documentation
accordingly.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210914235400.59427-1-mcroce@linux.microsoft.com
2021-09-15 22:39:55 +02:00
Grant Seltzer 69cd823956 libbpf: Add sphinx code documentation comments
This adds comments above five functions in btf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210915021951.117186-1-grantseltzer@gmail.com
2021-09-15 13:16:02 -07:00
Masami Hiramatsu 80be5998ad tools/bootconfig: Define memblock_free_ptr() to fix build error
The lib/bootconfig.c file is shared with the 'bootconfig' tooling, and
as a result, the changes incommit 77e02cf57b ("memblock: introduce
saner 'memblock_free_ptr()' interface") need to also be reflected in the
tooling header file.

So define the new memblock_free_ptr() wrapper, and remove unused __pa()
and memblock_free().

Fixes: 77e02cf57b ("memblock: introduce saner 'memblock_free_ptr()' interface")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-15 09:49:48 -07:00
Li Zhijian 8914a7a247 selftests: be sure to make khdr before other targets
LKP/0Day reported some building errors about kvm, and errors message
are not always same:
- lib/x86_64/processor.c:1083:31: error: ‘KVM_CAP_NESTED_STATE’ undeclared
(first use in this function); did you mean ‘KVM_CAP_PIT_STATE2’?
- lib/test_util.c:189:30: error: ‘MAP_HUGE_16KB’ undeclared (first use
in this function); did you mean ‘MAP_HUGE_16GB’?

Although kvm relies on the khdr, they still be built in parallel when -j
is specified. In this case, it will cause compiling errors.

Here we mark target khdr as NOTPARALLEL to make it be always built
first.

CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-09-15 10:34:21 -06:00
Yonghong Song 2220ecf55c selftests/bpf: Skip btf_tag test if btf_tag attribute not supported
Commit c240ba2878 ("selftests/bpf: Add a test with a bpf
program with btf_tag attributes") added btf_tag selftest
to test BTF_KIND_TAG generation from C source code, and to
test kernel validation of generated BTF types.
But if an old clang (clang 13 or earlier) is used, the
following compiler warning may be seen:
  progs/tag.c:23:20: warning: unknown attribute 'btf_tag' ignored
and the test itself is marked OK. The compiler warning is bad
and the test itself shouldn't be marked OK.

This patch added the check for btf_tag attribute support.
If btf_tag is not supported by the clang, the attribute will
not be used in the code and the test will be marked as skipped.
For example, with clang 13:
  ./test_progs -t btf_tag
  #21 btf_tag:SKIP
  Summary: 1/0 PASSED, 1 SKIPPED, 0 FAILED

The selftests/README.rst is updated to clarify when the btf_tag
test may be skipped.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210915061036.2577971-1-yhs@fb.com
2021-09-15 08:49:49 -07:00
Jakub Kicinski 2d6a58996e selftests: net: test ethtool -L vs mq
Add a selftest for checking mq children are visible after ethtool -L.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-15 15:46:02 +01:00
Peter Zijlstra f56dae88a8 objtool: Handle __sanitize_cov*() tail calls
Turns out the compilers also generate tail calls to __sanitize_cov*(),
make sure to also patch those out in noinstr code.

Fixes: 0f1441b44e ("objtool: Fix noinstr vs KCOV")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20210624095147.818783799@infradead.org
2021-09-15 15:51:45 +02:00
Peter Zijlstra 8b946cc38e objtool: Introduce CFI hash
Andi reported that objtool on vmlinux.o consumes more memory than his
system has, leading to horrific performance.

This is in part because we keep a struct instruction for every
instruction in the file in-memory. Shrink struct instruction by
removing the CFI state (which includes full register state) from it
and demand allocating it.

Given most instructions don't actually change CFI state, there's lots
of repetition there, so add a hash table to find previous CFI
instances.

Reduces memory consumption (and runtime) for processing an
x86_64-allyesconfig:

  pre:  4:40.84 real,   143.99 user,    44.18 sys,      30624988 mem
  post: 2:14.61 real,   108.58 user,    25.04 sys,      16396184 mem

Suggested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210624095147.756759107@infradead.org
2021-09-15 15:51:45 +02:00
Peter Zijlstra 9af9dcf11b x86/xen: Mark cpu_bringup_and_idle() as dead_end_function
The asm_cpu_bringup_and_idle() function is required to push the return
value on the stack in order to make ORC happy, but the only reason
objtool doesn't complain is because of a happy accident.

The thing is that asm_cpu_bringup_and_idle() doesn't return, so
validate_branch() never terminates and falls through to the next
function, which in the normal case is the hypercall_page. And that, as
it happens, is 4095 NOPs and a RET.

Make asm_cpu_bringup_and_idle() terminate on it's own, by making the
function it calls as a dead-end. This way we no longer rely on what
code happens to come after.

Fixes: c3881eb58d ("x86/xen: Make the secondary CPU idle tasks reliable")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/20210624095147.693801717@infradead.org
2021-09-15 15:51:44 +02:00
Yonghong Song c240ba2878 selftests/bpf: Add a test with a bpf program with btf_tag attributes
Add a bpf program with btf_tag attributes. The program is
loaded successfully with the kernel. With the command
  bpftool btf dump file ./tag.o
the following dump shows that tags are properly encoded:
  [8] STRUCT 'key_t' size=12 vlen=3
          'a' type_id=2 bits_offset=0
          'b' type_id=2 bits_offset=32
          'c' type_id=2 bits_offset=64
  [9] TAG 'tag1' type_id=8 component_id=-1
  [10] TAG 'tag2' type_id=8 component_id=-1
  [11] TAG 'tag1' type_id=8 component_id=1
  [12] TAG 'tag2' type_id=8 component_id=1
  ...
  [21] FUNC_PROTO '(anon)' ret_type_id=2 vlen=1
          'x' type_id=2
  [22] FUNC 'foo' type_id=21 linkage=static
  [23] TAG 'tag1' type_id=22 component_id=0
  [24] TAG 'tag2' type_id=22 component_id=0
  [25] TAG 'tag1' type_id=22 component_id=-1
  [26] TAG 'tag2' type_id=22 component_id=-1
  ...
  [29] VAR 'total' type_id=27, linkage=global
  [30] TAG 'tag1' type_id=29 component_id=-1
  [31] TAG 'tag2' type_id=29 component_id=-1

If an old clang compiler, which does not support btf_tag attribute,
is used, these btf_tag attributes will be silently ignored.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223058.248949-1-yhs@fb.com
2021-09-14 18:45:53 -07:00
Yonghong Song ad526474ae selftests/bpf: Test BTF_KIND_TAG for deduplication
Add unit tests for BTF_KIND_TAG deduplication for
  - struct and struct member
  - variable
  - func and func argument

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223052.248535-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 35baba7a83 selftests/bpf: Add BTF_KIND_TAG unit tests
Test good and bad variants of BTF_KIND_TAG encoding.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223047.248223-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 3df3bd68d4 selftests/bpf: Change NAME_NTH/IS_NAME_NTH for BTF_KIND_TAG format
BTF_KIND_TAG ELF format has a component_idx which might have value -1.
test_btf may confuse it with common_type.name as NAME_NTH checkes
high 16bit to be 0xffff. Change NAME_NTH high 16bit check to be
0xfffe so it won't confuse with component_idx.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223041.248009-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 71d29c2d47 selftests/bpf: Test libbpf API function btf__add_tag()
Add btf_write tests with btf__add_tag() function.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223036.247560-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 5c07f2fec0 bpftool: Add support for BTF_KIND_TAG
Added bpftool support to dump BTF_KIND_TAG information.
The new bpftool will be used in later patches to dump
btf in the test bpf program object file.

Currently, the tags are not emitted with
  bpftool btf dump file <path> format c
and they are silently ignored.  The tag information is
mostly used in the kernel for verification purpose and the kernel
uses its own btf to check. With adding these tags
to vmlinux.h, tags will be encoded in program's btf but
they will not be used by the kernel, at least for now.
So let us delay adding these tags to format C header files
until there is a real need.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223031.246951-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 5b84bd1036 libbpf: Add support for BTF_KIND_TAG
Add BTF_KIND_TAG support for parsing and dedup.
Also added sanitization for BTF_KIND_TAG. If BTF_KIND_TAG is not
supported in the kernel, sanitize it to INTs.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223025.246687-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 30025e8bd8 libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
This patch renames functions btf_{hash,equal}_int() to
btf_{hash,equal}_int_tag() so they can be reused for
BTF_KIND_TAG support. There is no functionality change for
this patch.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223020.245829-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song b5ea834dde bpf: Support for new btf kind BTF_KIND_TAG
LLVM14 added support for a new C attribute ([1])
  __attribute__((btf_tag("arbitrary_str")))
This attribute will be emitted to dwarf ([2]) and pahole
will convert it to BTF. Or for bpf target, this
attribute will be emitted to BTF directly ([3], [4]).
The attribute is intended to provide additional
information for
  - struct/union type or struct/union member
  - static/global variables
  - static/global function or function parameter.

For linux kernel, the btf_tag can be applied
in various places to specify user pointer,
function pre- or post- condition, function
allow/deny in certain context, etc. Such information
will be encoded in vmlinux BTF and can be used
by verifier.

The btf_tag can also be applied to bpf programs
to help global verifiable functions, e.g.,
specifying preconditions, etc.

This patch added basic parsing and checking support
in kernel for new BTF_KIND_TAG kind.

 [1] https://reviews.llvm.org/D106614
 [2] https://reviews.llvm.org/D106621
 [3] https://reviews.llvm.org/D106622
 [4] https://reviews.llvm.org/D109560

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223015.245546-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Yonghong Song 41ced4cd88 btf: Change BTF_KIND_* macros to enums
Change BTF_KIND_* macros to enums so they are encoded in dwarf and
appear in vmlinux.h. This will make it easier for bpf programs
to use these constants without macro definitions.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223009.245307-1-yhs@fb.com
2021-09-14 18:45:52 -07:00
Andrii Nakryiko 8987ede3ed selftests/bpf: Fix .gitignore to not ignore test_progs.c
List all possible test_progs flavors explicitly to avoid accidentally
ignoring valid source code files. In this case, test_progs.c was still
ignored after recent 809ed84de8 ("selftests/bpf: Whitelist test_progs.h
from .gitignore") fix that added exception only for test_progs.h.

Fixes: 74b5a5968f ("selftests/bpf: Replace test_progs and test_maps w/ general rule")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210914162228.3995740-1-andrii@kernel.org
2021-09-14 18:41:14 -07:00
Jie Meng c035407743 bpf,x64 Emit IMUL instead of MUL for x86-64
IMUL allows for multiple operands and saving and storing rax/rdx is no
longer needed. Signedness of the operands doesn't matter here because
the we only keep the lower 32/64 bit of the product for 32/64 bit
multiplications.

Signed-off-by: Jie Meng <jmeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210913211337.1564014-1-jmeng@fb.com
2021-09-14 18:36:36 -07:00
Andrii Nakryiko b6291a6f30 libbpf: Minimize explicit iterator of section definition array
Remove almost all the code that explicitly iterated BPF program section
definitions in favor of using find_sec_def(). The only remaining user of
section_defs is libbpf_get_type_names that has to iterate all of them to
construct its result.

Having one internal API entry point for section definitions will
simplify further refactorings around libbpf's program section
definitions parsing.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-5-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Andrii Nakryiko 5532dfd42e libbpf: Simplify BPF program auto-attach code
Remove the need to explicitly pass bpf_sec_def for auto-attachable BPF
programs, as it is already recorded at bpf_object__open() time for all
recognized type of BPF programs. This further reduces number of explicit
calls to find_sec_def(), simplifying further refactorings.

No functional changes are done by this patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-4-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Andrii Nakryiko 91b4d1d1d5 libbpf: Ensure BPF prog types are set before relocations
Refactor bpf_object__open() sequencing to perform BPF program type
detection based on SEC() definitions before we get to relocations
collection. This allows to have more information about BPF program by
the time we get to, say, struct_ops relocation gathering. This,
subsequently, simplifies struct_ops logic and removes the need to
perform extra find_sec_def() resolution.

With this patch libbpf will require all struct_ops BPF programs to be
marked with SEC("struct_ops") or SEC("struct_ops/xxx") annotations.
Real-world applications are already doing that through something like
selftests's BPF_STRUCT_OPS() macro. This change streamlines libbpf's
internal handling of SEC() definitions and is in the sprit of
upcoming libbpf-1.0 section strictness changes ([0]).

  [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-3-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Andrii Nakryiko 53df63ccdc selftests/bpf: Update selftests to always provide "struct_ops" SEC
Update struct_ops selftests to always specify "struct_ops" section
prefix. Libbpf will require a proper BPF program type set in the next
patch, so this prevents tests breaking.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-2-andrii@kernel.org
2021-09-14 15:49:24 -07:00
Rafael David Tinoco ca304b40c2 libbpf: Introduce legacy kprobe events support
Allow kprobe tracepoint events creation through legacy interface, as the
kprobe dynamic PMUs support, used by default, was only created in v4.17.

Store legacy kprobe name in struct bpf_perf_link, instead of creating
a new "subclass" off of bpf_perf_link. This is ok as it's just two new
fields, which are also going to be reused for legacy uprobe support in
follow up patches.

Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210912064844.3181742-1-rafaeldtinoco@gmail.com
2021-09-14 14:44:45 -07:00
David S. Miller 2865ba8247 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-09-14

The following pull-request contains BPF updates for your *net* tree.

We've added 7 non-merge commits during the last 13 day(s) which contain
a total of 18 files changed, 334 insertions(+), 193 deletions(-).

The main changes are:

1) Fix mmap_lock lockdep splat in BPF stack map's build_id lookup, from Yonghong Song.

2) Fix BPF cgroup v2 program bypass upon net_cls/prio activation, from Daniel Borkmann.

3) Fix kvcalloc() BTF line info splat on oversized allocation attempts, from Bixuan Cui.

4) Fix BPF selftest build of task_pt_regs test for arm64/s390, from Jean-Philippe Brucker.

5) Fix BPF's disasm.{c,h} to dual-license so that it is aligned with bpftool given the former
   is a build dependency for the latter, from Daniel Borkmann with ACKs from contributors.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 13:09:54 +01:00
Faizel K B f81c08f897 usb: testusb: Fix for showing the connection speed
testusb' application which uses 'usbtest' driver reports 'unknown speed'
from the function 'find_testdev'. The variable 'entry->speed' was not
updated from  the application. The IOCTL mentioned in the FIXME comment can
only report whether the connection is low speed or not. Speed is read using
the IOCTL USBDEVFS_GET_SPEED which reports the proper speed grade.  The
call is implemented in the function 'handle_testdev' where the file
descriptor was availble locally. Sample output is given below where 'high
speed' is printed as the connected speed.

sudo ./testusb -a
high speed      /dev/bus/usb/001/011    0
/dev/bus/usb/001/011 test 0,    0.000015 secs
/dev/bus/usb/001/011 test 1,    0.194208 secs
/dev/bus/usb/001/011 test 2,    0.077289 secs
/dev/bus/usb/001/011 test 3,    0.170604 secs
/dev/bus/usb/001/011 test 4,    0.108335 secs
/dev/bus/usb/001/011 test 5,    2.788076 secs
/dev/bus/usb/001/011 test 6,    2.594610 secs
/dev/bus/usb/001/011 test 7,    2.905459 secs
/dev/bus/usb/001/011 test 8,    2.795193 secs
/dev/bus/usb/001/011 test 9,    8.372651 secs
/dev/bus/usb/001/011 test 10,    6.919731 secs
/dev/bus/usb/001/011 test 11,   16.372687 secs
/dev/bus/usb/001/011 test 12,   16.375233 secs
/dev/bus/usb/001/011 test 13,    2.977457 secs
/dev/bus/usb/001/011 test 14 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 17,    0.148826 secs
/dev/bus/usb/001/011 test 18,    0.068718 secs
/dev/bus/usb/001/011 test 19,    0.125992 secs
/dev/bus/usb/001/011 test 20,    0.127477 secs
/dev/bus/usb/001/011 test 21 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 24,    4.133763 secs
/dev/bus/usb/001/011 test 27,    2.140066 secs
/dev/bus/usb/001/011 test 28,    2.120713 secs
/dev/bus/usb/001/011 test 29,    0.507762 secs

Signed-off-by: Faizel K B <faizel.kb@dicortech.com>
Link: https://lore.kernel.org/r/20210902114444.15106-1-faizel.kb@dicortech.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-14 10:31:41 +02:00
Paul E. McKenney b380b10b84 torture: Make torture.sh print the number of files to be compressed
Compressing gigabyte vmlinux files can take some time, and it can be a
bit annoying to not know many more batches of compression there will be.
This commit therefore makes torture.sh print the number of files to be
compressed just before starting compression and just after compression
completes.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-09-13 16:37:11 -07:00
Daniel Borkmann 43d2b88c29 bpf, selftests: Add test case for mixed cgroup v1/v2
Minimal selftest which implements a small BPF policy program to the
connect(2) hook which rejects TCP connection requests to port 60123
with EPERM. This is being attached to a non-root cgroup v2 path. The
test asserts that this works under cgroup v2-only and under a mixed
cgroup v1/v2 environment where net_classid is set in the former case.

Before fix:

  # ./test_progs -t cgroup_v1v2
  test_cgroup_v1v2:PASS:server_fd 0 nsec
  test_cgroup_v1v2:PASS:client_fd 0 nsec
  test_cgroup_v1v2:PASS:cgroup_fd 0 nsec
  test_cgroup_v1v2:PASS:server_fd 0 nsec
  run_test:PASS:skel_open 0 nsec
  run_test:PASS:prog_attach 0 nsec
  test_cgroup_v1v2:PASS:cgroup-v2-only 0 nsec
  run_test:PASS:skel_open 0 nsec
  run_test:PASS:prog_attach 0 nsec
  run_test:PASS:join_classid 0 nsec
  (network_helpers.c:219: errno: None) Unexpected success to connect to server
  test_cgroup_v1v2:FAIL:cgroup-v1v2 unexpected error: -1 (errno 0)
  #27 cgroup_v1v2:FAIL
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

After fix:

  # ./test_progs -t cgroup_v1v2
  #27 cgroup_v1v2:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210913230759.2313-3-daniel@iogearbox.net
2021-09-13 16:35:58 -07:00
Daniel Borkmann d8079d8026 bpf, selftests: Add cgroup v1 net_cls classid helpers
Minimal set of helpers for net_cls classid cgroupv1 management in order
to set an id, join from a process, initiate setup and teardown. cgroupv2
helpers are left as-is, but reused where possible.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210913230759.2313-2-daniel@iogearbox.net
2021-09-13 16:35:58 -07:00
Nathan Chancellor d0ee23f9d7 tools: compiler-gcc.h: Guard error attribute use with __has_attribute
When building objtool with HOSTCC=clang, there are several errors along
the lines of

  orc_dump.c:201:28: error: unknown attribute 'error' ignored [-Werror,-Wunknown-attributes]

This occurs after commit 4e59869aa6 ("compiler-gcc.h: drop checks for
older GCC versions"), which removed the GCC_VERSION gating.  The removed
version check just so happened to prevent __compiletime_error() from
being defined with clang because it pretends to be GCC 4.2.1 for
compatibility but the error attribute was not added to clang until
14.0.0.

Commit 815f0ddb34 ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") and commit a3f8a30f3f ("Compiler Attributes: use
feature checks instead of version checks") refactored the handling of
attributes in the main kernel to avoid situations like this but that
refactoring has never been done for the tools directory.

Refactoring is a rather large undertaking and this has never been an
issue before so instead, just guard the definition of
__compiletime_error() with __has_attribute() so that there are no more
errors.

Fixes: 4e59869aa6 ("compiler-gcc.h: drop checks for older GCC versions")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-13 15:51:41 -07:00
Andrii Nakryiko 2f38304127 libbpf: Make libbpf_version.h non-auto-generated
Turn previously auto-generated libbpf_version.h header into a normal
header file. This prevents various tricky Makefile integration issues,
simplifies the overall build process, but also allows to further extend
it with some more versioning-related APIs in the future.

To prevent accidental out-of-sync versions as defined by libbpf.map and
libbpf_version.h, Makefile checks their consistency at build time.

Simultaneously with this change bump libbpf.map to v0.6.

Also undo adding libbpf's output directory into include path for
kernel/bpf/preload, bpftool, and resolve_btfids, which is not necessary
because libbpf_version.h is just a normal header like any other.

Fixes: 0b46b75505 ("libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210913222309.3220849-1-andrii@kernel.org
2021-09-13 15:36:47 -07:00
Daniel Borkmann dbd7eb14e0 bpf, selftests: Replicate tailcall limit test for indirect call case
The tailcall_3 test program uses bpf_tail_call_static() where the JIT
would patch a direct jump. Add a new tailcall_6 test program replicating
exactly the same test just ensuring that bpf_tail_call() uses a map
index where the verifier cannot make assumptions this time.

In other words, this will now cover both on x86-64 JIT, meaning, JIT
images with emit_bpf_tail_call_direct() emission as well as JIT images
with emit_bpf_tail_call_indirect() emission.

  # echo 1 > /proc/sys/net/core/bpf_jit_enable
  # ./test_progs -t tailcalls
  #136/1 tailcalls/tailcall_1:OK
  #136/2 tailcalls/tailcall_2:OK
  #136/3 tailcalls/tailcall_3:OK
  #136/4 tailcalls/tailcall_4:OK
  #136/5 tailcalls/tailcall_5:OK
  #136/6 tailcalls/tailcall_6:OK
  #136/7 tailcalls/tailcall_bpf2bpf_1:OK
  #136/8 tailcalls/tailcall_bpf2bpf_2:OK
  #136/9 tailcalls/tailcall_bpf2bpf_3:OK
  #136/10 tailcalls/tailcall_bpf2bpf_4:OK
  #136/11 tailcalls/tailcall_bpf2bpf_5:OK
  #136 tailcalls:OK
  Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED

  # echo 0 > /proc/sys/net/core/bpf_jit_enable
  # ./test_progs -t tailcalls
  #136/1 tailcalls/tailcall_1:OK
  #136/2 tailcalls/tailcall_2:OK
  #136/3 tailcalls/tailcall_3:OK
  #136/4 tailcalls/tailcall_4:OK
  #136/5 tailcalls/tailcall_5:OK
  #136/6 tailcalls/tailcall_6:OK
  [...]

For interpreter, the tailcall_1-6 tests are passing as well. The later
tailcall_bpf2bpf_* are failing due lack of bpf2bpf + tailcall support
in interpreter, so this is expected.

Also, manual inspection shows that both loaded programs from tailcall_3
and tailcall_6 test case emit the expected opcodes:

* tailcall_3 disasm, emit_bpf_tail_call_direct():

  [...]
   b:   push   %rax
   c:   push   %rbx
   d:   push   %r13
   f:   mov    %rdi,%rbx
  12:   movabs $0xffff8d3f5afb0200,%r13
  1c:   mov    %rbx,%rdi
  1f:   mov    %r13,%rsi
  22:   xor    %edx,%edx                 _
  24:   mov    -0x4(%rbp),%eax          |  limit check
  2a:   cmp    $0x20,%eax               |
  2d:   ja     0x0000000000000046       |
  2f:   add    $0x1,%eax                |
  32:   mov    %eax,-0x4(%rbp)          |_
  38:   nopl   0x0(%rax,%rax,1)
  3d:   pop    %r13
  3f:   pop    %rbx
  40:   pop    %rax
  41:   jmpq   0xffffffffffffe377
  [...]

* tailcall_6 disasm, emit_bpf_tail_call_indirect():

  [...]
  47:   movabs $0xffff8d3f59143a00,%rsi
  51:   mov    %edx,%edx
  53:   cmp    %edx,0x24(%rsi)
  56:   jbe    0x0000000000000093        _
  58:   mov    -0x4(%rbp),%eax          |  limit check
  5e:   cmp    $0x20,%eax               |
  61:   ja     0x0000000000000093       |
  63:   add    $0x1,%eax                |
  66:   mov    %eax,-0x4(%rbp)          |_
  6c:   mov    0x110(%rsi,%rdx,8),%rcx
  74:   test   %rcx,%rcx
  77:   je     0x0000000000000093
  79:   pop    %rax
  7a:   mov    0x30(%rcx),%rcx
  7e:   add    $0xb,%rcx
  82:   callq  0x000000000000008e
  87:   pause
  89:   lfence
  8c:   jmp    0x0000000000000087
  8e:   mov    %rcx,(%rsp)
  92:   retq
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: Paul Chaignon <paul@cilium.io>
Link: https://lore.kernel.org/bpf/CAM1=_QRyRVCODcXo_Y6qOm1iT163HoiSj8U2pZ8Rj3hzMTT=HQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/20210910091900.16119-1-daniel@iogearbox.net
2021-09-13 14:52:22 -07:00
Song Liu 025bd7c753 selftests/bpf: Add test for bpf_get_branch_snapshot
This test uses bpf_get_branch_snapshot from a fexit program. The test uses
a target function (bpf_testmod_loop_test) and compares the record against
kallsyms. If there isn't enough record matching kallsyms, the test fails.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210910183352.3151445-4-songliubraving@fb.com
2021-09-13 10:53:50 -07:00
Song Liu 856c02dbce bpf: Introduce helper bpf_get_branch_snapshot
Introduce bpf_get_branch_snapshot(), which allows tracing pogram to get
branch trace from hardware (e.g. Intel LBR). To use the feature, the
user need to create perf_event with proper branch_record filtering
on each cpu, and then calls bpf_get_branch_snapshot in the bpf function.
On Intel CPUs, VLBR event (raw event 0x1b00) can be use for this.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210910183352.3151445-3-songliubraving@fb.com
2021-09-13 10:53:50 -07:00
Linus Torvalds 316346243b Merge branch 'gcc-min-version-5.1' (make gcc-5.1 the minimum version)
Merge patch series from Nick Desaulniers to update the minimum gcc
version to 5.1.

This is some of the left-overs from the merge window that I didn't want
to deal with yesterday, so it comes in after -rc1 but was sent before.

Gcc-4.9 support has been an annoyance for some time, and with -Werror I
had the choice of applying a fairly big patch from Kees Cook to remove a
fair number of initializer warnings (still leaving some), or this patch
series from Nick that just removes the source of the problem.

The initializer cleanups might still be worth it regardless, but
honestly, I preferred just tackling the problem with gcc-4.9 head-on.
We've been more aggressiuve about no longer having to care about
compilers that were released a long time ago, and I think it's been a
good thing.

I added a couple of patches on top to sort out a few left-overs now that
we no longer support gcc-4.x.

As noted by Arnd, as a result of this minimum compiler version upgrade
we can probably change our use of '--std=gnu89' to '--std=gnu11', and
finally start using local loop declarations etc.  But this series does
_not_ yet do that.

Link: https://lore.kernel.org/all/20210909182525.372ee687@canb.auug.org.au/
Link: https://lore.kernel.org/lkml/CAK7LNASs6dvU6D3jL2GG3jW58fXfaj6VNOe55NJnTB8UPuk2pA@mail.gmail.com/
Link: https://github.com/ClangBuiltLinux/linux/issues/1438

* emailed patches from Nick Desaulniers <ndesaulniers@google.com>:
  Drop some straggling mentions of gcc-4.9 as being stale
  compiler_attributes.h: drop __has_attribute() support for gcc4
  vmlinux.lds.h: remove old check for GCC 4.9
  compiler-gcc.h: drop checks for older GCC versions
  Makefile: drop GCC < 5 -fno-var-tracking-assignments workaround
  arm64: remove GCC version check for ARCH_SUPPORTS_INT128
  powerpc: remove GCC version check for UPD_CONSTR
  riscv: remove Kconfig check for GCC version for ARCH_RV64I
  Kconfig.debug: drop GCC 5+ version check for DWARF5
  mm/ksm: remove old GCC 4.9+ check
  compiler.h: drop fallback overflow checkers
  Documentation: raise minimum supported version of GCC to 5.1
2021-09-13 10:43:04 -07:00