OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Mark Brown	5aa45cc535	kselftest/arm64: Add stress test for SME ZA context switching Add a stress test for context switching of the ZA register state based on the similar tests Dave Martin wrote for FPSIMD and SVE registers. The test loops indefinitely writing a data pattern to ZA then reading it back and verifying that it's what was expected. Unlike the other tests we manually assemble the SME instructions since at present no released toolchain has SME support integrated. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-35-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:12 +01:00
Mark Brown	1a792b5455	kselftest/arm64: signal: Handle ZA signal context in core code As part of the generic code for signal handling test cases we parse all signal frames to make sure they have at least the basic form we expect and that there are no unexpected frames present in the signal context. Add coverage of the ZA signal frame to this code. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-34-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:11 +01:00
Mark Brown	4126bde025	kselftest/arm64: sme: Provide streaming mode SVE stress test One of the features of SME is the addition of streaming mode, in which we have access to a set of streaming mode SVE registers at the SME vector length. Since these are accessed using the SVE instructions let's reuse the existing SVE stress test for testing with a compile time option for controlling the few small differences needed: - Enter streaming mode immediately on starting the program. - In streaming mode FFR is removed so skip reading and writing FFR. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-33-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:11 +01:00
Mark Brown	a0f2eb641b	kselftest/arm64: Extend vector configuration API tests to cover SME Provide RDVL helpers for SME and extend the main vector configuration tests to cover SME. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-32-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:11 +01:00
Mark Brown	30e3a42b5d	kselftest/arm64: Add tests for TPIDR2 The Scalable Matrix Extension adds a new system register TPIDR2 intended to be used by libc for its own thread specific use, add some kselftests which exercise the ABI for it. Since this test should with some adjustment work for TPIDR and any other similar registers added in future add tests for it in a separate directory rather than placing it with the other floating point tests, nothing existing looked suitable so I created a new test directory called "abi". Since this feature is intended to be used by libc the test is built as freestanding code using nolibc so we don't end up with the test program and libc both trying to manage the register simultaneously and distrupting each other. As a result of being written using nolibc rather than using hwcaps to identify if SME is available in the system we check for the default SME vector length configuration in proc, adding hwcap support to nolibc seems like disproportionate effort and didn't feel entirely idiomatic for what nolibc is trying to do. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-31-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:11 +01:00
Mark Brown	e8c4451480	kselftest/arm64: sme: Add SME support to vlset The Scalable Matrix Extenions (SME) introduces additional register state with configurable vector lengths, similar to SVE but configured separately. Extend vlset to support configuring this state with a --sme or -s command line option. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-30-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:11 +01:00
Mark Brown	6d51b18865	kselftest/arm64: Add manual encodings for SME instructions As for the kernel so that we don't have ambitious toolchain requirements to build the tests manually encode some of the SVE instructions. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419112247.711548-29-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:11 +01:00
Mark Brown	e2d9642a5a	kselftest/arm64: Add simple test for MTE prctl The current tests use the prctls for various things but there's no coverage of the edges of the interface so add some basics. This isn't hugely useful as it is (it originally had some coverage for the combinations with asymmetric mode but we removed the prctl() for that) but it might be a helpful starting point for future work, for example covering error handling. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20220419103243.24774-5-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:11 +01:00
Mark Brown	f326c9a6f4	kselftest/arm64: Refactor parameter checking in mte_switch_mode() Currently we just have a big if statement with a non-specific diagnostic checking both the mode and the tag. Since we'll need to dynamically check for asymmetric mode support in the system and to improve debugability split these checks out. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20220419103243.24774-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:11 +01:00
Mark Brown	191e678bdc	kselftest/arm64: Log unexpected asynchronous MTE faults Help people figure out problems by printing a diagnostic when we get an unexpected asynchronous fault. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20220419103243.24774-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:10 +01:00
Mark Brown	3f374d7972	kselftest/arm64: Handle more kselftest result codes in MTE helpers The MTE selftests have a helper evaluate_test() which translates a return code into a call to ksft_test_result_*(). Currently this only handles pass and fail, silently ignoring any other code. Update the helper to support skipped tests and log any unknown return codes as an error so we get at least some diagnostic if anything goes wrong. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20220419103243.24774-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:10 +01:00
Mark Brown	82f97bcd87	kselftest/arm64: Validate setting via FPSIMD and read via SVE regsets Currently we validate that we can set the floating point state via the SVE regset and read the data via the FPSIMD regset but we do not valiate that the opposite case works as expected. Add a test that covers this case, noting that when reading via SVE regset the kernel has the option of returning either SVE or FPSIMD data so we need to accept both formats. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20220404090613.181272-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:10 +01:00
Mark Brown	1fb1e285b4	kselftest/arm64: Remove assumption that tasks start FPSIMD only Currently the sve-ptrace test for setting and reading FPSIMD data assumes that the child will start off in FPSIMD only mode and that it can use this to read some FPSIMD mode SVE ptrace data, skipping the test if it can't. This isn't an assumption guaranteed by the ABI and also limits how we can use this testcase within the program. Instead skip the initial read and just generate a FPSIMD format buffer for the write part of the test, making the coverage more robust in the face of future kernel and test program changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20220404090613.181272-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:10 +01:00
Mark Brown	854f856f7e	kselftest/arm64: Fix comment for ptrace_sve_get_fpsimd_data() The comment for ptrace_sve_get_fpsimd_data() doesn't describe what the test does at all, fix that. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20220404090613.181272-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-04-28 17:57:10 +01:00
Namhyung Kim	a5d20d42a2	perf symbol: Remove arch__symbols__fixup_end() Now the generic code can handle kallsyms fixup properly so no need to keep the arch-functions anymore. Fixes: `3cf6a32f3f` ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:51:40 -03:00
Namhyung Kim	8799ebce84	perf symbol: Update symbols__fixup_end() Now arch-specific functions all do the same thing. When it fixes the symbol address it needs to check the boundary between the kernel image and modules. For the last symbol in the previous region, it cannot know the exact size as it's discarded already. Thus it just uses a small page size (4096) and rounds it up like the last symbol. Fixes: `3cf6a32f3f` ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:51:33 -03:00
Namhyung Kim	838425f2de	perf symbol: Pass is_kallsyms to symbols__fixup_end() The symbol fixup is necessary for symbols in kallsyms since they don't have size info. So we use the next symbol's address to calculate the size. Now it's also used for user binaries because sometimes they miss size for hand-written asm functions. There's a arch-specific function to handle kallsyms differently but currently it cannot distinguish kallsyms from others. Pass this information explicitly to handle it properly. Note that those arch functions will be moved to the generic function so I didn't added it to the arch-functions. Fixes: `3cf6a32f3f` ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:51:20 -03:00
Timothy Hayes	3b9a8c8b9a	perf test: Add perf_event_attr test for Arm SPE Adds a perf_event_attr test for Arm SPE in which the presence of physical addresses are checked when SPE unit is run with pa_enable=1. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220421165205.117662-4-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:40:49 -03:00
Timothy Hayes	7599b70a3c	perf arm-spe: Fix SPE events with phys addresses This patch corrects a bug whereby SPE collection is invoked with pa_enable=1 but synthesized events fail to show physical addresses. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220421165205.117662-3-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:39:28 -03:00
Timothy Hayes	4e13f6706d	perf arm-spe: Fix addresses of synthesized SPE events This patch corrects a bug whereby synthesized events from SPE samples are missing virtual addresses. Fixes: `54f7815efe` ("perf arm-spe: Fill address info for samples") Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: bpf@vger.kernel.org Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: netdev@vger.kernel.org Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220421165205.117662-2-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:39:14 -03:00
Ian Rogers	36c84190dc	perf vendor events intel: Update WSM-EX events to v3 Events are generated for Westmere EX v3 with events from: https://download.01.org/perfmon/WSM-EX/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change updates descriptions. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:30:36 -03:00
Ian Rogers	a0cb448978	perf vendor events intel: Update WSM-EP-SP events to v3 Events are generated for Westmere EP-SP v3 with events from: https://download.01.org/perfmon/WSM-EP-SP/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change updates descriptions. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:30:16 -03:00
Ian Rogers	e14fd2ee6d	perf vendor events intel: Update SKX events to v1.27 Events are generated for Skylake Server v1.27 with events from: https://download.01.org/perfmon/SKX/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change updates descriptions, adds INST_DECODED.DECODERS and corrects a counter mask in UOPS_RETIRED.TOTAL_CYCLES. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:29:56 -03:00
Ian Rogers	02c758d2aa	perf vendor events intel: Update SKL events to v53 Events are generated for Skylake v53 with events from: https://download.01.org/perfmon/SKL/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change updates descriptions, adds INST_DECODED.DECODERS and corrects a counter mask in UOPS_RETIRED.TOTAL_CYCLES. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:29:38 -03:00
Ian Rogers	8ce185d496	perf vendor events intel: Update IVT events to v21 Events are generated for Ivytown v21 with events from: https://download.01.org/perfmon/IVT/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change fixes a spelling mistake in a description. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:29:12 -03:00
Ian Rogers	a5043ed963	perf vendor events intel: Update ICL events to v1.13 Events are generated for Icelake v1.13 with events from: https://download.01.org/perfmon/ICL/ Using the scripts at: https://github.com/intel/event-converter-for-linux-perf/ This change updates descriptions and adds INST_DECODED.DECODERS. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220428075730.797727-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:28:40 -03:00
Thomas Richter	44900ce975	perf test: Fix test case 81 ("perf record tests") on s390x perf test -F 81 ("perf record tests") -v fails on s390x on the linux-next branch. The test case is x86 specific can not be executed on s390x. The test case depends on x86 register names such as: ... \| egrep -q 'available registers: AX BX CX DX ....' Skip this test case on s390x. Output before: # perf test -F 81 81: perf record tests : FAILED! # Output after: # perf test -F 81 81: perf record tests : Skip # Fixes: `24f378e660` ("perf test: Add basic perf record tests") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20220428122821.3652015-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-28 10:24:24 -03:00
Adrian Hunter	de8fd13843	perf intel-pt: Fix timeless decoding with perf.data directory Intel PT does not capture data in separate directories, so do not use separate directory processing because it doesn't work for timeless decoding. It also looks like it doesn't support one_mmap handling. Example: Before: # perf record --kcore -a -e intel_pt/tsc=0/k sleep 0.1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.799 MB perf.data ] # perf script --itrace=bep \| head # After: # perf script --itrace=bep \| head perf 21073 [000] psb: psb offs: 0 ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms]) perf 21073 [000] cbr: cbr: 45 freq: 4505 MHz (161%) ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: 0 [unknown] ([unknown]) => ffffffffaa68faf6 native_write_msr+0x6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa68faf8 native_write_msr+0x8 ([kernel.kallsyms]) => ffffffffaa61aab0 pt_config_start+0x60 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61aabd pt_config_start+0x6d ([kernel.kallsyms]) => ffffffffaa61b8ad pt_event_start+0x27d ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61b8bb pt_event_start+0x28b ([kernel.kallsyms]) => ffffffffaa61ba60 pt_event_add+0x40 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61ba76 pt_event_add+0x56 ([kernel.kallsyms]) => ffffffffaa880e86 event_sched_in+0xc6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880e9b event_sched_in+0xdb ([kernel.kallsyms]) => ffffffffaa880ea5 event_sched_in+0xe5 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880eba event_sched_in+0xfa ([kernel.kallsyms]) => ffffffffaa880f96 event_sched_in+0x1d6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880fc8 event_sched_in+0x208 ([kernel.kallsyms]) => ffffffffaa880ec0 event_sched_in+0x100 ([kernel.kallsyms]) Fixes: `bb6be405c4` ("perf session: Load data directory files for analysis") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20220428093109.274641-1-adrian.hunter@intel.com Cc: Ian Rogers <irogers@google.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: linux-kernel@vger.kernel.org	2022-04-28 10:20:52 -03:00
Mykola Lysenko	0925225956	bpf/selftests: Add granular subtest output for prog_test Implement per subtest log collection for both parallel and sequential test execution. This allows granular per-subtest error output in the 'All error logs' section. Add subtest log transfer into the protocol during the parallel test execution. Move all test log printing logic into dump_test_log function. One exception is the output of test names when verbose printing is enabled. Move test name/result printing into separate functions to avoid repetition. Print all successful subtest results in the log. Print only failed test logs when test does not have subtests. Or only failed subtests' logs when test has subtests. Disable 'All error logs' output when verbose mode is enabled. This functionality was already broken and is causing confusion. Signed-off-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220427041353.246007-1-mykolal@fb.com	2022-04-27 19:03:58 -07:00
Jakub Kicinski	50c6afabfd	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2022-04-27 We've added 85 non-merge commits during the last 18 day(s) which contain a total of 163 files changed, 4499 insertions(+), 1521 deletions(-). The main changes are: 1) Teach libbpf to enhance BPF verifier log with human-readable and relevant information about failed CO-RE relocations, from Andrii Nakryiko. 2) Add typed pointer support in BPF maps and enable it for unreferenced pointers (via probe read) and referenced ones that can be passed to in-kernel helpers, from Kumar Kartikeya Dwivedi. 3) Improve xsk to break NAPI loop when rx queue gets full to allow for forward progress to consume descriptors, from Maciej Fijalkowski & Björn Töpel. 4) Fix a small RCU read-side race in BPF_PROG_RUN routines which dereferenced the effective prog array before the rcu_read_lock, from Stanislav Fomichev. 5) Implement BPF atomic operations for RV64 JIT, and add libbpf parsing logic for USDT arguments under riscv{32,64}, from Pu Lehui. 6) Implement libbpf parsing of USDT arguments under aarch64, from Alan Maguire. 7) Enable bpftool build for musl and remove nftw with FTW_ACTIONRETVAL usage so it can be shipped under Alpine which is musl-based, from Dominique Martinet. 8) Clean up {sk,task,inode} local storage trace RCU handling as they do not need to use call_rcu_tasks_trace() barrier, from KP Singh. 9) Improve libbpf API documentation and fix error return handling of various API functions, from Grant Seltzer. 10) Enlarge offset check for bpf_skb_{load,store}_bytes() helpers given data length of frags + frag_list may surpass old offset limit, from Liu Jian. 11) Various improvements to prog_tests in area of logging, test execution and by-name subtest selection, from Mykola Lysenko. 12) Simplify map_btf_id generation for all map types by moving this process to build time with help of resolve_btfids infra, from Menglong Dong. 13) Fix a libbpf bug in probing when falling back to legacy bpf_probe_read() helpers; the probing caused always to use old helpers, from Runqing Yang. 14) Add support for ARCompact and ARCv2 platforms for libbpf's PT_REGS tracing macros, from Vladimir Isaev. 15) Cleanup BPF selftests to remove old & unneeded rlimit code given kernel switched to memcg-based memory accouting a while ago, from Yafang Shao. 16) Refactor of BPF sysctl handlers to move them to BPF core, from Yan Zhu. 17) Fix BPF selftests in two occasions to work around regressions caused by latest LLVM to unblock CI until their fixes are worked out, from Yonghong Song. 18) Misc cleanups all over the place, from various others. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits) selftests/bpf: Add libbpf's log fixup logic selftests libbpf: Fix up verifier log for unguarded failed CO-RE relos libbpf: Simplify bpf_core_parse_spec() signature libbpf: Refactor CO-RE relo human description formatting routine libbpf: Record subprog-resolved CO-RE relocations unconditionally selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests libbpf: Avoid joining .BTF.ext data with BPF programs by section name libbpf: Fix logic for finding matching program for CO-RE relocation libbpf: Drop unhelpful "program too large" guess libbpf: Fix anonymous type check in CO-RE logic bpf: Compute map_btf_id during build time selftests/bpf: Add test for strict BTF type check selftests/bpf: Add verifier tests for kptr selftests/bpf: Add C tests for kptr libbpf: Add kptr type tag macros to bpf_helpers.h bpf: Make BTF type match stricter for release arguments bpf: Teach verifier about kptr_get kfunc helpers bpf: Wire up freeing of referenced kptr bpf: Populate pairs of btf_id and destructor kfunc in btf bpf: Adapt copy_map_value for multiple offset case ... ==================== Link: https://lore.kernel.org/r/20220427224758.20976-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-27 17:09:32 -07:00
Adrian Hunter	52cc784244	perf tools: Delete perf-with-kcore.sh script It has been obsolete since the introduction of the 'perf record --kcore' option. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20220427141946.269523-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-27 20:11:26 -03:00
Geliang Tang	53f368bfff	selftests: mptcp: print extra msg in chk_csum_nr When the multiple checksum errors occur in chk_csum_nr(), print the numbers of the errors as an extra message. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:54 +01:00
Geliang Tang	1f7d325f7d	selftests: mptcp: check MP_FAIL response mibs This patch extends chk_fail_nr to check the MP_FAIL response mibs. Add a new argument invert for chk_fail_nr to allow it can check the MP_FAIL TX and RX mibs from the opposite direction. When the infinite map is received before the MP_FAIL response, the response will be lost. A '-' can be added into fail_tx or fail_rx to represent that MP_FAIL response TX or RX can be lost when doing the checks. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:54 +01:00
Geliang Tang	b6e074e171	selftests: mptcp: add infinite map testcase Add the single subflow test case for MP_FAIL, to test the infinite mapping case. Use the test_linkfail value to make 128KB test files. Add a new function reset_with_fail(), in it use 'iptables' and 'tc action pedit' rules to produce the bit flips to trigger the checksum failures. Set validate_checksum to enable checksums for the MP_FAIL tests without passing the '-C' argument. Set check_invert flag to enable the invert bytes check for the output data in check_transfer(). Instead of the file mismatch error, this test prints out the inverted bytes. Add a new function pedit_action_pkts() to get the numbers of the packets edited by the tc pedit actions. Print this numbers to the output. Also add the needed kernel configures in the selftests config file. Suggested-by: Davide Caratti <dcaratti@redhat.com> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-27 10:45:53 +01:00
Andrii Nakryiko	ea4128eb43	selftests/bpf: Add libbpf's log fixup logic selftests Add tests validating that libbpf is indeed patching up BPF verifier log with CO-RE relocation details. Also test partial and full truncation scenarios. This test might be a bit fragile due to changing BPF verifier log format. If that proves to be frequently breaking, we can simplify tests or remove the truncation subtests. But for now it seems useful to test it in those conditions that are otherwise rarely occuring in practice. Also test CO-RE relo failure in a subprog as that excercises subprogram CO-RE relocation mapping logic which doesn't work out of the box without extra relo storage previously done only for gen_loader case. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-11-andrii@kernel.org	2022-04-26 15:41:46 -07:00
Andrii Nakryiko	9fdc4273b8	libbpf: Fix up verifier log for unguarded failed CO-RE relos Teach libbpf to post-process BPF verifier log on BPF program load failure and detect known error patterns to provide user with more context. Currently there is one such common situation: an "unguarded" failed BPF CO-RE relocation. While failing CO-RE relocation is expected, it is expected to be property guarded in BPF code such that BPF verifier always eliminates BPF instructions corresponding to such failed CO-RE relos as dead code. In cases when user failed to take such precautions, BPF verifier provides the best log it can: 123: (85) call unknown#195896080 invalid func unknown#195896080 Such incomprehensible log error is due to libbpf "poisoning" BPF instruction that corresponds to failed CO-RE relocation by replacing it with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads "bad relo" if you squint hard enough). Luckily, libbpf has all the necessary information to look up CO-RE relocation that failed and provide more human-readable description of what's going on: 5: <invalid CO-RE relocation> failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8) This hopefully makes it much easier to understand what's wrong with user's BPF program without googling magic constants. This BPF verifier log fixup is setup to be extensible and is going to be used for at least one other upcoming feature of libbpf in follow up patches. Libbpf is parsing lines of BPF verifier log starting from the very end. Currently it processes up to 10 lines of code looking for familiar patterns. This avoids wasting lots of CPU processing huge verifier logs (especially for log_level=2 verbosity level). Actual verification error should normally be found in last few lines, so this should work reliably. If libbpf needs to expand log beyond available log_buf_size, it truncates the end of the verifier log. Given verifier log normally ends with something like: processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 ... truncating this on program load error isn't too bad (end user can always increase log size, if it needs to get complete log). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-10-andrii@kernel.org	2022-04-26 15:41:46 -07:00
Andrii Nakryiko	14032f2644	libbpf: Simplify bpf_core_parse_spec() signature Simplify bpf_core_parse_spec() signature to take struct bpf_core_relo as an input instead of requiring callers to decompose them into type_id, relo, spec_str, etc. This makes using and reusing this helper easier. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-9-andrii@kernel.org	2022-04-26 15:41:46 -07:00
Andrii Nakryiko	b58af63aab	libbpf: Refactor CO-RE relo human description formatting routine Refactor how CO-RE relocation is formatted. Now it dumps human-readable representation, currently used by libbpf in either debug or error message output during CO-RE relocation resolution process, into provided buffer. This approach allows for better reuse of this functionality outside of CO-RE relocation resolution, which we'll use in next patch for providing better error message for BPF verifier rejecting BPF program due to unguarded failed CO-RE relocation. It also gets rid of annoying "stitching" of libbpf_print() calls, which was the only place where we did this. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-8-andrii@kernel.org	2022-04-26 15:41:46 -07:00
Andrii Nakryiko	185cfe837f	libbpf: Record subprog-resolved CO-RE relocations unconditionally Previously, libbpf recorded CO-RE relocations with insns_idx resolved according to finalized subprog locations (which are appended at the end of entry BPF program) to simplify the job of light skeleton generator. This is necessary because once subprogs' instructions are appended to main entry BPF program all the subprog instruction indices are shifted and that shift is different for each entry (main) BPF program, so it's generally impossible to map final absolute insn_idx of the finalized BPF program to their original locations inside subprograms. This information is now going to be used not only during light skeleton generation, but also to map absolute instruction index to subprog's instruction and its corresponding CO-RE relocation. So start recording these relocations always, not just when obj->gen_loader is set. This information is going to be freed at the end of bpf_object__load() step, as before (but this can change in the future if there will be a need for this information post load step). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-7-andrii@kernel.org	2022-04-26 15:41:46 -07:00
Andrii Nakryiko	b82bb1ffbb	selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests Enhance linked_funcs selftest with two tricky features that might not obviously work correctly together. We add CO-RE relocations to entry BPF programs and mark those programs as non-autoloadable with SEC("?...") annotation. This makes sure that libbpf itself handles .BTF.ext CO-RE relocation data matching correctly for SEC("?...") programs, as well as ensures that BPF static linker handles this correctly (this was the case before, no changes are necessary, but it wasn't explicitly tested). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-6-andrii@kernel.org	2022-04-26 15:41:46 -07:00
Andrii Nakryiko	11d5daa892	libbpf: Avoid joining .BTF.ext data with BPF programs by section name Instead of using ELF section names as a joining key between .BTF.ext and corresponding BPF programs, pre-build .BTF.ext section number to ELF section index mapping during bpf_object__open() and use it later for matching .BTF.ext information (func/line info or CO-RE relocations) to their respective BPF programs and subprograms. This simplifies corresponding joining logic and let's libbpf do manipulations with BPF program's ELF sections like dropping leading '?' character for non-autoloaded programs. Original joining logic in bpf_object__relocate_core() (see relevant comment that's now removed) was never elegant, so it's a good improvement regardless. But it also avoids unnecessary internal assumptions about preserving original ELF section name as BPF program's section name (which was broken when SEC("?abc") support was added). Fixes: `a3820c4811` ("libbpf: Support opting out from autoloading BPF programs declaratively") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-5-andrii@kernel.org	2022-04-26 15:41:46 -07:00
Andrii Nakryiko	966a750932	libbpf: Fix logic for finding matching program for CO-RE relocation Fix the bug in bpf_object__relocate_core() which can lead to finding invalid matching BPF program when processing CO-RE relocation. IF matching program is not found, last encountered program will be assumed to be correct program and thus error detection won't detect the problem. Fixes: `9c82a63cf3` ("libbpf: Fix CO-RE relocs against .text section") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-4-andrii@kernel.org	2022-04-26 15:41:46 -07:00
Andrii Nakryiko	0994a54c52	libbpf: Drop unhelpful "program too large" guess libbpf pretends it knows actual limit of BPF program instructions based on UAPI headers it compiled with. There is neither any guarantee that UAPI headers match host kernel, nor BPF verifier actually uses BPF_MAXINSNS constant anymore. Just drop unhelpful "guess", BPF verifier will emit actual reason for failure in its logs anyways. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-3-andrii@kernel.org	2022-04-26 15:41:45 -07:00
Andrii Nakryiko	afe98d46ba	libbpf: Fix anonymous type check in CO-RE logic Use type name for checking whether CO-RE relocation is referring to anonymous type. Using spec string makes no sense. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-2-andrii@kernel.org	2022-04-26 15:41:45 -07:00
Adrian Hunter	9e5e641045	perf intel-pt: Add link to the perf wiki's Intel PT page Add an EXAMPLE section and link to the perf wiki's Intel PT page. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20220426133213.248475-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-26 14:32:29 -03:00
Colin Ian King	c7b607fa93	selftests/resctrl: Fix null pointer dereference on open failed Currently if opening /dev/null fails to open then file pointer fp is null and further access to fp via fprintf will cause a null pointer dereference. Fix this by returning a negative error value when a null fp is detected. Detected using cppcheck static analysis: tools/testing/selftests/resctrl/fill_buf.c:124:6: note: Assuming that condition '!fp' is not redundant if (!fp) ^ tools/testing/selftests/resctrl/fill_buf.c:126:10: note: Null pointer dereference fprintf(fp, "Sum: %d ", ret); Fixes: `a2561b12fe` ("selftests/resctrl: Add built in benchmark") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-26 09:20:00 -06:00
Kumar Kartikeya Dwivedi	792c0a345f	selftests/bpf: Add test for strict BTF type check Ensure that the edge case where first member type was matched successfully even if it didn't match BTF type of register is caught and rejected by the verifier. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-14-memxor@gmail.com	2022-04-25 20:26:45 -07:00
Kumar Kartikeya Dwivedi	05a945deef	selftests/bpf: Add verifier tests for kptr Reuse bpf_prog_test functions to test the support for PTR_TO_BTF_ID in BPF map case, including some tests that verify implementation sanity and corner cases. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-13-memxor@gmail.com	2022-04-25 20:26:44 -07:00
Kumar Kartikeya Dwivedi	2cbc469a6f	selftests/bpf: Add C tests for kptr This uses the __kptr and __kptr_ref macros as well, and tries to test the stuff that is supposed to work, since we have negative tests in test_verifier suite. Also include some code to test map-in-map support, such that the inner_map_meta matches the kptr_off_tab of map added as element. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-12-memxor@gmail.com	2022-04-25 20:26:44 -07:00
Kumar Kartikeya Dwivedi	ef89654f2b	libbpf: Add kptr type tag macros to bpf_helpers.h Include convenience definitions: __kptr: Unreferenced kptr __kptr_ref: Referenced kptr Users can use them to tag the pointer type meant to be used with the new support directly in the map value definition. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-11-memxor@gmail.com	2022-04-25 20:26:44 -07:00
Kumar Kartikeya Dwivedi	c0a5a21c25	bpf: Allow storing referenced kptr in map Extending the code in previous commits, introduce referenced kptr support, which needs to be tagged using 'kptr_ref' tag instead. Unlike unreferenced kptr, referenced kptr have a lot more restrictions. In addition to the type matching, only a newly introduced bpf_kptr_xchg helper is allowed to modify the map value at that offset. This transfers the referenced pointer being stored into the map, releasing the references state for the program, and returning the old value and creating new reference state for the returned pointer. Similar to unreferenced pointer case, return value for this case will also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer must either be eventually released by calling the corresponding release function, otherwise it must be transferred into another map. It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear the value, and obtain the old value if any. BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future commit will permit using BPF_LDX for such pointers, but attempt at making it safe, since the lifetime of object won't be guaranteed. There are valid reasons to enforce the restriction of permitting only bpf_kptr_xchg to operate on referenced kptr. The pointer value must be consistent in face of concurrent modification, and any prior values contained in the map must also be released before a new one is moved into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg returns the old value, which the verifier would require the user to either free or move into another map, and releases the reference held for the pointer being moved in. In the future, direct BPF_XCHG instruction may also be permitted to work like bpf_kptr_xchg helper. Note that process_kptr_func doesn't have to call check_helper_mem_access, since we already disallow rdonly/wronly flags for map, which is what check_map_access_type checks, and we already ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc, so check_map_access is also not required. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com	2022-04-25 20:26:05 -07:00
Kumar Kartikeya Dwivedi	8f14852e89	bpf: Tag argument to be released in bpf_func_proto Add a new type flag for bpf_arg_type that when set tells verifier that for a release function, that argument's register will be the one for which meta.ref_obj_id will be set, and which will then be released using release_reference. To capture the regno, introduce a new field release_regno in bpf_call_arg_meta. This would be required in the next patch, where we may either pass NULL or a refcounted pointer as an argument to the release function bpf_kptr_xchg. Just releasing only when meta.ref_obj_id is set is not enough, as there is a case where the type of argument needed matches, but the ref_obj_id is set to 0. Hence, we must enforce that whenever meta.ref_obj_id is zero, the register that is to be released can only be NULL for a release function. Since we now indicate whether an argument is to be released in bpf_func_proto itself, is_release_function helper has lost its utitlity, hence refactor code to work without it, and just rely on meta.release_regno to know when to release state for a ref_obj_id. Still, the restriction of one release argument and only one ref_obj_id passed to BPF helper or kfunc remains. This may be lifted in the future. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-3-memxor@gmail.com	2022-04-25 17:31:35 -07:00
Shaopeng Tan	68c4844985	selftests/resctrl: Add missing SPDX license to Makefile Add the missing SPDX(SPDX-License-Identifier) license header to tools/testing/selftests/resctrl/Makefile. Acked-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 17:11:48 -06:00
Shaopeng Tan	42e2f21451	selftests/resctrl: Update README about using kselftest framework to build/run resctrl_tests resctrl_tests can be built or run using kselftests framework. Add description on how to do so in README. Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 17:11:41 -06:00
Shaopeng Tan	b733143cc4	selftests/resctrl: Make resctrl_tests run using kselftest framework In kselftest framework, all tests can be build/run at a time, and a sub test also can be build/run individually. As follows: $ make kselftest-all TARGETS=resctrl $ make -C tools/testing/selftests run_tests $ make -C tools/testing/selftests TARGETS=resctrl run_tests However, resctrl_tests cannot be run using kselftest framework, users have to change directory to tools/testing/selftests/resctrl/, run "make" to build executable file "resctrl_tests", and run "sudo ./resctrl_tests" to execute the test. To build/run resctrl_tests using kselftest framework. Modify tools/testing/selftests/Makefile and tools/testing/selftests/resctrl/Makefile. Even after this change, users can still build/run resctrl_tests without using framework as before. Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> # resctrl changes Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 17:11:34 -06:00
Shaopeng Tan	3531d930c3	selftests/resctrl: Fix resctrl_tests' return code to work with selftest framework In kselftest framework, if a sub test can not run by some reasons, the test result should be marked as SKIP rather than FAIL. Return KSFT_SKIP(4) instead of KSFT_FAIL(1) if resctrl_tests is not run as root or it is run on a test environment which does not support resctrl. - ksft_exit_fail_msg(): returns KSFT_FAIL(1) - ksft_exit_skip(): returns KSFT_SKIP(4) Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 17:10:21 -06:00
Shaopeng Tan	e2e3fb6ef0	selftests/resctrl: Change the default limited time to 120 seconds When testing on a Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz the resctrl selftests fail due to timeout after exceeding the default time limit of 45 seconds. On this system the test takes about 68 seconds. Since the failing test by default accesses a fixed size of memory, the execution time should not vary significantly between different environment. A new default of 120 seconds should be sufficient yet easy to customize with the introduction of the "settings" file for reference. Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 17:06:53 -06:00
Shaopeng Tan	f54b327816	selftests/resctrl: Kill child process before parent process terminates if SIGTERM is received In kselftest framework, a sub test is run using the timeout utility and it will send SIGTERM to the test upon timeout. In resctrl_tests, a child process is created by fork() to run benchmark but SIGTERM is not set in sigaction(). If SIGTERM signal is received, the parent process will be killed, but the child process still exists. Kill child process before the parent process terminates if SIGTERM signal is received. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 17:06:41 -06:00
Shaopeng Tan	d577380da0	selftests/resctrl: Print a message if the result of MBM&CMT tests is failed on Intel CPU According to "Intel Resource Director Technology (Intel RDT) on 2nd Generation Intel Xeon Scalable Processors Reference Manual", When the Intel Sub-NUMA Clustering(SNC) feature is enabled, Intel CMT and MBM counters may not be accurate. However, there does not seem to be an architectural way to detect if SNC is enabled. If the result of MBM&CMT test fails on Intel CPU, print a message to let users know a possible cause of failure. Acked-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 16:58:01 -06:00
Shaopeng Tan	6220f69e72	selftests/resctrl: Extend CPU vendor detection Currently, the resctrl_tests only has a function to detect AMD vendor. Since when the Intel Sub-NUMA Clustering feature is enabled, Intel CMT and MBM counters may not be accurate, the resctrl_tests also need a function to detect Intel vendor. And in the future, resctrl_tests will need a function to detect different vendors, such as Arm. Extend the function to detect Intel vendor as well. Also, this function can be easily extended to detect other vendors. Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 16:57:50 -06:00
Dominique Martinet	246bdfa52f	bpftool, musl compat: Replace sys/fcntl.h by fcntl.h musl does not like including sys/fcntl.h directly: [...] 1 \| #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [...] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220424051022.2619648-5-asmadeus@codewreck.org	2022-04-25 23:24:28 +02:00
Dominique Martinet	93bc2e9e94	bpftool, musl compat: Replace nftw with FTW_ACTIONRETVAL musl nftw implementation does not support FTW_ACTIONRETVAL. There have been multiple attempts at pushing the feature in musl upstream, but it has been refused or ignored all the times: https://www.openwall.com/lists/musl/2021/03/26/1 https://www.openwall.com/lists/musl/2022/01/22/1 In this case we only care about /proc/<pid>/fd/<fd>, so it's not too difficult to reimplement directly instead, and the new implementation makes 'bpftool perf' slightly faster because it doesn't needlessly stat/readdir unneeded directories (54ms -> 13ms on my machine). Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220424051022.2619648-4-asmadeus@codewreck.org	2022-04-25 23:24:16 +02:00
Reinette Chatre	170d1c23f2	selftests/x86/corrupt_xstate_header: Use provided __cpuid_count() macro kselftest.h makes the __cpuid_count() macro available to conveniently call the CPUID instruction. Remove the local CPUID wrapper and use __cpuid_count() from kselftest.h instead. __cpuid_count() from kselftest.h is used instead of the macro provided by the compiler since gcc v4.4 (via cpuid.h) because the selftest needs to be supported with gcc v3.2, the minimal required version for stable kernels. Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@suse.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 15:13:03 -06:00
Reinette Chatre	2ba8a7abb5	selftests/x86/amx: Use provided __cpuid_count() macro kselftest.h makes the __cpuid_count() macro available to conveniently call the CPUID instruction. Remove the local CPUID wrapper and use __cpuid_count() from kselftest.h instead. __cpuid_count() from kselftest.h is used instead of the macro provided by the compiler since gcc v4.4 (via cpuid.h) because the selftest needs to be supported with gcc v3.2, the minimal required version for stable kernels. Cc: Chang S. Bae <chang.seok.bae@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 15:12:58 -06:00
Reinette Chatre	0dba8dae6b	selftests/vm/pkeys: Use provided __cpuid_count() macro kselftest.h makes the __cpuid_count() macro available to conveniently call the CPUID instruction. Remove the local CPUID wrapper and use __cpuid_count() from already included kselftest.h instead. __cpuid_count() from kselftest.h is used instead of the macro provided by the compiler since gcc v4.4 (via cpuid.h) because the selftest needs to be compiled with gcc v3.2, the minimal required version for stable kernels. Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Florian Weimer <fweimer@redhat.com> Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Suchanek <msuchanek@suse.de> Cc: linux-mm@kvack.org Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 15:12:52 -06:00
Reinette Chatre	a23039c730	selftests: Provide local define of __cpuid_count() Some selftests depend on information provided by the CPUID instruction. To support this dependency the selftests implement private wrappers for CPUID. Duplication of the CPUID wrappers should be avoided. Both gcc and clang/LLVM provide __cpuid_count() macros but neither the macro nor its header file are available in all the compiler versions that need to be supported by the selftests. __cpuid_count() as provided by gcc is available starting with gcc v4.4, so it is not available if the latest tests need to be run in all the environments required to support kernels v4.9 and v4.14 that have the minimal required gcc v3.2. Duplicate gcc's __cpuid_count() macro to provide a centrally defined macro for __cpuid_count() to help eliminate the duplicate CPUID wrappers while continuing to compile in older environments. Suggested-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 15:12:36 -06:00
Yuanchu Xie	678f0cdc57	selftests/damon: add damon to selftests root Makefile Currently the damon selftests are not built with the rest of the selftests. We add damon to the list of targets. Fixes: `b348eb7abd` ("mm/damon: add user space selftests") Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Yuanchu Xie <yuanchu@google.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 13:36:17 -06:00
David Vernet	5c26993c31	cgroup: Add config file to cgroup selftest suite Most of the test suites in tools/testing/selftests contain a config file that specifies which kernel config options need to be present in order for the test suite to be able to run and perform meaningful validation. There is no config file for the tools/testing/selftests/cgroup test suite, so this patch adds one. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2022-04-25 07:27:31 -10:00
David Vernet	a79906570f	cgroup: Add test_cpucg_max_nested() testcase The cgroup cpu controller selftests have a test_cpucg_max() testcase that validates the behavior of the cpu.max knob. Let's also add a testcase that verifies that the behavior works correctly when set on a nested cgroup. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2022-04-25 07:27:31 -10:00
David Vernet	889ab8113e	cgroup: Add test_cpucg_max() testcase The cgroup cpu controller test suite has a number of testcases that validate the expected behavior of the cpu.weight knob, but none for cpu.max. This testcase fixes that by adding a testcase for cpu.max as well. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2022-04-25 07:27:31 -10:00
David Vernet	89ca0efa84	cgroup: Add test_cpucg_nested_weight_underprovisioned() testcase The cgroup cpu controller test suite currently contains a testcase called test_cpucg_nested_weight_underprovisioned() which verifies the expected behavior of cpu.weight when applied to nested cgroups. That first testcase validated the expected behavior when the processes in the leaf cgroups overcommitted the system. This patch adds a complementary test_cpucg_nested_weight_underprovisioned() testcase which validates behavior when those leaf cgroups undercommit the system. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2022-04-25 07:27:31 -10:00
David Vernet	b76ee4f576	cgroup: Adding test_cpucg_nested_weight_overprovisioned() testcase The cgroup cpu controller tests in tools/testing/selftests/cgroup/test_cpu.c have some testcases that validate the expected behavior of setting cpu.weight on cgroups, and then hogging CPUs. What is still missing from the suite is a testcase that validates nested cgroups. This patch adds test_cpucg_nested_weight_overprovisioned(), which validates that a parent's cpu.weight will override its children if they overcommit a host, and properly protect any sibling groups of that parent. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2022-04-25 07:27:31 -10:00
Karthik Alapati	ea1d15a067	selftests/binderfs: Improve message to provide more info Currently the binderfs test says what failure it encountered without saying why it may occurred when it fails to mount binderfs. So, Warn about enabling CONFIG_ANDROID_BINDERFS in the running kernel. Signed-off-by: Karthik Alapati <mail@karthek.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-25 10:04:42 -06:00
Yuntao Wang	003fed595c	libbpf: Remove unnecessary type cast The link variable is already of type 'struct bpf_link ', casting it to 'struct bpf_link ' is redundant, drop it. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220424143420.457082-1-ytcoode@gmail.com	2022-04-25 17:39:16 +02:00
Jiri Pirko	002defd576	selftests: mlxsw: Check device info on activated line card Once line card is activated, check the device FW version is exposed. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 10:42:29 +01:00
Jiri Pirko	08682c9e58	selftests: mlxsw: Check line card info on provisioned line card Once line card is provisioned, check if HW revision and INI version are exposed. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 10:42:28 +01:00
Jiri Pirko	5e22298918	selftests: mlxsw: Check devices on provisioned line card Once line card is provisioned, check the count of devices on it and print them out. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-25 10:42:28 +01:00
Mark Brown	c92b576a13	selftests: alsa: Start validating control names Not much of a test but we keep on getting problems with boolean controls not being called Switches so let's add a few basic checks to help people spot problems. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220421115020.14118-1-broonie@kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de>	2022-04-25 07:52:05 +02:00
Arnaldo Carvalho de Melo	e0c1b8f9eb	Merge remote-tracking branch 'torvalds/master' into perf/core To pick up fixes, such as the llvm one for ubuntu:22.04. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-24 07:50:49 -03:00
Adrian Hunter	4bbac9a1f5	libperf evsel: Factor out perf_evsel__ioctl() Factor out perf_evsel__ioctl() so it can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220422162402.147958-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-24 07:50:38 -03:00
Zhengjun Xing	d7e3c39708	perf stat: Support hybrid --topdown option Since for cpu_core or cpu_atom, they have different topdown events groups. For cpu_core, --topdown equals to: "{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/, cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/, cpu_core/topdown-heavy-ops/,cpu_core/topdown-br-mispredict/, cpu_core/topdown-fetch-lat/,cpu_core/topdown-mem-bound/}" For cpu_atom, --topdown equals to: "{cpu_atom/topdown-retiring/,cpu_atom/topdown-bad-spec/, cpu_atom/topdown-fe-bound/,cpu_atom/topdown-be-bound/}" To simplify the implementation, on hybrid, --topdown is used together with --cputype. If without --cputype, it uses cpu_core topdown events by default. # ./perf stat --topdown -a sleep 1 WARNING: default to use cpu_core topdown events Performance counter stats for 'system wide': retiring bad speculation frontend bound backend bound heavy operations light operations branch mispredict machine clears fetch latency fetch bandwidth memory bound Core bound 4.1% 0.0% 5.1% 90.8% 2.3% 1.8% 0.0% 0.0% 4.2% 0.9% 9.9% 81.0% 1.002624229 seconds time elapsed # ./perf stat --topdown -a --cputype atom sleep 1 Performance counter stats for 'system wide': retiring bad speculation frontend bound backend bound 13.5% 0.1% 31.2% 55.2% 1.002366987 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220422065635.767648-3-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-24 07:50:18 -03:00
Linus Torvalds	45ab9400e7	perf tools fixes for v5.18: 3rd batch - Fix header include for LLVM >= 14 when building with libclang. - Allow access to 'data_src' for auxtrace in 'perf script' with ARM SPE perf.data files, fixing processing data with such attributes. - Fix error message for test case 71 ("Convert perf time to TSC") on s390, where it is not supported. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYmNPCAAKCRCyPKLppCJ+ JxW9AQCgzYxEw5CJ+zn58lGmYJdfV5Kc6C8MPD671oo39lC49AD/Qw8tyklKTok5 hJkZ3CqahjMdN1j+xNgskXBNcJW6Rww= =Ayk0 -----END PGP SIGNATURE----- Merge tag 'perf-tools-fixes-for-v5.18-2022-04-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix header include for LLVM >= 14 when building with libclang. - Allow access to 'data_src' for auxtrace in 'perf script' with ARM SPE perf.data files, fixing processing data with such attributes. - Fix error message for test case 71 ("Convert perf time to TSC") on s390, where it is not supported. * tag 'perf-tools-fixes-for-v5.18-2022-04-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf test: Fix error message for test case 71 on s390, where it is not supported perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE event perf script: Always allow field 'data_src' for auxtrace perf clang: Fix header include for LLVM >= 14	2022-04-23 09:36:23 -07:00
Vladimir Oltean	07c8a2dd69	selftests: drivers: dsa: add a subset of forwarding selftests This adds an initial subset of forwarding selftests which I considered to be relevant for DSA drivers, along with a forwarding.config that makes it easier to run them (disables veth pair creation, makes sure MAC addresses are unique and stable). The intention is to request driver writers to run these selftests during review and make sure that the tests pass, or at least that the problems are known. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-23 12:18:16 +01:00
Vladimir Oltean	90b9566aa5	selftests: forwarding: add a test for local_termination.sh This tests the capability of switch ports to filter out undesired traffic. Different drivers are expected to have different capabilities here (so some may fail and some may pass), yet the test still has some value, for example to check for regressions. There are 2 kinds of failures, one is when a packet which should have been accepted isn't (and that should be fixed), and the other "failure" (as reported by the test) is when a packet could have been filtered out (for being unnecessary) yet it was received. The bridge driver fares particularly badly at this test: TEST: br0: Unicast IPv4 to primary MAC address [ OK ] TEST: br0: Unicast IPv4 to macvlan MAC address [ OK ] TEST: br0: Unicast IPv4 to unknown MAC address [FAIL] reception succeeded, but should have failed TEST: br0: Unicast IPv4 to unknown MAC address, promisc [ OK ] TEST: br0: Unicast IPv4 to unknown MAC address, allmulti [FAIL] reception succeeded, but should have failed TEST: br0: Multicast IPv4 to joined group [ OK ] TEST: br0: Multicast IPv4 to unknown group [FAIL] reception succeeded, but should have failed TEST: br0: Multicast IPv4 to unknown group, promisc [ OK ] TEST: br0: Multicast IPv4 to unknown group, allmulti [ OK ] TEST: br0: Multicast IPv6 to joined group [ OK ] TEST: br0: Multicast IPv6 to unknown group [FAIL] reception succeeded, but should have failed TEST: br0: Multicast IPv6 to unknown group, promisc [ OK ] TEST: br0: Multicast IPv6 to unknown group, allmulti [ OK ] mainly because it does not implement IFF_UNICAST_FLT. Yet I still think having the test (with the failures) is useful in case somebody wants to tackle that problem in the future, to make an easy before-and-after comparison. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-23 12:18:16 +01:00
Vladimir Oltean	476a4f05d9	selftests: forwarding: add a no_forwarding.sh test Bombard a standalone switch port with various kinds of traffic to ensure it is really standalone and doesn't leak packets to other switch ports. Also check for switch ports in different bridges, and switch ports in a VLAN-aware bridge but having different pvids. No forwarding should take place in either case. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-23 12:18:16 +01:00
Vladimir Oltean	a5114df6c6	selftests: forwarding: add helper for retrieving IPv6 link-local address of interface Pinging an IPv6 link-local multicast address selects the link-local unicast address of the interface as source, and we'd like to monitor for that in tcpdump. Add a helper to the forwarding library which retrieves the link-local IPv6 address of an interface, to make that task easier. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-23 12:18:16 +01:00
Vladimir Oltean	f23cddc722	selftests: forwarding: add helpers for IP multicast group joins/leaves Extend the forwarding library with calls to some small C programs which join an IP multicast group and send some packets to it. Both IPv4 and IPv6 groups are supported. Use cases range from testing IGMP/MLD snooping, to RX filtering, to multicast routing. Testing multicast traffic using msend/mreceive is intended to be done using tcpdump. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-23 12:18:16 +01:00
Joachim Wiberg	6182c5c509	selftests: forwarding: multiple instances in tcpdump helper Extend tcpdump_start() & C:o to handle multiple instances. Useful when observing bridge operation, e.g., unicast learning/flooding, and any case of multicast distribution (to these ports but not that one ...). This means the interface argument is now a mandatory argument to all tcpdump_*() functions, hence the changes to the ocelot flower test. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-23 12:18:16 +01:00
Joachim Wiberg	fe32dffdcd	selftests: forwarding: add TCPDUMP_EXTRA_FLAGS to lib.sh For some use-cases we may want to change the tcpdump flags used in tcpdump_start(). For instance, observing interfaces without the PROMISC flag, e.g. to see what's really being forwarded to the bridge interface. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-23 12:18:16 +01:00
Vladimir Oltean	b343734ee2	selftests: forwarding: add option to run tests with stable MAC addresses By default, DSA switch ports inherit their MAC address from the DSA master. This works well for practical situations, but some selftests like bridge_vlan_unaware.sh loop back 2 standalone DSA ports with 2 bridged DSA ports, and require the bridge to forward packets between the standalone ports. Due to the bridge seeing that the MAC DA it needs to forward is present as a local FDB entry (it coincides with the MAC address of the bridge ports), the test packets are not forwarded, but terminated locally on br0. In turn, this makes the ping and ping6 tests fail. Address this by introducing an option to have stable MAC addresses. When mac_addr_prepare is called, the current addresses of the netifs are saved and replaced with 00:01:02:03:04:${netif number}. Then when mac_addr_restore is called at the end of the test, the original MAC addresses are restored. This ensures that the MAC addresses are unique, which makes the test pass even for DSA ports. The usage model is for the behavior to be opt-in via STABLE_MAC_ADDRS, which DSA should set to true, all others behave as before. By hooking the calls to mac_addr_prepare and mac_addr_restore within the forwarding lib itself, we do not need to patch each individual selftest, the only requirement is that pre_cleanup is called. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-23 12:18:16 +01:00
Geliang Tang	8bd03be341	selftests: mptcp: add infinite map mibs check This patch adds a function chk_infi_nr() to check the mibs for the infinite mapping. Invoke it in chk_join_nr() when validate_checksum is set. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-23 11:51:05 +01:00
Linus Torvalds	bb4ce2c658	RISC-V: * Remove 's' & 'u' as valid ISA extension * Do not allow disabling the base extensions 'i'/'m'/'a'/'c' x86: * Fix NMI watchdog in guests on AMD * Fix for SEV cache incoherency issues * Don't re-acquire SRCU lock in complete_emulated_io() * Avoid NULL pointer deref if VM creation fails * Fix race conditions between APICv disabling and vCPU creation * Bugfixes for disabling of APICv * Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume selftests: * Do not use bitfields larger than 32-bits, they differ between GCC and clang -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJi3KUUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMhvQf/Yncfg3MkOvKsVxnCe7diKDTI/E2n wBGNIcL8r7L9oIltHL4Mh7JQTacHFQOZ9PQ30NO1p+pznZ03e8LR59IF1JpP7VOU sWrLZ5a4bIAEjOpA7Jxcee6hUBwewBauDgFLbb+YAI2lAahiH7jVfywDRife/c3k N2LjeA75K8UvMiDCfjxxxerFJK91zaqjWlUNF2OhtFp/5pnMfS+nli9Q8QS837pZ oUf+0Beb2RpSHan+wbYVU7X3ZLwtpR0M3w3uXOG+X3as56wDf26znXS02aSwa45x lfX+pqJfmb4vCJJDXt6avH27EVgTq0Vew+BhQHG3VLRO6uxZ+smX6qmsuw== =kvbw -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "The main and larger change here is a workaround for AMD's lack of cache coherency for encrypted-memory guests. I have another patch pending, but it's waiting for review from the architecture maintainers. RISC-V: - Remove 's' & 'u' as valid ISA extension - Do not allow disabling the base extensions 'i'/'m'/'a'/'c' x86: - Fix NMI watchdog in guests on AMD - Fix for SEV cache incoherency issues - Don't re-acquire SRCU lock in complete_emulated_io() - Avoid NULL pointer deref if VM creation fails - Fix race conditions between APICv disabling and vCPU creation - Bugfixes for disabling of APICv - Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume selftests: - Do not use bitfields larger than 32-bits, they differ between GCC and clang" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: selftests: introduce and use more page size-related constants kvm: selftests: do not use bitfields larger than 32-bits for PTEs KVM: SEV: add cache flush to solve SEV cache incoherency issues KVM: SVM: Flush when freeing encrypted pages even on SME_COHERENT CPUs KVM: SVM: Simplify and harden helper to flush SEV guest page(s) KVM: selftests: Silence compiler warning in the kvm_page_table_test KVM: x86/pmu: Update AMD PMC sample period to fix guest NMI-watchdog x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume KVM: SPDX style and spelling fixes KVM: x86: Skip KVM_GUESTDBG_BLOCKIRQ APICv update if APICv is disabled KVM: x86: Pend KVM_REQ_APICV_UPDATE during vCPU creation to fix a race KVM: nVMX: Defer APICv updates while L2 is active until L1 is active KVM: x86: Tag APICv DISABLE inhibit, not ABSENT, if APICv is disabled KVM: Initialize debugfs_dentry when a VM is created to avoid NULL deref KVM: Add helpers to wrap vcpu->srcu_idx and yell if it's abused KVM: RISC-V: Use kvm_vcpu.srcu_idx, drop RISC-V's unnecessary copy KVM: x86: Don't re-acquire SRCU lock in complete_emulated_io() RISC-V: KVM: Restrict the extensions that can be disabled RISC-V: KVM: Remove 's' & 'u' as valid ISA extension	2022-04-22 17:58:36 -07:00
Jason A. Donenfeld	00f3d2ed9d	wireguard: selftests: enable ACPI for SMP It turns out that by having CONFIG_ACPI=n, we've been failing to boot additional CPUs, and so these systems were functionally UP. The code bloat is unfortunate for build times, but I don't see an alternative. So this commit sets CONFIG_ACPI=y for x86_64 and i686 configs. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-22 15:59:05 -07:00
Andrii Nakryiko	fd0493a1e4	selftests/bpf: Switch fexit_stress to bpf_link_create() API Use bpf_link_create() API in fexit_stress test to attach FEXIT programs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Kui-Feng Lee <kuifeng@fb.com> Link: https://lore.kernel.org/bpf/20220421033945.3602803-4-andrii@kernel.org	2022-04-23 00:37:02 +02:00
Andrii Nakryiko	8462e0b46f	libbpf: Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open() Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open() on older kernels for programs that are attachable through BPF_RAW_TRACEPOINT_OPEN. This makes bpf_link_create() more unified and convenient interface for creating bpf_link-based attachments. With this approach end users can just use bpf_link_create() for tp_btf/fentry/fexit/fmod_ret/lsm program attachments without needing to care about kernel support, as libbpf will handle this transparently. On the other hand, as newer features (like BPF cookie) are added to LINK_CREATE interface, they will be readily usable though the same bpf_link_create() API without any major refactoring from user's standpoint. bpf_program__attach_btf_id() is now using bpf_link_create() internally as well and will take advantaged of this unified interface when BPF cookie is added for fentry/fexit. Doing proactive feature detection of LINK_CREATE support for fentry/tp_btf/etc is quite involved. It requires parsing vmlinux BTF, determining some stable and guaranteed to be in all kernels versions target BTF type (either raw tracepoint or fentry target function), actually attaching this program and thus potentially affecting the performance of the host kernel briefly, etc. So instead we are taking much simpler "lazy" approach of falling back to bpf_raw_tracepoint_open() call only if initial LINK_CREATE command fails. For modern kernels this will mean zero added overhead, while older kernels will incur minimal overhead with a single fast-failing LINK_CREATE call. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Kui-Feng Lee <kuifeng@fb.com> Link: https://lore.kernel.org/bpf/20220421033945.3602803-3-andrii@kernel.org	2022-04-23 00:37:02 +02:00
Thomas Richter	5bb017d4b9	perf test: Fix error message for test case 71 on s390, where it is not supported Test case 71 'Convert perf time to TSC' is not supported on s390. Subtest 71.1 is skipped with the correct message, but subtest 71.2 is not skipped and fails. The root cause is function evlist__open() called from test__perf_time_to_tsc(). evlist__open() returns -ENOENT because the event cycles:u is not supported by the selected PMU, for example platform s390 on z/VM or an x86_64 virtual machine. The PMU driver returns -ENOENT in this case. This error is leads to the failure. Fix this by returning TEST_SKIP on -ENOENT. Output before: 71: Convert perf time to TSC: 71.1: TSC support: Skip (This architecture does not support) 71.2: Perf time to TSC: FAILED! Output after: 71: Convert perf time to TSC: 71.1: TSC support: Skip (This architecture does not support) 71.2: Perf time to TSC: Skip (perf_read_tsc_conversion is not supported) This also happens on an x86_64 virtual machine: # uname -m x86_64 $ ./perf test -F 71 71: Convert perf time to TSC : 71.1: TSC support : Ok 71.2: Perf time to TSC : FAILED! $ Committer testing: Continues to work on x86_64: $ perf test 71 71: Convert perf time to TSC : 71.1: TSC support : Ok 71.2: Perf time to TSC : Ok $ Fixes: `290fa68bdc` ("perf test tsc: Fix error message when not supported") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Chengdong Li <chengdongli@tencent.com> Cc: chengdongli@tencent.com Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20220420062921.1211825-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-22 18:39:34 -03:00
Leo Yan	ccb17caecf	perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE event Since commit `bb30acae4c` ("perf report: Bail out --mem-mode if mem info is not available") "perf mem report" and "perf report --mem-mode" don't report result if the PERF_SAMPLE_DATA_SRC bit is missed in sample type. The commit `ffab487052` ("perf: arm-spe: Fix perf report --mem-mode") partially fixes the issue. It adds PERF_SAMPLE_DATA_SRC bit for Arm SPE event, this allows the perf data file generated by kernel v5.18-rc1 or later version can be reported properly. On the other hand, perf tool still fails to be backward compatibility for a data file recorded by an older version's perf which contains Arm SPE trace data. This patch is a workaround in reporting phase, when detects ARM SPE PMU event and without PERF_SAMPLE_DATA_SRC bit, it will force to set the bit in the sample type and give a warning info. Fixes: `bb30acae4c` ("perf report: Bail out --mem-mode if mem info is not available") Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: German Gomez <german.gomez@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: https://lore.kernel.org/r/20220414123201.842754-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-22 18:39:34 -03:00
Leo Yan	c6d8df0106	perf script: Always allow field 'data_src' for auxtrace If use command 'perf script -F,+data_src' to dump memory samples with Arm SPE trace data, it reports error: # perf script -F,+data_src Samples for 'dummy:u' event do not have DATA_SRC attribute set. Cannot print 'data_src' field. This is because the 'dummy:u' event is absent DATA_SRC bit in its sample type, so if a file contains AUX area tracing data then always allow field 'data_src' to be selected as an option for perf script. Fixes: `e55ed3423c` ("perf arm-spe: Synthesize memory event") Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220417114837.839896-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-22 18:39:34 -03:00
Guilherme Amadio	d22588d73b	perf clang: Fix header include for LLVM >= 14 The header TargetRegistry.h has moved in LLVM/clang 14. Committer notes: The problem as noticed when building in ubuntu:22.04: 90 98.61 ubuntu:22.04 : FAIL gcc version 11.2.0 (Ubuntu 11.2.0-19ubuntu1) util/c++/clang.cpp:23:10: fatal error: llvm/Support/TargetRegistry.h: No such file or directory 23 \| #include "llvm/Support/TargetRegistry.h" \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. Fixed after applying this patch. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Guilherme Amadio <amadio@gentoo.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://twitter.com/GuilhermeAmadio/status/1514970524232921088 Link: http://lore.kernel.org/lkml/Ylp0M/VYgHOxtcnF@gentoo.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-22 18:39:34 -03:00
David Vernet	4ab93063c8	cgroup: Add test_cpucg_weight_underprovisioned() testcase test_cpu.c includes testcases that validate the cgroup cpu controller. This patch adds a new testcase called test_cpucg_weight_underprovisioned() that verifies that processes with different cpu.weight that are all running on an underprovisioned system, still get roughly the same amount of cpu time. Because test_cpucg_weight_underprovisioned() is very similar to test_cpucg_weight_overprovisioned(), this patch also pulls the common logic into a separate helper function that is invoked from both testcases, and which uses function pointers to invoke the unique portions of the testcases. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2022-04-22 08:39:32 -10:00
David Vernet	6376b22cd0	cgroup: Add test_cpucg_weight_overprovisioned() testcase test_cpu.c includes testcases that validate the cgroup cpu controller. This patch adds a new testcase called test_cpucg_weight_overprovisioned() that verifies the expected behavior of creating multiple processes with different cpu.weight, on a system that is overprovisioned. So as to avoid code duplication, this patch also updates cpu_hog_func_param to take a new hog_clock_type enum which informs how time is counted in hog_cpus_timed() (either process time or wall clock time). Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2022-04-22 08:39:32 -10:00
David Vernet	3c879a1bb8	cgroup: Add test_cpucg_stats() testcase to cgroup cpu selftests test_cpu.c includes testcases that validate the cgroup cpu controller. This patch adds a new testcase called test_cpucg_stats() that verifies the expected behavior of the cpu.stat interface. In doing so, we define a new hog_cpus_timed() function which takes a cpu_hog_func_param struct that configures how many CPUs it uses, and how long it runs. Future patches will also spawn threads that hog CPUs, so this function will eventually serve those use-cases as well. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2022-04-22 08:39:32 -10:00
David Vernet	820a4f88ee	cgroup: Add new test_cpu.c test suite in cgroup selftests The cgroup selftests suite currently contains tests that validate various aspects of cgroup, such as validating the expected behavior for memory controllers, the expected behavior of cgroup.procs, etc. There are no tests that validate the expected behavior of the cgroup cpu controller. This patch therefore adds a new test_cpu.c file that will contain cpu controller testcases. The file currently only contains a single testcase that validates creating nested cgroups with cgroup.subtree_control including cpu. Future patches will add more sophisticated testcases that validate functional aspects of the cpu controller. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2022-04-22 08:39:32 -10:00
Matthew Wilcox (Oracle)	b9663a6ff8	tools: Add kmem_cache_alloc_lru() Turn kmem_cache_alloc() into a wrapper around kmem_cache_alloc_lru(). Fixes: `9bbdc0f324` ("xarray: use kmem_cache_alloc_lru to allocate xa_node") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Li Wang <liwang@redhat.com>	2022-04-22 14:24:28 -04:00
Zhengjun Xing	2c8e64514a	perf stat: Merge event counts from all hybrid PMUs For hybrid events, by default stat aggregates and reports the event counts per pmu. # ./perf stat -e cycles -a sleep 1 Performance counter stats for 'system wide': 14,066,877,268 cpu_core/cycles/ 6,814,443,147 cpu_atom/cycles/ 1.002760625 seconds time elapsed Sometimes, it's also useful to aggregate event counts from all PMUs. Create a new option '--hybrid-merge' to enable that behavior and report the counts without PMUs. # ./perf stat -e cycles -a --hybrid-merge sleep 1 Performance counter stats for 'system wide': 20,732,982,512 cycles 1.002776793 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220422065635.767648-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-22 14:23:35 -03:00
Zhengjun Xing	60344f1a9a	perf stat: Support metrics with hybrid events One metric such as 'Kernel_Utilization' may be from different PMUs and consists of different events. For core, Kernel_Utilization = cpu_clk_unhalted.thread:k / cpu_clk_unhalted.thread For atom, Kernel_Utilization = cpu_clk_unhalted.core:k / cpu_clk_unhalted.core The metric group string for core is: '{cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread:k/k,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W' It's internally expanded to: '{cpu_clk_unhalted.thread_p/metric-id=cpu_clk_unhalted.thread_p:k/k,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W#cpu_core' The metric group string for atom is: '{cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core:k/k,cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core/}:W' It's internally expanded to: '{cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core:k/k,cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core/}:W#cpu_atom' That means the group "{cpu_clk_unhalted.thread:k,cpu_clk_unhalted.thread}:W" is from cpu_core PMU and the group "{cpu_clk_unhalted.core:k,cpu_clk_unhalted.core}" is from cpu_atom PMU. And then next, check if the events in the group are valid on that PMU. If one event is not valid on that PMU, the associated group would be removed internally. In this example, cpu_clk_unhalted.thread is valid on cpu_core and cpu_clk_unhalted.core is valid on cpu_atom. So the checks for these two groups are passed. Before: # ./perf stat -M Kernel_Utilization -a sleep 1 WARNING: events in group from different hybrid PMUs! WARNING: grouped events cpus do not match, disabling group: anon group { CPU_CLK_UNHALTED.THREAD_P:k, CPU_CLK_UNHALTED.THREAD_P:k, CPU_CLK_UNHALTED.THREAD, CPU_CLK_UNHALTED.THREAD } Performance counter stats for 'system wide': 17,639,501 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 1.00 Kernel_Utilization 17,578,757 cpu_atom/CPU_CLK_UNHALTED.CORE:k/ 1,005,350,226 ns duration_time 43,012,352 cpu_core/CPU_CLK_UNHALTED.THREAD_P:k/ # 0.99 Kernel_Utilization 17,608,010 cpu_atom/CPU_CLK_UNHALTED.THREAD_P:k/ 43,608,755 cpu_core/CPU_CLK_UNHALTED.THREAD/ 17,630,838 cpu_atom/CPU_CLK_UNHALTED.THREAD/ 1,005,350,226 ns duration_time 1.005350226 seconds time elapsed After: # ./perf stat -M Kernel_Utilization -a sleep 1 Performance counter stats for 'system wide': 17,981,895 CPU_CLK_UNHALTED.CORE [cpu_atom] # 1.00 Kernel_Utilization 17,925,405 CPU_CLK_UNHALTED.CORE:k [cpu_atom] 1,004,811,366 ns duration_time 41,246,425 CPU_CLK_UNHALTED.THREAD_P:k [cpu_core] # 0.99 Kernel_Utilization 41,819,129 CPU_CLK_UNHALTED.THREAD [cpu_core] 1,004,811,366 ns duration_time 1.004811366 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220422065635.767648-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-22 14:23:17 -03:00
Zhengjun Xing	17408e5904	perf vendor events intel: Add metrics for Alderlake Add JSON metrics for Alderlake to perf. It included both P-core and E-core metrics. P-core metrics based on TMA 4.3-full (TMA_Metrics-full.csv) E-core metrics based on E-core TMA 2.0 (E-core_TMA_Metrics.xlsx) They are all downloaded from: https://download.01.org/perfmon/ Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20220422065336.767582-1-zhengjun.xing@linux.intel.com Cc: irogers@google.com Cc: peterz@infradead.org Cc: adrian.hunter@intel.com Cc: alexander.shishkin@intel.com Cc: acme@kernel.org Cc: ak@linux.intel.com Cc: jolsa@redhat.com Cc: mingo@redhat.com Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org	2022-04-22 14:22:24 -03:00
Linus Torvalds	281b9d9a4b	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "13 patches. Subsystems affected by this patch series: mm (memory-failure, memcg, userfaultfd, hugetlbfs, mremap, oom-kill, kasan, hmm), and kcov" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove() kcov: don't generate a warning on vm_insert_page()'s failure MAINTAINERS: add Vincenzo Frascino to KASAN reviewers oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup selftest/vm: add skip support to mremap_test selftest/vm: support xfail in mremap_test selftest/vm: verify remap destination address in mremap_test selftest/vm: verify mmap addr in mremap_test mm, hugetlb: allow for "high" userspace addresses userfaultfd: mark uffd_wp regardless of VM_WRITE flag memcg: sync flush only if periodic flush is delayed mm/memory-failure.c: skip huge_zero_page in memory_failure() mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()	2022-04-22 10:10:43 -07:00
Jiri Olsa	3a7ab60597	perf tools: Move libbpf init in libbpf_init function Move the libbpf init code into a single function, so that we have a single place doing that. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220422100025.1469207-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-22 14:02:15 -03:00
Josh Poimboeuf	a8e35fece4	objtool: Update documentation The objtool documentation is very stack validation centric. Broaden the documentation and describe all the features objtool supports. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/b6a84d301d9f73ec6725752654097f4e31fa1b69.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:05 +02:00
Josh Poimboeuf	753da4179d	objtool: Remove --lto and --vmlinux in favor of --link The '--lto' option is a confusing way of telling objtool to do stack validation despite it being a linked object. It's no longer needed now that an explicit '--stackval' option exists. The '--vmlinux' option is also redundant. Remove both options in favor of a straightforward '--link' option which identifies a linked object. Also, implicitly set '--link' with a warning if the user forgets to do so and we can tell that it's a linked object. This makes it easier for manual vmlinux runs. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/dcd3ceffd15a54822c6183e5766d21ad06082b45.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:05 +02:00
Josh Poimboeuf	22102f4559	objtool: Make noinstr hacks optional Objtool has some hacks in place to workaround toolchain limitations which otherwise would break no-instrumentation rules. Make the hacks explicit (and optional for other arches) by turning it into a cmdline option and kernel config option. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/b326eeb9c33231b9dfbb925f194ed7ee40edcd7c.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:04 +02:00
Josh Poimboeuf	4ab7674f59	objtool: Make jump label hack optional Objtool secretly does a jump label hack to overcome the limitations of the toolchain. Make the hack explicit (and optional for other arches) by turning it into a cmdline option and kernel config option. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/3bdcbfdd27ecb01ddec13c04bdf756a583b13d24.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:04 +02:00
Josh Poimboeuf	26e176896a	objtool: Make static call annotation optional As part of making objtool more modular, put the existing static call code behind a new '--static-call' option. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/d59ac57ef3d6d8380cdce20322314c9e2e556750.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:03 +02:00
Josh Poimboeuf	7206447496	objtool: Make stack validation frame-pointer-specific Now that CONFIG_STACK_VALIDATION is frame-pointer specific, do the same for the '--stackval' option. Now the '--no-fp' option is redundant and can be removed. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/f563fa064b3b63d528de250c72012d49e14742a3.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:03 +02:00
Josh Poimboeuf	03f16cd020	objtool: Add CONFIG_OBJTOOL Now that stack validation is an optional feature of objtool, add CONFIG_OBJTOOL and replace most usages of CONFIG_STACK_VALIDATION with it. CONFIG_STACK_VALIDATION can now be considered to be frame-pointer specific. CONFIG_UNWINDER_ORC is already inherently valid for live patching, so no need to "validate" it. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/939bf3d85604b2a126412bf11af6e3bd3b872bcb.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:03 +02:00
Josh Poimboeuf	c2bdd61c98	objtool: Extricate sls from stack validation Extricate sls functionality from validate_branch() so they can be executed (or ported) independently from each other. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/2545c86ffa5f27497f0d0c542540ad4a4be3c5a5.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:03 +02:00
Josh Poimboeuf	3c6f9f77e6	objtool: Rework ibt and extricate from stack validation Extricate ibt from validate_branch() so they can be executed (or ported) independently from each other. While shuffling code around, simplify and improve the ibt logic: - Ignore an explicit list of known sections which reference functions for reasons other than indirect branching to them. This helps prevent unnnecesary sealing. - Warn on missing !ENDBR for all other sections, not just .data and .rodata. This finds additional warnings, because there are sections other than .[ro]data which reference function pointers. For example, the ksymtab sections which are used for exporting symbols. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/fd1435e46bb95f81031b8fb1fa360f5f787e4316.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:02 +02:00
Josh Poimboeuf	7dce62041a	objtool: Make stack validation optional Make stack validation an explicit cmdline option so that individual objtool features can be enabled individually by other arches. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/52da143699574d756e65ca4c9d4acaffe9b0fe5f.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:02 +02:00
Josh Poimboeuf	99c0beb547	objtool: Add option to print section addresses To help prevent objtool users from having to do math to convert function addresses to section addresses, and to help out with finding data addresses reported by IBT validation, add an option to print the section address in addition to the function address. Normal: vmlinux.o: warning: objtool: fixup_exception()+0x2d1: unreachable instruction With '--sec-address': vmlinux.o: warning: objtool: fixup_exception()+0x2d1 (.text+0x76c51): unreachable instruction Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/2cea4d5299d53d1a4c09212a6ad7820aa46fda7a.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:02 +02:00
Josh Poimboeuf	2bc3dec705	objtool: Don't print parentheses in function addresses The parentheses in the "func()+off" address output are inconsistent with how the kernel prints function addresses, breaking Peter's scripts. Remove them. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/f2bec70312f62ef4f1ea21c134d9def627182ad3.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:02 +02:00
Josh Poimboeuf	b51277eb97	objtool: Ditch subcommands Objtool has a fairly singular focus. It runs on object files and does validations and transformations which can be combined in various ways. The subcommand model has never been a good fit, making it awkward to combine and remove options. Remove the "check" and "orc" subcommands in favor of a more traditional cmdline option model. This makes it much more flexible to use, and easier to port individual features to other arches. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/5c61ebf805e90aefc5fa62bc63468ffae53b9df6.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:01 +02:00
Josh Poimboeuf	2daf7faba7	objtool: Reorganize cmdline options Split the existing options into two groups: actions, which actually do something; and options, which modify the actions in some way. Also there's no need to have short flags for all the non-action options. Reserve short flags for the more important actions. While at it: - change a few of the short flags to be more intuitive - make option descriptions more consistently descriptive - sort options in the source like they are when printed - move options to a global struct Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/9dcaa752f83aca24b1b21f0b0eeb28a0c181c0b0.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:01 +02:00
Josh Poimboeuf	aa3d60e050	libsubcmd: Fix OPTION_GROUP sorting The OPTION_GROUP option type is a way of grouping certain options together in the printed usage text. It happens to be completely broken, thanks to the fact that the subcmd option sorting just sorts everything, without regard for grouping. Luckily, nobody uses this option anyway, though that will change shortly. Fix it by sorting each group individually. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/e167ea3a11e2a9800eb062c1fd0f13e9cd05140c.1650300597.git.jpoimboe@redhat.com	2022-04-22 12:32:01 +02:00
Peter Zijlstra	3398b12d10	Merge branch 'tip/x86/urgent' Merge the x86/urgent objtool/IBT changes as a base Signed-off-by: Peter Zijlstra <peterz@infradead.org>	2022-04-22 12:32:01 +02:00
Peter Zijlstra	4abff6d48d	objtool: Fix code relocs vs weak symbols Occasionally objtool driven code patching (think .static_call_sites .retpoline_sites etc..) goes sideways and it tries to patch an instruction that doesn't match. Much head-scatching and cursing later the problem is as outlined below and affects every section that objtool generates for us, very much including the ORC data. The below uses .static_call_sites because it's convenient for demonstration purposes, but as mentioned the ORC sections, .retpoline_sites and __mount_loc are all similarly affected. Consider: foo-weak.c: extern void __SCT__foo(void); __attribute__((weak)) void foo(void) { return __SCT__foo(); } foo.c: extern void __SCT__foo(void); extern void my_foo(void); void foo(void) { my_foo(); return __SCT__foo(); } These generate the obvious code (gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo.c): foo-weak.o: 0000000000000000 <foo>: 0: e9 00 00 00 00 jmpq 5 <foo+0x5> 1: R_X86_64_PLT32 __SCT__foo-0x4 foo.o: 0000000000000000 <foo>: 0: 48 83 ec 08 sub $0x8,%rsp 4: e8 00 00 00 00 callq 9 <foo+0x9> 5: R_X86_64_PLT32 my_foo-0x4 9: 48 83 c4 08 add $0x8,%rsp d: e9 00 00 00 00 jmpq 12 <foo+0x12> e: R_X86_64_PLT32 __SCT__foo-0x4 Now, when we link these two files together, you get something like (ld -r -o foos.o foo-weak.o foo.o): foos.o: 0000000000000000 <foo-0x10>: 0: e9 00 00 00 00 jmpq 5 <foo-0xb> 1: R_X86_64_PLT32 __SCT__foo-0x4 5: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:0x0(%rax,%rax,1) f: 90 nop 0000000000000010 <foo>: 10: 48 83 ec 08 sub $0x8,%rsp 14: e8 00 00 00 00 callq 19 <foo+0x9> 15: R_X86_64_PLT32 my_foo-0x4 19: 48 83 c4 08 add $0x8,%rsp 1d: e9 00 00 00 00 jmpq 22 <foo+0x12> 1e: R_X86_64_PLT32 __SCT__foo-0x4 Noting that ld preserves the weak function text, but strips the symbol off of it (hence objdump doing that funny negative offset thing). This does lead to 'interesting' unused code issues with objtool when ran on linked objects, but that seems to be working (fingers crossed). So far so good.. Now lets consider the objtool static_call output section (readelf output, old binutils): foo-weak.o: Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foo.o: Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 .text + d 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foos.o: Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 .text + 1d 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 So we have two patch sites, one in the dead code of the weak foo and one in the real foo. All is well. HOWEVER*, when the toolchain strips unused section symbols it generates things like this (using new enough binutils): foo-weak.o: Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foo.o: Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000200000002 R_X86_64_PC32 0000000000000000 foo + d 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 foos.o: Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 0000000100000002 R_X86_64_PC32 0000000000000000 foo + 0 0000000000000004 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 0000000000000008 0000000100000002 R_X86_64_PC32 0000000000000000 foo + d 000000000000000c 0000000d00000002 R_X86_64_PC32 0000000000000000 __SCT__foo + 1 And now we can see how that foos.o .static_call_sites goes side-ways, we now have _two_ patch sites in foo. One for the weak symbol at foo+0 (which is no longer a static_call site!) and one at foo+d which is in fact the right location. This seems to happen when objtool cannot find a section symbol, in which case it falls back to any other symbol to key off of, however in this case that goes terribly wrong! As such, teach objtool to create a section symbol when there isn't one. Fixes: `44f6a7c075` ("objtool: Fix seg fault with Clang non-section symbols") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20220419203807.655552918@infradead.org	2022-04-22 12:13:55 +02:00
Peter Zijlstra	c087c6e7b5	objtool: Fix type of reloc::addend Elf{32,64}_Rela::r_addend is of type: Elf{32,64}_Sword, that means that our reloc::addend needs to be long or face tuncation issues when we do elf_rebuild_reloc_section(): - 107: 48 b8 00 00 00 00 00 00 00 00 movabs $0x0,%rax 109: R_X86_64_64 level4_kernel_pgt+0x80000067 + 107: 48 b8 00 00 00 00 00 00 00 00 movabs $0x0,%rax 109: R_X86_64_64 level4_kernel_pgt-0x7fffff99 Fixes: `627fce1480` ("objtool: Add ORC unwind table generation") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20220419203807.596871927@infradead.org	2022-04-22 12:13:55 +02:00
Paolo Abeni	f70925bf99	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/ethernet/microchip/lan966x/lan966x_main.c `d08ed85256` ("net: lan966x: Make sure to release ptp interrupt") `c834963932` ("net: lan966x: Add FDMA functionality") Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-04-22 09:56:00 +02:00
Sidhartha Kumar	80df2fb95d	selftest/vm: add skip support to mremap_test Allow the mremap test to be skipped due to errors such as failing to parse the mmap_min_addr sysctl. Link: https://lkml.kernel.org/r/20220420215721.4868-4-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-21 20:01:10 -07:00
Sidhartha Kumar	e5508fc52c	selftest/vm: support xfail in mremap_test Use ksft_test_result_xfail for the tests which are expected to fail. Link: https://lkml.kernel.org/r/20220420215721.4868-3-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-21 20:01:10 -07:00
Sidhartha Kumar	18d609daa5	selftest/vm: verify remap destination address in mremap_test Because mremap does not have a MAP_FIXED_NOREPLACE flag, it can destroy existing mappings. This causes a segfault when regions such as text are remapped and the permissions are changed. Verify the requested mremap destination address does not overlap any existing mappings by using mmap's MAP_FIXED_NOREPLACE flag. Keep incrementing the destination address until a valid mapping is found or fail the current test once the max address is reached. Link: https://lkml.kernel.org/r/20220420215721.4868-2-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-21 20:01:10 -07:00
Sidhartha Kumar	9c85a9bae2	selftest/vm: verify mmap addr in mremap_test Avoid calling mmap with requested addresses that are less than the system's mmap_min_addr. When run as root, mmap returns EACCES when trying to map addresses < mmap_min_addr. This is not one of the error codes for the condition to retry the mmap in the test. Rather than arbitrarily retrying on EACCES, don't attempt an mmap until addr > vm.mmap_min_addr. Add a munmap call after an alignment check as the mappings are retained after the retry and can reach the vm.max_map_count sysctl. Link: https://lkml.kernel.org/r/20220420215721.4868-1-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-04-21 20:01:09 -07:00
Paolo Bonzini	e852be8b14	kvm: selftests: introduce and use more page size-related constants Clean up code that was hardcoding masks for various fields, now that the masks are included in processor.h. For more cleanup, define PAGE_SIZE and PAGE_MASK just like in Linux. PAGE_SIZE in particular was defined by several tests. Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-21 15:41:01 -04:00
Paolo Bonzini	f18b4aebe1	kvm: selftests: do not use bitfields larger than 32-bits for PTEs Red Hat's QE team reported test failure on access_tracking_perf_test: Testing guest mode: PA-bits:ANY, VA-bits:48, 4K pages guest physical test memory offset: 0x3fffbffff000 Populating memory : 0.684014577s Writing to populated memory : 0.006230175s Reading from populated memory : 0.004557805s ==== Test Assertion Failure ==== lib/kvm_util.c:1411: false pid=125806 tid=125809 errno=4 - Interrupted system call 1 0x0000000000402f7c: addr_gpa2hva at kvm_util.c:1411 2 (inlined by) addr_gpa2hva at kvm_util.c:1405 3 0x0000000000401f52: lookup_pfn at access_tracking_perf_test.c:98 4 (inlined by) mark_vcpu_memory_idle at access_tracking_perf_test.c:152 5 (inlined by) vcpu_thread_main at access_tracking_perf_test.c:232 6 0x00007fefe9ff81ce: ?? ??:0 7 0x00007fefe9c64d82: ?? ??:0 No vm physical memory at 0xffbffff000 I can easily reproduce it with a Intel(R) Xeon(R) CPU E5-2630 with 46 bits PA. It turns out that the address translation for clearing idle page tracking returned a wrong result; addr_gva2gpa()'s last step, which is based on "pte[index[0]].pfn", did the calculation with 40 bits length and the high 12 bits got truncated. In above case the GPA address to be returned should be 0x3fffbffff000 for GVA 0xc0000000, but it got truncated into 0xffbffff000 and the subsequent gpa2hva lookup failed. The width of operations on bit fields greater than 32-bit is implementation defined, and differs between GCC (which uses the bitfield precision) and clang (which uses 64-bit arithmetic), so this is a potential minefield. Remove the bit fields and using manual masking instead. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2075036 Reported-by: Nana Liu <nanliu@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-21 15:41:01 -04:00
Linus Torvalds	59f0c2447e	Networking fixes for 5.18-rc4, including fixes from xfrm and can. Current release - regressions: - rxrpc: restore removed timer deletion Current release - new code bugs: - gre: fix device lookup for l3mdev use-case - xfrm: fix egress device lookup for l3mdev use-case Previous releases - regressions: - sched: cls_u32: fix netns refcount changes in u32_change() - smc: fix sock leak when release after smc_shutdown() - xfrm: limit skb_page_frag_refill use to a single page - eth: atlantic: invert deep par in pm functions, preventing null derefs - eth: stmmac: use readl_poll_timeout_atomic() in atomic state Previous releases - always broken: - gre: fix skb_under_panic on xmit - openvswitch: fix OOB access in reserve_sfa_size() - dsa: hellcreek: calculate checksums in tagger - eth: ice: fix crash in switchdev mode - eth: igc: - fix infinite loop in release_swfw_sync - fix scheduling while atomic Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmJhLbcSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOknxkP/jiAATyBt/HQFykQNiZ7cdrcv1gzJ3Lg BmrV1QmbrNwfffBtmBHRliP7x0vNF6fV8LUKjfyQh1YgJw8I9F/FDH/1fojhBZq/ JJpZrh+TFikBBM4RDMJ0aQi6ssOEo8S9gfN4W48F/49O4S3Q/Gbgv7Lk0jL8utRz 7RgGUVxX+xOSklvh2Tn/xHdYPeebPhLojiKhmH+l6xghyDEUHkemF3AkLwV9QMnq LXmNP4y100xcdCW1bLbyVcq0lbwdLSg4SL+2wC2bvgEDRR0MUezQyNxD6Oqrmusn sASZYgNK92R9ekLBqTX/QwV/XIT+17hclTk4u0eV8GnemnibqOq7DhDqtKyeAzbD mfU6Z5Ku6LuXA1U+2w1jxnd4cJTacA+dCRKcQD91ReguBbCd6zOweB996iBNLucK Kf+r6qWWLxt+JmhSexb/T+oQHsdgvIPSQXNHUH2W8w+2DdTB/EPcSL76DlbZUxrP up4EC3Nr3oxJjHbv7Iq4d9mHuRlwoOOpNJ3mARlfRDL6iuL0zECTweST3qT9YyIH Cz4FGj7kwEDTxGtufoTVia+/JmS39f2lBrMKuhbTVo+qcYhs3zJM4Ki9bAgOKXqI Qf+I73x9yQZ182afq4DsRXLnq/BajmRMyX2/kebY8KsARzRsPAktBhsT17SI6tUG 3MiLiHiIb0qM =thBq -----END PGP SIGNATURE----- Merge tag 'net-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from xfrm and can. Current release - regressions: - rxrpc: restore removed timer deletion Current release - new code bugs: - gre: fix device lookup for l3mdev use-case - xfrm: fix egress device lookup for l3mdev use-case Previous releases - regressions: - sched: cls_u32: fix netns refcount changes in u32_change() - smc: fix sock leak when release after smc_shutdown() - xfrm: limit skb_page_frag_refill use to a single page - eth: atlantic: invert deep par in pm functions, preventing null derefs - eth: stmmac: use readl_poll_timeout_atomic() in atomic state Previous releases - always broken: - gre: fix skb_under_panic on xmit - openvswitch: fix OOB access in reserve_sfa_size() - dsa: hellcreek: calculate checksums in tagger - eth: ice: fix crash in switchdev mode - eth: igc: - fix infinite loop in release_swfw_sync - fix scheduling while atomic" * tag 'net-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits) drivers: net: hippi: Fix deadlock in rr_close() selftests: mlxsw: vxlan_flooding_ipv6: Prevent flooding of unwanted packets selftests: mlxsw: vxlan_flooding: Prevent flooding of unwanted packets nfc: MAINTAINERS: add Bug entry net: stmmac: Use readl_poll_timeout_atomic() in atomic state doc/ip-sysctl: add bc_forwarding netlink: reset network and mac headers in netlink_dump() net: mscc: ocelot: fix broken IP multicast flooding net: dsa: hellcreek: Calculate checksums in tagger net: atlantic: invert deep par in pm functions, preventing null derefs can: isotp: stop timeout monitoring when no first frame was sent bonding: do not discard lowest hash bit for non layer3+4 hashing net: lan966x: Make sure to release ptp interrupt ipv6: make ip6_rt_gc_expire an atomic_t net: Handle l3mdev in ip_tunnel_init_flow l3mdev: l3mdev_master_upper_ifindex_by_index_rcu should be using netdev_master_upper_dev_get_rcu net/sched: cls_u32: fix possible leak in u32_init_knode() net/sched: cls_u32: fix netns refcount changes in u32_change() powerpc: Update MAINTAINERS for ibmvnic and VAS net: restore alpha order to Ethernet devices in config ...	2022-04-21 12:29:08 -07:00
Thomas Huth	266a19a0bc	KVM: selftests: Silence compiler warning in the kvm_page_table_test When compiling kvm_page_table_test.c, I get this compiler warning with gcc 11.2: kvm_page_table_test.c: In function 'pre_init_before_test': ../../../../tools/include/linux/kernel.h:44:24: warning: comparison of distinct pointer types lacks a cast 44 \| (void) (&_max1 == &_max2); \ \| ^~ kvm_page_table_test.c:281:21: note: in expansion of macro 'max' 281 \| alignment = max(0x100000, alignment); \| ^~~ Fix it by adjusting the type of the absolute value. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Message-Id: <20220414103031.565037-1-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-21 13:16:14 -04:00
Gaosheng Cui	b71a2ebf74	libbpf: Remove redundant non-null checks on obj_elf Obj_elf is already non-null checked at the function entry, so remove redundant non-null checks on obj_elf. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220421031803.2283974-1-cuigaosheng1@huawei.com	2022-04-21 09:56:26 -07:00
Artem Savkov	c14766a8a8	selftests/bpf: Fix map tests errno checks Switching to libbpf 1.0 API broke test_lpm_map and test_lru_map as error reporting changed. Instead of setting errno and returning -1 bpf calls now return -Exxx directly. Drop errno checks and look at return code directly. Fixes: `b858ba8c52` ("selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK") Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/bpf/20220421094320.1563570-1-asavkov@redhat.com	2022-04-21 09:51:57 -07:00
Artem Savkov	6a12b8e20d	selftests/bpf: Fix prog_tests uprobe_autoattach compilation error I am getting the following compilation error for prog_tests/uprobe_autoattach.c: tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c: In function ‘test_uprobe_autoattach’: ./test_progs.h:209:26: error: pointer ‘mem’ may be used after ‘free’ [-Werror=use-after-free] The value of mem is now used in one of the asserts, which is why it may be confusing compilers. However, it is not dereferenced. Silence this by moving free(mem) after the assert block. Fixes: `1717e24801` ("selftests/bpf: Uprobe tests should verify param/return values") Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220421132317.1583867-1-asavkov@redhat.com	2022-04-21 18:48:04 +02:00
Artem Savkov	920fd5e177	selftests/bpf: Fix attach tests retcode checks Switching to libbpf 1.0 API broke test_sock and test_sysctl as they check for return of bpf_prog_attach to be exactly -1. Switch the check to '< 0' instead. Fixes: `b858ba8c52` ("selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK") Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/bpf/20220421130104.1582053-1-asavkov@redhat.com	2022-04-21 16:34:55 +02:00
Grant Seltzer	a66ab9a9e6	libbpf: Add documentation to API functions This adds documentation for the following API functions: - bpf_program__set_expected_attach_type() - bpf_program__set_type() - bpf_program__set_attach_target() - bpf_program__attach() - bpf_program__pin() - bpf_program__unpin() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220420161226.86803-3-grantseltzer@gmail.com	2022-04-21 16:31:07 +02:00
Grant Seltzer	df28671632	libbpf: Update API functions usage to check error This updates usage of the following API functions within libbpf so their newly added error return is checked: - bpf_program__set_expected_attach_type() - bpf_program__set_type() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220420161226.86803-2-grantseltzer@gmail.com	2022-04-21 16:28:25 +02:00
Grant Seltzer	93442f132b	libbpf: Add error returns to two API functions This adds an error return to the following API functions: - bpf_program__set_expected_attach_type() - bpf_program__set_type() In both cases, the error occurs when the BPF object has already been loaded when the function is called. In this case -EBUSY is returned. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220420161226.86803-1-grantseltzer@gmail.com	2022-04-21 16:28:11 +02:00
Ammar Faizi	11dbdaeff4	tools/nolibc/string: Implement `strdup()` and `strndup()` These functions are currently only available on architectures that have my_syscall6() macro implemented. Since these functions use malloc(), malloc() uses mmap(), mmap() depends on my_syscall6() macro. On architectures that don't support my_syscall6(), these function will always return NULL with errno set to ENOSYS. Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Ammar Faizi	b26823c19a	tools/nolibc/string: Implement `strnlen()` size_t strnlen(const char *str, size_t maxlen); The strnlen() function returns the number of bytes in the string pointed to by sstr, excluding the terminating null byte ('\0'), but at most maxlen. In doing this, strnlen() looks only at the first maxlen characters in the string pointed to by str and never beyond str[maxlen-1]. The first use case of this function is for determining the memory allocation size in the strndup() function. Link: https://lore.kernel.org/lkml/CAOG64qMpEMh+EkOfjNdAoueC+uQyT2Uv3689_sOr37-JxdJf4g@mail.gmail.com Suggested-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Ammar Faizi	0e0ff63840	tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()` Implement basic dynamic allocator functions. These functions are currently only available on architectures that have nolibc mmap() syscall implemented. These are not a super-fast memory allocator, but at least they can satisfy basic needs for having heap without libc. Cc: David Laight <David.Laight@ACULAB.COM> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Ammar Faizi	5a18d07ce3	tools/nolibc/types: Implement `offsetof()` and `container_of()` macro Implement `offsetof()` and `container_of()` macro. The first use case of these macros is for `malloc()`, `realloc()` and `free()`. Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Ammar Faizi	544fa1a2d3	tools/nolibc/sys: Implement `mmap()` and `munmap()` Implement mmap() and munmap(). Currently, they are only available for architecures that have my_syscall6 macro. For architectures that don't have, this function will return -1 with errno set to ENOSYS (Function not implemented). This has been tested on x86 and i386. Notes for i386: 1) The common mmap() syscall implementation uses __NR_mmap2 instead of __NR_mmap. 2) The offset must be shifted-right by 12-bit. Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Ammar Faizi	f4738ff74c	tools/nolibc: i386: Implement syscall with 6 arguments On i386, the 6th argument of syscall goes in %ebp. However, both Clang and GCC cannot use %ebp in the clobber list and in the "r" constraint without using -fomit-frame-pointer. To make it always available for any kind of compilation, the below workaround is implemented. 1) Push the 6-th argument. 2) Push %ebp. 3) Load the 6-th argument from 4(%esp) to %ebp. 4) Do the syscall (int $0x80). 5) Pop %ebp (restore the old value of %ebp). 6) Add %esp by 4 (undo the stack pointer). Cc: x86@kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/lkml/2e335ac54db44f1d8496583d97f9dab0@AcuMS.aculab.com Suggested-by: David Laight <David.Laight@ACULAB.COM> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Ammar Faizi	1590c59836	tools/nolibc: Remove .global _start from the entry point code Building with clang yields the following error: ``` <inline asm>:3:1: error: _start changed binding to STB_GLOBAL .global _start ^ 1 error generated. ``` Make sure only specify one between `.global _start` and `.weak _start`. Remove `.global _start`. Cc: llvm@lists.linux.dev Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Ammar Faizi	37d62758e7	tools/nolibc: Replace `asm` with `__asm__` Replace `asm` with `__asm__` to support compilation with -std flag. Using `asm` with -std flag makes GCC think `asm()` is a function call instead of an inline assembly. GCC doc says: For the C language, the `asm` keyword is a GNU extension. When writing C code that can be compiled with `-ansi` and the `-std` options that select C dialects without GNU extensions, use `__asm__` instead of `asm`. Link: https://gcc.gnu.org/onlinedocs/gcc/Basic-Asm.html Reported-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Ammar Faizi	5312aaa5d5	tools/nolibc: x86-64: Update System V ABI document link The old link no longer works, update it. Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Willy Tarreau	2475d37ac3	tools/nolibc/stdlib: only reference the external environ when inlined When building with gcc at -O0 we're seeing link errors due to the "environ" variable being referenced by getenv(). The problem is that at -O0 gcc will not inline getenv() and will not drop the external reference. One solution would be to locally declare the variable as weak, but then it would appear in all programs even those not using it, and would be confusing to users of getenv() who would forget to set environ to envp. An alternate approach used in this patch consists in always inlining the outer part of getenv() that references this extern so that it's always dropped when not used. The biggest part of the function was now moved to a new function called _getenv() that's still not inlined by default. Reported-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Tested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Willy Tarreau	96980b833a	tools/nolibc/string: do not use __builtin_strlen() at -O0 clang wants to use strlen() for __builtin_strlen() at -O0. We don't really care about -O0 but it at least ought to build, so let's make sure we don't choke on this, by dropping the optimizationn for constant strings in this case. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Willy Tarreau	0b37dff10b	tools/nolibc: add the nolibc subdir to the common Makefile The Makefile in tools/ is used to forward options to the makefiles in the various subdirs. Let's add nolibc there so that it becomes possible to make tools/nolibc_headers_standalone from the main tree to simply create a completely usable sysroot. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Willy Tarreau	2432616468	tools/nolibc: add a makefile to install headers This provides a target "headers_standalone" which installs the nolibc's arch-specific headers with "arch.h" taken from the current arch (or a concatenation of both i386 and x86_64 for arch=x86), then installs kernel headers. This creates a convenient sysroot which is directly usable by a bare-metal compiler to create any executable. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:46 -07:00
Willy Tarreau	96d2a1313f	tools/nolibc/types: add poll() and waitpid() flag definitions - POLLIN etc were missing, so poll() could only be used with timeouts. - WNOHANG was not defined and is convenient to check if a child is still running Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	54abe3590f	tools/nolibc/sys: add syscall definition for getppid() This is essentially for completeness as it's not the most often used in regtests. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	0e7b492943	tools/nolibc/string: add strcmp() and strncmp() We need these functions all the time, including when checking environment variables and parsing command-line arguments. These implementations were optimized to show optimal code size on a wide range of compilers (22 bytes return included for strcmp(), 33 for strncmp()). Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	bd845a193a	tools/nolibc/stdio: add support for '%p' to vfprintf() %p remains quite useful in test code, and the code path can easily be merged with the existing "%x" thus only adds ~50 bytes, thus let's add it. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	077d0a3924	tools/nolibc/stdlib: add a simple getenv() implementation This implementation relies on an extern definition of the environ variable, that the caller must declare and initialize from envp. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	170b230d22	tools/nolibc/stdio: make printf(%s) accept NULL It's often convenient to support this, especially in test programs where a NULL may correspond to an allocation error or a non-existing value. Let's make printf("%s") support being passed a NULL. In this case it prints "(null)" like glibc's printf(). Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	f0f04f28d5	tools/nolibc/stdlib: implement abort() libgcc uses it for certain divide functions, so it must be exported. Like for memset() we do that in its own section so that the linker can strip it when not needed. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	c4486e9728	tools/nolibc: also mention how to build by just setting the include path Now that a few basic include files are provided, some simple portable programs may build, which will save them from having to surround their includes with #ifndef NOLIBC. This patch mentions how to proceed, and enumerates the list of files that are covered. A comprehensive list of required include files is available here: https://en.cppreference.com/w/c/header Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	cec1505321	tools/nolibc/time: create time.h with time() The time() syscall is used by a few simple applications, and is trivial to implement based on gettimeofday() that we already have. Let's create the file to ease porting and provide the function. It never returns any error, though it may segfault in case of invalid pointer, like other implementations relying on gettimeofday(). Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	99cb50ab94	tools/nolibc/signal: move raise() to signal.h This function is normally found in signal.h, and providing the file eases porting of existing programs. Let's move it there. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	180a9797b0	tools/nolibc/unistd: add usleep() This call is trivial to implement based on select() to complete sleep() and msleep(), let's add it. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	4619de3446	tools/nolibc/unistd: extract msleep(), sleep(), tcsetpgrp() to unistd.h These functions are normally provided by unistd.h. For ease of porting, let's create the file and move them there. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	45a794bf7c	tools/nolibc/errno: extract errno.h from sys.h This allows us to provide a minimal errno.h to ease porting applications that use it. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	8d304a3740	tools/nolibc/string: export memset() and memmove() "clang -Os" and "gcc -Ofast" without -ffreestanding may ignore memset() and memmove(), hoping to provide their builtin equivalents, and finally not find them. Thus we must export these functions for these rare cases. Note that as they're set in their own sections, they will be eliminated by the linker if not used. In addition, they do not prevent gcc from identifying them and replacing them with the shorter "rep movsb" or "rep stosb" when relevant. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	023033fe34	tools/nolibc/types: define PATH_MAX and MAXPATHLEN These ones are often used and commonly set by applications to fallback values. Let's fix them both to agree on PATH_MAX=4096 by default, as is already present in linux/limits.h. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	dffeb81af5	tools/nolibc/arch: mark the _start symbol as weak By doing so we can link together multiple C files that have been compiled with nolibc and which each have a _start symbol. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	07f47ea06f	tools/nolibc: move exported functions to their own section Some functions like raise() and memcpy() are permanently exported because they're needed by libgcc on certain platforms. However most of the time they are not needed and needlessly take space. Let's move them to their own sub-section, called .text.nolibc_<function>. This allows ld to get rid of them if unused when passed --gc-sections. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:45 -07:00
Willy Tarreau	d9390de638	tools/nolibc/string: add tiny versions of strncat() and strlcat() While these functions are often dangerous, forcing the user to work around their absence is often much worse. Let's provide small versions of each of them. The respective sizes in bytes on a few architectures are: strncat(): x86:0x33 mips:0x68 arm:0x3c strlcat(): x86:0x25 mips:0x4c arm:0x2c The two are quite different, and strncat() is even different from strncpy() in that it limits the amount of data it copies and will always terminate the output by one zero, while strlcat() will always limit the total output to the specified size and will put a zero if possible. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	b312eb0b87	tools/nolibc/string: add strncpy() and strlcpy() These are minimal variants. strncpy() always fills the destination for <size> chars, while strlcpy() copies no more than <size> including the zero and returns the source's length. The respective sizes on various archs are: strncpy(): x86:0x1f mips:0x30 arm:0x20 strlcpy(): x86:0x17 mips:0x34 arm:0x1a Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	d76232ff8b	tools/nolibc/string: slightly simplify memmove() The direction test inside the loop was not always completely optimized, resulting in a larger than necessary function. This change adds a direction variable that is set out of the loop. Now the function is down to 48 bytes on x86, 32 on ARM and 68 on mips. It's worth noting that other approaches were attempted (including relying on the up and down functions) but they were only slightly beneficial on x86 and cost more on others. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	d8dcc2d8d9	tools/nolibc/string: use unidirectional variants for memcpy() Till now memcpy() relies on memmove(), but it's always included for libgcc, so we have a larger than needed function. Let's implement two unidirectional variants to copy from bottom to top and from top to bottom, and use the former for memcpy(). The variants are optimized to be compact, and at the same time the compiler is sometimes able to detect the loop and to replace it with a "rep movsb". The new function is 24 bytes instead of 52 on x86_64. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	830acd088e	tools/nolibc/sys: make getpgrp(), getpid(), gettid() not set errno These syscalls never fail so there is no need to extract and set errno for them. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	6e277371a5	tools/nolibc/stdlib: make raise() use the lower level syscalls only raise() doesn't set errno, so there's no point calling kill(), better call sys_kill(), which also reduces the function's size. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	ac90226d53	tools/nolibc/stdlib: avoid a 64-bit shift in u64toh_r() The build of printf() on mips requires libgcc for functions __ashldi3 and __lshrdi3 due to 64-bit shifts when scanning the input number. These are not really needed in fact since we scan the number 4 bits at a time. Let's arrange the loop to perform two 32-bit shifts instead on 32-bit platforms. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	a7604ba149	tools/nolibc/sys: make open() take a vararg on the 3rd argument Let's pass a vararg to open() so that it remains compatible with existing code. The arg is only dereferenced when flags contain O_CREAT. The function is generally not inlined anymore, causing an extra call (total 16 extra bytes) but it's still optimized for constant propagation, limiting the excess to no more than 16 bytes in practice when open() is called without O_CREAT, and ~40 with O_CREAT, which remains reasonable. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	acab7bcdb1	tools/nolibc/stdio: add perror() to report the errno value It doesn't contain the text for the error codes, but instead displays "errno=" followed by the errno value. Just like the regular errno, if a non-empty message is passed, it's placed followed with ": " on the output before the errno code. The message is emitted on stderr. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	51469d5ab3	tools/nolibc/types: define EXIT_SUCCESS and EXIT_FAILURE These ones are found in some examples found in man pages and ease portability tests. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	7e4346f4a3	tools/nolibc/stdio: add a minimal [vf]printf() implementation This adds a minimal vfprintf() implementation as well as the commonly used fprintf() and printf() that rely on it. For now the function supports: - formats: %s, %c, %u, %d, %x - modifiers: %l and %ll - unknown chars are considered as modifiers and are ignored It is designed to remain minimalist, despite this printf() is 549 bytes on x86_64. It would be wise not to add too many formats. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	e3e19052d5	tools/nolibc/stdio: add fwrite() to stdio We'll use it to write substrings. It relies on a simpler _fwrite() that only takes one size. fputs() was also modified to rely on it. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	99b037cbd5	tools/nolibc/stdio: add stdin/stdout/stderr and fget/fput functions The standard puts() function always emits the trailing LF which makes it unconvenient for small string concatenation. fputs() ought to be used instead but it requires a FILE. This adds 3 dummy FILE values (stdin, stdout, stderr) which are in fact pointers to struct FILE of one byte. We reserve 3 pointer values for them, -3, -2 and -1, so that they are ordered, easing the tests and mapping to integer. >From this, fgetc(), fputc(), fgets() and fputs() were implemented, and the previous putchar() and getchar() now remap to these. The standard getc() and putc() macros were also implemented as pointing to these ones. There is absolutely no buffering, fgetc() and fgets() read one byte at a time, fputc() writes one byte at a time, and only fputs() which knows the string's length writes all of it at once. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	4e383a66ac	tools/nolibc/stdio: add a minimal set of stdio functions This only provides getchar(), putchar(), and puts(). Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	5f493178ef	tools/nolibc/stdlib: add utoh() and u64toh() This adds a pair of functions to emit hex values. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	b1c21e7d99	tools/nolibc/stdlib: add i64toa() and u64toa() These are 64-bit variants of the itoa() and utoa() functions. They also support reentrant ones, and use the same itoa_buffer. The functions are a bit larger than the previous ones in 32-bit mode (86 and 98 bytes on x86_64 and armv7 respectively), which is why we continue to provide them as separate functions. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:44 -07:00
Willy Tarreau	66c397c4d2	tools/nolibc/stdlib: replace the ltoa() function with more efficient ones The original ltoa() function and the reentrant one ltoa_r() present a number of drawbacks. The divide by 10 generates calls to external code from libgcc_s, and the number does not necessarily start at the beginning of the buffer. Let's rewrite these functions so that they do not involve a divide and only use loops on powers of 10, and implement both signed and unsigned variants, always starting from the buffer's first character. Instead of using a static buffer for each function, we're now using a common one. In order to avoid confusion with the ltoa() name, the new functions are called itoa_r() and utoa_r() to distinguish the signed and unsigned versions, and for convenience for their callers, these functions now reutrn the number of characters emitted. The ltoa_r() function is just an inline mapping to the signed one and which returns the buffer. The functions are quite small (86 bytes on x86_64, 68 on armv7) and do not depend anymore on external code. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	56d68a3c1f	tools/nolibc/stdlib: move ltoa() to stdlib.h This function is not standard and performs the opposite of atol(). Let's move it with atol(). It's been split between a reentrant function and one using a static buffer. There's no more definition in nolibc.h anymore now. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	eba6d00d38	tools/nolibc/types: move makedev to types.h and make it a macro The makedev() man page says it's supposed to be a macro and that some OSes have it with the other ones in sys/types.h so it now makes sense to move it to types.h as a macro. Let's also define major() and minor() that perform the reverse operation. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	306c9fd4c6	tools/nolibc/types: make FD_SETSIZE configurable The macro was hard-coded to 256 but it's common to see it redefined. Let's support this and make sure we always allocate enough entries for the cases where it wouldn't be multiple of 32. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	8cb98b3fce	tools/nolibc/types: move the FD_* functions to macros in types.h FD_SET, FD_CLR, FD_ISSET, FD_ZERO are often expected to be macros and not functions. In addition we already have a file dedicated to such macros and types used by syscalls, it's types.h, so let's move them there and turn them to macros. FD_CLR() and FD_ISSET() were missing, so they were added. FD_ZERO() now deals with its own loop so that it doesn't rely on memset() that sets one byte at a time. Cc: David Laight <David.Laight@aculab.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	50850c38b2	tools/nolibc/ctype: add the missing is* functions There was only isdigit, this commit adds the other ones. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	62a2af0774	tools/nolibc/ctype: split the is* functions to ctype.h In fact there's only isdigit() for now. More should definitely be added. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	c91eb03389	tools/nolibc/string: split the string functions into string.h The string manipulation functions (mem, str) are now found in string.h. The file depends on almost nothing and will be usable from other includes if needed. Maybe more functions could be added. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	06fdba53e0	tools/nolibc/stdlib: extract the stdlib-specific functions to their own file The new file stdlib.h contains the definitions of functions that are usually found in stdlib.h. Many more could certainly be added. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	bd8c8fbb86	tools/nolibc/sys: split the syscall definitions into their own file The syscall definitions were moved to sys.h. They were arranged in a more easily maintainable order, whereby the sys_xxx() and xxx() functions were grouped together, which also enlights the occasional mappings such as wait relying on wait4(). Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	271661c1cd	tools/nolibc/arch: split arch-specific code into individual files In order to ease maintenance, this splits the arch-specific code into one file per architecture. A common file "arch.h" is used to include the right file among arch-* based on the detected architecture. Projects which are already split per architecture could simply rename these files to $arch/arch.h and get rid of the common arch.h. For this reason, include guards were placed into each arch-specific file. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	cc7a492ad0	tools/nolibc/types: split syscall-specific definitions into their own files The macros and type definitions used by a number of syscalls were moved to types.h where they will be easier to maintain. A few of them are arch-specific and must not be moved there (e.g. O_*, sys_stat_struct). A warning about them was placed at the top of the file. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:43 -07:00
Willy Tarreau	967cce191f	tools/nolibc/std: move the standard type definitions to std.h The ordering of includes and definitions for now is a bit of a mess, as for example asm/signal.h is included after int definitions, but plenty of structures are defined later as they rely on other includes. Let's move the standard type definitions to a dedicated file that is included first. We also move NULL there. This way all other includes are aware of it, and we can bring asm/signal.h back to the top of the file. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 17:05:33 -07:00
Paul E. McKenney	fb036ad7db	rcutorture: Make torture.sh allow for --kasan The torture.sh script provides extra memory for scftorture and rcuscale. However, the total memory provided is only 1G, which is less than the 2G that is required for KASAN testing. This commit therefore ups the torture.sh script's 1G to 2G. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:55:03 -07:00
Paul E. McKenney	d69e048b27	rcutorture: Make torture.sh refscale and rcuscale specify Tasks Trace RCU Now that the Tasks RCU flavors are selected by their users rather than by the rcutorture scenarios, torture.sh fails when attempting to run NOPREEMPT scenarios for refscale and rcuscale. This commit therefore makes torture.sh specify CONFIG_TASKS_TRACE_RCU=y to avoid such failure. Why not also CONFIG_TASKS_RCU? Because tracing selects this one. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:55:03 -07:00
Paul E. McKenney	3101562576	rcutorture: Make kvm.sh allow more memory for --kasan runs KASAN allots significant memory to track allocation state, and the amount of memory has increased recently, which results in frequent OOMs on a few of the rcutorture scenarios. This commit therefore provides 2G of memory for --kasan runs, up from the 512M default. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:55:03 -07:00
Paul E. McKenney	c7756fff4f	torture: Save "make allmodconfig" .config file Currently, torture.sh saves only the build output and exit code from the "make allmodconfig" test. This commit also saves the .config file. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:55:03 -07:00
Paul E. McKenney	f877e3993b	scftorture: Remove extraneous "scf" from per_version_boot_params There is an extraneous "scf" in the per_version_boot_params shell function used by scftorture. No harm done in that it is just passed as an argument to the /init program in initrd, but this commit nevertheless removes it. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:55:03 -07:00
Paul E. McKenney	eec52c7fb5	rcutorture: Adjust scenarios' Kconfig options for CONFIG_PREEMPT_DYNAMIC Now that CONFIG_PREEMPT_DYNAMIC=y is the default, kernels that are ostensibly built with CONFIG_PREEMPT_NONE=y or CONFIG_PREEMPT_VOLUNTARY=y are now actually built with CONFIG_PREEMPT=y, but are by default booted so as to disable preemption. Although this allows much more flexibility from a single kernel binary, it means that the current rcutorture scenarios won't find build errors that happen only when preemption is fully disabled at build time. This commit therefore adds CONFIG_PREEMPT_DYNAMIC=n to several scenarios, and while in the area switches one from CONFIG_PREEMPT_NONE=y to CONFIG_PREEMPT_VOLUNTARY=y to add coverage of this Kconfig option. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:55:03 -07:00
Paul E. McKenney	3e112a39f7	torture: Enable CSD-lock stall reports for scftorture This commit passes the csdlock_debug=1 kernel parameter in order to enable CSD-lock stall reports for torture.sh scftorure runs. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:55:03 -07:00
Paul E. McKenney	00f3133b7f	torture: Skip vmlinux check for kvm-again.sh runs The kvm-again.sh script reruns an previously built set of kernels, so the vmlinux files are associated with that previous run, not this on. This results in kvm-find_errors.sh reporting spurious failed-build errors. This commit therefore omits the vmlinux check for kvm-again.sh runs. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:54:56 -07:00
Paul E. McKenney	bf5e7a2f46	scftorture: Adjust for TASKS_RCU Kconfig option being selected This commit adjusts the scftorture PREEMPT and NOPREEMPT scenarios to account for the TASKS_RCU Kconfig option being explicitly selected rather than computed in isolation. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:53:19 -07:00
Paul E. McKenney	5ce027f4cd	rcuscale: Allow rcuscale without RCU Tasks Rude/Trace Currently, a CONFIG_PREEMPT_NONE=y kernel substitutes normal RCU for RCU Tasks Rude and RCU Tasks Trace. Unless that kernel builds rcuscale, whether built-in or as a module, in which case these RCU Tasks flavors are (unnecessarily) built in. This both increases kernel size and increases the complexity of certain tracing operations. This commit therefore decouples the presence of rcuscale from the presence of RCU Tasks Rude and RCU Tasks Trace. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:53:19 -07:00
Paul E. McKenney	4df002d908	rcuscale: Allow rcuscale without RCU Tasks Currently, a CONFIG_PREEMPT_NONE=y kernel substitutes normal RCU for RCU Tasks. Unless that kernel builds rcuscale, whether built-in or as a module, in which case RCU Tasks is (unnecessarily) built. This both increases kernel size and increases the complexity of certain tracing operations. This commit therefore decouples the presence of rcuscale from the presence of RCU Tasks. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:53:19 -07:00
Paul E. McKenney	dec86781a5	refscale: Allow refscale without RCU Tasks Rude/Trace Currently, a CONFIG_PREEMPT_NONE=y kernel substitutes normal RCU for RCU Tasks Rude and RCU Tasks Trace. Unless that kernel builds refscale, whether built-in or as a module, in which case these RCU Tasks flavors are (unnecessarily) built in. This both increases kernel size and increases the complexity of certain tracing operations. This commit therefore decouples the presence of refscale from the presence of RCU Tasks Rude and RCU Tasks Trace. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:53:19 -07:00
Paul E. McKenney	5f654af150	refscale: Allow refscale without RCU Tasks Currently, a CONFIG_PREEMPT_NONE=y kernel substitutes normal RCU for RCU Tasks. Unless that kernel builds refscale, whether built-in or as a module, in which case RCU Tasks is (unnecessarily) built in. This both increases kernel size and increases the complexity of certain tracing operations. This commit therefore decouples the presence of refscale from the presence of RCU Tasks. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:53:19 -07:00
Paul E. McKenney	58524e0fed	rcutorture: Allow specifying per-scenario stat_interval The rcutorture test suite makes double use of the rcutorture.stat_interval module parameter. As its name suggests, it controls the frequency of statistics printing, but it also controls the rcu_torture_writer() stall timeout. The current setting of 15 seconds works surprisingly well. However, given that the RCU tasks stall-warning timeout is ten -minutes-, 15 seconds is too short for TASKS02, which runs a non-preemptible kernel on a single CPU. This commit therefore adds checks for per-scenario specification of the rcutorture.stat_interval module parameter. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:53:19 -07:00
Paul E. McKenney	3831fc02f4	rcutorture: Add CONFIG_PREEMPT_DYNAMIC=n to TASKS02 scenario Now that CONFIG_PREEMPT_DYNAMIC=y is the default, TASKS02 no longer builds a pure non-preemptible kernel that uses Tiny RCU. This commit therefore fixes this new hole in rcutorture testing by adding CONFIG_PREEMPT_DYNAMIC=n to the TASKS02 rcutorture scenario. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:53:19 -07:00
Paul E. McKenney	4c3f7b0e1e	rcutorture: Allow rcutorture without RCU Tasks Rude Unless a kernel builds rcutorture, whether built-in or as a module, that kernel is also built with CONFIG_TASKS_RUDE_RCU, whether anything else needs Tasks Rude RCU or not. This unnecessarily increases kernel size. This commit therefore decouples the presence of rcutorture from the presence of RCU Tasks Rude. However, there is a need to select CONFIG_TASKS_RUDE_RCU for testing purposes. Except that casual users must not be bothered with questions -- for them, this needs to be fully automated. There is thus a CONFIG_FORCE_TASKS_RUDE_RCU that selects CONFIG_TASKS_RUDE_RCU, is user-selectable, but which depends on CONFIG_RCU_EXPERT. [ paulmck: Apply kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:53:19 -07:00
Paul E. McKenney	3b6e1dd423	rcutorture: Allow rcutorture without RCU Tasks Currently, a CONFIG_PREEMPT_NONE=y kernel substitutes normal RCU for RCU Tasks. Unless that kernel builds rcutorture, whether built-in or as a module, in which case RCU Tasks is (unnecessarily) used. This both increases kernel size and increases the complexity of certain tracing operations. This commit therefore decouples the presence of rcutorture from the presence of RCU Tasks. However, there is a need to select CONFIG_TASKS_RCU for testing purposes. Except that casual users must not be bothered with questions -- for them, this needs to be fully automated. There is thus a CONFIG_FORCE_TASKS_RCU that selects CONFIG_TASKS_RCU, is user-selectable, but which depends on CONFIG_RCU_EXPERT. [ paulmck: Apply kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:53:19 -07:00
Paul E. McKenney	40c1278aa7	rcutorture: Allow rcutorture without RCU Tasks Trace Unless a kernel builds rcutorture, whether built-in or as a module, that kernel is also built with CONFIG_TASKS_TRACE_RCU, whether anything else needs Tasks Trace RCU or not. This unnecessarily increases kernel size. This commit therefore decouples the presence of rcutorture from the presence of RCU Tasks Trace. However, there is a need to select CONFIG_TASKS_TRACE_RCU for testing purposes. Except that casual users must not be bothered with questions -- for them, this needs to be fully automated. There is thus a CONFIG_FORCE_TASKS_TRACE_RCU that selects CONFIG_TASKS_TRACE_RCU, is user-selectable, but which depends on CONFIG_RCU_EXPERT. [ paulmck: Apply kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:53:19 -07:00
Paul E. McKenney	835f14ed53	rcu: Make the TASKS_RCU Kconfig option be selected Currently, any kernel built with CONFIG_PREEMPTION=y also gets CONFIG_TASKS_RCU=y, which is not helpful to people trying to build preemptible kernels of minimal size. Because CONFIG_TASKS_RCU=y is needed only in kernels doing tracing of one form or another, this commit moves from TASKS_RCU deciding when it should be enabled to the tracing Kconfig options explicitly selecting it. This allows building preemptible kernels without TASKS_RCU, if desired. This commit also updates the SRCU-N and TREE09 rcutorture scenarios in order to avoid Kconfig errors that would otherwise result from CONFIG_TASKS_RCU being selected without its CONFIG_RCU_EXPERT dependency being met. [ paulmck: Apply BPF_SYSCALL feedback from Andrii Nakryiko. ] Reported-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-20 16:52:58 -07:00
Liu Jian	127e7dca42	selftests/bpf: Add test for skb_load_bytes Use bpf_prog_test_run_opts to test the skb_load_bytes function. Tests the behavior when offset is greater than INT_MAX or a normal value. Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220416105801.88708-4-liujian56@huawei.com	2022-04-20 23:48:34 +02:00
Florian Fischer	75eafc970b	perf list: Print all available tool events Introduce names for the new tool events 'user_time' and 'system_time'. $ perf list ... duration_time [Tool event] user_time [Tool event] system_time [Tool event] ... Committer testing: Before: $ perf list \| grep Tool duration_time [Tool event] $ After: $ perf list \| grep Tool duration_time [Tool event] user_time [Tool event] system_time [Tool event] $ Signed-off-by: Florian Fischer <florian.fischer@muhq.space> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220420174244.1741958-2-florian.fischer@muhq.space Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-20 15:05:00 -03:00
Florian Fischer	b03b89b350	perf stat: Add user_time and system_time events It bothered me that during benchmarking using 'perf stat' (to collect for example CPU cache events) I could not simultaneously retrieve the times spend in user or kernel mode in a machine readable format. When running 'perf stat' the output for humans contains the times reported by rusage and wait4. $ perf stat -e cache-misses:u -- true Performance counter stats for 'true': 4,206 cache-misses:u 0.001113619 seconds time elapsed 0.001175000 seconds user 0.000000000 seconds sys But 'perf stat's machine-readable format does not provide this information. $ perf stat -x, -e cache-misses:u -- true 4282,,cache-misses:u,492859,100.00,, I found no way to retrieve this information using the available events while using machine-readable output. This patch adds two new tool internal events 'user_time' and 'system_time', similarly to the already present 'duration_time' event. Both events use the already collected rusage information obtained by wait4 and tracked in the global ru_stats. Examples presenting cache-misses and rusage information in both human and machine-readable form: $ perf stat -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time . Performance counter stats for 'grep -q -r duration_time .': 67,422,542 ns duration_time:u 50,517,000 ns user_time:u 16,839,000 ns system_time:u 30,937 cache-misses:u 0.067422542 seconds time elapsed 0.050517000 seconds user 0.016839000 seconds sys $ perf stat -x, -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time . 72134524,ns,duration_time:u,72134524,100.00,, 65225000,ns,user_time:u,65225000,100.00,, 6865000,ns,system_time:u,6865000,100.00,, 38705,,cache-misses:u,71189328,100.00,, Signed-off-by: Florian Fischer <florian.fischer@muhq.space> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220420102354.468173-3-florian.fischer@muhq.space Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-20 13:44:56 -03:00
Florian Fischer	c735b0a521	perf stat: Introduce stats for the user and system rusage times This is preparation for exporting rusage values as tool events. Add new global stats tracking the values obtained via rusage. For now only ru_utime and ru_stime are part of the tracked stats. Both are stored as nanoseconds to be consistent with 'duration_time', although the finest resolution the struct timeval data in rusage provides are microseconds. Signed-off-by: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220420102354.468173-2-florian.fischer@muhq.space Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-20 13:38:41 -03:00
Martin Liška	c60664dea7	perf tools: Print warning when HAVE_DEBUGINFOD_SUPPORT is not set and user tries to use debuginfod support When one requests debuginfod, either via --debuginfod option, or with a perf-config value, complain when perf is not built with it. Signed-off-by: Martin Liška <mliska@suse.cz> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/35bae747-3951-dc3d-a66b-abf4cebcd9cb@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-20 13:36:36 -03:00
Martin Liška	b8836c2a4d	perf version: Add HAVE_DEBUGINFOD_SUPPORT to built-in features The change adds debuginfod to ./perf -vv: ... debuginfod: [ OFF ] # HAVE_DEBUGINFOD_SUPPORT ... Signed-off-by: Martin Liška <mliska@suse.cz> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/0d1c5ace-88e8-7102-1565-7c143f01a966@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-20 13:32:09 -03:00
Ido Schimmel	5e6242151d	selftests: mlxsw: vxlan_flooding_ipv6: Prevent flooding of unwanted packets The test verifies that packets are correctly flooded by the bridge and the VXLAN device by matching on the encapsulated packets at the other end. However, if packets other than those generated by the test also ingress the bridge (e.g., MLD packets), they will be flooded as well and interfere with the expected count. Make the test more robust by making sure that only the packets generated by the test can ingress the bridge. Drop all the rest using tc filters on the egress of 'br0' and 'h1'. In the software data path, the problem can be solved by matching on the inner destination MAC or dropping unwanted packets at the egress of the VXLAN device, but this is not currently supported by mlxsw. Fixes: `d01724dd2a` ("selftests: mlxsw: spectrum-2: Add a test for VxLAN flooding with IPv6") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-20 15:04:27 +01:00
Ido Schimmel	044011fdf1	selftests: mlxsw: vxlan_flooding: Prevent flooding of unwanted packets The test verifies that packets are correctly flooded by the bridge and the VXLAN device by matching on the encapsulated packets at the other end. However, if packets other than those generated by the test also ingress the bridge (e.g., MLD packets), they will be flooded as well and interfere with the expected count. Make the test more robust by making sure that only the packets generated by the test can ingress the bridge. Drop all the rest using tc filters on the egress of 'br0' and 'h1'. In the software data path, the problem can be solved by matching on the inner destination MAC or dropping unwanted packets at the egress of the VXLAN device, but this is not currently supported by mlxsw. Fixes: `94d302deae` ("selftests: mlxsw: Add a test for VxLAN flooding") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-20 15:04:27 +01:00
Pu Lehui	58ca8b0572	libbpf: Support riscv USDT argument parsing logic Add riscv-specific USDT argument specification parsing logic. riscv USDT argument format is shown below: - Memory dereference case: "size@off(reg)", e.g. "-8@-88(s0)" - Constant value case: "size@val", e.g. "4@5" - Register read case: "size@reg", e.g. "-8@a1" s8 will be marked as poison while it's a reg of riscv, we need to alias it in advance. Both RV32 and RV64 have been tested. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220419145238.482134-3-pulehui@huawei.com	2022-04-19 21:59:35 -07:00
Pu Lehui	5af25a410a	libbpf: Fix usdt_cookie being cast to 32 bits The usdt_cookie is defined as __u64, which should not be used as a long type because it will be cast to 32 bits in 32-bit platforms. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220419145238.482134-2-pulehui@huawei.com	2022-04-19 21:59:35 -07:00
Geliang Tang	abd26d348b	selftests: mqueue: drop duplicate min definition Drop duplicate macro min() definition in mq_perf_tests.c, use MIN() in sys/param.h instead. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-19 19:28:47 -06:00
Ze Zhang	d490527d30	selftests/ftrace: add mips support for kprobe args syntax tests This is the mips variant of commit <3990b5baf225> ("selftests/ftrace: Add s390 support for kprobe args tests"). Signed-off-by: Ze Zhang <zhangze@loongson.cn> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-19 19:23:23 -06:00
Ze Zhang	2238a1f490	selftests/ftrace: add mips support for kprobe args string tests This is the mips variant of commit <3990b5baf225> ("selftests/ftrace: Add s390 support for kprobe args tests"). Signed-off-by: Ze Zhang <zhangze@loongson.cn> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-19 19:23:10 -06:00
Kumar Kartikeya Dwivedi	24fe983abe	selftests/bpf: Add tests for type tag order validation Add a few test cases that ensure we catch cases of badly ordered type tags in modifier chains. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220419164608.1990559-3-memxor@gmail.com	2022-04-19 14:02:49 -07:00
Andrii Nakryiko	0d7fefebea	selftests/bpf: Use non-autoloaded programs in few tests Take advantage of new libbpf feature for declarative non-autoloaded BPF program SEC() definitions in few test that test single program at a time out of many available programs within the single BPF object. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220419002452.632125-2-andrii@kernel.org	2022-04-19 13:48:20 -07:00
Andrii Nakryiko	a3820c4811	libbpf: Support opting out from autoloading BPF programs declaratively Establish SEC("?abc") naming convention (i.e., adding question mark in front of otherwise normal section name) that allows to set corresponding program's autoload property to false. This is effectively just a declarative way to do bpf_program__set_autoload(prog, false). Having a way to do this declaratively in BPF code itself is useful and convenient for various scenarios. E.g., for testing, when BPF object consists of multiple independent BPF programs that each needs to be tested separately. Opting out all of them by default and then setting autoload to true for just one of them at a time simplifies testing code (see next patch for few conversions in BPF selftests taking advantage of this new feature). Another real-world use case is in libbpf-tools for cases when different BPF programs have to be picked depending on particulars of the host kernel due to various incompatible changes (like kernel function renames or signature change, or to pick kprobe vs fentry depending on corresponding kernel support for the latter). Marking all the different BPF program candidates as non-autoloaded declaratively makes this more obvious in BPF source code and allows simpler code in user-space code. When BPF program marked as SEC("?abc") it is otherwise treated just like SEC("abc") and bpf_program__section_name() reported will be "abc". Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220419002452.632125-1-andrii@kernel.org	2022-04-19 13:48:20 -07:00
Josh Poimboeuf	08feafe8d1	objtool: Fix function fallthrough detection for vmlinux Objtool's function fallthrough detection only works on C objects. The distinction between C and assembly objects no longer makes sense with objtool running on vmlinux.o. Now that copy_user_64.S has been fixed up, and an objtool sibling call detection bug has been fixed, the asm code is in "compliance" and this hack is no longer needed. Remove it. Fixes: `ed53a0d971` ("x86/alternative: Use .ibt_endbr_seal to seal indirect calls") Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/b434cff98eca3a60dcc64c620d7d5d405a0f441c.1649718562.git.jpoimboe@redhat.com	2022-04-19 21:58:53 +02:00
Josh Poimboeuf	34c861e806	objtool: Fix sibling call detection in alternatives In add_jump_destinations(), sibling call detection requires 'insn->func' to be valid. But alternative instructions get their 'func' set in handle_group_alt(), which runs after add_jump_destinations(). So sibling calls in alternatives code don't get properly detected. Fix that by changing the initialization order: call add_special_section_alts() before add_jump_destinations(). This also means the special case for a missing 'jump_dest' in add_jump_destinations() can be removed, as it has already been dealt with. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/c02e0a0a2a4286b5f848d17c77fdcb7e0caf709c.1649718562.git.jpoimboe@redhat.com	2022-04-19 21:58:53 +02:00
Josh Poimboeuf	26ff604102	objtool: Don't set 'jump_dest' for sibling calls For most sibling calls, 'jump_dest' is NULL because objtool treats the jump like a call and sets 'call_dest'. But there are a few edge cases where that's not true. Make it consistent to avoid unexpected behavior. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/8737d6b9d1691831aed73375f444f0f42da3e2c9.1649718562.git.jpoimboe@redhat.com	2022-04-19 21:58:53 +02:00
Josh Poimboeuf	1d08b92fa2	objtool: Use offstr() to print address of missing ENDBR Fixes: `89bc853eae` ("objtool: Find unused ENDBR instructions") Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/95d12e800c736a3f7d08d61dabb760b2d5251a8e.1650300597.git.jpoimboe@redhat.com	2022-04-19 21:58:50 +02:00
Josh Poimboeuf	4baae989e6	objtool: Print data address for "!ENDBR" data warnings When a "!ENDBR" warning is reported for a data section, objtool just prints the text address of the relocation target twice, without giving any clues about the location of the original data reference: vmlinux.o: warning: objtool: dcbnl_netdevice_event()+0x0: .text+0xb64680: data relocation to !ENDBR: dcbnl_netdevice_event+0x0 Instead, print the address of the data reference, in addition to the address of the relocation target. vmlinux.o: warning: objtool: dcbnl_nb+0x0: .data..read_mostly+0xe260: data relocation to !ENDBR: dcbnl_netdevice_event+0x0 Fixes: `89bc853eae` ("objtool: Find unused ENDBR instructions") Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/762e88d51300e8eaf0f933a5b0feae20ac033bea.1650300597.git.jpoimboe@redhat.com	2022-04-19 21:58:50 +02:00
Peter Zijlstra	d4e5268a08	x86,objtool: Mark cpu_startup_entry() __noreturn GCC-8 isn't clever enough to figure out that cpu_start_entry() is a noreturn while objtool is. This results in code after the call in start_secondary(). Give GCC a hand so that they all agree on things. vmlinux.o: warning: objtool: start_secondary()+0x10e: unreachable Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20220408094718.383658532@infradead.org	2022-04-19 21:58:48 +02:00
Yonghong Song	44df171a10	selftests/bpf: Workaround a verifier issue for test exhandler The llvm patch [1] enabled opaque pointer which caused selftest 'exhandler' failure. ... ; work = task->task_works; 7: (79) r1 = (u64 )(r6 +2120) ; R1_w=ptr_callback_head(off=0,imm=0) R6_w=ptr_task_struct(off=0,imm=0) ; func = work->func; 8: (79) r2 = (u64 )(r1 +8) ; R1_w=ptr_callback_head(off=0,imm=0) R2_w=scalar() ; if (!work && !func) 9: (4f) r1 \|= r2 math between ptr_ pointer and register with unbounded min value is not allowed below is insn 10 and 11 10: (55) if r1 != 0 goto +5 11: (18) r1 = 0 ll ... In llvm, the code generation of 'r1 \|= r2' happened in codegen selectiondag phase due to difference of opaque pointer vs. non-opaque pointer. Without [1], the related code looks like: r2 = (u64 )(r6 + 2120) r1 = (u64 )(r2 + 8) if r2 != 0 goto +6 <LBB0_4> if r1 != 0 goto +5 <LBB0_4> r1 = 0 ll ... I haven't found a good way in llvm to fix this issue. So let us workaround the problem first so bpf CI won't be blocked. [1] https://reviews.llvm.org/D123300 Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220419050900.3136024-1-yhs@fb.com	2022-04-19 10:22:19 -07:00
Yonghong Song	8c89b5db7a	selftests/bpf: Limit unroll_count for pyperf600 test LLVM commit [1] changed loop pragma behavior such that full loop unroll is always honored with user pragma. Previously, unroll count also depends on the unrolled code size. For pyperf600, without [1], the loop unroll count is 150. With [1], the loop unroll count is 600. The unroll count of 600 caused the program size close to 298k and this caused the following code is generated: 0: 7b 1a 00 ff 00 00 00 00 (u64 )(r10 - 256) = r1 ; uint64_t pid_tgid = bpf_get_current_pid_tgid(); 1: 85 00 00 00 0e 00 00 00 call 14 2: bf 06 00 00 00 00 00 00 r6 = r0 ; pid_t pid = (pid_t)(pid_tgid >> 32); 3: bf 61 00 00 00 00 00 00 r1 = r6 4: 77 01 00 00 20 00 00 00 r1 >>= 32 5: 63 1a fc ff 00 00 00 00 (u32 )(r10 - 4) = r1 6: bf a2 00 00 00 00 00 00 r2 = r10 7: 07 02 00 00 fc ff ff ff r2 += -4 ; PidData* pidData = bpf_map_lookup_elem(&pidmap, &pid); 8: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 10: 85 00 00 00 01 00 00 00 call 1 11: bf 08 00 00 00 00 00 00 r8 = r0 ; if (!pidData) 12: 15 08 15 e8 00 00 00 00 if r8 == 0 goto -6123 <LBB0_27588+0xffffffffffdae100> Note that insn 12 has a branch offset -6123 which is clearly illegal and will be rejected by the verifier. The negative offset is due to the branch range is greater than INT16_MAX. This patch changed the unroll count to be 150 to avoid above branch target insn out-of-range issue. Also the llvm is enhanced ([2]) to assert if the branch target insn is out of INT16 range. [1] https://reviews.llvm.org/D119148 [2] https://reviews.llvm.org/D123877 Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220419043230.2928530-1-yhs@fb.com	2022-04-19 10:18:56 -07:00
Rafael J. Wysocki	9765fa2566	Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat changes for 5.19 from Len Brown: "Chen Yu (1): tools/power turbostat: Support thermal throttle count print Dan Merillat (1): tools/power turbostat: fix dump for AMD cpus Len Brown (5): tools/power turbostat: tweak --show and --hide capability tools/power turbostat: fix ICX DRAM power numbers tools/power turbostat: be more useful as non-root tools/power turbostat: No build warnings with -Wextra tools/power turbostat: version 2022.04.16 Sumeet Pawnikar (2): tools/power turbostat: Add Power Limit4 support tools/power turbostat: print power values upto three decimal Zephaniah E. Loss-Cutler-Hull (2): tools/power turbostat: Allow -e for all names. tools/power turbostat: Allow printing header every N iterations" * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: version 2022.04.16 tools/power turbostat: No build warnings with -Wextra tools/power turbostat: be more useful as non-root tools/power turbostat: fix ICX DRAM power numbers tools/power turbostat: Support thermal throttle count print tools/power turbostat: Allow printing header every N iterations tools/power turbostat: Allow -e for all names. tools/power turbostat: print power values upto three decimal tools/power turbostat: Add Power Limit4 support tools/power turbostat: fix dump for AMD cpus tools/power turbostat: tweak --show and --hide capability	2022-04-19 17:43:25 +02:00
Mykola Lysenko	2324257dbd	selftests/bpf: Refactor prog_tests logging and test execution This is a pre-req to add separate logging for each subtest in test_progs. Move all the mutable test data to the test_result struct. Move per-test init/de-init into the run_one_test function. Consolidate data aggregation and final log output in calculate_and_print_summary function. As a side effect, this patch fixes double counting of errors for subtests and possible duplicate output of subtest log on failures. Also, add prog_tests_framework.c test to verify some of the counting logic. As part of verification, confirmed that number of reported tests is the same before and after the change for both parallel and sequential test execution. Signed-off-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220418222507.1726259-1-mykolal@fb.com	2022-04-18 21:22:13 -07:00
Ian Rogers	87e0a30e9a	perf vendor events intel: Update goldmont event topics Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:38:28 -03:00
Ian Rogers	f51c401f11	perf vendor events intel: Update goldmontplus event topics Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:38:18 -03:00
Ian Rogers	8f1a69825f	perf vendor events intel: Update elkhartlake event topics Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:38:08 -03:00
Ian Rogers	44a4b9ad8e	perf vendor events intel: Update westmereex event topics Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:37:56 -03:00
Ian Rogers	7f2c72fa69	perf vendor events intel: Update westmereep-sp event topics Apply topic updates from: p https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:37:43 -03:00
Ian Rogers	a01174fc9e	perf vendor events intel: Update westmereep-dp event topics Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:37:32 -03:00
Ian Rogers	55ae1b759e	perf vendor events intel: Update tremontx uncore and topics Update the topic of BTCLEAR.ANY and add additional uncore event names as per: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>1 Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:37:22 -03:00
Ian Rogers	45d97cdd2f	perf vendor events intel: Update tigerlake topic Update the topic of ASSISTS.ANY as per: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:37:10 -03:00
Ian Rogers	da578feb70	perf vendor events intel: Update nehalemep event topics Apply topic updates from: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:36:59 -03:00
Ian Rogers	339ec95167	perf vendor events intel: Update SKX uncore JSON uncore events are generated for Skylake Server for v1.26 with events from: https://download.01.org/perfmon/SKX/ New event names are added, that match the original JSON names, due to an update to: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:36:35 -03:00
Ian Rogers	dd498d0804	perf vendor events intel: Update CLX uncore to v1.14 JSON uncore events are generated for CascadeLake Server for v1.14 with events from: https://download.01.org/perfmon/CLX/ New event names are added, that match the original JSON names, due to an update to: https://github.com/intel/event-converter-for-linux-perf/ Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:36:15 -03:00
Ian Rogers	12c6385eeb	perf vendor events intel: Add sapphirerapids events Events were generated from 01.org using: https://github.com/intel/event-converter-for-linux-perf Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:36:07 -03:00
Ian Rogers	cbeee6caa4	perf vendor events intel: Fix icelakex cstate metrics Apply cstate fix from: https://github.com/intel/event-converter-for-linux-perf/ so that metrics for cstates that exist on the particular architecture are generated. This corrects issues with metric testing. Also correct topic of ASSISTS.ANY event. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:35:54 -03:00
Ian Rogers	2c77f36a9a	perf vendor events intel: Fix icelake cstate metrics Apply cstate fix from: https://github.com/intel/event-converter-for-linux-perf/ so that metrics for cstates that exist on the particular architecture are generated. This corrects issues with metric testing. Also correct topic of ASSISTS.ANY event. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220413210503.3256922-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 12:35:03 -03:00
Leo Yan	fdefc3750e	perf mem: Print memory operation type The memory operation types are not only for load and store, for easier reviewing the memory operation type, this patch prints out it. Before: ls 14753 [011] 3678.072400: 1 l1d-miss: 88000182 L1 miss\|SNP N/A\|TLB Walker hit\|LCK No\|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 l1d-access: 88000182 L1 miss\|SNP N/A\|TLB Walker hit\|LCK No\|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 tlb-access: 88000182 L1 miss\|SNP N/A\|TLB Walker hit\|LCK No\|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 memory: 88000182 L1 miss\|SNP N/A\|TLB Walker hit\|LCK No\|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) After: ls 14753 [011] 3678.072400: 1 l1d-miss: 88000182 \|OP LOAD\|LVL L1 miss\|SNP N/A\|TLB Walker hit\|LCK No\|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 l1d-access: 88000182 \|OP LOAD\|LVL L1 miss\|SNP N/A\|TLB Walker hit\|LCK No\|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 tlb-access: 88000182 \|OP LOAD\|LVL L1 miss\|SNP N/A\|TLB Walker hit\|LCK No\|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 memory: 88000182 \|OP LOAD\|LVL L1 miss\|SNP N/A\|TLB Walker hit\|LCK No\|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220417124524.901148-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-18 11:44:06 -03:00
Jiri Pirko	e1fad9517f	selftests: mlxsw: Introduce devlink line card provision/unprovision/activation tests Introduce basic line card manipulation which consists of provisioning, unprovisioning and activation of a line card. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-18 11:00:19 +01:00
Linus Torvalds	3a69a44278	Two x86 fixes related to TSX: - Use either MSR_TSX_FORCE_ABORT or MSR_IA32_TSX_CTRL to disable TSX to cover all CPUs which allow to disable it. - Disable TSX development mode at boot so that a microcode update which provides TSX development mode does not suddenly make the system vulnerable to TSX Asynchronous Abort. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmJb5LYTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoVVbD/9cxZWkFctCiymedUZqLabkfpYSki65 MngdpCPzCNaaIdlp44lwCido5+gJsY9unXdm3OAUzLjv6SsxxpDr5njz1/C6TM1l XmWjlkLEbG2QDPd1Ybd/lpYQORBmiukyo8v8x0yFT7ZzwvSddoDZAbeUtkQBrIin sDTeExsewKzL2X5qXhttrHLHu1PYgurn4ThIrrG+eg2e4FNk6UUFUS3TOyMvzJDg NWJ7N5pGy9YkR7CISq1q+qdnH55pGaUrgonDi2qBTt3EaH0fQtZP2ZtIOYr3O4nI YCx6isrIiGUB6kSygofxmk4B+22CaUJXd2OcUxMZ/Th/a2aCK+35BtGVPXQGi6nU d7m+ZWB7dShOiejFygS59ty+5L5kliKXYZfUASsq1CLoXH8K1xUwBMkbY5FQ2WH1 Ue4KUvjguNqsgSRAfeHdOi6B36oot0Xf9JO013Wm3V/r9hsGPtSOjWwFuVvT/euw a9iFtruATxDssBxH/l0djCKnwwm5yuOt1OpyizcIMFnlCgRD06h/6zgAvsJK7c8d dh6lC4D2mXP1e2wtEyZelve1tmRJ/FeReyG2V5FNU7m1mWYGm1rJZ4AEvnbrzcbC ePwFva0lPu8GVKG6HRgHfR8PjuQ7TFmKPKytT7fboIqQpTIY+1Q75wYD4eXkSu8Q /ltzXQz/8lz7bA== =UQaW -----END PGP SIGNATURE----- Merge tag 'x86-urgent-2022-04-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Two x86 fixes related to TSX: - Use either MSR_TSX_FORCE_ABORT or MSR_IA32_TSX_CTRL to disable TSX to cover all CPUs which allow to disable it. - Disable TSX development mode at boot so that a microcode update which provides TSX development mode does not suddenly make the system vulnerable to TSX Asynchronous Abort" * tag 'x86-urgent-2022-04-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsx: Disable TSX development mode at boot x86/tsx: Use MSR_TSX_CTRL to clear CPUID bits	2022-04-17 09:55:59 -07:00
Arun Ajith S	f9a2fb7331	net/ipv6: Introduce accept_unsolicited_na knob to implement router-side changes for RFC9131 Add a new neighbour cache entry in STALE state for routers on receiving an unsolicited (gratuitous) neighbour advertisement with target link-layer-address option specified. This is similar to the arp_accept configuration for IPv4. A new sysctl endpoint is created to turn on this behaviour: /proc/sys/net/ipv6/conf/interface/accept_unsolicited_na. Signed-off-by: Arun Ajith S <aajith@arista.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-17 13:23:49 +01:00
Len Brown	58990892ca	tools/power turbostat: version 2022.04.16 Signed-off-by: Len Brown <len.brown@intel.com>	2022-04-17 00:05:25 -04:00
Len Brown	9878bf7a9f	tools/power turbostat: No build warnings with -Wextra Signed-off-by: Len Brown <len.brown@intel.com>	2022-04-16 23:45:18 -04:00
Len Brown	164d7a965b	tools/power turbostat: be more useful as non-root Don't exit if used this way: sudo setcap cap_sys_nice,cap_sys_rawio=+ep ./turbostat sudo chmod +r /dev/cpu/*/msr ./turbostat note: cap_sys_admin is now also needed for the perf IPC counter: sudo setcap cap_sys_admin,cap_sys_nice,cap_sys_rawio=+ep ./turbostat Reported-by: Artem S. Tashkinov <aros@gmx.com> Reported-by: Toby Broom <tbroom@outlook.com> Signed-off-by: Len Brown <len.brown@intel.com>	2022-04-16 23:07:05 -04:00
Len Brown	6397b64189	tools/power turbostat: fix ICX DRAM power numbers ICX (and its duplicates) require special hard-coded DRAM RAPL units, rather than using the generic RAPL energy units. Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2022-04-16 21:58:15 -04:00
Chen Yu	eae97e053f	tools/power turbostat: Support thermal throttle count print The turbostat data is collected by end user for power evaluationit. However it looks like we are missing enough thermal context there. Already a couple of time we found that power management developer asking something like this: grep -r . /sys/devices/system/cpu/cpu/thermal_throttle/ Print the per core thermal throttle count so as to get suffificent thermal context. turbostat -i 5 -s Core,CPU,CoreThr Core CPU CoreThr - - 104 0 0 61 0 4 1 1 0 1 5 2 2 104 2 6 3 3 7 3 7 Suggested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2022-04-16 21:58:15 -04:00
Zephaniah E. Loss-Cutler-Hull	c7e399f839	tools/power turbostat: Allow printing header every N iterations This gives the ability to reprint the header every N iterations, so you can ensure that a scrolling display always has the header visible somewhere on the screen. Signed-off-by: Zephaniah E. Loss-Cutler-Hull <zephaniah@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2022-04-16 21:58:15 -04:00
Zephaniah E. Loss-Cutler-Hull	0fc521bc33	tools/power turbostat: Allow -e for all names. Currently, there are a number of variables which are displayed by default, enabled with -e all, and listed by --list, but which you can not give to --enable/-e. So you can enable CPU0c1 (in the bic array), but you can't enable C1 or C1% (not in the bic array, but exists in sysfs). This runs counter to both the documentation and user expectations, and it's just not very user friendly. As such, the mechanism used by --hide has been duplicated, and is now also used by --enable, so we can handle unknown names gracefully. Note: One impact of this is that truly unknown fields given to --enable will no longer generate errors, they will be silently ignored, as --hide does. Signed-off-by: Zephaniah E. Loss-Cutler-Hull <zephaniah@gmail.com> Signed-off-by: Len Brown <len.brown@intel.com>	2022-04-16 21:58:15 -04:00
Sumeet Pawnikar	6b398625ae	tools/power turbostat: print power values upto three decimal Print power values upto three decimal places in watts. Suggested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2022-04-16 21:58:15 -04:00
Sumeet Pawnikar	f52ba93190	tools/power turbostat: Add Power Limit4 support Add Power Limit4 support. Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2022-04-16 21:58:14 -04:00
Dan Merillat	6799ba84ca	tools/power turbostat: fix dump for AMD cpus turbostat --Dump exits early with status 243 (-13) get_counters() calls get_msr_sum() on zen CPUS for MSR_PKG_ENERGY_STAT, but per_cpu_msr_sum has not been initialized. Signed-off-by: Dan Merillat <git@dan.eginity.com> Signed-off-by: Len Brown <len.brown@intel.com>	2022-04-16 21:58:14 -04:00
Len Brown	5dc241f2b2	tools/power turbostat: tweak --show and --hide capability allow invocations such as # turbostat --show power,Busy% previously the "Busy%" was ignored Signed-off-by: Len Brown <len.brown@intel.com>	2022-04-16 21:17:18 -04:00
Linus Torvalds	bb34e0dba3	linux-kselftest-fixes-5.18-rc3 This Kselftest fixes update consists of a mqueue perf test memory leak bug fix. mq_perf_tests fail to call CPU_FREE to free memory allocated by CPU_SET. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmJZm/0ACgkQCwJExA0N QxwZLQ//Z3yHo1MEsxwCh7qiPOrZhomWyWprpNYn6KHrHm9jPoUSld1M++DF/1PW jEjJISy4WUFLDemFABvRop/K7FuLJxtpxWB8iI+zbhhoPqdM3rThzd10nHfpgKke g8x5umdBuux1pNzBcekq8BT3dBEUHU9JxGMPvTsp8v0pr5xoKAwUenhhDCmARukJ tUH83jLY4MKNI17Z8YIAVb9MjqPseMPruZuU25n8fXDkqqSAI/99ZeeNsLnSEQRo 1IA5Np0Qd+cUNFJ54Aqjb/nsGOc7ev0X2VSbWlrO+gBuX6GTvl7mKtxil2PIFz0p OjsiG9RgWcuLQTtIgKvtfpfUjsZjW6YuhxjPq+K/3jARoPk0tzk6Im7VfqRKApdA lWr41OEDMotaNbCFaQQyhOXe4N81n33skCs3d8xS94DJ6pAW51zFoJGiOihdobof CLOnpYCaA63MMHruOQb0bB4sZGFFA5tczmUe+KENH42b8NjLdUmPhyNOw6P2rkeO e34UrdysM1It3iJS+pKq4FiRwGuRcRw6HoK9c93/HOzG/t+l8HpDkfbsljxl2JDF 2VEiieYqITpGmhqgJvhqr0hoHhXiw/uFZzKzkO8+W/453AnO9deLy6HCiK/s7idz GbI2wnd4LHPkysn/CtEJBlFtCF/XpW7N29Nbr7igHl6qfFWSjII= =+5jP -----END PGP SIGNATURE----- Merge tag 'linux-kselftest-fixes-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: "A mqueue perf test memory leak bug fix. mq_perf_tests failed to call CPU_FREE to free memory allocated by CPU_SET" * tag 'linux-kselftest-fixes-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set	2022-04-15 11:24:32 -07:00
Paolo Abeni	edf45f007a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2022-04-15 09:26:00 +02:00
Athira Rajeev	f58faed7fb	perf bench: Fix numa bench to fix usage of affinity for machines with #CPUs > 1K The 'perf bench numa' testcase fails on systems with more than 1K CPUs. Testcase: perf bench numa mem -p 1 -t 3 -P 512 -s 100 -zZ0qcm --thp 1 Snippet of code: <<>> perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed. Aborted (core dumped) <<>> bind_to_node() uses "sched_getaffinity" to save the original cpumask and this call is returning EINVAL ((invalid argument). This happens because the default mask size in glibc is 1024. To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros ie, use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size. Apart from fixing this for "orig_mask", apply same logic to "mask" as well which is used to setaffinity so that mask size is large enough to represent number of possible CPU's in the system. sched_getaffinity is used in one more place in perf numa bench. It is in "bind_to_cpu" function. Apply the same logic there also. Though currently no failure is reported from there, it is ideal to change getaffinity to work with such system configurations having CPU's more than default mask size supported by glibc. Also fix "sched_setaffinity" to use mask size which is large enough to represent number of possible CPU's in the system. Fixed all places where "bind_cpumask" which is part of "struct thread_data" is used such that bind_cpumask works in all configuration. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220412164059.42654-3-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-14 09:15:10 -03:00
Athira Rajeev	8cb7a188ac	perf bench: Fix numa testcase to check if CPU used to bind task is online Perf numa bench test fails with error: Testcase: ./perf bench numa mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk Failure snippet: <<>> Running 'numa/mem' benchmark: # Running main, "perf bench numa numa-mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk" perf: bench/numa.c:333: bind_to_cpumask: Assertion `!(ret)' failed. <<>> The Testcases uses CPU's 0 and 8. In function "parse_setup_cpu_list", There is check to see if cpu number is greater than max cpu's possible in the system ie via "if (bind_cpu_0 >= g->p.nr_cpus \|\| bind_cpu_1 >= g->p.nr_cpus) {". But it could happen that system has say 48 CPU's, but only number of online CPU's is 0-7. Other CPU's are offlined. Since "g->p.nr_cpus" is 48, so function will go ahead and set bit for CPU 8 also in cpumask ( td->bind_cpumask). bind_to_cpumask function is called to set affinity using sched_setaffinity and the cpumask. Since the CPU8 is not present, set affinity will fail here with EINVAL. Fix this issue by adding a check to make sure that, CPU's provided in the input argument values are online before proceeding further and skip the test. For this, include new helper function "is_cpu_online" in "tools/perf/util/header.c". Since "BIT(x)" definition will get included from header.h, remove that from bench/numa.c Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220412164059.42654-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-14 09:13:41 -03:00
Ian Rogers	24f378e660	perf test: Add basic perf record tests Test the --per-thread flag. Test Intel machine state capturing. Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.bayduraev@gmail.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220414014642.3308206-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-14 09:10:12 -03:00
Alexey Bayduraev	23380e4d53	perf record: Fix per-thread option Per-thread mode doesn't have specific CPUs for events, add checks for this case. Minor fix to a pr_debug by Ian Rogers <irogers@google.com> to avoid an out of bound array access. Fixes: `7954f71689` ("perf record: Introduce thread affinity and mmap masks") Reported-by: Ian Rogers <irogers@google.com> Signed-off-by: Alexey Bayduraev <alexey.bayduraev@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20220414014642.3308206-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-14 09:05:11 -03:00
James Clark	2adacd7f0a	perf docs: Add man page entry for Arm SPE The SPE integration in Perf has quite a few usability quirks that can't be found by just reading the reference manual. So document this and at the same time add a summary of the feature that is also hard to find elsewhere. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Co-authored-by: Al Grant <al.grant@arm.com> Co-authored-by: Luke Dare <Luke.Dare@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220413084021.2556142-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-14 08:54:03 -03:00
Adrian Hunter	a668cc07f9	perf tools: Fix segfault accessing sample_id xyarray perf_evsel::sample_id is an xyarray which can cause a segfault when accessed beyond its size. e.g. # perf record -e intel_pt// -C 1 sleep 1 Segmentation fault (core dumped) # That is happening because a dummy event is opened to capture text poke events accross all CPUs, however the mmap logic is allocating according to the number of user_requested_cpus. In general, perf sometimes uses the evsel cpus to open events, and sometimes the evlist user_requested_cpus. However, it is not necessary to determine which case is which because the opened event file descriptors are also in an xyarray, the size of whch can be used to correctly allocate the size of the sample_id xyarray, because there is one ID per file descriptor. Note, in the affected code path, perf_evsel fd array is subsequently used to get the file descriptor for the mmap, so it makes sense for the xyarrays to be the same size there. Fixes: `d1a177595b` ("libperf: Adopt perf_evlist__mmap()/munmap() from tools/perf") Fixes: `246eba8e90` ("perf tools: Add support for PERF_RECORD_TEXT_POKE") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org # 5.5+ Link: https://lore.kernel.org/r/20220413114232.26914-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-13 22:23:02 -03:00
Lv Ruyi	d73f5d14e0	perf stat: Fix error check return value of hashmap__new(), must use IS_ERR() hashmap__new() returns ERR_PTR(-ENOMEM) when it fails, so we should use IS_ERR() to check it in error handling path. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220413093302.2538128-1-lv.ruyi@zte.com.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-13 22:20:15 -03:00
Bob Moore	487ea80a28	ACPICA: Update copyright notices to the year 2022 ACPICA commit 738d7b0726e6c0458ef93c0a01c0377490888d1e Affects all source modules and utility signons. Link: https://github.com/acpica/acpica/commit/738d7b07 Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-04-13 20:24:57 +02:00
Herton R. Krzesinski	b2dd71f9f7	tools/power/x86/intel-speed-select: fix build failure when using -Wl,--as-needed Build of intel-speed-select will fail if you run: $ LDFLAGS="-Wl,--as-needed" /usr/bin/make V=1 ... gcc -O2 -Wall -g -D_GNU_SOURCE -Iinclude -I/usr/include/libnl3 -Wl,--as-needed -lnl-genl-3 -lnl-3 intel-speed-select-in.o -o intel-speed-select /usr/bin/ld: intel-speed-select-in.o: in function `handle_event': (...)/linux/tools/power/x86/intel-speed-select/hfi-events.c:189: undefined reference to `nlmsg_hdr' ... In this case the problem is that order when linking matters when using the flag -Wl,--as-needed, symbols not used at that point are discarded. So since intel-speed-select-in.o comes after, at that point the libraries/symbols are already discarded and then missing/undefined references are reported. To fix this, make sure we specify LDFLAGS after the object file. Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Link: https://lore.kernel.org/r/20220404210525.725611-1-herton@redhat.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-04-13 13:49:48 +02:00
Alaa Mohamed	816cda9ae5	selftests: net: fib_rule_tests: add support to select a test to run Add boilerplate test loop in test to run all tests in fib_rule_tests.sh Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-13 12:19:48 +01:00
Adrian Hunter	f034fc50d3	perf tools: Fix misleading add event PMU debug message Fix incorrect debug message: Attempting to add event pmu 'intel_pt' with '' that may result in non-fatal errors which always appears with perf record -vv and intel_pt e.g. perf record -vv -e intel_pt//u uname The message is incorrect because there will never be non-fatal errors. Suppress the message if the PMU is 'selectable' i.e. meant to be selected directly as an event. Fixes: `4ac22b484d` ("perf parse-events: Make add PMU verbose output clearer") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20220411061758.2458417-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-13 07:00:31 -03:00
Linus Torvalds	453096eb04	x86: * Miscellaneous bugfixes * A small cleanup for the new workqueue code * Documentation syntax fix RISC-V: * Remove hgatp zeroing in kvm_arch_vcpu_put() * Fix alignment of the guest_hang() in KVM selftest * Fix PTE A and D bits in KVM selftest * Missing #include in vcpu_fp.c ARM: * Some PSCI fixes after introducing PSCIv1.1 and SYSTEM_RESET2 * Fix the MMU write-lock not being taken on THP split * Fix mixed-width VM handling * Fix potential UAF when debugfs registration fails * Various selftest updates for all of the above -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJVtdMUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroO33QgAiPh80xUkYfnl8FVN440S5F7UOPQ2 Cs/PbroNoP+Oz2GoG07aaqnUkFFApeBE5S+VMu1zhRNAernqpreN64/Y2iNaz0Y6 +MbvEX0FhQRW0UZJIF2m49ilgO8Gkt6aEpVRulq5G9w4NWiH1PtR25FVXfDMi8OG xdw4x1jwXNI9lOQJ5EpUKVde3rAbxCfoC6hCTh5pCNd9oLuVeLfnC+Uv91fzXltl EIeBlV0/mAi3RLp2E/AX38WP6ucMZqOOAy91/RTqX6oIx/7QL28ZNHXVrwQ67Hkd pAr3MAk84tZL58lnosw53i5aXAf9CBp0KBnpk2KGutfRNJ4Vzs1e+DZAJA== =vqAv -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "x86: - Miscellaneous bugfixes - A small cleanup for the new workqueue code - Documentation syntax fix RISC-V: - Remove hgatp zeroing in kvm_arch_vcpu_put() - Fix alignment of the guest_hang() in KVM selftest - Fix PTE A and D bits in KVM selftest - Missing #include in vcpu_fp.c ARM: - Some PSCI fixes after introducing PSCIv1.1 and SYSTEM_RESET2 - Fix the MMU write-lock not being taken on THP split - Fix mixed-width VM handling - Fix potential UAF when debugfs registration fails - Various selftest updates for all of the above" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (24 commits) KVM: x86: hyper-v: Avoid writing to TSC page without an active vCPU KVM: SVM: Do not activate AVIC for SEV-enabled guest Documentation: KVM: Add SPDX-License-Identifier tag selftests: kvm: add tsc_scaling_sync to .gitignore RISC-V: KVM: include missing hwcap.h into vcpu_fp KVM: selftests: riscv: Fix alignment of the guest_hang() function KVM: selftests: riscv: Set PTE A and D bits in VS-stage page table RISC-V: KVM: Don't clear hgatp CSR in kvm_arch_vcpu_put() selftests: KVM: Free the GIC FD when cleaning up in arch_timer selftests: KVM: Don't leak GIC FD across dirty log test iterations KVM: Don't create VM debugfs files outside of the VM directory KVM: selftests: get-reg-list: Add KVM_REG_ARM_FW_REG(3) KVM: avoid NULL pointer dereference in kvm_dirty_ring_push KVM: arm64: selftests: Introduce vcpu_width_config KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs KVM: arm64: vgic: Remove unnecessary type castings KVM: arm64: Don't split hugepages outside of MMU write lock KVM: arm64: Drop unneeded minor version check from PSCI v1.x handler KVM: arm64: Actually prevent SMC64 SYSTEM_RESET2 from AArch32 KVM: arm64: Generally disallow SMC64 for AArch32 guests ...	2022-04-12 14:16:33 -10:00
Athira Rajeev	ce64763c63	testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate CPU set. This cpu set is used further in pthread_attr_setaffinity_np and by pthread_create in the code. But in current code, allocated cpu set is not freed. Fix this issue by adding CPU_FREE in the "shutdown" function which is called in most of the error/exit path for the cleanup. There are few error paths which exit without using shutdown. Add a common goto error path with CPU_FREE for these cases. Fixes: `7820b0715b` ("tools/selftests: add mq_perf_tests") Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-12 13:54:49 -06:00
Joachim Wiberg	50fe062c80	selftests: forwarding: new test, verify host mdb entries Boiler plate for testing static mdb entries. This first test verifies adding and removing host mdb entries for all supported types: IPv4, IPv6, and MAC multicast. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-04-12 10:06:53 +02:00
Willy Tarreau	930c4acc06	tools/nolibc: guard the main file against multiple inclusion Including nolibc.h multiple times results in build errors due to multiple definitions. Let's add a guard against multiple inclusions. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:14:33 -07:00
Willy Tarreau	9c2970fbb4	tools/nolibc: use pselect6 on RISCV This arch doesn't provide the old-style select() syscall, we have to use pselect6(). Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:14:33 -07:00
Paul Menzel	8e82c28ea2	torture: Make thread detection more robust by using lspcu For consecutive numbers the lscpu command collapses the output and just shows the range with start and end. The processors are numbered that way on POWER8. $ sudo ppc64_cpu --smt=8 $ lscpu \| grep '^NUMA node' NUMA node(s): 2 NUMA node0 CPU(s): 0-79 NUMA node8 CPU(s): 80-159 This causes the heuristic to detect the number threads per core, looking for the number after the first comma, to fail, and QEMU aborts because of invalid arguments. $ lscpu \| grep '^NUMA node0' \| sed -e 's/^[^,-](,\|\-)$[0-9]$,.*$/\1/' NUMA node0 CPU(s): 0-79 But the lscpu command shows the number of threads per core: $ sudo ppc64_cpu --smt=8 $ lscpu \| grep 'Thread(s) per core' Thread(s) per core: 8 $ sudo ppc64_cpu --smt=off $ lscpu \| grep 'Thread(s) per core' Thread(s) per core: 1 This commit therefore directly uses that value and replaces use of grep with "sed -n" and its "p" command. Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:08:59 -07:00
Paul E. McKenney	98bb264bdb	torture: Permit running of experimental torture types This commit weakens the checks of the kvm.sh script's --torture parameter and the kvm-recheck.sh script's parsing so that experimental torture tests may be created without updating these two scripts. The changes required are to the appropriate Makefile and Kconfig file, plus a directory whose name begins with "X" must be added to the rcutorture/configs file. This new directory's name can then be passed in via the kvm.sh script's --torture parameter. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:08:59 -07:00
Paul E. McKenney	b20842baf8	torture: Use "-o Batchmode=yes" to disable ssh password requests The torture.sh script normally runs unattended, so there is not much point in the "ssh" command asking for a password. This commit therefore adds the "-o Batchmode=yes" argument to each "ssh" command to cause it to fail rather than ask for a password. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:08:59 -07:00
Paul E. McKenney	ab3ecd0bce	torture: Reposition so that $? collects ssh code in torture.sh An "echo" slipped in between an "ssh" and the "ret=$?" that was intended to collect its exit code, which prevents torture.sh from detecting "ssh" failure. This commit therefore reassociates the two. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:08:58 -07:00
Paul E. McKenney	b6f3c6a2b1	torture: Add rcu_normal and rcu_expedited runs to torture.sh Currently, the rcupdate.rcu_normal and rcupdate.rcu_expedited kernel boot parameters are not regularly tested. The potential addition of polled expedited grace-period APIs increases the amount of code that is affected by these kernel boot parameters. This commit therefore adds a "--do-rt" argument to torture.sh to exercise these kernel-boot options. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2022-04-11 17:07:28 -07:00
Alan Maguire	0f8619929c	libbpf: Usdt aarch64 arg parsing support Parsing of USDT arguments is architecture-specific. On aarch64 it is relatively easy since registers used are x[0-31] and sp. Format is slightly different compared to x86_64. Possible forms are: - "size@[reg[,offset]]" for dereferences, e.g. "-8@[sp,76]" and "-4@[sp]"; - "size@reg" for register values, e.g. "-4@x0"; - "size@value" for raw values, e.g. "-8@1". Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649690496-1902-2-git-send-email-alan.maguire@oracle.com	2022-04-11 15:32:28 -07:00
Carsten Haitzler	41204da4c1	perf test: Shell - Limit to only run executable scripts in tests 'perf test''s shell runner will just run everything in the tests directory (as long as it's not another directory or does not begin with a dot), but sometimes you find files in there that are not shell scripts - perf.data output for example if you do some testing and then the next time you run perf test it tries to run these. Check the files are executable so they are actually intended to be test scripts and not just some "random junk" files there. Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Link: http://lore.kernel.org/lkml/20220309122859.31487-1-carsten.haitzler@foss.arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-11 16:39:49 -03:00
Eelco Chaudron	ae24e9b53d	perf scripting python: Expose symbol offset and source information This change adds the symbol offset to the data exported for each call-chain entry. This can not be calculated from the script and only the ip value, and no related mmap information. In addition, also export the source file and line information, if available, to avoid an external lookup if this information is needed. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/164554263724.752731.14651017093796049736.stgit@wsfd-netdev64.ntdv.lab.eng.bos.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-11 16:39:49 -03:00
Eric Lin	335f70faa2	perf jitdump: Add riscv64 support This patch enables perf jitdump for riscv64 and was tested with V8 on qemu rv64. Qemu rv64: $ perf record -e cpu-clock -c 1000 -g -k mono ./d8_rv64 --perf-prof --no-write-protect-code-memory test.js $ perf inject -j -i perf.data -o perf.data.jitted $ perf report -i perf.data.jitted Output: To display the perf.data header info, please use --header/--header-only options. Total Lost Samples: 0 Samples: 87K of event 'cpu-clock' Event count (approx.): 87974000 Children Self Command Shared Object Symbol .... 0.28% 0.06% d8_rv64 d8_rv64 [.] _ZN2v88internal6WasmJs7InstallEPNS0_7IsolateEb 0.28% 0.00% d8_rv64 d8_rv64 [.] _ZN2v88internal10ParserBaseINS0_6ParserEE22ParseLogicalExpressionEv 0.28% 0.03% d8_rv64 jitted-112-76.so [.] Builtin:InterpreterEntryTrampoline 0.12% 0.00% d8_rv64 d8_rv64 [.] _ZN2v88internal19ContextDeserializer11DeserializeEPNS0_7IsolateENS0_6HandleINS0_13JSGlobalProxyEEENS_33DeserializeInternalFieldsCallbackE 0.12% 0.01% d8_rv64 jitted-112-651.so [.] Builtin:CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit .... Signed-off-by: Eric Lin <eric.lin@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: greentime.hu@sifive.com Cc: linux-riscv@lists.infradead.org Link: http://lore.kernel.org/lkml/20220406142606.18464-2-eric.lin@sifive.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-11 16:37:26 -03:00
Like Xu	0c8b6641c8	selftests: kvm: add tsc_scaling_sync to .gitignore The tsc_scaling_sync's binary should be present in the .gitignore file for the git to ignore it. Signed-off-by: Like Xu <likexu@tencent.com> Message-Id: <20220406063715.55625-3-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-11 13:28:56 -04:00
Geliang Tang	f4fd706f73	selftests/bpf: Drop duplicate max/min definitions Drop duplicate macros min() and MAX() definitions in prog_tests and use MIN() or MAX() in sys/param.h instead. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/1ae276da9925c2de59b5bdc93b693b4c243e692e.1649462033.git.geliang.tang@suse.com	2022-04-11 17:18:09 +02:00
Florian Westphal	f2ae0fa68e	selftests/mptcp: add diag listen tests Check dumping of mptcp listener sockets: 1. filter by dport should not return any results 2. filter by sport should return listen sk 3. filter by saddr+sport should return listen sk 4. no filter should return listen sk Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-11 11:55:54 +01:00
David S. Miller	4696ad36d7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Replace unnecessary list_for_each_entry_continue() in nf_tables, from Jakob Koschel. 2) Add struct nf_conntrack_net_ecache to conntrack event cache and use it, from Florian Westphal. 3) Refactor ctnetlink_dump_list(), also from Florian. 4) Bump module reference counter on cttimeout object addition/removal, from Florian. 5) Consolidate nf_log MAC printer, from Phil Sutter. 6) Add basic logging support for unknown ethertype, from Phil Sutter. 7) Consolidate check for sysctl nf_log_all_netns toggle, also from Phil. 8) Replace hardcode value in nft_bitwise, from Jeremy Sowden. 9) Rename BASIC-like goto tags in nft_bitwise to more meaningful names, also from Jeremy. 10) nft_fib support for reverse path filtering with policy-based routing on iif. Extend selftests to cover for this new usecase, from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-04-11 11:47:58 +01:00
Florian Westphal	0c7b27616f	selftests: netfilter: add fib expression forward test case Its now possible to use fib expression in the forward chain (where both the input and output interfaces are known). Add a simple test case for this. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-04-11 12:10:09 +02:00
Pawan Gupta	400331f8ff	x86/tsx: Disable TSX development mode at boot A microcode update on some Intel processors causes all TSX transactions to always abort by default[]. Microcode also added functionality to re-enable TSX for development purposes. With this microcode loaded, if tsx=on was passed on the cmdline, and TSX development mode was already enabled before the kernel boot, it may make the system vulnerable to TSX Asynchronous Abort (TAA). To be on safer side, unconditionally disable TSX development mode during boot. If a viable use case appears, this can be revisited later. []: Intel TSX Disable Update for Selected Processors, doc ID: 643557 [ bp: Drop unstable web link, massage heavily. ] Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/347bd844da3a333a9793c6687d4e4eb3b2419a3e.1646943780.git.pawan.kumar.gupta@linux.intel.com	2022-04-11 09:58:40 +02:00
Yafang Shao	451b5fbc2c	tools/runqslower: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK Explicitly set libbpf 1.0 API mode, then we can avoid using the deprecated RLIMIT_MEMLOCK. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220409125958.92629-5-laoar.shao@gmail.com	2022-04-10 20:17:16 -07:00
Yafang Shao	a777e18f1b	bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK We have switched to memcg-based memory accouting and thus the rlimit is not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in libbpf for backward compatibility, so we can use it instead now. libbpf_set_strict_mode always return 0, so we don't need to check whether the return value is 0 or not. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220409125958.92629-4-laoar.shao@gmail.com	2022-04-10 20:17:16 -07:00
Yafang Shao	b858ba8c52	selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK We have switched to memcg-based memory accouting and thus the rlimit is not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in libbpf for backward compatibility, so we can use it instead now. After this change, the header tools/testing/selftests/bpf/bpf_rlimit.h can be removed. This patch also removes the useless header sys/resource.h from many files in tools/testing/selftests/bpf/. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220409125958.92629-3-laoar.shao@gmail.com	2022-04-10 20:17:16 -07:00
Runqing Yang	d252a4a499	libbpf: Fix a bug with checking bpf_probe_read_kernel() support in old kernels Background: Libbpf automatically replaces calls to BPF bpf_probe_read_{kernel,user} [_str]() helpers with bpf_probe_read[_str](), if libbpf detects that kernel doesn't support new APIs. Specifically, libbpf invokes the probe_kern_probe_read_kernel function to load a small eBPF program into the kernel in which bpf_probe_read_kernel API is invoked and lets the kernel checks whether the new API is valid. If the loading fails, libbpf considers the new API invalid and replaces it with the old API. static int probe_kern_probe_read_kernel(void) { struct bpf_insn insns[] = { BPF_MOV64_REG(BPF_REG_1, BPF_REG_10), /* r1 = r10 (fp) / BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8), / r1 += -8 / BPF_MOV64_IMM(BPF_REG_2, 8), / r2 = 8 / BPF_MOV64_IMM(BPF_REG_3, 0), / r3 = 0 */ BPF_RAW_INSN(BPF_JMP \| BPF_CALL, 0, 0, 0, BPF_FUNC_probe_read_kernel), BPF_EXIT_INSN(), }; int fd, insn_cnt = ARRAY_SIZE(insns); fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, NULL, "GPL", insns, insn_cnt, NULL); return probe_fd(fd); } Bug: On older kernel versions [0], the kernel checks whether the version number provided in the bpf syscall, matches the LINUX_VERSION_CODE. If not matched, the bpf syscall fails. eBPF However, the probe_kern_probe_read_kernel code does not set the kernel version number provided to the bpf syscall, which causes the loading process alwasys fails for old versions. It means that libbpf will replace the new API with the old one even the kernel supports the new one. Solution: After a discussion in [1], the solution is using BPF_PROG_TYPE_TRACEPOINT program type instead of BPF_PROG_TYPE_KPROBE because kernel does not enfoce version check for tracepoint programs. I test the patch in old kernels (4.18 and 4.19) and it works well. [0] https://elixir.bootlin.com/linux/v4.19/source/kernel/bpf/syscall.c#L1360 [1] Closes: https://github.com/libbpf/libbpf/issues/473 Signed-off-by: Runqing Yang <rainkin1993@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220409144928.27499-1-rainkin1993@gmail.com	2022-04-10 20:15:22 -07:00
Mykola Lysenko	61ddff373f	selftests/bpf: Improve by-name subtest selection logic in prog_tests Improve subtest selection logic when using -t/-a/-d parameters. In particular, more than one subtest can be specified or a combination of tests / subtests. -a send_signal -d send_signal/send_signal_nmi* - runs send_signal test without nmi tests -a send_signal/send_signal_nmi,find_vma - runs two send_signal subtests and find_vma test -a 'send_signal' -a find_vma -d send_signal/send_signal_nmi* - runs 2 send_signal test and find_vma test. Disables two send_signal nmi subtests -t send_signal -t find_vma - runs two send_signal tests and one find_vma test This will allow us to have granular control over which subtests to disable in the CI system instead of disabling whole tests. Also, add new selftest to avoid possible regression when changing prog_test test name selection logic. Signed-off-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220409001750.529930-1-mykolal@fb.com	2022-04-10 20:08:20 -07:00
Vladimir Isaev	0738599856	libbpf: Add ARC support to bpf_tracing.h Add PT_REGS macros suitable for ARCompact and ARCv2. Signed-off-by: Vladimir Isaev <isaev@synopsys.com> Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220408224442.599566-1-geomatsi@gmail.com	2022-04-10 18:53:37 -07:00
Linus Torvalds	9c6913b749	- Fix the MSI message data struct definition - Use local labels in the exception table macros to avoid symbol conflicts with clang LTO builds - A couple of fixes to objtool checking of the relatively newly added SLS and IBT code - Rename a local var in the WARN* macro machinery to prevent shadowing -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJSwSkACgkQEsHwGGHe VUp6QQ//TGhL2xxLoN+7pYjIBDEDHJ3Oi0m6fOweqyQAZTYcm/rAPqd7hvoWVSoO YsLdWi9jeMwkzG0ItSm/qPVm/UvrViXwuQMdz4nDWqg2IPFIbhgNA3CKCIyPTio2 WHp2NXvYyDnwPMr6xTTRndMDoxiwxMBnXf91pNwoU3toxw0GuUuXan0Y+GKnvx1A sqhbpWO27bAmhKb26wPw5soJVxBbSqx+1TbFVG0Sz/uwYQowMa+nfNg1DXF0sXyJ E/ssqBB6wjl7ANVbQsxBQHRzr/EksLVPwHHrlT8ga/5loin+VJ6mTBCPLgG7SMBE +R1fm79Bp/9KU194fcqhJ3pvnyJPi8hfizzCqNKnK871V8LRzC+jW0l3EdvASEXC sDj0XWsSFoWft9eAtMV11d641uVC4rLB90GyyzmWWrEw9BbxmasBgED6QBx9d+V6 o1L4y58Tsz88HKzwd0PtBkeGDkvkA7xOx8ViG24IeLA0tcbixnfnATQdelQeWKqO 4m3o1JU8ogJp9JCEBY7ZeXyStFjZMedM4U/V0akF6AKnpDuVfR3T5C68cYhoLKBu XU6Swf5sFHImNWp0+54HPnXhHj/uhuwj9YWCkxx/eXViwvVlxSdTdIQWa380EddN 0KhOFLwLOdhha2+81FJc6vmkDHwiu6hlR38yqdGvdxZf/KPKjM0= =kMtP -----END PGP SIGNATURE----- Merge tag 'x86_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Fix the MSI message data struct definition - Use local labels in the exception table macros to avoid symbol conflicts with clang LTO builds - A couple of fixes to objtool checking of the relatively newly added SLS and IBT code - Rename a local var in the WARN* macro machinery to prevent shadowing * tag 'x86_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/msi: Fix msi message data shadow struct x86/extable: Prefer local labels in .set directives x86,bpf: Avoid IBT objtool warning objtool: Fix SLS validation for kcov tail-call replacement objtool: Fix IBT tail-call detection x86/bug: Prevent shadowing in __WARN_FLAGS x86/mm/tlb: Revert retpoline avoidance approach	2022-04-10 07:12:27 -10:00
Linus Torvalds	1862a69c91	perf tools fixes for v5.18: 1st batch - Fix the clang command line option probing and remove some options to filter out, fixing the build with the latest clang versions. - Fix 'perf bench' futex and epoll benchmarks to deal with machines with more than 1K CPUs. - Fix 'perf test tsc' error message when not supported. - Remap perf ring buffer if there is no space for event, fixing perf usage in 32-bit ChromeOS. - Drop objdump stderr to avoid getting stuck waiting for stdout output in 'perf annotate'. - Fix up garbled output by now showing unwind error messages when augmenting frame in best effort mode. - Fix perf's libperf_print callback, use the va_args eprintf() variant. - Sync vhost and arm64 cputype headers with the kernel sources. - Fix 'perf report --mem-mode' with ARM SPE. - Add missing external commands ('perf iiostat', etc) to 'perf --list-cmds'. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYlIM+gAKCRCyPKLppCJ+ J5l6AQCCY4co/6FBh8JMmMX4RVHAUriX0YfKTJfpeLU3nsiXPAD/TVqf1LOyYaPv /ZqJ8DwqvKr9nkUsf5kAOfPrDB/j/QQ= =0UV/ -----END PGP SIGNATURE----- Merge tag 'perf-tools-fixes-for-v5.18-2022-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix the clang command line option probing and remove some options to filter out, fixing the build with the latest clang versions - Fix 'perf bench' futex and epoll benchmarks to deal with machines with more than 1K CPUs - Fix 'perf test tsc' error message when not supported - Remap perf ring buffer if there is no space for event, fixing perf usage in 32-bit ChromeOS - Drop objdump stderr to avoid getting stuck waiting for stdout output in 'perf annotate' - Fix up garbled output by now showing unwind error messages when augmenting frame in best effort mode - Fix perf's libperf_print callback, use the va_args eprintf() variant - Sync vhost and arm64 cputype headers with the kernel sources - Fix 'perf report --mem-mode' with ARM SPE - Add missing external commands ('iiostat', etc) to 'perf --list-cmds' * tag 'perf-tools-fixes-for-v5.18-2022-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf annotate: Drop objdump stderr to avoid getting stuck waiting for stdout output perf tools: Add external commands to list-cmds perf docs: Add perf-iostat link to manpages perf session: Remap buf if there is no space for event perf bench: Fix epoll bench to correct usage of affinity for machines with #CPUs > 1K perf bench: Fix futex bench to correct usage of affinity for machines with #CPUs > 1K perf tools: Fix perf's libperf_print callback perf: arm-spe: Fix perf report --mem-mode perf unwind: Don't show unwind error messages when augmenting frame pointer stack tools headers arm64: Sync arm64's cputype.h with the kernel sources perf test tsc: Fix error message when not supported perf build: Don't use -ffat-lto-objects in the python feature test when building with clang-13 perf python: Fix probing for some clang command line options tools build: Filter out options and warnings not supported by clang tools build: Use $(shell ) instead of `` to get embedded libperl's ccopts tools include UAPI: Sync linux/vhost.h with the kernel sources	2022-04-09 18:45:10 -10:00
Linus Torvalds	94a4c2bb7a	cxl + nvdimm fixes for v5.18-rc2 - Fix a compile error in the nvdimm unit tests - Fix a shadowed variable warning in the CXL PCI driver -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYlIY6QAKCRDfioYZHlFs Z0aeAQDDYcicYRhLZ3Ljbg6stitBIumpdVcKDHm4WkC9gbmB4QEArnXLpcHPWyAa zmgc1Yrp9gOnpNSRMog9Wc8NaR45KA8= =Ov5t -----END PGP SIGNATURE----- Merge tag 'cxl+nvdimm-for-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull cxl and nvdimm fixes from Dan Williams: - Fix a compile error in the nvdimm unit tests - Fix a shadowed variable warning in the CXL PCI driver * tag 'cxl+nvdimm-for-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: cxl/pci: Drop shadowed variable tools/testing/nvdimm: Fix security_init() symbol collision	2022-04-09 18:31:59 -10:00
Ian Rogers	940a445a90	perf annotate: Drop objdump stderr to avoid getting stuck waiting for stdout output If objdump writes to stderr it can block waiting for it to be read. As perf doesn't read stderr then progress stops with perf waiting for stdout output. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Truong <alexandre.truong@arm.com> Cc: Dave Marchevsky <davemarchevsky@fb.com> Cc: Denis Nikitin <denik@chromium.org> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Lexi Shao <shaolexi@huawei.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Remi Bernon <rbernon@codeweavers.com> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: William Cohen <wcohen@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20220407230503.1265036-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 14:21:00 -03:00
Michael Petlan	3e6b43beb7	perf tools: Add external commands to list-cmds The `perf --list-cmds` output prints only internal commands, although there is no reason for that from users' perspective. Adding the external commands to commands array with NULL function pointer allows printing all perf commands while not changing the logic of command handler selection. Signed-off-by: Michael Petlan <mpetlan@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220404221541.30312-2-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 14:21:00 -03:00
Michael Petlan	0ff26efe92	perf docs: Add perf-iostat link to manpages Signed-off-by: Michael Petlan <mpetlan@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220404221541.30312-1-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 14:20:59 -03:00
Denis Nikitin	bc21e74d47	perf session: Remap buf if there is no space for event If a perf event doesn't fit into remaining buffer space return NULL to remap buf and fetch the event again. Keep the logic to error out on inadequate input from fuzzing. This fixes perf failing on ChromeOS (with 32b userspace): $ perf report -v -i perf.data ... prefetch_event: head=0x1fffff8 event->header_size=0x30, mmap_size=0x2000000: fuzzed or compressed perf.data? Error: failed to process sample Fixes: `57fc032ad6` ("perf session: Avoid infinite loop when seeing invalid header.size") Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Denis Nikitin <denik@chromium.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220330031130.2152327-1-denik@chromium.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 14:20:59 -03:00
Athira Rajeev	299687e18a	perf bench: Fix epoll bench to correct usage of affinity for machines with #CPUs > 1K The 'perf bench epoll' testcase fails on systems with more than 1K CPUs. Testcase: perf bench epoll all Result snippet: <<>> Run summary [PID 106497]: 1399 threads monitoring on 64 file-descriptors for 8 secs. perf: pthread_create: No such file or directory <<>> In epoll benchmarks (ctl, wait) pthread_create is invoked in do_threads from respective bench_epoll_* function. Though the logs shows direct failure from pthread_create, the actual failure is from "sched_setaffinity" returning EINVAL (invalid argument). This happens because the default mask size in glibc is 1024. To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros. Patch addresses this by fixing all the epoll benchmarks to use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the mask. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220406175113.87881-3-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 12:34:29 -03:00
Athira Rajeev	c9c2a427dd	perf bench: Fix futex bench to correct usage of affinity for machines with #CPUs > 1K The 'perf bench futex' testcase fails on systems with more than 1K CPUs. Testcase: perf bench futex all Failure snippet: <<>>Running futex/hash benchmark... perf: pthread_create: No such file or directory <<>> All the futex benchmarks (ie hash, lock-api, requeue, wake, wake-parallel), pthread_create is invoked in respective bench_futex_* function. Though the logs shows direct failure from pthread_create, strace logs showed that actual failure is from "sched_setaffinity" returning EINVAL (invalid argument). This happens because the default mask size in glibc is 1024. To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the mask size using the CPU_*_S macros. Patch addresses this by fixing all the futex benchmarks to use CPU_ALLOC to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the mask. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220406175113.87881-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 12:34:29 -03:00
Adrian Hunter	aeee9dc53c	perf tools: Fix perf's libperf_print callback eprintf() does not expect va_list as the type of the 4th parameter. Use veprintf() because it does. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: `428dab813a` ("libperf: Merge libperf_set_print() into libperf_init()") Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220408132625.2451452-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 12:34:29 -03:00
James Clark	ffab487052	perf: arm-spe: Fix perf report --mem-mode Since commit `bb30acae4c` ("perf report: Bail out --mem-mode if mem info is not available") "perf mem report" and "perf report --mem-mode" don't allow opening the file unless one of the events has PERF_SAMPLE_DATA_SRC set. SPE doesn't have this set even though synthetic memory data is generated after it is decoded. Fix this issue by setting DATA_SRC on SPE events. This has no effect on the data collected because the SPE driver doesn't do anything with that flag and doesn't generate samples. Fixes: `bb30acae4c` ("perf report: Bail out --mem-mode if mem info is not available") Signed-off-by: James Clark <james.clark@arm.com> Tested-by: Leo Yan <leo.yan@linaro.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220408144056.1955535-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 12:34:29 -03:00
James Clark	fa7095c5c3	perf unwind: Don't show unwind error messages when augmenting frame pointer stack Commit Fixes: `b9f6fbb3b2` ("perf arm64: Inject missing frames when using 'perf record --call-graph=fp'") intended to add a 'best effort' DWARF unwind that improved the frame pointer stack in most scenarios. It's expected that the unwind will fail sometimes, but this shouldn't be reported as an error. It only works when the return address can be determined from the contents of the link register alone. Fix the error shown when the unwinder requires extra registers by adding a new flag that suppresses error messages. This flag is not set in the normal --call-graph=dwarf unwind mode so that behavior is not changed. Fixes: `b9f6fbb3b2` ("perf arm64: Inject missing frames when using 'perf record --call-graph=fp'") Reported-by: John Garry <john.garry@huawei.com> Signed-off-by: James Clark <james.clark@arm.com> Tested-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Truong <alexandre.truong@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220406145651.1392529-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 12:34:29 -03:00
Arnaldo Carvalho de Melo	278aaba2c5	tools headers arm64: Sync arm64's cputype.h with the kernel sources To get the changes in: `83bea32ac7` ("arm64: Add part number for Arm Cortex-A78AE") That addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h' diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Cc: Ali Saidi <alisaidi@amazon.com> Cc: Andrew Kilroy <andrew.kilroy@arm.com> Cc: Chanho Park <chanho61.park@samsung.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Will Deacon <will@kernel.org> Link: http://lore.kernel.org/lkml/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 12:34:29 -03:00
Chengdong Li	290fa68bdc	perf test tsc: Fix error message when not supported By default `perf test tsc` does not return the error message when the child process detected kernel does not support it. Instead, the child process prints an error message to stderr, unfortunately stderr is redirected to /dev/null when verbose <= 0. This patch does: - return TEST_SKIP to the parent process instead of TEST_OK when perf_read_tsc_conversion() is not supported. - Add a new subtest of testing if TSC is supported on current architecture by moving exist code to a separate function. It avoids two places in test__perf_time_to_tsc() that return TEST_SKIP by doing this. - Extend the test suite definition to contain above two subtests. Current test_suite and test_case structs do not support printing skip reason when the number of subtest less than 1. To print skip reason, it is necessary to extend current test suite definition. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Chengdong Li <chengdongli@tencent.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: likexu@tencent.com Link: https://lore.kernel.org/r/20220408084748.43707-1-chengdongli@tencent.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 12:34:29 -03:00
Arnaldo Carvalho de Melo	3a8a047586	perf build: Don't use -ffat-lto-objects in the python feature test when building with clang-13 Using -ffat-lto-objects in the python feature test when building with clang-13 results in: clang-13: error: optimization flag '-ffat-lto-objects' is not supported [-Werror,-Wignored-optimization-argument] error: command '/usr/sbin/clang' failed with exit code 1 cp: cannot stat '/tmp/build/perf/python_ext_build/lib/perf.so': No such file or directory make[2]: ** [Makefile.perf:639: /tmp/build/perf/python/perf.so] Error 1 Noticed when building on a docker.io/library/archlinux:base container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Fangrui Song <maskray@google.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Keeping <john@metanate.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 12:34:29 -03:00
Arnaldo Carvalho de Melo	dd6e1fe91c	perf python: Fix probing for some clang command line options The clang compiler complains about some options even without a source file being available, while others require one, so use the simple tools/build/feature/test-hello.c file. Then check for the "is not supported" string in its output, in addition to the "unknown argument" already being looked for. This was noticed when building with clang-13 where -ffat-lto-objects isn't supported and since we were looking just for "unknown argument" and not providing a source code to clang, was mistakenly assumed as being available and not being filtered to set of command line options provided to clang, leading to a build failure. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Fangrui Song <maskray@google.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Keeping <john@metanate.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Link: http://lore.kernel.org/lkml/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 12:34:29 -03:00
Arnaldo Carvalho de Melo	41caff459a	tools build: Filter out options and warnings not supported by clang These make the feature check fail when using clang, so remove them just like is done in tools/perf/Makefile.config to build perf itself. Adding -Wno-compound-token-split-by-macro to tools/perf/Makefile.config when building with clang is also necessary to avoid these warnings turned into errors (-Werror): CC /tmp/build/perf/util/scripting-engines/trace-event-perl.o In file included from util/scripting-engines/trace-event-perl.c:35: In file included from /usr/lib64/perl5/CORE/perl.h:4085: In file included from /usr/lib64/perl5/CORE/hv.h:659: In file included from /usr/lib64/perl5/CORE/hv_func.h:34: In file included from /usr/lib64/perl5/CORE/sbox32_hash.h:4: /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: error: '(' and '{' tokens introducing statement expression appear in different macro expansion contexts [-Werror,-Wcompound-token-split-by-macro] ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/lib64/perl5/CORE/zaphod32_hash.h:80:38: note: expanded from macro 'ZAPHOD32_SCRAMBLE32' #define ZAPHOD32_SCRAMBLE32(v,prime) STMT_START { \ ^~~~~~~~~~ /usr/lib64/perl5/CORE/perl.h:737:29: note: expanded from macro 'STMT_START' # define STMT_START (void)( /* gcc supports "({ STATEMENTS; })" */ ^ /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: note: '{' token is here ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/lib64/perl5/CORE/zaphod32_hash.h:80:49: note: expanded from macro 'ZAPHOD32_SCRAMBLE32' #define ZAPHOD32_SCRAMBLE32(v,prime) STMT_START { \ ^ /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: error: '}' and ')' tokens terminating statement expression appear in different macro expansion contexts [-Werror,-Wcompound-token-split-by-macro] ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/lib64/perl5/CORE/zaphod32_hash.h:87:41: note: expanded from macro 'ZAPHOD32_SCRAMBLE32' v ^= (v>>23); \ ^ /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: note: ')' token is here ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/lib64/perl5/CORE/zaphod32_hash.h:88:3: note: expanded from macro 'ZAPHOD32_SCRAMBLE32' } STMT_END ^~~~~~~~ /usr/lib64/perl5/CORE/perl.h:738:21: note: expanded from macro 'STMT_END' # define STMT_END ) ^ Please refer to the discussion on the Link: tag below, where Nathan clarifies the situation: <quote> acme> And then get to the problems at the end of this message, which seem acme> similar to the problem described here: acme> acme> From Nathan Chancellor <> acme> Subject [PATCH] mwifiex: Remove unnecessary braces from HostCmd_SET_SEQ_NO_BSS_INFO acme> acme> https://lkml.org/lkml/2020/9/1/135 acme> acme> So perhaps in this case its better to disable that acme> -Werror,-Wcompound-token-split-by-macro when building with clang? Yes, I think that is probably the best solution. As far as I can tell, at least in this file and context, the warning appears harmless, as the "create a GNU C statement expression from two different macros" is very much intentional, based on the presence of PERL_USE_GCC_BRACE_GROUPS. The warning is fixed in upstream Perl by just avoiding creating GNU C statement expressions using STMT_START and STMT_END: https://github.com/Perl/perl5/issues/18780 https://github.com/Perl/perl5/pull/18984 If I am reading the source code correctly, an alternative to disabling the warning would be specifying -DPERL_GCC_BRACE_GROUPS_FORBIDDEN but it seems like that might end up impacting more than just this site, according to the issue discussion above. </quote> Based-on-a-patch-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # Debian/Selfmade LLVM-14 (x86-64) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Fangrui Song <maskray@google.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Keeping <john@metanate.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: http://lore.kernel.org/lkml/YkxWcYzph5pC1EK8@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 12:34:16 -03:00
Arnaldo Carvalho de Melo	541f695cbc	tools build: Use $(shell ) instead of `` to get embedded libperl's ccopts Just like its done for ldopts and for both in tools/perf/Makefile.config. Using `` to initialize PERL_EMBED_CCOPTS somehow precludes using: $(filter-out SOMETHING_TO_FILTER,$(PERL_EMBED_CCOPTS)) And we need to do it to allow for building with versions of clang where some gcc options selected by distros are not available. Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # Debian/Selfmade LLVM-14 (x86-64) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Fangrui Song <maskray@google.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Keeping <john@metanate.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: http://lore.kernel.org/lkml/YktYX2OnLtyobRYD@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 12:33:38 -03:00
Arnaldo Carvalho de Melo	940442deea	tools include UAPI: Sync linux/vhost.h with the kernel sources To get the changes in: `b04d910af3` ("vdpa: support exposing the count of vqs to userspace") `a61280dddd` ("vdpa: support exposing the config size to userspace") Silencing this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h' diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h $ diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h --- tools/include/uapi/linux/vhost.h 2021-07-15 16:17:01.840818309 -0300 +++ include/uapi/linux/vhost.h 2022-04-02 18:55:05.702522387 -0300 @@ -150,4 +150,11 @@ /* Get the valid iova range / #define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \ struct vhost_vdpa_iova_range) + +/ Get the config size / +#define VHOST_VDPA_GET_CONFIG_SIZE _IOR(VHOST_VIRTIO, 0x79, __u32) + +/ Get the count of all virtqueues */ +#define VHOST_VDPA_GET_VQS_COUNT _IOR(VHOST_VIRTIO, 0x80, __u32) + #endif $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before $ cp include/uapi/linux/vhost.h tools/include/uapi/linux/vhost.h $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after $ diff -u before after --- before 2022-04-04 14:52:25.036375145 -0300 +++ after 2022-04-04 14:52:31.906549976 -0300 @@ -38,4 +38,6 @@ [0x73] = "VDPA_GET_CONFIG", [0x76] = "VDPA_GET_VRING_NUM", [0x78] = "VDPA_GET_IOVA_RANGE", + [0x79] = "VDPA_GET_CONFIG_SIZE", + [0x80] = "VDPA_GET_VQS_COUNT", }; $ Cc: Longpeng <longpeng2@huawei.com> Cc: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/lkml/YksxoFcOARk%2Fldev@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-04-09 11:42:33 -03:00
Anup Patel	ebdef0de2d	KVM: selftests: riscv: Fix alignment of the guest_hang() function The guest_hang() function is used as the default exception handler for various KVM selftests applications by setting it's address in the vstvec CSR. The vstvec CSR requires exception handler base address to be at least 4-byte aligned so this patch fixes alignment of the guest_hang() function. Fixes: `3e06cdf105` ("KVM: selftests: Add initial support for RISC-V 64-bit") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Tested-by: Mayuresh Chitale <mchitale@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2022-04-09 09:15:51 +05:30
Anup Patel	fac3725364	KVM: selftests: riscv: Set PTE A and D bits in VS-stage page table Supporting hardware updates of PTE A and D bits is optional for any RISC-V implementation so current software strategy is to always set these bits in both G-stage (hypervisor) and VS-stage (guest kernel). If PTE A and D bits are not set by software (hypervisor or guest) then RISC-V implementations not supporting hardware updates of these bits will cause traps even for perfectly valid PTEs. Based on above explanation, the VS-stage page table created by various KVM selftest applications is not correct because PTE A and D bits are not set. This patch fixes VS-stage page table programming of PTE A and D bits for KVM selftests. Fixes: `3e06cdf105` ("KVM: selftests: Add initial support for RISC-V 64-bit") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Tested-by: Mayuresh Chitale <mchitale@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2022-04-09 09:15:44 +05:30
Linus Torvalds	9abb16bad5	linux-kselftest-fixes-5.18-rc2 This Kselftest fixes update for Linux 5.18-rc2 consists of build, run-times fixes to tests: - header dependencies - missing tear-downs to release allocated resources in assert paths - missing error messages when build fails - coccicheck and unused variable warnings -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmJQfjYACgkQCwJExA0N QxyImg/+N7thuUnFJuzfq/KuoCif4+t6xSre4d+RBUPNvTDnJ0BAU7K9b3FfWTgr 07ZPosjhhPe4gCvIRQXrzGAVyvLNR9YWBAyP41Mkg5OZzMemFD77pRgB2j3xrDMR zubxMlHYO/RrKUpiy5CH6wt+qBoGXt9XkEW8zxGG4wbBTtwQyEab0m8W7RRHHCf1 5euXvpZnAxHUYiK4QhIu16qPAWC433gQCXax+CXGSZObr3ISCgEPOVRI6gexh2QI dLWBiYWPfFGmLjyEsnTpqBVddDGKdPwAqvtSGyKMgGGlFQmbZKr6kH6OYDXFSpzE 9oK/heE/TX8IpDWki+vVgK15QPanPmvui+1oo670zCXX57+44jqP+3x5j/aXAjyw R9CPaRcQKEeP+CEZEvqU44slWC7GiiemXhNvQG7WMKA96vm12aB4nHaP9Y/lHqG/ mgPtM1/WEGjnFrQFY6XNtFez4t4t7X8oJDCJ5vrgwRsXHm46ZNWop8P6u81X1gmH I8vJsM84wR1HCIXevmELGGS3jZLA5bJ7BEaNlhDtK0sIwYbQwH3zS1sePQhm8sJT Xui+eibzfTL2TxW+R4fX8UhRFikAyiwBRzt2/SWNyWeUq23XKRi9eetazHMHFndT uvZZ2y2fAlgry23VEAsnSHOqK00m2kJmlV3J8Aj3I3XN+wUocfE= =xBpu -----END PGP SIGNATURE----- Merge tag 'linux-kselftest-fixes-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: "Build and run-times fixes to tests: - header dependencies - missing tear-downs to release allocated resources in assert paths - missing error messages when build fails - coccicheck and unused variable warnings" * tag 'linux-kselftest-fixes-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/harness: Pass variant to teardown selftests/harness: Run TEARDOWN for ASSERT failures selftests: fix an unused variable warning in pidfd selftest selftests: fix header dependency for pid_namespace selftests selftests: x86: add 32bit build warnings for SUSE selftests/proc: fix array_size.cocci warning selftests/vDSO: fix array_size.cocci warning	2022-04-08 14:48:35 -10:00
Jakub Kicinski	34ba23b44c	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2022-04-09 We've added 63 non-merge commits during the last 9 day(s) which contain a total of 68 files changed, 4852 insertions(+), 619 deletions(-). The main changes are: 1) Add libbpf support for USDT (User Statically-Defined Tracing) probes. USDTs are an abstraction built on top of uprobes, critical for tracing and BPF, and widely used in production applications, from Andrii Nakryiko. 2) While Andrii was adding support for x86{-64}-specific logic of parsing USDT argument specification, Ilya followed-up with USDT support for s390 architecture, from Ilya Leoshkevich. 3) Support name-based attaching for uprobe BPF programs in libbpf. The format supported is `u[ret]probe/binary_path:[raw_offset\|function[+offset]]`, e.g. attaching to libc malloc can be done in BPF via SEC("uprobe/libc.so.6:malloc") now, from Alan Maguire. 4) Various load/store optimizations for the arm64 JIT to shrink the image size by using arm64 str/ldr immediate instructions. Also enable pointer authentication to verify return address for JITed code, from Xu Kuohai. 5) BPF verifier fixes for write access checks to helper functions, e.g. rd-only memory from bpf__cpu_ptr() must not be passed to helpers that write into passed buffers, from Kumar Kartikeya Dwivedi. 6) Fix overly excessive stack map allocation for its base map structure and buckets which slipped-in from cleanups during the rlimit accounting removal back then, from Yuntao Wang. 7) Extend the unstable CT lookup helpers for XDP and tc/BPF to report netfilter connection tracking tuple direction, from Lorenzo Bianconi. 8) Improve bpftool dump to show BPF program/link type names, Milan Landaverde. 9) Minor cleanups all over the place from various others. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (63 commits) bpf: Fix excessive memory allocation in stack_map_alloc() selftests/bpf: Fix return value checks in perf_event_stackmap test selftests/bpf: Add CO-RE relos into linked_funcs selftests libbpf: Use weak hidden modifier for USDT BPF-side API functions libbpf: Don't error out on CO-RE relos for overriden weak subprogs samples, bpf: Move routes monitor in xdp_router_ipv4 in a dedicated thread libbpf: Allow WEAK and GLOBAL bindings during BTF fixup libbpf: Use strlcpy() in path resolution fallback logic libbpf: Add s390-specific USDT arg spec parsing logic libbpf: Make BPF-side of USDT support work on big-endian machines libbpf: Minor style improvements in USDT code libbpf: Fix use #ifdef instead of #if to avoid compiler warning libbpf: Potential NULL dereference in usdt_manager_attach_usdt() selftests/bpf: Uprobe tests should verify param/return values libbpf: Improve string parsing for uprobe auto-attach libbpf: Improve library identification for uprobe binary path resolution selftests/bpf: Test for writes to map key from BPF helpers selftests/bpf: Test passing rdonly mem to global func bpf: Reject writes for PTR_TO_MAP_KEY in check_helper_mem_access bpf: Check PTR_TO_MEM \| MEM_RDONLY in check_helper_mem_access ... ==================== Link: https://lore.kernel.org/r/20220408231741.19116-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-08 17:07:29 -07:00
Yuntao Wang	658d87687c	selftests/bpf: Fix return value checks in perf_event_stackmap test The bpf_get_stackid() function may also return 0 on success as per UAPI BPF helper documentation. Therefore, correct checks from 'val > 0' to 'val >= 0' to ensure that they cover all possible success return values. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220408041452.933944-1-ytcoode@gmail.com	2022-04-08 22:38:17 +02:00
Andrii Nakryiko	8555defe48	selftests/bpf: Add CO-RE relos into linked_funcs selftests Add CO-RE relocations into __weak subprogs for multi-file linked_funcs selftest to make sure libbpf handles such combination well. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220408181425.2287230-4-andrii@kernel.org	2022-04-08 22:24:15 +02:00
Andrii Nakryiko	2fa5b0f290	libbpf: Use weak hidden modifier for USDT BPF-side API functions Use __weak __hidden for bpf_usdt_xxx() APIs instead of much more confusing `static inline __noinline`. This was previously impossible due to libbpf erroring out on CO-RE relocations pointing to eliminated weak subprogs. Now that previous patch fixed this issue, switch back to __weak __hidden as it's a more direct way of specifying the desired behavior. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220408181425.2287230-3-andrii@kernel.org	2022-04-08 22:24:15 +02:00
Andrii Nakryiko	e89d57d938	libbpf: Don't error out on CO-RE relos for overriden weak subprogs During BPF static linking, all the ELF relocations and .BTF.ext information (including CO-RE relocations) are preserved for __weak subprograms that were logically overriden by either previous weak subprogram instance or by corresponding "strong" (non-weak) subprogram. This is just how native user-space linkers work, nothing new. But libbpf is over-zealous when processing CO-RE relocation to error out when CO-RE relocation belonging to such eliminated weak subprogram is encountered. Instead of erroring out on this expected situation, log debug-level message and skip the relocation. Fixes: `db2b8b0642` ("libbpf: Support CO-RE relocations for multi-prog sections") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220408181425.2287230-2-andrii@kernel.org	2022-04-08 22:24:15 +02:00
Dan Williams	e8cf229ebe	tools/testing/nvdimm: Fix security_init() symbol collision Starting with the new perf-event support in the nvdimm core, the nfit_test mock module stops compiling. Rename its security_init() to nfit_security_init(). tools/testing/nvdimm/test/nfit.c:1845:13: error: conflicting types for ‘security_init’; have ‘void(struct nfit_test )’ 1845 \| static void security_init(struct nfit_test t) \| ^~~~~~~~~~~~~ In file included from ./include/linux/perf_event.h:61, from ./include/linux/nd.h:11, from ./drivers/nvdimm/nd-core.h:11, from tools/testing/nvdimm/test/nfit.c:19: Fixes: `9a61d0838c` ("drivers/nvdimm: Add nvdimm pmu structure") Cc: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/164904238610.1330275.1889212115373993727.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2022-04-08 12:59:25 -07:00
Andrii Nakryiko	3a06ec0a99	libbpf: Allow WEAK and GLOBAL bindings during BTF fixup During BTF fix up for global variables, global variable can be global weak and will have STB_WEAK binding in ELF. Support such global variables in addition to non-weak ones. This is not the problem when using BPF static linking, as BPF static linker "fixes up" BTF during generation so that libbpf doesn't have to do it anymore during bpf_object__open(), which led to this not being noticed for a while, along with a pretty rare (currently) use of __weak variables and maps. Reported-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220407230446.3980075-2-andrii@kernel.org	2022-04-08 09:16:09 -07:00
Andrii Nakryiko	3c0dfe6e4c	libbpf: Use strlcpy() in path resolution fallback logic Coverity static analyzer complains that strcpy() can cause buffer overflow. Use libbpf_strlcpy() instead to be 100% sure this doesn't happen. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220407230446.3980075-1-andrii@kernel.org	2022-04-08 09:16:09 -07:00
Ilya Leoshkevich	bd022685bd	libbpf: Add s390-specific USDT arg spec parsing logic The logic is superficially similar to that of x86, but the small differences (no need for register table and dynamic allocation of register names, no $ sign before constants) make maintaining a common implementation too burdensome. Therefore simply add a s390x-specific version of parse_usdt_arg(). Note that while bcc supports index registers, this patch does not. This should not be a problem in most cases, since s390 uses a default value "nor" for STAP_SDT_ARG_CONSTRAINT. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220407214411.257260-4-iii@linux.ibm.com	2022-04-08 07:04:20 -07:00
Linus Torvalds	73b193f265	Networking fixes for 5.18-rc2, including fixes from bpf and netfilter Current release - new code bugs: - mctp: correct mctp_i2c_header_create result - eth: fungible: fix reference to __udivdi3 on 32b builds - eth: micrel: remove latencies support lan8814 Previous releases - regressions: - bpf: resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT - vrf: fix packet sniffing for traffic originating from ip tunnels - rxrpc: fix a race in rxrpc_exit_net() - dsa: revert "net: dsa: stop updating master MTU from master.c" - eth: ice: fix MAC address setting Previous releases - always broken: - tls: fix slab-out-of-bounds bug in decrypt_internal - bpf: support dual-stack sockets in bpf_tcp_check_syncookie - xdp: fix coalescing for page_pool fragment recycling - ovs: fix leak of nested actions - eth: sfc: - add missing xdp queue reinitialization - fix using uninitialized xdp tx_queue - eth: ice: - clear default forwarding VSI during VSI release - fix broken IFF_ALLMULTI handling - synchronize_rcu() when terminating rings - eth: qede: confirm skb is allocated before using - eth: aqc111: fix out-of-bounds accesses in RX fixup - eth: slip: fix NPD bug in sl_tx_timeout() Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmJPJvoSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkZywQAKesxObtKwob6uclHfOOl3Tfv9EV20zl 9T9r4vUJ7GtHtjzB59fcWXTRMgeDRRpUPww9U2DLFXEkms7b2O6XgjevRKg0e6ke eF7rPbjhv1igdtS43Vp+5fIUR7vMUhGKXjhLSFB5O+ToRYcWdufdPY4qU62SaFQV 62d2SF/VbdNxnBP6Nzmv4i+EON1uKb8yDL2u4gdwOGO9EV9AUeJ2JNN3H1gc86I7 kzL5gYc61Rd0UwwQAaUap6fcZi2kCRuSHCXLZlha/RK0BGWNcm2Fh5YKCKIAW+2/ 77Unt7aQZoj8DTUzBNjMJX432t18HTjvfOtkwTVIOXy/+n7meQjtgu93yFw9jU84 Oqlc+A8/Si3EyweNC2OvrTqTrUH9ZjjGzL9cEzWaLtEBQWvVeDz7dZxT8QZieXAN hZGba7aq6Ty5CKN7AaOK6e9GMzY8eEVOoSK/dVFZmRiex/y1mME0OHSiuOS1GEVm dfbFvGr1dWEbnQ6yV5peM6KY6y/TNd45BKYD2q5xfCIcJPkZj/dhCli/lx+UGoZY OoX6C78sz5Ogj9UC9lTooA2vo55ykOyxM6yKy9Ky28TmbkkvqDH5GmGMi6TkZOin JNGTADvsZq8TTaq8J7/GbISfbqySUX0TcEM5goyDDFec9TxpWCQlx8P6FJjpM85z DpqQUwYMrIjW =rdzK -----END PGP SIGNATURE----- Merge tag 'net-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - new code bugs: - mctp: correct mctp_i2c_header_create result - eth: fungible: fix reference to __udivdi3 on 32b builds - eth: micrel: remove latencies support lan8814 Previous releases - regressions: - bpf: resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT - vrf: fix packet sniffing for traffic originating from ip tunnels - rxrpc: fix a race in rxrpc_exit_net() - dsa: revert "net: dsa: stop updating master MTU from master.c" - eth: ice: fix MAC address setting Previous releases - always broken: - tls: fix slab-out-of-bounds bug in decrypt_internal - bpf: support dual-stack sockets in bpf_tcp_check_syncookie - xdp: fix coalescing for page_pool fragment recycling - ovs: fix leak of nested actions - eth: sfc: - add missing xdp queue reinitialization - fix using uninitialized xdp tx_queue - eth: ice: - clear default forwarding VSI during VSI release - fix broken IFF_ALLMULTI handling - synchronize_rcu() when terminating rings - eth: qede: confirm skb is allocated before using - eth: aqc111: fix out-of-bounds accesses in RX fixup - eth: slip: fix NPD bug in sl_tx_timeout()" * tag 'net-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits) drivers: net: slip: fix NPD bug in sl_tx_timeout() bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets bpf: Support dual-stack sockets in bpf_tcp_check_syncookie myri10ge: fix an incorrect free for skb in myri10ge_sw_tso net: usb: aqc111: Fix out-of-bounds accesses in RX fixup qede: confirm skb is allocated before using net: ipv6mr: fix unused variable warning with CONFIG_IPV6_PIMSM_V2=n net: phy: mscc-miim: reject clause 45 register accesses net: axiemac: use a phandle to reference pcs_phy dt-bindings: net: add pcs-handle attribute net: axienet: factor out phy_node in struct axienet_local net: axienet: setup mdio unconditionally net: sfc: fix using uninitialized xdp tx_queue rxrpc: fix a race in rxrpc_exit_net() net: openvswitch: fix leak of nested actions net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address() net: openvswitch: don't send internal clone attribute to the userspace. net: micrel: Fix KS8851 Kconfig ice: clear cmd_type_offset_bsz for TX rings ice: xsk: fix VSI state check in ice_xsk_wakeup() ...	2022-04-07 19:01:47 -10:00
Ilya Leoshkevich	6f403d9d53	libbpf: Make BPF-side of USDT support work on big-endian machines BPF_USDT_ARG_REG_DEREF handling always reads 8 bytes, regardless of the actual argument size. On little-endian the relevant argument bits end up in the lower bits of val, and later on the code that handles all the argument types expects them to be there. On big-endian they end up in the upper bits of val, breaking that expectation. Fix by right-shifting val on big-endian. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220407214411.257260-3-iii@linux.ibm.com	2022-04-07 20:59:10 -07:00
Ilya Leoshkevich	e1b6df598a	libbpf: Minor style improvements in USDT code Fix several typos and references to non-existing headers. Also use __BYTE_ORDER__ instead of __BYTE_ORDER for consistency with the rest of the bpf code - see commit `45f2bebc80` ("libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED()") for rationale). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220407214411.257260-2-iii@linux.ibm.com	2022-04-07 20:59:10 -07:00
Andrii Nakryiko	ded6dffaed	libbpf: Fix use #ifdef instead of #if to avoid compiler warning As reported by Naresh: perf build errors on i386 [1] on Linux next-20220407 [2] usdt.c:1181:5: error: "__x86_64__" is not defined, evaluates to 0 [-Werror=undef] 1181 \| #if __x86_64__ \| ^~~~~~~~~~ usdt.c:1196:5: error: "__x86_64__" is not defined, evaluates to 0 [-Werror=undef] 1196 \| #if __x86_64__ \| ^~~~~~~~~~ cc1: all warnings being treated as errors Use #ifdef instead of #if to avoid this. Fixes: `4c59e584d1` ("libbpf: Add x86-specific USDT arg spec parsing logic") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220407203842.3019904-1-andrii@kernel.org	2022-04-07 23:34:15 +02:00
Haowen Bai	e58c5c9717	libbpf: Potential NULL dereference in usdt_manager_attach_usdt() link could be null but still dereference bpf_link__destroy(&link->link) and it will lead to a null pointer access. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649299098-2069-1-git-send-email-baihaowen@meizu.com	2022-04-07 11:46:33 -07:00
Alan Maguire	1717e24801	selftests/bpf: Uprobe tests should verify param/return values uprobe/uretprobe tests don't do any validation of arguments/return values, and without this we can't be sure we are attached to the right function, or that we are indeed attached to a uprobe or uretprobe. To fix this record argument and return value for auto-attached functions and ensure these match expectations. Also need to filter by pid to ensure we do not pick up stray malloc()s since auto-attach traces libc system-wide. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649245431-29956-4-git-send-email-alan.maguire@oracle.com	2022-04-07 11:42:51 -07:00
Alan Maguire	90db26e6be	libbpf: Improve string parsing for uprobe auto-attach For uprobe auto-attach, the parsing can be simplified for the SEC() name to a single sscanf(); the return value of the sscanf can then be used to distinguish between sections that simply specify "u[ret]probe" (and thus cannot auto-attach), those that specify "u[ret]probe/binary_path:function+offset" etc. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649245431-29956-3-git-send-email-alan.maguire@oracle.com	2022-04-07 11:42:50 -07:00
Alan Maguire	a1c9d61b19	libbpf: Improve library identification for uprobe binary path resolution In the process of doing path resolution for uprobe attach, libraries are identified by matching a ".so" substring in the binary_path. This matches a lot of patterns that do not conform to library.so[.version] format, so instead match a ".so" _suffix_, and if that fails match a ".so." substring for the versioned library case. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649245431-29956-2-git-send-email-alan.maguire@oracle.com	2022-04-07 11:42:50 -07:00
Oliver Upton	21db838466	selftests: KVM: Free the GIC FD when cleaning up in arch_timer In order to correctly destroy a VM, all references to the VM must be freed. The arch_timer selftest creates a VGIC for the guest, which itself holds a reference to the VM. Close the GIC FD when cleaning up a VM. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220406235615.1447180-4-oupton@google.com	2022-04-07 08:46:13 +01:00
Oliver Upton	386ba265a8	selftests: KVM: Don't leak GIC FD across dirty log test iterations dirty_log_perf_test instantiates a VGICv3 for the guest (if supported by hardware) to reduce the overhead of guest exits. However, the test does not actually close the GIC fd when cleaning up the VM between test iterations, meaning that the VM is never actually destroyed in the kernel. While this is generally a bad idea, the bug was detected from the kernel spewing about duplicate debugfs entries as subsequent VMs happen to reuse the same FD even though the debugfs directory is still present. Abstract away the notion of setup/cleanup of the GIC FD from the test by creating arch-specific helpers for test setup/cleanup. Close the GIC FD on VM cleanup and do nothing for the other architectures. Fixes: `c340f7899a` ("KVM: selftests: Add vgic initialization for dirty log perf test for ARM") Reviewed-by: Jing Zhang <jingzhangos@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220406235615.1447180-3-oupton@google.com	2022-04-07 08:46:13 +01:00
Andrew Jones	02de9331c4	KVM: selftests: get-reg-list: Add KVM_REG_ARM_FW_REG(3) When testing a kernel with commit `a5905d6af4` ("KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated") get-reg-list output vregs: Number blessed registers: 234 vregs: Number registers: 238 vregs: There are 1 new registers. Consider adding them to the blessed reg list with the following lines: KVM_REG_ARM_FW_REG(3), vregs: PASS ... That output inspired two changes: 1) add the new register to the blessed list and 2) explain why "Number registers" is actually four larger than "Number blessed registers" (on the system used for testing), even though only one register is being stated as new. The reason is that some registers are host dependent and they get filtered out when comparing with the blessed list. The system used for the test apparently had three filtered registers. Signed-off-by: Andrew Jones <drjones@redhat.com> Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220316125129.392128-1-drjones@redhat.com	2022-04-07 08:45:01 +01:00
Jakub Kicinski	8e9d0d7a76	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Alexei Starovoitov says: ==================== pull-request: bpf 2022-04-06 We've added 8 non-merge commits during the last 8 day(s) which contain a total of 9 files changed, 139 insertions(+), 36 deletions(-). The main changes are: 1) rethook related fixes, from Jiri and Masami. 2) Fix the case when tracing bpf prog is attached to struct_ops, from Martin. 3) Support dual-stack sockets in bpf_tcp_check_syncookie, from Maxim. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets bpf: Support dual-stack sockets in bpf_tcp_check_syncookie bpf: selftests: Test fentry tracing a struct_ops program bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT rethook: Fix to use WRITE_ONCE() for rethook:: Handler selftests/bpf: Fix warning comparing pointer to 0 bpf: Fix sparse warnings in kprobe_multi_resolve_syms bpftool: Explicit errno handling in skeletons ==================== Link: https://lore.kernel.org/r/20220407031245.73026-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-04-06 21:58:50 -07:00
Kumar Kartikeya Dwivedi	9fc4476a08	selftests/bpf: Test for writes to map key from BPF helpers When invoking bpf_for_each_map_elem callback, we are passed a PTR_TO_MAP_KEY, previously writes to this through helper may be allowed, but the fix in previous patches is meant to prevent that case. The test case tries to pass it as writable memory to helper, and fails test if it succeeds to pass the verifier. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220319080827.73251-6-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-04-06 10:32:12 -07:00
Kumar Kartikeya Dwivedi	7cb29b1c99	selftests/bpf: Test passing rdonly mem to global func Add two test cases, one pass read only map value pointer to global func, which should be rejected. The same code checks it for kfunc, so that is covered as well. Second one tries to use the missing check for PTR_TO_MEM's MEM_RDONLY flag and tries to write to a read only memory pointer. Without prior patches, both of these tests fail. Reviewed-by: Hao Luo <haoluo@google.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220319080827.73251-5-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-04-06 10:32:12 -07:00
Artem Savkov	ebaf24c589	selftests/bpf: Use bpf_num_possible_cpus() in per-cpu map allocations bpf_map_value_size() uses num_possible_cpus() to determine map size, but some of the tests only allocate enough memory for online cpus. This results in out-of-bound writes in userspace during bpf(BPF_MAP_LOOKUP_ELEM) syscalls in cases when number of online cpus is lower than the number of possible cpus. Fix by switching from get_nprocs_conf() to bpf_num_possible_cpus() when determining the number of processors in these tests (test_progs/netcnt and test_cgroup_storage). Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220406085408.339336-1-asavkov@redhat.com	2022-04-06 10:15:53 -07:00
Colin Ian King	a8d600f6bc	libbpf: Fix spelling mistake "libaries" -> "libraries" There is a spelling mistake in a pr_warn message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220406080835.14879-1-colin.i.king@gmail.com	2022-04-06 10:14:27 -07:00
Yuntao Wang	958ddfd75d	selftests/bpf: Fix issues in parse_num_list() The function does not check that parsing_end is false after parsing argument. Thus, if the final part of the argument is something like '4-', which is invalid, parse_num_list() will discard it instead of returning -EINVAL. Before: $ ./test_progs -n 2,4- #2 atomic_bounds:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED After: $ ./test_progs -n 2,4- Failed to parse test numbers. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220406003622.73539-1-ytcoode@gmail.com	2022-04-06 10:10:03 -07:00
Maxim Mikityanskiy	53968dafc4	bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets The previous commit fixed support for dual-stack sockets in bpf_tcp_check_syncookie. This commit adjusts the selftest to verify the fixed functionality. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Arthur Fabre <afabre@cloudflare.com> Link: https://lore.kernel.org/bpf/20220406124113.2795730-2-maximmi@nvidia.com	2022-04-06 09:44:45 -07:00
Reiji Watanabe	2f5d27e6cf	KVM: arm64: selftests: Introduce vcpu_width_config Introduce a test for aarch64 that ensures non-mixed-width vCPUs (all 64bit vCPUs or all 32bit vcPUs) can be configured, and mixed-width vCPUs cannot be configured. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220329031924.619453-3-reijiw@google.com	2022-04-06 12:29:45 +01:00
Yuntao Wang	2d0df01974	selftests/bpf: Fix file descriptor leak in load_kallsyms() Currently, if sym_cnt > 0, it just returns and does not close file, fix it. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220405145711.49543-1-ytcoode@gmail.com	2022-04-05 16:49:32 -07:00
Andrii Nakryiko	00a0fa2d7d	selftests/bpf: Add urandom_read shared lib and USDTs Extend urandom_read helper binary to include USDTs of 4 combinations: semaphore/semaphoreless (refcounted and non-refcounted) and based in executable or shared library. We also extend urandom_read with ability to report it's own PID to parent process and wait for parent process to ready itself up for tracing urandom_read. We utilize popen() and underlying pipe properties for proper signaling. Once urandom_read is ready, we add few tests to validate that libbpf's USDT attachment handles all the above combinations of semaphore (or lack of it) and static or shared library USDTs. Also, we validate that libbpf handles shared libraries both with PID filter and without one (i.e., -1 for PID argument). Having the shared library case tested with and without PID is important because internal logic differs on kernels that don't support BPF cookies. On such older kernels, attaching to USDTs in shared libraries without specifying concrete PID doesn't work in principle, because it's impossible to determine shared library's load address to derive absolute IPs for uprobe attachments. Without absolute IPs, it's impossible to perform correct look up of USDT spec based on uprobe's absolute IP (the only kind available from BPF at runtime). This is not the problem on newer kernels with BPF cookie as we don't need IP-to-ID lookup because BPF cookie value is spec ID. So having those two situations as separate subtests is good because libbpf CI is able to test latest selftests against old kernels (e.g., 4.9 and 5.5), so we'll be able to disable PID-less shared lib attachment for old kernels, but will still leave PID-specific one enabled to validate this legacy logic is working correctly. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-8-andrii@kernel.org	2022-04-05 13:16:08 -07:00
Andrii Nakryiko	630301b0d5	selftests/bpf: Add basic USDT selftests Add semaphore-based USDT to test_progs itself and write basic tests to valicate both auto-attachment and manual attachment logic, as well as BPF-side functionality. Also add subtests to validate that libbpf properly deduplicates USDT specs and handles spec overflow situations correctly, as well as proper "rollback" of partially-attached multi-spec USDT. BPF-side of selftest intentionally consists of two files to validate that usdt.bpf.h header can be included from multiple source code files that are subsequently linked into final BPF object file without causing any symbol duplication or other issues. We are validating that __weak maps and bpf_usdt_xxx() API functions defined in usdt.bpf.h do work as intended. USDT selftests utilize sys/sdt.h header that on Ubuntu systems comes from systemtap-sdt-devel package. But to simplify everyone's life, including CI but especially casual contributors to bpf/bpf-next that are trying to build selftests, I've checked in sys/sdt.h header from [0] directly. This way it will work on all architectures and distros without having to figure it out for every relevant combination and adding any extra implicit package dependencies. [0] https://sourceware.org/git?p=systemtap.git;a=blob_plain;f=includes/sys/sdt.h;h=ca0162b4dc57520b96638c8ae79ad547eb1dd3a1;hb=HEAD Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-7-andrii@kernel.org	2022-04-05 13:16:08 -07:00
Andrii Nakryiko	4c59e584d1	libbpf: Add x86-specific USDT arg spec parsing logic Add x86/x86_64-specific USDT argument specification parsing. Each architecture will require their own logic, as all this is arch-specific assembly-based notation. Architectures that libbpf doesn't support for USDTs will pr_warn() with specific error and return -ENOTSUP. We use sscanf() as a very powerful and easy to use string parser. Those spaces in sscanf's format string mean "skip any whitespaces", which is pretty nifty (and somewhat little known) feature. All this was tested on little-endian architecture, so bit shifts are probably off on big-endian, which our CI will hopefully prove. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-6-andrii@kernel.org	2022-04-05 13:16:08 -07:00
Andrii Nakryiko	999783c8bb	libbpf: Wire up spec management and other arch-independent USDT logic Last part of architecture-agnostic user-space USDT handling logic is to set up BPF spec and, optionally, IP-to-ID maps from user-space. usdt_manager performs a compact spec ID allocation to utilize fixed-sized BPF maps as efficiently as possible. We also use hashmap to deduplicate USDT arg spec strings and map identical strings to single USDT spec, minimizing the necessary BPF map size. usdt_manager supports arbitrary sequences of attachment and detachment, both of the same USDT and multiple different USDTs and internally maintains a free list of unused spec IDs. bpf_link_usdt's logic is extended with proper setup and teardown of this spec ID free list and supporting BPF maps. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-5-andrii@kernel.org	2022-04-05 13:16:07 -07:00
Andrii Nakryiko	74cc6311ce	libbpf: Add USDT notes parsing and resolution logic Implement architecture-agnostic parts of USDT parsing logic. The code is the documentation in this case, it's futile to try to succinctly describe how USDT parsing is done in any sort of concreteness. But still, USDTs are recorded in special ELF notes section (.note.stapsdt), where each USDT call site is described separately. Along with USDT provider and USDT name, each such note contains USDT argument specification, which uses assembly-like syntax to describe how to fetch value of USDT argument. USDT arg spec could be just a constant, or a register, or a register dereference (most common cases in x86_64), but it technically can be much more complicated cases, like offset relative to global symbol and stuff like that. One of the later patches will implement most common subset of this for x86 and x86-64 architectures, which seems to handle a lot of real-world production application. USDT arg spec contains a compact encoding allowing usdt.bpf.h from previous patch to handle the above 3 cases. Instead of recording which register might be needed, we encode register's offset within struct pt_regs to simplify BPF-side implementation. USDT argument can be of different byte sizes (1, 2, 4, and 8) and signed or unsigned. To handle this, libbpf pre-calculates necessary bit shifts to do proper casting and sign-extension in a short sequences of left and right shifts. The rest is in the code with sometimes extensive comments and references to external "documentation" for USDTs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-4-andrii@kernel.org	2022-04-05 13:16:07 -07:00
Andrii Nakryiko	2e4913e025	libbpf: Wire up USDT API and bpf_link integration Wire up libbpf USDT support APIs without yet implementing all the nitty-gritty details of USDT discovery, spec parsing, and BPF map initialization. User-visible user-space API is simple and is conceptually very similar to uprobe API. bpf_program__attach_usdt() API allows to programmatically attach given BPF program to a USDT, specified through binary path (executable or shared lib), USDT provider and name. Also, just like in uprobe case, PID filter is specified (0 - self, -1 - any process, or specific PID). Optionally, USDT cookie value can be specified. Such single API invocation will try to discover given USDT in specified binary and will use (potentially many) BPF uprobes to attach this program in correct locations. Just like any bpf_program__attach_xxx() APIs, bpf_link is returned that represents this attachment. It is a virtual BPF link that doesn't have direct kernel object, as it can consist of multiple underlying BPF uprobe links. As such, attachment is not atomic operation and there can be brief moment when some USDT call sites are attached while others are still in the process of attaching. This should be taken into consideration by user. But bpf_program__attach_usdt() guarantees that in the case of success all USDT call sites are successfully attached, or all the successfuly attachments will be detached as soon as some USDT call sites failed to be attached. So, in theory, there could be cases of failed bpf_program__attach_usdt() call which did trigger few USDT program invocations. This is unavoidable due to multi-uprobe nature of USDT and has to be handled by user, if it's important to create an illusion of atomicity. USDT BPF programs themselves are marked in BPF source code as either SEC("usdt"), in which case they won't be auto-attached through skeleton's <skel>__attach() method, or it can have a full definition, which follows the spirit of fully-specified uprobes: SEC("usdt/<path>:<provider>:<name>"). In the latter case skeleton's attach method will attempt auto-attachment. Similarly, generic bpf_program__attach() will have enought information to go off of for parameterless attachment. USDT BPF programs are actually uprobes, and as such for kernel they are marked as BPF_PROG_TYPE_KPROBE. Another part of this patch is USDT-related feature probing: - BPF cookie support detection from user-space; - detection of kernel support for auto-refcounting of USDT semaphore. The latter is optional. If kernel doesn't support such feature and USDT doesn't rely on USDT semaphores, no error is returned. But if libbpf detects that USDT requires setting semaphores and kernel doesn't support this, libbpf errors out with explicit pr_warn() message. Libbpf doesn't support poking process's memory directly to increment semaphore value, like BCC does on legacy kernels, due to inherent raciness and danger of such process memory manipulation. Libbpf let's kernel take care of this properly or gives up. Logistically, all the extra USDT-related infrastructure of libbpf is put into a separate usdt.c file and abstracted behind struct usdt_manager. Each bpf_object has lazily-initialized usdt_manager pointer, which is only instantiated if USDT programs are attempted to be attached. Closing BPF object frees up usdt_manager resources. usdt_manager keeps track of USDT spec ID assignment and few other small things. Subsequent patches will fill out remaining missing pieces of USDT initialization and setup logic. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-3-andrii@kernel.org	2022-04-05 13:16:07 -07:00
Andrii Nakryiko	d72e2968fb	libbpf: Add BPF-side of USDT support Add BPF-side implementation of libbpf-provided USDT support. This consists of single header library, usdt.bpf.h, which is meant to be used from user's BPF-side source code. This header is added to the list of installed libbpf header, along bpf_helpers.h and others. BPF-side implementation consists of two BPF maps: - spec map, which contains "a USDT spec" which encodes information necessary to be able to fetch USDT arguments and other information (argument count, user-provided cookie value, etc) at runtime; - IP-to-spec-ID map, which is only used on kernels that don't support BPF cookie feature. It allows to lookup spec ID based on the place in user application that triggers USDT program. These maps have default sizes, 256 and 1024, which are chosen conservatively to not waste a lot of space, but handling a lot of common cases. But there could be cases when user application needs to either trace a lot of different USDTs, or USDTs are heavily inlined and their arguments are located in a lot of differing locations. For such cases it might be necessary to size those maps up, which libbpf allows to do by overriding BPF_USDT_MAX_SPEC_CNT and BPF_USDT_MAX_IP_CNT macros. It is an important aspect to keep in mind. Single USDT (user-space equivalent of kernel tracepoint) can have multiple USDT "call sites". That is, single logical USDT is triggered from multiple places in user application. This can happen due to function inlining. Each such inlined instance of USDT invocation can have its own unique USDT argument specification (instructions about the location of the value of each of USDT arguments). So while USDT looks very similar to usual uprobe or kernel tracepoint, under the hood it's actually a collection of uprobes, each potentially needing different spec to know how to fetch arguments. User-visible API consists of three helper functions: - bpf_usdt_arg_cnt(), which returns number of arguments of current USDT; - bpf_usdt_arg(), which reads value of specified USDT argument (by it's zero-indexed position) and returns it as 64-bit value; - bpf_usdt_cookie(), which functions like BPF cookie for USDT programs; this is necessary as libbpf doesn't allow specifying actual BPF cookie and utilizes it internally for USDT support implementation. Each bpf_usdt_xxx() APIs expect struct pt_regs * context, passed into BPF program. On kernels that don't support BPF cookie it is used to fetch absolute IP address of the underlying uprobe. usdt.bpf.h also provides BPF_USDT() macro, which functions like BPF_PROG() and BPF_KPROBE() and allows much more user-friendly way to get access to USDT arguments, if USDT definition is static and known to the user. It is expected that majority of use cases won't have to use bpf_usdt_arg_cnt() and bpf_usdt_arg() directly and BPF_USDT() will cover all their needs. Last, usdt.bpf.h is utilizing BPF CO-RE for one single purpose: to detect kernel support for BPF cookie. If BPF CO-RE dependency is undesirable, user application can redefine BPF_USDT_HAS_BPF_COOKIE to either a boolean constant (or equivalently zero and non-zero), or even point it to its own .rodata variable that can be specified from user's application user-space code. It is important that BPF_USDT_HAS_BPF_COOKIE is known to BPF verifier as static value (thus .rodata and not just .data), as otherwise BPF code will still contain bpf_get_attach_cookie() BPF helper call and will fail validation at runtime, if not dead-code eliminated. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-2-andrii@kernel.org	2022-04-05 13:16:07 -07:00
Peter Zijlstra	7a53f40890	objtool: Fix SLS validation for kcov tail-call replacement Since not all compilers have a function attribute to disable KCOV instrumentation, objtool can rewrite KCOV instrumentation in noinstr functions as per commit: `f56dae88a8` ("objtool: Handle __sanitize_cov*() tail calls") However, this has subtle interaction with the SLS validation from commit: `1cc1e4c8aa` ("objtool: Add straight-line-speculation validation") In that when a tail-call instrucion is replaced with a RET an additional INT3 instruction is also written, but is not represented in the decoded instruction stream. This then leads to false positive missing INT3 objtool warnings in noinstr code. Instead of adding additional struct instruction objects, mark the RET instruction with retpoline_safe to suppress the warning (since we know there really is an INT3). Fixes: `1cc1e4c8aa` ("objtool: Add straight-line-speculation validation") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220323230712.GA8939@worktop.programming.kicks-ass.net	2022-04-05 10:24:40 +02:00
Peter Zijlstra	d139bca4b8	objtool: Fix IBT tail-call detection Objtool reports: arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_blocks_avx() falls through to next function poly1305_blocks_x86_64() arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_emit_avx() falls through to next function poly1305_emit_x86_64() arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_blocks_avx2() falls through to next function poly1305_blocks_x86_64() Which reads like: 0000000000000040 <poly1305_blocks_x86_64>: 40: f3 0f 1e fa endbr64 ... 0000000000000400 <poly1305_blocks_avx>: 400: f3 0f 1e fa endbr64 404: 44 8b 47 14 mov 0x14(%rdi),%r8d 408: 48 81 fa 80 00 00 00 cmp $0x80,%rdx 40f: 73 09 jae 41a <poly1305_blocks_avx+0x1a> 411: 45 85 c0 test %r8d,%r8d 414: 0f 84 2a fc ff ff je 44 <poly1305_blocks_x86_64+0x4> ... These are simple conditional tail-calls and should be recognised as such by objtool, however due to a mistake in commit `08f87a93c8` ("objtool: Validate IBT assumptions") this is failing. Specifically, the jump_dest is +4, this means the instruction pointed at will not be ENDBR and as such it will fail the second clause of is_first_func_insn() that was supposed to capture this exact case. Instead, have is_first_func_insn() look at the previous instruction. Fixes: `08f87a93c8` ("objtool: Validate IBT assumptions") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220322115125.811582125@infradead.org	2022-04-05 10:24:40 +02:00
Ilya Leoshkevich	568189310c	libbpf: Support Debian in resolve_full_path() attach_probe selftest fails on Debian-based distros with `failed to resolve full path for 'libc.so.6'`. The reason is that these distros embraced multiarch to the point where even for the "main" architecture they store libc in /lib/<triple>. This is configured in /etc/ld.so.conf and in theory it's possible to replicate the loader's parsing and processing logic in libbpf, however a much simpler solution is to just enumerate the known library paths. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220404225020.51029-1-iii@linux.ibm.com	2022-04-04 16:47:16 -07:00
Daniel Latypov	baa3331503	kunit: tool: more descriptive metavars/--help output Before, our help output contained lines like --kconfig_add KCONFIG_ADD --qemu_config qemu_config --jobs jobs They're not very helpful. The former kind come from the automatic 'metavar' we get from argparse, the uppercase version of the flag name. The latter are where we manually specified metavar as the flag name. After: --build_dir DIR --make_options X=Y --kunitconfig PATH --kconfig_add CONFIG_X=Y --arch ARCH --cross_compile PREFIX --qemu_config FILE --jobs N --timeout SECONDS --raw_output [{all,kunit}] --json [FILE] This patch tries to make the code more clear by specifying the _type_ of input we expect, e.g. --build_dir is a DIR, --qemu_config is a FILE. I also switched it to uppercase since it looked more clearly like placeholder text that way. This patch also changes --raw_output to specify `choices` to make it more clear what the options are, and this way argparse can validate it for us, as shown by the added test case. Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 16:22:14 -06:00
Ilya Leoshkevich	d298761746	selftests/bpf: Define SYS_NANOSLEEP_KPROBE_NAME for aarch64 attach_probe selftest fails on aarch64 with `failed to create kprobe 'sys_nanosleep+0x0' perf event: No such file or directory`. This is because, like on several other architectures, nanosleep has a prefix. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20220404142101.27900-1-iii@linux.ibm.com	2022-04-04 14:57:29 -07:00
Milan Landaverde	7b53eaa656	bpftool: Handle libbpf_probe_prog_type errors Previously [1], we were using bpf_probe_prog_type which returned a bool, but the new libbpf_probe_bpf_prog_type can return a negative error code on failure. This change decides for bpftool to declare a program type is not available on probe failure. [1] https://lore.kernel.org/bpf/20220202225916.3313522-3-andrii@kernel.org/ Signed-off-by: Milan Landaverde <milan@mdaverde.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220331154555.422506-4-milan@mdaverde.com	2022-04-04 14:54:44 -07:00
Milan Landaverde	fff3dfab17	bpftool: Add missing link types Will display the link type names in bpftool link show output Signed-off-by: Milan Landaverde <milan@mdaverde.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220331154555.422506-3-milan@mdaverde.com	2022-04-04 14:54:34 -07:00
Milan Landaverde	380341637e	bpftool: Add syscall prog type In addition to displaying the program type in bpftool prog show this enables us to be able to query bpf_prog_type_syscall availability through feature probe as well as see which helpers are available in those programs (such as bpf_sys_bpf and bpf_sys_close) Signed-off-by: Milan Landaverde <milan@mdaverde.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220331154555.422506-2-milan@mdaverde.com	2022-04-04 14:52:54 -07:00
Quentin Monnet	4eeebce6ac	selftests/bpf: Fix parsing of prog types in UAPI hdr for bpftool sync The script for checking that various lists of types in bpftool remain in sync with the UAPI BPF header uses a regex to parse enum bpf_prog_type. If this enum contains a set of values different from the list of program types in bpftool, it complains. This script should have reported the addition, some time ago, of the new BPF_PROG_TYPE_SYSCALL, which was not reported to bpftool's program types list. It failed to do so, because it failed to parse that new type from the enum. This is because the new value, in the BPF header, has an explicative comment on the same line, and the regex does not support that. Let's update the script to support parsing enum values when they have comments on the same line. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220404140944.64744-1-quentin@isovalent.com	2022-04-04 14:46:15 -07:00
Kees Cook	d34f82d67d	kunit: tool: Do not colorize output when redirected Filling log files with color codes makes diffs and other comparisons difficult. Only emit vt100 codes when the stdout is a TTY. Cc: Brendan Higgins <brendanhiggins@google.com> Cc: linux-kselftest@vger.kernel.org Cc: kunit-dev@googlegroups.com Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 15:23:50 -06:00
Daniel Latypov	885210d348	kunit: tool: properly report the used arch for --json, or '' if not known Before, kunit.py always printed "arch": "UM" in its json output, but... 1. With `kunit.py parse`, we could be parsing output from anywhere, so we can't say that. 2. Capitalizing it is probably wrong, as it's `ARCH=um` 3. Commit `87c9c16317` ("kunit: tool: add support for QEMU") made it so kunit.py could knowingly run a different arch, yet we'd still always claim "UM". This patch addresses all of those. E.g. 1. $ ./tools/testing/kunit/kunit.py parse .kunit/test.log --json \| grep -o '"arch.*' \| sort -u "arch": "", 2. $ ./tools/testing/kunit/kunit.py run --json \| ... "arch": "um", 3. $ ./tools/testing/kunit/kunit.py run --json --arch=x86_64 \| ... "arch": "x86_64", Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 15:22:30 -06:00
Daniel Latypov	ee96d25f2f	kunit: tool: refactor how we plumb metadata into JSON When using --json, kunit.py run/exec/parse will produce results in KernelCI json format. As part of that, we include the build_dir that was used, and we (incorrectly) hardcode in the arch, etc. We'll want a way to plumb more values (as well as the correct `arch`), so this patch groups those fields into kunit_json.Metadata type. This patch should have no user visible changes. And since we only used build_dir in KunitParseRequest for json, we can now move it out of that struct and add it into KunitExecRequest, which needs it and used to get it via inheritance. Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 15:22:23 -06:00
Daniel Latypov	6bd0f52ee8	kunit: tool: readability tweaks in KernelCI json generation logic Use a more idiomatic check that a list is non-empty (`if mylist:`) and simplify the function body by dedenting and using a dict to map between the kunit TestStatus enum => KernelCI json status string. The dict hopefully makes it less likely to have bugs like commit `9a6bb30a88` ("kunit: tool: fix --json output for skipped tests"). Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 15:22:15 -06:00
Daniel Latypov	aa1c05558e	kunit: tool: simplify code since build_dir can't be None --build_dir is set to a default of '.kunit' since commit `ddbd60c779` ("kunit: use --build_dir=.kunit as default"), but even before then it was explicitly set to ''. So outside of one unit test, there was no way for the build_dir to be ever be None, and we can simplify code by fixing the unit test and enforcing that via updated type annotations. E.g. this lets us drop `get_file_path()` since it's now exactly equivalent to os.path.join(). Note: there's some `if build_dir` checks that also fail if build_dir is explicitly set to '' that just guard against passing "O=" to make. But running `make O=` works just fine, so drop these checks. Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 14:25:58 -06:00
Daniel Latypov	e6f6192065	kunit: tool: drop last uses of collections.namedtuple Since we formally require python3.7+ since commit `df4b0807ca` ("kunit: tool: Assert the version requirement"), we can just use @dataclasses.dataclass instead. In kunit_config.py, we used namedtuple to create a hashable type that had `name` and `value` fields and had to subclass it to define a custom `__str__()`. @datalcass lets us just define one type instead. In qemu_config.py, we use namedtuple to allow modules to define various parameters. Using @dataclass, we can add type-annotations for all these fields, making our code more typesafe and making it easier for users to figure out how to define new configs. Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 14:25:53 -06:00
Daniel Latypov	89aa72cd30	kunit: tool: drop unused KernelDirectoryPath var Commit `be886ba90c` ("kunit: run kunit_tool from any directory") introduced this variable, but it was unused even in that commit. Since it's still unused now and callers can instead use get_kernel_root_path(), delete this var. Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 14:25:47 -06:00
Daniel Latypov	00f75043e4	kunit: tool: make --json handling a bit clearer Currently kunit_json.get_json_result() will output the JSON-ified test output to json_path, but iff it's not "stdout". Instead, move the responsibility entirely over to the one caller. Signed-off-by: Daniel Latypov <dlatypov@google.com> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 14:25:37 -06:00
Willem de Bruijn	79ee8aa31d	selftests/harness: Pass variant to teardown FIXTURE_VARIANT data is passed to FIXTURE_SETUP and TEST_F as "variant". In some cases, the variant will change the setup, such that expectations also change on teardown. Also pass variant to FIXTURE_TEARDOWN. The new FIXTURE_TEARDOWN logic is identical to that in FIXTURE_SETUP, right above. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20201210231010.420298-1-willemdebruijn.kernel@gmail.com Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 13:37:48 -06:00
Kees Cook	63e6b2a423	selftests/harness: Run TEARDOWN for ASSERT failures The kselftest test harness has traditionally not run the registered TEARDOWN handler when a test encountered an ASSERT. This creates unexpected situations and tests need to be very careful about using ASSERT, which seems a needless hurdle for test writers. Because of the harness's design for optional failure handlers, the original implementation of ASSERT used an abort() to immediately stop execution, but that meant the context for running teardown was lost. Instead, use setjmp/longjmp so that teardown can be done. Failed SETUP routines continue to not be followed by TEARDOWN, though. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Shuah Khan <shuah@kernel.org> Cc: linux-kselftest@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 13:37:37 -06:00
Axel Rasmussen	187816d077	selftests: fix an unused variable warning in pidfd selftest I fixed a few warnings like this in commit `e2aa5e650b` ("selftests: fixup build warnings in pidfd / clone3 tests"), but I missed this one by mistake. Since this variable is unused, remove it. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 13:32:53 -06:00
Axel Rasmussen	52035628fa	selftests: fix header dependency for pid_namespace selftests The way the test target was defined before, when building with clang we get a command line like this: clang -Wall -Werror -g -I../../../../usr/include/ \ regression_enomem.c ../pidfd/pidfd.h -o regression_enomem This yields an error, because clang thinks we want to produce both a *.o file, as well as a precompiled header: clang: error: cannot specify -o when generating multiple output files gcc, for whatever reason, doesn't exhibit the same behavior which I suspect is why the problem wasn't noticed before. This can be fixed simply by using the LOCAL_HDRS infrastructure the selftests lib.mk provides. It does the right think and marks the target as depending on the header (so if the header changes, we rebuild), but it filters the header out of the compiler command line, so we don't get the error described above. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 13:32:31 -06:00
Geliang Tang	aa8ce29931	selftests: x86: add 32bit build warnings for SUSE In order to successfully build all these 32bit tests, these 32bit gcc and glibc packages, named gcc-32bit and glibc-devel-static-32bit on SUSE, need to be installed. This patch added this information in warn_32bit_failure. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 13:29:43 -06:00
Guo Zhengkui	1585b1b55a	selftests/proc: fix array_size.cocci warning Fix the following coccicheck warning: tools/testing/selftests/proc/proc-pid-vm.c:371:26-27: WARNING: Use ARRAY_SIZE tools/testing/selftests/proc/proc-pid-vm.c:420:26-27: WARNING: Use ARRAY_SIZE It has been tested with gcc (Debian 8.3.0-6) 8.3.0 on x86_64. Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 13:27:21 -06:00
Guo Zhengkui	8ff88bec6f	selftests/vDSO: fix array_size.cocci warning Fix the following coccicheck warning: tools/testing/selftests/vDSO/vdso_test_correctness.c:309:46-47: WARNING: Use ARRAY_SIZE tools/testing/selftests/vDSO/vdso_test_correctness.c:373:46-47: WARNING: Use ARRAY_SIZE It has been tested with gcc (Debian 8.3.0-6) 8.3.0 on x86_64. Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2022-04-04 13:27:11 -06:00
Borislav Petkov	dbae0a934f	x86/cpu: Remove CONFIG_X86_SMAP and "nosmap" Those were added as part of the SMAP enablement but SMAP is currently an integral part of kernel proper and there's no need to disable it anymore. Rip out that functionality. Leave --uaccess default on for objtool as this is what objtool should do by default anyway. If still needed - clearcpuid=smap. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220127115626.14179-4-bp@alien8.de	2022-04-04 10:16:57 +02:00
Yuntao Wang	e93f39998d	libbpf: Don't return -EINVAL if hdr_len < offsetofend(core_relo_len) Since core relos is an optional part of the .BTF.ext ELF section, we should skip parsing it instead of returning -EINVAL if header size is less than offsetofend(struct btf_ext_header, core_relo_len). Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220404005320.1723055-1-ytcoode@gmail.com	2022-04-03 19:56:01 -07:00
Alan Maguire	579c3196b2	selftests/bpf: Add tests for uprobe auto-attach via skeleton tests that verify auto-attach works for function entry/return for local functions in program and library functions in a library. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1648654000-21758-6-git-send-email-alan.maguire@oracle.com	2022-04-03 19:56:01 -07:00

... 6 7 8 9 10 ...

30142 Commits