Commit Graph

34208 Commits

Author SHA1 Message Date
Michal Luczaj 6c77ae716d KVM: selftests: Clean up misnomers in xen_shinfo_test
As discussed[*], relabel the poorly named structs to align with the
current KVM nomenclature.

Old names are a leftover from before commit 52491a38b2 ("KVM:
Initialize gfn_to_pfn_cache locks in dedicated helper"), which i.a.
introduced kvm_gpc_init() and renamed kvm_gfn_to_pfn_cache_init()/
_destroy() to kvm_gpc_activate()/_deactivate(). Partly in an effort
to avoid implying that the cache really is destroyed/freed.

While at it, get rid of #define GPA_INVALID, which being used as a GFN,
is not only misnamed, but also unnecessarily reinvents a UAPI constant.

No functional change intended.

[*] https://lore.kernel.org/r/Y5yZ6CFkEMBqyJ6v@google.com

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20230206202430.1898057-1-mhal@rbox.co
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-08 06:38:48 -08:00
Shaoqin Huang 7ae69d7087 selftests: KVM: Replace optarg with arg in guest_modes_cmdline
The parameter arg in guest_modes_cmdline not being used now, and the
optarg should be replaced with arg in guest_modes_cmdline.

And this is the chance to change strtoul() to atoi_non_negative(), since
guest mode ID will never be negative.

Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
Fixes: e42ac777d6 ("KVM: selftests: Factor out guest mode code")
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Reviewed-by: Vipin Sharma <vipinsh@google.com>
Link: https://lore.kernel.org/r/20230202025716.216323-1-shahuang@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-08 06:38:47 -08:00
Thomas Richter 6a5558f116 perf tools: Fix perf tool build error in util/pfm.c
I have downloaded linux-next and build the perf tool using

  # make LIBPFM4=1

to have libpfm4 support built into perf. The build fails:

 # make LIBPFM4=1
....
INSTALL libbpf_headers
  CC      util/pfm.o
util/pfm.c: In function ‘print_libpfm_event’:
util/pfm.c:189:9: error: too many arguments to function ‘print_cb->print_event’
  189 |         print_cb->print_event(print_state,
      |         ^~~~~~~~
util/pfm.c:220:25: error: too many arguments to function ‘print_cb->print_event’
  220 |                         print_cb->print_event(print_state,

The build error is caused by commit d9dc8874d6 ("perf pmu-events:
Remove now unused event and metric variables") which changes the
function prototype of

  struct print_callbacks {
      ...
      void (*print_event)(...);  --> last two parameters removed.
  };

but does not adjust the usage of this function prototype in util/pfm.c.
In file util/pfm.c function print_event() is still invoked with 13
parameters instead of 11. The compile fails.

When I adjust the file util/pfm.c as in this patch, the build works file.
Please check this patch for correctness, I have just fixed the compile
issue.

Fixes: d9dc8874d6 ("perf pmu-events: Remove now unused event and metric variables")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: egorenar@linux.ibm.com
Cc: linux-kernel-next@vger.kernel.org
Link: https://lore.kernel.org/r/20230207140447.1827741-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-08 11:07:47 -03:00
Yicong Yang ffd1240e8f perf tools: Fix auto-complete on aarch64
On aarch64 CPU related events are not under event_source/devices/cpu/events,
they're under event_source/devices/armv8_pmuv3_0/events on my machine.
Using current auto-complete script will generate below error:

  [root@localhost bin]# perf stat -e
  ls: cannot access '/sys/bus/event_source/devices/cpu/events': No such file or directory

Fix this by not testing /sys/bus/event_source/devices/cpu/events on
aarch64 machine.

Fixes: 74cd5815d9 ("perf tool: Improve bash command line auto-complete for multiple events with comma")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Cc: prime.zeng@hisilicon.com
Link: https://lore.kernel.org/r/20230207035057.43394-1-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-08 10:38:10 -03:00
Namhyung Kim 1bece1351c perf lock contention: Support old rw_semaphore type
The old kernel has a different type of the owner field in rwsem.  We can
check it using bpf_core_type_matches() builtin in clang but it also
needs its own version check since it's available on recent versions.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230207002403.63590-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-08 10:35:17 -03:00
Namhyung Kim 3477f079fe perf lock contention: Add -o/--lock-owner option
When there're many lock contentions in the system, people sometimes want
to know who caused the contention, IOW who's the owner of the locks.

The -o/--lock-owner option tries to follow the lock owners for the
contended mutexes and rwsems from BPF, and then attributes the
contention time to the owner instead of the waiter.  It's a best effort
approach to get the owner info at the time of the contention and doesn't
guarantee to have the precise tracking of owners if it's changing over
time.

Currently it only handles mutex and rwsem that have owner field in their
struct and it basically points to a task_struct that owns the lock at
the moment.

Technically its type is atomic_long_t and it comes with some LSB bits
used for other meanings.  So it needs to clear them when casting it to a
pointer to task_struct.

Also the atomic_long_t is a typedef of the atomic 32 or 64 bit types
depending on arch which is a wrapper struct for the counter value.  I'm
not aware of proper ways to access those kernel atomic types from BPF so
I just read the internal counter value directly.  Please let me know if
there's a better way.

When -o/--lock-owner option is used, it goes to the task aggregation
mode like -t/--threads option does.  However it cannot get the owner for
other lock types like spinlock and sometimes even for mutex.

  $ sudo ./perf lock con -abo -- ./perf bench sched pipe
  # Running 'sched/pipe' benchmark:
  # Executed 1000000 pipe operations between two processes

       Total time: 4.766 [sec]

         4.766540 usecs/op
           209795 ops/sec
   contended   total wait     max wait     avg wait          pid   owner

         403    565.32 us     26.81 us      1.40 us           -1   Unknown
           4     27.99 us      8.57 us      7.00 us      1583145   sched-pipe
           1      8.25 us      8.25 us      8.25 us      1583144   sched-pipe
           1      2.03 us      2.03 us      2.03 us         5068   chrome

As you can see, the owner is unknown for the most cases.  But if we
filter only for the mutex locks, it'd more likely get the onwers.

  $ sudo ./perf lock con -abo -Y mutex -- ./perf bench sched pipe
  # Running 'sched/pipe' benchmark:
  # Executed 1000000 pipe operations between two processes

       Total time: 4.910 [sec]

         4.910435 usecs/op
           203647 ops/sec
   contended   total wait     max wait     avg wait          pid   owner

           2     15.50 us      8.29 us      7.75 us      1582852   sched-pipe
           7      7.20 us      2.47 us      1.03 us           -1   Unknown
           1      6.74 us      6.74 us      6.74 us      1582851   sched-pipe

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230207002403.63590-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-08 10:33:32 -03:00
Namhyung Kim 55e391852e perf lock contention: Fix to save callstack for the default modified
The previous change missed to set the con->save_callstack for the
LOCK_AGGR_CALLER mode resulting in no caller information.

Fixes: ebab291641 ("perf lock contention: Support filters for different aggregation")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230207002403.63590-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-08 10:33:02 -03:00
Matthieu Baerts 070d6dafac selftests: mptcp: stop tests earlier
These 'endpoint' tests from 'mptcp_join.sh' selftest start a transfer in
the background and check the status during this transfer.

Once the expected events have been recorded, there is no reason to wait
for the data transfer to finish. It can be stopped earlier to reduce the
execution time by more than half.

For these tests, the exchanged data were not verified. Errors, if any,
were ignored but that's fine, plenty of other tests are looking at that.
It is then OK to mute stderr now that we are sure errors will be printed
(and still ignored) because the transfer is stopped before the end.

Fixes: e274f71540 ("selftests: mptcp: add subflow limits test-cases")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08 09:39:34 +00:00
Paolo Abeni a635a8c3df selftests: mptcp: allow more slack for slow test-case
A test-case is frequently failing on some extremely slow VMs.
The mptcp transfer completes before the script is able to do
all the required PM manipulation.

Address the issue in the simplest possible way, making the
transfer even more slow.

Additionally dump more info in case of failures, to help debugging
similar problems in the future and init dump_stats var.

Fixes: e274f71540 ("selftests: mptcp: add subflow limits test-cases")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/323
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08 09:39:34 +00:00
Gavin Shan 45f679550d KVM: selftests: Assign guest page size in sync area early in memslot_perf_test
The guest page size in the synchronization area is needed by all test
cases. So it's reasonable to set it in the unified preparation function
(prepare_vm()).

Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/r/20230118092133.320003-3-gshan@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-07 15:44:19 -08:00
Gavin Shan e5b426879f KVM: selftests: Remove duplicate VM creation in memslot_perf_test
Remove a spurious call to __vm_create_with_one_vcpu() that was introduced
by a merge gone sideways.

Fixes: eb5618911a ("Merge tag 'kvmarm-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD")
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Link: https://lore.kernel.org/r/20230118092133.320003-2-gshan@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-07 15:37:25 -08:00
Lorenzo Bianconi 02fc0e73e8 libbpf: Always use libbpf_err to return an error in bpf_xdp_query()
In order to properly set errno, rely on libbpf_err utility routine in
bpf_xdp_query() to return an error to the caller.

Fixes: 04d58f1b26 ("libbpf: add API to get XDP/XSK supported features")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/827d40181f9f90fb37702f44328e1614df7c0503.1675768112.git.lorenzo@kernel.org
2023-02-07 22:56:11 +01:00
Ian Rogers e0975ab92f tools/resolve_btfids: Tidy HOST_OVERRIDES
Don't set EXTRA_CFLAGS to HOSTCFLAGS, ensure CROSS_COMPILE isn't
passed through.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230202224253.40283-1-irogers@google.com
2023-02-07 22:48:56 +01:00
Jiri Olsa 56a2df7615 tools/resolve_btfids: Compile resolve_btfids as host program
Making resolve_btfids to be compiled as host program so
we can avoid cross compile issues as reported by Nathan.

Also we no longer need HOST_OVERRIDES for BINARY target,
just for 'prepare' targets.

Fixes: 13e07691a1 ("tools/resolve_btfids: Alter how HOSTCC is forced")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/bpf/20230202112839.1131892-1-jolsa@kernel.org
2023-02-07 22:47:16 +01:00
Dan Williams dbe9f7d1e1 Merge branch 'for-6.3/cxl-events' into cxl/next
Add the CXL event and interrupt support for the v6.3 update.
2023-02-07 11:14:06 -08:00
Dan Williams 5485eb9559 Merge branch 'for-6.3/cxl' into cxl/next
Merge the general CXL updates with fixes targeting v6.2-rc for v6.3.
Resolve a conflict with the fix and move of cxl_report_and_clear() from
pci.c to core/pci.c.
2023-02-07 11:12:24 -08:00
Mark Brown 2c4192c0a7 kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests
During early development a dependedncy was added on having FA64
available so we could use the full FPSIMD register set in the signal
handler which got copied over into the SSVE+ZA registers test case.
Subsequently the ABI was finialised so the handler is run with streaming
mode disabled meaning this is redundant but the dependency was never
removed, do so now.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230202-arm64-kselftest-sve-za-fa64-v1-1-5c5f3dabe441@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-02-07 18:34:13 +00:00
Mark Brown 6012b82020 kselftest/arm64: Copy whole EXTRA context
When copying the EXTRA context our calculation of the amount of data we
need to copy is incorrect, we only calculate the amount of data needed
within uc_mcontext.__reserved, not taking account of the fixed portion
of the context. Add in the offset of the reserved data so that we copy
everything we should.

This will only cause test failures in cases where the last context in the
EXTRA context is smaller than the missing data since we don't currently
validate any of the register data and all the buffers we copy into are
statically allocated so default to zero meaning that if we walk beyond the
end of what we copied we'll encounter what looks like a context with magic
and length both 0 which is a valid terminator record.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230201-arm64-kselftest-full-extra-v1-1-93741f32dd29@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-02-07 18:33:46 +00:00
Janis Schoetterl-Glausch 0dd714bfd2 KVM: s390: selftest: memop: Add cmpxchg tests
Test successful exchange, unsuccessful exchange, storage key protection
and invalid arguments.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230207164225.2114706-1-scgl@linux.ibm.com
Message-Id: <20230207164225.2114706-1-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2023-02-07 18:06:00 +01:00
Janis Schoetterl-Glausch dc55ceaef6 KVM: s390: selftest: memop: Fix integer literal
The address is a 64 bit value, specifying a 32 bit value can crash the
guest. In this case things worked out with -O2 but not -O0.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Fixes: 1bb873495a ("KVM: s390: selftests: Add more copy memop tests")
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20230206164602.138068-8-scgl@linux.ibm.com
Message-Id: <20230206164602.138068-8-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2023-02-07 18:05:59 +01:00
Janis Schoetterl-Glausch 12c12a9924 KVM: s390: selftest: memop: Fix wrong address being used in test
The guest code sets the key for mem1 only. In order to provoke a
protection exception the test codes needs to address mem1.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-7-scgl@linux.ibm.com
Message-Id: <20230206164602.138068-7-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2023-02-07 18:05:59 +01:00
Janis Schoetterl-Glausch 76a2ee43ed KVM: s390: selftest: memop: Fix typo
"acceeded" isn't a word, should be "exceeded".

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-6-scgl@linux.ibm.com
Message-Id: <20230206164602.138068-6-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2023-02-07 18:05:59 +01:00
Janis Schoetterl-Glausch 06e5da81c6 KVM: s390: selftest: memop: Add bad address test
Add a test that tries a real write to a bad address.
The existing CHECK_ONLY test doesn't cover all paths.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-5-scgl@linux.ibm.com
Message-Id: <20230206164602.138068-5-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2023-02-07 18:05:59 +01:00
Janis Schoetterl-Glausch de14e014a7 KVM: s390: selftest: memop: Move testlist into main
This allows checking if the necessary requirements for a test case are
met via an arbitrary expression. In particular, it is easy to check if
certain bits are set in the memop extension capability.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-4-scgl@linux.ibm.com
Message-Id: <20230206164602.138068-4-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2023-02-07 18:05:59 +01:00
Janis Schoetterl-Glausch 12d2707419 KVM: s390: selftest: memop: Replace macros by functions
Replace the DEFAULT_* test helpers by functions, as they don't
need the extra flexibility.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-3-scgl@linux.ibm.com
Message-Id: <20230206164602.138068-3-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2023-02-07 18:05:59 +01:00
Janis Schoetterl-Glausch 7d42b38de9 KVM: s390: selftest: memop: Pass mop_desc via pointer
The struct is quite large, so this seems nicer.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20230206164602.138068-2-scgl@linux.ibm.com
Message-Id: <20230206164602.138068-2-scgl@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2023-02-07 18:05:59 +01:00
Nina Schoetterl-Glausch d7b9dc1403 KVM: selftests: Compile s390 tests with -march=z10
The guest used in s390 kvm selftests is not be set up to handle all
instructions the compiler might emit, i.e. vector instructions, leading
to crashes.
Limit what the compiler emits to the oldest machine model currently
supported by Linux.

Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20230127174552.3370169-1-nsg@linux.ibm.com
Message-Id: <20230127174552.3370169-1-nsg@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2023-02-07 18:05:59 +01:00
Vladimir Oltean bbb253b206 selftests: ocelot: tc_flower_chains: make test_vlan_ingress_modify() more comprehensive
We have two IS1 filters of the OCELOT_VCAP_KEY_ANY key type (the one with
"action vlan pop" and the one with "action vlan modify") and one of the
OCELOT_VCAP_KEY_IPV4 key type (the one with "action skbedit priority").
But we have no IS1 filter with the OCELOT_VCAP_KEY_ETYPE key type, and
there was an uncaught breakage there.

To increase test coverage, convert one of the OCELOT_VCAP_KEY_ANY
filters to OCELOT_VCAP_KEY_ETYPE, by making the filter also match on the
MAC SA of the traffic sent by mausezahn, $h1_mac.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20230205192409.1796428-2-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-07 12:20:21 +01:00
Aaron Thompson 647037adca Revert "mm: Always release pages to the buddy allocator in memblock_free_late()."
This reverts commit 115d9d77bb.

The pages being freed by memblock_free_late() have already been
initialized, but if they are in the deferred init range,
__free_one_page() might access nearby uninitialized pages when trying to
coalesce buddies. This can, for example, trigger this BUG:

  BUG: unable to handle page fault for address: ffffe964c02580c8
  RIP: 0010:__list_del_entry_valid+0x3f/0x70
   <TASK>
   __free_one_page+0x139/0x410
   __free_pages_ok+0x21d/0x450
   memblock_free_late+0x8c/0xb9
   efi_free_boot_services+0x16b/0x25c
   efi_enter_virtual_mode+0x403/0x446
   start_kernel+0x678/0x714
   secondary_startup_64_no_verify+0xd2/0xdb
   </TASK>

A proper fix will be more involved so revert this change for the time
being.

Fixes: 115d9d77bb ("mm: Always release pages to the buddy allocator in memblock_free_late().")
Signed-off-by: Aaron Thompson <dev@aaront.org>
Link: https://lore.kernel.org/r/20230207082151.1303-1-dev@aaront.org
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2023-02-07 13:07:37 +02:00
Colin Ian King 8306829bf8 selftests/bpf: Fix spelling mistake "detecion" -> "detection"
There is a spelling mistake in a literal string. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230206092229.46416-1-colin.i.king@gmail.com
2023-02-06 14:51:30 -08:00
Hao Xiang d1d7730ff8 libbpf: Correctly set the kernel code version in Debian kernel.
In a previous commit, Ubuntu kernel code version is correctly set
by retrieving the information from /proc/version_signature.

commit<5b3d72987701d51bf31823b39db49d10970f5c2d>
(libbpf: Improve LINUX_VERSION_CODE detection)

The /proc/version_signature file doesn't present in at least the
older versions of Debian distributions (eg, Debian 9, 10). The Debian
kernel has a similar issue where the release information from uname()
syscall doesn't give the kernel code version that matches what the
kernel actually expects. Below is an example content from Debian 10.

release: 4.19.0-23-amd64
version: #1 SMP Debian 4.19.269-1 (2022-12-20) x86_64

Debian reports incorrect kernel version in utsname::release returned
by uname() syscall, which in older kernels (Debian 9, 10) leads to
kprobe BPF programs failing to load due to the version check mismatch.

Fortunately, the correct kernel code version presents in the
utsname::version returned by uname() syscall in Debian kernels. This
change adds another get kernel version function to handle Debian in
addition to the previously added get kernel version function to handle
Ubuntu. Some minor refactoring work is also done to make the code more
readable.

Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
Signed-off-by: Ho-Ren (Jack) Chuang <horenchuang@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230203234842.2933903-1-hao.xiang@bytedance.com
2023-02-06 14:30:50 -08:00
Mark Brown d236505478 KVM: selftests: Enable USERFAULTFD
The page_fault_test KVM selftest requires userfaultfd but the config
fragment for the KVM selftests does not enable it, meaning that those tests
are skipped in CI systems that rely on appropriate settings in the config
fragments except on S/390 which happens to have it in defconfig. Enable
the option in the config fragment so that the tests get run.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20230202-kvm-selftest-userfaultfd-v1-1-8186ac5a33a5@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-06 19:35:09 +00:00
Athira Rajeev 34266f904a perf test bpf: Skip test if kernel-debuginfo is not present
Perf BPF filter test fails in environment where "kernel-debuginfo"
is not installed.

Test failure logs:

  <<>>
  42: BPF filter                            :
  42.1: Basic BPF filtering                 : Ok
  42.2: BPF pinning                         : Ok
  42.3: BPF prologue generation             : FAILED!
  <<>>

Enabling verbose option provided debug logs, which says debuginfo
needs to be installed. Snippet of verbose logs:

  <<>>
  42.3: BPF prologue generation                                       :
  --- start ---
  test child forked, pid 28218
  <<>>
  Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo
  package.
  bpf_probe: failed to convert perf probe events
  Failed to add events selected by BPF
  test child finished with -1
  ---- end ----
  BPF filter subtest 3: FAILED!
  <<>>

Here the subtest "BPF prologue generation" failed and logs shows
debuginfo is needed. After installing kernel-debuginfo package, testcase
passes.

The "BPF prologue generation" subtest failed because, the do_test()
returns TEST_FAIL without checking the error type returned by
parse_events_load_bpf_obj().

parse_events_load_bpf_obj() can also return error of type -ENODATA
incase kernel-debuginfo package is not installed. Fix this by adding
check for -ENODATA error.

Test result after the patch changes:

Test failure logs:

  <<>>
  42: BPF filter                 :
  42.1: Basic BPF filtering      : Ok
  42.2: BPF pinning              : Ok
  42.3: BPF prologue generation  : Skip (clang/debuginfo isn't installed or environment missing BPF support)
  <<>>

Fixes: ba1fae431e ("perf test: Add 'perf test BPF'")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/linux-perf-users/Y7bIk77mdE4j8Jyi@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-06 15:01:23 -03:00
Athira Rajeev 67ef66bad4 perf probe: Update the exit error codes in function try_to_find_probe_trace_event
try_to_find_probe_trace_events() uses return error code as ENOENT in two
places.

First place is after open_debuginfo() when opening debuginfo fails and
secondly, after when not finding the probe point.

This function is invoked during BPF load and there are other exit points
in this code path which returns ENOENT. This makes it difficult to
understand the exact reason for exit.

Patches changes the exit code from ENOENT to:

- ENODATA when it fails to find debuginfo

- ENODEV when it fails to find probe point

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230105121742.92249-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-06 15:00:05 -03:00
Kan Liang 4e846311a9 perf script: Fix missing Retire Latency fields option documentation
The 'perf script' documentation is missing the fields option for Retire
Latency. Add it.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230206162100.3329395-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-06 14:57:50 -03:00
Kan Liang 957ed139d7 perf event x86: Add retire_lat when synthesizing PERF_SAMPLE_WEIGHT_STRUCT
In arch_perf_synthesize_sample_weight(), the retire_lat was mistakenly
missed, add it.

  perf test -v "x86 sample parsing"
   74: x86 Sample parsing                                              :
  --- start ---
  test child forked, pid 72526
  Samples differ at 'retire_lat'
  parsing failed for sample_type 0x1000000
  test child finished with -1
  ---- end ----
  x86 Sample parsing: FAILED!

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230206162100.3329395-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-06 14:56:22 -03:00
Kan Liang e65f91b20c perf test x86: Support the retire_lat (Retire Latency) sample_type check
Add test for the new field for Retire Latency in the X86 specific test.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230202192209.1795329-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-06 11:53:07 -03:00
Sebastian Andrzej Siewior 5646bbd668 selftests: Emit a warning if getcpu() is missing on 32bit
The VDSO implementation for getcpu() has been wired up on 32bit so warn if
missing.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20221125094216.3663444-4-bigeasy@linutronix.de
2023-02-06 15:48:54 +01:00
Athira Rajeev ee739f132f perf test bpf: Check for libtraceevent support
The "bpf" tests fails in environment with missing libtraceevent support
as below:

  # ./perf test 36
  36: BPF filter                          :
  36.1: Basic BPF filtering               : FAILED!
  36.2: BPF pinning                       : FAILED!
  36.3: BPF prologue generation           : FAILED!

The environment has clang but missing the libtraceevent devel. Hence
perf is compiled without libtraceevent support.

Detailed logs:
	./perf test -v "Basic BPF filtering"

	Failed to add BPF event syscalls:sys_enter_epoll_pwait
	bpf: tracepoint call back failed, stop iterate
	Failed to add events selected by BPF

The bpf tests tris to add probe event which fails at
"parse_events_add_tracepoint" function due to missing libtraceevent. Add
check for "HAVE_LIBTRACEEVENT" in the "tests/bpf.c" before proceeding
with the test.

With the change,

	# ./perf test 36
 	36: BPF filter                    :
 	36.1: Basic BPF filtering         : Skip (not compiled in or missing libtraceevent support)
 	36.2: BPF pinning                 : Skip (not compiled in or missing libtraceevent support)
 	36.3: BPF prologue generation     : Skip (not compiled in or missing libtraceevent support)

Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230131135001.54578-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-06 11:40:59 -03:00
Arnaldo Carvalho de Melo ab809efaeb Merge remote-tracking branch 'torvalds/master' into perf/core
To sync with libbpf, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-06 09:25:56 -03:00
Petr Machata 3446dcd7df selftests: forwarding: bridge_mdb_max: Add a new selftest
Add a suite covering mcast_n_groups and mcast_max_groups bridge features.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 08:48:27 +00:00
Petr Machata 9ae8546973 selftests: forwarding: lib: Add helpers to build IGMP/MLD leave packets
The testsuite that checks for mcast_max_groups functionality will need to
wipe the added groups as well. Add helpers to build an IGMP or MLD packets
announcing that host is leaving a given group.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 08:48:27 +00:00
Petr Machata 705d4bc7b6 selftests: forwarding: lib: Allow list of IPs for IGMPv3/MLDv2
The testsuite that checks for mcast_max_groups functionality will need
to generate IGMP and MLD packets with configurable number of (S,G)
addresses. To that end, further extend igmpv3_is_in_get() and
mldv2_is_in_get() to allow a list of IP addresses instead of one
address.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 08:48:27 +00:00
Petr Machata 506a1ac9d3 selftests: forwarding: lib: Parameterize IGMPv3/MLDv2 generation
In order to generate IGMPv3 and MLDv2 packets on the fly, the
functions that generate these packets need to be able to generate
packets for different groups and different sources. Generating MLDv2
packets further needs the source address of the packet for purposes of
checksum calculation. Add the necessary parameters, and generate the
payload accordingly by dispatching to helpers added in the previous
patches.

Adjust the sole client, bridge_mdb.sh, as well.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 08:48:26 +00:00
Petr Machata 952e0ee38c selftests: forwarding: lib: Add helpers for checksum handling
In order to generate IGMPv3 and MLDv2 packets on the fly, we will need
helpers to calculate the packet checksum.

The approach presented in this patch revolves around payload templates
for mausezahn. These are mausezahn-like payload strings (01:23:45:...)
with possibly one 2-byte sequence replaced with the word PAYLOAD. The
main function is payload_template_calc_checksum(), which calculates
RFC 1071 checksum of the message. There are further helpers to then
convert the checksum to the payload format, and to expand it.

For IPv6, MLDv2 message checksum is computed using a pseudoheader that
differs from the header used in the payload itself. The fact that the
two messages are different means that the checksum needs to be
returned as a separate quantity, instead of being expanded in-place in
the payload itself. Furthermore, the pseudoheader includes a length of
the message. Much like the checksum, this needs to be expanded in
mausezahn format. And likewise for number of addresses for (S,G)
entries. Thus we have several places where a computed quantity needs
to be presented in the payload format. Add a helper u16_to_bytes(),
which will be used in all these cases.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 08:48:26 +00:00
Petr Machata fcf4927632 selftests: forwarding: lib: Add helpers for IP address handling
In order to generate IGMPv3 and MLDv2 packets on the fly, we will need
helpers to expand IPv4 and IPv6 addresses given as parameters in
mausezahn payload notation. Add helpers that do it.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 08:48:26 +00:00
Petr Machata f7ccf60c4a selftests: forwarding: bridge_mdb: Fix a typo
Add the letter missing from the word "INCLUDE".

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 08:48:26 +00:00
Petr Machata 344dd2c9e7 selftests: forwarding: Move IGMP- and MLD-related functions to lib
These functions will be helpful for other testsuites as well. Extract them
to a common place.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 08:48:26 +00:00
Greg Kroah-Hartman d38e781ea0 Linux 6.2-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPgG/geHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGidAH/j0pUbmniNM3aft8
 5O1XxltlOZgqxju8OQhkiuX9VnQuuSeTNhAuj6jfcQOLpXDvlKsVfrpBxEfXalML
 yead9QAy/OT3hNGQxifMQzGbsmaWH1kgxxnv2lKo2eNP5KYZ+rec+IgtQhgX9quN
 q/N/ymd2/ju9AEOVRYwG1PD1EiIwTd3xPfBZl4Z35J+Ym+6NRrlXmN6J9LwslXmJ
 /P4cwgdf9/TuYAB96EshRHsZ4Tk/Gxkn15uyhLaWBmrZ/LAF5JGbClEfFlCWgjwv
 F5SGv3F+O6HSv49W9a24XtenE+3tw78AbCdqKZyzbb2whhv1eY99vKSbGnCahc/y
 O0VBP/g=
 =YoPk
 -----END PGP SIGNATURE-----

Merge 6.2-rc7 into char-misc-next

We need the char-misc driver fixes in here as other patches depend on
them.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-06 08:35:30 +01:00
Linus Torvalds c00f4ddde0 ARM64:
- Yet another fix for non-CPU accesses to the memory backing
   the VGICv3 subsystem
 
 - A set of fixes for the setlftest checking for the S1PTW
   behaviour after the fix that went in ealier in the cycle
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmPeZxwUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMh7Af/W3CX09KfdKYidbnF4Vx5rkaBaseP
 ox77Qco8j6DcC3fazs5TZtiN7pOSf6488VmbS43HkfZUoNAWmXn8tvwvX+a4pNE9
 yYjXQYRzOjyShPJIsBehlocn3F5G1KvZ8rVl5deVjBRtY5Vd1NzliFPL/EVZiVBk
 l6IHi0rawUTaSGJ9ZLBglQ9wEuAGv1R0SkjhwlXQ6AbupKB1n3tvgOSlKWbwP9fZ
 qJN0mOy38lp2YNRtsoZPiyf9AxGXd9XX6twQ4bhDr9h540HrbzzFh8mWu536X7mg
 dhWwggHidzN8Jzih2TnPByqorGIiJ4ab1iU6udp2X0oiaacfoIFzh8vrlA==
 =7/HY
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM64:

   - Yet another fix for non-CPU accesses to the memory backing the
     VGICv3 subsystem

   - A set of fixes for the setlftest checking for the S1PTW behaviour
     after the fix that went in ealier in the cycle"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: aarch64: Test read-only PT memory regions
  KVM: selftests: aarch64: Fix check of dirty log PT write
  KVM: selftests: aarch64: Do not default to dirty PTE pages on all S1PTWs
  KVM: selftests: aarch64: Relax userfaultfd read vs. write checks
  KVM: arm64: Allow no running vcpu on saving vgic3 pending table
  KVM: arm64: Allow no running vcpu on restoring vgic3 LPI pending status
  KVM: arm64: Add helper vgic_write_guest_lock()
2023-02-04 11:21:27 -08:00
Paolo Bonzini 25b72cf7da KVM/arm64 fixes for 6.2, take #3
- Yet another fix for non-CPU accesses to the memory backing
   the VGICv3 subsystem
 
 - A set of fixes for the setlftest checking for the S1PTW
   behaviour after the fix that went in ealier in the cycle
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmPWwGEPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDi/wP/3kbZrZ+y/YNcYioQwqibRS5DKuACKXM1dbh
 sMX0e8t3frmkrfkHZ1FsBNjSWtDLmRbjANNDWi8ypAXaPVm7/0whFqkJgyPWDO+v
 /1VXYMwMjy2zpWfPGPu+/fQL0Ninp+EfLP3Y2/Lr8VW5rH21bfuQ1rm41ucK/jB5
 IsMiQ+YObZUTrSq22fHfNJKc8fysSqeMHW96bl0QnJxf6aDDieZFGF9rlRQf/faq
 lPux0faasgQC0VgXlokWGdU1x5kXIf3Ta4VtiKARKNwxziuG8B484+5hHXvoBR1h
 bXFJJUQjQs2qBuH75BJftini9fvWvQPgbk4NvkD1tlyMhlZ5w2MTTKB4QmuW/WDT
 OGuGXAcuP2stm0dUaSn1aCwzfYgtihssp+RCAB5DOoL64i/CtHl+FJgz8wZfDPRk
 UNXdK2JccDfD6bGv/kQqPJoozjI5e8Ha2ks1O4IPHIDpIsVMIWRRGULgIRvLaHaS
 iaR7Vx+XgzW50Knj++S85eak/aTSkVaykYZIiiB4DTai1/XuAZfMA79X6IvQLxHq
 419FHmXwhJmYdWZ/JFBXWnbR6wRJiv4TR23A5u8X6o/YgBn6fmwAt6o8Avk1quZQ
 mslRPHG45hM/7Z7uSEsIQnbVVnHPhbaKr3GmHlJJ4zXRI8GaSMe23wpnJdUj1q9a
 w1Oe0rpq
 =2l/n
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.2, take #3

- Yet another fix for non-CPU accesses to the memory backing
  the VGICv3 subsystem

- A set of fixes for the setlftest checking for the S1PTW
  behaviour after the fix that went in ealier in the cycle
2023-02-04 08:57:43 -05:00
Mark Brown 69218b59be kselftest/alsa: Run PCM tests for multiple cards in parallel
With each test taking 4 seconds the runtime of pcm-test can add up. Since
generally each card in the system is physically independent and will be
unaffected by what's going on with other cards we can mitigate this by
testing each card in parallel. Make a list of cards as we enumerate the
system and then start a thread for each, then join the threads to ensure
they have all finished. The threads each run the same tests we currently
run for each PCM on the card before exiting.

The list of PCMs is kept global since it helps with global operations
like working out our planned number of tests and identifying missing PCMs
and it seemed neater to check for PCMs on the right card in the card
thread than make every PCM loop iterate over cards as well.

We don't run per-PCM tests in parallel since in embedded systems it can
be the case that resources are shared between the PCMs and operations on
one PCM on a card may constrain what can be done on another PCM on the same
card leading to potentially unstable results.

We use a mutex to ensure that the reporting of results is serialised and we
don't have issues with anything like the current test number, we could do
this in the kselftest framework but it seems like this might cause problems
for other tests that are doing lower level testing and building in
constrained environments such as nolibc so this seems more sensible.

Note that the ordering of the tests can't be guaranteed as things stand,
this does not seem like a major problem since the numbering of tests often
changes as test programs are changed so results parsers are expected to
rely on the test name rather than the test numbers. We also now prefix the
machine generated test name when printing the description of the test since
this is logged before streaming starts.

On my two card desktop system this reduces the overall runtime by a
third.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230203-alsa-pcm-test-card-thread-v1-1-59941640ebba@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-02-04 09:36:35 +01:00
Florian Lehner 17c9b4e1a7 bpf: fix typo in header for bpf_perf_prog_read_value
Fix a simple typo in the documentation for bpf_perf_prog_read_value.

Signed-off-by: Florian Lehner <dev@der-flo.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20230203121439.25884-1-dev@der-flo.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-02-03 22:11:21 -08:00
Shaoqin Huang 6043829fdb KVM: selftests: Remove redundant setbuf()
Since setbuf(stdout, NULL) has been called in kvm_util.c with
__attribute((constructor)). Selftests no need to setup it in their own
code.

Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20230203061038.277655-1-shahuang@redhat.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-02-03 21:26:48 +00:00
Kan Liang 17f248aa86 perf script: Support Retire Latency
The Retire Latency field is added in the var3_w of the
PERF_SAMPLE_WEIGHT_STRUCT. The Retire Latency reports the number of
elapsed core clocks between the retirement of the instruction indicated
by the Instruction Pointer field of the PEBS record and the retirement
of the prior instruction. That's quite useful to display the information
with perf script.

Add a new field retire_lat for the Retire Latency information.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20230104201349.1451191-9-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 17:26:40 -03:00
Kan Liang d7d213e04c perf report: Support Retire Latency
The Retire Latency field is added in the var3_w of the
PERF_SAMPLE_WEIGHT_STRUCT. The Retire Latency reports pipeline stall of
this instruction compared to the previous instruction in cycles.  That's
quite useful to display the information with perf mem report.

The p_stage_cyc for Power is also from the var3_w. Union the p_stage_cyc
and retire_lat to share the code.

Implement X86 specific codes to display the X86 specific header.

Add a new sort key retire_lat for the Retire Latency.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20230104201349.1451191-8-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 17:24:02 -03:00
Namhyung Kim ebab291641 perf lock contention: Support filters for different aggregation
It'd be useful to filter other than the current aggregation mode.  For
example, users may want to see callstacks for specific locks only.  Or
they may want tasks from a certain callstack.

The tracepoints already collected the information but it needs to check
the condition again when processing the event.  And it needs to change
BPF to allow the key combinations.

The lock contentions on 'rcu_state' spinlock can be monitored:

  $ sudo perf lock con -abv -L rcu_state sleep 1
  ...
   contended   total wait     max wait     avg wait         type   caller

           4    151.39 us     62.57 us     37.85 us     spinlock   rcu_core+0xcb
                          0xffffffff81fd1666  _raw_spin_lock_irqsave+0x46
                          0xffffffff8172d76b  rcu_core+0xcb
                          0xffffffff822000eb  __softirqentry_text_start+0xeb
                          0xffffffff816a0ba9  __irq_exit_rcu+0xc9
                          0xffffffff81fc0112  sysvec_apic_timer_interrupt+0xa2
                          0xffffffff82000e46  asm_sysvec_apic_timer_interrupt+0x16
                          0xffffffff81d49f78  cpuidle_enter_state+0xd8
                          0xffffffff81d4a259  cpuidle_enter+0x29
           1     30.21 us     30.21 us     30.21 us     spinlock   rcu_core+0xcb
                          0xffffffff81fd1666  _raw_spin_lock_irqsave+0x46
                          0xffffffff8172d76b  rcu_core+0xcb
                          0xffffffff822000eb  __softirqentry_text_start+0xeb
                          0xffffffff816a0ba9  __irq_exit_rcu+0xc9
                          0xffffffff81fc00c4  sysvec_apic_timer_interrupt+0x54
                          0xffffffff82000e46  asm_sysvec_apic_timer_interrupt+0x16
           1     28.84 us     28.84 us     28.84 us     spinlock   rcu_accelerate_cbs_unlocked+0x40
                          0xffffffff81fd1c60  _raw_spin_lock+0x30
                          0xffffffff81728cf0  rcu_accelerate_cbs_unlocked+0x40
                          0xffffffff8172da82  rcu_core+0x3e2
                          0xffffffff822000eb  __softirqentry_text_start+0xeb
                          0xffffffff816a0ba9  __irq_exit_rcu+0xc9
                          0xffffffff81fc0112  sysvec_apic_timer_interrupt+0xa2
                          0xffffffff82000e46  asm_sysvec_apic_timer_interrupt+0x16
                          0xffffffff81d49f78  cpuidle_enter_state+0xd8
  ...

To see tasks calling 'rcu_core' function:

  $ sudo perf lock con -abt -S rcu_core sleep 1
   contended   total wait     max wait     avg wait          pid   comm

          19     23.46 us      2.21 us      1.23 us            0   swapper
           2     18.37 us     17.01 us      9.19 us      2061859   ThreadPoolForeg
           3      5.76 us      1.97 us      1.92 us         3909   pipewire-pulse
           1      2.26 us      2.26 us      2.26 us      1809271   MediaSu~isor #2
           1      1.97 us      1.97 us      1.97 us      1514882   Chrome_ChildIOT
           1       987 ns       987 ns       987 ns         3740   pipewire-pulse

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230203021324.143540-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 17:12:26 -03:00
Namhyung Kim 16cad1d359 perf lock contention: Use lock_stat_find{,new}
This is a preparation work to support complex keys of BPF maps.  Now it
has single value key according to the aggregation mode like stack_id or
pid.  But we want to use a combination of those keys.

Then lock_contention_read() should still aggregate the result based on
the key that was requested by user.  The other key info will be used for
filtering.

So instead of creating a lock_stat entry always, Check if it's already
there using lock_stat_find() first.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230203021324.143540-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 17:12:26 -03:00
Namhyung Kim 492fef218a perf lock contention: Factor out lock_contention_get_name()
The lock_contention_get_name() returns a name for the lock stat entry
based on the current aggregation mode.  As it's called sequentially in a
single thread, it can return the address of a static buffer for symbol
and offset of the caller.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230203021324.143540-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 17:12:26 -03:00
Rob Herring 7105311c2d perf arm-spe: Add raw decoding for SPEv1.2 previous branch address
Arm SPEv1.2 adds a new optional address packet type: previous branch
target. The recorded address is the target virtual address of the most
recently taken branch in program order.

Add support for decoding the address packet in raw dumps.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230203162401.132931-1-robh@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 17:12:25 -03:00
Ian Rogers b777b3d255 perf jevents: Run metric_test.py at compile-time
Add a target that generates a log file for running metric_test.py and
make this a dependency on generating pmu-events.c. The log output is
displayed if the test fails like (the test was modified to make it
fail):

```
  TEST    /tmp/perf/pmu-events/metric_test.log
F......
======================================================================
FAIL: test_Brackets (__main__.TestMetricExpressions)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tools/perf/pmu-events/metric_test.py", line 33, in test_Brackets
    self.assertEqual((a * b + c).ToPerfJson(), 'a * b + d')
AssertionError: 'a * b + c' != 'a * b + d'
- a * b + c
?         ^
+ a * b + d
?         ^

----------------------------------------------------------------------
Ran 7 tests in 0.004s

FAILED (failures=1)
make[3]: *** [pmu-events/Build:32: /tmp/perf/pmu-events/metric_test.log] Error 1
```

However, normal execution will just show the TEST line.

This is roughly modeled on fortify testing in the kernel lib directory.

Modify metric_test.py so that it is executable. This is necessary when
PYTHON isn't specified in the build, the normal case.

Use variables to make the paths to files clearer and more consistent.

Committer notes:

Add pmu-events/metric_test.log to tools/perf/.gitignore and to the
'clean' target on tools/perf/Makefile.perf.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 17:11:39 -03:00
Linus Torvalds 0c272a1d33 25 hotfixes, mainly for MM. 13 are cc:stable.
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY9x+swAKCRDdBJ7gKXxA
 joPwAP95XqB7gzy2l1Mc++Ta7Ih0fS34Pj1vTAxwsRQnqzr6rwD/QOt3YU9KgXpy
 D7Fp8NnaQZq6m5o8cvV5+fBqA3uarAM=
 =IIB8
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2023-02-02-19-24-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "25 hotfixes, mainly for MM.  13 are cc:stable"

* tag 'mm-hotfixes-stable-2023-02-02-19-24-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (26 commits)
  mm: memcg: fix NULL pointer in mem_cgroup_track_foreign_dirty_slowpath()
  Kconfig.debug: fix the help description in SCHED_DEBUG
  mm/swapfile: add cond_resched() in get_swap_pages()
  mm: use stack_depot_early_init for kmemleak
  Squashfs: fix handling and sanity checking of xattr_ids count
  sh: define RUNTIME_DISCARD_EXIT
  highmem: round down the address passed to kunmap_flush_on_unmap()
  migrate: hugetlb: check for hugetlb shared PMD in node migration
  mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
  mm/MADV_COLLAPSE: catch !none !huge !bad pmd lookups
  Revert "mm: kmemleak: alloc gray object for reserved region with direct map"
  freevxfs: Kconfig: fix spelling
  maple_tree: should get pivots boundary by type
  .mailmap: update e-mail address for Eugen Hristev
  mm, mremap: fix mremap() expanding for vma's with vm_ops->close()
  squashfs: harden sanity check in squashfs_read_xattr_id_table
  ia64: fix build error due to switch case label appearing next to declaration
  mm: multi-gen LRU: fix crash during cgroup migration
  Revert "mm: add nodes= arg to memory.reclaim"
  zsmalloc: fix a race with deferred_handles storing
  ...
2023-02-03 10:01:57 -08:00
Ian Rogers e30f34053e tools build: Add test echo-cmd
Add quiet_cmd_test so that:
$(Q)$(call echo-cmd,test)

will print:
TEST   <path>

This is useful for executing compile-time tests similar to what
happens for fortify tests in the kernel's lib directory.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 13:54:22 -03:00
Ian Rogers d2e3dc829e perf jevents: Correct bad character encoding
A character encoding issue added a "3D" character that breaks the
metrics test.

Fixes: 40769665b6 ("perf jevents: Parse metrics during conversion")
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 13:54:21 -03:00
Ian Rogers 3340a08354 perf pmu-events: Fix testing with JEVENTS_ARCH=all
The #slots literal will return NAN when not on ARM64 which causes a
perf test failure when not on an ARM64 for a JEVENTS_ARCH=all build:
..
 10.4: Parsing of PMU event table metrics with fake PMUs             : FAILED!
..
Add an is_test boolean so that the failure can be avoided when running
as a test.

Fixes: acef233b7c ("perf pmu: Add #slots literal support for arm64")
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 13:54:21 -03:00
Ian Rogers 5a09b1fd1b perf jevents: Add model list option
This allows the set of generated jevents events and metrics be limited
to a subset of the model names. Appropriate if trying to minimize the
binary size where only a set of models are possible.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 13:54:21 -03:00
Ian Rogers 62774db2a0 perf jevents: Generate metrics and events as separate tables
Turn a perf json event into an event, metric or both. This reduces the
number of events needed to scan to find an event or metric. As events
no longer need the relatively seldom used metric fields, 4 bytes is
saved per event. This reduces the big C string's size by 335kb (14.8%)
on x86.

Note, for the test PMU architecture pme_test_soc_cpu is renamed
pmu_events__test_soc_cpu for consistency with the event vs metric
naming convention.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 13:54:21 -03:00
Ian Rogers f8ea2c1524 perf pmu-events: Introduce pmu_metrics_table
Add a metrics table that is just a cast from pmu_events_table. This
changes the APIs so that event and metric usage of the underlying
table is different. For the no jevents case the tables are already
separate, later changes will separate the tables for the jevents case.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 13:54:21 -03:00
Ian Rogers 9f587cc93f perf jevents: Combine table prefix and suffix writing
Combine into a single function to simplify, in a later change, writing
metrics separately.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 13:54:21 -03:00
Ian Rogers 6f8f98ab6c perf stat: Remove evsel metric_name/expr
Metrics are their own unit and these variables held broken metrics
previously and now just hold the value NULL. Remove code that used
these variables.

Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 13:54:21 -03:00
Ian Rogers d9dc8874d6 perf pmu-events: Remove now unused event and metric variables
Previous changes separated the uses of pmu_event and pmu_metric,
however, both structures contained all the variables of event and
metric. This change removes the event variables from metric and the
metric variables from event.

Note, this change removes the setting of evsel's metric_name/expr as
these fields are no longer part of struct pmu_event. The metric
remains but is no longer implicitly requested when the event is. This
impacts a few Intel uncore events, however, as the ScaleUnit is shared
by the event and the metric this utility is questionable. Also the
MetricNames look broken (contain spaces) in some cases and when trying
to use the functionality with '-e' the metrics fail but regular
metrics with '-M' work. For example, on SkylakeX '-M' works:

```
$ perf stat -M LLC_MISSES.PCIE_WRITE -a sleep 1

 Performance counter stats for 'system wide':

                 0      UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 #  57896.0 Bytes  LLC_MISSES.PCIE_WRITE  (49.84%)
             7,174      UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1                                        (49.85%)
                 0      UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3                                        (50.16%)
                63      UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0                                        (50.15%)

       1.004576381 seconds time elapsed
```

whilst the event '-e' version is broken even with --group/-g (fwiw, we should also remove -g [1]):

```
$ perf stat -g -e LLC_MISSES.PCIE_WRITE -g -a sleep 1
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART2 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART1 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART3 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE
Add UNC_IIO_DATA_REQ_OF_CPU.MEM_WRITE.PART0 event to groups to get metric expression for LLC_MISSES.PCIE_WRITE

 Performance counter stats for 'system wide':

            27,316 Bytes LLC_MISSES.PCIE_WRITE

       1.004505469 seconds time elapsed
```

The code also carries warnings where the user is supposed to select
events for metrics [2] but given the lack of use of such a feature,
let's clean the code and just remove.

[1] https://lore.kernel.org/lkml/20220707195610.303254-1-irogers@google.com/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/stat-shadow.c?id=01b8957b738f42f96a130079bc951b3cc78c5b8a#n425

Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 13:54:21 -03:00
Ian Rogers 96d2a74618 perf pmu-events: Separate the metrics from events for no jevents
Separate the event and metric table when building without jevents. Add
find_core_metrics_table and perf_pmu__find_metrics_table while
renaming existing utilities to be event specific, so that users can
find the right table for their need.

Committer notes:

Fix the build on aarch64 with:

  tools/perf/arch/arm64/util/pmu.c
  @@ -32,7 +32,7 @@ const struct pmu_events_table *pmu_events_table__find(void)
  -               return perf_pmu__find_table(pmu);
  +               return perf_pmu__find_events_table(pmu);

Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-03 13:53:52 -03:00
Srinivas Pandruvada d1fcb7493f tools/power/x86/intel-speed-select: v1.14 release
This release adds following change:
- Minor fixes for coverity static analysis
- Don't read cpufreq on offline CPUs
- SST turbo-freq enable on auto mode when user disables SMT from
kernel command line
- Fix uncore frequency display
- Set uncore frequency max/min limits on perf level change

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03 10:01:50 +01:00
Srinivas Pandruvada 2612ae5961 tools/power/x86/intel-speed-select: Adjust uncore max/min frequency
When perf level is changed, uncore limits can change. Set the uncore
limits via Linux uncore sysfs, when user changes perf level with
-o option.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03 10:01:50 +01:00
Zhang Rui 61f9fdcdcd tools/power/x86/intel-speed-select: Add Emerald Rapid quirk
Need memory frequency quirk as Sapphire Rapids in Emerald Rapids.
So add Emerald Rapids CPU model check in is_spr_platform().

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: Subject, changelog and code edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03 10:00:24 +01:00
Srinivas Pandruvada 0d5eea3527 tools/power/x86/intel-speed-select: Fix display of uncore min frequency
Uncore P1 is not uncore minmum frequency. This is uncore base frequency.
Correct display from uncore-frequency-min(MHz)
to uncore-frequency-base(Mhz).

To get uncore min frequency use mailbox command
CONFIG_TDP_GET_RATIO_INFO. Use this mailbox to get uncore frequency
limits when present.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03 10:00:24 +01:00
Srinivas Pandruvada 6ed9e36315 tools/power/x86/intel-speed-select: turbo-freq auto mode with SMT off
When SMT is disabled from kernel command line, sibling CPUs still
appears in the sysfs as offline CPUs. This is a problem when turbo-freq
is enabled in auto mode. They are still assigned to CLOS value
of 3 as they are still in the present CPU list. But they are not in the
sibling list of a CPU. When the CPU is a high priority CPU, because of
sibling it will be still set to CLOS to 3 as CLOS is assigned at core
level not at CPU level.

So, avoid setting CLOS 3 to offline CPU.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03 09:57:10 +01:00
Srinivas Pandruvada cf3b8e8f55 tools/power/x86/intel-speed-select: cpufreq reads on offline CPUs
Due to some recent kernel changes, reading cpufreq attributes like
scaling_max_freq on offline CPUs returns error. So avoid reading
cpufreq attributes on offline CPUs.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03 09:57:07 +01:00
Zhang Rui 689dfc9e40 tools/power/x86/intel-speed-select: Use null-terminated string
strlen() and strtok() takes null-termimated strings as input.
Make sure these strings are null-terminated before using them.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03 09:57:05 +01:00
Zhang Rui 8a44d27542 tools/power/x86/intel-speed-select: Remove duplicate dup()
Remove the duplicate dup() invocation.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03 09:57:01 +01:00
Zhang Rui 364ba3b711 tools/power/x86/intel-speed-select: Handle open() failure case
Add handling for open() failure case to make sure a valid file
descriptor is passed to dup().

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03 09:56:58 +01:00
Zhang Rui b8bebc8e58 tools/power/x86/intel-speed-select: Remove unused non_block flag
variable 'non_block' is always 0, thus remove the variable and the
handling for "non_block != 0" case.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03 09:56:53 +01:00
Zhang Rui 507fa17a6c tools/power/x86/intel-speed-select: Remove wrong check in set_isst_id()
struct isst_id *id is a pointer, comparing it with less than zero is wrong.

The check is there to make sure the id->pkg and id->die is set to -1, when
it is illegal or unavailable. Here comparing with MAX_PACKAGE_COUNT and
MAX_DIE_PER_PACKAGE is sufficient.

Hence remove the wrong check.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[srinivas.pandruvada@linux.intel.com: Subject and changelog edits]
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03 09:55:26 +01:00
Kees Cook 4cf1fe34fd kselftest: vm: add tests for memory-deny-write-execute
Add some tests to cover the new PR_SET_MDWE prctl.

Link: https://lkml.kernel.org/r/20230119160344.54358-3-joey.gouly@arm.com
Co-developed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Mark Brown <broonie@kernel.org>
Cc: nd <nd@arm.com>
Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: Topi Miettinen <toiwoton@gmail.com>
Cc: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:33:25 -08:00
Herton R. Krzesinski e6d2c436ff tools/mm: allow users to provide additional cflags/ldflags
Right now there is no way to provide additional cflags/ldflags when
building tools/vm binaries.  And using eg.  make CFLAGS=<options> will
override the CFLAGS being set in the Makefile, making the build fail since
it requires the include of the ../lib dir (for libapi).

This change then allows you to specify:
CFLAGS=<options> LDFLAGS=<options> make V=1 -C tools/vm

And the options will be correctly appended as can be seen from the
make output.

Link: https://lkml.kernel.org/r/20230116224921.4106324-1-herton@redhat.com
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Justin Forbes <jforbes@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Scott Weaver <scweaver@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:33:24 -08:00
NeilBrown 2973d8229b mm: discard __GFP_ATOMIC
__GFP_ATOMIC serves little purpose.  Its main effect is to set
ALLOC_HARDER which adds a few little boosts to increase the chance of an
allocation succeeding, one of which is to lower the water-mark at which it
will succeed.

It is *always* paired with __GFP_HIGH which sets ALLOC_HIGH which also
adjusts this watermark.  It is probable that other users of __GFP_HIGH
should benefit from the other little bonuses that __GFP_ATOMIC gets.

__GFP_ATOMIC also gives a warning if used with __GFP_DIRECT_RECLAIM.
There is little point to this.  We already get a might_sleep() warning if
__GFP_DIRECT_RECLAIM is set.

__GFP_ATOMIC allows the "watermark_boost" to be side-stepped.  It is
probable that testing ALLOC_HARDER is a better fit here.

__GFP_ATOMIC is used by tegra-smmu.c to check if the allocation might
sleep.  This should test __GFP_DIRECT_RECLAIM instead.

This patch:
 - removes __GFP_ATOMIC
 - allows __GFP_HIGH allocations to ignore watermark boosting as well
   as GFP_ATOMIC requests.
 - makes other adjustments as suggested by the above.

The net result is not change to GFP_ATOMIC allocations.  Other
allocations that use __GFP_HIGH will benefit from a few different extra
privileges.  This affects:
  xen, dm, md, ntfs3
  the vermillion frame buffer
  hibernation
  ksm
  swap
all of which likely produce more benefit than cost if these selected
allocation are more likely to succeed quickly.

[mgorman: Minor adjustments to rework on top of a series]
Link: https://lkml.kernel.org/r/163712397076.13692.4727608274002939094@noble.neil.brown.name
Link: https://lkml.kernel.org/r/20230113111217.14134-7-mgorman@techsingularity.net
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:33:13 -08:00
Vernon Yang c5d5546ea0 maple_tree: remove the parameter entry of mas_preallocate
The parameter entry of mas_preallocate is not used, so drop it.

Link: https://lkml.kernel.org/r/20230110154211.1758562-1-vernon2gm@gmail.com
Signed-off-by: Vernon Yang <vernon2gm@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:32:52 -08:00
SeongJae Park 75cb348714 selftests/damon/debugfs_rm_non_contexts: hide expected write error messages
A selftest case for DAMON debugfs interface has a test for expected
failure.  To make the test output clean, hide the expected failure error
message.

Link: https://lkml.kernel.org/r/20230110190400.119388-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:32:52 -08:00
SeongJae Park 16ddcb1549 selftests/damon/sysfs: hide expected write failures
DAMON selftests for sysfs (sysfs.sh) tests if some writes to DAMON sysfs
interface files fails as expected.  It makes the test results noisy with
the failure error message because it tests a number of such failures. 
Redirect the expected failure error messages to /dev/null to make the
results clean.

Link: https://lkml.kernel.org/r/20230110190400.119388-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:32:52 -08:00
Tonghao Zhang 377c16fa3f bpftool: profile online CPUs instead of possible
The number of online cpu may be not equal to possible cpu.
"bpftool prog profile" can not create pmu event on possible
but on online cpu.

$ dmidecode -s system-product-name
PowerEdge R620
$ cat /sys/devices/system/cpu/possible
0-47
$ cat /sys/devices/system/cpu/online
0-31

Disable cpu dynamically:
$ echo 0 > /sys/devices/system/cpu/cpuX/online

If one cpu is offline, perf_event_open will return ENODEV.
To fix this issue:
* check value returned and skip offline cpu.
* close pmu_fd immediately on error path, avoid fd leaking.

Fixes: 47c09d6a9f ("bpftool: Introduce "prog profile" command")
Signed-off-by: Tonghao Zhang <tong@infragraf.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20230202131701.29519-1-tong@infragraf.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-02-02 22:20:14 -08:00
Greg Kroah-Hartman 196db6bb44 1st set of IIO new device support, features and cleanups for the 6.3 cycle
The usual mixed bag. So far this has been a quiet cycle for IIO.
 
 New device support
 * adi,ad8686
   - Add support for the AD5337 DAC - ID and 8 bit channel support.
 * maxim,max5522
   - New driver for this 2 channel DAC.
 * nxp,imx93-adc
   - New driver for this SoC ADC which is a fresh IP that will probably
     turn up in additional SoCs going forwards.
 * st,magn
   - Add support for magnetometer part of LSM303C which is very similar
     to standalone LIS3MDL already supported.
 * ti,ads7924
   - New driver for this 4 channel, 12-bit I2C ADC.
 * ti,lmp92064
   - New driver for this 12 bit SPI ADC.
 * ti,tmag5273
   - New driver for this 3D Hall-Effect Sensor.
 
 Features
 * core
   - Add a standard structure for the value pairs in IIO_VAL_INT_PLUS_MICRO
     available attributes and similar.
 * cirrus,ep93xx
   - Add DT binding docs and convert driver to DT based probing.
   - Enable testing building with CONFIG_COMPILE_TEST.
 * st,stm32-dfsdm
   - Enable ID register support for discovery of hardware capabilities on
     some devices.
 
 Cleanups and minor fixes
 * core
   - Drop the custom iio_sysfs_match_string_with_gaps().
     The special ability of this function to skip gaps in an array
     was never used by any upstream driver.
   - Sort headers whilst touching this file.
 * tools
   - Fix memory leak in iio_utils.c
 * various
   - leftover i2c probe_new() conversions.
   - scnprintf() -> sysfs_emit() cleanups.
   - hand rolled devm enables -> devm_regulator[_bulk]_get_enable()
   - typo fixes
   - dt-binding cleanup (whitespace, excess quotes and similar)
 * adi,ad7746
   - Set variable without pointless conditional.
 * fsl,mma9551
   - Squash false positives about use of uninitialized variable where
     garbage undergoes an endian conversion before being ignored.
 * measspec,ms5611
   - Switch to fully devm_ managed probe() and so drop explicit remove()
 * qcom,spmi-adc
   - Use dev_err_probe() to suppress deferred print.
 * qcom,spmi-adc5
   - Define a missing channel used for battery identification.
 * qcom,spmi-iadc
   - Document a compatible seen in wild.
 * semtech,sx9360
   - Fix units on semtech,resolution dt-binding.
 * sensiron,scd30
   - dev_err_probe() usage to simplify error paths a little.
 * st,lsm6dsx
   - Add missing mount matrix for the gyro IIO device.
 * taos,tsl2563
   - Respect firmware configured interrupt polarity if present.
   - Use i2c_smbus_write_word_data() in a few cases not previously covered.
   - Factor out duplicated interrupt configuration.
   - Switch to GENMASK() / BIT() from hand coded equivalents.
   - Tidy up unused definitions.
   - Use dev_err_probe() as appropriate.
   - Drop platform_data as no in kernel users and there are better ways to
     do equivalent if any are added.
   - Add local struct device variable to tidy up code.
   - Avoid dance via i2c_client to get the drvdata.
   - Tidy up headers ordering and Makefile ordering.
 * ti,adc128s052
   - Use new spi_get_device_match_data().
   - Drop ACPI_PTR() protection.
   - Sort headers whilst here.
   - Use asm instead of incorrect include of asm-generic/unaligned.h
 * vishay,vcn4000
   - Interrupt support for vcnl4040 (lots of refactoring needed)
 * xilinx,ams
   - Use fwnode_device_is_compatible() instead of open coding it.
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAmPb+A4RHGppYzIzQGtl
 cm5lbC5vcmcACgkQVIU0mcT0FoiXQw/+N0ubxewmq7dUfhgzTv2GLjhNBlVJXgC0
 GyXNEF3r2PMmKHQCpUHcJo2NxsOx4tpYhb0J61kuaCjBlRR3ON+KRKNWK+PLVGZh
 I/ljRDxHo0vw9RQs1d+MRZAobZDVkEvJOoovtQC4Xna8vs+HDNVgpiQA/xVCHTF4
 2qxqlRMloTpI4RprLTu6oeDgCthX3RcMM9yw+9N2dcQBw1MxqEZzkJej2dOJncz7
 AJ7Wtw62EDhG+rOaWQHNUMTxOxVBZkJd3KEvc2aKDT8D08iZRtpWCGauaOrRn6Ss
 IddopUW7ihAyyxDYXxLzSRJLs4XJ7t5HlKUJX4tSyuoMU/roPP8ACYte5SbiC9eD
 wJ65wWQqkCIhgOD6KP2Q44GYDtWCARioun181Et4mcJTPduUiKZxRhyD6aiXcLGr
 vJWaDcOmu1OtUbM57cNDaft3bEi8qh1+uY05ex7TRyonlw+d9QQvVBkaCWBvJwIm
 DHJf6p9V6srxHhpsQziG3xYHzacc4GNP7+nr/rQw7lvatJYEYujUJEKfJ1QsDn9G
 gNMkZeSga1b2Tsvg4InNPfrzO+K3XlT1HSF2nBZE9+o+w++5oX3DQng+2nnapzL0
 jTTfpw6B4p40hbJB0bykfoy+XeDpgPhacUZvWwKmwLIy2/LnsorJgIc9aNkOfsvx
 a+/m7/8/zaM=
 =Dfec
 -----END PGP SIGNATURE-----

Merge tag 'iio-for-6.3a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next

Jonathan writes:

1st set of IIO new device support, features and cleanups for the 6.3 cycle

The usual mixed bag. So far this has been a quiet cycle for IIO.

New device support
* adi,ad8686
  - Add support for the AD5337 DAC - ID and 8 bit channel support.
* maxim,max5522
  - New driver for this 2 channel DAC.
* nxp,imx93-adc
  - New driver for this SoC ADC which is a fresh IP that will probably
    turn up in additional SoCs going forwards.
* st,magn
  - Add support for magnetometer part of LSM303C which is very similar
    to standalone LIS3MDL already supported.
* ti,ads7924
  - New driver for this 4 channel, 12-bit I2C ADC.
* ti,lmp92064
  - New driver for this 12 bit SPI ADC.
* ti,tmag5273
  - New driver for this 3D Hall-Effect Sensor.

Features
* core
  - Add a standard structure for the value pairs in IIO_VAL_INT_PLUS_MICRO
    available attributes and similar.
* cirrus,ep93xx
  - Add DT binding docs and convert driver to DT based probing.
  - Enable testing building with CONFIG_COMPILE_TEST.
* st,stm32-dfsdm
  - Enable ID register support for discovery of hardware capabilities on
    some devices.

Cleanups and minor fixes
* core
  - Drop the custom iio_sysfs_match_string_with_gaps().
    The special ability of this function to skip gaps in an array
    was never used by any upstream driver.
  - Sort headers whilst touching this file.
* tools
  - Fix memory leak in iio_utils.c
* various
  - leftover i2c probe_new() conversions.
  - scnprintf() -> sysfs_emit() cleanups.
  - hand rolled devm enables -> devm_regulator[_bulk]_get_enable()
  - typo fixes
  - dt-binding cleanup (whitespace, excess quotes and similar)
* adi,ad7746
  - Set variable without pointless conditional.
* fsl,mma9551
  - Squash false positives about use of uninitialized variable where
    garbage undergoes an endian conversion before being ignored.
* measspec,ms5611
  - Switch to fully devm_ managed probe() and so drop explicit remove()
* qcom,spmi-adc
  - Use dev_err_probe() to suppress deferred print.
* qcom,spmi-adc5
  - Define a missing channel used for battery identification.
* qcom,spmi-iadc
  - Document a compatible seen in wild.
* semtech,sx9360
  - Fix units on semtech,resolution dt-binding.
* sensiron,scd30
  - dev_err_probe() usage to simplify error paths a little.
* st,lsm6dsx
  - Add missing mount matrix for the gyro IIO device.
* taos,tsl2563
  - Respect firmware configured interrupt polarity if present.
  - Use i2c_smbus_write_word_data() in a few cases not previously covered.
  - Factor out duplicated interrupt configuration.
  - Switch to GENMASK() / BIT() from hand coded equivalents.
  - Tidy up unused definitions.
  - Use dev_err_probe() as appropriate.
  - Drop platform_data as no in kernel users and there are better ways to
    do equivalent if any are added.
  - Add local struct device variable to tidy up code.
  - Avoid dance via i2c_client to get the drvdata.
  - Tidy up headers ordering and Makefile ordering.
* ti,adc128s052
  - Use new spi_get_device_match_data().
  - Drop ACPI_PTR() protection.
  - Sort headers whilst here.
  - Use asm instead of incorrect include of asm-generic/unaligned.h
* vishay,vcn4000
  - Interrupt support for vcnl4040 (lots of refactoring needed)
* xilinx,ams
  - Use fwnode_device_is_compatible() instead of open coding it.

* tag 'iio-for-6.3a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (71 commits)
  iio: adc: ad7291: Fix indentation error by adding extra spaces
  iio: accel: mma9551_core: Prevent uninitialized variable in mma9551_read_config_word()
  iio: accel: mma9551_core: Prevent uninitialized variable in mma9551_read_status_word()
  dt-bindings: iio/proximity: semtech,sx9360: Fix 'semtech,resolution' type
  iio: imu: fix spdx format
  iio: adc: imx93: Fix spelling mistake "geting" -> "getting"
  dt-bindings: iio: cleanup examples - indentation
  dt-bindings: iio: use lowercase hex in examples
  dt-bindings: iio: correct node names in examples
  dt-bindings: iio: minor whitespace cleanups
  dt-bindings: iio: drop unneeded quotes
  dt-bindings: iio: adc: Add NXP IMX93 ADC
  iio: adc: add imx93 adc support
  dt-bindings: iio: adc: add Texas Instruments ADS7924
  iio: adc: ti-ads7924: add Texas Instruments ADS7924 driver
  iio: imu: st_lsm6dsx: add 'mount_matrix' sysfs entry to gyro channel.
  iio: imu: st_lsm6dsx: fix naming of 'struct iio_info' in st_lsm6dsx_shub.c.
  iio: light: vcnl4000: Add interrupt support for vcnl4040
  iio: light: vcnl4000: Make irq handling more generic
  iio: light: vcnl4000: Prepare for more generic setup
  ...
2023-02-03 07:01:54 +01:00
Lorenzo Bianconi 4dba3e7852 selftests/bpf: introduce XDP compliance test tool
Introduce xdp_features tool in order to test XDP features supported by
the NIC and match them against advertised ones.
In order to test supported/advertised XDP features, xdp_features must
run on the Device Under Test (DUT) and on a Tester device.
xdp_features opens a control TCP channel between DUT and Tester devices
to send control commands from Tester to the DUT and a UDP data channel
where the Tester sends UDP 'echo' packets and the DUT is expected to
reply back with the same packet. DUT installs multiple XDP programs on the
NIC to test XDP capabilities and reports back to the Tester some XDP stats.
Currently xdp_features supports the following XDP features:
- XDP_DROP
- XDP_ABORTED
- XDP_PASS
- XDP_TX
- XDP_REDIRECT
- XDP_NDO_XMIT

Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/7c1af8e7e6ef0614cf32fa9e6bdaa2d8d605f859.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02 20:48:24 -08:00
Lorenzo Bianconi 84050074e5 selftests/bpf: add test for bpf_xdp_query xdp-features support
Introduce a self-test to verify libbpf bpf_xdp_query capability to dump
the xdp-features supported by the device (lo and veth in this case).

Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/534550318a2c883e174811683909544c63632f05.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02 20:48:24 -08:00
Lorenzo Bianconi 04d58f1b26 libbpf: add API to get XDP/XSK supported features
Extend bpf_xdp_query routine in order to get XDP/XSK supported features
of netdev over route netlink interface.
Extend libbpf netlink implementation in order to support netlink_generic
protocol.

Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Co-developed-by: Marek Majtyka <alardam@gmail.com>
Signed-off-by: Marek Majtyka <alardam@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/a72609ef4f0de7fee5376c40dbf54ad7f13bfb8d.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02 20:48:24 -08:00
Lorenzo Bianconi 8f1669319c libbpf: add the capability to specify netlink proto in libbpf_netlink_send_recv
This is a preliminary patch in order to introduce netlink_generic
protocol support to libbpf.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/7878a54667e74afeec3ee519999c044bd514b44c.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02 20:48:23 -08:00
Jakub Kicinski d3d854fd6a netdev-genl: create a simple family for netdev stuff
Add a Netlink spec-compatible family for netdevs.
This is a very simple implementation without much
thought going into it.

It allows us to reap all the benefits of Netlink specs,
one can use the generic client to issue the commands:

  $ ./cli.py --spec netdev.yaml --dump dev_get
  [{'ifindex': 1, 'xdp-features': set()},
   {'ifindex': 2, 'xdp-features': {'basic', 'ndo-xmit', 'redirect'}},
   {'ifindex': 3, 'xdp-features': {'rx-sg'}}]

the generic python library does not have flags-by-name
support, yet, but we also don't have to carry strings
in the messages, as user space can get the names from
the spec.

Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Co-developed-by: Marek Majtyka <alardam@gmail.com>
Signed-off-by: Marek Majtyka <alardam@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/327ad9c9868becbe1e601b580c962549c8cd81f2.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02 20:48:23 -08:00
Tiezhu Yang 150809082a selftests/bpf: Use semicolon instead of comma in test_verifier.c
Just silence the following checkpatch warning:

WARNING: Possible comma where semicolon could be used

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/1675319486-27744-3-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02 20:38:32 -08:00
Tiezhu Yang e2bd974298 tools/bpf: Use tab instead of white spaces to sync bpf.h
Just silence the following build warning:

Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/1675319486-27744-2-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02 20:38:32 -08:00
Ilya Leoshkevich 354bb4a0e0 selftests/bpf: Initialize tc in xdp_synproxy
xdp_synproxy/xdp fails in CI with:

    Error: bpf_tc_hook_create: File exists

The XDP version of the test should not be calling bpf_tc_hook_create();
the reason it's happening anyway is that if we don't specify --tc on the
command line, tc variable remains uninitialized.

Fixes: 784d5dc0ef ("selftests/bpf: Add selftests for raw syncookie helpers in TC mode")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Reported-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230202235335.3403781-1-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02 20:10:07 -08:00
Jakub Kicinski 82b4a9412b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/core/gro.c
  7d2c89b325 ("skb: Do mix page pool and page referenced frags in GRO")
  b1a78b9b98 ("net: add support for ipv4 big tcp")
https://lore.kernel.org/all/20230203094454.5766f160@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-02 14:49:55 -08:00
Linus Torvalds edb9b8f380 Including fixes from bpf, can and netfilter.
Current release - regressions:
 
  - phy: fix null-deref in phy_attach_direct
 
  - mac802154: fix possible double free upon parsing error
 
 Previous releases - regressions:
 
  - bpf: preserve reg parent/live fields when copying range info,
    prevent mis-verification of programs as safe
 
  - ip6: fix GRE tunnels not generating IPv6 link local addresses
 
  - phy: dp83822: fix null-deref on DP83825/DP83826 devices
 
  - sctp: do not check hb_timer.expires when resetting hb_timer
 
  - eth: mtk_sock: fix SGMII configuration after phylink conversion
 
 Previous releases - always broken:
 
  - eth: xdp: execute xdp_do_flush() before napi_complete_done()
 
  - skb: do not mix page pool and page referenced frags in GRO
 
  - bpf:
    - fix a possible task gone issue with bpf_send_signal[_thread]()
    - fix an off-by-one bug in bpf_mem_cache_idx() to select
      the right cache
    - add missing btf_put to register_btf_id_dtor_kfuncs
    - sockmap: fon't let sock_map_{close,destroy,unhash} call itself
 
  - gso: fix null-deref in skb_segment_list()
 
  - mctp: purge receive queues on sk destruction
 
  - fix UaF caused by accept on already connected socket in exotic
    socket families
 
  - tls: don't treat list head as an entry in tls_is_tx_ready()
 
  - netfilter: br_netfilter: disable sabotage_in hook after first
    suppression
 
  - wwan: t7xx: fix runtime PM implementation
 
 Misc:
 
  - MAINTAINERS: spring cleanup of networking maintainers
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmPcKlkACgkQMUZtbf5S
 IrtHgxAAxvuXeN9kQ+/gDYbxTa4Fc3gwCAA7SdJiBdfxP9xJAUEBs9TNfh5MPynb
 iIWl9tTe1OOQtWJTnp5CKMuegUj4hlJVYuLu4MK36hW56Q41cbCFvhsDk/1xoFTF
 2mtmEr5JiXZfgEdf9/tEITTGKgfANsbXOAzL4b5HcV9qtvwvEd2aFxll3VO9N2Xn
 k6o59Lr2mff7NnXQsrVkTLBxXRK9oir8VwyTnCs7j+T7E8Qe4hIkx2LDWcDiRC1e
 vSxfoWFoe+uOGTQMxlarMOAgAOt7nvOQngwCaaVpMBa/OLzGo51Clf98cpPm+cSi
 18ZjmA8r5owGghd75p2PtxcND8U1vXeeRJW+FKK60K8qznMc0D4s7HX+wHBfevnV
 U649F0OHsNsX0Y75Z5sqA1eSmJbTR9QeyiRbuS6nfOnT7ZKDw6vPJrKGJRN789y0
 erzG/JwrfnrCDavXfNNEgk0mMlP0squemffWjuH6FfL4Pt4/gUz63Nokr5vRFSns
 yskHZgXVs4gq++yrbZDZjKfQaPDkA1P/z6VADVRWh4Y5m1m6GtvDUy6vJxK7/hUs
 5giqmhbAAQq0U+2L5RIaoYGfjk2UaDcunNpqTCgjdCdMXx9Yz41QBloFgBjRshp/
 3kg6ElYPzflXXCYK4pGRrzqO4Gqz1Hrvr9YuT+kka+wrpZPDtWc=
 =OGU7
 -----END PGP SIGNATURE-----

Merge tag 'net-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, can and netfilter.

  Current release - regressions:

   - phy: fix null-deref in phy_attach_direct

   - mac802154: fix possible double free upon parsing error

  Previous releases - regressions:

   - bpf: preserve reg parent/live fields when copying range info,
     prevent mis-verification of programs as safe

   - ip6: fix GRE tunnels not generating IPv6 link local addresses

   - phy: dp83822: fix null-deref on DP83825/DP83826 devices

   - sctp: do not check hb_timer.expires when resetting hb_timer

   - eth: mtk_sock: fix SGMII configuration after phylink conversion

  Previous releases - always broken:

   - eth: xdp: execute xdp_do_flush() before napi_complete_done()

   - skb: do not mix page pool and page referenced frags in GRO

   - bpf:
      - fix a possible task gone issue with bpf_send_signal[_thread]()
      - fix an off-by-one bug in bpf_mem_cache_idx() to select the right
        cache
      - add missing btf_put to register_btf_id_dtor_kfuncs
      - sockmap: fon't let sock_map_{close,destroy,unhash} call itself

   - gso: fix null-deref in skb_segment_list()

   - mctp: purge receive queues on sk destruction

   - fix UaF caused by accept on already connected socket in exotic
     socket families

   - tls: don't treat list head as an entry in tls_is_tx_ready()

   - netfilter: br_netfilter: disable sabotage_in hook after first
     suppression

   - wwan: t7xx: fix runtime PM implementation

  Misc:

   - MAINTAINERS: spring cleanup of networking maintainers"

* tag 'net-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits)
  mtk_sgmii: enable PCS polling to allow SFP work
  net: mediatek: sgmii: fix duplex configuration
  net: mediatek: sgmii: ensure the SGMII PHY is powered down on configuration
  MAINTAINERS: update SCTP maintainers
  MAINTAINERS: ipv6: retire Hideaki Yoshifuji
  mailmap: add John Crispin's entry
  MAINTAINERS: bonding: move Veaceslav Falico to CREDITS
  net: openvswitch: fix flow memory leak in ovs_flow_cmd_new
  net: ethernet: mtk_eth_soc: disable hardware DSA untagging for second MAC
  virtio-net: Keep stop() to follow mirror sequence of open()
  selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy benchmarking
  selftests: net: udpgso_bench: Fix racing bug between the rx/tx programs
  selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are provided
  selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warning
  can: mcp251xfd: mcp251xfd_ring_set_ringparam(): assign missing tx_obj_num_coalesce_irq
  can: isotp: split tx timer into transmission and timeout
  can: isotp: handle wait_event_interruptible() return values
  can: raw: fix CAN FD frame transmissions over CAN XL devices
  can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate
  hv_netvsc: Fix missed pagebuf entries in netvsc_dma_map/unmap()
  ...
2023-02-02 14:03:31 -08:00
Ian Rogers db95818e88 perf pmu-events: Add separate metric from pmu_event
Create a new pmu_metric for the metric related variables from pmu_event
but that is initially just a clone of pmu_event. Add iterators for
pmu_metric and use in places that metrics are desired rather than
events. Make the event iterator skip metric only events, and the metric
iterator skip event only events.

Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 17:18:31 -03:00
Ian Rogers df5499ddb8 perf jevents: Rewrite metrics in the same file with each other
Rewrite metrics within the same file in terms of each other. For example, on Power8
other_stall_cpi is rewritten from:

"PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL"

to:

"stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi"
Which more closely matches the definition on Power9.

To avoid recomputation decorate the function with a cache.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 17:18:31 -03:00
Ian Rogers 2efbb73d46 perf jevents metric: Add ability to rewrite metrics in terms of others
Add RewriteMetricsInTermsOfOthers that iterates over pairs of names and
expressions trying to replace an expression, within the current
expression, with its name.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 17:18:31 -03:00
Ian Rogers 3241cd11d9 perf jevents metric: Correct Function equality
rhs may not be defined, say for source_count, so add a guard.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230126233645.200509-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 17:18:31 -03:00
Sandipan Das 8eaf8ec3c0 perf session: Show branch speculation info in raw dump
Show the branch speculation info if provided by the branch recording
hardware feature. This can be useful for purposes of code optimization.

E.g.

  $ perf record -j any,u ./test_branch
  $ perf report --dump-raw-trace

Before:

  [...]
  8380958377610 0x40b178 [0x1b0]: PERF_RECORD_SAMPLE(IP, 0x2): 7952/7952: 0x4f851a period: 48973 addr: 0
  ... branch stack: nr:16
  .....  0: 00000000004b52fd -> 00000000004f82c0 0 cycles  P   0
  .....  1: ffffffff8220137c -> 00000000004b52f0 0 cycles M    0
  .....  2: 000000000041d1c4 -> 00000000004b52f0 0 cycles  P   0
  .....  3: 00000000004e7ead -> 000000000041d1b0 0 cycles M    0
  .....  4: 00000000004e7f91 -> 00000000004e7ead 0 cycles  P   0
  .....  5: 00000000004e7ea8 -> 00000000004e7f70 0 cycles  P   0
  .....  6: 00000000004e7e52 -> 00000000004e7e98 0 cycles M    0
  .....  7: 00000000004e7e1f -> 00000000004e7e40 0 cycles M    0
  .....  8: 00000000004e7f60 -> 00000000004e7df0 0 cycles  P   0
  .....  9: 00000000004e7f58 -> 00000000004e7f60 0 cycles M    0
  ..... 10: 000000000041d85d -> 00000000004e7f50 0 cycles  P   0
  ..... 11: 000000000043306a -> 000000000041d840 0 cycles  P   0
  ..... 12: ffffffff8220137c -> 0000000000433040 0 cycles M    0
  ..... 13: 000000000041e4a1 -> 0000000000433040 0 cycles  P   0
  ..... 14: ffffffff8220137c -> 000000000041e490 0 cycles M    0
  ..... 15: 000000000041d89b -> 000000000041e487 0 cycles  P   0
   ... thread: test_branch:7952
   ...... dso: /data/sandipan/test_branch
  [...]

After:

  [...]
  8380958377610 0x40b178 [0x1b0]: PERF_RECORD_SAMPLE(IP, 0x2): 7952/7952: 0x4f851a period: 48973 addr: 0
  ... branch stack: nr:16
  .....  0: 00000000004b52fd -> 00000000004f82c0 0 cycles  P   0  NON_SPEC_CORRECT_PATH
  .....  1: ffffffff8220137c -> 00000000004b52f0 0 cycles M    0  NON_SPEC_CORRECT_PATH
  .....  2: 000000000041d1c4 -> 00000000004b52f0 0 cycles  P   0  NON_SPEC_CORRECT_PATH
  .....  3: 00000000004e7ead -> 000000000041d1b0 0 cycles M    0  NON_SPEC_CORRECT_PATH
  .....  4: 00000000004e7f91 -> 00000000004e7ead 0 cycles  P   0  NON_SPEC_CORRECT_PATH
  .....  5: 00000000004e7ea8 -> 00000000004e7f70 0 cycles  P   0  NON_SPEC_CORRECT_PATH
  .....  6: 00000000004e7e52 -> 00000000004e7e98 0 cycles M    0  SPEC_CORRECT_PATH
  .....  7: 00000000004e7e1f -> 00000000004e7e40 0 cycles M    0  NON_SPEC_CORRECT_PATH
  .....  8: 00000000004e7f60 -> 00000000004e7df0 0 cycles  P   0  NON_SPEC_CORRECT_PATH
  .....  9: 00000000004e7f58 -> 00000000004e7f60 0 cycles M    0  NON_SPEC_CORRECT_PATH
  ..... 10: 000000000041d85d -> 00000000004e7f50 0 cycles  P   0  NON_SPEC_CORRECT_PATH
  ..... 11: 000000000043306a -> 000000000041d840 0 cycles  P   0  NON_SPEC_CORRECT_PATH
  ..... 12: ffffffff8220137c -> 0000000000433040 0 cycles M    0  NON_SPEC_CORRECT_PATH
  ..... 13: 000000000041e4a1 -> 0000000000433040 0 cycles  P   0  NON_SPEC_CORRECT_PATH
  ..... 14: ffffffff8220137c -> 000000000041e490 0 cycles M    0  NON_SPEC_CORRECT_PATH
  ..... 15: 000000000041d89b -> 000000000041e487 0 cycles  P   0  NON_SPEC_CORRECT_PATH
   ... thread: test_branch:7952
   ...... dso: /data/sandipan/test_branch
  [...]

With the addition of new branch flags, the "brstacksym" fields in perf
script output now shows speculation information after the branch type.
Change the regular expressions accordingly for the test to pass. Since
branch speculation information may vary across platforms, the test does
not look for specific values.

E.g.

  $ perf test -v 110

Before:

  110: Check branch stack sampling                                     :
  --- start ---
  test child forked, pid 54154
  Testing user branch stack sampling
  + grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/IND_CALL$ /tmp/__perf_test.program.AfhUI/perf.script
  + cleanup
  + rm -rf /tmp/__perf_test.program.AfhUI
  test child finished with -1
  ---- end ----
  Check branch stack sampling: FAILED!

After:

  110: Check branch stack sampling                                     :
  --- start ---
  test child forked, pid 43716
  Testing user branch stack sampling
  + grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/IND_CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
  brstack_bench+0x66/brstack_foo+0x0/P/-/-/0/IND_CALL/NON_SPEC_CORRECT_PATH
  + grep -E -m1 ^brstack_foo\+[^ ]*/brstack_bar\+[^ ]*/CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
  brstack_foo+0x1b/brstack_bar+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
  + grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
  brstack_bench+0x58/brstack_foo+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
  + grep -E -m1 ^brstack_bench\+[^ ]*/brstack_bar\+[^ ]*/CALL/.*$ /tmp/__perf_test.program.xgzAi/perf.script
  brstack_bench+0x5d/brstack_bar+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
  + grep -E -m1 ^brstack_bar\+[^ ]*/brstack_foo\+[^ ]*/RET/.*$ /tmp/__perf_test.program.xgzAi/perf.script
  brstack_bar+0x31/brstack_foo+0x20/P/-/-/0/RET/NON_SPEC_CORRECT_PATH
  + grep -E -m1 ^brstack_foo\+[^ ]*/brstack_bench\+[^ ]*/RET/.*$ /tmp/__perf_test.program.xgzAi/perf.script
  brstack_foo+0x36/brstack_bench+0x5d/P/-/-/0/RET/NON_SPEC_CORRECT_PATH
  + grep -E -m1 ^brstack_bench\+[^ ]*/brstack_bench\+[^ ]*/COND/.*$ /tmp/__perf_test.program.xgzAi/perf.script
  brstack_bench+0x76/brstack_bench+0x7d/P/-/-/0/COND/NON_SPEC_CORRECT_PATH
  + grep -E -m1 ^brstack\+[^ ]*/brstack\+[^ ]*/UNCOND/.*$ /tmp/__perf_test.program.xgzAi/perf.script
  brstack+0x5a/brstack+0x41/P/-/-/0/UNCOND/NON_SPEC_CORRECT_PATH
  + set +x
  Testing branch stack filtering permutation (any_call,CALL|IND_CALL|COND_CALL|SYSCALL|IRQ)
  Testing branch stack filtering permutation (call,CALL|SYSCALL)
  Testing branch stack filtering permutation (cond,COND)
  Testing branch stack filtering permutation (any_ret,RET|COND_RET|SYSRET|ERET)
  Testing branch stack filtering permutation (call,cond,CALL|SYSCALL|COND)
  Testing branch stack filtering permutation (any_call,cond,CALL|IND_CALL|COND_CALL|IRQ|SYSCALL|COND)
  Testing branch stack filtering permutation (cond,any_call,any_ret,COND|CALL|IND_CALL|COND_CALL|SYSCALL|IRQ|RET|COND_RET|SYSRET|ERET)
  test child finished with 0
  ---- end ----
  Check branch stack sampling: Ok

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/048d67c9de3cc8e3dbf19aaa7ff718dec91364c5.1675333809.git.sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 17:18:31 -03:00
Sandipan Das 6ade6c6460 perf script: Show branch speculation info
Show the branch speculation info if provided by the branch recording
hardware feature. This can be useful for optimizing code further.

The speculation info is appended to the end of the list of fields so any
existing tools that use "/" as a delimiter for access fields via an index
remain unaffected. Also show "-" instead of "N/A" when speculation info
is unavailable because "/" is used as the field separator.

E.g.

  $ perf record -j any,u,save_type ./test_branch
  $ perf script --fields brstacksym

Before:

  [...]
  check_match+0x60/strcmp+0x0/P/-/-/0/CALL
  do_lookup_x+0x3c5/check_match+0x0/P/-/-/0/CALL
  [...]

After:

  [...]
  check_match+0x60/strcmp+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
  do_lookup_x+0x3c5/check_match+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH
  [...]

The bitfield swapping scheme used duing sample parsing has changed
because of the addition of new branch flags, namely "spec", "new_type"
and "priv". Earlier, these were all part of the "reserved" field but
now, each of these fields get swapped separately. Change the expected
flag values accordingly for the test to pass.

E.g.

  $ perf test -v 27

Before:

   27: Sample parsing                                                  :
  --- start ---
  test child forked, pid 61979
  parsing failed for sample_type 0x800
  test child finished with -1
  ---- end ----
  Sample parsing: FAILED!

After:

   27: Sample parsing                                                  :
  --- start ---
  test child forked, pid 63293
  test child finished with 0
  ---- end ----
  Sample parsing: Ok

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/56e272583552526e999ba0b536ac009ae3613966.1675333809.git.sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 17:18:31 -03:00
Namhyung Kim 79b7ca7802 perf test: Add more test cases for perf lock contention
Check callstack filter with two different aggregation mode.

  $ sudo ./perf test -v contention
   88: kernel lock contention analysis test                            :
  --- start ---
  test child forked, pid 83416
  Testing perf lock record and perf lock contention
  Testing perf lock contention --use-bpf
  Testing perf lock record and perf lock contention at the same time
  Testing perf lock contention --threads
  Testing perf lock contention --lock-addr
  Testing perf lock contention --type-filter (w/ spinlock)
  Testing perf lock contention --lock-filter (w/ tasklist_lock)
  Testing perf lock contention --callstack-filter (w/ unix_stream)
  Testing perf lock contention --callstack-filter with task aggregation
  test child finished with 0
  ---- end ----
  kernel lock contention analysis test: Ok

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230202050455.2187592-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 17:18:31 -03:00
Tiezhu Yang 540f8b5640 perf bench syscall: Add execve syscall benchmark
This commit adds the execve syscall benchmark, more syscall benchmarks
can be added in the future.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1668052208-14047-5-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Tiezhu Yang 391f84e555 perf bench syscall: Add getpgid syscall benchmark
This commit adds a simple getpgid syscall benchmark, more syscall
benchmarks can be added in the future.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1668052208-14047-4-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Tiezhu Yang 3fe91f3262 perf bench syscall: Introduce bench_syscall_common()
In the current code, there is only a basic syscall benchmark via
getppid, this is not enough. Introduce bench_syscall_common() so that we
can add more syscalls to benchmark.

This is preparation for later patch, no functionality change.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1668052208-14047-3-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Tiezhu Yang 1bad502775 tools x86: Keep list sorted by number in unistd_{32,64}.h
It is better to keep list sorted by number in unistd_{32,64}.h,
so that we can add more syscall number to a proper position.

This is preparation for later patch, no functionality change.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1668052208-14047-2-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Diederik de Haas a912f5975f perf test: Replace legacy `...` with $(...)
As detailed in https://www.shellcheck.net/wiki/SC2006:

The use of `...` is legacy syntax with several issues:
1. It has a series of undefined behaviors related to quoting in POSIX.
2. It imposes a custom escaping mode with surprising results.
3. It's exceptionally hard to nest.

$(...) command substitution has none of these problems,
and is therefore strongly encouraged.

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Acked-by: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230201214945.127474-3-didi.debian@cknow.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Diederik de Haas 5b420cf003 perf test: Replace 'grep | wc -l' with 'grep -c'
To count the number of results from grep, use the '-c' parameter
instead of piping it to 'wc'.

See also https://www.shellcheck.net/wiki/SC2126

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Acked-by: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230201214945.127474-2-didi.debian@cknow.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Namhyung Kim dd15480a3d perf stat: Hide invalid uncore event output for aggr mode
The current display code for perf stat iterates given cpus and build the
aggr map to collect the event data for the aggregation mode.

But uncore events have their own cpu maps and it won't guarantee that
it'd match to the aggr map.  For example, per-package uncore events
would generate a single value for each socket.  When user asks per-core
aggregation mode, the output would contain 0 values for other cores.

Thus it needs to check the uncore PMU's cpumask and if it matches to the
current aggregation id.

Before:
  $ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1

   Performance counter stats for 'system wide':

  S0-D0-C0              1               3.73 Joules power/energy-pkg/
  S0-D0-C1              0      <not counted> Joules power/energy-pkg/
  S0-D0-C2              0      <not counted> Joules power/energy-pkg/
  S0-D0-C3              0      <not counted> Joules power/energy-pkg/

         1.001404046 seconds time elapsed

  Some events weren't counted. Try disabling the NMI watchdog:
  	echo 0 > /proc/sys/kernel/nmi_watchdog
  	perf stat ...
  	echo 1 > /proc/sys/kernel/nmi_watchdog

The core 1, 2 and 3 should not be printed because the event is handled
in a cpu in the core 0 only.  With this change, the output becomes like
below.

After:
  $ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1

   Performance counter stats for 'system wide':

  S0-D0-C0              1               2.09 Joules power/energy-pkg/

Fixes: b897613510 ("perf stat: Update event skip condition for system-wide per-thread mode and merged uncore and hybrid events")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230125192431.2929677-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Namhyung Kim 7b204399ae perf lock contention: Add -S/--callstack-filter option
The -S/--callstack-filter is to limit display entries having the given
string in the callstack (not only in the caller in the output).

The following example shows lock contention results if the callstack
has 'net' substring somewhere.  Note that the caller '__dev_queue_xmit'
does not match to it, but it has 'inet6_csk_xmit' in the callstack.

This applies even if you don't use -v option to show the full callstack.

  $ sudo ./perf lock con -abv -S net sleep 1
  ...
   contended   total wait     max wait     avg wait         type   caller

           5     70.20 us     16.13 us     14.04 us     spinlock   __dev_queue_xmit+0xb6d
                          0xffffffffa5dd1c60  _raw_spin_lock+0x30
                          0xffffffffa5b8f6ed  __dev_queue_xmit+0xb6d
                          0xffffffffa5cd8267  ip6_finish_output2+0x2c7
                          0xffffffffa5cdac14  ip6_finish_output+0x1d4
                          0xffffffffa5cdb477  ip6_xmit+0x457
                          0xffffffffa5d1fd17  inet6_csk_xmit+0xd7
                          0xffffffffa5c5f4aa  __tcp_transmit_skb+0x54a
                          0xffffffffa5c6467d  tcp_keepalive_timer+0x2fd

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230126000936.3017683-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Namhyung Kim 3fd7a168bf perf script: Add 'cgroup' field for output
There's no field for the cgroup, let's add one.  To do that, users need to
specify --all-cgroup option for perf record to capture the cgroup info.

  $ perf record --all-cgroups -- true

  $ perf script -F comm,pid,cgroup
            true 337112  /user.slice/user-657345.slice/user@657345.service/...
            true 337112  /user.slice/user-657345.slice/user@657345.service/...
            true 337112  /user.slice/user-657345.slice/user@657345.service/...
            true 337112  /user.slice/user-657345.slice/user@657345.service/...

If it's recorded without the --all-cgroups, it'd complain.

  $ perf script -F comm,pid,cgroup
  Samples for 'cycles:u' event do not have CGROUP attribute set. Cannot print 'cgroup' field.
  Hint: run 'perf record --all-cgroups ...'

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230126213610.3381147-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Ross Zwisler 1df49ef9ee perf tools docs: Use canonical ftrace path
The canonical location for the tracefs filesystem is at /sys/kernel/tracing.

But, from Documentation/trace/ftrace.rst:

  Before 4.1, all ftrace tracing control files were within the debugfs
  file system, which is typically located at /sys/kernel/debug/tracing.
  For backward compatibility, when mounting the debugfs file system,
  the tracefs file system will be automatically mounted at:

  /sys/kernel/debug/tracing

A few spots in the perf docs still refer to this older debugfs path, so
let's update them to avoid confusion.

Signed-off-by: Ross Zwisler <zwisler@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: linux-trace-kernel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20230130181915.1113313-5-zwisler@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Rob Herring 2889959489 perf arm-spe: Only warn once for each unsupported address packet
Unknown address packet indexes are not an error as the Arm architecture
can (and has with SPEv1.2) define new ones and implementation defined
ones are also allowed. The error message for every occurrence of the
packet is needlessly noisy as well. Change the message to print just
once for each unknown index.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230127205546.667740-1-robh@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Krister Johansen 1c24956542 perf symbols: Symbol lookup with kcore can fail if multiple segments match stext
This problem was encountered on an arm64 system with a lot of memory.
Without kernel debug symbols installed, and with both kcore and kallsyms
available, perf managed to get confused and returned "unknown" for all
of the kernel symbols that it tried to look up.

On this system, stext fell within the vmalloc segment.  The kcore symbol
matching code tries to find the first segment that contains stext and
uses that to replace the segment generated from just the kallsyms
information.  In this case, however, there were two: a very large
vmalloc segment, and the text segment.  This caused perf to get confused
because multiple overlapping segments were inserted into the RB tree
that holds the discovered segments.  However, that alone wasn't
sufficient to cause the problem. Even when we could find the segment,
the offsets were adjusted in such a way that the newly generated symbols
didn't line up with the instruction addresses in the trace.  The most
obvious solution would be to consult which segment type is text from
kcore, but this information is not exposed to users.

Instead, select the smallest matching segment that contains stext
instead of the first matching segment.  This allows us to match the text
segment instead of vmalloc, if one is contained within the other.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Reaver <me@davidreaver.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230125183418.GD1963@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Athira Rajeev 3980ee9ad8 perf probe: Fix usage when libtraceevent is missing
While parsing the tracepoint events in parse_events_add_tracepoint()
function, code checks for HAVE_LIBTRACEEVENT support. This is needed
since libtraceevent is necessary for tracepoint. But while adding probe
points, check for LIBTRACEEVENT is not done in case of perf probe.
Hence, in environment with missing libtraceevent-devel, it is observed
that adding a probe point shows below message though it can't be used
via perf record.

Example:

Adding probe point:
	./perf probe 'vfs_getname=getname_flags:72 pathname=result->name:string'
	Added new event:
	  probe:vfs_getname    (on getname_flags:72 with pathname=result->name:string)

	You can now use it in all perf tools, such as:

		perf record -e probe:vfs_getname -aR sleep 1

But trying perf record:
	./perf  record -e probe:vfs_getname -aR sleep 1
	event syntax error: 'probe:vfs_getname'
				\___ unsupported tracepoint
	libtraceevent is necessary for tracepoint support
	Run 'perf list' for a list of valid events

The builtin tool like perf record needs libtraceevent to
parse tracefs. But still the probe can be used by enabling
via tracefs. Patch fixes the probe usage message to the user
based on presence of libtraceevent. With the fix,

 # ./perf probe 'pmu:myprobe=schedule'
 Added new event:
   pmu:myprobe          (on schedule)

 perf is not linked with libtraceevent, to use the new probe you can use tracefs:

	cd /sys/kernel/tracing/
	echo 1 > events/pmu/myprobe/enable
	echo 1 > tracing_on
	cat trace_pipe
	Before removing the probe, echo 0 > events/pmu/myprobe/enable

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230131134748.54567-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Adrian Hunter ce4c8e7966 perf symbols: Get symbols for .plt.got for x86-64
For x86_64, determine a symbol for .plt.got entries. That requires
computing the target offset and finding that in .rela.dyn, which in
turn means .rela.dyn needs to be sorted by offset.

Example:

  In this example, the GNU C Library is using .plt.got for malloc and
  free.

  Before:

    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ perf record -e intel_pt//u uname
    Linux
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.027 MB perf.data ]
    $ perf script --itrace=be --ns -F-event,+addr,-period,-comm,-tid,-cpu > /tmp/cmp1.txt

  After:

    $ perf script --itrace=be --ns -F-event,+addr,-period,-comm,-tid,-cpu > /tmp/cmp2.txt
    $ diff /tmp/cmp1.txt /tmp/cmp2.txt | head -12
    15509,15510c15509,15510
    < 27046.755390907:      7f0b2943e3ab _nl_normalize_codeset+0x5b (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b29428380 offset_0x28380@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    < 27046.755390907:      7f0b29428384 offset_0x28380@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b294a5120 malloc+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    ---
    > 27046.755390907:      7f0b2943e3ab _nl_normalize_codeset+0x5b (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b29428380 malloc@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    > 27046.755390907:      7f0b29428384 malloc@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b294a5120 malloc+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    15821,15822c15821,15822
    < 27046.755394865:      7f0b2943850c _nl_load_locale_from_archive+0x5bc (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b29428370 offset_0x28370@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    < 27046.755394865:      7f0b29428374 offset_0x28370@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b294a5460 cfree@GLIBC_2.2.5+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    ---
    > 27046.755394865:      7f0b2943850c _nl_load_locale_from_archive+0x5bc (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b29428370 free@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    > 27046.755394865:      7f0b29428374 free@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b294a5460 cfree@GLIBC_2.2.5+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:01 -03:00
Adrian Hunter 51a188ad8c perf symbols: Start adding support for .plt.got for x86
For x86, .plt.got is used, for example, when the address is taken of a
dynamically linked function. Start adding support by synthesizing a
symbol for each entry. A subsequent patch will attempt to get a better
name for the symbol.

Example:

  Before:

    $ cat tstpltlib.c
    void fn1(void) {}
    void fn2(void) {}
    void fn3(void) {}
    void fn4(void) {}
    $ cat tstpltgot.c
    void fn1(void);
    void fn2(void);
    void fn3(void);
    void fn4(void);

    void callfn(void (*fn)(void))
    {
            fn();
    }

    int main()
    {
            fn4();
            fn1();
            callfn(fn3);
            fn2();
            fn3();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -Wall -Wextra -shared -o libtstpltlib.so tstpltlib.c
    $ gcc -Wall -Wextra -o tstpltgot tstpltgot.c -L . -ltstpltlib -Wl,-rpath="$(pwd)"
    $ readelf -SW tstpltgot | grep 'Name\|plt\|dyn'
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [ 6] .dynsym           DYNSYM          00000000000003d8 0003d8 0000f0 18   A  7   1  8
      [ 7] .dynstr           STRTAB          00000000000004c8 0004c8 0000c6 00   A  0   0  1
      [10] .rela.dyn         RELA            00000000000005d8 0005d8 0000d8 18   A  6   0  8
      [11] .rela.plt         RELA            00000000000006b0 0006b0 000048 18  AI  6  24  8
      [13] .plt              PROGBITS        0000000000001020 001020 000040 10  AX  0   0 16
      [14] .plt.got          PROGBITS        0000000000001060 001060 000020 10  AX  0   0 16
      [15] .plt.sec          PROGBITS        0000000000001080 001080 000030 10  AX  0   0 16
      [23] .dynamic          DYNAMIC         0000000000003d90 002d90 000210 10  WA  7   0  8
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstpltgot , filter callfn @ ./tstpltgot' ./tstpltgot
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.011 MB perf.data ]
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    28393.810326915:   tr strt                               0 [unknown] =>     562350baa1b2 main+0x0
    28393.810326915:   tr end  call               562350baa1ba main+0x8 =>     562350baa090 fn4@plt+0x0
    28393.810326917:   tr strt                               0 [unknown] =>     562350baa1bf main+0xd
    28393.810326917:   tr end  call               562350baa1bf main+0xd =>     562350baa080 fn1@plt+0x0
    28393.810326917:   tr strt                               0 [unknown] =>     562350baa1c4 main+0x12
    28393.810326917:   call                       562350baa1ce main+0x1c =>     562350baa199 callfn+0x0
    28393.810326917:   tr end  call               562350baa1ad callfn+0x14 =>     7f607d36110f fn3+0x0
    28393.810326922:   tr strt                               0 [unknown] =>     562350baa1af callfn+0x16
    28393.810326922:   return                     562350baa1b1 callfn+0x18 =>     562350baa1d3 main+0x21
    28393.810326922:   tr end  call               562350baa1d3 main+0x21 =>     562350baa0a0 fn2@plt+0x0
    28393.810326924:   tr strt                               0 [unknown] =>     562350baa1d8 main+0x26
    28393.810326924:   tr end  call               562350baa1d8 main+0x26 =>     562350baa060 [unknown]  <- call to fn3 via .plt.got
    28393.810326925:   tr strt                               0 [unknown] =>     562350baa1dd main+0x2b
    28393.810326925:   tr end  return             562350baa1e3 main+0x31 =>     7f607d029d90 __libc_start_call_main+0x80

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    28393.810326915:   tr strt                               0 [unknown] =>     562350baa1b2 main+0x0
    28393.810326915:   tr end  call               562350baa1ba main+0x8 =>     562350baa090 fn4@plt+0x0
    28393.810326917:   tr strt                               0 [unknown] =>     562350baa1bf main+0xd
    28393.810326917:   tr end  call               562350baa1bf main+0xd =>     562350baa080 fn1@plt+0x0
    28393.810326917:   tr strt                               0 [unknown] =>     562350baa1c4 main+0x12
    28393.810326917:   call                       562350baa1ce main+0x1c =>     562350baa199 callfn+0x0
    28393.810326917:   tr end  call               562350baa1ad callfn+0x14 =>     7f607d36110f fn3+0x0
    28393.810326922:   tr strt                               0 [unknown] =>     562350baa1af callfn+0x16
    28393.810326922:   return                     562350baa1b1 callfn+0x18 =>     562350baa1d3 main+0x21
    28393.810326922:   tr end  call               562350baa1d3 main+0x21 =>     562350baa0a0 fn2@plt+0x0
    28393.810326924:   tr strt                               0 [unknown] =>     562350baa1d8 main+0x26
    28393.810326924:   tr end  call               562350baa1d8 main+0x26 =>     562350baa060 offset_0x1060@plt+0x0
    28393.810326925:   tr strt                               0 [unknown] =>     562350baa1dd main+0x2b
    28393.810326925:   tr end  return             562350baa1e3 main+0x31 =>     7f607d029d90 __libc_start_call_main+0x80

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:31:06 -03:00
Daniel Bristot de Oliveira 5def33df84 rtla/timerlat: Add auto-analysis support to timerlat top
Currently, timerlat top displays the timerlat tracer latency results, saving
the intuitive timerlat trace for the developer to analyze.

This patch goes a step forward in the automaton of the scheduling latency
analysis by providing a summary of the root cause of a latency higher than
the passed "stop tracing" parameter if the trace stops.

The output is intuitive enough for non-expert users to have a general idea
of the root cause by looking at each factor's contribution percentage while
keeping the technical detail in the output for more expert users to start
an in dept debug or to correlate a root cause with an existing one.

The terminology is in line with recent industry and academic publications
to facilitate the understanding of both audiences.

Here is one example of tool output:
 ----------------------------------------- %< -----------------------------------------------------
  # taskset -c 0 timerlat -a 40 -c 1-23 -q
                                     Timer Latency
    0 00:00:12   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
  CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
    1 #12322     |        0         0         1        15 |       10         3         9        31
    2 #12322     |        3         0         1        12 |       10         3         9        23
    3 #12322     |        1         0         1        21 |        8         2         8        34
    4 #12322     |        1         0         1        17 |       10         2        11        33
    5 #12322     |        0         0         1        12 |        8         3         8        25
    6 #12322     |        1         0         1        14 |       16         3        11        35
    7 #12322     |        0         0         1        14 |        9         2         8        29
    8 #12322     |        1         0         1        22 |        9         3         9        34
    9 #12322     |        0         0         1        14 |        8         2         8        24
   10 #12322     |        1         0         0        12 |        9         3         8        24
   11 #12322     |        0         0         0        15 |        6         2         7        29
   12 #12321     |        1         0         0        13 |        5         3         8        23
   13 #12319     |        0         0         1        14 |        9         3         9        26
   14 #12321     |        1         0         0        13 |        6         2         8        24
   15 #12321     |        1         0         1        15 |       12         3        11        27
   16 #12318     |        0         0         1        13 |        7         3        10        24
   17 #12319     |        0         0         1        13 |       11         3         9        25
   18 #12318     |        0         0         0        12 |        8         2         8        20
   19 #12319     |        0         0         1        18 |       10         2         9        28
   20 #12317     |        0         0         0        20 |        9         3         8        34
   21 #12318     |        0         0         0        13 |        8         3         8        28
   22 #12319     |        0         0         1        11 |        8         3        10        22
   23 #12320     |       28         0         1        28 |       41         3        11        41
  rtla timerlat hit stop tracing
  ## CPU 23 hit stop tracing, analyzing it ##
  IRQ handler delay:				      	    27.49 us (65.52 %)
  IRQ latency:						    28.13 us
  Timerlat IRQ duration:				     9.59 us (22.85 %)
  Blocking thread:					     3.79 us (9.03 %)
			objtool:49256    		     3.79 us
    Blocking thread stacktrace
		-> timerlat_irq
		-> __hrtimer_run_queues
		-> hrtimer_interrupt
		-> __sysvec_apic_timer_interrupt
		-> sysvec_apic_timer_interrupt
		-> asm_sysvec_apic_timer_interrupt
		-> _raw_spin_unlock_irqrestore
		-> cgroup_rstat_flush_locked
		-> cgroup_rstat_flush_irqsafe
		-> mem_cgroup_flush_stats
		-> mem_cgroup_wb_stats
		-> balance_dirty_pages
		-> balance_dirty_pages_ratelimited_flags
		-> btrfs_buffered_write
		-> btrfs_do_write_iter
		-> vfs_write
		-> __x64_sys_pwrite64
		-> do_syscall_64
		-> entry_SYSCALL_64_after_hwframe
  ------------------------------------------------------------------------
    Thread latency:					    41.96 us (100%)

  The system has exit from idle latency!
    Max timerlat IRQ latency from idle: 17.48 us in cpu 4
  Saving trace to timerlat_trace.txt
 ----------------------------------------- >% -----------------------------------------------------

In this case, the major factor was the delay suffered by the IRQ handler
that handles timerlat wakeup: 65.52 %. This can be caused by the
current thread masking interrupts, which can be seen in the blocking
thread stacktrace: the current thread (objtool:49256) disabled interrupts
via raw spin lock operations inside mem cgroup, while doing write
syscall in a btrfs file system.

A simple search for the function name on Google shows that this is
a legit case for disabling the interrupts:

  cgroup: Use irqsave in cgroup_rstat_flush_locked()
  lore.kernel.org/linux-mm/20220301122143.1521823-2-bigeasy@linutronix.de/

The output also prints other reasons for the latency root cause, such as:

  - an IRQ that happened before the IRQ handler that caused delays
  - The interference from NMI, IRQ, Softirq, and Threads

The details about how these factors affect the scheduling latency
can be found here:

   https://bristot.me/demystifying-the-real-time-linux-latency/

Link: https://lkml.kernel.org/r/3d45f40e630317f51ac6d678e2d96d310e495729.1675179318.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-02 10:48:04 -05:00
Daniel Bristot de Oliveira 27e348b221 rtla/timerlat: Add auto-analysis core
Currently, timerlat displays a summary of the timerlat tracer results
saving the trace if the system hits a stop condition.

While this represented a huge step forward, the root cause was not
that is accessible to non-expert users.

The auto-analysis fulfill this gap by parsing the trace timerlat runs,
printing an intuitive auto-analysis.

Link: https://lkml.kernel.org/r/1ee073822f6a2cbb33da0c817331d0d4045e837f.1675179318.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-02 10:48:03 -05:00
Ross Zwisler cf3e025186 PM: tools: use canonical ftrace path
The canonical location for the tracefs filesystem is at /sys/kernel/tracing.

But, from Documentation/trace/ftrace.rst:

  Before 4.1, all ftrace tracing control files were within the debugfs
  file system, which is typically located at /sys/kernel/debug/tracing.
  For backward compatibility, when mounting the debugfs file system,
  the tracefs file system will be automatically mounted at:

  /sys/kernel/debug/tracing

A few scripts in tools/power still refer to this older debugfs path, so
let's update them to avoid confusion.

Signed-off-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-02 15:45:35 +01:00
Andrei Gherzan 329c9cd769 selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy benchmarking
The test tool can check that the zerocopy number of completions value is
valid taking into consideration the number of datagram send calls. This can
catch the system into a state where the datagrams are still in the system
(for example in a qdisk, waiting for the network interface to return a
completion notification, etc).

This change adds a retry logic of computing the number of completions up to
a configurable (via CLI) timeout (default: 2 seconds).

Fixes: 79ebc3c260 ("net/udpgso_bench_tx: options to exercise TX CMSG")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-4-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-02 13:29:51 +01:00
Andrei Gherzan dafe93b9ee selftests: net: udpgso_bench: Fix racing bug between the rx/tx programs
"udpgro_bench.sh" invokes udpgso_bench_rx/udpgso_bench_tx programs
subsequently and while doing so, there is a chance that the rx one is not
ready to accept socket connections. This racing bug could fail the test
with at least one of the following:

./udpgso_bench_tx: connect: Connection refused
./udpgso_bench_tx: sendmsg: Connection refused
./udpgso_bench_tx: write: Connection refused

This change addresses this by making udpgro_bench.sh wait for the rx
program to be ready before firing off the tx one - up to a 10s timeout.

Fixes: 3a687bef14 ("selftests: udp gso benchmark")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-3-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-02 13:29:51 +01:00
Andrei Gherzan db9b47ee9f selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are provided
Leaving unrecognized arguments buried in the output, can easily hide a
CLI/script typo. Avoid this by exiting when wrong arguments are provided to
the udpgso_bench test programs.

Fixes: 3a687bef14 ("selftests: udp gso benchmark")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-2-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-02 13:29:51 +01:00
Andrei Gherzan c03c80e3a0 selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warning
This change fixes the following compiler warning:

/usr/include/x86_64-linux-gnu/bits/error.h:40:5: warning: ‘gso_size’ may
be used uninitialized [-Wmaybe-uninitialized]
   40 |     __error_noreturn (__status, __errnum, __format,
   __va_arg_pack ());
         |
	 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	 udpgso_bench_rx.c: In function ‘main’:
	 udpgso_bench_rx.c:253:23: note: ‘gso_size’ was declared here
	   253 |         int ret, len, gso_size, budget = 256;

Fixes: 3327a9c463 ("selftests: add functionals test for UDP GRO")
Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230201001612.515730-1-andrei.gherzan@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-02 13:29:51 +01:00
Ye Xingchen 4bc32df7a9 selftests/bpf: Remove duplicate include header in xdp_hw_metadata
The linux/net_tstamp.h is included more than once, thus clean it up.

Signed-off-by: Ye Xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/202301311440516312161@zte.com.cn
2023-02-02 11:52:48 +01:00
Stanislav Fomichev 8b79b34a66 selftests/bpf: Don't refill on completion in xdp_metadata
We only need to consume TX completion instead of refilling 'fill' ring.
It's currently not an issue because we never RX more than 8 packets.

Fixes: e2a46d54d7 ("selftests/bpf: Verify xdp_metadata xdp->af_xdp path")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230201233640.367646-1-sdf@google.com
2023-02-02 11:41:27 +01:00
Adrian Hunter a1ab12856f perf symbols: Allow for static executables with .plt
A statically linked executable can have a .plt due to IFUNCs, in which
case .symtab is used not .dynsym. Check the section header link to see
if that is the case, and then use symtab instead.

Example:

  Before:

    $ cat tstifunc.c
    #include <stdio.h>

    void thing1(void)
    {
            printf("thing1\n");
    }

    void thing2(void)
    {
            printf("thing2\n");
    }

    typedef void (*thing_fn_t)(void);

    thing_fn_t thing_ifunc(void)
    {
            int x;

            if (x & 1)
                    return thing2;
            return thing1;
    }

    void thing(void) __attribute__ ((ifunc ("thing_ifunc")));

    int main()
    {
            thing();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -static -Wall -Wextra -Wno-uninitialized -o tstifuncstatic tstifunc.c
    $ readelf -SW tstifuncstatic | grep 'Name\|plt\|dyn'
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [ 4] .rela.plt         RELA            00000000004002e8 0002e8 000258 18  AI 29  20  8
      [ 6] .plt              PROGBITS        0000000000401020 001020 000190 00  AX  0   0 16
      [20] .got.plt          PROGBITS        00000000004c5000 0c4000 0000e0 08  WA  0   0  8
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstifuncstatic' ./tstifuncstatic
    thing1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.008 MB perf.data ]
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    15786.690189535:   tr strt                               0 [unknown] =>           4017cd main+0x0
    15786.690189535:   tr end  call                     4017d5 main+0x8 =>           401170 [unknown]
    15786.690197660:   tr strt                               0 [unknown] =>           4017da main+0xd
    15786.690197660:   tr end  return                   4017e0 main+0x13 =>           401c1a __libc_start_call_main+0x6a

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    15786.690189535:   tr strt                               0 [unknown] =>           4017cd main+0x0
    15786.690189535:   tr end  call                     4017d5 main+0x8 =>           401170 thing_ifunc@plt+0x0
    15786.690197660:   tr strt                               0 [unknown] =>           4017da main+0xd
    15786.690197660:   tr end  return                   4017e0 main+0x13 =>           401c1a __libc_start_call_main+0x6a

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:51:51 -03:00
Adrian Hunter 60fbb3e49a perf symbols: Allow for .plt without header
A static executable can have a .plt due to the presence of IFUNCs.  In
that case the .plt does not have a header. Check for whether there is a
header by comparing the number of entries to the number of relocation
entries.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:51:31 -03:00
Adrian Hunter b7dbc0be6e perf symbols: Add support for IFUNC symbols for x86_64
For x86_64, the GNU linker is putting IFUNC information in the relocation
addend, so use it to try to find a symbol for plt entries that refer to
IFUNCs.

Example:

  Before:

    $ cat tstpltlib.c
    void fn1(void) {}
    void fn2(void) {}
    void fn3(void) {}
    void fn4(void) {}
    $ cat tstpltifunc.c
    #include <stdio.h>

    void thing1(void)
    {
            printf("thing1\n");
    }

    void thing2(void)
    {
            printf("thing2\n");
    }

    typedef void (*thing_fn_t)(void);

    thing_fn_t thing_ifunc(void)
    {
            int x;

            if (x & 1)
                    return thing2;
            return thing1;
    }

    void thing(void) __attribute__ ((ifunc ("thing_ifunc")));

    void fn1(void);
    void fn2(void);
    void fn3(void);
    void fn4(void);

    int main()
    {
            fn4();
            fn1();
            thing();
            fn2();
            fn3();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -Wall -Wextra -shared -o libtstpltlib.so tstpltlib.c
    $ gcc -Wall -Wextra -Wno-uninitialized -o tstpltifunc tstpltifunc.c -L . -ltstpltlib -Wl,-rpath="$(pwd)"
    $ readelf -rW tstpltifunc | grep -A99 plt
    Relocation section '.rela.plt' at offset 0x738 contains 8 entries:
        Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
    0000000000003f98  0000000300000007 R_X86_64_JUMP_SLOT     0000000000000000 puts@GLIBC_2.2.5 + 0
    0000000000003fa8  0000000400000007 R_X86_64_JUMP_SLOT     0000000000000000 __stack_chk_fail@GLIBC_2.4 + 0
    0000000000003fb0  0000000500000007 R_X86_64_JUMP_SLOT     0000000000000000 fn1 + 0
    0000000000003fb8  0000000600000007 R_X86_64_JUMP_SLOT     0000000000000000 fn3 + 0
    0000000000003fc0  0000000800000007 R_X86_64_JUMP_SLOT     0000000000000000 fn4 + 0
    0000000000003fc8  0000000900000007 R_X86_64_JUMP_SLOT     0000000000000000 fn2 + 0
    0000000000003fd0  0000000b00000007 R_X86_64_JUMP_SLOT     0000000000000000 getrandom@GLIBC_2.25 + 0
    0000000000003fa0  0000000000000025 R_X86_64_IRELATIVE                        125d
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstpltifunc' ./tstpltifunc
    thing2
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.016 MB perf.data ]
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    21860.073683659:   tr strt                               0 [unknown] =>     561e212c42be main+0x0
    21860.073683659:   tr end  call               561e212c42c6 main+0x8 =>     561e212c4110 fn4@plt+0x0
    21860.073683661:   tr strt                               0 [unknown] =>     561e212c42cb main+0xd
    21860.073683661:   tr end  call               561e212c42cb main+0xd =>     561e212c40f0 fn1@plt+0x0
    21860.073683661:   tr strt                               0 [unknown] =>     561e212c42d0 main+0x12
    21860.073683661:   tr end  call               561e212c42d0 main+0x12 =>     561e212c40d0 offset_0x10d0@plt+0x0
    21860.073698451:   tr strt                               0 [unknown] =>     561e212c42d5 main+0x17
    21860.073698451:   tr end  call               561e212c42d5 main+0x17 =>     561e212c4120 fn2@plt+0x0
    21860.073698451:   tr strt                               0 [unknown] =>     561e212c42da main+0x1c
    21860.073698451:   tr end  call               561e212c42da main+0x1c =>     561e212c4100 fn3@plt+0x0
    21860.073698452:   tr strt                               0 [unknown] =>     561e212c42df main+0x21
    21860.073698452:   tr end  return             561e212c42e5 main+0x27 =>     7fb51cc29d90 __libc_start_call_main+0x80

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    21860.073683659:   tr strt                               0 [unknown] =>     561e212c42be main+0x0
    21860.073683659:   tr end  call               561e212c42c6 main+0x8 =>     561e212c4110 fn4@plt+0x0
    21860.073683661:   tr strt                               0 [unknown] =>     561e212c42cb main+0xd
    21860.073683661:   tr end  call               561e212c42cb main+0xd =>     561e212c40f0 fn1@plt+0x0
    21860.073683661:   tr strt                               0 [unknown] =>     561e212c42d0 main+0x12
    21860.073683661:   tr end  call               561e212c42d0 main+0x12 =>     561e212c40d0 thing_ifunc@plt+0x0
    21860.073698451:   tr strt                               0 [unknown] =>     561e212c42d5 main+0x17
    21860.073698451:   tr end  call               561e212c42d5 main+0x17 =>     561e212c4120 fn2@plt+0x0
    21860.073698451:   tr strt                               0 [unknown] =>     561e212c42da main+0x1c
    21860.073698451:   tr end  call               561e212c42da main+0x1c =>     561e212c4100 fn3@plt+0x0
    21860.073698452:   tr strt                               0 [unknown] =>     561e212c42df main+0x21
    21860.073698452:   tr end  return             561e212c42e5 main+0x27 =>     7fb51cc29d90 __libc_start_call_main+0x80

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:50:52 -03:00
Adrian Hunter 05963491c0 perf symbols: Record whether a symbol is an alias for an IFUNC symbol
To assist with synthesizing plt symbols for IFUNCs, record whether a
symbol is an alias of an IFUNC symbol.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:49:29 -03:00
Adrian Hunter 78250284b1 perf symbols: Sort plt relocations for x86
For x86, with the addition of IFUNCs, relocation information becomes
disordered with respect to plt. Correct that by sorting the relocations by
offset.

Example:

  Before:

    $ cat tstpltlib.c
    void fn1(void) {}
    void fn2(void) {}
    void fn3(void) {}
    void fn4(void) {}
    $ cat tstpltifunc.c
    #include <stdio.h>

    void thing1(void)
    {
            printf("thing1\n");
    }

    void thing2(void)
    {
            printf("thing2\n");
    }

    typedef void (*thing_fn_t)(void);

    thing_fn_t thing_ifunc(void)
    {
            int x;

            if (x & 1)
                    return thing2;
            return thing1;
    }

    void thing(void) __attribute__ ((ifunc ("thing_ifunc")));

    void fn1(void);
    void fn2(void);
    void fn3(void);
    void fn4(void);

    int main()
    {
            fn4();
            fn1();
            thing();
            fn2();
            fn3();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -Wall -Wextra -shared -o libtstpltlib.so tstpltlib.c
    $ gcc -Wall -Wextra -Wno-uninitialized -o tstpltifunc tstpltifunc.c -L . -ltstpltlib -Wl,-rpath="$(pwd)"
    $ readelf -rW tstpltifunc | grep -A99 plt
    Relocation section '.rela.plt' at offset 0x738 contains 8 entries:
        Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
    0000000000003f98  0000000300000007 R_X86_64_JUMP_SLOT     0000000000000000 puts@GLIBC_2.2.5 + 0
    0000000000003fa8  0000000400000007 R_X86_64_JUMP_SLOT     0000000000000000 __stack_chk_fail@GLIBC_2.4 + 0
    0000000000003fb0  0000000500000007 R_X86_64_JUMP_SLOT     0000000000000000 fn1 + 0
    0000000000003fb8  0000000600000007 R_X86_64_JUMP_SLOT     0000000000000000 fn3 + 0
    0000000000003fc0  0000000800000007 R_X86_64_JUMP_SLOT     0000000000000000 fn4 + 0
    0000000000003fc8  0000000900000007 R_X86_64_JUMP_SLOT     0000000000000000 fn2 + 0
    0000000000003fd0  0000000b00000007 R_X86_64_JUMP_SLOT     0000000000000000 getrandom@GLIBC_2.25 + 0
    0000000000003fa0  0000000000000025 R_X86_64_IRELATIVE                        125d
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstpltifunc' ./tstpltifunc
    thing2
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.029 MB perf.data ]
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    20417.302513948:   tr strt                               0 [unknown] =>     5629a74892be main+0x0
    20417.302513948:   tr end  call               5629a74892c6 main+0x8 =>     5629a7489110 fn2@plt+0x0
    20417.302513949:   tr strt                               0 [unknown] =>     5629a74892cb main+0xd
    20417.302513949:   tr end  call               5629a74892cb main+0xd =>     5629a74890f0 fn3@plt+0x0
    20417.302513950:   tr strt                               0 [unknown] =>     5629a74892d0 main+0x12
    20417.302513950:   tr end  call               5629a74892d0 main+0x12 =>     5629a74890d0 __stack_chk_fail@plt+0x0
    20417.302528114:   tr strt                               0 [unknown] =>     5629a74892d5 main+0x17
    20417.302528114:   tr end  call               5629a74892d5 main+0x17 =>     5629a7489120 getrandom@plt+0x0
    20417.302528115:   tr strt                               0 [unknown] =>     5629a74892da main+0x1c
    20417.302528115:   tr end  call               5629a74892da main+0x1c =>     5629a7489100 fn4@plt+0x0
    20417.302528115:   tr strt                               0 [unknown] =>     5629a74892df main+0x21
    20417.302528115:   tr end  return             5629a74892e5 main+0x27 =>     7ff14da29d90 __libc_start_call_main+0x80

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    20417.302513948:   tr strt                               0 [unknown] =>     5629a74892be main+0x0
    20417.302513948:   tr end  call               5629a74892c6 main+0x8 =>     5629a7489110 fn4@plt+0x0
    20417.302513949:   tr strt                               0 [unknown] =>     5629a74892cb main+0xd
    20417.302513949:   tr end  call               5629a74892cb main+0xd =>     5629a74890f0 fn1@plt+0x0
    20417.302513950:   tr strt                               0 [unknown] =>     5629a74892d0 main+0x12
    20417.302513950:   tr end  call               5629a74892d0 main+0x12 =>     5629a74890d0 offset_0x10d0@plt+0x0
    20417.302528114:   tr strt                               0 [unknown] =>     5629a74892d5 main+0x17
    20417.302528114:   tr end  call               5629a74892d5 main+0x17 =>     5629a7489120 fn2@plt+0x0
    20417.302528115:   tr strt                               0 [unknown] =>     5629a74892da main+0x1c
    20417.302528115:   tr end  call               5629a74892da main+0x1c =>     5629a7489100 fn3@plt+0x0
    20417.302528115:   tr strt                               0 [unknown] =>     5629a74892df main+0x21
    20417.302528115:   tr end  return             5629a74892e5 main+0x27 =>     7ff14da29d90 __libc_start_call_main+0x80

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:49:05 -03:00
Adrian Hunter b2529f829a perf symbols: Add support for x86 .plt.sec
The section .plt.sec was originally added for MPX and was first called
.plt.bnd. While MPX has been deprecated, .plt.sec is now also used for
IBT.  On x86_64, IBT may be enabled by default, but can be switched off
using gcc option -fcf-protection=none, or switched on by -z ibt or -z
ibtplt. On 32-bit, option -z ibt or -z ibtplt will enable IBT.

With .plt.sec, calls are made into .plt.sec instead of .plt, so it makes
more sense to put the symbols there instead of .plt. A notable
difference is that .plt.sec does not have a header entry.

For x86, when synthesizing symbols for plt, use offset and entry size of
.plt.sec instead of .plt when there is a .plt.sec section.

Example on Ubuntu 22.04 gcc 11.3:

  Before:

    $ cat tstpltlib.c
    void fn1(void) {}
    void fn2(void) {}
    void fn3(void) {}
    void fn4(void) {}
    $ cat tstplt.c
    void fn1(void);
    void fn2(void);
    void fn3(void);
    void fn4(void);

    int main()
    {
            fn4();
            fn1();
            fn2();
            fn3();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -Wall -Wextra -shared -o libtstpltlib.so tstpltlib.c
    $ gcc -Wall -Wextra -z ibt -o tstplt tstplt.c -L . -ltstpltlib -Wl,-rpath=$(pwd)
    $ readelf -SW tstplt | grep 'plt\|Name'
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [11] .rela.plt         RELA            0000000000000698 000698 000060 18  AI  6  24  8
      [13] .plt              PROGBITS        0000000000001020 001020 000050 10  AX  0   0 16
      [14] .plt.got          PROGBITS        0000000000001070 001070 000010 10  AX  0   0 16
      [15] .plt.sec          PROGBITS        0000000000001080 001080 000040 10  AX  0   0 16
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstplt' ./tstplt
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.015 MB perf.data ]
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    38970.522546686:   tr strt                               0 [unknown] =>     55fc222a81a9 main+0x0
    38970.522546686:   tr end  call               55fc222a81b1 main+0x8 =>     55fc222a80a0 [unknown]
    38970.522546687:   tr strt                               0 [unknown] =>     55fc222a81b6 main+0xd
    38970.522546687:   tr end  call               55fc222a81b6 main+0xd =>     55fc222a8080 [unknown]
    38970.522546688:   tr strt                               0 [unknown] =>     55fc222a81bb main+0x12
    38970.522546688:   tr end  call               55fc222a81bb main+0x12 =>     55fc222a80b0 [unknown]
    38970.522546688:   tr strt                               0 [unknown] =>     55fc222a81c0 main+0x17
    38970.522546688:   tr end  call               55fc222a81c0 main+0x17 =>     55fc222a8090 [unknown]
    38970.522546689:   tr strt                               0 [unknown] =>     55fc222a81c5 main+0x1c
    38970.522546894:   tr end  return             55fc222a81cb main+0x22 =>     7f3a4dc29d90 __libc_start_call_main+0x80

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    38970.522546686:   tr strt                               0 [unknown] =>     55fc222a81a9 main+0x0
    38970.522546686:   tr end  call               55fc222a81b1 main+0x8 =>     55fc222a80a0 fn4@plt+0x0
    38970.522546687:   tr strt                               0 [unknown] =>     55fc222a81b6 main+0xd
    38970.522546687:   tr end  call               55fc222a81b6 main+0xd =>     55fc222a8080 fn1@plt+0x0
    38970.522546688:   tr strt                               0 [unknown] =>     55fc222a81bb main+0x12
    38970.522546688:   tr end  call               55fc222a81bb main+0x12 =>     55fc222a80b0 fn2@plt+0x0
    38970.522546688:   tr strt                               0 [unknown] =>     55fc222a81c0 main+0x17
    38970.522546688:   tr end  call               55fc222a81c0 main+0x17 =>     55fc222a8090 fn3@plt+0x0
    38970.522546689:   tr strt                               0 [unknown] =>     55fc222a81c5 main+0x1c
    38970.522546894:   tr end  return             55fc222a81cb main+0x22 =>     7f3a4dc29d90 __libc_start_call_main+0x80

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:48:31 -03:00
Adrian Hunter 66fe2d53a0 perf symbols: Correct plt entry sizes for x86
In 32-bit executables the .plt entry size can be set to 4 when it is really
16. In fact the only sizes used for x86 (32 or 64 bit) are 8 or 16, so
check for those and, if not, use the alignment to choose which it is.

Example on Ubuntu 22.04 gcc 11.3:

  Before:

    $ cat tstpltlib.c
    void fn1(void) {}
    void fn2(void) {}
    void fn3(void) {}
    void fn4(void) {}
    $ cat tstplt.c
    void fn1(void);
    void fn2(void);
    void fn3(void);
    void fn4(void);

    int main()
    {
            fn4();
            fn1();
            fn2();
            fn3();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -m32 -Wall -Wextra -shared -o libtstpltlib32.so tstpltlib.c
    $ gcc -m32 -Wall -Wextra -o tstplt32 tstplt.c -L . -ltstpltlib32 -Wl,-rpath=$(pwd)
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstplt32' ./tstplt32
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.011 MB perf.data ]
    $ readelf -SW tstplt32 | grep 'plt\|Name'
      [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
      [10] .rel.plt          REL             0000041c 00041c 000028 08  AI  5  22  4
      [12] .plt              PROGBITS        00001030 001030 000060 04  AX  0   0 16   <- ES is 0x04, should be 0x10
      [13] .plt.got          PROGBITS        00001090 001090 000008 08  AX  0   0  8
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    17894.383903029:   tr strt                               0 [unknown] =>         565b81cd main+0x0
    17894.383903029:   tr end  call                   565b81d4 main+0x7 =>         565b80d0 __x86.get_pc_thunk.bx+0x0
    17894.383903031:   tr strt                               0 [unknown] =>         565b81d9 main+0xc
    17894.383903031:   tr end  call                   565b81df main+0x12 =>         565b8070 [unknown]
    17894.383903032:   tr strt                               0 [unknown] =>         565b81e4 main+0x17
    17894.383903032:   tr end  call                   565b81e4 main+0x17 =>         565b8050 [unknown]
    17894.383903033:   tr strt                               0 [unknown] =>         565b81e9 main+0x1c
    17894.383903033:   tr end  call                   565b81e9 main+0x1c =>         565b8080 [unknown]
    17894.383903033:   tr strt                               0 [unknown] =>         565b81ee main+0x21
    17894.383903033:   tr end  call                   565b81ee main+0x21 =>         565b8060 [unknown]
    17894.383903237:   tr strt                               0 [unknown] =>         565b81f3 main+0x26
    17894.383903237:   tr end  return                 565b81fc main+0x2f =>         f7c21519 [unknown]

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    17894.383903029:   tr strt                               0 [unknown] =>         565b81cd main+0x0
    17894.383903029:   tr end  call                   565b81d4 main+0x7 =>         565b80d0 __x86.get_pc_thunk.bx+0x0
    17894.383903031:   tr strt                               0 [unknown] =>         565b81d9 main+0xc
    17894.383903031:   tr end  call                   565b81df main+0x12 =>         565b8070 fn4@plt+0x0
    17894.383903032:   tr strt                               0 [unknown] =>         565b81e4 main+0x17
    17894.383903032:   tr end  call                   565b81e4 main+0x17 =>         565b8050 fn1@plt+0x0
    17894.383903033:   tr strt                               0 [unknown] =>         565b81e9 main+0x1c
    17894.383903033:   tr end  call                   565b81e9 main+0x1c =>         565b8080 fn2@plt+0x0
    17894.383903033:   tr strt                               0 [unknown] =>         565b81ee main+0x21
    17894.383903033:   tr end  call                   565b81ee main+0x21 =>         565b8060 fn3@plt+0x0
    17894.383903237:   tr strt                               0 [unknown] =>         565b81f3 main+0x26
    17894.383903237:   tr end  return                 565b81fc main+0x2f =>         f7c21519 [unknown]

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:47:46 -03:00
Athira Rajeev 766b0beedb perf tests shell: Fix check for libtracevent support
Test “Use vfs_getname probe to get syscall args filenames” fails in
environment with missing libtraceevent support as below:

  82: Use vfs_getname probe to get syscall args filenames             :
  --- start ---
  test child forked, pid 304726
  Recording open file:
  event syntax error: 'probe:vfs_getname*'
                       \___ unsupported tracepoint

  libtraceevent is necessary for tracepoint support
  Run 'perf list' for a list of valid events

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -e, --event <event>   event selector. use 'perf list' to list available events
  test child finished with -1
  ---- end ----
  Use vfs_getname probe to get syscall args filenames: FAILED!

The environment has debuginfo but is missing the libtraceevent devel.

Hence perf is compiled without libtraceevent support.  The test tries to
add probe “probe:vfs_getname” and then uses it with “perf record”.  This
fails at function “parse_events_add_tracepoint" due to missing
libtraceevent.

Similarly "probe libc's inet_pton & backtrace it with ping" test slso
fails with same reason.

Add a function in 'perf test shell' library to check if perf record with
—dry-run reports any error on missing support for libtraceevent. Update
both the tests to use this new function “skip_no_probe_record_support”
before proceeding With using probe point via perf builtin record.

With the change,

  82: Use vfs_getname probe to get syscall args filenames             :
  --- start ---
  test child forked, pid 305014
  Recording open file:
  libtraceevent is necessary for tracepoint support
  test child finished with -2
  ---- end ----
  Use vfs_getname probe to get syscall args filenames: Skip

   81: probe libc's inet_pton & backtrace it with ping                 :
  --- start ---
  test child forked, pid 305036
  libtraceevent is necessary for tracepoint support
  test child finished with -2
  ---- end ----
  probe libc's inet_pton & backtrace it with ping: Skip

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kjain@linux.ibm.com,
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/r/20230201180421.59640-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:44:26 -03:00
Athira Rajeev 84cce3d60c perf tests shell: Add check for perf data file in record+probe_libc_inet_pton test
The "probe libc's inet_pton & backtrace it with ping" test installs a
uprobe and uses perf record/script to check the backtrace. Currently
even if the "perf record" fails, the test reports success. Logs below:

  # ./perf test -v "probe libc's inet_pton & backtrace it with ping"
  81: probe libc's inet_pton & backtrace it with ping                 :
  --- start ---
  test child forked, pid 304211
  failed to open /tmp/perf.data.Btf: No such file or directory
  test child finished with 0
  ---- end ----
  probe libc's inet_pton & backtrace it with ping: Ok

Fix this by adding check for presence of perf.data file
before proceeding with "perf script".

With the patch changes, test reports fail correctly.

 # ./perf test -v "probe libc's inet_pton & backtrace it with ping"
 81: probe libc's inet_pton & backtrace it with ping                 :
  --- start ---
  test child forked, pid 304358
  FAIL: perf record failed to create "/tmp/perf.data.Uoi"
  test child finished with -1
  ---- end ----
  probe libc's inet_pton & backtrace it with ping: FAILED!

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/r/20230201180421.59640-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:44:18 -03:00
Namhyung Kim e072b097d2 perf test: Add pipe mode test to the Intel PT test suite
The test_pipe() function will check perf report and perf inject with
pipe input.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:31:15 -03:00
Namhyung Kim 14bf478441 perf session: Avoid calling lseek(2) for pipe
We should not call lseek(2) for pipes as it won't work.  And we already
in the proper place to read the data for AUXTRACE.  Add the comment like
in the PERF_RECORD_HEADER_TRACING_DATA.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:31:04 -03:00
Namhyung Kim aeb802f872 perf intel-pt: Do not try to queue auxtrace data on pipe
When it processes AUXTRACE_INFO, it calls to auxtrace_queue_data() to
collect AUXTRACE data first.  That won't work with pipe since it needs
lseek() to read the scattered aux data.

  $ perf record -o- -e intel_pt// true | perf report -i- --itrace=i100
  # To display the perf.data header info, please use --header/--header-only options.
  #
  0x4118 [0xa0]: failed to process type: 70
  Error:
  failed to process sample

For the pipe mode, it can handle the aux data as it gets.  But there's
no guarantee it can get the aux data in time.  So the following warning
will be shown at the beginning:

  WARNING: Intel PT with pipe mode is not recommended.
           The output cannot relied upon.  In particular,
           time stamps and the order of events may be incorrect.

Fixes: dbd134322e ("perf intel-pt: Add support for decoding AUX area samples")
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:30:05 -03:00
Emanuele Giuseppe Esposito eb98192576 KVM: selftests: Verify APIC_ID is set when forcing x2APIC=>xAPIC transition
Add a sub-test to verify that KVM stuffs the APIC_ID when userspace forces
a transition from x2APIC to xAPIC without first disabling the APIC.  Such
a transition is architecturally disallowed (WRMSR will #GP), but needs to
be handled by KVM to allow userspace to emulate RESET (ignoring that
userspace should also stuff local APIC state on RESET).

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Link: https://lore.kernel.org/r/20230109130605.2013555-3-eesposit@redhat.com
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-01 16:22:54 -08:00
Jesper Dangaard Brouer e8a3c8bd68 selftests/bpf: xdp_hw_metadata use strncpy for ifname
The ifname char pointer is taken directly from the command line
as input and the string is copied directly into struct ifreq
via strcpy. This makes it easy to corrupt other members of ifreq
and generally do stack overflows.

Most often the ioctl will fail with:

 ./xdp_hw_metadata: ioctl(SIOCETHTOOL): Bad address

As people will likely copy-paste code for getting NIC queue
channels (rxq_num) and enabling HW timestamping (hwtstamp_ioctl)
lets make this code a bit more secure by using strncpy.

Fixes: 297a3f1241 ("selftests/bpf: Simple program to dump XDP RX metadata")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/167527272543.937063.16993147790832546209.stgit@firesoul
2023-02-02 00:49:04 +01:00
Jesper Dangaard Brouer 7bd4224dee selftests/bpf: xdp_hw_metadata correct status value in error(3)
The glibc error reporting function error():

 void error(int status, int errnum, const char *format, ...);

The status argument should be a positive value between 0-255 as it
is passed over to the exit(3) function as the value as the shell exit
status. The least significant byte of status (i.e., status & 0xFF) is
returned to the shell parent.

Fix this by using 1 instead of -1. As 1 corresponds to C standard
constant EXIT_FAILURE.

Fixes: 297a3f1241 ("selftests/bpf: Simple program to dump XDP RX metadata")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/167527272038.937063.9137108142012298120.stgit@firesoul
2023-02-02 00:48:49 +01:00
Jesper Dangaard Brouer a19a62e564 selftests/bpf: xdp_hw_metadata cleanup cause segfault
Using xdp_hw_metadata I experince Segmentation fault after seeing
"detaching bpf program....".

On my system the segfault happened when accessing bpf_obj->skeleton
in xdp_hw_metadata__destroy(bpf_obj) call. That doesn't make any sense
as this memory have not been freed by program at this point in time.

Prior to calling xdp_hw_metadata__destroy(bpf_obj) the function
close_xsk() is called for each RX-queue xsk.  The real bug lays
in close_xsk() that unmap via munmap() the wrong memory pointer.
The call xsk_umem__delete(xsk->umem) will free xsk->umem, thus
the call to munmap(xsk->umem, UMEM_SIZE) will have unpredictable
behavior. And man page explain subsequent references to these
pages will generate SIGSEGV.

Unmapping xsk->umem_area instead removes the segfault.

Fixes: 297a3f1241 ("selftests/bpf: Simple program to dump XDP RX metadata")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/167527271533.937063.5717065138099679142.stgit@firesoul
2023-02-02 00:48:22 +01:00
Jesper Dangaard Brouer 3fd9dcd689 selftests/bpf: xdp_hw_metadata clear metadata when -EOPNOTSUPP
The AF_XDP userspace part of xdp_hw_metadata see non-zero as a signal of
the availability of rx_timestamp and rx_hash in data_meta area. The
kernel-side BPF-prog code doesn't initialize these members when kernel
returns an error e.g. -EOPNOTSUPP.  This memory area is not guaranteed to
be zeroed, and can contain garbage/previous values, which will be read
and interpreted by AF_XDP userspace side.

Tested this on different drivers. The experiences are that for most
packets they will have zeroed this data_meta area, but occasionally it
will contain garbage data.

Example of failure tested on ixgbe:

 poll: 1 (0)
 xsk_ring_cons__peek: 1
 0x18ec788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000
 rx_hash: 3697961069
 rx_timestamp:  9024981991734834796 (sec:9024981991.7348)
 0x18ec788: complete idx=8 addr=8000

Converting to date:

 date -d @9024981991
 2255-12-28T20:26:31 CET

I choose a simple fix in this patch. When kfunc fails or isn't supported
assign zero to the corresponding struct meta value.

It's up to the individual BPF-programmer to do something smarter e.g.
that fits their use-case, like getting a software timestamp and marking
a flag that gives the type of timestamp.

Fixes: 297a3f1241 ("selftests/bpf: Simple program to dump XDP RX metadata")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/167527271027.937063.5177725618616476592.stgit@firesoul
2023-02-02 00:48:07 +01:00
Jesper Dangaard Brouer f2922f77a6 selftests/bpf: Fix unmap bug in prog_tests/xdp_metadata.c
The function close_xsk() unmap via munmap() the wrong memory pointer.
The call xsk_umem__delete(xsk->umem) have already freed xsk->umem.
Thus the call to munmap(xsk->umem, UMEM_SIZE) will have unpredictable
behavior that can lead to Segmentation fault elsewhere, as man page
explain subsequent references to these pages will generate SIGSEGV.

Fixes: e2a46d54d7 ("selftests/bpf: Verify xdp_metadata xdp->af_xdp path")
Reported-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/167527517464.938135.13750760520577765269.stgit@firesoul
2023-02-02 00:45:57 +01:00
David Vernet 6aed15e330 selftests/bpf: Add testcase for static kfunc with unused arg
kfuncs are allowed to be static, or not use one or more of their
arguments. For example, bpf_xdp_metadata_rx_hash() in net/core/xdp.c is
meant to be implemented by drivers, with the default implementation just
returning -EOPNOTSUPP. As described in [0], such kfuncs can have their
arguments elided, which can cause BTF encoding to be skipped. The new
__bpf_kfunc macro should address this, and this patch adds a selftest
which verifies that a static kfunc with at least one unused argument can
still be encoded and invoked by a BPF program.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230201173016.342758-5-void@manifault.com
2023-02-02 00:25:14 +01:00
David Vernet 400031e05a bpf: Add __bpf_kfunc tag to all kfuncs
Now that we have the __bpf_kfunc tag, we should use add it to all
existing kfuncs to ensure that they'll never be elided in LTO builds.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20230201173016.342758-4-void@manifault.com
2023-02-02 00:25:14 +01:00
Vipin Sharma 6032526123 KVM: selftests: Test Hyper-V extended hypercall exit to userspace
Hyper-V extended hypercalls by default exit to userspace. Verify
userspace gets the call, update the result and then verify in guest
correct result is received.

Add KVM_EXIT_HYPERV to list of "known" hypercalls so errors generate
pretty strings.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20221212183720.4062037-14-vipinsh@google.com
[sean: add KVM_EXIT_HYPERV to exit_reasons_known]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-01 14:31:27 -08:00
Namhyung Kim 1746212dae perf inject: Use perf_data__read() for auxtrace
In copy_bytes(), it reads the data from the (input) fd and writes it to
the output file.  But it does with the read(2) unconditionally which
caused a problem of mixing buffered vs unbuffered I/O together.

You can see the problem when using pipes.

  $ perf record -e intel_pt// -o- true | perf inject -b > /dev/null
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
  0x45c0 [0x30]: failed to process type: 71

It should use perf_data__read() to honor the 'use_stdio' setting.

Fixes: 601366678c ("perf data: Allow to use stdio functions for pipe mode")
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 19:22:07 -03:00
Vipin Sharma f65092015a KVM: selftests: Replace hardcoded Linux OS id with HYPERV_LINUX_OS_ID
Use HYPERV_LINUX_OS_ID macro instead of hardcoded 0x8100 << 48

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20221212183720.4062037-12-vipinsh@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-01 13:47:56 -08:00
Vipin Sharma c4a46627e5 KVM: selftests: Test Hyper-V extended hypercall enablement
Test Hyper-V extended hypercall, HV_EXT_CALL_QUERY_CAPABILITIES
(0x8001), access denied and invalid parameter cases.

Access is denied if CPUID.0x40000003.EBX BIT(20) is not set.
Invalid parameter if call has fast bit set.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Link: https://lore.kernel.org/r/20221212183720.4062037-11-vipinsh@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-02-01 13:46:24 -08:00
Linus Torvalds 9f266ccaa2 virtio,vhost,vdpa: fixes
Just small bugfixes all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmPTs30PHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpjzIIAMxHPX4KamjTQ0C9A4uPoWbdLAJfft6GpjK0
 To2aZSwv48Evjk9eilZIU7QOmU5vAYiyvK27MMUZQ74W/1maZUWGF3Hu3g2TwBNj
 qcxNKIZu9eurrEvYSgIAbH2R4al3WhdQjF+h/ZUmYO2RM//X64a2LM6n2p6aFZjs
 yBnR3LOiKC7TIkO7YEVyiIQT3ayulu3viLeL0d2HsQJKDQ60YZX1NDBOXgY0UsIq
 gG50l3Nr3RA7CgeJyWOB1x9/8DouXbg9uaHu6WR1+qTwFSW++Z07iz3V/ZYzPmNX
 1GaeJUNhOIR5F4DvhND3Me1WcuaE28j6Flu8SECYNgnV8WG4lSg=
 =EjKi
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "Just small bugfixes all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vdpa: ifcvf: Do proper cleanup if IFCVF init fails
  vhost-scsi: unbreak any layout for response
  tools/virtio: fix the vringh test for virtio ring changes
  vhost/net: Clear the pending messages when the backend is removed
2023-02-01 10:31:53 -08:00
Mark Brown a7db82f18c kselftest/arm64: Fix enumeration of systems without 128 bit SME for SSVE+ZA
The current signal handling tests for SME do not account for the fact that
unlike SVE all SME vector lengths are optional so we can't guarantee that
we will encounter the minimum possible VL, they will hang enumerating VLs
on such systems. Abort enumeration when we find the lowest VL in the newly
added ssve_za_regs test.

Fixes: bc69da5ff0 ("kselftest/arm64: Verify simultaneous SSVE and ZA context generation")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230131-arm64-kselftest-sig-sme-no-128-v1-2-d47c13dc8e1e@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-02-01 17:21:33 +00:00
Mark Brown 5f38923853 kselftest/arm64: Fix enumeration of systems without 128 bit SME
The current signal handling tests for SME do not account for the fact that
unlike SVE all SME vector lengths are optional so we can't guarantee that
we will encounter the minimum possible VL, they will hang enumerating VLs
on such systems. Abort enumeration when we find the lowest VL.

Fixes: 4963aeb35a ("kselftest/arm64: signal: Add SME signal handling tests")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230131-arm64-kselftest-sig-sme-no-128-v1-1-d47c13dc8e1e@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-02-01 17:21:33 +00:00
Mark Brown 4365eec819 kselftest/arm64: Don't require FA64 for streaming SVE tests
During early development a dependedncy was added on having FA64
available so we could use the full FPSIMD register set in the signal
handler.  Subsequently the ABI was finialised so the handler is run with
streaming mode disabled meaning this is redundant but the dependency was
never removed, do so now.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230131-arm64-kselfetest-ssve-fa64-v1-1-f418efcc2b60@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-02-01 17:19:36 +00:00
Petr Machata bd32ff6872 selftests: net: forwarding: lib: Drop lldpad_app_wait_set(), _del()
The existing users of these helpers have been converted to iproute2 dcb.
Drop the helpers.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 21:02:11 -08:00
Petr Machata 5b3ef0452c selftests: mlxsw: qos_defprio: Convert from lldptool to dcb
Set up default port priority through the iproute2 dcb tool, which is easier
to understand and manage.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 21:02:11 -08:00
Petr Machata 10d5bd0b69 selftests: mlxsw: qos_dscp_router: Convert from lldptool to dcb
Set up DSCP prioritization through the iproute2 dcb tool, which is easier
to understand and manage.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 21:02:11 -08:00
Petr Machata 1680801ef6 selftests: mlxsw: qos_dscp_bridge: Convert from lldptool to dcb
Set up DSCP prioritization through the iproute2 dcb tool, which is easier
to understand and manage.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 21:02:11 -08:00
Jakub Kicinski 981cbcb030 tools: net: use python3 explicitly
The scripts require Python 3 and some distros are dropping
Python 2 support.

Reported-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 20:36:03 -08:00
Jakub Kicinski 5c6674f6eb tools: ynl: load jsonschema on demand
The CLI script tries to validate jsonschema by default.
It's seems better to validate too many times than too few.
However, when copying the scripts to random servers having
to install jsonschema is tedious. Load jsonschema via
importlib, and let the user opt out.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 20:36:03 -08:00
Jakub Kicinski 8dfec0a888 tools: ynl: use operation names from spec on the CLI
When I wrote the first version of the Python code I was quite
excited that we can generate class methods directly from the
spec. Unfortunately we need to use valid identifiers for method
names (specifically no dashes are allowed). Don't reuse those
names on the CLI, it's much more natural to use the operation
names exactly as listed in the spec.

Instead of:
  ./cli --do rings_get
use:
  ./cli --do rings-get

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 20:36:03 -08:00
Jakub Kicinski 4cd2796f3f tools: ynl: support pretty printing bad attribute names
One of my favorite features of the Netlink specs is that they
make decoding structured extack a ton easier.
Implement pretty printing bad attribute names in YNL.

For example it will now say:

  'bad-attr': '.header.flags'

rather than the useless:

  'bad-attr-offs': 32

Proof:

  $ ./cli.py --spec ethtool.yaml --do rings_get \
     --json '{"header":{"dev-index":1, "flags":4}}'
  Netlink error: Invalid argument
  nl_len = 68 (52) nl_flags = 0x300 nl_type = 2
	error: -22	extack: {'msg': 'reserved bit set',
				 'bad-attr': '.header.flags'}

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 20:36:03 -08:00
Jakub Kicinski 90256f3f80 tools: ynl: support multi-attr
Ethtool uses mutli-attr, add the support to YNL.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 20:36:03 -08:00
Jakub Kicinski fd0616d342 tools: ynl: support directional enum-model in CLI
Support families which use different IDs for messages
to and from the kernel.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 20:36:03 -08:00
Jakub Kicinski 19b64b48a3 tools: ynl: add support for types needed by ethtool
Ethtool needs support for handful of extra types.
It doesn't have the definitions section yet.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 20:36:03 -08:00
Jakub Kicinski 30a5c6c810 tools: ynl: use the common YAML loading and validation code
Adapt the common object hierarchy in code gen and CLI.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 20:36:03 -08:00
Jakub Kicinski 3aacf82813 tools: ynl: add an object hierarchy to represent parsed spec
There's a lot of copy and pasting going on between the "cli"
and code gen when it comes to representing the parsed spec.
Create a library which both can use.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 20:36:03 -08:00
Jakub Kicinski 4e4480e89c tools: ynl: move the cli and netlink code around
Move the CLI code out of samples/ and the library part
of it into tools/net/ynl/lib/. This way we can start
sharing some code with the code gen.

Initially I thought that code gen is too C-specific to
share anything but basic stuff like calculating values
for enums can easily be shared.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 20:36:03 -08:00
Jakub Kicinski eaf317e7d2 tools: ynl-gen: prevent do / dump reordering
An earlier fix tried to address generated code jumping around
one code-gen run to another. Turns out dict()s are already
ordered since Python 3.7, the problem is that we iterate over
operation modes using a set(). Sets are unordered in Python.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-31 20:36:03 -08:00
Andrew Morton 5ab0fc155d Sync mm-stable with mm-hotfixes-stable to pick up dependent patches
Merge branch 'mm-hotfixes-stable' into mm-stable
2023-01-31 17:25:17 -08:00
Andreas Ziegler 1fab1469b6 tools/tracing/rtla: osnoise_hist: display average with two-digit precision
Calculate average value in osnoise-hist summary with two-digit
precision to avoid displaying too optimitic results.

Link: https://lkml.kernel.org/r/20230103103400.275566-3-br015@umbiko.net

Signed-off-by: Andreas Ziegler <br015@umbiko.net>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-01-31 19:23:36 -05:00
Andreas Ziegler fe137a4fe0 tools/tracing/rtla: osnoise_hist: use total duration for average calculation
Sampled durations must be weighted by observed quantity, to arrive at a correct
average duration value.

Perform calculation of total duration by summing (duration * count).

Link: https://lkml.kernel.org/r/20230103103400.275566-2-br015@umbiko.net

Fixes: 829a6c0b56 ("rtla/osnoise: Add the hist mode")

Signed-off-by: Andreas Ziegler <br015@umbiko.net>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-01-31 19:23:36 -05:00
zhang songyi a37380ef8b tools/rv: Remove unneeded semicolon
The semicolon after the "}" is unneeded.

Link: https://lore.kernel.org/linux-trace-devel/202212191431057948891@zte.com.cn

Signed-off-by: zhang songyi <zhang.songyi@zte.com.cn>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-01-31 19:21:12 -05:00
Waiman Long e5ae880384 cgroup/cpuset: Fix wrong check in update_parent_subparts_cpumask()
It was found that the check to see if a partition could use up all
the cpus from the parent cpuset in update_parent_subparts_cpumask()
was incorrect. As a result, it is possible to leave parent with no
effective cpu left even if there are tasks in the parent cpuset. This
can lead to system panic as reported in [1].

Fix this probem by updating the check to fail the enabling the partition
if parent's effective_cpus is a subset of the child's cpus_allowed.

Also record the error code when an error happens in update_prstate()
and add a test case where parent partition and child have the same cpu
list and parent has task. Enabling partition in the child will fail in
this case.

[1] https://www.spinics.net/lists/cgroups/msg36254.html

Fixes: f0af1bfc27 ("cgroup/cpuset: Relax constraints to partition & cpus changes")
Cc: stable@vger.kernel.org # v6.1
Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-01-31 12:14:02 -10:00
Mark Brown b2ab432bcf kselftest/arm64: Remove redundant _start labels from zt-test
The newly added zt-test program copied the pattern from the other FP
stress test programs of having a redundant _start label which is
rejected by clang, as we did in a parallel series for the other tests
remove the label so we can build with clang.

No functional change.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230130-arm64-fix-sme2-clang-v1-1-3ce81d99ea8f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-31 16:02:13 +00:00
Mark Brown 89ff30b9b7 kselftest/arm64: Limit the maximum VL we try to set via ptrace
When SVE was initially merged we chose to export the maximum VQ in the ABI
as being 512, rather more than the architecturally supported maximum of 16.
For the ptrace tests this results in us generating a lot of test cases and
hence log output which are redundant since a system couldn't possibly
support them. Instead only check values up to the current architectural
limit, plus one more so that we're covering the constraining of higher
vector lengths.

This makes no practical difference to our test coverage, speeds things up
on slower consoles and makes the output much more managable.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111-arm64-kselftest-ptrace-max-vl-v1-1-8167f41d1ad8@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-31 15:33:23 +00:00
Ingo Molnar 57a30218fa Linux 6.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
 E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
 nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
 TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
 s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
 ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
 SDc/Y/c=
 =zpaD
 -----END PGP SIGNATURE-----

Merge tag 'v6.2-rc6' into sched/core, to pick up fixes

Pick up fixes before merging another batch of cpuidle updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-01-31 15:01:20 +01:00
Mathieu Desnoyers 145df2fdc3 selftests: core: Fix incorrect kernel headers search path
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org> # 5.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-01-30 15:04:52 -07:00
Mathieu Desnoyers 612cf4d283 selftests: clone3: Fix incorrect kernel headers search path
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org>  # 5.18+
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-01-30 15:01:52 -07:00
Mathieu Desnoyers 7482c19173 selftests: arm64: Fix incorrect kernel headers search path
Use $(KHDR_INCLUDES) as lookup path for kernel headers. This prevents
building against kernel headers from the build environment in scenarios
where kernel headers are installed into a specific output directory
(O=...).

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: <stable@vger.kernel.org>  # 5.18+
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-01-30 15:00:39 -07:00
Mike Leach c6535b6ba9 perf cs-etm: Update decoder code for OpenCSD version 1.4
OpenCSD version 1.4 is released with support for FEAT_ITE.

This adds a new packet type, with associated output element ID in the
packet type enum - OCSD_GEN_TRC_ELEM_INSTRUMENTATION.

As we just ignore this packet in perf, add to the switch statement to
avoid the "enum not handled in switch error", but conditionally so as
not to break the perf build for older OpenCSD installations.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120153706.20388-1-mike.leach@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-30 14:54:13 -03:00
Naveen N. Rao dfadf8b315 perf test: Fix DWARF unwind test by adding non-inline to expected function in a backtrace
'DWARF unwind' 'perf test' can sometimes fail:

  $ perf test -v 74
  Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
   74: Test dwarf unwind                                               :
  --- start ---
  test child forked, pid 3785254
  Problems creating module maps, continuing anyway...
  Problems creating module maps, continuing anyway...
  unwind: test__arch_unwind_sample:ip = 0x102d0ad4c (0x36ad4c)
  unwind: access_mem addr 0x7fffc33128c8, val 1031c3228, offset 120
  unwind: access_mem addr 0x7fffc33128d0, val 12427cc70, offset 128
  <snip>
  unwind: test_dwarf_unwind__krava_3:ip = 0x102b8768b (0x1e768b)
  unwind: access_mem addr 0x7fffc3313048, val 7fffc3313050, offset 2040
  unwind: access_mem addr 0x7fffc3313060, val 102b8777c, offset 2064
  unwind: test_dwarf_unwind__krava_2:ip = 0x102b8770b (0x1e770b)
  unwind: access_mem addr 0x7fffc3313088, val 7fffc3313090, offset 2104
  unwind: access_mem addr 0x7fffc33130a0, val 102b87890, offset 2128
  unwind: test_dwarf_unwind__krava_1:ip = 0x102b8777b (0x1e777b)
  unwind: access_mem addr 0x7fffc3313108, val 10323a274, offset 2232
  unwind: access_mem addr 0x7fffc3313110, val ffffffffffffffff, offset 2240
  unwind: access_mem addr 0x7fffc3313118, val 102c08ed0, offset 2248
  unwind: access_mem addr 0x7fffc3313120, val 1031db000, offset 2256
  unwind: access_mem addr 0x7fffc3313128, val 7fffc3313130, offset 2264
  unwind: access_mem addr 0x7fffc3313140, val 102b45ee8, offset 2288
  unwind: '':ip = 0x102b8788f (0x1e788f)
  failed: got unresolved address 0x102b8788f
  unwind: failed with 'no error'
  got wrong number of stack entries 0 != 8
  test child finished with -1
  ---- end ----
  Test dwarf unwind: FAILED!

We expect to resolve test__dwarf_unwind as the last symbol, but that
function can be optimized away:

  $ objdump -tT /usr/bin/perf | grep dwarf_unwind
  000000000083b018 g    DO .data	0000000000000040  Base        tests__dwarf_unwind
  00000000001e7750 g    DF .text	0000000000000068  Base        0x60 test_dwarf_unwind__krava_1
  00000000001e76e0 g    DF .text	0000000000000068  Base        0x60 test_dwarf_unwind__krava_2
  00000000001e7620 g    DF .text	00000000000000b4  Base        0x60 test_dwarf_unwind__krava_3
  00000000001e74f0 g    DF .text	0000000000000128  Base        0x60 test_dwarf_unwind__compare
  00000000001e7350 g    DF .text	000000000000019c  Base        0x60 test_dwarf_unwind__thread
  000000000083b000 g    DO .data	0000000000000018  Base        suite__dwarf_unwind

Fix this similar to commit fdf7c49c20 ("perf tests: Fix dwarf
unwind for stripped binaries") by marking the function as a global and
adding the 'noinline' attribute to it.

With this patch:

  $ objdump -tT perf | grep dwarf_unwind
  000000000083b018 g    DO .data	0000000000000040  Base        tests__dwarf_unwind
  00000000001e80f0 g    DF .text	0000000000000068  Base        0x60 test_dwarf_unwind__krava_1
  00000000001e8080 g    DF .text	0000000000000068  Base        0x60 test_dwarf_unwind__krava_2
  00000000001e7fc0 g    DF .text	00000000000000b4  Base        0x60 test_dwarf_unwind__krava_3
  00000000001e7e90 g    DF .text	0000000000000128  Base        0x60 test_dwarf_unwind__compare
  00000000001e7cf0 g    DF .text	000000000000019c  Base        0x60 test_dwarf_unwind__thread
  00000000001e8160 g    DF .text	0000000000000248  Base        0x60 test__dwarf_unwind
  000000000083b000 g    DO .data	0000000000000018  Base        suite__dwarf_unwind
  $ ./perf test 74
   74: Test dwarf unwind                                               : Ok

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Link: http://lore.kernel.org/lkml/20230125123442.107156-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-30 14:54:13 -03:00
Michael Schmitz be6c50d315 selftests/seccomp: Add m68k support
Add m68k seccomp definitions to seccomp_bpf self test code.

Tested on ARAnyM.

Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20230112035529.13521-4-schmitzmic@gmail.com
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2023-01-30 16:40:15 +01:00
Mark Brown 68efe8f7a1 KVM: selftests: Fix build of rseq test
The KVM rseq test is failing to build in -next due to a commit merged
from the tip tree which adds a wrapper for sys_getcpu() to the rseq
kselftests, conflicting with the wrapper already included in the KVM
selftest:

rseq_test.c:48:13: error: conflicting types for 'sys_getcpu'
   48 | static void sys_getcpu(unsigned *cpu)
          |             ^~~~~~~~~~
In file included from rseq_test.c:23:
../rseq/rseq.c:82:12: note: previous definition of 'sys_getcpu' was here
   82 | static int sys_getcpu(unsigned *cpu, unsigned *node)
          |            ^~~~~~~~~~

Fix this by removing the local wrapper and moving the result check up to
the caller.

Fixes: 99babd04b2 ("selftests/rseq: Implement rseq numa node id field selftest")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/r/20230106-fix-kvm-rseq-build-v1-1-b704d9831d02@kernel.org
2023-01-30 16:29:45 +01:00
Ilya Leoshkevich ee105d5a50 selftests/bpf: Trim DENYLIST.s390x
Now that trampoline is implemented, enable a number of tests on s390x.
18 of the remaining failures have to do with either lack of rethook
(fixed by [1]) or syscall symbols missing from BTF (fixed by [2]).

Do not re-classify the remaining failures for now; wait until the
s390/for-next fixes are merged and re-classify only the remaining few.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=for-next&id=1a280f48c0e403903cf0b4231c95b948e664f25a
[2] https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git/commit/?h=for-next&id=2213d44e140f979f4b60c3c0f8dd56d151cc8692

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230129190501.1624747-9-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-29 19:16:29 -08:00
Ilya Leoshkevich af320fb7dd selftests/bpf: Fix s390x vmlinux path
After commit edd4a86673 ("s390/boot: get rid of startup archive")
there is no more compressed/ subdirectory.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230129190501.1624747-8-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-29 19:16:29 -08:00
Ilya Leoshkevich 7ce878ca81 selftests/bpf: Fix sk_assign on s390x
sk_assign is failing on an s390x machine running Debian "bookworm" for
2 reasons: legacy server_map definition and uninitialized addrlen in
recvfrom() call.

Fix by adding a new-style server_map definition and dropping addrlen
(recvfrom() allows NULL values for src_addr and addrlen).

Since the test should support tc built without libbpf, build the prog
twice: with the old-style definition and with the new-style definition,
then select the right one at runtime. This could be done at compile
time too, but this would not be cross-compilation friendly.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230129190501.1624747-2-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-29 19:16:28 -08:00
Ricardo Koller 08ddbbdf0b KVM: selftests: aarch64: Test read-only PT memory regions
Extend the read-only memslot tests in page_fault_test to test
read-only PT (Page table) memslots. Note that this was not allowed
before commit 406504c7b0 ("KVM: arm64: Fix S1PTW handling on RO
memslots") as all S1PTW faults were treated as writes which resulted
in an (unrecoverable) exception inside the guest.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230127214353.245671-5-ricarkol@google.com
2023-01-29 18:49:08 +00:00
Ricardo Koller 8b03c97fa6 KVM: selftests: aarch64: Fix check of dirty log PT write
The dirty log checks are mistakenly testing the first page in the page
table (PT) memory region instead of the page holding the test data
page PTE.  This wasn't an issue before commit 406504c7b0 ("KVM:
arm64: Fix S1PTW handling on RO memslots") as all PT pages (including
the first page) were treated as writes.

Fix the page_fault_test dirty logging tests by checking for the right
page: the one for the PTE of the data test page.

Fixes: a4edf25b3e ("KVM: selftests: aarch64: Add dirty logging tests into page_fault_test")
Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230127214353.245671-4-ricarkol@google.com
2023-01-29 18:49:08 +00:00
Ricardo Koller 42561751ea KVM: selftests: aarch64: Do not default to dirty PTE pages on all S1PTWs
Only Stage1 Page table walks (S1PTW) trying to write into a PTE should
result in the PTE page being dirty in the log.  However, the dirty log
tests in page_fault_test default to treat all S1PTW accesses as writes.
Fix the relevant tests by asserting dirty pages only for S1PTW writes,
which in these tests only applies to when Hardware management of the Access
Flag is enabled.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230127214353.245671-3-ricarkol@google.com
2023-01-29 18:49:08 +00:00
Ricardo Koller 0dd8d22a88 KVM: selftests: aarch64: Relax userfaultfd read vs. write checks
Only Stage1 Page table walks (S1PTW) writing a PTE on an unmapped page
should result in a userfaultfd write. However, the userfaultfd tests in
page_fault_test wrongly assert that any S1PTW is a PTE write.

Fix this by relaxing the read vs. write checks in all userfaultfd
handlers.  Note that this is also an attempt to focus less on KVM (and
userfaultfd) behavior, and more on architectural behavior. Also note
that after commit 406504c7b0 ("KVM: arm64: Fix S1PTW handling on RO
memslots"), the userfaultfd fault (S1PTW with AF on an unmaped PTE
page) is actually a read: the translation fault that comes before the
permission fault.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230127214353.245671-2-ricarkol@google.com
2023-01-29 18:49:08 +00:00
Ilya Leoshkevich 42fae973c2 libbpf: Fix BPF_PROBE_READ{_STR}_INTO() on s390x
BPF_PROBE_READ_INTO() and BPF_PROBE_READ_STR_INTO() should map to
bpf_probe_read() and bpf_probe_read_str() respectively in order to work
correctly on architectures with !ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-24-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:45:14 -08:00
Ilya Leoshkevich 25c76ed428 libbpf: Fix unbounded memory access in bpf_usdt_arg()
Loading programs that use bpf_usdt_arg() on s390x fails with:

    ; if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt)
    128: (79) r1 = *(u64 *)(r10 -24)      ; frame1: R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
    129: (25) if r1 > 0xb goto pc+83      ; frame1: R1_w=scalar(umax=11,var_off=(0x0; 0xf))
    ...
    ; arg_spec = &spec->args[arg_num];
    135: (79) r1 = *(u64 *)(r10 -24)      ; frame1: R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
    ...
    ; switch (arg_spec->arg_type) {
    139: (61) r1 = *(u32 *)(r2 +8)
    R2 unbounded memory access, make sure to bounds check any such access

The reason is that, even though the C code enforces that
arg_num < BPF_USDT_MAX_ARG_CNT, the verifier cannot propagate this
constraint to the arg_spec assignment yet. Help it by forcing r1 back
to stack after comparison.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-23-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:45:14 -08:00
Ilya Leoshkevich e85465e420 libbpf: Simplify barrier_var()
Use a single "+r" constraint instead of the separate "=r" and "0".

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-22-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:45:14 -08:00
Ilya Leoshkevich 1b5e385325 selftests/bpf: Fix profiler on s390x
Use bpf_probe_read_kernel() and bpf_probe_read_kernel_str() instead
of bpf_probe_read() and bpf_probe_read_kernel().

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-21-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:45:14 -08:00
Ilya Leoshkevich 438a2edf26 selftests/bpf: Fix xdp_synproxy/tc on s390x
Use the correct datatype for the values map values; currently the test
works by accident, since on little-endian machines it is sometimes
acceptable to access u64 as u32.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-20-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:45:14 -08:00
Ilya Leoshkevich d504270a23 selftests/bpf: Fix vmlinux test on s390x
Use a syscall macro to access the nanosleep()'s first argument;
currently the code uses gprs[2] instead of orig_gpr2.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-18-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:09 -08:00
Ilya Leoshkevich 26e8a01494 selftests/bpf: Fix test_xdp_adjust_tail_grow2 on s390x
s390x cache line size is 256 bytes, so skb_shared_info must be aligned
on a much larger boundary than for x86.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-17-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:09 -08:00
Ilya Leoshkevich 207612eb12 selftests/bpf: Fix test_lsm on s390x
Use syscall macros to access the setdomainname() arguments; currently
the code uses gprs[2] instead of orig_gpr2 for the first argument.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-16-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:09 -08:00
Ilya Leoshkevich be6b5c10ec selftests/bpf: Add a sign-extension test for kfuncs
s390x ABI requires the caller to zero- or sign-extend the arguments.
eBPF already deals with zero-extension (by definition of its ABI), but
not with sign-extension.

Add a test to cover that potentially problematic area.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-15-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:09 -08:00
Ilya Leoshkevich 80a611904e selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x
sizeof(struct bpf_local_storage_elem) is 512 on s390x:

    struct bpf_local_storage_elem {
            struct hlist_node          map_node;             /*     0    16 */
            struct hlist_node          snode;                /*    16    16 */
            struct bpf_local_storage * local_storage;        /*    32     8 */
            struct callback_head       rcu __attribute__((__aligned__(8))); /*    40    16 */

            /* XXX 200 bytes hole, try to pack */

            /* --- cacheline 1 boundary (256 bytes) --- */
            struct bpf_local_storage_data sdata __attribute__((__aligned__(256))); /*   256     8 */

            /* size: 512, cachelines: 2, members: 5 */
            /* sum members: 64, holes: 1, sum holes: 200 */
            /* padding: 248 */
            /* forced alignments: 2, forced holes: 1, sum forced holes: 200 */
    } __attribute__((__aligned__(256)));

As the existing comment suggests, use a larger number in order to be
future-proof.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-14-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:08 -08:00
Ilya Leoshkevich 2934565f04 selftests/bpf: Check stack_mprotect() return value
If stack_mprotect() succeeds, errno is not changed. This can produce
misleading error messages, that show stale errno.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-13-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:08 -08:00
Ilya Leoshkevich 06cea99e68 selftests/bpf: Fix cgrp_local_storage on s390x
Sync the definition of socket_cookie between the eBPF program and the
test. Currently the test works by accident, since on little-endian it
is sometimes acceptable to access u64 as u32.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-12-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:08 -08:00
Ilya Leoshkevich 06c1865b0b selftests/bpf: Fix xdp_do_redirect on s390x
s390x cache line size is 256 bytes, so skb_shared_info must be aligned
on a much larger boundary than for x86. This makes the maximum packet
size smaller.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-11-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:08 -08:00
Ilya Leoshkevich 56e1a50483 selftests/bpf: Fix verify_pkcs7_sig on s390x
Use bpf_probe_read_kernel() instead of bpf_probe_read(), which is not
defined on all architectures.

While at it, improve the error handling: do not hide the verifier log,
and check the return values of bpf_probe_read_kernel() and
bpf_copy_from_user().

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-10-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:08 -08:00
Ilya Leoshkevich 98e13848cf selftests/bpf: Fix decap_sanity_ns cleanup
decap_sanity prints the following on the 1st run:

    decap_sanity: sh: 1: Syntax error: Bad fd number

and the following on the 2nd run:

    Cannot create namespace file "/run/netns/decap_sanity_ns": File exists

The problem is that the cleanup command has a typo and does nothing.
Fix the typo.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-9-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:08 -08:00
Ilya Leoshkevich 804acdd251 selftests/bpf: Set errno when urand_spawn() fails
The result of urand_spawn() is checked with ASSERT_OK_PTR, which treats
NULL as success if errno == 0.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-8-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:08 -08:00
Ilya Leoshkevich 31da9be64a selftests/bpf: Fix kfree_skb on s390x
h_proto is big-endian; use htons() in order to make comparison work on
both little- and big-endian machines.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-7-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:08 -08:00
Ilya Leoshkevich 6eab2370d1 selftests/bpf: Fix symlink creation error
When building with O=, the following error occurs:

    ln: failed to create symbolic link 'no_alu32/bpftool': No such file or directory

Adjust the code to account for $(OUTPUT).

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-6-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:08 -08:00
Ilya Leoshkevich b14b01f281 selftests/bpf: Fix liburandom_read.so linker error
When building with O=, the following linker error occurs:

    clang: error: no such file or directory: 'liburandom_read.so'

Fix by adding $(OUTPUT) to the linker search path.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-5-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:30:08 -08:00
Ilya Leoshkevich 8fb9fb2f17 selftests/bpf: Query BPF_MAX_TRAMP_LINKS using BTF
Do not hard-code the value, since for s390x it will be smaller than
for x86.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230128000650.1516334-4-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28 12:29:41 -08:00
Andrei Gherzan a6efc42a86 selftest: net: Improve IPV6_TCLASS/IPV6_HOPLIMIT tests apparmor compatibility
"tcpdump" is used to capture traffic in these tests while using a random,
temporary and not suffixed file for it. This can interfere with apparmor
configuration where the tool is only allowed to read from files with
'known' extensions.

The MINE type application/vnd.tcpdump.pcap was registered with IANA for
pcap files and .pcap is the extension that is both most common but also
aligned with standard apparmor configurations. See TCPDUMP(8) for more
details.

This improves compatibility with standard apparmor configurations by
using ".pcap" as the file extension for the tests' temporary files.

Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-28 13:55:12 +00:00
Jakub Kicinski 2d104c390f bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY9RqJgAKCRDbK58LschI
 gw2IAP9G5uhFO5abBzYLupp6SY3T5j97MUvPwLfFqUEt7EXmuwEA2lCUEWeW0KtR
 QX+QmzCa6iHxrW7WzP4DUYLue//FJQY=
 =yYqA
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
bpf-next 2023-01-28

We've added 124 non-merge commits during the last 22 day(s) which contain
a total of 124 files changed, 6386 insertions(+), 1827 deletions(-).

The main changes are:

1) Implement XDP hints via kfuncs with initial support for RX hash and
   timestamp metadata kfuncs, from Stanislav Fomichev and
   Toke Høiland-Jørgensen.
   Measurements on overhead: https://lore.kernel.org/bpf/875yellcx6.fsf@toke.dk

2) Extend libbpf's bpf_tracing.h support for tracing arguments of
   kprobes/uprobes and syscall as a special case, from Andrii Nakryiko.

3) Significantly reduce the search time for module symbols by livepatch
   and BPF, from Jiri Olsa and Zhen Lei.

4) Enable cpumasks to be used as kptrs, which is useful for tracing
   programs tracking which tasks end up running on which CPUs
   in different time intervals, from David Vernet.

5) Fix several issues in the dynptr processing such as stack slot liveness
   propagation, missing checks for PTR_TO_STACK variable offset, etc,
   from Kumar Kartikeya Dwivedi.

6) Various performance improvements, fixes, and introduction of more
   than just one XDP program to XSK selftests, from Magnus Karlsson.

7) Big batch to BPF samples to reduce deprecated functionality,
   from Daniel T. Lee.

8) Enable struct_ops programs to be sleepable in verifier,
   from David Vernet.

9) Reduce pr_warn() noise on BTF mismatches when they are expected under
   the CONFIG_MODULE_ALLOW_BTF_MISMATCH config anyway, from Connor O'Brien.

10) Describe modulo and division by zero behavior of the BPF runtime
    in BPF's instruction specification document, from Dave Thaler.

11) Several improvements to libbpf API documentation in libbpf.h,
    from Grant Seltzer.

12) Improve resolve_btfids header dependencies related to subcmd and add
    proper support for HOSTCC, from Ian Rogers.

13) Add ipip6 and ip6ip decapsulation support for bpf_skb_adjust_room()
    helper along with BPF selftests, from Ziyang Xuan.

14) Simplify the parsing logic of structure parameters for BPF trampoline
    in the x86-64 JIT compiler, from Pu Lehui.

15) Get BTF working for kernels with CONFIG_RUST enabled by excluding
    Rust compilation units with pahole, from Martin Rodriguez Reboredo.

16) Get bpf_setsockopt() working for kTLS on top of TCP sockets,
    from Kui-Feng Lee.

17) Disable stack protection for BPF objects in bpftool given BPF backends
    don't support it, from Holger Hoffstätte.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (124 commits)
  selftest/bpf: Make crashes more debuggable in test_progs
  libbpf: Add documentation to map pinning API functions
  libbpf: Fix malformed documentation formatting
  selftests/bpf: Properly enable hwtstamp in xdp_hw_metadata
  selftests/bpf: Calls bpf_setsockopt() on a ktls enabled socket.
  bpf: Check the protocol of a sock to agree the calls to bpf_setsockopt().
  bpf/selftests: Verify struct_ops prog sleepable behavior
  bpf: Pass const struct bpf_prog * to .check_member
  libbpf: Support sleepable struct_ops.s section
  bpf: Allow BPF_PROG_TYPE_STRUCT_OPS programs to be sleepable
  selftests/bpf: Fix vmtest static compilation error
  tools/resolve_btfids: Alter how HOSTCC is forced
  tools/resolve_btfids: Install subcmd headers
  bpf/docs: Document the nocast aliasing behavior of ___init
  bpf/docs: Document how nested trusted fields may be defined
  bpf/docs: Document cpumask kfuncs in a new file
  selftests/bpf: Add selftest suite for cpumask kfuncs
  selftests/bpf: Add nested trust selftests suite
  bpf: Enable cpumasks to be queried and used as kptrs
  bpf: Disallow NULLable pointers for trusted kfuncs
  ...
====================

Link: https://lore.kernel.org/r/20230128004827.21371-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-28 00:00:14 -08:00
Jakub Kicinski 0548c5f26a bpf-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY9RCwAAKCRDbK58LschI
 g7drAQDfMPc1Q2CE4LZ9oh2wu1Nt2/85naTDK/WirlCToKs0xwD+NRQOyO3hcoJJ
 rOCwfjOlAm+7uqtiwwodBvWgTlDgVAM=
 =0fC+
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
bpf 2023-01-27

We've added 10 non-merge commits during the last 9 day(s) which contain
a total of 10 files changed, 170 insertions(+), 59 deletions(-).

The main changes are:

1) Fix preservation of register's parent/live fields when copying
   range-info, from Eduard Zingerman.

2) Fix an off-by-one bug in bpf_mem_cache_idx() to select the right
   cache, from Hou Tao.

3) Fix stack overflow from infinite recursion in sock_map_close(),
   from Jakub Sitnicki.

4) Fix missing btf_put() in register_btf_id_dtor_kfuncs()'s error path,
   from Jiri Olsa.

5) Fix a splat from bpf_setsockopt() via lsm_cgroup/socket_sock_rcv_skb,
   from Kui-Feng Lee.

6) Fix bpf_send_signal[_thread]() helpers to hold a reference on the task,
   from Yonghong Song.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix the kernel crash caused by bpf_setsockopt().
  selftests/bpf: Cover listener cloning with progs attached to sockmap
  selftests/bpf: Pass BPF skeleton to sockmap_listen ops tests
  bpf, sockmap: Check for any of tcp_bpf_prots when cloning a listener
  bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself
  bpf: Add missing btf_put to register_btf_id_dtor_kfuncs
  selftests/bpf: Verify copy_register_state() preserves parent/live fields
  bpf: Fix to preserve reg parent/live fields when copying range info
  bpf: Fix a possible task gone issue with bpf_send_signal[_thread]() helpers
  bpf: Fix off-by-one error in bpf_mem_cache_idx()
====================

Link: https://lore.kernel.org/r/20230127215820.4993-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-27 23:32:03 -08:00
Jakub Kicinski b568d3072a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

drivers/net/ethernet/intel/ice/ice_main.c
  418e53401e ("ice: move devlink port creation/deletion")
  643ef23bd9 ("ice: Introduce local var for readability")
https://lore.kernel.org/all/20230127124025.0dacef40@canb.auug.org.au/
https://lore.kernel.org/all/20230124005714.3996270-1-anthony.l.nguyen@intel.com/

drivers/net/ethernet/engleder/tsnep_main.c
  3d53aaef43 ("tsnep: Fix TX queue stop/wake for multiple queues")
  25faa6a4c5 ("tsnep: Replace TX spin_lock with __netif_tx_lock")
https://lore.kernel.org/all/20230127123604.36bb3e99@canb.auug.org.au/

net/netfilter/nf_conntrack_proto_sctp.c
  13bd9b31a9 ("Revert "netfilter: conntrack: add sctp DATA_SENT state"")
  a44b765148 ("netfilter: conntrack: unify established states for SCTP paths")
  f71cb8f45d ("netfilter: conntrack: sctp: use nf log infrastructure for invalid packets")
https://lore.kernel.org/all/20230127125052.674281f9@canb.auug.org.au/
https://lore.kernel.org/all/d36076f3-6add-a442-6d4b-ead9f7ffff86@tessares.net/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-27 22:56:18 -08:00
Stanislav Fomichev 16809afdcb selftest/bpf: Make crashes more debuggable in test_progs
Reset stdio before printing verbose log of the SIGSEGV'ed test.
Otherwise, it's hard to understand what's going on in the cases like [0].

With the following patch applied:

	--- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
	+++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
	@@ -392,6 +392,11 @@ void test_xdp_metadata(void)
	 		       "generate freplace packet"))
	 		goto out;

	+
	+	ASSERT_EQ(1, 2, "oops");
	+	int *x = 0;
	+	*x = 1; /* die */
	+
	 	while (!retries--) {
	 		if (bpf_obj2->bss->called)
	 			break;

Before:

 #281     xdp_metadata:FAIL
Caught signal #11!
Stack trace:
./test_progs(crash_handler+0x1f)[0x55c919d98bcf]
/lib/x86_64-linux-gnu/libc.so.6(+0x3bf90)[0x7f36aea5df90]
./test_progs(test_xdp_metadata+0x1db0)[0x55c919d8c6d0]
./test_progs(+0x23b438)[0x55c919d9a438]
./test_progs(main+0x534)[0x55c919d99454]
/lib/x86_64-linux-gnu/libc.so.6(+0x2718a)[0x7f36aea4918a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7f36aea49245]
./test_progs(_start+0x21)[0x55c919b82ef1]

After:

test_xdp_metadata:PASS:ip netns add xdp_metadata 0 nsec
open_netns:PASS:malloc token 0 nsec
open_netns:PASS:open /proc/self/ns/net 0 nsec
open_netns:PASS:open netns fd 0 nsec
open_netns:PASS:setns 0 nsec
..
test_xdp_metadata:FAIL:oops unexpected oops: actual 1 != expected 2
 #281     xdp_metadata:FAIL
Caught signal #11!
Stack trace:
./test_progs(crash_handler+0x1f)[0x562714a76bcf]
/lib/x86_64-linux-gnu/libc.so.6(+0x3bf90)[0x7fa663f9cf90]
./test_progs(test_xdp_metadata+0x1db0)[0x562714a6a6d0]
./test_progs(+0x23b438)[0x562714a78438]
./test_progs(main+0x534)[0x562714a77454]
/lib/x86_64-linux-gnu/libc.so.6(+0x2718a)[0x7fa663f8818a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7fa663f88245]
./test_progs(_start+0x21)[0x562714860ef1]

0: https://github.com/kernel-patches/bpf/actions/runs/4019879316/jobs/6907358876

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230127215705.1254316-1-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-27 15:22:00 -08:00
Grant Seltzer e4ce876f10 libbpf: Add documentation to map pinning API functions
This adds documentation for the following API functions:

- bpf_map__set_pin_path()
- bpf_map__pin_path()
- bpf_map__is_pinned()
- bpf_map__pin()
- bpf_map__unpin()
- bpf_object__pin_maps()
- bpf_object__unpin_maps()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230126024225.520685-1-grantseltzer@gmail.com
2023-01-27 23:03:19 +01:00
Grant Seltzer 139df64d26 libbpf: Fix malformed documentation formatting
This fixes the doxygen format documentation above the
user_ring_buffer__* APIs. There has to be a newline
before the @brief, otherwise doxygen won't render them
for libbpf.readthedocs.org.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230126024749.522278-1-grantseltzer@gmail.com
2023-01-27 23:02:03 +01:00
Linus Torvalds 37d0be6a7d gpio fixes for v6.2-rc6
- fix the -c option in the gpio-event-mode user-space example program
 - fix the irq number translation in gpio-ep93xx and make its irqchip immutable
 - add a missing spin_unlock in error path in gpio-mxc
 - fix a suspend breakage on System76 and Lenovo Gen2a introduced in GPIO ACPI
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmPUKckACgkQEacuoBRx
 13KjuQ//fL9MagRQVsE85SVhhMsjmTTbXnvGJOIT0iTTvqgh9DACX9m6EDQQ/8Kd
 bKuciI0h2zO5xt/fZPAyJWgGw0w3qvld7f3f6KQE60X4MRyckyZB9xEyHnvuUs+X
 v0SiH8rLm+znezFXFiIB3lq004T9DU+wxUCaSq4WH1eE7eqKP8ZXhg2sLUgiskiY
 c1Albv2X0xM4duLmPBK4vsXV2x35yj5qS0qkIymezff2sQekzb3TOZsaS7Doznex
 cKr2OvJR/VuH4SMv+uIoKjsVKzdvtP/6hBKJtTH5i1krzNsBW/S+/7odwxzdxPVU
 EgQc5I7/ubV7FP3Jo3ThO3exTFyd50hl4jKmrkrzzFpeh2xZ2eKU/AD8/MNsPVuA
 KAoBUxHnoJfdZuUNL0Cmf1pTmzZbS80qntznzO700QtBaIxS/6redfJ8CKUK4Iy8
 f491K3SbzpXzRAjVMBv8ua7JrB8MT82IKxbeXd1bellGrLaNuMzj1Yd8XhWlepnQ
 wZxMSo3457FpOghiiMM3YnLSCcpbs4WU8VV/NMCc8jfS14oq7gA26EalQLPwKeCF
 5c+UMAZ95gafJwPjuZcDJcRCGpDlhqPawRBo5cDVUlSokh1PSmPhcGNtE8XKUKlE
 fHyd6WEYn8AzXnf3PMceiuJYCADs8wtg87GHKuLF1PPXmcqM4C8=
 =wh3L
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix the -c option in the gpio-event-mode user-space example program

 - fix the irq number translation in gpio-ep93xx and make its irqchip
   immutable

 - add a missing spin_unlock in error path in gpio-mxc

 - fix a suspend breakage on System76 and Lenovo Gen2a introduced in
   GPIO ACPI

* tag 'gpio-fixes-for-v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  tools: gpio: fix -c option of gpio-event-mon
  gpio: ep93xx: remove unused variable
  gpio: ep93xx: Make irqchip immutable
  gpio: ep93xx: Fix port F hwirq numbers in handler
  gpio: mxc: Unlock on error path in mxc_flip_edge()
  gpiolib-acpi: Don't set GPIOs for wakeup in S3 mode
2023-01-27 13:47:40 -08:00
Linus Torvalds 9f4d0bd24e linux-kselftest-fixes-6.2-rc6
This Kselftest fixes update for Linux 6.2-rc6 consists of a single
 fix to a amd-pstate test Makefile bug that deletes source files
 during make clean run.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmPS4awACgkQCwJExA0N
 QxyxIA/+LSuXceaj895VX/wwCeB9SFarG5oZesX5sg4Z8RID91w505hY+ekuKyUT
 6Wt2sjc9XJ3RJfPZ5L9fK1TncIjkkQbKoDIJjKZ9nM10zT4wj0fK+vjnKY96Da9w
 s70stRKr0zyVE+uPV5oEuj89LsZITuOsDJxbxcF7RWHOCid+nYaO8RdoANAHMPnF
 f70ziaAAb8zNqv55Lnh05N+gd7VptaJ6CuS3d/WxRABhNgqSHheEMWymY51G0aO1
 3Om3/HLVFWLr7nqdt7Vnc2EjcllzwxaRnIACctmLIzVlabJSuDm4FtDpOU2D+sG4
 pGW/NVc+ydrM1PJuugery7tMMP1vV+frhhDvfnuRdLaPFl5Wxc0W8D7WOgMce0fo
 hK+wahfIrcNrXX8mQQi54P0K5imIZcgPxpZz3rgZHrbP+hJmDdStMh93zCLLfL/t
 zmVYf9xYJzb32OOkmjZdN0AKNDlDowgZQcbesmH22bDVMuKORb6YG920J6Qa2pmK
 5A168cLYS5yDHtbouZlcDK5XAroydcAYWGLXmVMoLKpS2Sa58DsKlVgq7BeHcUgX
 dQLhfS18YsW9XKiKYakWb6z67B+MQ0SulnOdOLhOO3ekds04hWFSASR4nFvDKHce
 h9MZKBk2osHYfX//RCfmyCVqE4SleMXBWNzFOfrfm0de0d+NILQ=
 =Wb/P
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:
 "A single fix to a amd-pstate test Makefile bug that deletes source
  files during make clean run"

* tag 'linux-kselftest-fixes-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: amd-pstate: Don't delete source files via Makefile
2023-01-27 12:41:09 -08:00
Ian Rogers 22e06e6825 perf buildid: Avoid copy of uninitialized memory
build_id__init() only copies the buildid data up to size leaving the
rest of the data array uninitialized. Copying the full array during
synthesis means the written event contains uninitialized memory.

Ensure the size is less that the buffer size and only copy the bytes
that were initialized. This was detected by the Clang/LLVM memory
sanitizer.

v2. Avoids the potential for copying too much as suggested by Arnaldo.

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230120185828.43231-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-27 15:00:35 -03:00
James Clark 86569c0ab1 perf mem/c2c: Document that SPE is used for mem and c2c on ARM
Setup is non-trivial so also link to the full SPE docs.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-perf-users@vger.kernel.or
Link: https://lore.kernel.org/r/20230124145929.557891-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-27 15:00:34 -03:00
James Clark 6bc75b4c90 perf cs-etm: Improve missing sink warning message
Make the sink error message more similar to the event error message that
reminds about missing kernel support. The available sinks are also
determined by the hardware so mention that too.

Also, usually it's not necessary to specify the sink, so add that as a
hint.

Now the error for a made up sink looks like this:

  $ perf record -e cs_etm/@abc/
  Couldn't find sink "abc" on event cs_etm/@abc/.
  Missing kernel or device support?

  Hint: An appropriate sink will be picked automatically if one isn't is specified.

For any error other than ENOENT, the same message as before is
displayed.

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/ec7502e6-b406-3997-c2a5-24f98e5c4854@arm.com
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230124110220.460551-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-27 15:00:34 -03:00
Jeff Xu 8677e555f1
selftests/landlock: Test ptrace as much as possible with Yama
Update ptrace tests according to all potential Yama security policies.
This is required to make such tests pass even if Yama is enabled.

Tests are not skipped but they now check both Landlock and Yama boundary
restrictions at run time to keep a maximum test coverage (i.e. positive
and negative testing).

Signed-off-by: Jeff Xu <jeffxu@google.com>
Link: https://lore.kernel.org/r/20230114020306.1407195-2-jeffxu@google.com
Cc: stable@vger.kernel.org
[mic: Add curly braces around EXPECT_EQ() to make it build, and improve
commit message]
Co-developed-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2023-01-27 18:53:55 +01:00
Ivo Borisov Shopov 677d85e1a1 tools: gpio: fix -c option of gpio-event-mon
Following line should listen for a rising edge and exit after the first
one since '-c 1' is provided.

    # gpio-event-mon -n gpiochip1 -o 0 -r -c 1

It works with kernel 4.19 but it doesn't work with 5.10. In 5.10 the
above command doesn't exit after the first rising edge it keep listening
for an event forever. The '-c 1' is not taken into an account.
The problem is in commit 62757c32d5 ("tools: gpio: add multi-line
monitoring to gpio-event-mon").
Before this commit the iterator 'i' in monitor_device() is used for
counting of the events (loops). In the case of the above command (-c 1)
we should start from 0 and increment 'i' only ones and hit the 'break'
statement and exit the process. But after the above commit counting
doesn't start from 0, it start from 1 when we listen on one line.
It is because 'i' is used from one more purpose, counting of lines
(num_lines) and it isn't restore to 0 after following code

    for (i = 0; i < num_lines; i++)
        gpiotools_set_bit(&values.mask, i);

Restore the initial value of the iterator to 0 in order to allow counting
of loops to work for any cases.

Fixes: 62757c32d5 ("tools: gpio: add multi-line monitoring to gpio-event-mon")
Signed-off-by: Ivo Borisov Shopov <ivoshopov@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
[Bartosz: tweak the commit message]
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2023-01-27 14:05:46 +01:00
Shunsuke Mie 3f7b75abf4 tools/virtio: fix the vringh test for virtio ring changes
Fix the build caused by missing kmsan_handle_dma() and is_power_of_2() that
are used in drivers/virtio/virtio_ring.c.

Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Message-Id: <20230110034310.779744-1-mie@igel.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-27 06:18:41 -05:00
Stanislav Fomichev a5f3a3f7c1 selftests/bpf: Properly enable hwtstamp in xdp_hw_metadata
The existing timestamping_enable() is a no-op because it applies
to the socket-related path that we are not verifying here
anymore. (but still leaving the code around hoping we can
have xdp->skb path verified here as well)

  poll: 1 (0)
  xsk_ring_cons__peek: 1
  0xf64788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000
  rx_hash: 3697961069
  rx_timestamp:  1674657672142214773 (sec:1674657672.1422)
  XDP RX-time:   1674657709561774876 (sec:1674657709.5618) delta sec:37.4196
  AF_XDP time:   1674657709561871034 (sec:1674657709.5619) delta
sec:0.0001 (96.158 usec)
  0xf64788: complete idx=8 addr=8000

Also, maybe something to archive here, see [0] for Jesper's note
about NIC vs host clock delta.

0: https://lore.kernel.org/bpf/f3a116dc-1b14-3432-ad20-a36179ef0608@redhat.com/

v2:
- Restore original value (Martin)

Fixes: 297a3f1241 ("selftests/bpf: Simple program to dump XDP RX metadata")
Reported-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230126225030.510629-1-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-26 22:10:31 -08:00
Ira Weiny bab2a5e6fe cxl/test: Simulate event log overflow
Log overflow is marked by a separate trace message.

Simulate a log with lots of messages and flag overflow until space is
cleared.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-8-2316a5c8f7d8@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-26 16:51:08 -08:00
Ira Weiny 0092f62acc cxl/test: Add specific events
Each type of event has different trace point outputs.

Add mock General Media Event, DRAM event, and Memory Module Event
records to the mock list of events returned.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-7-2316a5c8f7d8@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-26 16:51:07 -08:00
Ira Weiny d1dca858f0 cxl/test: Add generic mock events
Facilitate testing basic Get/Clear Event functionality by creating
multiple logs and generic events with made up UUID's.

Data is completely made up with data patterns which should be easy to
spot in trace output.

A single sysfs entry resets the event data and triggers collecting the
events for testing.

Test traces are easy to obtain with a small script such as this:

	#!/bin/bash -x

	devices=`find /sys/devices/platform -name cxl_mem*`

	# Turn on tracing
	echo "" > /sys/kernel/tracing/trace
	echo 1 > /sys/kernel/tracing/events/cxl/enable
	echo 1 > /sys/kernel/tracing/tracing_on

	# Generate fake interrupt
	for device in $devices; do
	        echo 1 > $device/event_trigger
	done

	# Turn off tracing and report events
	echo 0 > /sys/kernel/tracing/tracing_on
	cat /sys/kernel/tracing/trace

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20221216-cxl-ev-log-v7-6-2316a5c8f7d8@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-26 16:51:07 -08:00
Jakub Kicinski 3a43ded081 tools: ynl: store ops in ordered dict to avoid random ordering
When rendering code we should walk the ops in the order in which
they are declared in the spec. This is both more intuitive and
prevents code from jumping around when hashing in the dict changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:32:41 -08:00
Jakub Kicinski b49c34e217 tools: ynl: rename ops_list -> msg_list
ops_list contains all the operations, but the main iteration use
case is to walk only ops which define attrs. Rename ops_list to
msg_list, because now it looks like the contents are the same,
just the format is different. While at it convert from tuple
to just keys, none of the users care about the name of the op.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:32:41 -08:00
Jakub Kicinski 66fa34b9c2 tools: ynl: support kdocs for flags in code generation
Lorenzo reports that after switching from enum to flags netdev
family lost ability to render kdoc (and the enum contents got
generally garbled).

Combine the flags and enum handling in uAPI handling.

Reported-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:32:41 -08:00
Alison Schofield 66f3cb7993 tools/testing/cxl: Remove cxl_test module math loading message
Commit "tools/testing/cxl: Add XOR Math support to cxl_test" added
a module parameter to cxl_test for the interleave_arithmetic option.

In doing so, it also added this dev_dbg() message describing which
option cxl_test used during load:
"[  111.743246] (NULL device *): cxl_test loading modulo math option"
That "(NULL device *)" has raised needless user concern.

Remove the dev_dbg() message and make the module_param readable via
sysfs for users that need to know which math option is active.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20230126170555.701240-1-alison.schofield@intel.com
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-26 15:57:42 -08:00
Jakub Kicinski 65177e47d3 testing: kselftest_harness: add filtering and enumerating tests
As the number of test cases and length of execution grows it's
useful to select only a subset of tests. In TLS for instance we
have a matrix of variants for different crypto protocols and
during development mostly care about testing a handful.
This is quicker and makes reading output easier.

This patch adds argument parsing to kselftest_harness.

It supports a couple of ways to filter things, I could not come
up with one way which will cover all cases.

The first and simplest switch is -r which takes the name of
a test to run (can be specified multiple times). For example:

  $ ./my_test -r some.test.name -r some.other.name

will run tests some.test.name and some.other.name (where "some"
is the fixture, "test" and "other" and "name is the test.)

Then there is a handful of group filtering options. f/v/t for
filtering by fixture/variant/test. They have both positive
(match -> run) and negative versions (match -> skip).
If user specifies any positive option we assume the default
is not to run the tests. If only negative options are set
we assume the tests are supposed to be run by default.

  Usage: ./tools/testing/selftests/net/tls [-h|-l] [-t|-T|-v|-V|-f|-F|-r name]
	-h       print help
	-l       list all tests

	-t name  include test
	-T name  exclude test
	-v name  include variant
	-V name  exclude variant
	-f name  include fixture
	-F name  exclude fixture
	-r name  run specified test

  Test filter options can be specified multiple times. The filtering stops
  at the first match. For example to include all tests from variant 'bla'
  but not test 'foo' specify '-T foo -v bla'.

Here we can request for example all tests from fixture "foo" to run:

 ./my_test -f foo

or to skip variants var1 and var2:

 ./my_test -V var1 -V var2

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2023-01-26 16:00:41 -07:00
Matthieu Baerts 8dbdf24f4e selftests: mptcp: userspace: avoid read errors
During the cleanup phase, the server pids were killed with a SIGTERM
directly, not using a SIGUSR1 first to quit safely. As a result, this
test was often ending with two error messages:

  read: Connection reset by peer

While at it, use a for-loop to terminate all the PIDs the same way.

Also the different files are now removed after having killed the PIDs
using them. It makes more sense to do that in this order.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Matthieu Baerts 10d4273411 selftests: mptcp: userspace: print error details if any
Before, only '[FAIL]' was printed in case of error during the validation
phase.

Now, in case of failure, the variable name, its value and expected one
are displayed to help understand what was wrong.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Matthieu Baerts 1c0b0ee264 selftests: mptcp: userspace: refactor asserts
Instead of having a long list of conditions to check, it is possible to
give a list of variable names to compare with their 'e_XXX' version.

This will ease the introduction of the following commit which will print
which condition has failed (if any).

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Matthieu Baerts f790ae03db selftests: mptcp: userspace: print titles
This script is running a few tests after having setup the environment.

Printing titles helps understand what is being tested.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Paolo Abeni ad3493746e selftests: mptcp: add test-cases for mixed v4/v6 subflows
Note that we can't guess the listener family anymore based on the client
target address: always use IPv6.

The fullmesh flag with endpoints from different families is also
validated here.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Jakub Sitnicki ae5439658c selftests/net: Cover the IP_LOCAL_PORT_RANGE socket option
Exercise IP_LOCAL_PORT_RANGE socket option in various scenarios:

1. pass invalid values to setsockopt
2. pass a range outside of the per-netns port range
3. configure a single-port range
4. exhaust a configured multi-port range
5. check interaction with late-bind (IP_BIND_ADDRESS_NO_PORT)
6. set then get the per-socket port range

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-25 22:45:00 -08:00
Kui-Feng Lee d1246f9360 selftests/bpf: Calls bpf_setsockopt() on a ktls enabled socket.
Ensures that whenever bpf_setsockopt() is called with the SOL_TCP
option on a ktls enabled socket, the call will be accepted by the
system. The provided test makes sure of this by performing an
examination when the server side socket is in the CLOSE_WAIT state. At
this stage, ktls is still enabled on the server socket and can be used
to test if bpf_setsockopt() works correctly with linux.

Signed-off-by: Kui-Feng Lee <kuifeng@meta.com>
Link: https://lore.kernel.org/r/20230125201608.908230-3-kuifeng@meta.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-25 15:10:34 -08:00
Luis Chamberlain f45d63c121 tools/testing/cxl: require 64-bit
size_t is limited to 32-bits and so the gen_pool_alloc() using
the size of SZ_64G would map to 0, triggering a low allocation
which is not expected. Force the dependency on 64-bit for cxl_test
as that is what it was designed for.

This issue was found by build test reports when converting this
driver as a proper upstream driver.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20221219195050.325959-1-mcgrof@kernel.org
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-25 13:04:39 -08:00
David Vernet 7dd880592a bpf/selftests: Verify struct_ops prog sleepable behavior
In a set of prior changes, we added the ability for struct_ops programs
to be sleepable. This patch enhances the dummy_st_ops selftest suite to
validate this behavior by adding a new sleepable struct_ops entry to
dummy_st_ops.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230125164735.785732-5-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-25 10:25:57 -08:00