OpenCloudOS-Kernel/tools
Feng Tang 1470a108a6 perf c2c: Add report option to show false sharing in adjacent cachelines
Many platforms have feature of adjacent cachelines prefetch, when it is
enabled, for data in RAM of 2 cachelines (2N and 2N+1) granularity, if
one is fetched to cache, the other one could likely be fetched too,
which sort of extends the cacheline size to double, thus the false
sharing could happens in adjacent cachelines.

0Day has captured performance changed related with this [1], and some
commercial software explicitly makes its hot global variables 128 bytes
aligned (2 cache lines) to avoid this kind of extended false sharing.

So add an option "--double-cl" for 'perf c2c report' to show false
sharing in double cache line granularity, which acts just like the
cacheline size is doubled. There is no change to c2c record. The
hardware events of shared cacheline are still per cacheline, and this
option just changes the granularity of how events are grouped and
displayed.

In the 'perf c2c report' output below (will-it-scale's 'pagefault2' case
on old kernel):

  ----------------------------------------------------------------------
     26       31        2        0        0        0  0xffff888103ec6000
  ----------------------------------------------------------------------
   35.48%   50.00%    0.00%    0.00%    0.00%   0x10     0       1  0xffffffff8133148b   1153   66    971   3748   74  [k] get_mem_cgroup_from_mm
    6.45%    0.00%    0.00%    0.00%    0.00%   0x10     0       1  0xffffffff813396e4    570    0   1531    879   75  [k] mem_cgroup_charge
   25.81%   50.00%    0.00%    0.00%    0.00%   0x54     0       1  0xffffffff81331472    949   70    593   3359   74  [k] get_mem_cgroup_from_mm
   19.35%    0.00%    0.00%    0.00%    0.00%   0x54     0       1  0xffffffff81339686   1352    0   1073   1022   74  [k] mem_cgroup_charge
    9.68%    0.00%    0.00%    0.00%    0.00%   0x54     0       1  0xffffffff813396d6   1401    0    863    768   74  [k] mem_cgroup_charge
    3.23%    0.00%    0.00%    0.00%    0.00%   0x54     0       1  0xffffffff81333106    618    0    804     11    9  [k] uncharge_batch

The offset 0x10 and 0x54 used to displayed in 2 groups, and now they are
listed together to give users a hint of extended false sharing.

[1]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/

Committer notes:

Link: https://lore.kernel.org/r/Y+wvVNWqXb70l4uy@feng-clx

Removed -a, leaving just as --double-cl, as this probably is not used so
frequently and perhaps will be even auto-detected if we manage to record
the MSR where this is configured.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Joe Mario <jmario@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@intel.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230214075823.246414-1-feng.tang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-16 09:33:45 -03:00
..
accounting tools/accounting/procacct: remove some unused variables 2022-11-18 13:55:09 -08:00
arch perf bench syscall: Add execve syscall benchmark 2023-02-02 16:32:19 -03:00
bootconfig
bpf bpftool: Fix linkage with statically built libllvm 2022-12-22 20:09:43 +01:00
build tools build: Add test echo-cmd 2023-02-03 13:54:22 -03:00
certs
cgroup
counter
debugging
edid
firewire
firmware
gpio tools: gpio: fix -c option of gpio-event-mon 2023-01-27 14:05:46 +01:00
hv
iio tools: iio: iio_generic_buffer: Fix read size 2022-11-01 08:48:13 +00:00
include tools headers: Syncronize linux/build_bug.h with the kernel sources 2023-01-18 10:31:11 -03:00
io_uring
kvm/kvm_stat tools/kvm_stat: update exit reasons for vmx/svm/aarch64/userspace 2022-11-09 12:26:52 -05:00
laptop
leds
lib libperf: Fix install_pkgconfig target 2022-12-16 10:04:06 -03:00
memory-model tools/memory-model: Weaken ctrl dependency definition in explanation.txt 2022-10-18 15:14:52 -07:00
objtool objtool: Tolerate STT_NOTYPE symbols at end of section 2023-01-09 17:53:46 +01:00
pci
pcmcia
perf perf c2c: Add report option to show false sharing in adjacent cachelines 2023-02-16 09:33:45 -03:00
power ACPI updates for 6.2-rc1 2022-12-12 13:38:17 -08:00
rcu
scripts
spi
testing ARM64: 2023-02-04 11:21:27 -08:00
thermal
time
tracing rtla: Fix exit status when returning from calls to usage() 2022-12-09 18:06:24 -05:00
usb tools: usb: ffs-aio-example: Fix build error with aarch64-*-gnu-gcc toolchain(s) 2022-11-09 12:37:56 +01:00
verification Tracing fix for 6.2: 2022-12-21 19:03:42 -08:00
virtio tools/virtio: fix the vringh test for virtio ring changes 2023-01-27 06:18:41 -05:00
vm MM patches for 6.2-rc1. 2022-12-13 19:29:45 -08:00
wmi
Makefile