perf annotate: Add fusion logic for AMD microarchs
AMD family 15h and above microarchs fuse a subset of cmp/test/ALU
instructions with branch instructions[1][2]. Add perf annotate
fused instruction support for these microarchs.
Before:
│ testb $0x80,0x51(%rax)
│ ┌──jne 5b3
0.78 │ │ mov %r13,%rdi
│ │→ callq mark_page_accessed
1.08 │5b3:└─→mov 0x8(%r13),%rax
After:
│ ┌──testb $0x80,0x51(%rax)
│ ├──jne 5b3
0.78 │ │ mov %r13,%rdi
│ │→ callq mark_page_accessed
1.08 │5b3:└─→mov 0x8(%r13),%rax
[1] https://bugzilla.kernel.org/attachment.cgi?id=298553
[2] https://bugzilla.kernel.org/attachment.cgi?id=298555
Committer testing:
On a:
$ grep -m1 "model name" /proc/cpuinfo
model name : AMD Ryzen 9 3900X 12-Core Processor
$
Samples: 44K of event 'cycles', 4000 Hz, Event count (approx.): 7533249650
_int_malloc /usr/lib64/libc-2.33.so [Percent: local period]
Percent│ ┌──test %eax,%eax
│ ├──jne 884
│ │↓ jmpq 943
│ │ nop
│878:│ add $0x10,%rdx
0.64 │ │ add %eax,%eax
0.57 │ │↓ je cc9
0.77 │884:└─→test %esi,%eax
│ ↑ je 878
│ mov 0x18(%rdx),%r15
Reported-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https //lore.kernel.org/r/20210911043854.8373-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>