Currently we can hit following assert when running numa bench:
$ perf bench numa mem -p 3 -t 1 -P 512 -s 100 -zZ0cm --thp 1
perf: bench/numa.c:1577: __bench_numa: Assertion `!(!(((wait_stat) & 0x7f) == 0))' failed.
The assertion is correct, because we hit the SIGFPE in following line:
Thread 2.2 "thread 0/0" received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0x7fffd28c6700 (LWP 11750)]
0x000.. in worker_thread (__tdata=0x7.. ) at bench/numa.c:1257
1257 td->speed_gbs = bytes_done / (td->runtime_ns / NSEC_PER_SEC) / 1e9;
We don't check if the runtime is actually bigger than 1 second,
and thus this might end up with zero division within FPU.
Adding the check to prevent this.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180620094036.17278-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf stat' shows a mismatch in perf stat regarding counter names on
s390:
Run command:
[root@s35lp76 perf]# ./perf stat -e tx_nc_tend -v --
~/mytesttx 1 >/tmp/111
tx_nc_tend: 1 573146 573146
tx_nc_tend: 1 573146 573146
Performance counter stats for '/root/mytesttx 1':
3 tx_nc_tend
0.001037252 seconds time elapsed
[root@s35lp76 perf]#
shows transaction counter tx_nc_tend with value 3 but it was triggered
only once as seen by the output of mytesttx.
When looking up the event name tx_nc_tend the following function
sequence is called:
parse_events_multi_pmu_add()
+--> perf_pmu__scan() being called with NULL argument
+--> pmu_read_sysfs() scans directory ../devices/ for
all PMUs
+--> perf_pmu__find() tries to find a PMU in the
global pmu list.
+--> pmu_lookup() called to read all file
entries when not in global
list.
pmu_lookup() causes the issue. It calls
+---> pmu_aliases() to read all the entries in the PMU directory.
On s390 this is named
/sys/devices/cpum_cf/events.
+--> pmu_aliases_parse() reads all files and creates an
alias for each file name.
So we end up with first entry created by
reading the sysfs file
[root@s35lp76 perf]# cat /sys/devices/cpum_cf
/events/TX_NC_TEND
event=0x008d
[root@s35lp76 perf]#
Debug output shows this entry
tx_nc_tend -> 'cpum_cf'/'event=0x008d
'/
After all files in this directory have been
read and aliases created this function is called:
+--> pmu_add_cpu_aliases()
This function looks up the CPU tables
created by the json files.
With json files for s390 now available all
the aliases are added to
the PMU alias list a second time.
The second entry is added by
reading the json file converted by jevent
resulting in file pmu-events/pmu-events.c:
{
.name = "tx_nc_tend",
.event = "event=0x8d",
.desc = "Unit: cpum_cf Completed TEND \
instructions \
in non-constrained TX mode",
.topic = "extended",
.long_desc = "A TEND instruction has \
completed in a \
non-constrained \
transactional-execution mode",
.pmu = "cpum_cf",
},
Debug output shows this entry
tx_nc_tend -> 'cpum_cf'/'event=0x8d'/
Function pmu_aliases_parse() and pmu_add_cpu_aliases() both use
__perf_pmu__new_alias() to add an alias to the PMU alias list. There is
no check if an alias already exist
So we end up with 2 entries for tx_nc_tend in the PMU alias list.
Having set up the PMU alias list for this PMU now
parse_events_multi_add_pmu() reads the complete alias list and adds each
alias with parse_events_add_pmu() to the global perfev_list. This
causes the alias to be added multiple times to the event list.
Fix this by making __perf_pmu__new_alias() to merge alias definitions if
an alias is already on the alias list. Also print a debug message when
the alias has mismatches in some fields.
Output before:
[root@s35lp76 perf]# ./perf stat -e tx_nc_tend -v \
-- ~/mytesttx 1 >/tmp/111
tx_nc_tend: 1 551446 551446
Performance counter stats for '/root/mytesttx 1':
3 tx_nc_tend
0.000961134 seconds time elapsed
[root@s35lp76 perf]#
Output after:
[root@s35lp76 perf]# ./perf stat -e tx_nc_tend -v \
-- ~/mytesttx 1 >/tmp/111
tx_nc_tend: 1 551446 551446
Performance counter stats for '/root/mytesttx 1':
1 tx_nc_tend
0.000961134 seconds time elapsed
[root@s35lp76 perf]#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180615101105.47047-3-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PMU alias definitions in sysfs files may have spaces, newlines and
numbers with leading zeroes. Some alias definitions may also appear in
JSON files without spaces, etc.
Scan alias definitions and remove leading zeroes, spaces, newlines, etc
and rebuild string to make alias->str member comparable.
s390 for example has terms specified as event=0x0091 (read from files
../<PMU>/events/<FILE> and terms specified as event=0x91 (read from JSON
files).
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180615101105.47047-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove a trailing newline when reading sysfs file contents such as
/sys/devices/cpum_cf/events/TX_NC_TEND. This shows when verbose option
-v is used.
Output before:
tx_nc_tend -> 'cpum_cf'/'event=0x008d
'/
Output after:
tx_nc_tend -> 'cpum_cf'/'event=0x8d'/
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180615101105.47047-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo reported the perf build failure with latest llvm/clang compiler
(7.0).
$ make LIBCLANGLLVM=1 -C tools/perf/
<SNIP>
CC /tmp/tmp.t53Qo38zci/tests/kmod-path.o
util/c++/clang.cpp: In function ‘std::unique_ptr<llvm::SmallVectorImpl<char> >
perf::getBPFObjectFromModule(llvm::Module*)’:
util/c++/clang.cpp:150:43: error: no matching function for call to
‘llvm::TargetMachine::addPassesToEmitFile(llvm::legacy::PassManager&,
llvm::raw_svector_ostream&, llvm::TargetMachine::CodeGenFileType)’
TargetMachine::CGFT_ObjectFile)) {
^
In file included from util/c++/clang.cpp:25:0:
/usr/local/include/llvm/Target/TargetMachine.h:254:16: note: candidate:
virtual bool llvm::TargetMachine::addPassesToEmitFile(
llvm::legacy::PassManagerBase&, llvm::raw_pwrite_stream&,
llvm::raw_pwrite_stream*, llvm::TargetMachine::CodeGenFileType, bool,
llvm::MachineModuleInfo*)
virtual bool addPassesToEmitFile(PassManagerBase &, raw_pwrite_stream &,
^~~~~~~~~~~~~~~~~~~
/usr/local/include/llvm/Target/TargetMachine.h:254:16: note:
candidate expects 6 arguments, 3 provided
mv: cannot stat '/tmp/tmp.t53Qo38zci/util/c++/.clang.o.tmp': No such file or directory
make[7]: *** [/home/acme/git/perf/tools/build/Makefile.build:101:
/tmp/tmp.t53Qo38zci/util/c++/clang.o] Error 1
make[6]: *** [/home/acme/git/perf/tools/build/Makefile.build:139: c++] Error 2
make[5]: *** [/home/acme/git/perf/tools/build/Makefile.build:139: util] Error 2
make[5]: *** Waiting for unfinished jobs....
CC /tmp/tmp.t53Qo38zci/tests/thread-map.o
The function addPassesToEmitFile signature changed in llvm 7.0 and such
a change caused the failure. This patch fixed the issue with using
proper function signatures under different compiler versions.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20180616174739.1076733-1-yhs@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the rename in:
bd3a08aaa9 ("bpf: flowlabel in bpf_fib_lookup should be flowinfo")
Silencing this build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zd1sgtbybtjrrt7bqdybu0s0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The IFLA_BRPORT_ISOLATED and IFLA_VXLAN_TTL_INHERIT defines were added in:
7d850abd5f ("net: bridge: add support for port isolation")
72f6d71e49 ("vxlan: add ttl inherit support")
Pick them, silencing this build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/if_link.h' differs from latest version at 'include/uapi/linux/if_link.h'
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Leblond <eric@regit.org>
Cc: Hangbin Liu <liuhangbin@gmail.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Link: https://lkml.kernel.org/n/tip-ezi5u0mmdqm0wfm0y2y8176r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This updates the tools/perf/ copy of the powerpc file used to generate
the syscall table file used to make 'perf trace' become aware of the new
'rseq' syscall, no matter in which system it gets built, i.e. older
systems where the syscalls are not available in the running kernel (via
tracefs) or in the system headers will still be aware of these
syscalls/.
From this commit:
bb862b021d ("powerpc: Wire up restartable sequences system call")
Silencing this tools/perf build warning:
Warning: Kernel ABI header at 'tools/arch/powerpc/include/uapi/asm/unistd.h' differs from latest version at 'arch/powerpc/include/uapi/asm/unistd.h'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-adtgz6u3apd76tghiu9w0k19@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This updates the tools/perf/ copy of the system call table for x86 which makes
'perf trace' become aware of the new 'io_pgetevents' and 'rseq' syscalls, no
matter in which system it gets built, i.e. older systems where the syscalls are
not available in the running kernel (via tracefs) or in the system headers will
still be aware of these syscalls/.
These are the csets introducing the source drift:
05c17cedf8 ("x86: Wire up restartable sequence system call")
7a074e96de ("aio: implement io_pgetevents")
This results in this build time change:
$ diff -u /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.old /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c
--- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.old 2018-06-15 11:48:17.648948094 -0300
+++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c 2018-06-15 11:48:22.133942480 -0300
@@ -332,5 +332,7 @@
[330] = "pkey_alloc",
[331] = "pkey_free",
[332] = "statx",
+ [333] = "io_pgetevents",
+ [334] = "rseq",
};
-#define SYSCALLTBL_x86_64_MAX_ID 332
+#define SYSCALLTBL_x86_64_MAX_ID 334
$
This silences the following tools/perf/ build warning:
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-tfvyz51sabuzemrszbrhzxni@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the new ioctls added in these csets:
7595bda2fb ("drm: Add DRM client cap for aspect-ratio")
The DRM caps are not yet being decoded in 'perf trace', so this sync
doesn't incur in any change in behaviour in any tools, just silencing
this tools/perf/ build warning:
Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-atwz0arwanq1npu8pptwkoxt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding optional 'valid' callback for events tests in parse-events
object, so we don't try to parse PMUs, which are not supported.
Following line is displayed for skipped test:
running test 52 'intel_pt//u'... SKIP
Committer note:
Use named initializers in the struct evlist_test variable to avoid
breaking the build on centos:5, 6 and others with a similar gcc:
cc1: warnings being treated as errors
tests/parse-events.c: In function 'test_pmu_events':
tests/parse-events.c:1817: error: missing initializer
tests/parse-events.c:1817: error: (near initialization for 'e.type')
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lkml.kernel.org/r/20180611093422.1005-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add missing error handling for parse_events calls in test_event function
that led to following segfault on s390:
running test 52 'intel_pt//u'
perf: Segmentation fault
...
/lib64/libc.so.6(vasprintf+0xe6) [0x3fffca3f106]
/lib64/libc.so.6(asprintf+0x46) [0x3fffca1aa96]
./perf(parse_events_add_pmu+0xb8) [0x80132088]
./perf(parse_events_parse+0xc62) [0x8019529a]
./perf(parse_events+0x98) [0x801341c0]
./perf(test__parse_events+0x48) [0x800cd140]
./perf(cmd_test+0x26a) [0x800bd44a]
test child interrupted
Adding the struct parse_events_error argument to parse_events call. Also
adding parse_events_print_error to get more details on the parsing
failures, like:
# perf test 6 -v
running test 52 'intel_pt//u'failed to parse event 'intel_pt//u', err 1, str 'Cannot find PMU `intel_pt'. Missing kernel support?'
event syntax error: 'intel_pt//u'
\___ Cannot find PMU `intel_pt'. Missing kernel support?
Committer note:
Use named initializers in the struct parse_events_error variable to
avoid breaking the build on centos5, 6 and others with a similar gcc:
cc1: warnings being treated as errors
tests/parse-events.c: In function 'test_event':
tests/parse-events.c:1696: error: missing initializer
tests/parse-events.c:1696: error: (near initialization for 'err.str')
Reported-by: Kim Phillips <kim.phillips@arm.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lkml.kernel.org/r/20180611093422.1005-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For some cases, the callchain provided by the kernel may be empty. So,
the callchain ip filtering code will cause a crash if we do not check
whether the struct ip_callchain pointer is NULL before accessing any
members.
This can be observed on a powerpc64le system running Fedora 27 as shown
below.
# perf record -b -e cycles:u ls
Before:
# perf report --branch-history
perf: Segmentation fault
-------- backtrace --------
perf[0x1027615c]
linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x7fff856304d8]
perf(arch_skip_callchain_idx+0x44)[0x10257c58]
perf[0x1017f2e4]
perf(thread__resolve_callchain+0x124)[0x1017ff5c]
perf(sample__resolve_callchain+0xf0)[0x10172788]
...
After:
# perf report --branch-history
Samples: 25 of event 'cycles:u', Event count (approx.): 2306870
Overhead Source:Line Symbol Shared Object
+ 11.60% _init+35736 [.] _init ls
+ 9.84% strcoll_l.c:137 [.] __strcoll_l libc-2.26.so
+ 9.16% memcpy.S:175 [.] __memcpy_power7 libc-2.26.so
+ 9.01% gconv_charset.h:54 [.] _nl_find_locale libc-2.26.so
+ 8.87% dl-addr.c:52 [.] _dl_addr libc-2.26.so
+ 8.83% _init+236 [.] _init ls
...
Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20180611104049.11048-1-sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On s390 this test case fails because the socket identifiction numbers
assigned to the CPU are higher than the CPU identification numbers.
F/ix this by adding the platform architecture into the perf data header
flag information. This helps identifiing the test platform and handles
s390 specifics in process_cpu_topology().
Before:
[root@p23lp27 perf]# perf test -vvvvv -F 39
39: Session topology :
--- start ---
templ file: /tmp/perf-test-iUv755
socket_id number is too big.You may need to upgrade the perf tool.
---- end ----
Session topology: Skip
[root@p23lp27 perf]#
After:
[root@p23lp27 perf]# perf test -vvvvv -F 39
39: Session topology :
--- start ---
templ file: /tmp/perf-test-8X8VTs
CPU 0, core 0, socket 6
CPU 1, core 1, socket 3
---- end ----
Session topology: Ok
[root@p23lp27 perf]#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixes: c84974ed9f ("perf test: Add entry to test cpu topology")
Link: http://lkml.kernel.org/r/20180611073153.15592-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On s390 the socket identifier assigned to a CPU identifier is random and
(depending on the configuration of the LPAR) may be higher than the CPU
identifier. This is currently not supported.
Fix this by allowing arbitrary socket identifiers being assigned to
CPU id.
Output before:
[root@p23lp27 perf]# ./perf report --header -I -v
...
socket_id number is too big.You may need to upgrade the perf tool.
Error:
The perf.data file has no samples!
# ========
# captured on : Tue May 29 09:29:57 2018
# header version : 1
...
# Core ID and Socket ID information is not available
...
[root@p23lp27 perf]#
Output after:
[root@p23lp27 perf]# ./perf report --header -I -v
...
Error:
The perf.data file has no samples!
# ========
# captured on : Tue May 29 09:29:57 2018
# header version : 1
...
# CPU 0: Core ID 0, Socket ID 6
# CPU 1: Core ID 1, Socket ID 3
# CPU 2: Core ID -1, Socket ID -1
...
[root@p23lp27 perf]#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180611073153.15592-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf stat:
. Add --interval-clear option, to provide a 'watch' like printing (Jiri Olsa)
. Fix metric column header display alignment (Jiri Olsa)
. Improve error messages for default attributes, providing better output
for error in command lines such as:
$ perf stat -T
Cannot set up transaction events
event syntax error: '..cycles,cpu/cycles-t/,cpu/tx-start/,cpu/el-start/,cpu/cycles-ct/}'
\___ unknown term
Where the "event syntax error" line now appears (Jiri Olsa)
perf script:
. Show hw-cache events too (Seeteena Thoufeek)
perf c2c:
. Fix data dependency problem in layout of 'struct c2c_hist_entry', where
its member 'struct hist_entry' must be at the end because it has a ZLA
as its last member, that gets space when handling callchains (Jiri Olsa)
Core:
- We cannot assume that a 'struct perf_evsel' is to be obtained from a
container_of operation on a 'struct hists' as there are tools, such as
'perf c2c' that uses 'struct hist' instances without having them in
container structs that also have 'struct perf_evsel' in a particular
layout, so provide a different way of figuring out if a 'struct hists'
and 'struct hist_entry' have callchains (Arnaldo Carvalho de Melo)
- Fix error index in the PMU event parser, so that error messages can
point to the problematic token (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEELb9bqkb7Te0zijNb1lAW81NSqkAFAlsesIYACgkQ1lAW81NS
qkCvJA//YInlTK5uSKDTSvn7jnw5qz+aOawOi/ggSC2y1QeDh/98GhKyAb4UF23u
Jfe/DA/h8T9NEbPkCjbDwDGsMAiFswZgiZ535f1BklvOriuY7k0Z2s0oYlXjgruy
vn1rMqy/zIVVFUYfoQe9wpilr0hioCLvMiN3kFZPIWUeZlijs6oapagd4wgfdQgK
yFdplNhLA3siLj3j4SKfGcDyMya5Uy6P37Wffh2S4GbOn4kr302YKhNpBAxsezKa
hoZe9GWAM7MKTo5pJD9IigRcRYWKjs7qnR7aewgwFbITTDf3M+gs8NlRqWMCziN1
/kfH+Xi6Rw0XUwSxoo5t74u5vziwYU5Tif+6Wj2bVs7Djj4XAboFMr/ovT2u56s3
a4y+hy38jowgtanH0s1GR86Xh44t1i4d1dUgkJawA3Z+CsK0P/2sakcgPFfpvUIh
vUkj/Zf5OPht4y1bqnxLsKBuF11WvBJpWxNFTNWVXYlWlXdupjCa0yeFYCWutpwL
gmOI6PDphXO0oAe476inDKJYrFHIL1MlSXprNqz3y4A+U86vIlDUchq7hW1WFdex
vuOFwP5bSlJ84ch6C3N7BAKD6X3fT+MTYmmnmLXKMutg2A/u/nOSeed+3nrLNgN0
xxQ0uB1VuR1wszsP6HOY0Ib3HfrLLdhm/0Dq45HVmKWXSZ775pw=
=rF+V
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo-4.18-20180611' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes and improvements from Arnaldo Carvalho de Melo:
perf stat:
- Fix metric column header display alignment (Jiri Olsa)
- Improve error messages for default attributes, providing better output
for error in command lines such as:
$ perf stat -T
Cannot set up transaction events
event syntax error: '..cycles,cpu/cycles-t/,cpu/tx-start/,cpu/el-start/,cpu/cycles-ct/}'
\___ unknown term
Where the "event syntax error" line now appears (Jiri Olsa)
- Add --interval-clear option, to provide a 'watch' like printing (Jiri Olsa)
perf script:
- Show hw-cache events too (Seeteena Thoufeek)
perf c2c:
- Fix data dependency problem in layout of 'struct c2c_hist_entry', where
its member 'struct hist_entry' must be at the end because it has a ZLA
as its last member, that gets space when handling callchains (Jiri Olsa)
Core:
- We cannot assume that a 'struct perf_evsel' is to be obtained from a
container_of operation on a 'struct hists' as there are tools, such as
'perf c2c' that uses 'struct hist' instances without having them in
container structs that also have 'struct perf_evsel' in a particular
layout, so provide a different way of figuring out if a 'struct hists'
and 'struct hist_entry' have callchains (Arnaldo Carvalho de Melo)
- Fix error index in the PMU event parser, so that error messages can
point to the problematic token (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull restartable sequence support from Thomas Gleixner:
"The restartable sequences syscall (finally):
After a lot of back and forth discussion and massive delays caused by
the speculative distraction of maintainers, the core set of
restartable sequences has finally reached a consensus.
It comes with the basic non disputed core implementation along with
support for arm, powerpc and x86 and a full set of selftests
It was exposed to linux-next earlier this week, so it does not fully
comply with the merge window requirements, but there is really no
point to drag it out for yet another cycle"
* 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq/selftests: Provide Makefile, scripts, gitignore
rseq/selftests: Provide parametrized tests
rseq/selftests: Provide basic percpu ops test
rseq/selftests: Provide basic test
rseq/selftests: Provide rseq library
selftests/lib.mk: Introduce OVERRIDE_TARGETS
powerpc: Wire up restartable sequences system call
powerpc: Add syscall detection for restartable sequences
powerpc: Add support for restartable sequences
x86: Wire up restartable sequence system call
x86: Add support for restartable sequences
arm: Wire up restartable sequences system call
arm: Add syscall detection for restartable sequences
arm: Add restartable sequences support
rseq: Introduce restartable sequences system call
uapi/headers: Provide types_32_64.h
Pull more perf tooling updates from Thomas Gleixner:
"Perf tool updates and fixes:
perf stat:
- Display user and system time for workload targets (Jiri Olsa)
perf record:
- Enable arbitrary event names thru name= modifier (Alexey Budankov)
PowerPC:
- Add a python script for hypervisor call statistics (Ravi Bangoria)
Intel PT: (Adrian Hunter)
- Fix sync_switch INTEL_PT_SS_NOT_TRACING
- Fix decoding to accept CBR between FUP and corresponding TIP
- Fix MTC timing after overflow
- Fix "Unexpected indirect branch" error
perf test:
- record+probe_libc_inet_pton:
- To get the symbol table for dynamic shared objects on ubuntu we
need to pass the -D/--dynamic command line option, unlike with
the fedora distros (Arnaldo Carvalho de Melo)
- code-reading:
- Fix perf_env setup for PTI entry trampolines (Adrian Hunter)
- kmod-path:
- Add tests for vdso32 and vdsox32 (Adrian Hunter)
- Use header file util/debug.h (Thomas Richter)
perf annotate:
- Make the various UI backends (stdio, TUI, gtk) use more
consistently structs with annotation options as specified by the
user (Arnaldo Carvalho de Melo)
- Move annotation specific knobs from the symbol_conf global kitchen
sink to the annotation option structs (Arnaldo Carvalho de Melo)
perf script:
- Add more PMU fields to python scripts event handler dict (Jin Yao)
Core:
- Fix misleading error for some unparsable events mentioning PMUs
when those are not involved in the problem (Jiri Olsa)
- Consider BSS symbols when processing /proc/kallsyms ('B' and 'b')
(Arnaldo Carvalho de Melo)
- Be more robust when trying to use per-symbol histograms, checking
for unlikely but possible cases where the space for the histograms
wasn't allocated, print a debug message for such cases (Arnaldo
Carvalho de Melo)
- Fix symbol and object code resolution for vdso32 and vdsox32
(Adrian Hunter)
- No need to check for null when passing pointers to foo__get() style
refcount grabbing helpers, just like in the kernel and with free(),
its safe to pass a NULL pointer to avoid having to check it before
each and every foo__get() call (Arnaldo Carvalho de Melo)
- Remove some dead code (quote.[ch]) (Arnaldo Carvalho de Melo)
- Remove some needless globals, making them local (Arnaldo Carvalho
de Melo)
- Reduce usage of symbol_conf.use_callchain, using other means of
finding out if callchains are in use or available for specific
events, as we evolved this codebase to allow requesting callchains
for just a subset of the monitored events. In time it will help
polish recording and showing mixed sets accross the various tools:
perf record -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions'
(Arnaldo Carvalho de Melo)
- Consider PTI entry trampolines in map__rip_2objdump() (Adrian
Hunter)"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
perf script python: Add dict fields introduction to Documentation
perf script python: Add more PMU fields to event handler dict
perf script python: Move dsoname code to a new function
perf symbols: Add BSS symbols when reading from /proc/kallsyms
perf annnotate: Make __symbol__inc_addr_samples handle src->histograms == NULL
perf intel-pt: Fix "Unexpected indirect branch" error
perf intel-pt: Fix MTC timing after overflow
perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP
perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING
perf script powerpc: Python script for hypervisor call statistics
perf test record+probe_libc_inet_pton: Ask 'nm' for dynamic symbols
perf map: Consider PTI entry trampolines in rip_2objdump()
perf test code-reading: Fix perf_env setup for PTI entry trampolines
perf tools: Fix pmu events parsing rule
perf stat: Display user and system time
perf record: Enable arbitrary event names thru name= modifier
perf tools: Fix symbol and object code resolution for vdso32 and vdsox32
perf tests kmod-path: Add tests for vdso32 and vdsox32
perf hists: Check if a hist_entry has callchains before using them
perf hists: Introduce hist_entry__has_callchain() method
...
Pull core fixes from Thomas Gleixner:
"A small set of core updates:
- Make objtool cope with GCC8 oddities some more
- Remove a stale local_irq_save/restore sequence in the signal code
along with the stale comment in the RCU code. The underlying issue
which led to this has been solved long time ago, but nobody cared
to cleanup the hackarounds"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
signal: Remove no longer required irqsave/restore
rcu: Update documentation of rcu_read_unlock()
objtool: Fix GCC 8 cold subfunction detection for aliased functions
Pull sparc updates from David Miller:
- a FPE signal fix that was also merged upstream
- privileged ADI driver from Tom Hromatka
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: fix compat siginfo ABI regression
selftests: sparc64: char: Selftest for privileged ADI driver
char: sparc64: Add privileged ADI driver
Here is the big staging and IIO driver update for 4.18-rc1.
It was delayed as I wanted to make sure the final driver deletions did
not cause any major merge issues, and all now looks good.
There are a lot of patches here, just over 1000. The diffstat summary
shows the major changes here:
1007 files changed, 16828 insertions(+), 227770 deletions(-)
Because of this, we might be close to shrinking the overall kernel
source code size for two releases in a row.
There was loads of work in this release cycle, primarily:
- tons of ks7010 driver cleanups
- lots of mt7621 driver fixes and cleanups
- most driver cleanups
- wilc1000 fixes and cleanups
- lots and lots of IIO driver cleanups and new additions
- debugfs cleanups for all staging drivers
- lots of other staging driver cleanups and fixes, the shortlog
has the full details.
but the big user-visable things here are the removal of 3 chunks of
code:
- ncpfs and ipx were removed on schedule, no one has cared about
this code since it moved to staging last year, and if it needs
to come back, it can be reverted.
- lustre file system is removed. I've ranted at the lustre
developers about once a year for the past 5 years, with no
real forward progress at all to clean things up and get the
code into the "real" part of the kernel. Given that the
lustre developers continue to work on an external tree and try
to port those changes to the in-kernel tree every once in a
while, this whole thing really really is not working out at
all. So I'm deleting it so that the developers can spend the
time working in their out-of-tree location and get things
cleaned up properly to get merged into the tree correctly at a
later date.
Because of these file removals, you will have merge issues on some of
these files (2 in the ipx code, 1 in the ncpfs code, and 1 in the
atomisp driver). Just delete those files, it's a simple merge :)
All of this has been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWxvjGQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymoEwCbBYnyUl3cwCszIJ3L3/zvUWpmqIgAn1DDsAim
dM4lmKg6HX/JBSV4GAN0
=zdta
-----END PGP SIGNATURE-----
Merge tag 'staging-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/IIO updates from Greg KH:
"Here is the big staging and IIO driver update for 4.18-rc1.
It was delayed as I wanted to make sure the final driver deletions did
not cause any major merge issues, and all now looks good.
There are a lot of patches here, just over 1000. The diffstat summary
shows the major changes here:
1007 files changed, 16828 insertions(+), 227770 deletions(-)
Because of this, we might be close to shrinking the overall kernel
source code size for two releases in a row.
There was loads of work in this release cycle, primarily:
- tons of ks7010 driver cleanups
- lots of mt7621 driver fixes and cleanups
- most driver cleanups
- wilc1000 fixes and cleanups
- lots and lots of IIO driver cleanups and new additions
- debugfs cleanups for all staging drivers
- lots of other staging driver cleanups and fixes, the shortlog has
the full details.
but the big user-visable things here are the removal of 3 chunks of
code:
- ncpfs and ipx were removed on schedule, no one has cared about this
code since it moved to staging last year, and if it needs to come
back, it can be reverted.
- lustre file system is removed.
I've ranted at the lustre developers about once a year for the past
5 years, with no real forward progress at all to clean things up
and get the code into the "real" part of the kernel.
Given that the lustre developers continue to work on an external
tree and try to port those changes to the in-kernel tree every once
in a while, this whole thing really really is not working out at
all. So I'm deleting it so that the developers can spend the time
working in their out-of-tree location and get things cleaned up
properly to get merged into the tree correctly at a later date.
Because of these file removals, you will have merge issues on some of
these files (2 in the ipx code, 1 in the ncpfs code, and 1 in the
atomisp driver). Just delete those files, it's a simple merge :)
All of this has been in linux-next for a while with no reported
problems"
* tag 'staging-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (1011 commits)
staging: ipx: delete it from the tree
ncpfs: remove uapi .h files
ncpfs: remove Documentation
ncpfs: remove compat functionality
staging: ncpfs: delete it
staging: lustre: delete the filesystem from the tree.
staging: vc04_services: no need to save the log debufs dentries
staging: vc04_services: vchiq_debugfs_log_entry can be a void *
staging: vc04_services: remove struct vchiq_debugfs_info
staging: vc04_services: move client dbg directory into static variable
staging: vc04_services: remove odd vchiq_debugfs_top() wrapper
staging: vc04_services: no need to check debugfs return values
staging: mt7621-gpio: reorder includes alphabetically
staging: mt7621-gpio: change gc_map to don't use pointers
staging: mt7621-gpio: use GPIOF_DIR_OUT and GPIOF_DIR_IN macros instead of custom values
staging: mt7621-gpio: change 'to_mediatek_gpio' to make just a one line return
staging: mt7621-gpio: dt-bindings: update documentation for #interrupt-cells property
staging: mt7621-gpio: update #interrupt-cells for the gpio node
staging: mt7621-gpio: dt-bindings: complete documentation for the gpio
staging: mt7621-dts: add missing properties to gpio node
...
* DAX broke a fundamental assumption of truncate of file mapped pages.
The truncate path assumed that it is safe to disconnect a pinned page
from a file and let the filesystem reclaim the physical block. With DAX
the page is equivalent to the filesystem block. Introduce
dax_layout_busy_page() to enable filesystems to wait for pinned DAX
pages to be released. Without this wait a filesystem could allocate
blocks under active device-DMA to a new file.
* DAX arranges for the block layer to be bypassed and uses
dax_direct_access() + copy_to_iter() to satisfy read(2) calls.
However, the memcpy_mcsafe() facility is available through the pmem
block driver. In order to safely handle media errors, via the DAX
block-layer bypass, introduce copy_to_iter_mcsafe().
* Fix cache management policy relative to the ACPI NFIT Platform
Capabilities Structure to properly elide cache flushes when they are not
necessary. The table indicates whether CPU caches are power-fail
protected. Clarify that a deep flush is always performed on
REQ_{FUA,PREFLUSH} requests.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbGxI7AAoJEB7SkWpmfYgCDjsP/2Lcibu9Kf4tKIzuInsle6iE
6qP29qlkpHVTpDKbhvIxTYTYL9sMU0DNUrpPCJR/EYdeyztLWDFC5EAT1wF240vf
maV37s/uP331jSC/2VJnKWzBs2ztQxmKLEIQCxh6aT0qs9cbaOvJgB/WlVu+qtsl
aGJFLmb6vdQacp31noU5plKrMgMA1pADyF5qx9I9K2HwowHE7T368ZEFS/3S//c3
LXmpx/Nfq52sGu/qbRbu6B1CTJhIGhmarObyQnvBYoKntK1Ov4e8DS95wD3EhNDe
FuRkOCUKhjl6cFy7QVWh1ct1bFm84ny+b4/AtbpOmv9l/+0mveJ7e+5mu8HQTifT
wYiEe2xzXJ+OG/xntv8SvlZKMpjP3BqI0jYsTutsjT4oHrciiXdXM186cyS+BiGp
KtFmWyncQJgfiTq6+Hj5XpP9BapNS+OYdYgUagw9ZwzdzptuGFYUMSVOBrYrn6c/
fwqtxjubykJoW0P3pkIoT91arFSea7nxOKnGwft06imQ7TwR4ARsI308feQ9itJq
2P2e7/20nYMsw2aRaUDDA70Yu+Lagn1m8WL87IybUGeUDLb1BAkjphAlWa6COJ+u
PhvAD2tvyM9m0c7O5Mytvz7iWKG6SVgatoAyOPkaeplQK8khZ+wEpuK58sO6C1w8
4GBvt9ri9i/Ww/A+ppWs
=4bfw
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"This adds a user for the new 'bytes-remaining' updates to
memcpy_mcsafe() that you already received through Ingo via the
x86-dax- for-linus pull.
Not included here, but still targeting this cycle, is support for
handling memory media errors (poison) consumed via userspace dax
mappings.
Summary:
- DAX broke a fundamental assumption of truncate of file mapped
pages. The truncate path assumed that it is safe to disconnect a
pinned page from a file and let the filesystem reclaim the physical
block. With DAX the page is equivalent to the filesystem block.
Introduce dax_layout_busy_page() to enable filesystems to wait for
pinned DAX pages to be released. Without this wait a filesystem
could allocate blocks under active device-DMA to a new file.
- DAX arranges for the block layer to be bypassed and uses
dax_direct_access() + copy_to_iter() to satisfy read(2) calls.
However, the memcpy_mcsafe() facility is available through the pmem
block driver. In order to safely handle media errors, via the DAX
block-layer bypass, introduce copy_to_iter_mcsafe().
- Fix cache management policy relative to the ACPI NFIT Platform
Capabilities Structure to properly elide cache flushes when they
are not necessary. The table indicates whether CPU caches are
power-fail protected. Clarify that a deep flush is always performed
on REQ_{FUA,PREFLUSH} requests"
* tag 'libnvdimm-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (21 commits)
dax: Use dax_write_cache* helpers
libnvdimm, pmem: Do not flush power-fail protected CPU caches
libnvdimm, pmem: Unconditionally deep flush on *sync
libnvdimm, pmem: Complete REQ_FLUSH => REQ_PREFLUSH
acpi, nfit: Remove ecc_unit_size
dax: dax_insert_mapping_entry always succeeds
libnvdimm, e820: Register all pmem resources
libnvdimm: Debug probe times
linvdimm, pmem: Preserve read-only setting for pmem devices
x86, nfit_test: Add unit test for memcpy_mcsafe()
pmem: Switch to copy_to_iter_mcsafe()
dax: Report bytes remaining in dax_iomap_actor()
dax: Introduce a ->copy_to_iter dax operation
uio, lib: Fix CONFIG_ARCH_HAS_UACCESS_MCSAFE compilation
xfs, dax: introduce xfs_break_dax_layouts()
xfs: prepare xfs_break_layouts() for another layout type
xfs: prepare xfs_break_layouts() to be called with XFS_MMAPLOCK_EXCL
mm, fs, dax: handle layout changes to pinned dax mappings
mm: fix __gup_device_huge vs unmap
mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS
...
Exactly as the comment just before 'struct c2c_hist_entry" says, i.e.
the last entry in struct hist_entry is a zero length array, that when
allocating space for hist_entry gets extra space if callchains are in
use, which, if hist_entry is not at the end of c2c_hist_entry, the
members after it gets corrupted when callchains get added to the rb
trees collecting them, etc.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Fixes: 7f834c2e84 ("perf c2c report: Display node for cacheline address")
Link: http://lkml.kernel.org/n/tip-bh0ke4fh2ygpj3yowna7o1di@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Test lookup in /proc/self/fd.
"map_files" lookup story showed that lookup is not that simple.
* Test that all those symlinks open the same file.
Check with (st_dev, st_info).
* Test that kernel threads do not have anything in their /proc/*/fd/
directory.
Now this is where things get interesting.
First, kernel threads aren't pinned by /proc/self or equivalent,
thus some "atomicity" is required.
Second, ->comm can contain whitespace and ')'.
No, they are not escaped.
Third, the only reliable way to check if process is kernel thread
appears to be field #9 in /proc/*/stat.
This field is struct task_struct::flags in decimal!
Check is done by testing PF_KTHREAD flags like we do in kernel.
PF_KTREAD value is a part of userspace ABI !!!
Other methods for determining kernel threadness are not reliable:
* RSS can be 0 if everything is swapped, even while reading
from /proc/self.
* ->total_vm CAN BE ZERO if process is finishing
munmap(NULL, whole address space);
* /proc/*/maps and similar files can be empty because unmapping
everything works. Read returning 0 can't distinguish between
kernel thread and such suicide process.
Link: http://lkml.kernel.org/r/20180505000414.GA15090@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Define a new PageTable bit in the page_type and use it to mark pages in
use as page tables. This can be helpful when debugging crashdumps or
analysing memory fragmentation. Add a KPF flag to report these pages to
userspace and update page-types.c to interpret that flag.
Note that only pages currently accounted as NR_PAGETABLES are tracked as
PageTable; this does not include pgd/p4d/pud/pmd pages. Those will be the
subject of a later patch.
Link: http://lkml.kernel.org/r/20180518194519.3820-4-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The following change will introduce new metrics, that doesn't need such
wide hard coded spacing. Switch METRIC_ONLY_LEN macro usage with
metric_only_len variable.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20180606221513.11302-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can call color_fprintf also for non color case, it's handled
properly. This change simplifies following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20180606221513.11302-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding --interval-clear option to clear the screen before next interval.
Committer testing:
# perf stat -I 1000 --interval-clear
And, as expected, it behaves almost like:
# watch -n 0 perf stat -a sleep 1
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20180606221513.11302-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are places where we have only access to struct hists and need to
know if any of its hist_entries has callchains, like when drawing
headers for the various output modes (stdio, TUI, etc), so, when adding
a new hist_entry, check if it has callchains, storing this info for
later use by hists__has_callchains().
This reimplementation is necessary because not always a 'struct hists'
is allocated together with a 'struct perf evsel', so we can't go from
'hists' to 'perf_event_attr.sample_type & PERF_SAMPLE_CALLCHAIN'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-hg5g7yddjio3ljwyqnnaj5dt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since we can't go from struct hists to struct evsel for all cases (c2c
is an exception) and we have access to the hist_entry, use
hist_entry__has_callchains() in the GTK+ hists browser to figure out
if callchains are available.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-8owkgrruzzi5emvblwh4e6le@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since 'perf c2c' uses 'struct hists' not allocated together with a
'struct perf_evsel' instance, we can't go from a 'struct hist_entry'
pointer to a 'struct perf_evsel' via he->hists, so, instead, check if
space was set aside for hist_entry->callchain[0] at hist_entry__new()
time.
Reported-by: Jin Yao <yao.jin@linux.intel.com>
Reported-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: fabd37b837 ("perf hists: Check if a hist_entry has callchains before using them")
Link: https://lkml.kernel.org/n/tip-e8ife8djvvvwmeze3s4yodii@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Notable changes:
- Support for split PMD page table lock on 64-bit Book3S (Power8/9).
- Add support for HAVE_RELIABLE_STACKTRACE, so we properly support live
patching again.
- Add support for patching barrier_nospec in copy_from_user() and syscall entry.
- A couple of fixes for our data breakpoints on Book3S.
- A series from Nick optimising TLB/mm handling with the Radix MMU.
- Numerous small cleanups to squash sparse/gcc warnings from Mathieu Malaterre.
- Several series optimising various parts of the 32-bit code from Christophe Leroy.
- Removal of support for two old machines, "SBC834xE" and "C2K" ("GEFanuc,C2K"),
which is why the diffstat has so many deletions.
And many other small improvements & fixes.
There's a few out-of-area changes. Some minor ftrace changes OK'ed by Steve, and
a fix to our powernv cpuidle driver. Then there's a series touching mm, x86 and
fs/proc/task_mmu.c, which cleans up some details around pkey support. It was
ack'ed/reviewed by Ingo & Dave and has been in next for several weeks.
Thanks to:
Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Al Viro, Andrew
Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Balbir Singh,
Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King, Dave
Hansen, Fabio Estevam, Finn Thain, Frederic Barrat, Gautham R. Shenoy, Haren
Myneni, Hari Bathini, Ingo Molnar, Jonathan Neuschäfer, Josh Poimboeuf,
Kamalesh Babulal, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Greer, Mathieu
Malaterre, Matthew Wilcox, Michael Neuling, Michal Suchanek, Naveen N. Rao,
Nicholas Piggin, Nicolai Stange, Olof Johansson, Paul Gortmaker, Paul
Mackerras, Peter Rosin, Pridhiviraj Paidipeddi, Ram Pai, Rashmica Gupta, Ravi
Bangoria, Russell Currey, Sam Bobroff, Samuel Mendoza-Jonas, Segher
Boessenkool, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stewart Smith,
Thiago Jung Bauermann, Torsten Duwe, Vaibhav Jain, Wei Yongjun, Wolfram Sang,
Yisheng Xie, YueHaibing.
-----BEGIN PGP SIGNATURE-----
iQIwBAABCAAaBQJbGQKBExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYBq
TRAAioK7rz5xYMkxaM3Ng3ybobEeNAwQqOolz98xvmnB9SfDWNuc99vf8cGu0/fQ
zc8AKZ5RcnwipOjyGlxW9oa1ZhVq0xtYnQPiYLEKMdLQmh5D+C7+KpvAd1UElweg
ub40/xDySWfMujfuMSF9JDCWPIXyojt4Xg5nJKIVRrAm/3YMe/+i5Am7NWHuMCEb
aQmZtlYW5Mz81XY0968hjpUO6eKFRmsaM7yFAhGTXx6+oLRpGj1PZB4AwdRIKS2L
Ak7q/VgxtE4W+s3a0GK2s+eXIhGKeFuX9AVnx3nti+8/K1OqrqhDcLMUC/9JpCpv
EvOtO7dxPnZujHjdu4Eai/xNoo4h6zRy7bWqve9LoBM40CP5jljKzu1lwqqb5yO0
jC7/aXhgiSIxxcRJLjoI/TYpZPu40MifrkydmczykdPyPCnMIWEJDcj4KsRL/9Y8
9SSbJzRNC/SgQNTbUYPZFFi6G0QaMmlcbCb628k8QT+Gn3Xkdf/ZtxzqEyoF4Irq
46kFBsiSSK4Bu0rVlcUtJQLgdqytWULO6NKEYnD67laxYcgQd8pGFQ8SjZhRZLgU
q5LA3HIWhoAI4M0wZhOnKXO6JfiQ1UbO8gUJLsWsfF0Fk5KAcdm+4kb4jbI1H4Qk
Vol9WNRZwEllyaiqScZN9RuVVuH0GPOZeEH1dtWK+uWi0lM=
=ZlBf
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Notable changes:
- Support for split PMD page table lock on 64-bit Book3S (Power8/9).
- Add support for HAVE_RELIABLE_STACKTRACE, so we properly support
live patching again.
- Add support for patching barrier_nospec in copy_from_user() and
syscall entry.
- A couple of fixes for our data breakpoints on Book3S.
- A series from Nick optimising TLB/mm handling with the Radix MMU.
- Numerous small cleanups to squash sparse/gcc warnings from Mathieu
Malaterre.
- Several series optimising various parts of the 32-bit code from
Christophe Leroy.
- Removal of support for two old machines, "SBC834xE" and "C2K"
("GEFanuc,C2K"), which is why the diffstat has so many deletions.
And many other small improvements & fixes.
There's a few out-of-area changes. Some minor ftrace changes OK'ed by
Steve, and a fix to our powernv cpuidle driver. Then there's a series
touching mm, x86 and fs/proc/task_mmu.c, which cleans up some details
around pkey support. It was ack'ed/reviewed by Ingo & Dave and has
been in next for several weeks.
Thanks to: Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Al
Viro, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd
Bergmann, Balbir Singh, Cédric Le Goater, Christophe Leroy, Christophe
Lombard, Colin Ian King, Dave Hansen, Fabio Estevam, Finn Thain,
Frederic Barrat, Gautham R. Shenoy, Haren Myneni, Hari Bathini, Ingo
Molnar, Jonathan Neuschäfer, Josh Poimboeuf, Kamalesh Babulal,
Madhavan Srinivasan, Mahesh Salgaonkar, Mark Greer, Mathieu Malaterre,
Matthew Wilcox, Michael Neuling, Michal Suchanek, Naveen N. Rao,
Nicholas Piggin, Nicolai Stange, Olof Johansson, Paul Gortmaker, Paul
Mackerras, Peter Rosin, Pridhiviraj Paidipeddi, Ram Pai, Rashmica
Gupta, Ravi Bangoria, Russell Currey, Sam Bobroff, Samuel
Mendoza-Jonas, Segher Boessenkool, Shilpasri G Bhat, Simon Guo,
Souptick Joarder, Stewart Smith, Thiago Jung Bauermann, Torsten Duwe,
Vaibhav Jain, Wei Yongjun, Wolfram Sang, Yisheng Xie, YueHaibing"
* tag 'powerpc-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (251 commits)
powerpc/64s/radix: Fix missing ptesync in flush_cache_vmap
cpuidle: powernv: Fix promotion from snooze if next state disabled
powerpc: fix build failure by disabling attribute-alias warning in pci_32
ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait()
powerpc-opal: fix spelling mistake "Uniterrupted" -> "Uninterrupted"
powerpc: fix spelling mistake: "Usupported" -> "Unsupported"
powerpc/pkeys: Detach execute_only key on !PROT_EXEC
powerpc/powernv: copy/paste - Mask SO bit in CR
powerpc: Remove core support for Marvell mv64x60 hostbridges
powerpc/boot: Remove core support for Marvell mv64x60 hostbridges
powerpc/boot: Remove support for Marvell mv64x60 i2c controller
powerpc/boot: Remove support for Marvell MPSC serial controller
powerpc/embedded6xx: Remove C2K board support
powerpc/lib: optimise PPC32 memcmp
powerpc/lib: optimise 32 bits __clear_user()
powerpc/time: inline arch_vtime_task_switch()
powerpc/Makefile: set -mcpu=860 flag for the 8xx
powerpc: Implement csum_ipv6_magic in assembly
powerpc/32: Optimise __csum_partial()
powerpc/lib: Adjust .balign inside string functions for PPC32
...
So that we can figure out the real size of the struct and also be able
to tell if callchains may be present in this histogram entry.
Since we can't always guarantee that from hist_entry->hists we can use
hists_to_evsel, to then look at evsel->attr.sample_type for
PERF_SAMPLE_CALLCHAIN, like with the 'perf c2c' tool, that uses plain
'struct hists' instances, we need another way of deciding if a specific
hist_entry instance has callchains associated with it, i.e. if its
hist_entry->callchain[0] has space allocated for.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ptvndealxs1k7myluvu9flnq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf stat:
. Display user and system time for workload targets (Jiri Olsa)
perf record:
. Enable arbitrary event names thru name= modifier (Alexey Budankov)
PowerPC:
. Add a python script for hypervisor call statistics (Ravi Bangoria)
Intel PT: (Adrian Hunter)
. Fix sync_switch INTEL_PT_SS_NOT_TRACING
. Fix decoding to accept CBR between FUP and corresponding TIP
. Fix MTC timing after overflow
. Fix "Unexpected indirect branch" error
perf test:
. record+probe_libc_inet_pton:
. To get the symbol table for dynamic
shared objects on ubuntu we need to pass the -D/--dynamic command line
option, unlike with the fedora distros (Arnaldo Carvalho de Melo)
. code-reading:
. Fix perf_env setup for PTI entry trampolines (Adrian Hunter)
. kmod-path:
. Add tests for vdso32 and vdsox32 (Adrian Hunter)
. Use header file util/debug.h (Thomas Richter)
perf annotate:
. Make the various UI backends (stdio, TUI, gtk) use more consistently
structs with annotation options as specified by the user (Arnaldo Carvalho de Melo)
. Move annotation specific knobs from the symbol_conf global kitchen
sink to the annotation option structs (Arnaldo Carvalho de Melo)
perf script:
. Add more PMU fields to python scripts event handler dict (Jin Yao)
Core:
. Fix misleading error for some unparsable events mentioning PMUs when
those are not involved in the problem (Jiri Olsa)
. Consider BSS symbols when processing /proc/kallsyms ('B' and 'b')
(Arnaldo Carvalho de Melo)
- Be more robust when trying to use per-symbol histograms, checking for
unlikely but possible cases where the space for the histograms wasn't
allocated, print a debug message for such cases (Arnaldo Carvalho de Melo)
- Fix symbol and object code resolution for vdso32 and vdsox32 (Adrian Hunter)
. No need to check for null when passing pointers to foo__get() style
refcount grabbing helpers, just like in the kernel and with free(),
its safe to pass a NULL pointer to avoid having to check it before
each and every foo__get() call (Arnaldo Carvalho de Melo)
. Remove some dead code (quote.[ch]) (Arnaldo Carvalho de Melo)
. Remove some needless globals, making them local (Arnaldo Carvalho de Melo)
. Reduce usage of symbol_conf.use_callchain, using other means of
finding out if callchains are in use or available for specific events,
as we evolved this codebase to allow requesting callchains for just
a subset of the monitored events. In time it will help polish
recording and showing mixed sets accross the various tools:
perf record -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions'
(Arnaldo Carvalho de Melo)
. Consider PTI entry trampolines in map__rip_2objdump() (Adrian Hunter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEELb9bqkb7Te0zijNb1lAW81NSqkAFAlsYQTAACgkQ1lAW81NS
qkCjoA//WKjqox7hb4JZiSwvOulA8REs1UvTHzusm48spNRXUXezFWBRgQrvuh3E
4C5DUnhOC4Tonm/Je4xzu80xB4w4yjd6QTXicQ0sajEEZXSoBS5XtjhWdZCpXb/v
cIBY+7wEHkMZE3p89rUKih1tbBrKkjLD1QykD5QoNOB/HsektBbDTsmsI2kgBetg
jjwhkh1CJqBEvd9rDeMyWTDHI5EUruINVzqIoF4U8gUu/TaowWF5d+byEM5x0JLC
yRB3mIfyWxQug+VONjvs4Z/f/LhRuLbayz0l4lhqmfqta0b7aDl+VMdcwXcFQMeS
wotXPu9fJKQmXVtGMqoLRHWDBDDtpHPGZb2F+cHDNsskzMtPiGmjH+YW1tant0Q8
5c4ZqJvkxViED1Xmn6dWgqYwc3hjxVvZ1pEu3hR9/GWdrfwq/4NXaumGmf1BVILf
+IACKOjb6XdSxo9BXgdQb5a5uxcZqF41vrJJqfZE7jtEvSkVZrewX7ugbX5c+ANX
BWGrXOZx4aT4W4qSWjSYk36mQ60gXeB8G238xnlzV36UY9/svF9eLtXa9joq1Xaq
2DQ3abA0iw9Mv1eh4ZAkYZo69Iy9zgZKJ+yDJt96UCQqz88lFirYRdda7XTf6WFR
wWAfrhTgAuVTmC/aksQvIu3G23lQkgdVhT3SaXD+/QVs9B2c90Y=
=9fWW
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-4.18-20180606' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf stat:
- Display user and system time for workload targets (Jiri Olsa)
perf record:
- Enable arbitrary event names thru name= modifier (Alexey Budankov)
PowerPC:
- Add a python script for hypervisor call statistics (Ravi Bangoria)
Intel PT: (Adrian Hunter)
- Fix sync_switch INTEL_PT_SS_NOT_TRACING
- Fix decoding to accept CBR between FUP and corresponding TIP
- Fix MTC timing after overflow
- Fix "Unexpected indirect branch" error
perf test:
- record+probe_libc_inet_pton:
- To get the symbol table for dynamic
shared objects on ubuntu we need to pass the -D/--dynamic command line
option, unlike with the fedora distros (Arnaldo Carvalho de Melo)
- code-reading:
- Fix perf_env setup for PTI entry trampolines (Adrian Hunter)
- kmod-path:
- Add tests for vdso32 and vdsox32 (Adrian Hunter)
- Use header file util/debug.h (Thomas Richter)
perf annotate:
- Make the various UI backends (stdio, TUI, gtk) use more consistently
structs with annotation options as specified by the user (Arnaldo Carvalho de Melo)
- Move annotation specific knobs from the symbol_conf global kitchen
sink to the annotation option structs (Arnaldo Carvalho de Melo)
perf script:
- Add more PMU fields to python scripts event handler dict (Jin Yao)
Core:
- Fix misleading error for some unparsable events mentioning PMUs when
those are not involved in the problem (Jiri Olsa)
- Consider BSS symbols when processing /proc/kallsyms ('B' and 'b')
(Arnaldo Carvalho de Melo)
- Be more robust when trying to use per-symbol histograms, checking for
unlikely but possible cases where the space for the histograms wasn't
allocated, print a debug message for such cases (Arnaldo Carvalho de Melo)
- Fix symbol and object code resolution for vdso32 and vdsox32 (Adrian Hunter)
- No need to check for null when passing pointers to foo__get() style
refcount grabbing helpers, just like in the kernel and with free(),
its safe to pass a NULL pointer to avoid having to check it before
each and every foo__get() call (Arnaldo Carvalho de Melo)
- Remove some dead code (quote.[ch]) (Arnaldo Carvalho de Melo)
- Remove some needless globals, making them local (Arnaldo Carvalho de Melo)
- Reduce usage of symbol_conf.use_callchain, using other means of
finding out if callchains are in use or available for specific events,
as we evolved this codebase to allow requesting callchains for just
a subset of the monitored events. In time it will help polish
recording and showing mixed sets accross the various tools:
perf record -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions'
(Arnaldo Carvalho de Melo)
- Consider PTI entry trampolines in map__rip_2objdump() (Adrian Hunter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull networking updates from David Miller:
1) Add Maglev hashing scheduler to IPVS, from Inju Song.
2) Lots of new TC subsystem tests from Roman Mashak.
3) Add TCP zero copy receive and fix delayed acks and autotuning with
SO_RCVLOWAT, from Eric Dumazet.
4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard
Brouer.
5) Add ttl inherit support to vxlan, from Hangbin Liu.
6) Properly separate ipv6 routes into their logically independant
components. fib6_info for the routing table, and fib6_nh for sets of
nexthops, which thus can be shared. From David Ahern.
7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP
messages from XDP programs. From Nikita V. Shirokov.
8) Lots of long overdue cleanups to the r8169 driver, from Heiner
Kallweit.
9) Add BTF ("BPF Type Format"), from Martin KaFai Lau.
10) Add traffic condition monitoring to iwlwifi, from Luca Coelho.
11) Plumb extack down into fib_rules, from Roopa Prabhu.
12) Add Flower classifier offload support to igb, from Vinicius Costa
Gomes.
13) Add UDP GSO support, from Willem de Bruijn.
14) Add documentation for eBPF helpers, from Quentin Monnet.
15) Add TLS tx offload to mlx5, from Ilya Lesokhin.
16) Allow applications to be given the number of bytes available to read
on a socket via a control message returned from recvmsg(), from
Soheil Hassas Yeganeh.
17) Add x86_32 eBPF JIT compiler, from Wang YanQing.
18) Add AF_XDP sockets, with zerocopy support infrastructure as well.
From Björn Töpel.
19) Remove indirect load support from all of the BPF JITs and handle
these operations in the verifier by translating them into native BPF
instead. From Daniel Borkmann.
20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha.
21) Allow XDP programs to do lookups in the main kernel routing tables
for forwarding. From David Ahern.
22) Allow drivers to store hardware state into an ELF section of kernel
dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy.
23) Various RACK and loss detection improvements in TCP, from Yuchung
Cheng.
24) Add TCP SACK compression, from Eric Dumazet.
25) Add User Mode Helper support and basic bpfilter infrastructure, from
Alexei Starovoitov.
26) Support ports and protocol values in RTM_GETROUTE, from Roopa
Prabhu.
27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard
Brouer.
28) Add lots of forwarding selftests, from Petr Machata.
29) Add generic network device failover driver, from Sridhar Samudrala.
* ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits)
strparser: Add __strp_unpause and use it in ktls.
rxrpc: Fix terminal retransmission connection ID to include the channel
net: hns3: Optimize PF CMDQ interrupt switching process
net: hns3: Fix for VF mailbox receiving unknown message
net: hns3: Fix for VF mailbox cannot receiving PF response
bnx2x: use the right constant
Revert "net: sched: cls: Fix offloading when ingress dev is vxlan"
net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
enic: fix UDP rss bits
netdev-FAQ: clarify DaveM's position for stable backports
rtnetlink: validate attributes in do_setlink()
mlxsw: Add extack messages for port_{un, }split failures
netdevsim: Add extack error message for devlink reload
devlink: Add extack to reload and port_{un, }split operations
net: metrics: add proper netlink validation
ipmr: fix error path when ipmr_new_table fails
ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds
net: hns3: remove unused hclgevf_cfg_func_mta_filter
netfilter: provide udp*_lib_lookup for nf_tproxy
qed*: Utilize FW 8.37.2.0
...
triggers. For example:
# cd /sys/kernel/debug/tracing
# echo 'snapshot' > events/ftrace/print/trigger
# echo 'cause snapshot' > trace_marker
The rest of the changes are various clean ups and also one stable fix that
was added late in the cycle.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCWxftjBQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qosyAQCgmhb+/HgRYIbsKa8Q70E2pMx5A/za
4rQ37Q9PhDUJlAD/aWXfrqcMm+buAU0yY4K+eb+f2feKXWXxdZ/fuh7+nA4=
=lpXM
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"One new feature was added to ftrace, which is the trace_marker now
supports triggers. For example:
# cd /sys/kernel/debug/tracing
# echo 'snapshot' > events/ftrace/print/trigger
# echo 'cause snapshot' > trace_marker
The rest of the changes are various clean ups and also one stable fix
that was added late in the cycle"
* tag 'trace-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (21 commits)
tracing: Use match_string() instead of open coding it in trace_set_options()
branch-check: fix long->int truncation when profiling branches
ring-buffer: Fix typo in comment
ring-buffer: Fix a bunch of typos in comments
tracing/selftest: Add test to test simple snapshot trigger for trace_marker
tracing/selftest: Add test to test hist trigger between kernel event and trace_marker
tracing/selftest: Add selftests to test trace_marker histogram triggers
ftrace/selftest: Fix reset_trigger() to handle triggers with filters
ftrace/selftest: Have the reset_trigger code be a bit more careful
tracing: Document trace_marker triggers
tracing: Allow histogram triggers to access ftrace internal events
tracing: Prevent further users of zero size static arrays in trace events
tracing: Have zero size length in filter logic be full string
tracing: Add trigger file for trace_markers tracefs/ftrace/print
tracing: Do not show filter file for ftrace internal events
tracing: Add brackets in ftrace event dynamic arrays
tracing: Have event_trace_init() called by trace_init_tracefs()
tracing: Add __find_event_file() to find event files without restrictions
tracing: Do not reference event data in post call triggers
tracepoints: Fix the descriptions of tracepoint_probe_register{_prio}
...
This Kselftest update for 4.18-rc1 consists of:
- Work to restructure timers test suite to move PIE out of rtctest from
Alexandre Belloni.
- Several minor spelling and bug fixes.
- New cgroup tests from Roman Gushchin and Mike Rapoport.
- Kselftest framework changes to handle and report skipped tests correctly.
Prior to these changes, framework treated all non-zero return codes from
tests as failures. When tests are skipped with non-zero return code, due
to unmet dependencies and/or unsupported configuration, reporting them as
failed lead to false negatives on the tests that couldn't be run.
- Fixes to test Makefiles to remove unnecessary RUN_TESTS and EMIT_TESTS
overrides and use common defines from lib.mk.
- Fixes to several tests to return correct Kselftest skip code.
- Changes to improve test output.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJbFaYgAAoJEAsCRMQNDUMccHkQAK3Bx9dw7MBR9z4LsT+/0Qr8
sgQp2WivuricX71T0JjqXImWsrloYS+4ChvaFCmvy1xIK29/4TtiXJR/pCcJrsQr
JA3AzurtLVjNGLxKOJqP+ALxy/oc7JYXcmooxL332fmQafNsfazJm4HKWHwNuSJM
2fQ2xdbqWo3LT+pNHX1i7H0JRrudGbPmCufUO0UZra/9oTrlPgL6oC7q7+KG6hLL
wLSQlF88MYXUm/lnNZyHc9aWGkOLmHcaHPXxMiCo/indI7UN+COUix2lOQ7dZ9a1
Paoo7f9GaS0pKfBGEhRc1aKzNOjDZgUPM+VWgKCkx9B2U8iUNwogt/jDh3feqzwJ
7yhha/g5vSLdhoWiFINmc0R+DEJ+OnG6v+G50M+XIYyriQ7zCoe209EOjgw4yhE4
epJL9CofiOt0/n7UGSNvvCzoEOtlBwhpmKzhnBMpYEzD8cykAJKOrUGjRs/URHyC
MOkGxz4ZDOqsJSfsDd7aJ122SfeatwYey5H6GQgCgoP1D//qXRv3T7CVdIcPcFKs
mhZEn0pekreMyz0dSpPHHfKFaNQBvUJ35taDq5r948mL6QN+TUev4mNjpnHTYgSy
U7QdLiRKft3a01GCZedL/+NdMnege2kvtjpQka9Sa08/HN3aqMmYQHiQM101egUD
/jUd/WF8O5/fNMkGOExk
=eVzz
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest update from Shuah Khan:
- Work to restructure timers test suite to move PIE out of rtctest from
Alexandre Belloni.
- Several minor spelling and bug fixes.
- New cgroup tests from Roman Gushchin and Mike Rapoport.
- Kselftest framework changes to handle and report skipped tests
correctly.
Prior to these changes, framework treated all non-zero return codes
from tests as failures. When tests are skipped with non-zero return
code, due to unmet dependencies and/or unsupported configuration,
reporting them as failed lead to false negatives on the tests that
couldn't be run.
- Fixes to test Makefiles to remove unnecessary RUN_TESTS and
EMIT_TESTS overrides and use common defines from lib.mk.
- Fixes to several tests to return correct Kselftest skip code.
- Changes to improve test output.
* tag 'linux-kselftest-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (55 commits)
selftests: lib: fix prime_numbers module search and skip logic
selftests: intel_pstate: notification about privilege required to run intel_pstate testing script
selftests: cgroup/memcontrol: add basic test for socket accounting
selftest: intel_pstate: debug support message from aperf.c and return value
kselftest/cgroup: fix variable dereferenced before check warning
selftests/intel_pstate: Enhance table printing
selftests/intel_pstate: Improve test, minor fixes
selftests: cgroup/memcontrol: add basic test for swap controls
selftests: cgroup: add memory controller self-tests
selftests: memfd: split regular and hugetlbfs tests
selftests: net: return Kselftest Skip code for skipped tests
selftests: mqueue: return Kselftest Skip code for skipped tests
selftests: memory-hotplug: return Kselftest Skip code for skipped tests
selftests: memfd: return Kselftest Skip code for skipped tests
selftests: membarrier: return Kselftest Skip code for skipped tests
selftests: media_tests: return Kselftest Skip code for skipped tests
selftests: locking: return Kselftest Skip code for skipped tests
selftests: locking: add Makefile for locking test
selftests: lib: return Kselftest Skip code for skipped tests
selftests: lib: add prime_numbers.sh test to Makefile
...
Add a brief introduction about fields to perf-script-python.txt.
It should help python script developers in easily finding what fields
are supported.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1527843663-32288-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When doing pmu sampling and then running a script with perf script -s
script.py, the process_event function gets dictionary with some fields
from the perf ring buffer (like ip, sym, callchain etc).
But we miss quite a few fields we report now, for example, LBRs, data
source, weight, transaction, iregs, uregs, etc.
This patch reports these fields for perf script python processing.
New keys/items:
---------------
key : brstack
items: from, to, from_dsoname, to_dsoname, mispred,
predicted, in_tx, abort, cycles.
key : brstacksym
items: from, to, pred, in_tx, abort (converted string)
key : datasrc
key : datasrc_decode (decoded string)
key : iregs
key : uregs
key : weight
key : transaction
v2:
---
Add new fields for dso.
Use PyBool_FromLong() for mispred/predicted/in_tx/abort
Committer notes:
!sym->name isn't valid, as its not a pointer, its a [0] array, use
!sym->name[0] instead, guaranteed to be the case by symbol__new.
This was caught by just one of the containers:
52 54.22 ubuntu:17.04 : FAIL gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406
CC /tmp/build/perf/util/scripting-engines/trace-event-python.o
util/scripting-engines/trace-event-python.c:534:20: error: address of array 'sym->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
if (!sym || !sym->name)
~~~~~~^~~~
1 error generated.
mv: cannot stat '/tmp/build/perf/util/scripting-engines/.trace-event-python.o.tmp': No such file or directory
/git/linux/tools/build/Makefile.build:96: recipe for target '/tmp/build/perf/util/scripting-engines/trace-event-python.o' failed
make[5]: *** [/tmp/build/perf/util/scripting-engines/trace-event-python.o] Error 1
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1527843663-32288-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch creates a new function get_dsoname() and move the code which
gets the dsoname string to this function.
That's because in next patch, when we process LBR data, we will also
need get_dsoname() to return dsoname for branch from/to.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1527843663-32288-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were not considering 'B' and 'b' (BSS, uninitialized data objects,
that gets set to zero at program start), do it so that we can resolve
more symbols in tools doing resolution of data operands, like 'perf c2c'.
When using vmlinux, i.e. an ELF symbol table, those were already
considered, as the decision was about STT_FUNC or STT_OBJECT, and the
later covers BSS symbols.
# grep -i ' b ' /proc/kallsyms | head -20 | tail -5
ffffffffa789d030 b execute_command
ffffffffa789d038 b initcall_command_line
ffffffffa789d040 b static_command_line
ffffffffa789d048 B ROOT_DEV
ffffffffa789d050 b once.73786
#
# readelf -s /lib/modules/`uname -r`/build/vmlinux | grep ROOT_DEV
79219: ffffffff8289d048 4 OBJECT GLOBAL DEFAULT 58 ROOT_DEV
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-z960xobig39ca1pmp5brl2fr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Making it a bit more robust, this took place here when a sample appeared
right after:
ffffffff8a925000 D __nosave_end
And before the next considered symbol, which, using kallsyms make us
over guess the size of __nosave_end, and then the sequence:
hist_entry__inc_addr_samples ->
symbol__inc_addr_samples ->
symbol__hists ->
annotated_source__alloc_histograms
Ends up not liking to allocate gigabytes of ram for annotation...
This will be alleviated by considering BSS symbols, which we should but
don't so far, and then we should investigate those samples further.
The testcase was to have:
perf top -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions
Running for a while till it segfaulted trying to access NULL notes->src->histograms.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ndfjtpiop3tdcnyjgp320ra8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some Atom CPUs can produce FUP packets that contain NLIP (next linear
instruction pointer) instead of CLIP (current linear instruction
pointer). That will result in "Unexpected indirect branch" errors. Fix
by comparing IP to NLIP in that case.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1527762225-26024-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On some platforms, overflows will clear before MTC wraparound, and there
is no following TSC/TMA packet. In that case the previous TMA is valid.
Since there will be a valid TMA either way, stop setting 'have_tma' to
false upon overflow.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1527762225-26024-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>