linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Arnaldo Carvalho de Melo	65f2ed2b2f	perf report: Print the map table just after samples for which no map was found If -vv is used just the map table will be printed, -vvv will print the symbol table too, with it we can see that we have a bug where some samples are not being resolved to a map when we get them in the perf.data stream, but after we have it all processed, we can find the right map, some reordering probably is happening. Upcoming patches will provide ways to ask for most PERF_SAMPLE_ conditional samples to be taken for !PERF_RECORD_SAMPLE events too, then we'll be able to ask for PERF_SAMPLE_TIME and PERF_SAMPLE_CPU to help diagnose this. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1268161097-17761-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-03-10 13:53:52 +01:00
Arnaldo Carvalho de Melo	faa5c5c36e	perf tools: Don't use parent comm if not set at fork time As the parent comm then is worthless, confusing users about the thread where the sample really happened, leading to think that the sample happened in the parent, not where it really happened, in the children of a thread for which a PERF_RECORD_COMM event was not received. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1266627727-19715-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-21 17:48:24 +01:00
Arnaldo Carvalho de Melo	9de89fe7c5	perf symbols: Remove perf_session usage in symbols layer I noticed while writing the first test in 'perf regtest' that to just test the symbol handling routines one needs to create a perf session, that is a layer centered on a perf.data file, events, etc, so I untied these layers. This reduces the complexity for the users as the number of parameters to most of the symbols and session APIs now was reduced while not adding more state to all the map instances by only having data that is needed to split the kernel (kallsyms and ELF symtab sections) maps and do vmlinux relocation on the main kernel map. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1265223128-11786-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-04 09:33:24 +01:00
Arnaldo Carvalho de Melo	59ee68ecd1	perf symbols: Create thread__find_addr_map from thread__find_addr_location Because some tools will only want to know with maps had hits, not needing the full symbol resolution done by thread__find_addr_location. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1263519930-22803-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-01-16 10:58:48 +01:00
Arnaldo Carvalho de Melo	b7cece7678	perf tools: Encode kernel module mappings in perf.data We were always looking at the running machine /proc/modules, even when processing a perf.data file, which only makes sense when we're doing 'perf record' and 'perf report' on the same machine, and in close sucession, or if we don't use modules at all, right Peter? ;-) Now, at 'perf record' time we read /proc/modules, find the long path for modules, and put them as PERF_MMAP events, just like we did to encode the reloc reference symbol for vmlinux. Talking about that now it is encoded in .pgoff, so that we can use .{start,len} to store the address boundaries for the kernel so that when we reconstruct the kmaps tree we can do lookups right away, without having to fixup the end of the kernel maps like we did in the past (and now only in perf record). One more step in the 'perf archive' direction when we'll finally be able to collect data in one machine and analyse in another. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1263396139-4798-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-01-13 17:39:43 +01:00
Arnaldo Carvalho de Melo	4aa6563641	perf session: Move kmaps to perf_session There is still some more work to do to disentangle map creation from DSO loading, but this happens only for the kernel, and for the early adopters of perf diff, where this disentanglement matters most, we'll be testing different kernels, so no problem here. Further clarification: right now we create the kernel maps for the various modules and discontiguous kernel text maps when loading the DSO, we should do it as a two step process, first creating the maps, for multiple mappings with the same DSO store, then doing the dso load just once, for the first hit on one of the maps sharing this DSO backing store. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260741029-4430-6-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-14 16:57:17 +01:00
Arnaldo Carvalho de Melo	b3165f4144	perf session: Move the global threads list to perf_session So that we can process two perf.data files. We still need to add a O_MMAP mode for perf_session so that we can do all the mmap stuff in it. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260741029-4430-5-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-14 16:57:16 +01:00
Arnaldo Carvalho de Melo	13df45ca1c	perf session: Register the idle thread in perf_session__process_events No need for all tools to register it and then immediately call perf_session__process_events. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260741029-4430-3-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-14 16:57:15 +01:00
Arnaldo Carvalho de Melo	79406cd789	perf symbols: Allow lookups by symbol name too Configurable via symbol_conf.sort_by_name, so that the cost of an extra rb_node on all 'struct symbol' instances is not paid by tools that only want to decode addresses. How to use it: symbol_conf.sort_by_name = true; symbol_init(&symbol_conf); struct map map = map_groups__find_by_name(kmaps, MAP__VARIABLE, "[kernel.kallsyms]"); if (map == NULL) { pr_err("couldn't find map!\n"); kernel_maps__fprintf(stdout); } else { struct symbol sym = map__find_symbol_by_name(map, sym_filter, NULL); if (sym == NULL) pr_err("couldn't find symbol %s!\n", sym_filter); else pr_info("symbol %s: %#Lx-%#Lx \n", sym_filter, sym->start, sym->end); } Looking over the vmlinux/kallsyms is common enough that I'll add a variable to the upcoming struct perf_session to avoid the need to use map_groups__find_by_name to get the main vmlinux/kallsyms map. The above example looks on the 'variable' symtab, but it is just like that for the functions one. Also the sort operation is done when we first use map__find_symbol_by_name, in a lazy way. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260564622-12392-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-12 07:42:11 +01:00
Arnaldo Carvalho de Melo	9958e1f0ae	perf symbols: Rename kthreads to kmaps, using another abstraction for it Using a struct thread instance just to hold the kernel space maps (vmlinux + modules) is overkill and confuses people trying to understand the perf symbols abstractions. The kernel maps are really present in all threads, i.e. the kernel is a library, not a separate thread. So introduce the 'map_groups' abstraction and use it for the kernel maps, now in the kmaps global variable. It, in turn, will move, together with the threads list to the perf_file abstraction, so that we can support multiple perf_file instances, needed by perf diff. Brainstormed-with: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1260550239-5372-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-12 07:42:09 +01:00
Arnaldo Carvalho de Melo	1ed091c45a	perf tools: Consolidate symbol resolving across all tools Now we have a very high level routine for simple tools to process IP sample events: int event__preprocess_sample(const event_t self, struct addr_location al, symbol_filter_t filter) It receives the event itself and will insert new threads in the global threads list and resolve the map and symbol, filling all this info into the new addr_location struct, so that tools like annotate and report can further process the event by creating hist_entries in their specific way (with or without callgraphs, etc). It in turn uses the new next layer function: void thread__find_addr_location(struct thread self, u8 cpumode, enum map_type type, u64 addr, struct addr_location al, symbol_filter_t filter) This one will, given a thread (userspace or the kernel kthread one), will find the given type (MAP__FUNCTION now, MAP__VARIABLE too in the near future) at the given cpumode, taking vdsos into account (userspace hit, but kernel symbol) and will fill all these details in the addr_location given. Tools that need a more compact API for plain function resolution, like 'kmem', can use this other one: struct symbol thread__find_function(struct thread self, u64 addr, symbol_filter_t filter) So, to resolve a kernel symbol, that is all the 'kmem' tool needs, its just a matter of calling: sym = thread__find_function(kthread, addr, NULL); The 'filter' parameter is needed because we do lazy parsing/loading of ELF symtabs or /proc/kallsyms. With this we remove more code duplication all around, which is always good, huh? :-) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-12-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-27 20:22:02 +01:00
Arnaldo Carvalho de Melo	1de8e24520	perf symbols: When not using modules, discard its symbols Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-10-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-27 20:22:01 +01:00
Arnaldo Carvalho de Melo	95011c6007	perf symbols: Support multiple symtabs in struct thread Making the routines that were so far specific to the kernel maps useful for all threads. This is done by making the kernel maps be contained in a kernel "thread". This gets the kernel specific routines closer to the userspace counterparts, which will help in reducing the boilerplate for resolving a symbol, as will be demonstrated in the next patches. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-9-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-27 20:22:00 +01:00
Arnaldo Carvalho de Melo	23ea4a3fad	perf symbols: Kernel_maps should be an array of MAP__NR_TYPES entries So that we can support multiple symbol table types. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259346563-12568-8-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-27 20:22:00 +01:00
Arnaldo Carvalho de Melo	fcf1203a91	perf symbols: Rename find_symbol routines to find_function Paving the way for supporting variable in adition to function symbols. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1259074912-5924-1-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-24 16:37:03 +01:00
Arnaldo Carvalho de Melo	c338aee853	perf symbols: Do lazy symtab loading for the kernel & modules too Just like we do with the other DSOs. This also simplifies the kernel_maps setup process, now all that the tools need to do is to call kernel_maps__init and the maps for the modules and kernel will be created, then, later, when kernel_maps__find_symbol() is used, it will also call maps__find_symbol that already checks if the symtab was loaded, loading it if needed. Now if one does 'perf top --hide_kernel_symbols' we won't pay the price of loading the (many) symbols in /proc/kallsyms or vmlinux. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1258757489-5978-4-git-send-email-acme@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-21 14:11:33 +01:00
Frederic Weisbecker	a4fb581b15	perf tools: Bind callchains to the first sort dimension column Currently, the callchains are displayed using a constant left margin. So depending on the current sort dimension configuration, callchains may appear to be well attached to the first sort dimension column field which is mostly the case, except when the first dimension of sorting is done by comm, because these are right aligned. This patch binds the callchain to the first letter in the first column, whatever type of column it is (dso, comm, symbol). Before: 0.80% perf [k] __lock_acquire __lock_acquire lock_acquire \| \|--58.33%-- _spin_lock \| \| \| \|--28.57%-- inotify_should_send_event \| \| fsnotify \| \| __fsnotify_parent After: 0.80% perf [k] __lock_acquire __lock_acquire lock_acquire \| \|--58.33%-- _spin_lock \| \| \| \|--28.57%-- inotify_should_send_event \| \| fsnotify \| \| __fsnotify_parent Also, for clarity, we don't put anymore the callchain as is but: - If we have a top level ancestor in the callchain, start it with a first ascii hook. Before: 0.80% perf [kernel] [k] __lock_acquire __lock_acquire lock_acquire \| \|--58.33%-- _spin_lock \| \| \| \|--28.57%-- inotify_should_send_event \| \| fsnotify [..] [..] After: 0.80% perf [kernel] [k] __lock_acquire \| --- __lock_acquire lock_acquire \| \|--58.33%-- _spin_lock \| \| \| \|--28.57%-- inotify_should_send_event \| \| fsnotify [..] [..] - Otherwise, if we have several top level ancestors, then display these like we did before: 1.69% Xorg \| \|--21.21%-- vread_hpet \| 0x7fffd85b46fc \| 0x7fffd85b494d \| 0x7f4fafb4e54d \| \|--15.15%-- exaOffscreenAlloc \| \|--9.09%-- I830WaitLpRing Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> LKML-Reference: <1256246604-17156-2-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-23 07:55:18 +02:00
Arnaldo Carvalho de Melo	d5b889f2ec	perf tools: Move threads & last_match to threads.c This was just being copy'n'pasted all over. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091013141629.GD21809@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-13 17:12:18 +02:00
Frederic Weisbecker	97ea1a7fa6	perf tools: Fix thread comm resolution in perf sched This reverts commit `9a92b479b2` ("perf tools: Improve thread comm resolution in perf sched") and fixes the real bug. The bug was elsewhere: We are failing to resolve thread names in perf sched because the table of threads we are building, on top of comm events, has a per process granularity. But perf sched, unlike the other perf tools, needs a per thread granularity as we are profiling every tasks individually. So fix it by building our threads table using the tid instead of the pid as the thread identifier. v2: Revert the previous fix - it is not really needed Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1255028657-11158-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-08 21:10:21 +02:00
Frederic Weisbecker	9a92b479b2	perf tools: Improve thread comm resolution in perf sched When we get sched traces that involve a task that was already created before opening the event, we won't have the comm event for it. So if we can't find the comm event for a given thread, we look at the traces that may contain these informations. Before: ata/1:371 \| 0.000 ms \| 1 \| avg: 3988.693 ms \| max: 3988.693 ms \| kondemand/1:421 \| 0.096 ms \| 3 \| avg: 345.346 ms \| max: 1035.989 ms \| kondemand/0:420 \| 0.025 ms \| 3 \| avg: 421.332 ms \| max: 964.014 ms \| :5124:5124 \| 0.103 ms \| 5 \| avg: 74.082 ms \| max: 277.194 ms \| :6244:6244 \| 0.691 ms \| 9 \| avg: 125.655 ms \| max: 271.306 ms \| firefox:5080 \| 0.924 ms \| 5 \| avg: 53.833 ms \| max: 257.828 ms \| npviewer.bin:6225 \| 21.871 ms \| 53 \| avg: 22.462 ms \| max: 220.835 ms \| :6245:6245 \| 9.631 ms \| 21 \| avg: 41.864 ms \| max: 213.349 ms \| After: ata/1:371 \| 0.000 ms \| 1 \| avg: 3988.693 ms \| max: 3988.693 ms \| kondemand/1:421 \| 0.096 ms \| 3 \| avg: 345.346 ms \| max: 1035.989 ms \| kondemand/0:420 \| 0.025 ms \| 3 \| avg: 421.332 ms \| max: 964.014 ms \| firefox:5124 \| 0.103 ms \| 5 \| avg: 74.082 ms \| max: 277.194 ms \| npviewer.bin:6244 \| 0.691 ms \| 9 \| avg: 125.655 ms \| max: 271.306 ms \| firefox:5080 \| 0.924 ms \| 5 \| avg: 53.833 ms \| max: 257.828 ms \| npviewer.bin:6225 \| 21.871 ms \| 53 \| avg: 22.462 ms \| max: 220.835 ms \| npviewer.bin:6245 \| 9.631 ms \| 21 \| avg: 41.864 ms \| max: 213.349 ms \| Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1255012632-7882-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-08 16:56:33 +02:00
Arnaldo Carvalho de Melo	439d473b47	perf tools: Rewrite and improve support for kernel modules Representing modules as struct map entries, backed by a DSO, etc, using /proc/modules to find where the module is loaded. DSOs now can have a short and long name, so that in verbose mode we can show exactly which .ko or vmlinux image was used. As kernel modules now are a DSO separate from the kernel, we can ask for just the hits for a particular set of kernel modules, just like we can do with shared libraries: [root@doppio linux-2.6-tip]# perf report -n --vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules --dsos \[drm\] \| head -15 84.58% 13266 Xorg [k] drm_clflush_pages 4.02% 630 Xorg [k] trace_kmalloc.clone.0 3.95% 619 Xorg [k] drm_ioctl 2.07% 324 Xorg [k] drm_addbufs 1.68% 263 Xorg [k] drm_gem_close_ioctl 0.77% 120 Xorg [k] drm_setmaster_ioctl 0.70% 110 Xorg [k] drm_lastclose 0.68% 106 Xorg [k] drm_open 0.54% 85 Xorg [k] drm_mm_search_free [root@doppio linux-2.6-tip]# Specifying --dsos /lib/modules/2.6.31-tip/kernel/drivers/gpu/drm/drm.ko would have the same effect. Allowing specifying just 'drm.ko' is left for another patch. Processing kallsyms so that per kernel module struct map are instantiated was also left for another patch. That will allow removing the module name from each of its symbols. struct symbol was reduced by removing the ->module backpointer and moving it (well now the map) to struct symbol_entry in perf top, that is its only user right now. The total linecount went down by ~500 lines. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-02 10:48:42 +02:00
Arnaldo Carvalho de Melo	1b46cddfcc	perf tools: Use rb_tree for maps Threads can have many and kernel modules will be represented as a tree of maps as well. Ah, and for a perf.data with 146607 samples: Before: [root@doppio ~]# perf stat -r 5 perf report > /dev/null Performance counter stats for 'perf report' (5 runs): 699.823680 task-clock-msecs # 0.991 CPUs ( +- 0.454% ) 74 context-switches # 0.000 M/sec ( +- 1.709% ) 2 CPU-migrations # 0.000 M/sec ( +- 17.008% ) 23114 page-faults # 0.033 M/sec ( +- 0.000% ) 1381257019 cycles # 1973.721 M/sec ( +- 0.290% ) 1456894438 instructions # 1.055 IPC ( +- 0.007% ) 18779818 cache-references # 26.835 M/sec ( +- 0.380% ) 641799 cache-misses # 0.917 M/sec ( +- 1.200% ) 0.705972729 seconds time elapsed ( +- 0.501% ) [root@doppio ~]# After Performance counter stats for 'perf report' (5 runs): 691.261451 task-clock-msecs # 0.993 CPUs ( +- 0.307% ) 72 context-switches # 0.000 M/sec ( +- 0.829% ) 6 CPU-migrations # 0.000 M/sec ( +- 18.409% ) 23127 page-faults # 0.033 M/sec ( +- 0.000% ) 1366395876 cycles # 1976.670 M/sec ( +- 0.153% ) 1443136016 instructions # 1.056 IPC ( +- 0.012% ) 17956402 cache-references # 25.976 M/sec ( +- 0.325% ) 661924 cache-misses # 0.958 M/sec ( +- 1.335% ) 0.696127275 seconds time elapsed ( +- 0.377% ) I.e. we see some speedup too. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> LKML-Reference: <20090928174846.GA3361@ghostprotocols.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-30 13:57:56 +02:00
John Kacur	8b40f521cf	perf tools: Protect header files with a consistent style There was a colorful mix of header guards - standardize them. Signed-off-by: John Kacur <jkacur@redhat.com> LKML-Reference: <alpine.LFD.2.00.0909241756530.11383@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 21:27:51 +02:00
Ingo Molnar	0ec04e16d0	perf sched: Add 'perf sched map' scheduling event map printout This prints a textual context-switching outline of workload captured via perf sched record. For example, on a 16 CPU box it outputs: N1 O1 . . . S1 . . . B0 . I0 C1 . M1 . 23002.773423 secs N1 O1 . Q0 . S1 . . . B0 . I0 C1 . M1 . 23002.773423 secs N1 O1 . Q0 . S1 . . . B0 . R1 C1 . M1 . 23002.773485 secs N1 O1 . Q0 . S1 . S0 . B0 . R1 C1 . M1 . 23002.773478 secs L0 O1 . Q0 . S1 . S0 . B0 . R1 C1 . M1 . 23002.773523 secs L0 O1 . . . S1 . S0 . B0 . R1 C1 . M1 . 23002.773531 secs L0 O1 . . . S1 . S0 . B0 . R1 C1 T1 M1 . 23002.773547 secs T1 => irqbalance:2089 L0 O1 . . . S1 . S0 . P0 . R1 C1 T1 M1 . 23002.773549 secs N1 O1 . . . S1 . S0 . P0 . R1 C1 T1 M1 . 23002.773566 secs N1 O1 . . . J0 . S0 . P0 . R1 C1 T1 M1 . 23002.773571 secs N1 O1 . . . J0 . S0 B0 P0 . R1 C1 T1 M1 . 23002.773592 secs N1 O1 . . . J0 . U0 B0 P0 . R1 C1 T1 M1 . 23002.773582 secs N1 O1 . . . S1 . U0 B0 P0 . R1 C1 T1 M1 . 23002.773604 secs N1 O1 . . . S1 . U0 B0 . . R1 C1 T1 M1 . 23002.773615 secs N1 O1 . . . S1 . U0 B0 . . K0 C1 T1 M1 . 23002.773631 secs N1 O1 . M0 . S1 . U0 B0 . . K0 C1 T1 M1 . 23002.773624 secs N1 O1 . M0 . S1 . U0 . . . K0 C1 T1 M1 . 23002.773644 secs N1 O1 . M0 . S1 . U0 . . . R1 C1 T1 M1 . 23002.773662 secs N1 O1 . M0 . S1 . . . . . R1 C1 T1 M1 . 23002.773648 secs N1 O1 . . . S1 . . . . . R1 C1 T1 M1 . 23002.773680 secs N1 O1 . . . L0 . . . . . R1 C1 T1 M1 . 23002.773717 secs N0 O1 . . . L0 . . . . . R1 C1 T1 M1 . 23002.773709 secs N1 O1 . . . L0 . . . . . R1 C1 T1 M1 . 23002.773747 secs Columns stand for individual CPUs, from CPU0 to CPU15, and the two-letter shortcuts stand for tasks that are running on a CPU. '' denotes the CPU that had the event. A dot signals an idle CPU. New tasks are assigned new two-letter shortcuts - when they occur first they are printed. In the above example 'T1' stood for irqbalance: T1 => irqbalance:2089 Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-16 16:41:30 +02:00
Ingo Molnar	b5fae128e4	perf sched: Clean up PID sorting logic Use a sort list for thread atoms insertion as well - instead of hardcoded for PID. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-13 10:22:50 +02:00
Frederic Weisbecker	5b447a6a13	perf tools: Librarize idle thread registration Librarize register_idle_thread() used by annotate and report. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1251693921-6579-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-31 10:04:48 +02:00
Frederic Weisbecker	6baa0a5ae0	perf tools: Factorize the thread code in a dedicated file Factorize the thread management code used by perf-annotate and perf-report in dedicated source and header files. v2: pass last_match by address so that it can actually be modified. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1250245313-6995-1-git-send-email-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-15 16:10:19 +02:00

27 Commits