Remove hardware sampler support from oprofile module.
The oprofile user space utilty has been switched to use the kernel
perf interface, for which we also provide hardware sampling support.
In addition the hardware sampling support is also slightly broken: it
supports only 16 bits for the pid and therefore would generate wrong
results on machines which have a pid >64k.
Also the pt_regs structure which was passed to oprofile common code
cannot necessarily be used to generate sane backtraces, since the
task(s) in question may run while the samples are fed to oprofile.
So the result would be more or less random.
However given that the only user space tools switched to the perf
interface already four years ago the hardware sampler code seems to be
unused code, and therefore it should be reasonable to remove it.
The timer based oprofile support continues to work.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Andreas Arnez <arnez@linux.vnet.ibm.com>
Acked-by: Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Acked-by: Robert Richter <rric@kernel.org>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We have four different stack tracers of which three had bugs. So it's
time to merge them to a single stack tracer which allows to specify a
call back function which will be called for each step.
This patch changes behavior a bit:
- the "nosched" and "in_sched_functions" check within
save_stack_trace_tsk did work only for the last stack frame within a
context. Now it considers the check for each stack frame like it
should.
- both the oprofile variant and the perf_events variant did save a
return address twice if a zero back chain was detected, which
indicates an interrupt frame. The new dump_trace function will call
the oprofile and perf_events backends with the psw address that is
contained within the corresponding pt_regs structure instead.
- the original show_trace and save_context_stack functions did already
use the psw address of the pt_regs structure if a zero back chain
was detected. However now we ignore the psw address if it is a user
space address. After all we trace the kernel stack and not the user
space stack. This way we also get rid of the garbage user space
address in case of warnings and / or panic call traces.
So this should make life easier since now there is only one stack
tracer left which we can break.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix these errors when compiling with CONFIG_OPROFILE=y and
CONFIG_PERF_EVENTS=n:
arch/s390/oprofile/init.c: In function ‘oprofile_hwsampler_start’:
arch/s390/oprofile/init.c:93:2: error: implicit declaration of function 'perf_reserve_sampling' [-Werror=implicit-function-declaration]
retval = perf_reserve_sampling();
^
arch/s390/oprofile/init.c:99:3: error: implicit declaration of function 'perf_release_sampling' [-Werror=implicit-function-declaration]
perf_release_sampling();
^
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the 31 bit support in order to reduce maintenance cost and
effectively remove dead code. Since a couple of years there is no
distribution left that comes with a 31 bit kernel.
The 31 bit kernel also has been broken since more than a year before
anybody noticed. In addition I added a removal warning to the kernel
shown at ipl for 5 minutes: a960062e58 ("s390: add 31 bit warning
message") which let everybody know about the plan to remove 31 bit
code. We didn't get any response.
Given that the last 31 bit only machine was introduced in 1999 let's
remove the code.
Anybody with 31 bit user space code can still use the compat mode.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introduce reserve/release functions to share the sampling facility
between perf and oprofile.
Also improve error handling for the sampling facility support in perf.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
it's always equal to ->d_sb of the second argument (parent dentry),
due to either being literally that, or ->d_sb of parent's parent.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Just add the new model number where appropiate.
Cc: stable@vger.kernel.org # v3.10
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch adds support for the latest release of the
IBM mainframe series - the IBM zEnterprise EC12 (zEC12).
The CPU measurement facility didn't change. So only the new CPU type
has to be tolerated.
Signed-off-by: Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Signed-off-by: Robert Richter <rric@kernel.org>
If oprofilefs_ulong_from_user() is called with count equals zero, *val
remains unchanged. Depending on the implementation it might be
uninitialized. Fixing users of oprofilefs_ulong_ from_user().
We missed these s390 changes with:
913050b oprofile: Fix uninitialized memory access when writing to writing to oprofilefs
Cc: stable@vger.kernel.org # 3.3+
Signed-off-by: Robert Richter <robert.richter@amd.com>
Remove the file name from the comment at top of many files. In most
cases the file name was wrong anyway, so it's rather pointless.
Also unify the IBM copyright statement. We did have a lot of sightly
different statements and wanted to change them one after another
whenever a file gets touched. However that never happened. Instead
people start to take the old/"wrong" statements to use as a template
for new files.
So unify all of them in one go.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (106 commits)
perf kvm: Fix copy & paste error in description
perf script: Kill script_spec__delete
perf top: Fix a memory leak
perf stat: Introduce get_ratio_color() helper
perf session: Remove impossible condition check
perf tools: Fix feature-bits rework fallout, remove unused variable
perf script: Add generic perl handler to process events
perf tools: Use for_each_set_bit() to iterate over feature flags
perf tools: Unify handling of features when writing feature section
perf report: Accept fifos as input file
perf tools: Moving code in some files
perf tools: Fix out-of-bound access to struct perf_session
perf tools: Continue processing header on unknown features
perf tools: Improve macros for struct feature_ops
perf: builtin-record: Document and check that mmap_pages must be a power of two.
perf: builtin-record: Provide advice if mmap'ing fails with EPERM.
perf tools: Fix truncated annotation
perf script: look up thread using tid instead of pid
perf tools: Look up thread names for system wide profiling
perf tools: Fix comm for processes with named threads
...
If oprofilefs_ulong_from_user() is called with count equals
zero, *val remains unchanged. Depending on the implementation it
might be uninitialized.
Change oprofilefs_ulong_from_user()'s interface to return count
on success. Thus, we are able to return early if count equals
zero which avoids using *val uninitialized. Fixing all users of
oprofilefs_ulong_ from_user().
This follows write syscall implementation when count is zero:
"If count is zero ... [and if] no errors are detected, 0 will be
returned without causing any other effect." (man 2 write)
Reported-By: Mike Waychison <mikew@google.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Cc: oprofile-list <oprofile-list@lists.sourceforge.net>
Link: http://lkml.kernel.org/r/20111219153830.GH16765@erda.amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
With this patch the OProfile Basic Mode Sampling support for System z
is enhanced with a counter file system. That way hardware sampling
can be configured using the user space tools with only little
modifications.
With the patch by default new cpu_types (s390/z10, s390/z196) are
returned in order to indicate that we are running a CPU which provides
the hardware sampling facility. Existing user space tools will
complain about an unknown cpu type. In order to be compatible with
existing user space tools the `cpu_type' module parameter has been
added. Setting the parameter to `timer' will force the module to
return `timer' as cpu_type. The module will still try to use hardware
sampling if available and the hwsampling virtual filesystem will be
also be available for configuration. So this has a different effect
than using the generic oprofile module parameter `timer=1'.
If the basic mode sampling is enabled on the machine and the
cpu_type=timer parameter is not used the kernel module will provide
the following virtual filesystem:
/dev/oprofile/0/enabled
/dev/oprofile/0/event
/dev/oprofile/0/count
/dev/oprofile/0/unit_mask
/dev/oprofile/0/kernel
/dev/oprofile/0/user
In the counter file system only the values of 'enabled', 'count',
'kernel', and 'user' are evaluated by the kernel module. Everything
else must contain fixed values.
The 'event' value only supports a single event - HWSAMPLING with value
0.
The 'count' value specifies the hardware sampling rate as it is passed
to the CPU measurement facility.
The 'kernel' and 'user' flags can now be used to filter for samples
when using hardware sampling.
Additionally also the following file will be created:
/dev/oprofile/timer/enabled
This will always be the inverted value of /dev/oprofile/0/enabled. 0
is not accepted without hardware sampling.
Signed-off-by: Andreas Krebbel <krebbel@linux.vnet.ibm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
The sampling interval for the hardware sampler is specified in cycles.
(see SA23-2260-01 The Load-Program-Parameter and the CPU-Measurement
Facilities)
The current default value will therefore result in millions of samples.
This patch changes the default sampling interval to 4M, which will
result in ~1500 samples per second on a z196 reducing the overhead
of sampling.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
On specific configurations with hwsampler opcontrol --start returns an
error on "echo 1 >/dev/oprofile/enable". Turns out that the hw sampling
interval is not checked against the hardware limits.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Many stupid corrections of duplicated includes based on the output of
scripts/checkincludes.pl.
Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
oprofile_min_interval and oprofile_max_interval are unsigned, checking
for negative values doesn't work. Change hwsampler_query_min_interval
and hwsampler_query_max_interval to return an unsigned long and
check for a zero value instead.
Reported-by: Nicolas Kaiser <nikai@nikai.net>
Acked-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Doesn't work and build for CONFIG_32BIT. So disable it.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove unused HAVE_HWSAMPLER config option. It is not used anymore,
removing it.
Also make some functions static and some coding style fixes.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Merge the contents of hwsampler_files.c into
arch/s390/oprofile/init.c.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch is a rework of the hwsampler oprofile implementation that
has been applied recently. Now there are less non-architectural
changes. The only changes are:
* introduction of oprofile_add_ext_hw_sample(), and
* removal of section attributes of oprofile_timer_init/_exit().
To setup hwsampler for oprofile we need to modify start()/stop()
callbacks and additional hwsampler control files in oprofilefs. We do
not reinitialize the timer or hwsampler mode by restarting calling
init/exit() anymore, instead hwsampler_running is used to switch the
mode directly in oprofile_hwsampler_start/_stop(). For locking reasons
there is also hwsampler_file that reflects the value in oprofilefs.
The overall diffstat of the oprofile s390 hwsampler implemenation
shows the low impact to non-architectural code:
arch/Kconfig | 3 +
arch/s390/Kconfig | 1 +
arch/s390/oprofile/Makefile | 2 +-
arch/s390/oprofile/hwsampler.c | 1256 ++++++++++++++++++++++++++++++++++
arch/s390/oprofile/hwsampler.h | 113 +++
arch/s390/oprofile/hwsampler_files.c | 162 +++++
arch/s390/oprofile/init.c | 6 +-
drivers/oprofile/cpu_buffer.c | 24 +-
drivers/oprofile/timer_int.c | 4 +-
include/linux/oprofile.h | 7 +
10 files changed, 1567 insertions(+), 11 deletions(-)
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
OProfile is enhanced to export all files for controlling System z's
hardware sampling, and to invoke hwsampler exported functions to
initialize and use System z's hardware sampling.
The patch invokes hwsampler_setup() during oprofile init and exports
following hwsampler files under oprofilefs if hwsampler's setup
succeeded:
A new directory for hardware sampling based files
/dev/oprofile/hwsampling/
The userland daemon must explicitly write to the following files
to disable (or enable) hardware based sampling
/dev/oprofile/hwsampling/hwsampler
to modify the actual sampling rate
/dev/oprofile/hwsampling/hw_interval
to modify the amount of sampling memory (measured in 4K pages)
/dev/oprofile/hwsampling/hw_sdbt_blocks
The following files are read only and show
the possible minimum sampling rate
/dev/oprofile/hwsampling/hw_min_interval
the possible maximum sampling rate
/dev/oprofile/hwsampling/hw_max_interval
The patch splits the oprofile_timer_[init/exit] function so that it
can be also called through user context (oprofilefs) to avoid kernel
oops.
Applied with following changes:
* whitespace changes in Makefile and timer_int.c
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Maran Pakkirisamy <maranp@linux.vnet.ibm.com>
Signed-off-by: Heinz Graalfs <graalfs@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!