OpenCloudOS-Kernel/kernel
Paul Mackerras 3b6f9e5cb2 perf_counter: Add support for pinned and exclusive counter groups
Impact: New perf_counter features

A pinned counter group is one that the user wants to have on the CPU
whenever possible, i.e. whenever the associated task is running, for
a per-task group, or always for a per-cpu group.  If the system
cannot satisfy that, it puts the group into an error state where
it is not scheduled any more and reads from it return EOF (i.e. 0
bytes read).  The group can be released from error state and made
readable again using prctl(PR_TASK_PERF_COUNTERS_ENABLE).  When we
have finer-grained enable/disable controls on counters we'll be able
to reset the error state on individual groups.

An exclusive group is one that the user wants to be the only group
using the CPU performance monitor hardware whenever it is on.  The
counter group scheduler will not schedule an exclusive group if there
are already other groups on the CPU and will not schedule other groups
onto the CPU if there is an exclusive group scheduled (that statement
does not apply to groups containing only software counters, which can
always go on and which do not prevent an exclusive group from going on).
With an exclusive group, we will be able to let users program PMU
registers at a low level without the concern that those settings will
perturb other measurements.

Along the way this reorganizes things a little:
- is_software_counter() is moved to perf_counter.h.
- cpuctx->active_oncpu now records the number of hardware counters on
  the CPU, i.e. it now excludes software counters.  Nothing was reading
  cpuctx->active_oncpu before, so this change is harmless.
- A new cpuctx->exclusive field records whether we currently have an
  exclusive group on the CPU.
- counter_sched_out moves higher up in perf_counter.c and gets called
  from __perf_counter_remove_from_context and __perf_counter_exit_task,
  where we used to have essentially the same code.
- __perf_counter_sched_in now goes through the counter list twice, doing
  the pinned counters in the first loop and the non-pinned counters in
  the second loop, in order to give the pinned counters the best chance
  to be scheduled in.

Note that only a group leader can be exclusive or pinned, and that
attribute applies to the whole group.  This avoids some awkwardness in
some corner cases (e.g. where a group leader is closed and the other
group members get added to the context list).  If we want to relax that
restriction later, we can, and it is easier to relax a restriction than
to apply a new one.

This doesn't yet handle the case where a pinned counter is inherited
and goes into error state in the child - the error state is not
propagated up to the parent when the child exits, and arguably it
should.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-01-14 21:00:30 +11:00
..
irq async: Asynchronous function calls to speed up kernel boot 2009-01-07 08:45:46 -08:00
power Merge branch 'linus' into release 2009-01-09 03:39:43 -05:00
time Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2009-01-07 11:31:52 -08:00
trace Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile 2009-01-09 12:43:06 -08:00
.gitignore
Kconfig.freezer container freezer: implement freezer cgroup subsystem 2008-10-20 08:52:34 -07:00
Kconfig.hz sched: fix SCHED_HRTICK dependency 2008-07-28 14:37:38 +02:00
Kconfig.preempt rcu: provide RCU options on non-preempt architectures too 2008-12-25 09:31:28 +01:00
Makefile Merge commit 'v2.6.29-rc1' into perfcounters/core 2009-01-11 02:42:53 +01:00
acct.c CRED: Wrap task credential accesses in the core kernel 2008-11-14 10:39:12 +11:00
async.c async: make async a command line option for now 2009-01-09 13:23:45 -08:00
audit.c [PATCH] fix broken timestamps in AVC generated by kernel threads 2008-12-09 02:27:41 -05:00
audit.h fixing audit rule ordering mess, part 1 2009-01-04 15:14:41 -05:00
audit_tree.c audit: validate comparison operations, store them in sane form 2009-01-04 15:14:42 -05:00
auditfilter.c audit: validate comparison operations, store them in sane form 2009-01-04 15:14:42 -05:00
auditsc.c make sure that filterkey of task,always rules is reported 2009-01-04 15:14:42 -05:00
backtracetest.c backtrace: replace timer with tasklet + completions 2008-06-27 18:09:16 +02:00
bounds.c Add kbuild.h that contains common definitions for kbuild users 2008-04-29 08:06:29 -07:00
capability.c Merge branch 'next' into for-linus 2009-01-07 09:58:22 +11:00
cgroup.c cgroups: add css_tryget() 2009-01-08 08:31:10 -08:00
cgroup_debug.c cgroups: fix probable race with put_css_set[_taskexit] and find_css_set 2008-10-20 08:52:38 -07:00
cgroup_freezer.c freezer_cg: disable writing freezer.state of root cgroup 2008-11-12 17:17:16 -08:00
compat.c Allow times and time system calls to return small negative values 2009-01-06 15:59:13 -08:00
configs.c kernel/configs.c: remove useless comments 2008-10-20 08:52:34 -07:00
cpu.c stop_machine/cpu hotplug: fix disable_nonboot_cpus 2009-01-07 11:36:14 -08:00
cpuset.c cpuset: remove remaining pointers to cpumask_t 2009-01-08 08:31:11 -08:00
cred-internals.h CRED: Inaugurate COW credentials 2008-11-14 10:39:23 +11:00
cred.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 2009-01-09 13:59:25 -08:00
delayacct.c schedstat: consolidate per-task cpu runtime stats 2008-12-18 13:54:01 +01:00
dma-coherent.c dma-coherent: catch oversized requests to dma_alloc_from_coherent() 2009-01-06 15:59:31 -08:00
dma.c kernel/dma.c: remove a CVS keyword 2008-10-16 11:21:30 -07:00
exec_domain.c proc: move /proc/execdomains to kernel/exec_domain.c 2008-10-23 14:30:41 +04:00
exit.c Merge commit 'v2.6.29-rc1' into perfcounters/core 2009-01-11 02:42:53 +01:00
extable.c Merge branch 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-12-30 16:10:19 -08:00
fork.c Merge commit 'v2.6.29-rc1' into perfcounters/core 2009-01-11 02:42:53 +01:00
freezer.c freezer_cg: use thaw_process() in unfreeze_cgroup() 2008-10-30 11:38:45 -07:00
futex.c Merge branches 'core/futexes', 'core/locking', 'core/rcu' and 'linus' into core/urgent 2009-01-06 09:32:11 +01:00
futex_compat.c CRED: Use RCU to access another task's creds and to release a task's own creds 2008-11-14 10:39:19 +11:00
hrtimer.c hrtimer: splitout peek ahead functionality, fix 2009-01-05 14:11:10 +01:00
itimer.c timers: fix itimer/many thread hang 2008-09-14 16:25:35 +02:00
kallsyms.c allow stripping of generated symbols under CONFIG_KALLSYMS_ALL 2008-12-19 22:47:10 +01:00
kexec.c cpumask: prepare for iterators to only go to nr_cpu_ids/nr_cpumask_bits.: core 2009-01-01 10:12:15 +10:30
kfifo.c
kgdb.c kgdb: call touch_softlockup_watchdog on resume 2008-10-06 13:50:59 -05:00
kmod.c kmod: fix varargs kernel-doc 2009-01-06 15:59:27 -08:00
kprobes.c kprobes: support probing module __init function 2009-01-06 15:59:21 -08:00
ksysfs.c kernel/ksysfs.c:fix dependence on CONFIG_NET 2009-01-06 10:44:31 -08:00
kthread.c tracepoints: add DECLARE_TRACE() and DEFINE_TRACE() 2008-11-16 09:01:36 +01:00
latencytop.c KSYM_SYMBOL_LEN fixes 2008-12-10 08:01:54 -08:00
lockdep.c Merge branch 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-12-30 16:10:19 -08:00
lockdep_internals.h lockdep: build fix 2008-08-13 12:55:10 +02:00
lockdep_proc.c lockstat: contend with points 2008-10-20 15:43:10 +02:00
marker.c markers/tracpoints: fix non-modular build 2008-11-16 09:52:03 +01:00
module.c async: Asynchronous function calls to speed up kernel boot 2009-01-07 08:45:46 -08:00
mutex-debug.c mutex-debug: check mutex magic before owner 2008-05-16 16:53:35 +02:00
mutex-debug.h
mutex.c mutex: __used is needed for function referenced only from inline asm 2008-11-24 10:00:28 +01:00
mutex.h
notifier.c Merge commit 'v2.6.28-rc6' into core/debug 2008-11-26 08:22:50 +01:00
ns_cgroup.c ns_cgroup: remove unused spinlock 2009-01-08 08:31:02 -08:00
nsproxy.c User namespaces: set of cleanups (v2) 2008-11-24 18:57:41 -05:00
panic.c oops: increment the oops UUID every time we oops 2009-01-06 15:59:12 -08:00
params.c Fix compile warning in kernel/params.c 2008-10-23 12:09:00 -07:00
perf_counter.c perf_counter: Add support for pinned and exclusive counter groups 2009-01-14 21:00:30 +11:00
pid.c pid: generalize task_active_pid_ns 2009-01-08 08:31:12 -08:00
pid_namespace.c pid_ns: (BUG 11391) change ->child_reaper when init->group_leader exits 2008-09-02 19:21:38 -07:00
pm_qos_params.c pm_qos_requirement might sleep 2008-09-02 19:21:40 -07:00
posix-cpu-timers.c Merge commit 'v2.6.28' into core/core 2008-12-25 13:51:46 +01:00
posix-timers.c Merge branches 'timers/clocksource', 'timers/hpet', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/rtc' into timers/core 2008-12-25 18:02:25 +01:00
printk.c trivial: printk: fix indentation of new_text_line declaration 2009-01-06 11:28:06 +01:00
profile.c profile: don't include <asm/ptrace.h> twice. 2009-01-06 15:59:14 -08:00
ptrace.c Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-12-28 12:21:10 -08:00
rcuclassic.c cpumask: convert RCU implementations, fix 2009-01-03 18:59:25 +01:00
rcupdate.c rcu: eliminate synchronize_rcu_xxx macro 2009-01-05 10:18:08 +01:00
rcupreempt.c rcu: eliminate synchronize_rcu_xxx macro 2009-01-05 10:18:08 +01:00
rcupreempt_trace.c "Tree RCU": scalable classic RCU implementation 2008-12-18 21:56:04 +01:00
rcutorture.c rcu: fix rcutorture bug 2009-01-05 13:09:49 +01:00
rcutree.c rcu: make treercu safe for suspend and resume 2009-01-05 10:12:33 +01:00
rcutree_trace.c "Tree RCU": scalable classic RCU implementation 2008-12-18 21:56:04 +01:00
relay.c relayfs: fix infinite loop with splice() 2008-12-10 08:01:52 -08:00
res_counter.c memcg: memory cgroup resource counters for hierarchy 2009-01-08 08:31:05 -08:00
resource.c resource: allow MMIO exclusivity for device drivers 2009-01-07 11:12:32 -08:00
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c sysdev: Pass the attribute to the low level sysdev show/store function 2008-07-21 21:55:02 -07:00
rtmutex.c hrtimer: convert kernel/* to the new hrtimer apis 2008-09-05 21:35:13 -07:00
rtmutex.h
rtmutex_common.h
rwsem.c
sched.c Merge commit 'v2.6.29-rc1' into perfcounters/core 2009-01-11 02:42:53 +01:00
sched_clock.c sched_clock: prevent scd->clock from moving backwards, take #2 2008-12-31 09:53:21 +01:00
sched_cpupri.c sched: fix section mismatch 2009-01-06 11:07:15 +01:00
sched_cpupri.h sched: convert struct cpupri_vec cpumask_var_t. 2008-11-24 17:52:22 +01:00
sched_debug.c sched: add uid information to sched_debug for CONFIG_USER_SCHED 2008-12-01 20:39:50 +01:00
sched_fair.c generic swap(): sched: remove local swap() macro 2009-01-08 08:31:15 -08:00
sched_features.h sched: backward looking buddy 2008-11-05 10:30:14 +01:00
sched_idletask.c sched: add CONFIG_SMP consistency 2008-10-22 10:01:52 +02:00
sched_rt.c sched: put back some stack hog changes that were undone in kernel/sched.c 2009-01-03 19:00:09 +01:00
sched_stats.h Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask into merge-rr-cpumask 2009-01-03 18:53:31 +01:00
seccomp.c
semaphore.c semaphore: __down_common: use signal_pending_state() 2008-08-05 14:33:47 -07:00
signal.c SEND_SIG_NOINFO: set si_pid to tgid instead of pid 2009-01-06 15:59:29 -08:00
smp.c cpumask: prepare for iterators to only go to nr_cpu_ids/nr_cpumask_bits.: core 2009-01-01 10:12:15 +10:30
softirq.c Merge branch 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-01-03 12:04:39 -08:00
softlockup.c cpumask: convert rest of files in kernel/ 2009-01-01 10:12:28 +10:30
spinlock.c lockdep: spin_lock_nest_lock(), checkpatch fixes 2008-08-13 13:56:51 +02:00
srcu.c
stacktrace.c stacktrace: provide save_stack_trace_tsk() weak alias 2008-12-25 11:44:43 +01:00
stop_machine.c stop_machine: introduce stop_machine_create/destroy. 2009-01-05 08:40:14 +10:30
sys.c Merge commit 'v2.6.29-rc1' into perfcounters/core 2009-01-11 02:42:53 +01:00
sys_ni.c performance counters: core code 2008-12-08 15:47:03 +01:00
sysctl.c NOMMU: Make mmap allocation page trimming behaviour configurable. 2009-01-08 12:04:47 +00:00
sysctl_check.c [XFS] remove restricted chown parameter from xfs linux 2008-10-30 18:30:09 +11:00
taskstats.c cpumask: convert rest of files in kernel/ 2009-01-01 10:12:28 +10:30
test_kprobes.c kprobes: add tests for register_kprobes 2009-01-06 15:59:20 -08:00
time.c Allow times and time system calls to return small negative values 2009-01-06 15:59:13 -08:00
timeconst.pl Make constants in kernel/timeconst.h fixed 64 bits 2008-05-02 16:18:42 -07:00
timer.c [PATCH] idle cputime accounting 2008-12-31 15:11:46 +01:00
tracepoint.c markers/tracpoints: fix non-modular build 2008-11-16 09:52:03 +01:00
tsacct.c mm: introduce get_mm_hiwater_xxx(), fix taskstats->hiwater_xxx accounting 2009-01-06 15:59:09 -08:00
uid16.c CRED: Wrap current->cred and a few other accessors 2008-11-14 10:39:18 +11:00
user.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2008-12-28 12:27:58 -08:00
user_namespace.c User namespaces: set of cleanups (v2) 2008-11-24 18:57:41 -05:00
utsname.c removed unused #include <linux/version.h>'s 2008-08-23 12:14:12 -07:00
utsname_sysctl.c sysctl: simplify ->strategy 2008-10-16 11:21:47 -07:00
wait.c wait: kill is_sync_wait() 2008-10-16 11:21:31 -07:00
workqueue.c cpumask: convert kernel/workqueue.c 2009-01-01 10:12:25 +10:30