Commit Graph

190 Commits

Author SHA1 Message Date
Rafael J. Wysocki 1c3d85dd4e cpufreq: Revert incorrect commit 5800043
Commit 5800043 (cpufreq: convert cpufreq_driver to using RCU) causes
the following call trace to be spit on boot:

 BUG: sleeping function called from invalid context at /scratch/rafael/work/linux-pm/mm/slab.c:3179
 in_atomic(): 0, irqs_disabled(): 0, pid: 292, name: systemd-udevd
 2 locks held by systemd-udevd/292:
  #0:  (subsys mutex){+.+.+.}, at: [<ffffffff8146851a>] subsys_interface_register+0x4a/0xe0
  #1:  (rcu_read_lock){.+.+.+}, at: [<ffffffff81538210>] cpufreq_add_dev_interface+0x60/0x5e0
 Pid: 292, comm: systemd-udevd Not tainted 3.9.0-rc8+ #323
 Call Trace:
  [<ffffffff81072c90>] __might_sleep+0x140/0x1f0
  [<ffffffff811581c2>] kmem_cache_alloc+0x42/0x2b0
  [<ffffffff811e7179>] sysfs_new_dirent+0x59/0x130
  [<ffffffff811e63cb>] sysfs_add_file_mode+0x6b/0x110
  [<ffffffff81538210>] ? cpufreq_add_dev_interface+0x60/0x5e0
  [<ffffffff810a3254>] ? __lock_is_held+0x54/0x80
  [<ffffffff811e647d>] sysfs_add_file+0xd/0x10
  [<ffffffff811e6541>] sysfs_create_file+0x21/0x30
  [<ffffffff81538280>] cpufreq_add_dev_interface+0xd0/0x5e0
  [<ffffffff81538210>] ? cpufreq_add_dev_interface+0x60/0x5e0
  [<ffffffffa000337f>] ? acpi_processor_get_platform_limit+0x32/0xbb [processor]
  [<ffffffffa022f540>] ? do_drv_write+0x70/0x70 [acpi_cpufreq]
  [<ffffffff810a3254>] ? __lock_is_held+0x54/0x80
  [<ffffffff8106c97e>] ? up_read+0x1e/0x40
  [<ffffffff8106e632>] ? __blocking_notifier_call_chain+0x72/0xc0
  [<ffffffff81538dbd>] cpufreq_add_dev+0x62d/0xae0
  [<ffffffff815389b8>] ? cpufreq_add_dev+0x228/0xae0
  [<ffffffff81468569>] subsys_interface_register+0x99/0xe0
  [<ffffffffa014d000>] ? 0xffffffffa014cfff
  [<ffffffff81535d5d>] cpufreq_register_driver+0x9d/0x200
  [<ffffffffa014d000>] ? 0xffffffffa014cfff
  [<ffffffffa014d0e9>] acpi_cpufreq_init+0xe9/0x1000 [acpi_cpufreq]
  [<ffffffff810002fa>] do_one_initcall+0x11a/0x170
  [<ffffffff810b4b87>] load_module+0x1cf7/0x2920
  [<ffffffff81322580>] ? ddebug_proc_open+0xb0/0xb0
  [<ffffffff816baee0>] ? retint_restore_args+0xe/0xe
  [<ffffffff810b5887>] sys_init_module+0xd7/0x120
  [<ffffffff816bb6d2>] system_call_fastpath+0x16/0x1b

which is quite obvious, because that commit put (multiple instances
of) sysfs_create_file() under rcu_read_lock()/rcu_read_unlock(),
although sysfs_create_file() may cause memory to be allocated with
GFP_KERNEL and that may sleep, which is not permitted in RCU read
critical section.

Revert the buggy commit altogether along with some changes on top
of it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-29 00:08:16 +02:00
Viresh Kumar 820c6ca293 cpufreq: Don't call __cpufreq_governor() for drivers without target()
Some cpufreq drivers implement their own governor and so don't need
us to call generic governors interface via __cpufreq_governor(). Few
recent commits haven't obeyed this law well and we saw some
regressions.

This patch is an attempt to fix the above issue.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-22 00:48:03 +02:00
Viresh Kumar e4969ebac8 cpufreq: Call __cpufreq_governor() with correct policy->cpus mask
__cpufreq_governor() must be called with a correct policy->cpus mask.
In __cpufreq_remove_dev() we initially clear policy->cpus with
cpumask_clear_cpu() and then call
__cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT). If the governor
is doing some per-cpu stuff in EXIT callback, this can create
uncertain behavior.

Generic governors in drivers/cpufreq/ doesn't do any per-cpu stuff
in EXIT callback and so we don't face any issues currently. But its
better to keep the code clean, so we don't face any issues in future.

Now, we call cpumask_clear_cpu() only when multiple cpus are managed
by policy.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-11 22:50:09 +02:00
Nathan Zimmer 5800043b24 cpufreq: convert cpufreq_driver to using RCU
We eventually would like to remove the rwlock cpufreq_driver_lock or
convert it back to a spinlock and protect the read sections with RCU.
The first step in that direction is to make cpufreq_driver use RCU.
I don't see an easy wasy to protect the cpufreq_cpu_data structure
with RCU, so I am leaving it with the rwlock for now since under
certain configs __cpufreq_cpu_get is a hot spot with 256+ cores.

[rjw: Subject, changelog, white space]
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-10 13:19:26 +02:00
Viresh Kumar b43a7ffbf3 cpufreq: Notify all policy->cpus in cpufreq_notify_transition()
policy->cpus contains all online cpus that have single shared clock line. And
their frequencies are always updated together.

Many SMP system's cpufreq drivers take care of this in individual drivers but
the best place for this code is in cpufreq core.

This patch modifies cpufreq_notify_transition() to notify frequency change for
all cpus in policy->cpus and hence updates all users of this API.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-02 15:24:00 +02:00
Viresh Kumar 4d5dcc4211 cpufreq: governor: Implement per policy instances of governors
Currently, there can't be multiple instances of single governor_type.
If we have a multi-package system, where we have multiple instances
of struct policy (per package), we can't have multiple instances of
same governor. i.e. We can't have multiple instances of ondemand
governor for multiple packages.

Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
governor-name/. Which again reflects that there can be only one
instance of a governor_type in the system.

This is a bottleneck for multicluster system, where we want different
packages to use same governor type, but with different tunables.

This patch uses the infrastructure provided by earlier patch and
implements init/exit routines for ondemand and conservative
governors.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:34 +02:00
Viresh Kumar 7bd353a995 cpufreq: Add per policy governor-init/exit infrastructure
Currently, there can't be multiple instances of single governor_type.
If we have a multi-package system, where we have multiple instances
of struct policy (per package), we can't have multiple instances of
same governor. i.e. We can't have multiple instances of ondemand
governor for multiple packages.

Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
governor-name/. Which again reflects that there can be only one
instance of a governor_type in the system.

This is a bottleneck for multicluster system, where we want different
packages to use same governor type, but with different tunables.

This patch is inclined towards providing this infrastructure. Because
we are required to allocate governor's resources dynamically now, we
must do it at policy creation and end. And so got
CPUFREQ_GOV_POLICY_INIT/EXIT.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:34 +02:00
Nathan Zimmer 0d1857a1b9 cpufreq: Convert the cpufreq_driver_lock to a rwlock
This eliminates the contention I am seeing in __cpufreq_cpu_get.
It also nicely stages the lock to be replaced by the rcu.

Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:34 +02:00
Rafael J. Wysocki 4419fbd4b4 Merge branch 'pm-cpufreq'
* pm-cpufreq: (55 commits)
  cpufreq / intel_pstate: Fix 32 bit build
  cpufreq: conservative: Fix typos in comments
  cpufreq: ondemand: Fix typos in comments
  cpufreq: exynos: simplify .init() for setting policy->cpus
  cpufreq: kirkwood: Add a cpufreq driver for Marvell Kirkwood SoCs
  cpufreq/x86: Add P-state driver for sandy bridge.
  cpufreq_stats: do not remove sysfs files if frequency table is not present
  cpufreq: Do not track governor name for scaling drivers with internal governors.
  cpufreq: Only call cpufreq_out_of_sync() for driver that implement cpufreq_driver.target()
  cpufreq: Retrieve current frequency from scaling drivers with internal governors
  cpufreq: Fix locking issues
  cpufreq: Create a macro for unlock_policy_rwsem{read,write}
  cpufreq: Remove unused HOTPLUG_CPU code
  cpufreq: governors: Fix WARN_ON() for multi-policy platforms
  cpufreq: ondemand: Replace down_differential tuner with adj_up_threshold
  cpufreq / stats: Get rid of CPUFREQ_STATDEVICE_ATTR
  cpufreq: Don't check cpu_online(policy->cpu)
  cpufreq: add imx6q-cpufreq driver
  cpufreq: Don't remove sysfs link for policy->cpu
  cpufreq: Remove unnecessary use of policy->shared_type
  ...
2013-02-15 13:59:07 +01:00
Dirk Brandewie fa69e33f7d cpufreq: Do not track governor name for scaling drivers with internal governors.
Scaling drivers that implement internal governors do not have governor
structures assocaited with them.  Only track the name of the governor
associated with the CPU if the driver does not implement
cpufreq_driver.setpolicy()

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-09 12:55:53 +01:00
Dirk Brandewie f6b0515b07 cpufreq: Only call cpufreq_out_of_sync() for driver that implement cpufreq_driver.target()
Scaling drivers that implement cpufreq_driver.setpolicy() have
internal governors that do not signal changes via
cpufreq_notify_transition() so the frequncy in the policy will almost
certainly be different than the current frequncy.  Only call
cpufreq_out_of_sync() when the underlying driver implements
cpufreq_driver.target()

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-09 12:55:47 +01:00
Dirk Brandewie 9e21ba8bd8 cpufreq: Retrieve current frequency from scaling drivers with internal governors
Scaling drivers that implement the cpufreq_driver.setpolicy() versus
the cpufreq_driver.target() interface do not set policy->cur.

Normally policy->cur is set during the call to cpufreq_driver.target()
when the frequnecy request is made by the governor.

If the scaling driver implements cpufreq_driver.setpolicy() and
cpufreq_driver.get() interfaces use cpufreq_driver.get() to retrieve
the current frequency.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-09 12:55:03 +01:00
Viresh Kumar 2eaa3e2df1 cpufreq: Fix locking issues
cpufreq core uses two locks:
- cpufreq_driver_lock: General lock for driver and cpufreq_cpu_data array.
- cpu_policy_rwsemfix locking: per CPU reader-writer semaphore designed to cure
  all cpufreq/hotplug/workqueue/etc related lock issues.

These locks were not used properly and are placed against their principle
(present before their definition) at various places. This patch is an attempt to
fix their use.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-09 01:22:57 +01:00
Viresh Kumar fa1d8af47f cpufreq: Create a macro for unlock_policy_rwsem{read,write}
On the lines of macro: lock_policy_rwsem, we can create another macro for
unlock_policy_rwsem. Lets do it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-09 01:22:06 +01:00
Viresh Kumar 65922465b5 cpufreq: Remove unused HOTPLUG_CPU code
Because the sibling cpu of any online cpu is identified very early in
cpufreq_add_dev(), below code is never executed. And so can be removed.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-09 01:21:37 +01:00
Viresh Kumar 8e53695f7f cpufreq: governors: Fix WARN_ON() for multi-policy platforms
On multi-policy systems there is a single instance of governor for both the
policies (if same governor is chosen for both policies). With the code update
from following patches:

8eeed09 cpufreq: governors: Get rid of dbs_data->enable field
b394058 cpufreq: governors: Reset tunables only for cpufreq_unregister_governor()

We are creating/removing sysfs directory of governor for for every call to
GOV_START and STOP. This would fail for multi-policy system as there is a
per-policy call to START/STOP.

This patch reuses the governor->initialized variable to detect total users of
governor.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-09 01:21:13 +01:00
Viresh Kumar 3361b7b173 cpufreq: Don't check cpu_online(policy->cpu)
policy->cpu or cpus in policy->cpus can't be offline anymore. And so we don't
need to check if they are online or not.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-09 01:18:34 +01:00
Viresh Kumar 73bf0fc2b0 cpufreq: Don't remove sysfs link for policy->cpu
"cpufreq" directory in policy->cpu is never created using
sysfs_create_link(), but using kobject_init_and_add(). And so we
shouldn't call sysfs_remove_link() for policy->cpu().  sysfs stuff
for policy->cpu is automatically removed when we call kobject_put()
for dying policy.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-05 22:21:14 +01:00
Viresh Kumar b394058f06 cpufreq: governors: Reset tunables only for cpufreq_unregister_governor()
Currently, whenever governor->governor() is called for CPUFRREQ_GOV_START event
we reset few tunables of governor. Which isn't correct, as this routine is
called for every cpu hot-[un]plugging event. We should actually be resetting
these only when the governor module is removed and re-installed.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 01:29:31 +01:00
Viresh Kumar fcf8058296 cpufreq: Simplify cpufreq_add_dev()
Currently cpufreq_add_dev() firsts allocates policy, calls
driver->init() and then checks if this CPU is already managed or not.
And if it is already managed, its policy is freed.

We can save all this if we somehow know that CPU is managed or not in
advance.  policy->related_cpus contains the list of all valid sibling
CPUs of policy->cpu. We can check this to see if the current CPU is
already managed.

From now on, platforms don't really need to set related_cpus from
their init() routines, as the same work is done by core too.

If a platform driver needs to set the related_cpus mask with some
additional CPUs, other than CPUs present in policy->cpus, they are
free to do it, though, as we don't override anything.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:16 +01:00
Viresh Kumar b26f72042e cpufreq: Revert "cpufreq: Don't use cpu removed during cpufreq_driver_unregister"
This reverts commit 956f339 "cpufreq: Don't use cpu removed during
cpufreq_driver_unregister".

With the addition of the following commit, this change/variable is not
required any more:

commit b9ba2725343ae57add3f324dfa5074167f48de96
Author: Viresh Kumar <viresh.kumar@linaro.org>
Date:   Mon Jan 14 13:23:03 2013 +0000

    cpufreq: Simplify __cpufreq_remove_dev()

[rjw: Subject and changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:16 +01:00
Borislav Petkov 9d95046e5d cpufreq: Add a get_current_driver helper
Add a helper function to return cpufreq_driver->name.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:15 +01:00
Dirk Brandewie d5aaffa9dd cpufreq: handle cpufreq being disabled for all exported function.
When disable_cpufreq() is called some exported functions are still
being used that do not have a check for cpufreq being disabled.

Add a disabled check into cpufreq_cpu_get() to return NULL if
cpufreq is disabled this covers most of the exported functions. For
the exported functions that do not call cpufreq_cpu_get() add an
explicit check.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:14 +01:00
Viresh Kumar b8eed8af94 cpufreq: Simplify __cpufreq_remove_dev()
__cpufreq_remove_dev() is called on multiple occasions: cpufreq_driver
unregister and cpu removals.

Current implementation of this routine is overly complex without much need. If
the cpu to be removed is the policy->cpu, we remove the policy first and add all
other cpus again from policy->cpus and then finally call __cpufreq_remove_dev()
again to remove the cpu to be deleted. Haahhhh..

There exist a simple solution to removal of a cpu:
- Simply use the old policy structure
- update its fields like: policy->cpu, etc.
- notify any users of cpufreq, which depend on changing policy->cpu

Hence this patch, which tries to implement the above theory. It is tested well
by myself on ARM big.LITTLE TC2 SoC, which has 5 cores (2 A15 and 3 A7). Both
A15's share same struct policy and all A7's share same policy structure.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:14 +01:00
Viresh Kumar 6954ca9c8b cpufreq: Don't use cpu removed during cpufreq_driver_unregister
This is how the core works:
cpufreq_driver_unregister()
 - subsys_interface_unregister()
   - for_each_cpu() call cpufreq_remove_dev(), i.e. 0,1,2,3,4 when we
     unregister.

cpufreq_remove_dev():
 - Remove policy node
 - Call cpufreq_add_dev() for next cpu, sharing mask with removed cpu.
   i.e. When cpu 0 is removed, we call it for cpu 1. And when called for cpu 2,
   we call it for cpu 3.
   - cpufreq_add_dev() would call cpufreq_driver->init()
   - init would return mask as AND of 2, 3 and 4 for cluster A7.
   - cpufreq core would do online_cpu && policy->cpus
     Here is the BUG(). Because cpu hasn't died but we have just unregistered
     the cpufreq driver, online cpu would still have cpu 2 in it. And so thing
     go bad again.

Solution: Keep cpumask of cpus that are registered with cpufreq core and clear
	  cpus when we get a call from subsys_interface_unregister() via
	  cpufreq_remove_dev().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:14 +01:00
Viresh Kumar f6a7409cab cpufreq: Notify governors when cpus are hot-[un]plugged
Because cpufreq core and governors worry only about the online cpus, if a cpu is
hot [un]plugged, we must notify governors about it, otherwise be ready to expect
something unexpected.

We already have notifiers in the form of CPUFREQ_GOV_START/CPUFREQ_GOV_STOP, we
just need to call them now.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:14 +01:00
Viresh Kumar 643ae6e81d cpufreq: Manage only online cpus
cpufreq core doesn't manage offline cpus and if driver->init() has returned
mask including offline cpus, it may result in unwanted behavior by cpufreq core
or governors.

We need to get only online cpus in this mask. There are two places to fix this
mask, cpufreq core and cpufreq driver. It makes sense to do this at common place
and hence is done in core.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:14 +01:00
Paul Gortmaker 43720bd601 PM / tracing: remove deprecated power trace API
The text in Documentation said it would be removed in 2.6.41;
the text in the Kconfig said removal in the 3.1 release.  Either
way you look at it, we are well past both, so push it off a cliff.

Note that the POWER_CSTATE and the POWER_PSTATE are part of the
legacy tracing API.  Remove all tracepoints which use these flags.
As can be seen from context, most already have a trace entry via
trace_cpu_idle anyways.

Also, the cpufreq/cpufreq.c PSTATE one is actually unpaired, as
compared to the CSTATE ones which all have a clear start/stop.
As part of this, the trace_power_frequency also becomes orphaned,
so it too is deleted.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-01-26 00:39:12 +01:00
Jingoo Han f55c9c2627 cpufreq: Remove unnecessary initialization of a local variable
Remove an unnecessary initializer for the 'ret' variable in
__cpufreq_set_policy().

[rjw: Modified the subject and changelog slightly.]
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:09 +01:00
Viresh Kumar 7249924e53 cpufreq: Make sure target freq is within limits
__cpufreq_driver_target() must not pass target frequency beyond the
limits of current policy.

Today most of cpufreq platform drivers are doing this check in their
target routines. Why not move it to __cpufreq_driver_target()?

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:09 +01:00
Viresh Kumar 5a1c022850 cpufreq: Avoid calling cpufreq driver's target() routine if target_freq == policy->cur
Avoid calling cpufreq driver's target() routine if new frequency is same as
policies current frequency.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:09 +01:00
Viresh Kumar da58445570 cpufreq: Fix sparse warning by making local function static
cpufreq_disabled() is a local function, so should be marked static.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:08 +01:00
Viresh Kumar 0676f7f2e7 cpufreq: return early from __cpufreq_driver_getavg()
There is no need to do cpufreq_get_cpu() and cpufreq_put_cpu() for drivers that
don't support getavg() routine.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:07 +01:00
Viresh Kumar db7011516c cpufreq: Improve debug prints
With debug options on, it is difficult to locate cpufreq core's debug prints.
Fix this by prefixing debug prints with KBUILD_MODNAME.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:06 +01:00
viresh kumar 4b972f0b04 cpufreq / core: Fix printing of governor and driver name
Arrays for governer and driver name are of size CPUFREQ_NAME_LEN or 16.
i.e. 15 bytes for name and 1 for trailing '\0'.

When cpufreq driver print these names (for sysfs), it includes '\n' or ' ' in
the fmt string and still passes length as CPUFREQ_NAME_LEN. If the driver or
governor names are using all 15 fields allocated to them, then the trailing '\n'
or ' ' will never be printed. And so commands like:

root@linaro-developer# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver

will print something like:

cpufreq_foodrvroot@linaro-developer#

Fix this by increasing print length by one character.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:06 +01:00
viresh kumar 8bf1ac7236 cpufreq / core: Fix typo in comment describing show_bios_limit()
show_bios_limit is mistakenly written as show_scaling_driver in a comment
describing purpose of show_bios_limit() routine.

Fix it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:05 +01:00
Stephen Boyd a914443627 cpufreq: Fix sysfs deadlock with concurrent hotplug/frequency switch
Running one program that continuously hotplugs and replugs a cpu
concurrently with another program that continuously writes to the
scaling_setspeed node eventually deadlocks with:

=============================================
[ INFO: possible recursive locking detected ]
3.4.0 #37 Tainted: G        W
---------------------------------------------
filemonkey/122 is trying to acquire lock:
 (s_active#13){++++.+}, at: [<c01a3d28>] sysfs_remove_dir+0x9c/0xb4

but task is already holding lock:
 (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(s_active#13);
  lock(s_active#13);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by filemonkey/122:
 #0:  (&buffer->mutex){+.+.+.}, at: [<c01a2230>] sysfs_write_file+0x28/0x140
 #1:  (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140

stack backtrace:
[<c0014fcc>] (unwind_backtrace+0x0/0x120) from [<c00ca600>] (validate_chain+0x6f8/0x1054)
[<c00ca600>] (validate_chain+0x6f8/0x1054) from [<c00cb778>] (__lock_acquire+0x81c/0x8d8)
[<c00cb778>] (__lock_acquire+0x81c/0x8d8) from [<c00cb9c0>] (lock_acquire+0x18c/0x1e8)
[<c00cb9c0>] (lock_acquire+0x18c/0x1e8) from [<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180)
[<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180) from [<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4)
[<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4) from [<c02d0e5c>] (kobject_del+0x10/0x38)
[<c02d0e5c>] (kobject_del+0x10/0x38) from [<c02d0f74>] (kobject_release+0xf0/0x194)
[<c02d0f74>] (kobject_release+0xf0/0x194) from [<c0565a98>] (cpufreq_cpu_put+0xc/0x24)
[<c0565a98>] (cpufreq_cpu_put+0xc/0x24) from [<c05683f0>] (store+0x6c/0x74)
[<c05683f0>] (store+0x6c/0x74) from [<c01a2314>] (sysfs_write_file+0x10c/0x140)
[<c01a2314>] (sysfs_write_file+0x10c/0x140) from [<c014af44>] (vfs_write+0xb0/0x128)
[<c014af44>] (vfs_write+0xb0/0x128) from [<c014b06c>] (sys_write+0x3c/0x68)
[<c014b06c>] (sys_write+0x3c/0x68) from [<c000e0e0>] (ret_fast_syscall+0x0/0x3c)

This is because store() in cpufreq.c indirectly calls
kobject_get() via cpufreq_cpu_get() and is the last one to call
kobject_put() via cpufreq_cpu_put(). Sysfs code should not call
kobject_get() or kobject_put() directly (see the comment around
sysfs_schedule_callback() for more information).

Fix this deadlock by introducing two new functions:

	struct cpufreq_policy *cpufreq_cpu_get_sysfs(unsigned int cpu)
	void cpufreq_cpu_put_sysfs(struct cpufreq_policy *data)

which do the same thing as cpufreq_cpu_{get,put}() but don't call
kobject functions.

To easily trigger this deadlock you can insert an msleep() with a
reasonably large value right after the fail label at the bottom
of the store() function in cpufreq.c and then write
scaling_setspeed in one task and offline the cpu in another. The
first task will hang and be detected by the hung task detector.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-07-20 21:39:25 +02:00
Konrad Rzeszutek Wilk a7b422cda5 provide disable_cpufreq() function to disable the API.
useful for disabling cpufreq altogether. The cpu frequency
scaling drivers and cpu frequency governors will fail to register.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2012-03-14 14:45:03 -04:00
Linus Torvalds 02d929502c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: (23 commits)
  [CPUFREQ] EXYNOS: Removed useless headers and codes
  [CPUFREQ] EXYNOS: Make EXYNOS common cpufreq driver
  [CPUFREQ] powernow-k8: Update copyright, maintainer and documentation information
  [CPUFREQ] powernow-k8: Fix indexing issue
  [CPUFREQ] powernow-k8: Avoid Pstate MSR accesses on systems supporting CPB
  [CPUFREQ] update lpj only if frequency has changed
  [CPUFREQ] cpufreq:userspace: fix cpu_cur_freq updation
  [CPUFREQ] Remove wall variable from cpufreq_gov_dbs_init()
  [CPUFREQ] EXYNOS4210: cpufreq code is changed for stable working
  [CPUFREQ] EXYNOS4210: Update frequency table for cpu divider
  [CPUFREQ] EXYNOS4210: Remove code about bus on cpufreq
  [CPUFREQ] s3c64xx: Use pr_fmt() for consistent log messages
  cpufreq: OMAP: fixup for omap_device changes, include <linux/module.h>
  cpufreq: OMAP: fix freq_table leak
  cpufreq: OMAP: put clk if cpu_init failed
  cpufreq: OMAP: only supports OPP library
  cpufreq: OMAP: dont support !freq_table
  cpufreq: OMAP: deny initialization if no mpudev
  cpufreq: OMAP: move clk name decision to init
  cpufreq: OMAP: notify even with bad boot frequency
  ...
2012-01-11 18:53:33 -08:00
Afzal Mohammed d08de0c19c [CPUFREQ] update lpj only if frequency has changed
During scaling up of cpu frequency, loops_per_jiffy
is updated upon invoking PRECHANGE notifier.
If setting to new frequency fails in cpufreq driver,
lpj is left at incorrect value.

Hence update lpj only if cpu frequency is changed,
i.e. upon invoking POSTCHANGE notifier.

Penalty would be that during time period between
changing cpu frequency & invocation of POSTCHANGE
notifier, udelay(x) may not gurantee minimal delay
of 'x' us for frequency scaling up operation.

Perhaps a better solution would be to define
CPUFREQ_ABORTCHANGE & handle accordingly, but then
it would be more intrusive (using ABORTCHANGE may
help drivers also; if any has registered notifier
and expect POST for a PRECHANGE, their needs can
be taken care using ABORT)

Signed-off-by: Afzal Mohammed <afzal@ti.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2012-01-06 10:10:53 -05:00
Kay Sievers 8a25a2fd12 cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem
This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
and converts the devices to regular devices. The sysdev drivers are
implemented as subsystem interfaces now.

After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.

Userspace relies on events and generic sysfs subsystem infrastructure
from sysdev devices, which are made available with this conversion.

Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@amd64.org>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Len Brown <lenb@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-21 14:29:42 -08:00
Jesse Barnes 3d73710880 cpufreq: expose a cpufreq_quick_get_max routine
This allows drivers and other code to get the max reported CPU frequency.
Initial use is for scaling ring frequency with GPU frequency in the i915
driver.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-06-28 13:54:26 -07:00
Kees Cook 1a8e1463a4 [CPUFREQ] remove redundant sprintf from request_module call.
Since format string handling is part of request_module, there is no
need to construct the module name. As such, drop the redundant sprintf
and heap usage.

Signed-off-by: Kees Cook <kees.cook@canonical.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2011-05-04 11:50:58 -04:00
Dominik Brodowski 2d06d8c49a [CPUFREQ] use dynamic debug instead of custom infrastructure
With dynamic debug having gained the capability to report debug messages
also during the boot process, it offers a far superior interface for
debug messages than the custom cpufreq infrastructure. As a first step,
remove the old cpufreq_debug_printk() function and replace it with a call
to the generic pr_debug() function.

How can dynamic debug be used on cpufreq? You need a kernel which has
CONFIG_DYNAMIC_DEBUG enabled.

To enabled debugging during runtime, mount debugfs and

$ echo -n 'module cpufreq +p' > /sys/kernel/debug/dynamic_debug/control

for debugging the complete "cpufreq" module. To achieve the same goal during
boot, append

	ddebug_query="module cpufreq +p"

as a boot parameter to the kernel of your choice.

For more detailled instructions, please see
Documentation/dynamic-debug-howto.txt

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
2011-05-04 11:50:57 -04:00
Jacob Shin 27ecddc2a9 [CPUFREQ] CPU hotplug, re-create sysfs directory and symlinks
When we discover CPUs that are affected by each other's
frequency/voltage transitions, the first CPU gets a sysfs directory
created, and rest of the siblings get symlinks. Currently, when we
hotplug off only the first CPU, all of the symlinks and the sysfs
directory gets removed. Even though rest of the siblings are still
online and functional, they are orphaned, and no longer governed by
cpufreq.

This patch, given the above scenario, creates a sysfs directory for
the first sibling and symlinks for the rest of the siblings.

Please note the recursive call, it was rather too ugly to roll it
out. And the removal of redundant NULL setting (it is already taken
care of near the top of the function).

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Acked-by: Mark Langsdorf <mark.langsdorf@amd.com>
Reviewed-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
Cc: stable@kernel.org
2011-05-04 11:50:57 -04:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Rafael J. Wysocki e00e56dfd3 cpufreq: Use syscore_ops for boot CPU suspend/resume (v2)
The cpufreq subsystem uses sysdev suspend and resume for
executing cpufreq_suspend() and cpufreq_resume(), respectively,
during system suspend, after interrupts have been switched off on the
boot CPU, and during system resume, while interrupts are still off on
the boot CPU.  In both cases the other CPUs are off-line at the
relevant point (either they have been switched off via CPU hotplug
during suspend, or they haven't been switched on yet during resume).
For this reason, although it may seem that cpufreq_suspend() and
cpufreq_resume() are executed for all CPUs in the system, they are
only called for the boot CPU in fact, which is quite confusing.

To remove the confusion and to prepare for elimiating sysdev
suspend and resume operations from the kernel enirely, convernt
cpufreq to using a struct syscore_ops object for the boot CPU
suspend and resume and rename the callbacks so that their names
reflect their purpose.  In addition, put some explanatory remarks
into their kerneldoc comments.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-03-23 22:16:32 +01:00
Rafael J. Wysocki 7ca64e2d28 [CPUFREQ] Remove the pm_message_t argument from driver suspend
None of the existing cpufreq drivers uses the second argument of
its .suspend() callback (which isn't useful anyway), so remove it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
2011-03-16 17:54:33 -04:00
Jiri Slaby 8f5bc2abfd [CPUFREQ] fix BUG on cpufreq policy init failure
cpufreq_register_driver sets cpufreq_driver to a structure owned (and
placed) in the caller's memory. If cpufreq policy fails in its ->init
function, sysdev_driver_register returns nonzero in
cpufreq_register_driver. Now, cpufreq_register_driver returns an error
without setting cpufreq_driver back to NULL.

Usually cpufreq policy modules are unloaded because they propagate the
error to the module init function and return that.

So a later access to any member of cpufreq_driver causes bugs like:
BUG: unable to handle kernel paging request at ffffffffa00270a0
IP: [<ffffffff8145eca3>] cpufreq_cpu_get+0x53/0xe0
PGD 1805067 PUD 1809063 PMD 1c3f90067 PTE 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/net/tun0/statistics/collisions
CPU 0
Modules linked in: ...
Pid: 5677, comm: thunderbird-bin Tainted: G        W   2.6.38-rc4-mm1_64+ #1389 To be filled by O.E.M./To Be Filled By O.E.M.
RIP: 0010:[<ffffffff8145eca3>]  [<ffffffff8145eca3>] cpufreq_cpu_get+0x53/0xe0
RSP: 0018:ffff8801aec37d98  EFLAGS: 00010086
RAX: 0000000000000202 RBX: 0000000000000000 RCX: 0000000000000001
RDX: ffffffffa00270a0 RSI: 0000000000001000 RDI: ffffffff8199ece8
...
Call Trace:
 [<ffffffff8145f490>] cpufreq_quick_get+0x10/0x30
 [<ffffffff8103f12b>] show_cpuinfo+0x2ab/0x300
 [<ffffffff81136292>] seq_read+0xf2/0x3f0
 [<ffffffff8126c5d3>] ? __strncpy_from_user+0x33/0x60
 [<ffffffff8116850d>] proc_reg_read+0x6d/0xa0
 [<ffffffff81116e53>] vfs_read+0xc3/0x180
 [<ffffffff81116f5c>] sys_read+0x4c/0x90
 [<ffffffff81030dbb>] system_call_fastpath+0x16/0x1b
...

It's all cause by weird fail path handling in cpufreq_register_driver.
To fix that, shuffle the code to do proper handling with gotos.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Dave Jones <davej@redhat.com>
2011-03-01 18:49:44 -05:00
Thomas Renninger 25e41933b5 perf: Clean up power events by introducing new, more generic ones
Add these new power trace events:

 power:cpu_idle
 power:cpu_frequency
 power:machine_suspend

The old C-state/idle accounting events:
  power:power_start
  power:power_end

Have now a replacement (but we are still keeping the old
tracepoints for compatibility):

  power:cpu_idle

and
  power:power_frequency

is replaced with:
  power:cpu_frequency

power:machine_suspend is newly introduced.

Jean Pihet has a patch integrated into the generic layer
(kernel/power/suspend.c) which will make use of it.

the type= field got removed from both, it was never
used and the type is differed by the event type itself.

perf timechart userspace tool gets adjusted in a separate patch.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Jean Pihet <jean.pihet@newoldbits.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: rjw@sisk.pl
LKML-Reference: <1294073445-14812-3-git-send-email-trenn@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1290072314-31155-2-git-send-email-trenn@suse.de>
2011-01-04 08:16:54 +01:00