Commit Graph

368 Commits

Author SHA1 Message Date
Saravana Kannan 9d16f20711 cpufreq: Track cpu managing sysfs kobjects separately
In order to prepare for the next few commits, that will stop migrating
sysfs files on cpu hotplug, this patch starts managing sysfs-cpu
separately.

The behavior is still the same as we are still migrating sysfs files on
hotplug, later commits would change that.

Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-23 00:49:04 +02:00
Shailendra Verma 58405af632 cpufreq: Fix for typos in two comments
Signed-off-by: Shailendra Verma <shailendra.capricorn@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-22 23:59:44 +02:00
Viresh Kumar 18bf3a124e cpufreq: Mark policy->governor = NULL for inactive policies
Later commits would change the way policies are managed today. Policies
wouldn't be freed on cpu hotplug (currently they aren't freed on
suspend), and while the CPU is offline, the sysfs cpufreq files would
still be present.

Because we don't mark policy->governor as NULL, it still contains
pointer of the last used governor. And if the governor is removed, while
all the CPUs of a policy are hotplugged out, this pointer wouldn't be
valid anymore. And if we try to read the 'scaling_governor', etc.  from
sysfs, it will result in kernel OOPs.

To prevent this, mark policy->governor as NULL for all inactive policies
while the governor is removed from kernel.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 02:46:45 +02:00
Viresh Kumar 4573237b01 cpufreq: Manage governor usage history with 'policy->last_governor'
History of which governor was used last is common to all CPUs within a
policy and maintaining it per-cpu isn't the best approach for sure.

Apart from wasting memory, this also increases the complexity of
managing this data structure as it has to be updated for all CPUs.

To make that somewhat simpler, lets store this information in a new
field 'last_governor' in struct cpufreq_policy and update it on removal
of last cpu of a policy.

As a side-effect it also solves an old problem, consider a system with
two clusters 0 & 1. And there is one policy per cluster.

Cluster 0: CPU0 and 1.
Cluster 1: CPU2 and 3.

 - CPU2 is first brought online, and governor is set to performance
   (default as cpufreq_cpu_governor wasn't set).
 - Governor is changed to ondemand.
 - CPU2 is taken offline and cpufreq_cpu_governor is updated for CPU2.
 - CPU3 is brought online.
 - Because cpufreq_cpu_governor wasn't set for CPU3, the default governor
   performance is picked for CPU3.

This patch fixes the bug as we now have a single variable to update for
policy.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 02:44:17 +02:00
Viresh Kumar 9104bb26c7 cpufreq: Don't traverse all active policies to find policy for a cpu
We reach here while adding policy for a CPU and enter into the 'if'
block only if a policy already exists for the CPU.

As cpufreq_cpu_data is set for all policy->related_cpus now, when the
policy is first added, we can use that to find the CPU's policy instead
of traversing the list of all active policies.

Acked-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 02:38:18 +02:00
Viresh Kumar 3914d37910 cpufreq: Get rid of cpufreq_cpu_data_fallback
We can extract the same information from cpufreq_cpu_data as it is also
available for inactive policies now. And so don't need
cpufreq_cpu_data_fallback anymore.

Also add a WARN_ON() for the case where we try to restore from an active
policy.

Acked-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 02:35:57 +02:00
Viresh Kumar 988bed09d3 cpufreq: Don't clear cpufreq_cpu_data and policy list for inactive policies
Now that we can check policy->cpus to find if policy is active or not,
we don't need to clean cpufreq_cpu_data and delete policy from the list
on light weight tear down of policies (like in suspend).

To make it consistent and clean, set cpufreq_cpu_data for all related
CPUs when the policy is first created and clean it only while it is
freed.

Also update cpufreq_cpu_get_raw() to check if cpu is part of
policy->cpus mask, so that we don't end up getting policies for offline
CPUs.

In order to make sure that no users of 'policy' are using an inactive
policy, use cpufreq_cpu_get_raw() instead of directly accessing
cpufreq_cpu_data.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 02:32:46 +02:00
Viresh Kumar f963735a3c cpufreq: Create for_each_{in}active_policy()
policy->cpus is cleared unconditionally now on hotplug-out of a CPU and
it can be checked to know if a policy is active or not. Create helper
routines to iterate over all active/inactive policies, based on
policy->cpus field.

Replace all instances of for_each_policy() with for_each_active_policy()
to make them iterate only for active policies. (We haven't made changes
yet to keep inactive policies in the same list, but that will be
followed in a later patch).

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 02:26:07 +02:00
Viresh Kumar 303ae72307 cpufreq: Clear policy->cpus even for the last CPU
We clear policy->cpus mask while CPUs are hotplugged out. We do it for all CPUs
except the last CPU of the policy. I don't remember what the rationale behind
that was, but I couldn't think of anything that will break if we remove this
conditional clearing and always clear policy->cpus.

The benefit we get out of it is, we can know if a policy is active or not by
checking if this field is empty or not. That will be used by later commits.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-07 23:38:35 +02:00
Viresh Kumar bb29ae152e cpufreq: Keep a single path for adding managed CPUs
There are two cases when we may try to add CPUs we're already handling:
 - On boot, the first cpu has marked all policy->cpus managed and so we
   will find policy for all other policy->cpus later on.
 - When a managed cpu is hotplugged out and later brought back in.

Currently, separate paths and checks take care of the two.  While the
first one is detected by testing cpu against 'policy->cpus', the other
one is detected by testing cpu against 'policy->related_cpus'.

We can handle them both via a single path and there is no need to do
special checking for the first one.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
[ rjw: Changelog, comments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-07 23:36:41 +02:00
Viresh Kumar 1b947c904c cpufreq: Throw warning when we try to get policy for an invalid CPU
Simply returning here with an error is not enough. It shouldn't be allowed at
all to try calling cpufreq_cpu_get() for an invalid CPU.

Add a WARN here to make it clear that it wouldn't be acceptable at all.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-07 23:29:57 +02:00
Viresh Kumar 23faf0b743 cpufreq: Merge __cpufreq_add_dev() and cpufreq_add_dev()
cpufreq_add_dev() is an unnecessary wrapper over __cpufreq_add_dev(). Merge
them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-07 23:28:28 +02:00
Viresh Kumar 50e9c85213 cpufreq: Add doc style comment about cpufreq_cpu_{get|put}()
This clearly states what the code inside these routines is doing and how these
must be used.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-07 23:27:20 +02:00
Viresh Kumar c75de0ac07 cpufreq: Schedule work for the first-online CPU on resume
All CPUs leaving the first-online CPU are hotplugged out on suspend and
and cpufreq core stops managing them.

On resume, we need to call cpufreq_update_policy() for this CPU's policy
to make sure its frequency is in sync with cpufreq's cached value, as it
might have got updated by hardware during suspend/resume.

The policies are always added to the top of the policy-list. So, in
normal circumstances, CPU 0's policy will be the last one in the list.
And so the code checks for the last policy.

But there are cases where it will fail. Consider quad-core system, with
policy-per core. If CPU0 is hotplugged out and added back again, the
last policy will be on CPU1 :(

To fix this in a proper way, always look for the policy of the first
online CPU. That way we will be sure that we are calling
cpufreq_update_policy() for the only CPU that wasn't hotplugged out.

Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
Fixes: 2f0aea9363 ("cpufreq: suspend governors on system suspend/hibernate")
Reported-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-03 12:59:47 +02:00
Viresh Kumar f7b2706117 cpufreq: Create for_each_governor()
To make code more readable and less error prone, lets create a helper macro for
iterating over all available governors.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-03 23:28:01 +01:00
Viresh Kumar b4f0676fe2 cpufreq: Create for_each_policy()
To make code more readable and less error prone, lets create a helper macro for
iterating over all active policies.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-03 23:27:45 +01:00
Viresh Kumar 1e63eaf0c4 cpufreq: Drop cpufreq_disabled() check from cpufreq_cpu_{get|put}()
When cpufreq is disabled, the per-cpu variable would have been set to
NULL. Remove this unnecessary check.

[ Changelog from Saravana Kannan. ]

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-03 23:26:02 +01:00
Viresh Kumar 6ffae8c06f cpufreq: Set cpufreq_cpu_data to NULL before putting kobject
In __cpufreq_remove_dev_finish(), per-cpu 'cpufreq_cpu_data' needs
to be cleared before calling kobject_put(&policy->kobj) and under
cpufreq_driver_lock. Otherwise, if someone else calls cpufreq_cpu_get()
in parallel with it, they can obtain a non-NULL policy from that after
kobject_put(&policy->kobj) was executed.

Consider this case:

Thread A				Thread B
cpufreq_cpu_get()
  acquire cpufreq_driver_lock
  read-per-cpu cpufreq_cpu_data
					kobject_put(&policy->kobj);
  kobject_get(&policy->kobj);
					...
					per_cpu(&cpufreq_cpu_data, cpu) = NULL

And this will result in a warning like this one:

 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 4 at include/linux/kref.h:47
 kobject_get+0x41/0x50()
 Modules linked in: acpi_cpufreq(+) nfsd auth_rpcgss nfs_acl
 lockd grace sunrpc xfs libcrc32c sd_mod ixgbe igb mdio ahci hwmon
 ...
 Call Trace:
  [<ffffffff81661b14>] dump_stack+0x46/0x58
  [<ffffffff81072b61>] warn_slowpath_common+0x81/0xa0
  [<ffffffff81072c7a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff812e16d1>] kobject_get+0x41/0x50
  [<ffffffff815262a5>] cpufreq_cpu_get+0x75/0xc0
  [<ffffffff81527c3e>] cpufreq_update_policy+0x2e/0x1f0
  [<ffffffff810b8cb2>] ? up+0x32/0x50
  [<ffffffff81381aa9>] ? acpi_ns_get_node+0xcb/0xf2
  [<ffffffff81381efd>] ? acpi_evaluate_object+0x22c/0x252
  [<ffffffff813824f6>] ? acpi_get_handle+0x95/0xc0
  [<ffffffff81360967>] ? acpi_has_method+0x25/0x40
  [<ffffffff81391e08>] acpi_processor_ppc_has_changed+0x77/0x82
  [<ffffffff81089566>] ? move_linked_works+0x66/0x90
  [<ffffffff8138e8ed>] acpi_processor_notify+0x58/0xe7
  [<ffffffff8137410c>] acpi_ev_notify_dispatch+0x44/0x5c
  [<ffffffff8135f293>] acpi_os_execute_deferred+0x15/0x22
  [<ffffffff8108c910>] process_one_work+0x160/0x410
  [<ffffffff8108d05b>] worker_thread+0x11b/0x520
  [<ffffffff8108cf40>] ? rescuer_thread+0x380/0x380
  [<ffffffff81092421>] kthread+0xe1/0x100
  [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0
  [<ffffffff81669ebc>] ret_from_fork+0x7c/0xb0
  [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0
 ---[ end trace 89e66eb9795efdf7 ]---

The actual code flow is as follows:

 Thread A: Workqueue: kacpi_notify

 acpi_processor_notify()
   acpi_processor_ppc_has_changed()
         cpufreq_update_policy()
           cpufreq_cpu_get()
             kobject_get()

 Thread B: xenbus_thread()

 xenbus_thread()
   msg->u.watch.handle->callback()
     handle_vcpu_hotplug_event()
       vcpu_hotplug()
         cpu_down()
           __cpu_notify(CPU_POST_DEAD..)
             cpufreq_cpu_callback()
               __cpufreq_remove_dev_finish()
                 cpufreq_policy_put_kobj()
                   kobject_put()

cpufreq_cpu_get() gets the policy from per-cpu variable cpufreq_cpu_data
under cpufreq_driver_lock, and once it gets a valid policy it expects it
to not be freed until cpufreq_cpu_put() is called.

But the race happens when another thread puts the kobject first and updates
cpufreq_cpu_data before or later. And so the first thread gets a valid policy
structure and before it does kobject_get() on it, the second one has already
done kobject_put().

Fix this by setting cpufreq_cpu_data to NULL before putting the kobject and that
too under locks.

Reported-by: Ethan Zhao <ethan.zhao@oracle.com>
Reported-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.12+ <stable@vger.kernel.org> # 3.12+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-03 00:59:29 +01:00
Viresh Kumar d9f354460d cpufreq: remove CPUFREQ_UPDATE_POLICY_CPU notifications
CPUFREQ_UPDATE_POLICY_CPU notifications were used only from cpufreq-stats which
doesn't use it anymore. Remove them.

This also decrements values of other notification macros defined after
CPUFREQ_UPDATE_POLICY_CPU by 1 to remove gaps. Hopefully all users are using
macro's instead of direct numbers and so they wouldn't break as macro values are
changed now.

Reviewed-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 23:06:44 +01:00
Viresh Kumar 7c418ff099 cpufreq: Remove (now) unused 'last_cpu' from struct cpufreq_policy
'last_cpu' was used only from cpufreq-stats and isn't used anymore. Get rid of
it.

Reviewed-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 23:06:44 +01:00
Viresh Kumar 818c57126e cpufreq: move some initialization stuff to cpufreq_policy_alloc()
We need to initialize completion and work only on policy allocation and not
really on the policy restore side and so we better move this piece of code to
cpufreq_policy_alloc().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:35 +01:00
Viresh Kumar ce1bcfe94d cpufreq: check cpufreq_policy_list instead of scanning policies for all CPUs
CPUFREQ_STICKY flag is set by drivers which don't want to get unregistered
even if cpufreq-core isn't able to initialize policy for any CPU.

When this flag isn't set, we try to unregister the driver. To find out
which CPUs are registered and which are not, we try to check per_cpu
cpufreq_cpu_data for all CPUs. Because we have a list of valid policies
available now, we better check if the list is empty or not instead of
the 'for' loop. That will be much more efficient.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:35 +01:00
Viresh Kumar 39c132eebd cpufreq: limit the scope of l_p_j variables
These variables are just used within adjust_jiffies() and so must be
local to it. Also there is no need of a dummy routine for CONFIG_SMP
case as we can take care of all that with help of macros in the same
routine. It doesn't look that ugly.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:34 +01:00
Viresh Kumar d7a9771c1a cpufreq: use light-weight cpufreq_cpu_get_raw() in __cpufreq_add_dev()
We just need to check if a 'policy' is already present for the cpu we are
adding. We don't need to take all the locks and do kobject usage updates. Use
the light-weight cpufreq_cpu_get_raw() routine instead.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:34 +01:00
Viresh Kumar 7f0c020ab6 cpufreq: get rid of 'tpolicy' from __cpufreq_add_dev()
There is no need of this separate variable, use 'policy' instead.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:34 +01:00
Viresh Kumar 22a7cfb014 cpufreq: get rid of CONFIG_{HOTPLUG_CPU|SMP} mess
These are messing up more than the benefit they provide. It isn't
a lot of code anyway, that we will compile without them.

Kill them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:34 +01:00
Viresh Kumar bc68b7dfda cpufreq: update driver_data->flags only if we are registering driver
We should first check if a cpufreq driver is already registered or not
before updating driver_data->flags.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:34 +01:00
Viresh Kumar d92d50a462 cpufreq: pass policy to __cpufreq_get()
There is no point finding out the 'policy' again within __cpufreq_get()
when all the callers already have it. Just make them pass policy instead.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:34 +01:00
Viresh Kumar a1e1dc41c4 cpufreq: pass policy to cpufreq_out_of_sync
There is no point finding out the 'policy' again within cpufreq_out_of_sync()
when all the callers already have it. Just make them pass policy instead.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:34 +01:00
Viresh Kumar 2e1cc3a5d7 cpufreq: No need to check for has_target()
Either we can be setpolicy or target type, nothing else. And so the
else part of setpolicy will automatically be of has_target() type.
And so we don't need to check it again.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:34 +01:00
Viresh Kumar 42f91fa116 cpufreq: s/__find_governor/find_governor
Remove unnecessary from find_governor's name.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:33 +01:00
Viresh Kumar db5f299574 cpufreq: merge 'if' blocks in __cpufreq_remove_dev_prepare()
There are two 'if' blocks here, checking for !cpufreq_driver->setpolicy and
has_target(). Both are actually doing the same thing, merge them.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:33 +01:00
Viresh Kumar 09347b2905 cpufreq: don't need line break in show_scaling_cur_freq()
No need of an unnecessary line break.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:33 +01:00
Viresh Kumar f13f1184a1 cpufreq: remove extra parenthesis
We can live without it and so we should.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:33 +01:00
Viresh Kumar e673f1639b cpufreq: remove dangling comment
It doesn't make any sense at all and is a leftover of some earlier commit.
Remove it.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:49:33 +01:00
Doug Anderson 90de2a4aa9 cpufreq: suspend cpufreq governors on shutdown
We should stop cpufreq governors when we shut down the system.  If we
don't do this, we can end up with this deadlock:

1. cpufreq governor may be running on a CPU other than CPU0.
2. In machine_restart() we call smp_send_stop() which stops CPUs.
   If one of these CPUs was actively running a cpufreq governor
   then it may have the mutex / spinlock needed to access the main
   PMIC in the system (perhaps over I2C)
3. If a machine needs access to the main PMIC in order to shutdown
   then it will never get it since the mutex was lost when the other
   CPU stopped.
4. We'll hang (possibly eventually hitting the hard lockup detector).

Let's avoid the problem by stopping the cpufreq governor at shutdown,
which is a sensible thing to do anyway.

Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-23 22:20:30 +01:00
Ethan Zhao cb57720bf7 cpufreq: fix a NULL pointer dereference in __cpufreq_governor()
If ACPI _PPC changed notification happens before governor was initiated
while kernel is booting, a NULL pointer dereference will be triggered:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
 IP: [<ffffffff81470453>] __cpufreq_governor+0x23/0x1e0
 PGD 0
 Oops: 0000 [#1] SMP
 ... ...
 RIP: 0010:[<ffffffff81470453>]  [<ffffffff81470453>]
 __cpufreq_governor+0x23/0x1e0
 RSP: 0018:ffff881fcfbcfbb8  EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffff881fd11b3980 RCX: ffff88407fc20000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff881fd11b3980
 RBP: ffff881fcfbcfbd8 R08: 0000000000000000 R09: 000000000000000f
 R10: ffffffff818068d0 R11: 0000000000000043 R12: 0000000000000004
 R13: 0000000000000000 R14: ffffffff8196cae0 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff881fffc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000030 CR3: 00000000018ae000 CR4: 00000000000407f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process kworker/0:3 (pid: 750, threadinfo ffff881fcfbce000, task
 ffff881fcf556400)
 Stack:
  ffff881fffc17d00 ffff881fcfbcfc18 ffff881fd11b3980 0000000000000000
  ffff881fcfbcfc08 ffffffff81470d08 ffff881fd11b3980 0000000000000007
  ffff881fcfbcfc18 ffff881fffc17d00 ffff881fcfbcfd28 ffffffff81472e9a
 Call Trace:
  [<ffffffff81470d08>] __cpufreq_set_policy+0x1b8/0x2e0
  [<ffffffff81472e9a>] cpufreq_update_policy+0xca/0x150
  [<ffffffff81472f20>] ? cpufreq_update_policy+0x150/0x150
  [<ffffffff81324a96>] acpi_processor_ppc_has_changed+0x71/0x7b
  [<ffffffff81320bcd>] acpi_processor_notify+0x55/0x115
  [<ffffffff812f9c29>] acpi_device_notify+0x19/0x1b
  [<ffffffff813084ca>] acpi_ev_notify_dispatch+0x41/0x5f
  [<ffffffff812f64a4>] acpi_os_execute_deferred+0x27/0x34

The root cause is a race conditon -- cpufreq core and acpi-cpufreq driver
were initiated, but cpufreq_governor wasn't and _PPC changed notification
happened, __cpufreq_governor() was called within acpi_os_execute_deferred
kernel thread context.

To fix this panic issue, add pointer checking code in __cpufreq_governor()
before pointer policy->governor is to be dereferenced.

Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-19 22:49:07 +01:00
Viresh Kumar 7c45cf31b3 cpufreq: Introduce ->ready() callback for cpufreq drivers
Currently there is no callback for cpufreq drivers which is called once the
policy is ready to be used. There are some requirements where such a callback is
required.

One of them is registering a cooling device with the help of
of_cpufreq_cooling_register(). This routine tries to get 'struct cpufreq_policy'
for CPUs which isn't yet initialed at the time ->init() is called and so we face
issues while registering the cooling device.

Because we can't register cooling device from ->init(), we need a callback that
is called after the policy is ready to be used and hence we introduce ->ready()
callback.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Eduardo Valentin <edubezval@gmail.com>
Tested-by: Eduardo Valentin <edubezval@gmail.com>
Reviewed-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-29 23:38:38 +01:00
Tomeu Vizoso 6d4e81ed89 cpufreq: Ref the policy object sooner
Do it before it's assigned to cpufreq_cpu_data, otherwise when a driver
tries to get the cpu frequency during initialization the policy kobj is
referenced and we get this warning:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 64 at include/linux/kref.h:47 kobject_get+0x64/0x70()
Modules linked in:
CPU: 1 PID: 64 Comm: irq/77-tegra-ac Not tainted 3.18.0-rc4-next-20141114ccu-00050-g3eff942 #326
[<c0016fac>] (unwind_backtrace) from [<c001272c>] (show_stack+0x10/0x14)
[<c001272c>] (show_stack) from [<c06085d8>] (dump_stack+0x98/0xd8)
[<c06085d8>] (dump_stack) from [<c002892c>] (warn_slowpath_common+0x84/0xb4)
[<c002892c>] (warn_slowpath_common) from [<c00289f8>] (warn_slowpath_null+0x1c/0x24)
[<c00289f8>] (warn_slowpath_null) from [<c0220290>] (kobject_get+0x64/0x70)
[<c0220290>] (kobject_get) from [<c03e944c>] (cpufreq_cpu_get+0x88/0xc8)
[<c03e944c>] (cpufreq_cpu_get) from [<c03e9500>] (cpufreq_get+0xc/0x64)
[<c03e9500>] (cpufreq_get) from [<c0285288>] (actmon_thread_isr+0x134/0x198)
[<c0285288>] (actmon_thread_isr) from [<c0069008>] (irq_thread_fn+0x1c/0x40)
[<c0069008>] (irq_thread_fn) from [<c0069324>] (irq_thread+0x134/0x174)
[<c0069324>] (irq_thread) from [<c0040290>] (kthread+0xdc/0xf4)
[<c0040290>] (kthread) from [<c000f4b8>] (ret_from_fork+0x14/0x3c)
---[ end trace b7bd64a81b340c59 ]---

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-25 22:44:17 +01:00
Rafael J. Wysocki bd2a0f6754 Merge back cpufreq material for 3.19-rc1. 2014-11-18 01:22:29 +01:00
Vince Hsu 619c144c84 cpufreq: respect the min/max settings from user space
When the user space tries to set scaling_(max|min)_freq through
sysfs, the cpufreq_set_policy() asks other driver's opinions
for the max/min frequencies. Some device drivers, like Tegra
CPU EDP which is not upstreamed yet though, may constrain the
CPU maximum frequency dynamically because of board design.
So if the user space access happens and some driver is capping
the cpu frequency at the same time, the user_policy->(max|min)
is overridden by the capped value, and that's not expected by
the user space. And if the user space is not invoked again,
the CPU will always be capped by the user_policy->(max|min)
even no drivers limit the CPU frequency any more.

This patch preserves the user specified min/max settings, so that
every time the cpufreq policy is updated, the new max/min can
be re-evaluated correctly based on the user's expection and
the present device drivers' status.

Signed-off-by: Vince Hsu <vinceh@nvidia.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-11 23:04:28 +01:00
Geert Uytterhoeven 09712f557b cpufreq: Avoid crash in resume on SMP without OPP
When resuming from s2ram on an SMP system without cpufreq operating
points (e.g. there's no "operating-points" property for the CPU node in
DT, or the platform doesn't use DT yet), the kernel crashes when
bringing CPU 1 online:

    Enabling non-boot CPUs ...
    CPU1: Booted secondary processor
    Unable to handle kernel NULL pointer dereference at virtual address 0000003c
    pgd = ee5e6b00
    [0000003c] *pgd=6e579003, *pmd=6e588003, *pte=00000000
    Internal error: Oops: a07 [#1] SMP ARM
    Modules linked in:
    CPU: 0 PID: 1246 Comm: s2ram Tainted: G        W      3.18.0-rc3-koelsch-01614-g0377af242bb175c8-dirty #589
    task: eeec5240 ti: ee704000 task.ti: ee704000
    PC is at __cpufreq_add_dev.isra.24+0x24c/0x77c
    LR is at __cpufreq_add_dev.isra.24+0x244/0x77c
    pc : [<c0298efc>]    lr : [<c0298ef4>]    psr: 60000153
    sp : ee705d48  ip : ee705d48  fp : ee705d84
    r10: c04e0450  r9 : 00000000  r8 : 00000001
    r7 : c05426a8  r6 : 00000001  r5 : 00000001  r4 : 00000000
    r3 : 00000000  r2 : 00000000  r1 : 20000153  r0 : c0542734

Verify that policy is not NULL before dereferencing it to fix this.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Fixes: 8414809c6a (cpufreq: Preserve policy structure across suspend/resume)
Cc: 3.12+ <stable@vger.kernel.org> # 3.12+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-08 02:10:04 +01:00
Dirk Brandewie c034b02e21 cpufreq: expose scaling_cur_freq sysfs file for set_policy() drivers
Currently the core does not expose scaling_cur_freq for set_policy()
drivers this breaks some userspace monitoring tools.
Change the core to expose this file for all drivers and if the
set_policy() driver supports the get() callback use it to retrieve the
current frequency.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=73741
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-23 22:59:59 +02:00
Thomas Petazzoni 51315cdfa0 cpufreq: allow driver-specific data
This commit extends the cpufreq_driver structure with an additional
'void *driver_data' field that can be filled by the ->probe() function
of a cpufreq driver to pass additional custom information to the
driver itself.

A new function called cpufreq_get_driver_data() is added to allow a
cpufreq driver to retrieve those driver data, since they are typically
needed from a cpufreq_policy->init() callback, which does not have
access to the cpufreq_driver structure. This function call is similar
to the existing cpufreq_get_current_driver() function call.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-21 00:51:01 +02:00
Rafael J. Wysocki 6f1293ff74 Merge back cpufreq material for v3.18. 2014-10-03 15:41:16 +02:00
Viresh Kumar b1b12babe3 cpufreq: update 'cpufreq_suspended' after stopping governors
Commit 8e30444e15 ("cpufreq: fix cpufreq suspend/resume for intel_pstate")
introduced a bug where the governors wouldn't be stopped anymore for
->target{_index}() drivers during suspend. This happens because
'cpufreq_suspended' is updated before stopping the governors during suspend
and due to this __cpufreq_governor() would return early due to this check:

	/* Don't start any governor operations if we are entering suspend */
	if (cpufreq_suspended)
		return 0;

Fixes: 8e30444e15 ("cpufreq: fix cpufreq suspend/resume for intel_pstate")
Cc: 3.15+ <stable@vger.kernel.org> # 3.15+: 8e30444e15 "cpufreq: fix cpufreq suspend/resume for intel_pstate"
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-30 21:02:34 +02:00
Rasmus Villemoes 7c4f453970 cpufreq: Replace strnicmp with strncasecmp
The kernel used to contain two functions for length-delimited,
case-insensitive string comparison, strnicmp with correct semantics
and a slightly buggy strncasecmp. The latter is the POSIX name, so
strnicmp was renamed to strncasecmp, and strnicmp made into a wrapper
for the new strncasecmp to avoid breaking existing users.

To allow the compat wrapper strnicmp to be removed at some point in
the future, and to avoid the extra indirection cost, do
s/strnicmp/strncasecmp/g.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-29 15:53:04 +02:00
Preeti U Murthy 789ca24374 cpufreq: Allow stop CPU callback to be used by all cpufreq drivers
Commit 367dc4aa93 ("cpufreq: Add stop CPU callback to
cpufreq_driver interface") introduced the stop CPU callback for
intel_pstate drivers. During the CPU_DOWN_PREPARE stage, this
callback is invoked so that drivers can take some action on the
pstate of the cpu before it is taken offline. This callback was
assumed to be useful only for those drivers which have implemented
the set_policy CPU callback because they have no other way to take
action about the cpufreq of a CPU which is being hotplugged out
except in the exit callback which is called very late in the offline
process.

The drivers which implement the target/target_index callbacks were
expected to take care of requirements like the ones that commit
367dc4aa addresses in the GOV_STOP notification event. But there
are disadvantages to restricting the usage of stop CPU callback
to cpufreq drivers that implement the set_policy callbacks and who
want to take explicit action on the setting the cpufreq during a
hotplug operation.

1.GOV_STOP gets called for every CPU offline and drivers would usually
want to take action when the last cpu in the policy->cpus mask
is taken offline. As long as there is more than one cpu in the
policy->cpus mask, cpufreq core itself makes sure that the freq
for the other cpus in this mask is set according to the maximum load.
This is sensible and drivers which implement the target_index callback
would mostly not want to modify that. However the cpufreq core leaves a
loose end when the cpu in the policy->cpus mask is the last one to go offline;
it does nothing explicit to the frequency of the core. Drivers may need
a way to take some action here and stop CPU callback mechanism is the
best way to do it today.

2. We cannot implement driver specific actions in the GOV_STOP mechanism.
So we will need another driver callback which is invoked from here which is
unnecessary.

Therefore this patch extends the usage of stop CPU callback to be used
by all cpufreq drivers as long as they have this callback implemented
and irrespective of whether they are set_policy/target_index drivers.
The assumption is if the drivers find the GOV_STOP path to be a suitable
way of implementing what they want to do with the freq of the cpu
going offine,they will not implement the stop CPU callback at all.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-29 15:47:12 +02:00
Prarit Bhargava 7106e02bae cpufreq: release policy->rwsem on error
While debugging a cpufreq-related hardware failure on a system I saw the
following lockdep warning:

 =========================
 [ BUG: held lock freed! ] 3.17.0-rc4+ #1 Tainted: G            E
 -------------------------
 insmod/2247 is freeing memory ffff88006e1b1400-ffff88006e1b17ff, with a lock still held there!
  (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80
 3 locks held by insmod/2247:
  #0:  (subsys mutex#5){+.+.+.}, at: [<ffffffff81485579>] subsys_interface_register+0x69/0x120
  #1:  (cpufreq_rwsem){.+.+.+}, at: [<ffffffff8156cf73>] __cpufreq_add_dev.isra.21+0x73/0xb80
  #2:  (&policy->rwsem){+.+...}, at: [<ffffffff8156d37d>] __cpufreq_add_dev.isra.21+0x47d/0xb80

 stack backtrace:
 CPU: 0 PID: 2247 Comm: insmod Tainted: G            E  3.17.0-rc4+ #1
 Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 08/24/2013
  0000000000000000 000000008f3063c4 ffff88006f87bb30 ffffffff8171b358
  ffff88006bcf3750 ffff88006f87bb68 ffffffff810e09e1 ffff88006e1b1400
  ffffea0001b86c00 ffffffff8156d327 ffff880073003500 0000000000000246
 Call Trace:
  [<ffffffff8171b358>] dump_stack+0x4d/0x66
  [<ffffffff810e09e1>] debug_check_no_locks_freed+0x171/0x180
  [<ffffffff8156d327>] ? __cpufreq_add_dev.isra.21+0x427/0xb80
  [<ffffffff8121412b>] kfree+0xab/0x2b0
  [<ffffffff8156d327>] __cpufreq_add_dev.isra.21+0x427/0xb80
  [<ffffffff81724cf7>] ? _raw_spin_unlock+0x27/0x40
  [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq]
  [<ffffffff8156da8e>] cpufreq_add_dev+0xe/0x10
  [<ffffffff814855d1>] subsys_interface_register+0xc1/0x120
  [<ffffffff8156bcf2>] cpufreq_register_driver+0x112/0x340
  [<ffffffff8121415a>] ? kfree+0xda/0x2b0
  [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq]
  [<ffffffffa003562e>] pcc_cpufreq_init+0x4af/0xe81 [pcc_cpufreq]
  [<ffffffffa003517f>] ? pcc_cpufreq_do_osc+0x17f/0x17f [pcc_cpufreq]
  [<ffffffff81002144>] do_one_initcall+0xd4/0x210
  [<ffffffff811f7472>] ? __vunmap+0xd2/0x120
  [<ffffffff81127155>] load_module+0x1315/0x1b70
  [<ffffffff811222a0>] ? store_uevent+0x70/0x70
  [<ffffffff811229d9>] ? copy_module_from_fd.isra.44+0x129/0x180
  [<ffffffff81127b86>] SyS_finit_module+0xa6/0xd0
  [<ffffffff81725b69>] system_call_fastpath+0x16/0x1b
 cpufreq: __cpufreq_add_dev: ->get() failed
insmod: ERROR: could not insert module pcc-cpufreq.ko: No such device

The warning occurs in the __cpufreq_add_dev() code which does

        down_write(&policy->rwsem);
	...
        if (cpufreq_driver->get && !cpufreq_driver->setpolicy) {
                policy->cur = cpufreq_driver->get(policy->cpu);
                if (!policy->cur) {
                        pr_err("%s: ->get() failed\n", __func__);
                        goto err_get_freq;
                }

If cpufreq_driver->get(policy->cpu) returns an error we execute the
code at err_get_freq, which does not up the policy->rwsem.  This causes
the lockdep warning.

Trivial patch to up the policy->rwsem in the error path.

After the patch has been applied, and an error occurs in the
cpufreq_driver->get(policy->cpu) call we will now see

cpufreq: __cpufreq_add_dev: ->get() failed
cpufreq: __cpufreq_add_dev: ->get() failed
modprobe: ERROR: could not insert 'pcc_cpufreq': No such device

Fixes: 4e97b631f2 (cpufreq: Initialize governor for a new policy under policy->rwsem)
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-22 14:32:43 +02:00
Lan Tianyu 8e30444e15 cpufreq: fix cpufreq suspend/resume for intel_pstate
Cpufreq core introduces cpufreq_suspended flag to let cpufreq sysfs nodes
across S2RAM/S2DISK. But the flag is only set in the cpufreq_suspend()
for cpufreq drivers which have target or target_index callback. This
skips intel_pstate driver. This patch is to set the flag before checking
target or target_index callback.

Fixes: 2f0aea9363 (cpufreq: suspend governors on system suspend/hibernate)
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
[rjw: Subject]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-09-22 14:23:18 +02:00