2019-06-04 16:11:33 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0-only */
|
2005-04-17 06:20:36 +08:00
|
|
|
/*
|
2013-06-19 16:49:33 +08:00
|
|
|
* linux/include/linux/cpufreq.h
|
2005-04-17 06:20:36 +08:00
|
|
|
*
|
2013-06-19 16:49:33 +08:00
|
|
|
* Copyright (C) 2001 Russell King
|
|
|
|
* (C) 2002 - 2003 Dominik Brodowski <linux@brodo.de>
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
|
|
|
#ifndef _LINUX_CPUFREQ_H
|
|
|
|
#define _LINUX_CPUFREQ_H
|
|
|
|
|
2014-01-09 23:08:43 +08:00
|
|
|
#include <linux/clk.h>
|
2013-08-07 01:23:03 +08:00
|
|
|
#include <linux/cpumask.h>
|
|
|
|
#include <linux/completion.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
#include <linux/kobject.h>
|
2013-08-07 01:23:03 +08:00
|
|
|
#include <linux/notifier.h>
|
cpufreq: Use per-policy frequency QoS
Replace the CPU device PM QoS used for the management of min and max
frequency constraints in cpufreq (and its users) with per-policy
frequency QoS to avoid problems with cpufreq policies covering
more then one CPU.
Namely, a cpufreq driver is registered with the subsys interface
which calls cpufreq_add_dev() for each CPU, starting from CPU0, so
currently the PM QoS notifiers are added to the first CPU in the
policy (i.e. CPU0 in the majority of cases).
In turn, when the cpufreq driver is unregistered, the subsys interface
doing that calls cpufreq_remove_dev() for each CPU, starting from CPU0,
and the PM QoS notifiers are only removed when cpufreq_remove_dev() is
called for the last CPU in the policy, say CPUx, which as a rule is
not CPU0 if the policy covers more than one CPU. Then, the PM QoS
notifiers cannot be removed, because CPUx does not have them, and
they are still there in the device PM QoS notifiers list of CPU0,
which prevents new PM QoS notifiers from being registered for CPU0
on the next attempt to register the cpufreq driver.
The same issue occurs when the first CPU in the policy goes offline
before unregistering the driver.
After this change it does not matter which CPU is the policy CPU at
the driver registration time and whether or not it is online all the
time, because the frequency QoS is per policy and not per CPU.
Fixes: 67d874c3b2c6 ("cpufreq: Register notifiers with the PM QoS framework")
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reported-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Diagnosed-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
Link: https://lore.kernel.org/linux-pm/20191017094612.6tbkwoq4harsjcqv@vireshk-i7/T/#m30d48cc23b9a80467fbaa16e30f90b3828a5a29b
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2019-10-16 18:47:06 +08:00
|
|
|
#include <linux/pm_qos.h>
|
cpufreq: Make sure frequency transitions are serialized
Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE
notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers
should strictly alternate, thereby preventing two different sets of PRECHANGE or
POSTCHANGE notifiers from interleaving arbitrarily.
The following examples illustrate why this is important:
Scenario 1:
-----------
A thread reading the value of cpuinfo_cur_freq, will call
__cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition()
The ondemand governor can decide to change the frequency of the CPU at the same
time and hence it can end up sending the notifications via ->target().
If the notifiers are not serialized, the following sequence can occur:
- PRECHANGE Notification for freq A (from cpuinfo_cur_freq)
- PRECHANGE Notification for freq B (from target())
- Freq changed by target() to B
- POSTCHANGE Notification for freq B
- POSTCHANGE Notification for freq A
We can see from the above that the last POSTCHANGE Notification happens for freq
A but the hardware is set to run at freq B.
Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback()
in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the
loops_per_jiffy calculations will get messed up.
Scenario 2:
-----------
The governor calls __cpufreq_driver_target() to change the frequency. At the
same time, if we change scaling_{min|max}_freq from sysfs, it will end up
calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call
__cpufreq_driver_target(). And hence we end up issuing concurrent calls to
->target().
Typically, platforms have the following logic in their ->target() routines:
(Eg: cpufreq-cpu0, omap, exynos, etc)
A. If new freq is more than old: Increase voltage
B. Change freq
C. If new freq is less than old: decrease voltage
Now, if the two concurrent calls to ->target() are X and Y, where X is trying to
increase the freq and Y is trying to decrease it, we get the following race
condition:
X.A: voltage gets increased for larger freq
Y.A: nothing happens
Y.B: freq gets decreased
Y.C: voltage gets decreased
X.B: freq gets increased
X.C: nothing happens
Thus we can end up setting a freq which is not supported by the voltage we have
set. That will probably make the clock to the CPU unstable and the system might
not work properly anymore.
This patch introduces a set of synchronization primitives to serialize frequency
transitions, which are to be used as shown below:
cpufreq_freq_transition_begin();
//Perform the frequency change
cpufreq_freq_transition_end();
The _begin() call sends the PRECHANGE notification whereas the _end() call sends
the POSTCHANGE notification. Also, all the necessary synchronization is handled
within these calls. In particular, even drivers which set the ASYNC_NOTIFICATION
flag can also use these APIs for performing frequency transitions (ie., you can
call _begin() from one task, and call the corresponding _end() from a different
task).
The actual synchronization underneath is not that complicated:
The key challenge is to allow drivers to begin the transition from one thread
and end it in a completely different thread (this is to enable drivers that do
asynchronous POSTCHANGE notification from bottom-halves, to also use the same
interface).
To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a
wait-queue are added per-policy. The flag and the wait-queue are used in
conjunction to create an "uninterrupted flow" from _begin() to _end(). The
spinlock is used to ensure that only one such "flow" is in flight at any given
time. Put together, this provides us all the necessary synchronization.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-24 16:05:44 +08:00
|
|
|
#include <linux/spinlock.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
#include <linux/sysfs.h>
|
|
|
|
|
|
|
|
/*********************************************************************
|
2013-08-07 01:23:04 +08:00
|
|
|
* CPUFREQ INTERFACE *
|
2005-04-17 06:20:36 +08:00
|
|
|
*********************************************************************/
|
2013-08-07 01:23:04 +08:00
|
|
|
/*
|
|
|
|
* Frequency values here are CPU kHz
|
|
|
|
*
|
2005-06-01 10:03:47 +08:00
|
|
|
* Maximum transition latency is in nanoseconds - if it's unknown,
|
2005-04-17 06:20:36 +08:00
|
|
|
* CPUFREQ_ETERNAL shall be used.
|
|
|
|
*/
|
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
#define CPUFREQ_ETERNAL (-1)
|
|
|
|
#define CPUFREQ_NAME_LEN 16
|
2017-02-03 17:56:25 +08:00
|
|
|
/* Print length for names. Extra 1 space for accommodating '\n' in prints */
|
2013-08-07 01:23:04 +08:00
|
|
|
#define CPUFREQ_NAME_PLEN (CPUFREQ_NAME_LEN + 1)
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
struct cpufreq_governor;
|
|
|
|
|
2016-06-27 18:34:07 +08:00
|
|
|
enum cpufreq_table_sorting {
|
|
|
|
CPUFREQ_TABLE_UNSORTED,
|
|
|
|
CPUFREQ_TABLE_SORTED_ASCENDING,
|
|
|
|
CPUFREQ_TABLE_SORTED_DESCENDING
|
|
|
|
};
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
struct cpufreq_cpuinfo {
|
|
|
|
unsigned int max_freq;
|
|
|
|
unsigned int min_freq;
|
2011-04-29 07:42:53 +08:00
|
|
|
|
|
|
|
/* in 10^(-9) s = nanoseconds */
|
|
|
|
unsigned int transition_latency;
|
2005-04-17 06:20:36 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
struct cpufreq_policy {
|
2013-01-31 10:03:53 +08:00
|
|
|
/* CPUs sharing clock, require sw coordination */
|
|
|
|
cpumask_var_t cpus; /* Online CPUs only */
|
|
|
|
cpumask_var_t related_cpus; /* Online + Offline CPUs */
|
cpufreq: Avoid attempts to create duplicate symbolic links
After commit 87549141d516 (cpufreq: Stop migrating sysfs files on
hotplug) there is a problem with CPUs that share cpufreq policy
objects with other CPUs and are initially offline.
Say CPU1 shares a policy with CPU0 which is online and is registered
first. As part of the registration process, cpufreq_add_dev() is
called for it. It creates the policy object and a symbolic link
to it from the CPU1's sysfs directory. If CPU1 is registered
subsequently and it is offline at that time, cpufreq_add_dev() will
attempt to create a symbolic link to the policy object for it, but
that link is present already, so a warning about that will be
triggered.
To avoid that warning, make cpufreq use an additional CPU mask
containing related CPUs that are actually present for each policy
object. That mask is initialized when the policy object is populated
after its creation (for the first online CPU using it) and it includes
CPUs from the "policy CPUs" mask returned by the cpufreq driver's
->init() callback that are physically present at that time. Symbolic
links to the policy are created only for the CPUs in that mask.
If cpufreq_add_dev() is invoked for an offline CPU, it checks the
new mask and only creates the symlink if the CPU was not in it (the
CPU is added to the mask at the same time).
In turn, cpufreq_remove_dev() drops the given CPU from the new mask,
removes its symlink to the policy object and returns, unless it is
the CPU owning the policy object. In that case, the policy object
is moved to a new CPU's sysfs directory or deleted if the CPU being
removed was the last user of the policy.
While at it, notice that cpufreq_remove_dev() can't fail, because
its return value is ignored, so make it ignore return values from
__cpufreq_remove_dev_prepare() and __cpufreq_remove_dev_finish()
and prevent these functions from aborting on errors returned by
__cpufreq_governor(). Also drop the now unused sif argument from
them.
Fixes: 87549141d516 (cpufreq: Stop migrating sysfs files on hotplug)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-and-tested-by: Russell King <linux@arm.linux.org.uk>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2015-07-26 08:07:47 +08:00
|
|
|
cpumask_var_t real_cpus; /* Related and present */
|
2013-01-31 10:03:53 +08:00
|
|
|
|
2013-02-01 14:40:02 +08:00
|
|
|
unsigned int shared_type; /* ACPI: ANY or ALL affected CPUs
|
2005-12-15 04:05:00 +08:00
|
|
|
should set cpufreq */
|
2015-05-18 13:13:31 +08:00
|
|
|
unsigned int cpu; /* cpu managing this policy, must be online */
|
|
|
|
|
2014-01-09 23:08:43 +08:00
|
|
|
struct clk *clk;
|
2005-04-17 06:20:36 +08:00
|
|
|
struct cpufreq_cpuinfo cpuinfo;/* see above */
|
|
|
|
|
|
|
|
unsigned int min; /* in kHz */
|
|
|
|
unsigned int max; /* in kHz */
|
|
|
|
unsigned int cur; /* in kHz, only needed if cpufreq
|
|
|
|
* governors are used */
|
2014-06-03 01:19:28 +08:00
|
|
|
unsigned int restore_freq; /* = policy->cur before transition */
|
2014-03-04 11:00:27 +08:00
|
|
|
unsigned int suspend_freq; /* freq to set during suspend */
|
|
|
|
|
2011-04-29 07:42:53 +08:00
|
|
|
unsigned int policy; /* see above */
|
2015-12-02 08:52:14 +08:00
|
|
|
unsigned int last_policy; /* policy before unplug */
|
2005-04-17 06:20:36 +08:00
|
|
|
struct cpufreq_governor *governor; /* see below */
|
2013-03-27 23:58:57 +08:00
|
|
|
void *governor_data;
|
2015-05-12 14:52:34 +08:00
|
|
|
char last_governor[CPUFREQ_NAME_LEN]; /* last governor used */
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
struct work_struct update; /* if update_policy() needs to be
|
|
|
|
* called, but you're in IRQ context */
|
|
|
|
|
cpufreq: Use per-policy frequency QoS
Replace the CPU device PM QoS used for the management of min and max
frequency constraints in cpufreq (and its users) with per-policy
frequency QoS to avoid problems with cpufreq policies covering
more then one CPU.
Namely, a cpufreq driver is registered with the subsys interface
which calls cpufreq_add_dev() for each CPU, starting from CPU0, so
currently the PM QoS notifiers are added to the first CPU in the
policy (i.e. CPU0 in the majority of cases).
In turn, when the cpufreq driver is unregistered, the subsys interface
doing that calls cpufreq_remove_dev() for each CPU, starting from CPU0,
and the PM QoS notifiers are only removed when cpufreq_remove_dev() is
called for the last CPU in the policy, say CPUx, which as a rule is
not CPU0 if the policy covers more than one CPU. Then, the PM QoS
notifiers cannot be removed, because CPUx does not have them, and
they are still there in the device PM QoS notifiers list of CPU0,
which prevents new PM QoS notifiers from being registered for CPU0
on the next attempt to register the cpufreq driver.
The same issue occurs when the first CPU in the policy goes offline
before unregistering the driver.
After this change it does not matter which CPU is the policy CPU at
the driver registration time and whether or not it is online all the
time, because the frequency QoS is per policy and not per CPU.
Fixes: 67d874c3b2c6 ("cpufreq: Register notifiers with the PM QoS framework")
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Reported-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Diagnosed-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
Link: https://lore.kernel.org/linux-pm/20191017094612.6tbkwoq4harsjcqv@vireshk-i7/T/#m30d48cc23b9a80467fbaa16e30f90b3828a5a29b
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2019-10-16 18:47:06 +08:00
|
|
|
struct freq_constraints constraints;
|
|
|
|
struct freq_qos_request *min_freq_req;
|
|
|
|
struct freq_qos_request *max_freq_req;
|
|
|
|
|
2014-03-10 17:23:33 +08:00
|
|
|
struct cpufreq_frequency_table *freq_table;
|
2016-06-27 18:34:07 +08:00
|
|
|
enum cpufreq_table_sorting freq_table_sorted;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-08-07 01:23:08 +08:00
|
|
|
struct list_head policy_list;
|
2005-04-17 06:20:36 +08:00
|
|
|
struct kobject kobj;
|
|
|
|
struct completion kobj_unregister;
|
2013-10-18 21:40:15 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* The rules for this semaphore:
|
|
|
|
* - Any routine that wants to read from the policy structure will
|
|
|
|
* do a down_read on this semaphore.
|
|
|
|
* - Any routine that will write to the policy structure and/or may take away
|
|
|
|
* the policy altogether (eg. CPU hotplug), will hold this lock in write
|
|
|
|
* mode before doing so.
|
|
|
|
*/
|
|
|
|
struct rw_semaphore rwsem;
|
cpufreq: Make sure frequency transitions are serialized
Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE
notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers
should strictly alternate, thereby preventing two different sets of PRECHANGE or
POSTCHANGE notifiers from interleaving arbitrarily.
The following examples illustrate why this is important:
Scenario 1:
-----------
A thread reading the value of cpuinfo_cur_freq, will call
__cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition()
The ondemand governor can decide to change the frequency of the CPU at the same
time and hence it can end up sending the notifications via ->target().
If the notifiers are not serialized, the following sequence can occur:
- PRECHANGE Notification for freq A (from cpuinfo_cur_freq)
- PRECHANGE Notification for freq B (from target())
- Freq changed by target() to B
- POSTCHANGE Notification for freq B
- POSTCHANGE Notification for freq A
We can see from the above that the last POSTCHANGE Notification happens for freq
A but the hardware is set to run at freq B.
Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback()
in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the
loops_per_jiffy calculations will get messed up.
Scenario 2:
-----------
The governor calls __cpufreq_driver_target() to change the frequency. At the
same time, if we change scaling_{min|max}_freq from sysfs, it will end up
calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call
__cpufreq_driver_target(). And hence we end up issuing concurrent calls to
->target().
Typically, platforms have the following logic in their ->target() routines:
(Eg: cpufreq-cpu0, omap, exynos, etc)
A. If new freq is more than old: Increase voltage
B. Change freq
C. If new freq is less than old: decrease voltage
Now, if the two concurrent calls to ->target() are X and Y, where X is trying to
increase the freq and Y is trying to decrease it, we get the following race
condition:
X.A: voltage gets increased for larger freq
Y.A: nothing happens
Y.B: freq gets decreased
Y.C: voltage gets decreased
X.B: freq gets increased
X.C: nothing happens
Thus we can end up setting a freq which is not supported by the voltage we have
set. That will probably make the clock to the CPU unstable and the system might
not work properly anymore.
This patch introduces a set of synchronization primitives to serialize frequency
transitions, which are to be used as shown below:
cpufreq_freq_transition_begin();
//Perform the frequency change
cpufreq_freq_transition_end();
The _begin() call sends the PRECHANGE notification whereas the _end() call sends
the POSTCHANGE notification. Also, all the necessary synchronization is handled
within these calls. In particular, even drivers which set the ASYNC_NOTIFICATION
flag can also use these APIs for performing frequency transitions (ie., you can
call _begin() from one task, and call the corresponding _end() from a different
task).
The actual synchronization underneath is not that complicated:
The key challenge is to allow drivers to begin the transition from one thread
and end it in a completely different thread (this is to enable drivers that do
asynchronous POSTCHANGE notification from bottom-halves, to also use the same
interface).
To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a
wait-queue are added per-policy. The flag and the wait-queue are used in
conjunction to create an "uninterrupted flow" from _begin() to _end(). The
spinlock is used to ensure that only one such "flow" is in flight at any given
time. Put together, this provides us all the necessary synchronization.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-24 16:05:44 +08:00
|
|
|
|
2016-03-30 09:47:49 +08:00
|
|
|
/*
|
|
|
|
* Fast switch flags:
|
|
|
|
* - fast_switch_possible should be set by the driver if it can
|
|
|
|
* guarantee that frequency can be changed on any CPU sharing the
|
|
|
|
* policy and that the change will affect all of the policy CPUs then.
|
|
|
|
* - fast_switch_enabled is to be set by governors that support fast
|
2017-02-03 17:56:25 +08:00
|
|
|
* frequency switching with the help of cpufreq_enable_fast_switch().
|
2016-03-30 09:47:49 +08:00
|
|
|
*/
|
|
|
|
bool fast_switch_possible;
|
|
|
|
bool fast_switch_enabled;
|
|
|
|
|
2017-04-11 06:20:41 +08:00
|
|
|
/*
|
|
|
|
* Preferred average time interval between consecutive invocations of
|
|
|
|
* the driver to set the frequency for this policy. To be set by the
|
|
|
|
* scaling driver (0, which is the default, means no preference).
|
|
|
|
*/
|
|
|
|
unsigned int transition_delay_us;
|
|
|
|
|
2017-07-28 14:46:39 +08:00
|
|
|
/*
|
|
|
|
* Remote DVFS flag (Not added to the driver structure as we don't want
|
|
|
|
* to access another structure from scheduler hotpath).
|
|
|
|
*
|
|
|
|
* Should be set if CPUs can do DVFS on behalf of other CPUs from
|
|
|
|
* different cpufreq policies.
|
|
|
|
*/
|
|
|
|
bool dvfs_possible_from_any_cpu;
|
|
|
|
|
2016-07-14 04:25:25 +08:00
|
|
|
/* Cached frequency lookup from cpufreq_driver_resolve_freq. */
|
|
|
|
unsigned int cached_target_freq;
|
|
|
|
int cached_resolved_idx;
|
|
|
|
|
cpufreq: Make sure frequency transitions are serialized
Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE
notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers
should strictly alternate, thereby preventing two different sets of PRECHANGE or
POSTCHANGE notifiers from interleaving arbitrarily.
The following examples illustrate why this is important:
Scenario 1:
-----------
A thread reading the value of cpuinfo_cur_freq, will call
__cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition()
The ondemand governor can decide to change the frequency of the CPU at the same
time and hence it can end up sending the notifications via ->target().
If the notifiers are not serialized, the following sequence can occur:
- PRECHANGE Notification for freq A (from cpuinfo_cur_freq)
- PRECHANGE Notification for freq B (from target())
- Freq changed by target() to B
- POSTCHANGE Notification for freq B
- POSTCHANGE Notification for freq A
We can see from the above that the last POSTCHANGE Notification happens for freq
A but the hardware is set to run at freq B.
Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback()
in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the
loops_per_jiffy calculations will get messed up.
Scenario 2:
-----------
The governor calls __cpufreq_driver_target() to change the frequency. At the
same time, if we change scaling_{min|max}_freq from sysfs, it will end up
calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call
__cpufreq_driver_target(). And hence we end up issuing concurrent calls to
->target().
Typically, platforms have the following logic in their ->target() routines:
(Eg: cpufreq-cpu0, omap, exynos, etc)
A. If new freq is more than old: Increase voltage
B. Change freq
C. If new freq is less than old: decrease voltage
Now, if the two concurrent calls to ->target() are X and Y, where X is trying to
increase the freq and Y is trying to decrease it, we get the following race
condition:
X.A: voltage gets increased for larger freq
Y.A: nothing happens
Y.B: freq gets decreased
Y.C: voltage gets decreased
X.B: freq gets increased
X.C: nothing happens
Thus we can end up setting a freq which is not supported by the voltage we have
set. That will probably make the clock to the CPU unstable and the system might
not work properly anymore.
This patch introduces a set of synchronization primitives to serialize frequency
transitions, which are to be used as shown below:
cpufreq_freq_transition_begin();
//Perform the frequency change
cpufreq_freq_transition_end();
The _begin() call sends the PRECHANGE notification whereas the _end() call sends
the POSTCHANGE notification. Also, all the necessary synchronization is handled
within these calls. In particular, even drivers which set the ASYNC_NOTIFICATION
flag can also use these APIs for performing frequency transitions (ie., you can
call _begin() from one task, and call the corresponding _end() from a different
task).
The actual synchronization underneath is not that complicated:
The key challenge is to allow drivers to begin the transition from one thread
and end it in a completely different thread (this is to enable drivers that do
asynchronous POSTCHANGE notification from bottom-halves, to also use the same
interface).
To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a
wait-queue are added per-policy. The flag and the wait-queue are used in
conjunction to create an "uninterrupted flow" from _begin() to _end(). The
spinlock is used to ensure that only one such "flow" is in flight at any given
time. Put together, this provides us all the necessary synchronization.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-24 16:05:44 +08:00
|
|
|
/* Synchronization for frequency transitions */
|
|
|
|
bool transition_ongoing; /* Tracks transition status */
|
|
|
|
spinlock_t transition_lock;
|
|
|
|
wait_queue_head_t transition_wait;
|
cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end
Some cpufreq drivers were redundantly invoking the _begin() and _end()
APIs around frequency transitions, and this double invocation (one from
the cpufreq core and the other from the cpufreq driver) used to result
in a self-deadlock, leading to system hangs during boot. (The _begin()
API makes contending callers wait until the previous invocation is
complete. Hence, the cpufreq driver would end up waiting on itself!).
Now all such drivers have been fixed, but debugging this issue was not
very straight-forward (even lockdep didn't catch this). So let us add a
debug infrastructure to the cpufreq core to catch such issues more easily
in the future.
We add a new field called 'transition_task' to the policy structure, to keep
track of the task which is performing the frequency transition. Using this
field, we make note of this task during _begin() and print a warning if we
find a case where the same task is calling _begin() again, before completing
the previous frequency transition using the corresponding _end().
We have left out ASYNC_NOTIFICATION drivers from this debug infrastructure
for 2 reasons:
1. At the moment, we have no way to avoid a particular scenario where this
debug infrastructure can emit false-positive warnings for such drivers.
The scenario is depicted below:
Task A Task B
/* 1st freq transition */
Invoke _begin() {
...
...
}
Change the frequency
/* 2nd freq transition */
Invoke _begin() {
... //waiting for B to
... //finish _end() for
... //the 1st transition
... | Got interrupt for successful
... | change of frequency (1st one).
... |
... | /* 1st freq transition */
... | Invoke _end() {
... | ...
... V }
...
...
}
This scenario is actually deadlock-free because, once Task A changes the
frequency, it is Task B's responsibility to invoke the corresponding
_end() for the 1st frequency transition. Hence it is perfectly legal for
Task A to go ahead and attempt another frequency transition in the meantime.
(Of course it won't be able to proceed until Task B finishes the 1st _end(),
but this doesn't cause a deadlock or a hang).
The debug infrastructure cannot handle this scenario and will treat it as
a deadlock and print a warning. To avoid this, we exclude such drivers
from the purview of this code.
2. Luckily, we don't _need_ this infrastructure for ASYNC_NOTIFICATION drivers
at all! The cpufreq core does not automatically invoke the _begin() and
_end() APIs during frequency transitions in such drivers. Thus, the driver
alone is responsible for invoking _begin()/_end() and hence there shouldn't
be any conflicts which lead to double invocations. So, we can skip these
drivers, since the probability that such drivers will hit this problem is
extremely low, as outlined above.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-05-05 15:22:39 +08:00
|
|
|
struct task_struct *transition_task; /* Task which is doing the transition */
|
2014-08-28 13:52:23 +08:00
|
|
|
|
2015-01-13 14:04:00 +08:00
|
|
|
/* cpufreq-stats */
|
|
|
|
struct cpufreq_stats *stats;
|
|
|
|
|
2014-08-28 13:52:23 +08:00
|
|
|
/* For cpufreq driver's internal use */
|
|
|
|
void *driver_data;
|
2019-01-30 13:22:01 +08:00
|
|
|
|
|
|
|
/* Pointer to the cooling device if used for thermal mitigation */
|
|
|
|
struct thermal_cooling_device *cdev;
|
2019-07-08 18:57:52 +08:00
|
|
|
|
|
|
|
struct notifier_block nb_min;
|
|
|
|
struct notifier_block nb_max;
|
2005-04-17 06:20:36 +08:00
|
|
|
};
|
|
|
|
|
2019-04-29 17:33:58 +08:00
|
|
|
struct cpufreq_freqs {
|
|
|
|
struct cpufreq_policy *policy;
|
|
|
|
unsigned int old;
|
|
|
|
unsigned int new;
|
|
|
|
u8 flags; /* flags of cpufreq_driver, see below. */
|
|
|
|
};
|
|
|
|
|
2013-02-01 14:40:02 +08:00
|
|
|
/* Only for ACPI */
|
2006-06-26 12:34:43 +08:00
|
|
|
#define CPUFREQ_SHARED_TYPE_NONE (0) /* None */
|
|
|
|
#define CPUFREQ_SHARED_TYPE_HW (1) /* HW does needed coordination */
|
|
|
|
#define CPUFREQ_SHARED_TYPE_ALL (2) /* All dependent CPUs should set freq */
|
|
|
|
#define CPUFREQ_SHARED_TYPE_ANY (3) /* Freq can be set from any dependent CPU*/
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-10-08 16:56:11 +08:00
|
|
|
#ifdef CONFIG_CPU_FREQ
|
2015-09-16 08:17:49 +08:00
|
|
|
struct cpufreq_policy *cpufreq_cpu_get_raw(unsigned int cpu);
|
2013-08-07 01:23:04 +08:00
|
|
|
struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu);
|
2013-08-07 01:23:05 +08:00
|
|
|
void cpufreq_cpu_put(struct cpufreq_policy *policy);
|
2013-10-08 16:56:11 +08:00
|
|
|
#else
|
2015-09-16 08:17:49 +08:00
|
|
|
static inline struct cpufreq_policy *cpufreq_cpu_get_raw(unsigned int cpu)
|
|
|
|
{
|
|
|
|
return NULL;
|
|
|
|
}
|
2013-10-08 16:56:11 +08:00
|
|
|
static inline struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu)
|
|
|
|
{
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
static inline void cpufreq_cpu_put(struct cpufreq_policy *policy) { }
|
|
|
|
#endif
|
2013-08-07 01:23:04 +08:00
|
|
|
|
2019-03-26 19:19:52 +08:00
|
|
|
static inline bool policy_is_inactive(struct cpufreq_policy *policy)
|
|
|
|
{
|
|
|
|
return cpumask_empty(policy->cpus);
|
|
|
|
}
|
|
|
|
|
2013-01-31 17:44:40 +08:00
|
|
|
static inline bool policy_is_shared(struct cpufreq_policy *policy)
|
|
|
|
{
|
|
|
|
return cpumask_weight(policy->cpus) > 1;
|
|
|
|
}
|
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
/* /sys/devices/system/cpu/cpufreq: entry point for global variables */
|
|
|
|
extern struct kobject *cpufreq_global_kobject;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
#ifdef CONFIG_CPU_FREQ
|
|
|
|
unsigned int cpufreq_get(unsigned int cpu);
|
|
|
|
unsigned int cpufreq_quick_get(unsigned int cpu);
|
|
|
|
unsigned int cpufreq_quick_get_max(unsigned int cpu);
|
|
|
|
void disable_cpufreq(void);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
u64 get_cpu_idle_time(unsigned int cpu, u64 *wall, int io_busy);
|
2019-03-26 19:19:52 +08:00
|
|
|
|
|
|
|
struct cpufreq_policy *cpufreq_cpu_acquire(unsigned int cpu);
|
|
|
|
void cpufreq_cpu_release(struct cpufreq_policy *policy);
|
2013-08-07 01:23:04 +08:00
|
|
|
int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu);
|
2019-03-26 19:19:52 +08:00
|
|
|
int cpufreq_set_policy(struct cpufreq_policy *policy,
|
|
|
|
struct cpufreq_policy *new_policy);
|
2019-07-04 15:36:22 +08:00
|
|
|
void refresh_frequency_limits(struct cpufreq_policy *policy);
|
2016-11-18 20:59:21 +08:00
|
|
|
void cpufreq_update_policy(unsigned int cpu);
|
2019-03-26 19:15:13 +08:00
|
|
|
void cpufreq_update_limits(unsigned int cpu);
|
2013-08-07 01:23:04 +08:00
|
|
|
bool have_governor_per_policy(void);
|
|
|
|
struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy);
|
2016-03-30 09:47:49 +08:00
|
|
|
void cpufreq_enable_fast_switch(struct cpufreq_policy *policy);
|
2016-04-08 05:38:46 +08:00
|
|
|
void cpufreq_disable_fast_switch(struct cpufreq_policy *policy);
|
2013-08-07 01:23:04 +08:00
|
|
|
#else
|
|
|
|
static inline unsigned int cpufreq_get(unsigned int cpu)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2013-08-07 01:23:04 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
static inline unsigned int cpufreq_quick_get(unsigned int cpu)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
static inline unsigned int cpufreq_quick_get_max(unsigned int cpu)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
static inline void disable_cpufreq(void) { }
|
2005-04-17 06:20:36 +08:00
|
|
|
#endif
|
|
|
|
|
2016-06-01 04:14:44 +08:00
|
|
|
#ifdef CONFIG_CPU_FREQ_STAT
|
|
|
|
void cpufreq_stats_create_table(struct cpufreq_policy *policy);
|
|
|
|
void cpufreq_stats_free_table(struct cpufreq_policy *policy);
|
|
|
|
void cpufreq_stats_record_transition(struct cpufreq_policy *policy,
|
|
|
|
unsigned int new_freq);
|
|
|
|
#else
|
|
|
|
static inline void cpufreq_stats_create_table(struct cpufreq_policy *policy) { }
|
|
|
|
static inline void cpufreq_stats_free_table(struct cpufreq_policy *policy) { }
|
|
|
|
static inline void cpufreq_stats_record_transition(struct cpufreq_policy *policy,
|
|
|
|
unsigned int new_freq) { }
|
|
|
|
#endif /* CONFIG_CPU_FREQ_STAT */
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/*********************************************************************
|
2013-08-07 01:23:04 +08:00
|
|
|
* CPUFREQ DRIVER INTERFACE *
|
2005-04-17 06:20:36 +08:00
|
|
|
*********************************************************************/
|
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
#define CPUFREQ_RELATION_L 0 /* lowest frequency at or above target */
|
|
|
|
#define CPUFREQ_RELATION_H 1 /* highest frequency below or at target */
|
2014-07-01 00:59:33 +08:00
|
|
|
#define CPUFREQ_RELATION_C 2 /* closest frequency to target */
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
struct freq_attr {
|
|
|
|
struct attribute attr;
|
|
|
|
ssize_t (*show)(struct cpufreq_policy *, char *);
|
|
|
|
ssize_t (*store)(struct cpufreq_policy *, const char *, size_t count);
|
2005-04-17 06:20:36 +08:00
|
|
|
};
|
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
#define cpufreq_freq_attr_ro(_name) \
|
|
|
|
static struct freq_attr _name = \
|
|
|
|
__ATTR(_name, 0444, show_##_name, NULL)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
#define cpufreq_freq_attr_ro_perm(_name, _perm) \
|
|
|
|
static struct freq_attr _name = \
|
|
|
|
__ATTR(_name, _perm, show_##_name, NULL)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
#define cpufreq_freq_attr_rw(_name) \
|
|
|
|
static struct freq_attr _name = \
|
|
|
|
__ATTR(_name, 0644, show_##_name, store_##_name)
|
|
|
|
|
2016-11-08 02:02:23 +08:00
|
|
|
#define cpufreq_freq_attr_wo(_name) \
|
|
|
|
static struct freq_attr _name = \
|
|
|
|
__ATTR(_name, 0200, NULL, store_##_name)
|
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
#define define_one_global_ro(_name) \
|
2019-01-25 15:23:07 +08:00
|
|
|
static struct kobj_attribute _name = \
|
2013-08-07 01:23:04 +08:00
|
|
|
__ATTR(_name, 0444, show_##_name, NULL)
|
|
|
|
|
|
|
|
#define define_one_global_rw(_name) \
|
2019-01-25 15:23:07 +08:00
|
|
|
static struct kobj_attribute _name = \
|
2013-08-07 01:23:04 +08:00
|
|
|
__ATTR(_name, 0644, show_##_name, store_##_name)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
|
|
|
|
struct cpufreq_driver {
|
2014-11-27 08:37:49 +08:00
|
|
|
char name[CPUFREQ_NAME_LEN];
|
|
|
|
u8 flags;
|
|
|
|
void *driver_data;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
/* needed by all drivers */
|
2014-11-27 08:37:49 +08:00
|
|
|
int (*init)(struct cpufreq_policy *policy);
|
|
|
|
int (*verify)(struct cpufreq_policy *policy);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
/* define one out of two */
|
2014-11-27 08:37:49 +08:00
|
|
|
int (*setpolicy)(struct cpufreq_policy *policy);
|
2014-06-03 01:19:28 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* On failure, should always restore frequency to policy->restore_freq
|
|
|
|
* (i.e. old freq).
|
|
|
|
*/
|
2014-11-27 08:37:49 +08:00
|
|
|
int (*target)(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq,
|
|
|
|
unsigned int relation); /* Deprecated */
|
|
|
|
int (*target_index)(struct cpufreq_policy *policy,
|
|
|
|
unsigned int index);
|
2016-03-30 09:47:49 +08:00
|
|
|
unsigned int (*fast_switch)(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq);
|
2016-07-14 04:25:25 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Caches and returns the lowest driver-supported frequency greater than
|
|
|
|
* or equal to the target frequency, subject to any driver limitations.
|
|
|
|
* Does not set the frequency. Only to be implemented for drivers with
|
|
|
|
* target().
|
|
|
|
*/
|
|
|
|
unsigned int (*resolve_freq)(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq);
|
|
|
|
|
2014-06-03 01:19:28 +08:00
|
|
|
/*
|
|
|
|
* Only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION
|
|
|
|
* unset.
|
|
|
|
*
|
|
|
|
* get_intermediate should return a stable intermediate frequency
|
|
|
|
* platform wants to switch to and target_intermediate() should set CPU
|
|
|
|
* to to that frequency, before jumping to the frequency corresponding
|
|
|
|
* to 'index'. Core will take care of sending notifications and driver
|
|
|
|
* doesn't have to handle them in target_intermediate() or
|
|
|
|
* target_index().
|
|
|
|
*
|
|
|
|
* Drivers can return '0' from get_intermediate() in case they don't
|
|
|
|
* wish to switch to intermediate frequency for some target frequency.
|
|
|
|
* In that case core will directly call ->target_index().
|
|
|
|
*/
|
2014-11-27 08:37:49 +08:00
|
|
|
unsigned int (*get_intermediate)(struct cpufreq_policy *policy,
|
|
|
|
unsigned int index);
|
|
|
|
int (*target_intermediate)(struct cpufreq_policy *policy,
|
|
|
|
unsigned int index);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
/* should be defined, if possible */
|
2014-11-27 08:37:49 +08:00
|
|
|
unsigned int (*get)(unsigned int cpu);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2019-03-26 19:15:13 +08:00
|
|
|
/* Called to update policy limits on firmware notifications. */
|
|
|
|
void (*update_limits)(unsigned int cpu);
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/* optional */
|
2014-11-27 08:37:49 +08:00
|
|
|
int (*bios_limit)(int cpu, unsigned int *limit);
|
2008-08-05 02:59:07 +08:00
|
|
|
|
2019-02-12 19:06:04 +08:00
|
|
|
int (*online)(struct cpufreq_policy *policy);
|
|
|
|
int (*offline)(struct cpufreq_policy *policy);
|
2014-11-27 08:37:49 +08:00
|
|
|
int (*exit)(struct cpufreq_policy *policy);
|
|
|
|
void (*stop_cpu)(struct cpufreq_policy *policy);
|
|
|
|
int (*suspend)(struct cpufreq_policy *policy);
|
|
|
|
int (*resume)(struct cpufreq_policy *policy);
|
2014-11-27 08:37:51 +08:00
|
|
|
|
|
|
|
/* Will be called after the driver is fully initialized */
|
|
|
|
void (*ready)(struct cpufreq_policy *policy);
|
|
|
|
|
2014-11-27 08:37:49 +08:00
|
|
|
struct freq_attr **attr;
|
2013-12-20 22:24:49 +08:00
|
|
|
|
|
|
|
/* platform specific boost support code */
|
2014-11-27 08:37:49 +08:00
|
|
|
bool boost_enabled;
|
|
|
|
int (*set_boost)(int state);
|
2005-04-17 06:20:36 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
/* flags */
|
2019-01-21 16:47:37 +08:00
|
|
|
|
|
|
|
/* driver isn't removed even if all ->init() calls failed */
|
|
|
|
#define CPUFREQ_STICKY BIT(0)
|
|
|
|
|
|
|
|
/* loops_per_jiffy or other kernel "constants" aren't affected by frequency transitions */
|
|
|
|
#define CPUFREQ_CONST_LOOPS BIT(1)
|
|
|
|
|
|
|
|
/* don't warn on suspend/resume speed mismatches */
|
|
|
|
#define CPUFREQ_PM_NO_WARN BIT(2)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-10-02 16:43:18 +08:00
|
|
|
/*
|
|
|
|
* This should be set by platforms having multiple clock-domains, i.e.
|
|
|
|
* supporting multiple policies. With this sysfs directories of governor would
|
|
|
|
* be created in cpu/cpu<num>/cpufreq/ directory and so they can use the same
|
|
|
|
* governor with different tunables for different clusters.
|
|
|
|
*/
|
2019-01-21 16:47:37 +08:00
|
|
|
#define CPUFREQ_HAVE_GOVERNOR_PER_POLICY BIT(3)
|
2013-10-02 16:43:18 +08:00
|
|
|
|
2013-10-29 21:26:06 +08:00
|
|
|
/*
|
|
|
|
* Driver will do POSTCHANGE notifications from outside of their ->target()
|
|
|
|
* routine and so must set cpufreq_driver->flags with this flag, so that core
|
|
|
|
* can handle them specially.
|
|
|
|
*/
|
2019-01-21 16:47:37 +08:00
|
|
|
#define CPUFREQ_ASYNC_NOTIFICATION BIT(4)
|
2013-10-29 21:26:06 +08:00
|
|
|
|
2013-12-03 13:50:45 +08:00
|
|
|
/*
|
|
|
|
* Set by drivers which want cpufreq core to check if CPU is running at a
|
|
|
|
* frequency present in freq-table exposed by the driver. For these drivers if
|
|
|
|
* CPU is found running at an out of table freq, we will try to set it to a freq
|
|
|
|
* from the table. And if that fails, we will stop further boot process by
|
|
|
|
* issuing a BUG_ON().
|
|
|
|
*/
|
2019-01-21 16:47:37 +08:00
|
|
|
#define CPUFREQ_NEED_INITIAL_FREQ_CHECK BIT(5)
|
2013-12-03 13:50:45 +08:00
|
|
|
|
2017-07-19 18:12:48 +08:00
|
|
|
/*
|
|
|
|
* Set by drivers to disallow use of governors with "dynamic_switching" flag
|
|
|
|
* set.
|
|
|
|
*/
|
2019-01-21 16:47:37 +08:00
|
|
|
#define CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING BIT(6)
|
2017-07-19 18:12:48 +08:00
|
|
|
|
2019-01-30 13:22:01 +08:00
|
|
|
/*
|
|
|
|
* Set by drivers that want the core to automatically register the cpufreq
|
|
|
|
* driver as a thermal cooling device.
|
|
|
|
*/
|
|
|
|
#define CPUFREQ_IS_COOLING_DEV BIT(7)
|
|
|
|
|
2007-02-27 06:55:48 +08:00
|
|
|
int cpufreq_register_driver(struct cpufreq_driver *driver_data);
|
|
|
|
int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
const char *cpufreq_get_current_driver(void);
|
2014-10-19 17:30:27 +08:00
|
|
|
void *cpufreq_get_driver_data(void);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2019-06-25 19:32:41 +08:00
|
|
|
static inline int cpufreq_thermal_control_enabled(struct cpufreq_driver *drv)
|
|
|
|
{
|
|
|
|
return IS_ENABLED(CONFIG_CPU_THERMAL) &&
|
|
|
|
(drv->flags & CPUFREQ_IS_COOLING_DEV);
|
|
|
|
}
|
|
|
|
|
2013-06-19 16:49:33 +08:00
|
|
|
static inline void cpufreq_verify_within_limits(struct cpufreq_policy *policy,
|
|
|
|
unsigned int min, unsigned int max)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
|
|
|
if (policy->min < min)
|
|
|
|
policy->min = min;
|
|
|
|
if (policy->max < min)
|
|
|
|
policy->max = min;
|
|
|
|
if (policy->min > max)
|
|
|
|
policy->min = max;
|
|
|
|
if (policy->max > max)
|
|
|
|
policy->max = max;
|
|
|
|
if (policy->min > policy->max)
|
|
|
|
policy->min = policy->max;
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2013-10-02 16:43:19 +08:00
|
|
|
static inline void
|
|
|
|
cpufreq_verify_within_cpu_limits(struct cpufreq_policy *policy)
|
|
|
|
{
|
|
|
|
cpufreq_verify_within_limits(policy, policy->cpuinfo.min_freq,
|
|
|
|
policy->cpuinfo.max_freq);
|
|
|
|
}
|
|
|
|
|
2014-03-04 11:00:26 +08:00
|
|
|
#ifdef CONFIG_CPU_FREQ
|
|
|
|
void cpufreq_suspend(void);
|
|
|
|
void cpufreq_resume(void);
|
2014-03-04 11:00:27 +08:00
|
|
|
int cpufreq_generic_suspend(struct cpufreq_policy *policy);
|
2014-03-04 11:00:26 +08:00
|
|
|
#else
|
|
|
|
static inline void cpufreq_suspend(void) {}
|
|
|
|
static inline void cpufreq_resume(void) {}
|
|
|
|
#endif
|
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
/*********************************************************************
|
|
|
|
* CPUFREQ NOTIFIER INTERFACE *
|
|
|
|
*********************************************************************/
|
2010-04-01 03:56:46 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
#define CPUFREQ_TRANSITION_NOTIFIER (0)
|
|
|
|
#define CPUFREQ_POLICY_NOTIFIER (1)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
/* Transition notifiers */
|
|
|
|
#define CPUFREQ_PRECHANGE (0)
|
|
|
|
#define CPUFREQ_POSTCHANGE (1)
|
2010-04-01 03:56:46 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
/* Policy Notifiers */
|
2019-07-23 14:14:09 +08:00
|
|
|
#define CPUFREQ_CREATE_POLICY (0)
|
|
|
|
#define CPUFREQ_REMOVE_POLICY (1)
|
2010-04-01 03:56:46 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
#ifdef CONFIG_CPU_FREQ
|
|
|
|
int cpufreq_register_notifier(struct notifier_block *nb, unsigned int list);
|
|
|
|
int cpufreq_unregister_notifier(struct notifier_block *nb, unsigned int list);
|
2010-04-01 03:56:46 +08:00
|
|
|
|
cpufreq: Make sure frequency transitions are serialized
Whenever we change the frequency of a CPU, we call the PRECHANGE and POSTCHANGE
notifiers. They must be serialized, i.e. PRECHANGE and POSTCHANGE notifiers
should strictly alternate, thereby preventing two different sets of PRECHANGE or
POSTCHANGE notifiers from interleaving arbitrarily.
The following examples illustrate why this is important:
Scenario 1:
-----------
A thread reading the value of cpuinfo_cur_freq, will call
__cpufreq_cpu_get()->cpufreq_out_of_sync()->cpufreq_notify_transition()
The ondemand governor can decide to change the frequency of the CPU at the same
time and hence it can end up sending the notifications via ->target().
If the notifiers are not serialized, the following sequence can occur:
- PRECHANGE Notification for freq A (from cpuinfo_cur_freq)
- PRECHANGE Notification for freq B (from target())
- Freq changed by target() to B
- POSTCHANGE Notification for freq B
- POSTCHANGE Notification for freq A
We can see from the above that the last POSTCHANGE Notification happens for freq
A but the hardware is set to run at freq B.
Where would we break then?: adjust_jiffies() in cpufreq.c & cpufreq_callback()
in arch/arm/kernel/smp.c (which also adjusts the jiffies). All the
loops_per_jiffy calculations will get messed up.
Scenario 2:
-----------
The governor calls __cpufreq_driver_target() to change the frequency. At the
same time, if we change scaling_{min|max}_freq from sysfs, it will end up
calling the governor's CPUFREQ_GOV_LIMITS notification, which will also call
__cpufreq_driver_target(). And hence we end up issuing concurrent calls to
->target().
Typically, platforms have the following logic in their ->target() routines:
(Eg: cpufreq-cpu0, omap, exynos, etc)
A. If new freq is more than old: Increase voltage
B. Change freq
C. If new freq is less than old: decrease voltage
Now, if the two concurrent calls to ->target() are X and Y, where X is trying to
increase the freq and Y is trying to decrease it, we get the following race
condition:
X.A: voltage gets increased for larger freq
Y.A: nothing happens
Y.B: freq gets decreased
Y.C: voltage gets decreased
X.B: freq gets increased
X.C: nothing happens
Thus we can end up setting a freq which is not supported by the voltage we have
set. That will probably make the clock to the CPU unstable and the system might
not work properly anymore.
This patch introduces a set of synchronization primitives to serialize frequency
transitions, which are to be used as shown below:
cpufreq_freq_transition_begin();
//Perform the frequency change
cpufreq_freq_transition_end();
The _begin() call sends the PRECHANGE notification whereas the _end() call sends
the POSTCHANGE notification. Also, all the necessary synchronization is handled
within these calls. In particular, even drivers which set the ASYNC_NOTIFICATION
flag can also use these APIs for performing frequency transitions (ie., you can
call _begin() from one task, and call the corresponding _end() from a different
task).
The actual synchronization underneath is not that complicated:
The key challenge is to allow drivers to begin the transition from one thread
and end it in a completely different thread (this is to enable drivers that do
asynchronous POSTCHANGE notification from bottom-halves, to also use the same
interface).
To achieve this, a 'transition_ongoing' flag, a 'transition_lock' spinlock and a
wait-queue are added per-policy. The flag and the wait-queue are used in
conjunction to create an "uninterrupted flow" from _begin() to _end(). The
spinlock is used to ensure that only one such "flow" is in flight at any given
time. Put together, this provides us all the necessary synchronization.
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-24 16:05:44 +08:00
|
|
|
void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
|
|
|
|
struct cpufreq_freqs *freqs);
|
|
|
|
void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
|
|
|
|
struct cpufreq_freqs *freqs, int transition_failed);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
#else /* CONFIG_CPU_FREQ */
|
|
|
|
static inline int cpufreq_register_notifier(struct notifier_block *nb,
|
|
|
|
unsigned int list)
|
2009-10-27 07:49:29 +08:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2013-08-07 01:23:04 +08:00
|
|
|
static inline int cpufreq_unregister_notifier(struct notifier_block *nb,
|
|
|
|
unsigned int list)
|
2005-12-03 02:43:20 +08:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2013-08-07 01:23:04 +08:00
|
|
|
#endif /* !CONFIG_CPU_FREQ */
|
|
|
|
|
|
|
|
/**
|
|
|
|
* cpufreq_scale - "old * mult / div" calculation for large values (32-bit-arch
|
|
|
|
* safe)
|
|
|
|
* @old: old value
|
|
|
|
* @div: divisor
|
|
|
|
* @mult: multiplier
|
|
|
|
*
|
|
|
|
*
|
|
|
|
* new = old * mult / div
|
|
|
|
*/
|
|
|
|
static inline unsigned long cpufreq_scale(unsigned long old, u_int div,
|
|
|
|
u_int mult)
|
2011-06-29 01:59:12 +08:00
|
|
|
{
|
2013-08-07 01:23:04 +08:00
|
|
|
#if BITS_PER_LONG == 32
|
|
|
|
u64 result = ((u64) old) * ((u64) mult);
|
|
|
|
do_div(result, div);
|
|
|
|
return (unsigned long) result;
|
|
|
|
|
|
|
|
#elif BITS_PER_LONG == 64
|
|
|
|
unsigned long result = old * ((u64) mult);
|
|
|
|
result /= div;
|
|
|
|
return result;
|
2005-12-03 02:43:20 +08:00
|
|
|
#endif
|
2013-08-07 01:23:04 +08:00
|
|
|
}
|
2005-12-03 02:43:20 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/*********************************************************************
|
2013-08-07 01:23:04 +08:00
|
|
|
* CPUFREQ GOVERNORS *
|
2005-04-17 06:20:36 +08:00
|
|
|
*********************************************************************/
|
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
/*
|
|
|
|
* If (cpufreq_driver->target) exists, the ->governor decides what frequency
|
|
|
|
* within the limits is used. If (cpufreq_driver->setpolicy> exists, these
|
|
|
|
* two generic policies are available:
|
|
|
|
*/
|
|
|
|
#define CPUFREQ_POLICY_POWERSAVE (1)
|
|
|
|
#define CPUFREQ_POLICY_PERFORMANCE (2)
|
|
|
|
|
2016-03-22 09:51:56 +08:00
|
|
|
/*
|
|
|
|
* The polling frequency depends on the capability of the processor. Default
|
|
|
|
* polling frequency is 1000 times the transition latency of the processor. The
|
|
|
|
* ondemand governor will work on any processor with transition latency <= 10ms,
|
|
|
|
* using appropriate sampling rate.
|
|
|
|
*/
|
|
|
|
#define LATENCY_MULTIPLIER (1000)
|
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
struct cpufreq_governor {
|
|
|
|
char name[CPUFREQ_NAME_LEN];
|
2016-06-03 05:24:15 +08:00
|
|
|
int (*init)(struct cpufreq_policy *policy);
|
|
|
|
void (*exit)(struct cpufreq_policy *policy);
|
|
|
|
int (*start)(struct cpufreq_policy *policy);
|
|
|
|
void (*stop)(struct cpufreq_policy *policy);
|
|
|
|
void (*limits)(struct cpufreq_policy *policy);
|
2013-08-07 01:23:04 +08:00
|
|
|
ssize_t (*show_setspeed) (struct cpufreq_policy *policy,
|
|
|
|
char *buf);
|
|
|
|
int (*store_setspeed) (struct cpufreq_policy *policy,
|
|
|
|
unsigned int freq);
|
2017-07-19 18:12:46 +08:00
|
|
|
/* For governors which change frequency dynamically by themselves */
|
|
|
|
bool dynamic_switching;
|
2013-08-07 01:23:04 +08:00
|
|
|
struct list_head governor_list;
|
|
|
|
struct module *owner;
|
|
|
|
};
|
|
|
|
|
|
|
|
/* Pass a target to the cpufreq driver */
|
2016-03-30 09:47:49 +08:00
|
|
|
unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq);
|
2013-08-07 01:23:04 +08:00
|
|
|
int cpufreq_driver_target(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq,
|
|
|
|
unsigned int relation);
|
|
|
|
int __cpufreq_driver_target(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq,
|
|
|
|
unsigned int relation);
|
2016-07-14 04:25:25 +08:00
|
|
|
unsigned int cpufreq_driver_resolve_freq(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq);
|
2017-07-19 18:12:42 +08:00
|
|
|
unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy);
|
2013-08-07 01:23:04 +08:00
|
|
|
int cpufreq_register_governor(struct cpufreq_governor *governor);
|
|
|
|
void cpufreq_unregister_governor(struct cpufreq_governor *governor);
|
|
|
|
|
2016-02-05 09:37:42 +08:00
|
|
|
struct cpufreq_governor *cpufreq_default_governor(void);
|
|
|
|
struct cpufreq_governor *cpufreq_fallback_governor(void);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2016-05-18 20:25:31 +08:00
|
|
|
static inline void cpufreq_policy_apply_limits(struct cpufreq_policy *policy)
|
|
|
|
{
|
|
|
|
if (policy->max < policy->cur)
|
|
|
|
__cpufreq_driver_target(policy, policy->max, CPUFREQ_RELATION_H);
|
|
|
|
else if (policy->min > policy->cur)
|
|
|
|
__cpufreq_driver_target(policy, policy->min, CPUFREQ_RELATION_L);
|
|
|
|
}
|
|
|
|
|
2016-03-22 09:50:45 +08:00
|
|
|
/* Governor attribute set */
|
|
|
|
struct gov_attr_set {
|
|
|
|
struct kobject kobj;
|
|
|
|
struct list_head policy_list;
|
|
|
|
struct mutex update_lock;
|
|
|
|
int usage_count;
|
|
|
|
};
|
|
|
|
|
|
|
|
/* sysfs ops for cpufreq governors */
|
|
|
|
extern const struct sysfs_ops governor_sysfs_ops;
|
|
|
|
|
|
|
|
void gov_attr_set_init(struct gov_attr_set *attr_set, struct list_head *list_node);
|
|
|
|
void gov_attr_set_get(struct gov_attr_set *attr_set, struct list_head *list_node);
|
|
|
|
unsigned int gov_attr_set_put(struct gov_attr_set *attr_set, struct list_head *list_node);
|
|
|
|
|
|
|
|
/* Governor sysfs attribute */
|
|
|
|
struct governor_attr {
|
|
|
|
struct attribute attr;
|
|
|
|
ssize_t (*show)(struct gov_attr_set *attr_set, char *buf);
|
|
|
|
ssize_t (*store)(struct gov_attr_set *attr_set, const char *buf,
|
|
|
|
size_t count);
|
|
|
|
};
|
|
|
|
|
2018-05-22 18:01:30 +08:00
|
|
|
static inline bool cpufreq_this_cpu_can_update(struct cpufreq_policy *policy)
|
sched: cpufreq: Allow remote cpufreq callbacks
With Android UI and benchmarks the latency of cpufreq response to
certain scheduling events can become very critical. Currently, callbacks
into cpufreq governors are only made from the scheduler if the target
CPU of the event is the same as the current CPU. This means there are
certain situations where a target CPU may not run the cpufreq governor
for some time.
One testcase to show this behavior is where a task starts running on
CPU0, then a new task is also spawned on CPU0 by a task on CPU1. If the
system is configured such that the new tasks should receive maximum
demand initially, this should result in CPU0 increasing frequency
immediately. But because of the above mentioned limitation though, this
does not occur.
This patch updates the scheduler core to call the cpufreq callbacks for
remote CPUs as well.
The schedutil, ondemand and conservative governors are updated to
process cpufreq utilization update hooks called for remote CPUs where
the remote CPU is managed by the cpufreq policy of the local CPU.
The intel_pstate driver is updated to always reject remote callbacks.
This is tested with couple of usecases (Android: hackbench, recentfling,
galleryfling, vellamo, Ubuntu: hackbench) on ARM hikey board (64 bit
octa-core, single policy). Only galleryfling showed minor improvements,
while others didn't had much deviation.
The reason being that this patch only targets a corner case, where
following are required to be true to improve performance and that
doesn't happen too often with these tests:
- Task is migrated to another CPU.
- The task has high demand, and should take the target CPU to higher
OPPs.
- And the target CPU doesn't call into the cpufreq governor until the
next tick.
Based on initial work from Steve Muckle.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-28 14:46:38 +08:00
|
|
|
{
|
2017-07-28 14:46:39 +08:00
|
|
|
/*
|
|
|
|
* Allow remote callbacks if:
|
|
|
|
* - dvfs_possible_from_any_cpu flag is set
|
|
|
|
* - the local and remote CPUs share cpufreq policy
|
|
|
|
*/
|
2017-08-04 20:57:39 +08:00
|
|
|
return policy->dvfs_possible_from_any_cpu ||
|
|
|
|
cpumask_test_cpu(smp_processor_id(), policy->cpus);
|
sched: cpufreq: Allow remote cpufreq callbacks
With Android UI and benchmarks the latency of cpufreq response to
certain scheduling events can become very critical. Currently, callbacks
into cpufreq governors are only made from the scheduler if the target
CPU of the event is the same as the current CPU. This means there are
certain situations where a target CPU may not run the cpufreq governor
for some time.
One testcase to show this behavior is where a task starts running on
CPU0, then a new task is also spawned on CPU0 by a task on CPU1. If the
system is configured such that the new tasks should receive maximum
demand initially, this should result in CPU0 increasing frequency
immediately. But because of the above mentioned limitation though, this
does not occur.
This patch updates the scheduler core to call the cpufreq callbacks for
remote CPUs as well.
The schedutil, ondemand and conservative governors are updated to
process cpufreq utilization update hooks called for remote CPUs where
the remote CPU is managed by the cpufreq policy of the local CPU.
The intel_pstate driver is updated to always reject remote callbacks.
This is tested with couple of usecases (Android: hackbench, recentfling,
galleryfling, vellamo, Ubuntu: hackbench) on ARM hikey board (64 bit
octa-core, single policy). Only galleryfling showed minor improvements,
while others didn't had much deviation.
The reason being that this patch only targets a corner case, where
following are required to be true to improve performance and that
doesn't happen too often with these tests:
- Task is migrated to another CPU.
- The task has high demand, and should take the target CPU to higher
OPPs.
- And the target CPU doesn't call into the cpufreq governor until the
next tick.
Based on initial work from Steve Muckle.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Saravana Kannan <skannan@codeaurora.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-28 14:46:38 +08:00
|
|
|
}
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/*********************************************************************
|
|
|
|
* FREQUENCY TABLE HELPERS *
|
|
|
|
*********************************************************************/
|
|
|
|
|
2014-03-28 21:41:47 +08:00
|
|
|
/* Special Values of .frequency field */
|
2014-06-28 05:09:39 +08:00
|
|
|
#define CPUFREQ_ENTRY_INVALID ~0u
|
|
|
|
#define CPUFREQ_TABLE_END ~1u
|
2014-03-28 21:41:47 +08:00
|
|
|
/* Special Values of .flags field */
|
|
|
|
#define CPUFREQ_BOOST_FREQ (1 << 0)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
struct cpufreq_frequency_table {
|
2014-03-28 21:41:47 +08:00
|
|
|
unsigned int flags;
|
2013-03-30 18:55:15 +08:00
|
|
|
unsigned int driver_data; /* driver specific data, not used by core */
|
2005-04-17 06:20:36 +08:00
|
|
|
unsigned int frequency; /* kHz - doesn't need to be in ascending
|
|
|
|
* order */
|
|
|
|
};
|
|
|
|
|
2014-05-05 21:33:50 +08:00
|
|
|
#if defined(CONFIG_CPU_FREQ) && defined(CONFIG_PM_OPP)
|
|
|
|
int dev_pm_opp_init_cpufreq_table(struct device *dev,
|
|
|
|
struct cpufreq_frequency_table **table);
|
|
|
|
void dev_pm_opp_free_cpufreq_table(struct device *dev,
|
|
|
|
struct cpufreq_frequency_table **table);
|
|
|
|
#else
|
|
|
|
static inline int dev_pm_opp_init_cpufreq_table(struct device *dev,
|
|
|
|
struct cpufreq_frequency_table
|
|
|
|
**table)
|
|
|
|
{
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void dev_pm_opp_free_cpufreq_table(struct device *dev,
|
|
|
|
struct cpufreq_frequency_table
|
|
|
|
**table)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2014-04-26 04:15:23 +08:00
|
|
|
/*
|
|
|
|
* cpufreq_for_each_entry - iterate over a cpufreq_frequency_table
|
|
|
|
* @pos: the cpufreq_frequency_table * to use as a loop cursor.
|
|
|
|
* @table: the cpufreq_frequency_table * to iterate over.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#define cpufreq_for_each_entry(pos, table) \
|
|
|
|
for (pos = table; pos->frequency != CPUFREQ_TABLE_END; pos++)
|
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
/*
|
|
|
|
* cpufreq_for_each_entry_idx - iterate over a cpufreq_frequency_table
|
|
|
|
* with index
|
|
|
|
* @pos: the cpufreq_frequency_table * to use as a loop cursor.
|
|
|
|
* @table: the cpufreq_frequency_table * to iterate over.
|
|
|
|
* @idx: the table entry currently being processed
|
|
|
|
*/
|
|
|
|
|
|
|
|
#define cpufreq_for_each_entry_idx(pos, table, idx) \
|
|
|
|
for (pos = table, idx = 0; pos->frequency != CPUFREQ_TABLE_END; \
|
|
|
|
pos++, idx++)
|
|
|
|
|
2014-04-26 04:15:23 +08:00
|
|
|
/*
|
|
|
|
* cpufreq_for_each_valid_entry - iterate over a cpufreq_frequency_table
|
|
|
|
* excluding CPUFREQ_ENTRY_INVALID frequencies.
|
|
|
|
* @pos: the cpufreq_frequency_table * to use as a loop cursor.
|
|
|
|
* @table: the cpufreq_frequency_table * to iterate over.
|
|
|
|
*/
|
|
|
|
|
2016-02-26 07:22:57 +08:00
|
|
|
#define cpufreq_for_each_valid_entry(pos, table) \
|
|
|
|
for (pos = table; pos->frequency != CPUFREQ_TABLE_END; pos++) \
|
|
|
|
if (pos->frequency == CPUFREQ_ENTRY_INVALID) \
|
|
|
|
continue; \
|
|
|
|
else
|
2014-04-26 04:15:23 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
/*
|
|
|
|
* cpufreq_for_each_valid_entry_idx - iterate with index over a cpufreq
|
|
|
|
* frequency_table excluding CPUFREQ_ENTRY_INVALID frequencies.
|
|
|
|
* @pos: the cpufreq_frequency_table * to use as a loop cursor.
|
|
|
|
* @table: the cpufreq_frequency_table * to iterate over.
|
|
|
|
* @idx: the table entry currently being processed
|
|
|
|
*/
|
|
|
|
|
|
|
|
#define cpufreq_for_each_valid_entry_idx(pos, table, idx) \
|
|
|
|
cpufreq_for_each_entry_idx(pos, table, idx) \
|
|
|
|
if (pos->frequency == CPUFREQ_ENTRY_INVALID) \
|
|
|
|
continue; \
|
|
|
|
else
|
|
|
|
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy,
|
|
|
|
struct cpufreq_frequency_table *table);
|
|
|
|
|
|
|
|
int cpufreq_frequency_table_verify(struct cpufreq_policy *policy,
|
|
|
|
struct cpufreq_frequency_table *table);
|
2013-10-03 22:57:55 +08:00
|
|
|
int cpufreq_generic_frequency_table_verify(struct cpufreq_policy *policy);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2016-06-27 18:34:07 +08:00
|
|
|
int cpufreq_table_index_unsorted(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq,
|
|
|
|
unsigned int relation);
|
2013-12-03 13:50:46 +08:00
|
|
|
int cpufreq_frequency_table_get_index(struct cpufreq_policy *policy,
|
|
|
|
unsigned int freq);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-08-07 01:23:04 +08:00
|
|
|
ssize_t cpufreq_show_cpus(const struct cpumask *mask, char *buf);
|
|
|
|
|
2013-12-20 22:24:49 +08:00
|
|
|
#ifdef CONFIG_CPU_FREQ
|
|
|
|
int cpufreq_boost_trigger_state(int state);
|
|
|
|
int cpufreq_boost_enabled(void);
|
2015-07-29 18:53:09 +08:00
|
|
|
int cpufreq_enable_boost_support(void);
|
|
|
|
bool policy_has_boost_freq(struct cpufreq_policy *policy);
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
/* Find lowest freq at or above target in a table in ascending order */
|
|
|
|
static inline int cpufreq_table_find_index_al(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq)
|
|
|
|
{
|
|
|
|
struct cpufreq_frequency_table *table = policy->freq_table;
|
2018-01-30 13:42:37 +08:00
|
|
|
struct cpufreq_frequency_table *pos;
|
2016-06-27 18:34:07 +08:00
|
|
|
unsigned int freq;
|
2018-01-30 13:42:37 +08:00
|
|
|
int idx, best = -1;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
cpufreq_for_each_valid_entry_idx(pos, table, idx) {
|
2016-10-12 11:15:05 +08:00
|
|
|
freq = pos->frequency;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
if (freq >= target_freq)
|
2018-01-30 13:42:37 +08:00
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
best = idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
return best;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Find lowest freq at or above target in a table in descending order */
|
|
|
|
static inline int cpufreq_table_find_index_dl(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq)
|
|
|
|
{
|
|
|
|
struct cpufreq_frequency_table *table = policy->freq_table;
|
2018-01-30 13:42:37 +08:00
|
|
|
struct cpufreq_frequency_table *pos;
|
2016-06-27 18:34:07 +08:00
|
|
|
unsigned int freq;
|
2018-01-30 13:42:37 +08:00
|
|
|
int idx, best = -1;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
cpufreq_for_each_valid_entry_idx(pos, table, idx) {
|
2016-10-12 11:15:05 +08:00
|
|
|
freq = pos->frequency;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
if (freq == target_freq)
|
2018-01-30 13:42:37 +08:00
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
if (freq > target_freq) {
|
2018-01-30 13:42:37 +08:00
|
|
|
best = idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* No freq found above target_freq */
|
2018-01-30 13:42:37 +08:00
|
|
|
if (best == -1)
|
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
return best;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
return best;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Works only on sorted freq-tables */
|
|
|
|
static inline int cpufreq_table_find_index_l(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq)
|
|
|
|
{
|
|
|
|
target_freq = clamp_val(target_freq, policy->min, policy->max);
|
|
|
|
|
|
|
|
if (policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_ASCENDING)
|
|
|
|
return cpufreq_table_find_index_al(policy, target_freq);
|
|
|
|
else
|
|
|
|
return cpufreq_table_find_index_dl(policy, target_freq);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Find highest freq at or below target in a table in ascending order */
|
|
|
|
static inline int cpufreq_table_find_index_ah(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq)
|
|
|
|
{
|
|
|
|
struct cpufreq_frequency_table *table = policy->freq_table;
|
2018-01-30 13:42:37 +08:00
|
|
|
struct cpufreq_frequency_table *pos;
|
2016-06-27 18:34:07 +08:00
|
|
|
unsigned int freq;
|
2018-01-30 13:42:37 +08:00
|
|
|
int idx, best = -1;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
cpufreq_for_each_valid_entry_idx(pos, table, idx) {
|
2016-10-12 11:15:05 +08:00
|
|
|
freq = pos->frequency;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
if (freq == target_freq)
|
2018-01-30 13:42:37 +08:00
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
if (freq < target_freq) {
|
2018-01-30 13:42:37 +08:00
|
|
|
best = idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* No freq found below target_freq */
|
2018-01-30 13:42:37 +08:00
|
|
|
if (best == -1)
|
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
return best;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
return best;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Find highest freq at or below target in a table in descending order */
|
|
|
|
static inline int cpufreq_table_find_index_dh(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq)
|
|
|
|
{
|
|
|
|
struct cpufreq_frequency_table *table = policy->freq_table;
|
2018-01-30 13:42:37 +08:00
|
|
|
struct cpufreq_frequency_table *pos;
|
2016-06-27 18:34:07 +08:00
|
|
|
unsigned int freq;
|
2018-01-30 13:42:37 +08:00
|
|
|
int idx, best = -1;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
cpufreq_for_each_valid_entry_idx(pos, table, idx) {
|
2016-10-12 11:15:05 +08:00
|
|
|
freq = pos->frequency;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
if (freq <= target_freq)
|
2018-01-30 13:42:37 +08:00
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
best = idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
return best;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Works only on sorted freq-tables */
|
|
|
|
static inline int cpufreq_table_find_index_h(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq)
|
|
|
|
{
|
|
|
|
target_freq = clamp_val(target_freq, policy->min, policy->max);
|
|
|
|
|
|
|
|
if (policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_ASCENDING)
|
|
|
|
return cpufreq_table_find_index_ah(policy, target_freq);
|
|
|
|
else
|
|
|
|
return cpufreq_table_find_index_dh(policy, target_freq);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Find closest freq to target in a table in ascending order */
|
|
|
|
static inline int cpufreq_table_find_index_ac(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq)
|
|
|
|
{
|
|
|
|
struct cpufreq_frequency_table *table = policy->freq_table;
|
2018-01-30 13:42:37 +08:00
|
|
|
struct cpufreq_frequency_table *pos;
|
2016-06-27 18:34:07 +08:00
|
|
|
unsigned int freq;
|
2018-01-30 13:42:37 +08:00
|
|
|
int idx, best = -1;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
cpufreq_for_each_valid_entry_idx(pos, table, idx) {
|
2016-10-12 11:15:05 +08:00
|
|
|
freq = pos->frequency;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
if (freq == target_freq)
|
2018-01-30 13:42:37 +08:00
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
if (freq < target_freq) {
|
2018-01-30 13:42:37 +08:00
|
|
|
best = idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* No freq found below target_freq */
|
2018-01-30 13:42:37 +08:00
|
|
|
if (best == -1)
|
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
/* Choose the closest freq */
|
2018-01-30 13:42:37 +08:00
|
|
|
if (target_freq - table[best].frequency > freq - target_freq)
|
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
return best;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
return best;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Find closest freq to target in a table in descending order */
|
|
|
|
static inline int cpufreq_table_find_index_dc(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq)
|
|
|
|
{
|
|
|
|
struct cpufreq_frequency_table *table = policy->freq_table;
|
2018-01-30 13:42:37 +08:00
|
|
|
struct cpufreq_frequency_table *pos;
|
2016-06-27 18:34:07 +08:00
|
|
|
unsigned int freq;
|
2018-01-30 13:42:37 +08:00
|
|
|
int idx, best = -1;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
cpufreq_for_each_valid_entry_idx(pos, table, idx) {
|
2016-10-12 11:15:05 +08:00
|
|
|
freq = pos->frequency;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
if (freq == target_freq)
|
2018-01-30 13:42:37 +08:00
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
if (freq > target_freq) {
|
2018-01-30 13:42:37 +08:00
|
|
|
best = idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* No freq found above target_freq */
|
2018-01-30 13:42:37 +08:00
|
|
|
if (best == -1)
|
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
|
|
|
/* Choose the closest freq */
|
2018-01-30 13:42:37 +08:00
|
|
|
if (table[best].frequency - target_freq > target_freq - freq)
|
|
|
|
return idx;
|
2016-06-27 18:34:07 +08:00
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
return best;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
2018-01-30 13:42:37 +08:00
|
|
|
return best;
|
2016-06-27 18:34:07 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Works only on sorted freq-tables */
|
|
|
|
static inline int cpufreq_table_find_index_c(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq)
|
|
|
|
{
|
|
|
|
target_freq = clamp_val(target_freq, policy->min, policy->max);
|
|
|
|
|
|
|
|
if (policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_ASCENDING)
|
|
|
|
return cpufreq_table_find_index_ac(policy, target_freq);
|
|
|
|
else
|
|
|
|
return cpufreq_table_find_index_dc(policy, target_freq);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline int cpufreq_frequency_table_target(struct cpufreq_policy *policy,
|
|
|
|
unsigned int target_freq,
|
|
|
|
unsigned int relation)
|
|
|
|
{
|
|
|
|
if (unlikely(policy->freq_table_sorted == CPUFREQ_TABLE_UNSORTED))
|
|
|
|
return cpufreq_table_index_unsorted(policy, target_freq,
|
|
|
|
relation);
|
|
|
|
|
|
|
|
switch (relation) {
|
|
|
|
case CPUFREQ_RELATION_L:
|
|
|
|
return cpufreq_table_find_index_l(policy, target_freq);
|
|
|
|
case CPUFREQ_RELATION_H:
|
|
|
|
return cpufreq_table_find_index_h(policy, target_freq);
|
|
|
|
case CPUFREQ_RELATION_C:
|
|
|
|
return cpufreq_table_find_index_c(policy, target_freq);
|
|
|
|
default:
|
|
|
|
pr_err("%s: Invalid relation: %d\n", __func__, relation);
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
}
|
2017-04-25 18:27:15 +08:00
|
|
|
|
|
|
|
static inline int cpufreq_table_count_valid_entries(const struct cpufreq_policy *policy)
|
|
|
|
{
|
|
|
|
struct cpufreq_frequency_table *pos;
|
|
|
|
int count = 0;
|
|
|
|
|
|
|
|
if (unlikely(!policy->freq_table))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
cpufreq_for_each_valid_entry(pos, policy->freq_table)
|
|
|
|
count++;
|
|
|
|
|
|
|
|
return count;
|
|
|
|
}
|
2013-12-20 22:24:49 +08:00
|
|
|
#else
|
|
|
|
static inline int cpufreq_boost_trigger_state(int state)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
static inline int cpufreq_boost_enabled(void)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2015-07-29 18:53:09 +08:00
|
|
|
|
|
|
|
static inline int cpufreq_enable_boost_support(void)
|
|
|
|
{
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool policy_has_boost_freq(struct cpufreq_policy *policy)
|
|
|
|
{
|
|
|
|
return false;
|
|
|
|
}
|
2013-12-20 22:24:49 +08:00
|
|
|
#endif
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2018-12-03 17:56:21 +08:00
|
|
|
#if defined(CONFIG_ENERGY_MODEL) && defined(CONFIG_CPU_FREQ_GOV_SCHEDUTIL)
|
|
|
|
void sched_cpufreq_governor_change(struct cpufreq_policy *policy,
|
|
|
|
struct cpufreq_governor *old_gov);
|
|
|
|
#else
|
|
|
|
static inline void sched_cpufreq_governor_change(struct cpufreq_policy *policy,
|
|
|
|
struct cpufreq_governor *old_gov) { }
|
|
|
|
#endif
|
|
|
|
|
x86 / CPU: Always show current CPU frequency in /proc/cpuinfo
After commit 890da9cf0983 (Revert "x86: do not use cpufreq_quick_get()
for /proc/cpuinfo "cpu MHz"") the "cpu MHz" number in /proc/cpuinfo
on x86 can be either the nominal CPU frequency (which is constant)
or the frequency most recently requested by a scaling governor in
cpufreq, depending on the cpufreq configuration. That is somewhat
inconsistent and is different from what it was before 4.13, so in
order to restore the previous behavior, make it report the current
CPU frequency like the scaling_cur_freq sysfs file in cpufreq.
To that end, modify the /proc/cpuinfo implementation on x86 to use
aperfmperf_snapshot_khz() to snapshot the APERF and MPERF feedback
registers, if available, and use their values to compute the CPU
frequency to be reported as "cpu MHz".
However, do that carefully enough to avoid accumulating delays that
lead to unacceptable access times for /proc/cpuinfo on systems with
many CPUs. Run aperfmperf_snapshot_khz() once on all CPUs
asynchronously at the /proc/cpuinfo open time, add a single delay
upfront (if necessary) at that point and simply compute the current
frequency while running show_cpuinfo() for each individual CPU.
Also, to avoid slowing down /proc/cpuinfo accesses too much, reduce
the default delay between consecutive APERF and MPERF reads to 10 ms,
which should be sufficient to get large enough numbers for the
frequency computation in all cases.
Fixes: 890da9cf0983 (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
2017-11-15 09:13:40 +08:00
|
|
|
extern void arch_freq_prepare_all(void);
|
2017-06-24 13:11:52 +08:00
|
|
|
extern unsigned int arch_freq_get_on_cpu(int cpu);
|
|
|
|
|
2017-09-27 00:41:07 +08:00
|
|
|
extern void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq,
|
|
|
|
unsigned long max_freq);
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/* the following are really really optional */
|
|
|
|
extern struct freq_attr cpufreq_freq_attr_scaling_available_freqs;
|
2015-08-07 19:59:16 +08:00
|
|
|
extern struct freq_attr cpufreq_freq_attr_scaling_boost_freqs;
|
2013-10-03 22:57:55 +08:00
|
|
|
extern struct freq_attr *cpufreq_generic_attr[];
|
2018-02-22 13:59:44 +08:00
|
|
|
int cpufreq_table_validate_and_sort(struct cpufreq_policy *policy);
|
2013-06-27 15:08:54 +08:00
|
|
|
|
2014-01-09 23:08:43 +08:00
|
|
|
unsigned int cpufreq_generic_get(unsigned int cpu);
|
2019-07-16 12:06:08 +08:00
|
|
|
void cpufreq_generic_init(struct cpufreq_policy *policy,
|
2013-10-03 22:59:07 +08:00
|
|
|
struct cpufreq_frequency_table *table,
|
|
|
|
unsigned int transition_latency);
|
2005-04-17 06:20:36 +08:00
|
|
|
#endif /* _LINUX_CPUFREQ_H */
|