Merge branches 'pm-core', 'pm-sleep', 'powercap', 'pm-domains' and 'pm-em'
Merge core device power management changes for v5.20-rc1: - Extend support for wakeirq to callback wrappers used during system suspend and resume (Ulf Hansson). - Defer waiting for device probe before loading a hibernation image till the first actual device access to avoid possible deadlocks reported by syzbot (Tetsuo Handa). - Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn Helgaas). - Add Raptor Lake-P to the list of processors supported by the Intel RAPL driver (George D Sworo). - Add Alder Lake-N and Raptor Lake-P to the list of processors for which Power Limit4 is supported in the Intel RAPL driver (Sumeet Pawnikar). - Make pm_genpd_remove() check genpd_debugfs_dir against NULL before attempting to remove it (Hsin-Yi Wang). - Change the Energy Model code to represent power in micro-Watts and adjust its users accordingly (Lukasz Luba). * pm-core: PM: runtime: Extend support for wakeirq for force_suspend|resume * pm-sleep: PM: hibernate: defer device probing when resuming from hibernation PM: wakeup: Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP * powercap: powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P powercap: intel_rapl: Add support for RAPTORLAKE_P * pm-domains: PM: domains: Ensure genpd_debugfs_dir exists before remove * pm-em: cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1 firmware: arm_scmi: Get detailed power scale from perf Documentation: EM: Switch to micro-Watts scale PM: EM: convert power field to micro-Watts precision and align drivers
This commit is contained in:
commit
954a83fc60
|
@ -20,20 +20,20 @@ possible source of information on its own, the EM framework intervenes as an
|
|||
abstraction layer which standardizes the format of power cost tables in the
|
||||
kernel, hence enabling to avoid redundant work.
|
||||
|
||||
The power values might be expressed in milli-Watts or in an 'abstract scale'.
|
||||
The power values might be expressed in micro-Watts or in an 'abstract scale'.
|
||||
Multiple subsystems might use the EM and it is up to the system integrator to
|
||||
check that the requirements for the power value scale types are met. An example
|
||||
can be found in the Energy-Aware Scheduler documentation
|
||||
Documentation/scheduler/sched-energy.rst. For some subsystems like thermal or
|
||||
powercap power values expressed in an 'abstract scale' might cause issues.
|
||||
These subsystems are more interested in estimation of power used in the past,
|
||||
thus the real milli-Watts might be needed. An example of these requirements can
|
||||
thus the real micro-Watts might be needed. An example of these requirements can
|
||||
be found in the Intelligent Power Allocation in
|
||||
Documentation/driver-api/thermal/power_allocator.rst.
|
||||
Kernel subsystems might implement automatic detection to check whether EM
|
||||
registered devices have inconsistent scale (based on EM internal flag).
|
||||
Important thing to keep in mind is that when the power values are expressed in
|
||||
an 'abstract scale' deriving real energy in milli-Joules would not be possible.
|
||||
an 'abstract scale' deriving real energy in micro-Joules would not be possible.
|
||||
|
||||
The figure below depicts an example of drivers (Arm-specific here, but the
|
||||
approach is applicable to any architecture) providing power costs to the EM
|
||||
|
@ -98,7 +98,7 @@ Drivers are expected to register performance domains into the EM framework by
|
|||
calling the following API::
|
||||
|
||||
int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states,
|
||||
struct em_data_callback *cb, cpumask_t *cpus, bool milliwatts);
|
||||
struct em_data_callback *cb, cpumask_t *cpus, bool microwatts);
|
||||
|
||||
Drivers must provide a callback function returning <frequency, power> tuples
|
||||
for each performance state. The callback function provided by the driver is free
|
||||
|
@ -106,10 +106,10 @@ to fetch data from any relevant location (DT, firmware, ...), and by any mean
|
|||
deemed necessary. Only for CPU devices, drivers must specify the CPUs of the
|
||||
performance domains using cpumask. For other devices than CPUs the last
|
||||
argument must be set to NULL.
|
||||
The last argument 'milliwatts' is important to set with correct value. Kernel
|
||||
The last argument 'microwatts' is important to set with correct value. Kernel
|
||||
subsystems which use EM might rely on this flag to check if all EM devices use
|
||||
the same scale. If there are different scales, these subsystems might decide
|
||||
to: return warning/error, stop working or panic.
|
||||
to return warning/error, stop working or panic.
|
||||
See Section 3. for an example of driver implementing this
|
||||
callback, or Section 2.4 for further documentation on this API
|
||||
|
||||
|
@ -137,7 +137,7 @@ The .get_cost() allows to provide the 'cost' values which reflect the
|
|||
efficiency of the CPUs. This would allow to provide EAS information which
|
||||
has different relation than what would be forced by the EM internal
|
||||
formulas calculating 'cost' values. To register an EM for such platform, the
|
||||
driver must set the flag 'milliwatts' to 0, provide .get_power() callback
|
||||
driver must set the flag 'microwatts' to 0, provide .get_power() callback
|
||||
and provide .get_cost() callback. The EM framework would handle such platform
|
||||
properly during registration. A flag EM_PERF_DOMAIN_ARTIFICIAL is set for such
|
||||
platform. Special care should be taken by other frameworks which are using EM
|
||||
|
|
|
@ -222,6 +222,9 @@ static void genpd_debug_remove(struct generic_pm_domain *genpd)
|
|||
{
|
||||
struct dentry *d;
|
||||
|
||||
if (!genpd_debugfs_dir)
|
||||
return;
|
||||
|
||||
d = debugfs_lookup(genpd->name, genpd_debugfs_dir);
|
||||
debugfs_remove(d);
|
||||
}
|
||||
|
|
|
@ -1862,10 +1862,13 @@ int pm_runtime_force_suspend(struct device *dev)
|
|||
|
||||
callback = RPM_GET_CALLBACK(dev, runtime_suspend);
|
||||
|
||||
dev_pm_enable_wake_irq_check(dev, true);
|
||||
ret = callback ? callback(dev) : 0;
|
||||
if (ret)
|
||||
goto err;
|
||||
|
||||
dev_pm_enable_wake_irq_complete(dev);
|
||||
|
||||
/*
|
||||
* If the device can stay in suspend after the system-wide transition
|
||||
* to the working state that will follow, drop the children counter of
|
||||
|
@ -1882,6 +1885,7 @@ int pm_runtime_force_suspend(struct device *dev)
|
|||
return 0;
|
||||
|
||||
err:
|
||||
dev_pm_disable_wake_irq_check(dev, true);
|
||||
pm_runtime_enable(dev);
|
||||
return ret;
|
||||
}
|
||||
|
@ -1915,9 +1919,11 @@ int pm_runtime_force_resume(struct device *dev)
|
|||
|
||||
callback = RPM_GET_CALLBACK(dev, runtime_resume);
|
||||
|
||||
dev_pm_disable_wake_irq_check(dev, false);
|
||||
ret = callback ? callback(dev) : 0;
|
||||
if (ret) {
|
||||
pm_runtime_set_suspended(dev);
|
||||
dev_pm_enable_wake_irq_check(dev, false);
|
||||
goto out;
|
||||
}
|
||||
|
||||
|
|
|
@ -500,36 +500,6 @@ void device_set_wakeup_capable(struct device *dev, bool capable)
|
|||
}
|
||||
EXPORT_SYMBOL_GPL(device_set_wakeup_capable);
|
||||
|
||||
/**
|
||||
* device_init_wakeup - Device wakeup initialization.
|
||||
* @dev: Device to handle.
|
||||
* @enable: Whether or not to enable @dev as a wakeup device.
|
||||
*
|
||||
* By default, most devices should leave wakeup disabled. The exceptions are
|
||||
* devices that everyone expects to be wakeup sources: keyboards, power buttons,
|
||||
* possibly network interfaces, etc. Also, devices that don't generate their
|
||||
* own wakeup requests but merely forward requests from one bus to another
|
||||
* (like PCI bridges) should have wakeup enabled by default.
|
||||
*/
|
||||
int device_init_wakeup(struct device *dev, bool enable)
|
||||
{
|
||||
int ret = 0;
|
||||
|
||||
if (!dev)
|
||||
return -EINVAL;
|
||||
|
||||
if (enable) {
|
||||
device_set_wakeup_capable(dev, true);
|
||||
ret = device_wakeup_enable(dev);
|
||||
} else {
|
||||
device_wakeup_disable(dev);
|
||||
device_set_wakeup_capable(dev, false);
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(device_init_wakeup);
|
||||
|
||||
/**
|
||||
* device_set_wakeup_enable - Enable or disable a device to wake up the system.
|
||||
* @dev: Device to handle.
|
||||
|
|
|
@ -51,7 +51,7 @@ static const u16 cpufreq_mtk_offsets[REG_ARRAY_SIZE] = {
|
|||
};
|
||||
|
||||
static int __maybe_unused
|
||||
mtk_cpufreq_get_cpu_power(struct device *cpu_dev, unsigned long *mW,
|
||||
mtk_cpufreq_get_cpu_power(struct device *cpu_dev, unsigned long *uW,
|
||||
unsigned long *KHz)
|
||||
{
|
||||
struct mtk_cpufreq_data *data;
|
||||
|
@ -71,8 +71,9 @@ mtk_cpufreq_get_cpu_power(struct device *cpu_dev, unsigned long *mW,
|
|||
i--;
|
||||
|
||||
*KHz = data->table[i].frequency;
|
||||
*mW = readl_relaxed(data->reg_bases[REG_EM_POWER_TBL] +
|
||||
i * LUT_ROW_SIZE) / 1000;
|
||||
/* Provide micro-Watts value to the Energy Model */
|
||||
*uW = readl_relaxed(data->reg_bases[REG_EM_POWER_TBL] +
|
||||
i * LUT_ROW_SIZE);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
|
|
@ -19,6 +19,7 @@
|
|||
#include <linux/slab.h>
|
||||
#include <linux/scmi_protocol.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/units.h>
|
||||
|
||||
struct scmi_data {
|
||||
int domain_id;
|
||||
|
@ -99,6 +100,7 @@ static int __maybe_unused
|
|||
scmi_get_cpu_power(struct device *cpu_dev, unsigned long *power,
|
||||
unsigned long *KHz)
|
||||
{
|
||||
enum scmi_power_scale power_scale = perf_ops->power_scale_get(ph);
|
||||
unsigned long Hz;
|
||||
int ret, domain;
|
||||
|
||||
|
@ -112,6 +114,10 @@ scmi_get_cpu_power(struct device *cpu_dev, unsigned long *power,
|
|||
if (ret)
|
||||
return ret;
|
||||
|
||||
/* Convert the power to uW if it is mW (ignore bogoW) */
|
||||
if (power_scale == SCMI_POWER_MILLIWATTS)
|
||||
*power *= MICROWATT_PER_MILLIWATT;
|
||||
|
||||
/* The EM framework specifies the frequency in KHz. */
|
||||
*KHz = Hz / 1000;
|
||||
|
||||
|
@ -249,8 +255,9 @@ static int scmi_cpufreq_exit(struct cpufreq_policy *policy)
|
|||
static void scmi_cpufreq_register_em(struct cpufreq_policy *policy)
|
||||
{
|
||||
struct em_data_callback em_cb = EM_DATA_CB(scmi_get_cpu_power);
|
||||
bool power_scale_mw = perf_ops->power_scale_mw_get(ph);
|
||||
enum scmi_power_scale power_scale = perf_ops->power_scale_get(ph);
|
||||
struct scmi_data *priv = policy->driver_data;
|
||||
bool em_power_scale = false;
|
||||
|
||||
/*
|
||||
* This callback will be called for each policy, but we don't need to
|
||||
|
@ -262,9 +269,13 @@ static void scmi_cpufreq_register_em(struct cpufreq_policy *policy)
|
|||
if (!priv->nr_opp)
|
||||
return;
|
||||
|
||||
if (power_scale == SCMI_POWER_MILLIWATTS
|
||||
|| power_scale == SCMI_POWER_MICROWATTS)
|
||||
em_power_scale = true;
|
||||
|
||||
em_dev_register_perf_domain(get_cpu_device(policy->cpu), priv->nr_opp,
|
||||
&em_cb, priv->opp_shared_cpus,
|
||||
power_scale_mw);
|
||||
em_power_scale);
|
||||
}
|
||||
|
||||
static struct cpufreq_driver scmi_cpufreq_driver = {
|
||||
|
|
|
@ -170,8 +170,7 @@ struct perf_dom_info {
|
|||
struct scmi_perf_info {
|
||||
u32 version;
|
||||
int num_domains;
|
||||
bool power_scale_mw;
|
||||
bool power_scale_uw;
|
||||
enum scmi_power_scale power_scale;
|
||||
u64 stats_addr;
|
||||
u32 stats_size;
|
||||
struct perf_dom_info *dom_info;
|
||||
|
@ -201,9 +200,13 @@ static int scmi_perf_attributes_get(const struct scmi_protocol_handle *ph,
|
|||
u16 flags = le16_to_cpu(attr->flags);
|
||||
|
||||
pi->num_domains = le16_to_cpu(attr->num_domains);
|
||||
pi->power_scale_mw = POWER_SCALE_IN_MILLIWATT(flags);
|
||||
|
||||
if (POWER_SCALE_IN_MILLIWATT(flags))
|
||||
pi->power_scale = SCMI_POWER_MILLIWATTS;
|
||||
if (PROTOCOL_REV_MAJOR(pi->version) >= 0x3)
|
||||
pi->power_scale_uw = POWER_SCALE_IN_MICROWATT(flags);
|
||||
if (POWER_SCALE_IN_MICROWATT(flags))
|
||||
pi->power_scale = SCMI_POWER_MICROWATTS;
|
||||
|
||||
pi->stats_addr = le32_to_cpu(attr->stats_addr_low) |
|
||||
(u64)le32_to_cpu(attr->stats_addr_high) << 32;
|
||||
pi->stats_size = le32_to_cpu(attr->stats_size);
|
||||
|
@ -792,11 +795,12 @@ static bool scmi_fast_switch_possible(const struct scmi_protocol_handle *ph,
|
|||
return dom->fc_info && dom->fc_info->level_set_addr;
|
||||
}
|
||||
|
||||
static bool scmi_power_scale_mw_get(const struct scmi_protocol_handle *ph)
|
||||
static enum scmi_power_scale
|
||||
scmi_power_scale_get(const struct scmi_protocol_handle *ph)
|
||||
{
|
||||
struct scmi_perf_info *pi = ph->get_priv(ph);
|
||||
|
||||
return pi->power_scale_mw;
|
||||
return pi->power_scale;
|
||||
}
|
||||
|
||||
static const struct scmi_perf_proto_ops perf_proto_ops = {
|
||||
|
@ -811,7 +815,7 @@ static const struct scmi_perf_proto_ops perf_proto_ops = {
|
|||
.freq_get = scmi_dvfs_freq_get,
|
||||
.est_power_get = scmi_dvfs_est_power_get,
|
||||
.fast_switch_possible = scmi_fast_switch_possible,
|
||||
.power_scale_mw_get = scmi_power_scale_mw_get,
|
||||
.power_scale_get = scmi_power_scale_get,
|
||||
};
|
||||
|
||||
static int scmi_perf_set_notify_enabled(const struct scmi_protocol_handle *ph,
|
||||
|
|
|
@ -1443,12 +1443,12 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_of_node);
|
|||
* It provides the power used by @dev at @kHz if it is the frequency of an
|
||||
* existing OPP, or at the frequency of the first OPP above @kHz otherwise
|
||||
* (see dev_pm_opp_find_freq_ceil()). This function updates @kHz to the ceiled
|
||||
* frequency and @mW to the associated power.
|
||||
* frequency and @uW to the associated power.
|
||||
*
|
||||
* Returns 0 on success or a proper -EINVAL value in case of error.
|
||||
*/
|
||||
static int __maybe_unused
|
||||
_get_dt_power(struct device *dev, unsigned long *mW, unsigned long *kHz)
|
||||
_get_dt_power(struct device *dev, unsigned long *uW, unsigned long *kHz)
|
||||
{
|
||||
struct dev_pm_opp *opp;
|
||||
unsigned long opp_freq, opp_power;
|
||||
|
@ -1465,7 +1465,7 @@ _get_dt_power(struct device *dev, unsigned long *mW, unsigned long *kHz)
|
|||
return -EINVAL;
|
||||
|
||||
*kHz = opp_freq / 1000;
|
||||
*mW = opp_power / 1000;
|
||||
*uW = opp_power;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
@ -1475,14 +1475,14 @@ _get_dt_power(struct device *dev, unsigned long *mW, unsigned long *kHz)
|
|||
* This computes the power estimated by @dev at @kHz if it is the frequency
|
||||
* of an existing OPP, or at the frequency of the first OPP above @kHz otherwise
|
||||
* (see dev_pm_opp_find_freq_ceil()). This function updates @kHz to the ceiled
|
||||
* frequency and @mW to the associated power. The power is estimated as
|
||||
* frequency and @uW to the associated power. The power is estimated as
|
||||
* P = C * V^2 * f with C being the device's capacitance and V and f
|
||||
* respectively the voltage and frequency of the OPP.
|
||||
*
|
||||
* Returns -EINVAL if the power calculation failed because of missing
|
||||
* parameters, 0 otherwise.
|
||||
*/
|
||||
static int __maybe_unused _get_power(struct device *dev, unsigned long *mW,
|
||||
static int __maybe_unused _get_power(struct device *dev, unsigned long *uW,
|
||||
unsigned long *kHz)
|
||||
{
|
||||
struct dev_pm_opp *opp;
|
||||
|
@ -1512,9 +1512,10 @@ static int __maybe_unused _get_power(struct device *dev, unsigned long *mW,
|
|||
return -EINVAL;
|
||||
|
||||
tmp = (u64)cap * mV * mV * (Hz / 1000000);
|
||||
do_div(tmp, 1000000000);
|
||||
/* Provide power in micro-Watts */
|
||||
do_div(tmp, 1000000);
|
||||
|
||||
*mW = (unsigned long)tmp;
|
||||
*uW = (unsigned long)tmp;
|
||||
*kHz = Hz / 1000;
|
||||
|
||||
return 0;
|
||||
|
|
|
@ -53,7 +53,7 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
|
|||
|
||||
for (i = 0; i < pd->nr_perf_states; i++) {
|
||||
|
||||
power = pd->table[i].power * MICROWATT_PER_MILLIWATT * nr_cpus;
|
||||
power = pd->table[i].power * nr_cpus;
|
||||
|
||||
if (power > power_limit)
|
||||
break;
|
||||
|
@ -63,8 +63,7 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
|
|||
|
||||
freq_qos_update_request(&dtpm_cpu->qos_req, freq);
|
||||
|
||||
power_limit = pd->table[i - 1].power *
|
||||
MICROWATT_PER_MILLIWATT * nr_cpus;
|
||||
power_limit = pd->table[i - 1].power * nr_cpus;
|
||||
|
||||
return power_limit;
|
||||
}
|
||||
|
|
|
@ -1109,6 +1109,7 @@ static const struct x86_cpu_id rapl_ids[] __initconst = {
|
|||
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &rapl_defaults_core),
|
||||
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, &rapl_defaults_core),
|
||||
X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, &rapl_defaults_core),
|
||||
X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, &rapl_defaults_core),
|
||||
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &rapl_defaults_spr_server),
|
||||
X86_MATCH_INTEL_FAM6_MODEL(LAKEFIELD, &rapl_defaults_core),
|
||||
|
||||
|
|
|
@ -140,7 +140,9 @@ static const struct x86_cpu_id pl4_support_ids[] = {
|
|||
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_TIGERLAKE_L, X86_FEATURE_ANY },
|
||||
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ALDERLAKE, X86_FEATURE_ANY },
|
||||
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ALDERLAKE_L, X86_FEATURE_ANY },
|
||||
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ALDERLAKE_N, X86_FEATURE_ANY },
|
||||
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_RAPTORLAKE, X86_FEATURE_ANY },
|
||||
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_RAPTORLAKE_P, X86_FEATURE_ANY },
|
||||
{}
|
||||
};
|
||||
|
||||
|
|
|
@ -21,6 +21,7 @@
|
|||
#include <linux/pm_qos.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/thermal.h>
|
||||
#include <linux/units.h>
|
||||
|
||||
#include <trace/events/thermal.h>
|
||||
|
||||
|
@ -101,6 +102,7 @@ static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
|
|||
static u32 cpu_freq_to_power(struct cpufreq_cooling_device *cpufreq_cdev,
|
||||
u32 freq)
|
||||
{
|
||||
unsigned long power_mw;
|
||||
int i;
|
||||
|
||||
for (i = cpufreq_cdev->max_level - 1; i >= 0; i--) {
|
||||
|
@ -108,16 +110,23 @@ static u32 cpu_freq_to_power(struct cpufreq_cooling_device *cpufreq_cdev,
|
|||
break;
|
||||
}
|
||||
|
||||
return cpufreq_cdev->em->table[i + 1].power;
|
||||
power_mw = cpufreq_cdev->em->table[i + 1].power;
|
||||
power_mw /= MICROWATT_PER_MILLIWATT;
|
||||
|
||||
return power_mw;
|
||||
}
|
||||
|
||||
static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev,
|
||||
u32 power)
|
||||
{
|
||||
unsigned long em_power_mw;
|
||||
int i;
|
||||
|
||||
for (i = cpufreq_cdev->max_level; i > 0; i--) {
|
||||
if (power >= cpufreq_cdev->em->table[i].power)
|
||||
/* Convert EM power to milli-Watts to make safe comparison */
|
||||
em_power_mw = cpufreq_cdev->em->table[i].power;
|
||||
em_power_mw /= MICROWATT_PER_MILLIWATT;
|
||||
if (power >= em_power_mw)
|
||||
break;
|
||||
}
|
||||
|
||||
|
|
|
@ -200,7 +200,11 @@ static int devfreq_cooling_get_requested_power(struct thermal_cooling_device *cd
|
|||
res = dfc->power_ops->get_real_power(df, power, freq, voltage);
|
||||
if (!res) {
|
||||
state = dfc->capped_state;
|
||||
|
||||
/* Convert EM power into milli-Watts first */
|
||||
dfc->res_util = dfc->em_pd->table[state].power;
|
||||
dfc->res_util /= MICROWATT_PER_MILLIWATT;
|
||||
|
||||
dfc->res_util *= SCALE_ERROR_MITIGATION;
|
||||
|
||||
if (*power > 1)
|
||||
|
@ -218,8 +222,10 @@ static int devfreq_cooling_get_requested_power(struct thermal_cooling_device *cd
|
|||
|
||||
_normalize_load(&status);
|
||||
|
||||
/* Scale power for utilization */
|
||||
/* Convert EM power into milli-Watts first */
|
||||
*power = dfc->em_pd->table[perf_idx].power;
|
||||
*power /= MICROWATT_PER_MILLIWATT;
|
||||
/* Scale power for utilization */
|
||||
*power *= status.busy_time;
|
||||
*power >>= 10;
|
||||
}
|
||||
|
@ -244,6 +250,7 @@ static int devfreq_cooling_state2power(struct thermal_cooling_device *cdev,
|
|||
|
||||
perf_idx = dfc->max_state - state;
|
||||
*power = dfc->em_pd->table[perf_idx].power;
|
||||
*power /= MICROWATT_PER_MILLIWATT;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
@ -254,7 +261,7 @@ static int devfreq_cooling_power2state(struct thermal_cooling_device *cdev,
|
|||
struct devfreq_cooling_device *dfc = cdev->devdata;
|
||||
struct devfreq *df = dfc->devfreq;
|
||||
struct devfreq_dev_status status;
|
||||
unsigned long freq;
|
||||
unsigned long freq, em_power_mw;
|
||||
s32 est_power;
|
||||
int i;
|
||||
|
||||
|
@ -279,9 +286,13 @@ static int devfreq_cooling_power2state(struct thermal_cooling_device *cdev,
|
|||
* Find the first cooling state that is within the power
|
||||
* budget. The EM power table is sorted ascending.
|
||||
*/
|
||||
for (i = dfc->max_state; i > 0; i--)
|
||||
if (est_power >= dfc->em_pd->table[i].power)
|
||||
for (i = dfc->max_state; i > 0; i--) {
|
||||
/* Convert EM power to milli-Watts to make safe comparison */
|
||||
em_power_mw = dfc->em_pd->table[i].power;
|
||||
em_power_mw /= MICROWATT_PER_MILLIWATT;
|
||||
if (est_power >= em_power_mw)
|
||||
break;
|
||||
}
|
||||
|
||||
*state = dfc->max_state - i;
|
||||
dfc->capped_state = *state;
|
||||
|
|
|
@ -62,7 +62,7 @@ struct em_perf_domain {
|
|||
/*
|
||||
* em_perf_domain flags:
|
||||
*
|
||||
* EM_PERF_DOMAIN_MILLIWATTS: The power values are in milli-Watts or some
|
||||
* EM_PERF_DOMAIN_MICROWATTS: The power values are in micro-Watts or some
|
||||
* other scale.
|
||||
*
|
||||
* EM_PERF_DOMAIN_SKIP_INEFFICIENCIES: Skip inefficient states when estimating
|
||||
|
@ -71,7 +71,7 @@ struct em_perf_domain {
|
|||
* EM_PERF_DOMAIN_ARTIFICIAL: The power values are artificial and might be
|
||||
* created by platform missing real power information
|
||||
*/
|
||||
#define EM_PERF_DOMAIN_MILLIWATTS BIT(0)
|
||||
#define EM_PERF_DOMAIN_MICROWATTS BIT(0)
|
||||
#define EM_PERF_DOMAIN_SKIP_INEFFICIENCIES BIT(1)
|
||||
#define EM_PERF_DOMAIN_ARTIFICIAL BIT(2)
|
||||
|
||||
|
@ -79,22 +79,44 @@ struct em_perf_domain {
|
|||
#define em_is_artificial(em) ((em)->flags & EM_PERF_DOMAIN_ARTIFICIAL)
|
||||
|
||||
#ifdef CONFIG_ENERGY_MODEL
|
||||
#define EM_MAX_POWER 0xFFFF
|
||||
/*
|
||||
* The max power value in micro-Watts. The limit of 64 Watts is set as
|
||||
* a safety net to not overflow multiplications on 32bit platforms. The
|
||||
* 32bit value limit for total Perf Domain power implies a limit of
|
||||
* maximum CPUs in such domain to 64.
|
||||
*/
|
||||
#define EM_MAX_POWER (64000000) /* 64 Watts */
|
||||
|
||||
/*
|
||||
* Increase resolution of energy estimation calculations for 64-bit
|
||||
* architectures. The extra resolution improves decision made by EAS for the
|
||||
* task placement when two Performance Domains might provide similar energy
|
||||
* estimation values (w/o better resolution the values could be equal).
|
||||
*
|
||||
* We increase resolution only if we have enough bits to allow this increased
|
||||
* resolution (i.e. 64-bit). The costs for increasing resolution when 32-bit
|
||||
* are pretty high and the returns do not justify the increased costs.
|
||||
* To avoid possible energy estimation overflow on 32bit machines add
|
||||
* limits to number of CPUs in the Perf. Domain.
|
||||
* We are safe on 64bit machine, thus some big number.
|
||||
*/
|
||||
#ifdef CONFIG_64BIT
|
||||
#define em_scale_power(p) ((p) * 1000)
|
||||
#define EM_MAX_NUM_CPUS 4096
|
||||
#else
|
||||
#define em_scale_power(p) (p)
|
||||
#define EM_MAX_NUM_CPUS 16
|
||||
#endif
|
||||
|
||||
/*
|
||||
* To avoid an overflow on 32bit machines while calculating the energy
|
||||
* use a different order in the operation. First divide by the 'cpu_scale'
|
||||
* which would reduce big value stored in the 'cost' field, then multiply by
|
||||
* the 'sum_util'. This would allow to handle existing platforms, which have
|
||||
* e.g. power ~1.3 Watt at max freq, so the 'cost' value > 1mln micro-Watts.
|
||||
* In such scenario, where there are 4 CPUs in the Perf. Domain the 'sum_util'
|
||||
* could be 4096, then multiplication: 'cost' * 'sum_util' would overflow.
|
||||
* This reordering of operations has some limitations, we lose small
|
||||
* precision in the estimation (comparing to 64bit platform w/o reordering).
|
||||
*
|
||||
* We are safe on 64bit machine.
|
||||
*/
|
||||
#ifdef CONFIG_64BIT
|
||||
#define em_estimate_energy(cost, sum_util, scale_cpu) \
|
||||
(((cost) * (sum_util)) / (scale_cpu))
|
||||
#else
|
||||
#define em_estimate_energy(cost, sum_util, scale_cpu) \
|
||||
(((cost) / (scale_cpu)) * (sum_util))
|
||||
#endif
|
||||
|
||||
struct em_data_callback {
|
||||
|
@ -112,7 +134,7 @@ struct em_data_callback {
|
|||
* and frequency.
|
||||
*
|
||||
* In case of CPUs, the power is the one of a single CPU in the domain,
|
||||
* expressed in milli-Watts or an abstract scale. It is expected to
|
||||
* expressed in micro-Watts or an abstract scale. It is expected to
|
||||
* fit in the [0, EM_MAX_POWER] range.
|
||||
*
|
||||
* Return 0 on success.
|
||||
|
@ -148,7 +170,7 @@ struct em_perf_domain *em_cpu_get(int cpu);
|
|||
struct em_perf_domain *em_pd_get(struct device *dev);
|
||||
int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states,
|
||||
struct em_data_callback *cb, cpumask_t *span,
|
||||
bool milliwatts);
|
||||
bool microwatts);
|
||||
void em_dev_unregister_perf_domain(struct device *dev);
|
||||
|
||||
/**
|
||||
|
@ -273,7 +295,7 @@ static inline unsigned long em_cpu_energy(struct em_perf_domain *pd,
|
|||
* pd_nrg = ------------------------ (4)
|
||||
* scale_cpu
|
||||
*/
|
||||
return ps->cost * sum_util / scale_cpu;
|
||||
return em_estimate_energy(ps->cost, sum_util, scale_cpu);
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -297,7 +319,7 @@ struct em_data_callback {};
|
|||
static inline
|
||||
int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states,
|
||||
struct em_data_callback *cb, cpumask_t *span,
|
||||
bool milliwatts)
|
||||
bool microwatts)
|
||||
{
|
||||
return -EINVAL;
|
||||
}
|
||||
|
|
|
@ -109,7 +109,6 @@ extern struct wakeup_source *wakeup_sources_walk_next(struct wakeup_source *ws);
|
|||
extern int device_wakeup_enable(struct device *dev);
|
||||
extern int device_wakeup_disable(struct device *dev);
|
||||
extern void device_set_wakeup_capable(struct device *dev, bool capable);
|
||||
extern int device_init_wakeup(struct device *dev, bool val);
|
||||
extern int device_set_wakeup_enable(struct device *dev, bool enable);
|
||||
extern void __pm_stay_awake(struct wakeup_source *ws);
|
||||
extern void pm_stay_awake(struct device *dev);
|
||||
|
@ -167,13 +166,6 @@ static inline int device_set_wakeup_enable(struct device *dev, bool enable)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static inline int device_init_wakeup(struct device *dev, bool val)
|
||||
{
|
||||
device_set_wakeup_capable(dev, val);
|
||||
device_set_wakeup_enable(dev, val);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline bool device_may_wakeup(struct device *dev)
|
||||
{
|
||||
return dev->power.can_wakeup && dev->power.should_wakeup;
|
||||
|
@ -217,4 +209,27 @@ static inline void pm_wakeup_hard_event(struct device *dev)
|
|||
return pm_wakeup_dev_event(dev, 0, true);
|
||||
}
|
||||
|
||||
/**
|
||||
* device_init_wakeup - Device wakeup initialization.
|
||||
* @dev: Device to handle.
|
||||
* @enable: Whether or not to enable @dev as a wakeup device.
|
||||
*
|
||||
* By default, most devices should leave wakeup disabled. The exceptions are
|
||||
* devices that everyone expects to be wakeup sources: keyboards, power buttons,
|
||||
* possibly network interfaces, etc. Also, devices that don't generate their
|
||||
* own wakeup requests but merely forward requests from one bus to another
|
||||
* (like PCI bridges) should have wakeup enabled by default.
|
||||
*/
|
||||
static inline int device_init_wakeup(struct device *dev, bool enable)
|
||||
{
|
||||
if (enable) {
|
||||
device_set_wakeup_capable(dev, true);
|
||||
return device_wakeup_enable(dev);
|
||||
} else {
|
||||
device_wakeup_disable(dev);
|
||||
device_set_wakeup_capable(dev, false);
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
#endif /* _LINUX_PM_WAKEUP_H */
|
||||
|
|
|
@ -60,6 +60,12 @@ struct scmi_clock_info {
|
|||
};
|
||||
};
|
||||
|
||||
enum scmi_power_scale {
|
||||
SCMI_POWER_BOGOWATTS,
|
||||
SCMI_POWER_MILLIWATTS,
|
||||
SCMI_POWER_MICROWATTS
|
||||
};
|
||||
|
||||
struct scmi_handle;
|
||||
struct scmi_device;
|
||||
struct scmi_protocol_handle;
|
||||
|
@ -135,7 +141,7 @@ struct scmi_perf_proto_ops {
|
|||
unsigned long *rate, unsigned long *power);
|
||||
bool (*fast_switch_possible)(const struct scmi_protocol_handle *ph,
|
||||
struct device *dev);
|
||||
bool (*power_scale_mw_get)(const struct scmi_protocol_handle *ph);
|
||||
enum scmi_power_scale (*power_scale_get)(const struct scmi_protocol_handle *ph);
|
||||
};
|
||||
|
||||
/**
|
||||
|
|
|
@ -145,7 +145,7 @@ static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd,
|
|||
|
||||
/*
|
||||
* The power returned by active_state() is expected to be
|
||||
* positive and to fit into 16 bits.
|
||||
* positive and be in range.
|
||||
*/
|
||||
if (!power || power > EM_MAX_POWER) {
|
||||
dev_err(dev, "EM: invalid power: %lu\n",
|
||||
|
@ -170,7 +170,7 @@ static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd,
|
|||
goto free_ps_table;
|
||||
}
|
||||
} else {
|
||||
power_res = em_scale_power(table[i].power);
|
||||
power_res = table[i].power;
|
||||
cost = div64_u64(fmax * power_res, table[i].frequency);
|
||||
}
|
||||
|
||||
|
@ -201,9 +201,17 @@ static int em_create_pd(struct device *dev, int nr_states,
|
|||
{
|
||||
struct em_perf_domain *pd;
|
||||
struct device *cpu_dev;
|
||||
int cpu, ret;
|
||||
int cpu, ret, num_cpus;
|
||||
|
||||
if (_is_cpu_device(dev)) {
|
||||
num_cpus = cpumask_weight(cpus);
|
||||
|
||||
/* Prevent max possible energy calculation to not overflow */
|
||||
if (num_cpus > EM_MAX_NUM_CPUS) {
|
||||
dev_err(dev, "EM: too many CPUs, overflow possible\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
pd = kzalloc(sizeof(*pd) + cpumask_size(), GFP_KERNEL);
|
||||
if (!pd)
|
||||
return -ENOMEM;
|
||||
|
@ -314,13 +322,13 @@ EXPORT_SYMBOL_GPL(em_cpu_get);
|
|||
* @cpus : Pointer to cpumask_t, which in case of a CPU device is
|
||||
* obligatory. It can be taken from i.e. 'policy->cpus'. For other
|
||||
* type of devices this should be set to NULL.
|
||||
* @milliwatts : Flag indicating that the power values are in milliWatts or
|
||||
* @microwatts : Flag indicating that the power values are in micro-Watts or
|
||||
* in some other scale. It must be set properly.
|
||||
*
|
||||
* Create Energy Model tables for a performance domain using the callbacks
|
||||
* defined in cb.
|
||||
*
|
||||
* The @milliwatts is important to set with correct value. Some kernel
|
||||
* The @microwatts is important to set with correct value. Some kernel
|
||||
* sub-systems might rely on this flag and check if all devices in the EM are
|
||||
* using the same scale.
|
||||
*
|
||||
|
@ -331,7 +339,7 @@ EXPORT_SYMBOL_GPL(em_cpu_get);
|
|||
*/
|
||||
int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states,
|
||||
struct em_data_callback *cb, cpumask_t *cpus,
|
||||
bool milliwatts)
|
||||
bool microwatts)
|
||||
{
|
||||
unsigned long cap, prev_cap = 0;
|
||||
unsigned long flags = 0;
|
||||
|
@ -381,8 +389,8 @@ int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states,
|
|||
}
|
||||
}
|
||||
|
||||
if (milliwatts)
|
||||
flags |= EM_PERF_DOMAIN_MILLIWATTS;
|
||||
if (microwatts)
|
||||
flags |= EM_PERF_DOMAIN_MICROWATTS;
|
||||
else if (cb->get_cost)
|
||||
flags |= EM_PERF_DOMAIN_ARTIFICIAL;
|
||||
|
||||
|
|
|
@ -26,6 +26,7 @@
|
|||
|
||||
#include "power.h"
|
||||
|
||||
static bool need_wait;
|
||||
|
||||
static struct snapshot_data {
|
||||
struct snapshot_handle handle;
|
||||
|
@ -78,7 +79,7 @@ static int snapshot_open(struct inode *inode, struct file *filp)
|
|||
* Resuming. We may need to wait for the image device to
|
||||
* appear.
|
||||
*/
|
||||
wait_for_device_probe();
|
||||
need_wait = true;
|
||||
|
||||
data->swap = -1;
|
||||
data->mode = O_WRONLY;
|
||||
|
@ -168,6 +169,11 @@ static ssize_t snapshot_write(struct file *filp, const char __user *buf,
|
|||
ssize_t res;
|
||||
loff_t pg_offp = *offp & ~PAGE_MASK;
|
||||
|
||||
if (need_wait) {
|
||||
wait_for_device_probe();
|
||||
need_wait = false;
|
||||
}
|
||||
|
||||
lock_system_sleep();
|
||||
|
||||
data = filp->private_data;
|
||||
|
@ -244,6 +250,11 @@ static long snapshot_ioctl(struct file *filp, unsigned int cmd,
|
|||
loff_t size;
|
||||
sector_t offset;
|
||||
|
||||
if (need_wait) {
|
||||
wait_for_device_probe();
|
||||
need_wait = false;
|
||||
}
|
||||
|
||||
if (_IOC_TYPE(cmd) != SNAPSHOT_IOC_MAGIC)
|
||||
return -ENOTTY;
|
||||
if (_IOC_NR(cmd) > SNAPSHOT_IOC_MAXNR)
|
||||
|
|
Loading…
Reference in New Issue