Additional power management updates for 5.6-rc1

Update the recently merged CPR (Core Power Reduction) support in the
 AVS (Adaptive Voltage Scaling) subsystem (Brendan Higgins, Nathan
 Chancellor, Niklas Cassel) and the rockchip-io AVS driver (Heiko
 Stuebner), add two more module parameters to intel_idle on top of the
 recently merged material, clean up a piece of cpuidle documentation
 and consolidate system sleep states documentation (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl49O1kSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx7OMP/34DA5zPXhughdawhrw4z9N8Xl/VKbva
 k6wr26C4Bwrd19i4MU6LdrjbbMBfbg7gsUtuO/sF9dXjsFlcMrfsgsdIO5QH26qL
 yD68wwv/F7/MWUWhY7wVcTI5bnBnS1WF6ygnhRCfhI7R7Zq/WWQwK6D9/yZG/9uo
 AfTEv+yfXSBg+ByzaFqYt7S6QTTnUPrp29ROOJNE7Eryz2n1S9cSxCZPAQbqVNfr
 0SRX9EPUNkNTYfR+QEtIs6uXrCWpbAePdxenirRYMvTYidjDvWG3kXNp+MlwR9co
 ieiYmCELOnlmYXu2vOPmBWW5rnxP+4aQuOVpuCPRxRwecKUxZwcGL0eh7e4A0wP8
 /WwiTHyKphlqPWSaYvWKWDI2razDNnFnqkmXgfb68rAZNUNRRMbTErHzu2w4A1vj
 NiGCWqR+5Wimp4Ztk/WfFQYxPb2wr+a8PdwNWtH03goB2JOvHSZbNCB+BqX0yJJt
 GECDn6mFKCqJRQXfsxa+RIrVj3oXouExlb6VtKc3AHPSn+XyzQgRbo+CH0XvTnNI
 uoiL3PY4ayR6WBJjveRW9lLw3Acul475O1Xd0OzCNx2gsD6cRG7sE9cj4vRltdpl
 qW/t7vkS4JiqASgQEfZui4EZFxIk0Brv9v7KNl43bIcyW5SCkyjAbWeCR18I9l6l
 kxn8zXSTxtKZ
 =Yc9a
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:

 - Update the recently merged CPR (Core Power Reduction) support in the
   AVS (Adaptive Voltage Scaling) subsystem (Brendan Higgins, Nathan
   Chancellor, Niklas Cassel)

 - Update the rockchip-io AVS driver (Heiko Stuebner)

 - Add two more module parameters to intel_idle on top of the recently
   merged material (Rafael Wysocki)

 - Clean up a piece of cpuidle documentation and consolidate system
   sleep states documentation (Rafael Wysocki)

* tag 'pm-5.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpuidle: Documentation: Clean up PM QoS description
  Documentation: admin-guide: PM: Update sleep states documentation
  intel_idle: Introduce 'states_off' module parameter
  intel_idle: Introduce 'use_acpi' module parameter
  power: avs: qcom-cpr: Avoid clang -Wsometimes-uninitialized in cpr_scale
  power: avs: qcom-cpr: add unspecified HAS_IOMEM dependency
  PM / AVS: rockchip-io: fix the supply naming for the emmc supply on px30
  power: avs: qcom-cpr: add a printout after the driver has been initialized
This commit is contained in:
Linus Torvalds 2020-02-07 12:49:10 -08:00
commit ba7dcfc7ba
8 changed files with 125 additions and 115 deletions

View File

@ -632,16 +632,16 @@ class priority list and destroyed. If that happens, the priority list mechanism
will be used, again, to determine the new effective value for the whole list
and that value will become the new real constraint.
In turn, for each CPU there is only one resume latency PM QoS request
associated with the :file:`power/pm_qos_resume_latency_us` file under
In turn, for each CPU there is one resume latency PM QoS request associated with
the :file:`power/pm_qos_resume_latency_us` file under
:file:`/sys/devices/system/cpu/cpu<N>/` in ``sysfs`` and writing to it causes
this single PM QoS request to be updated regardless of which user space
process does that. In other words, this PM QoS request is shared by the entire
user space, so access to the file associated with it needs to be arbitrated
to avoid confusion. [Arguably, the only legitimate use of this mechanism in
practice is to pin a process to the CPU in question and let it use the
``sysfs`` interface to control the resume latency constraint for it.] It
still only is a request, however. It is a member of a priority list used to
``sysfs`` interface to control the resume latency constraint for it.] It is
still only a request, however. It is an entry in a priority list used to
determine the effective value to be set as the resume latency constraint for the
CPU in question every time the list of requests is updated this way or another
(there may be other requests coming from kernel code in that list).

View File

@ -60,6 +60,9 @@ of the system. The former are always used if the processor model at hand is
recognized by ``intel_idle`` and the latter are used if that is required for
the given processor model (which is the case for all server processor models
recognized by ``intel_idle``) or if the processor model is not recognized.
[There is a module parameter that can be used to make the driver use the ACPI
tables with any processor model recognized by it; see
`below <intel-idle-parameters_>`_.]
If the ACPI tables are going to be used for building the list of available idle
states, ``intel_idle`` first looks for a ``_CST`` object under one of the ACPI
@ -165,7 +168,7 @@ and ``idle=nomwait``. If any of them is present in the kernel command line, the
``MWAIT`` instruction is not allowed to be used, so the initialization of
``intel_idle`` will fail.
Apart from that there are two module parameters recognized by ``intel_idle``
Apart from that there are four module parameters recognized by ``intel_idle``
itself that can be set via the kernel command line (they cannot be updated via
sysfs, so that is the only way to change their values).
@ -186,9 +189,28 @@ QoS) feature can be used to prevent ``CPUIdle`` from touching those idle states
even if they have been enumerated (see :ref:`cpu-pm-qos` in :doc:`cpuidle`).
Setting ``max_cstate`` to 0 causes the ``intel_idle`` initialization to fail.
The ``noacpi`` module parameter (which is recognized by ``intel_idle`` if the
kernel has been configured with ACPI support), can be set to make the driver
ignore the system's ACPI tables entirely (it is unset by default).
The ``no_acpi`` and ``use_acpi`` module parameters (recognized by ``intel_idle``
if the kernel has been configured with ACPI support) can be set to make the
driver ignore the system's ACPI tables entirely or use them for all of the
recognized processor models, respectively (they both are unset by default and
``use_acpi`` has no effect if ``no_acpi`` is set).
The value of the ``states_off`` module parameter (0 by default) represents a
list of idle states to be disabled by default in the form of a bitmask.
Namely, the positions of the bits that are set in the ``states_off`` value are
the indices of idle states to be disabled by default (as reflected by the names
of the corresponding idle state directories in ``sysfs``, :file:`state0`,
:file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the index of the given
idle state; see :ref:`idle-states-representation` in :doc:`cpuidle`).
For example, if ``states_off`` is equal to 3, the driver will disable idle
states 0 and 1 by default, and if it is equal to 8, idle state 3 will be
disabled by default and so on (bit positions beyond the maximum idle state index
are ignored).
The idle states disabled this way can be enabled (on a per-CPU basis) from user
space via ``sysfs``.
.. _intel-idle-core-and-package-idle-states:

View File

@ -153,8 +153,11 @@ for the given CPU architecture includes the low-level code for system resume.
Basic ``sysfs`` Interfaces for System Suspend and Hibernation
=============================================================
The following files located in the :file:`/sys/power/` directory can be used by
user space for sleep states control.
The power management subsystem provides userspace with a unified ``sysfs``
interface for system sleep regardless of the underlying system architecture or
platform. That interface is located in the :file:`/sys/power/` directory
(assuming that ``sysfs`` is mounted at :file:`/sys`) and it consists of the
following attributes (files):
``state``
This file contains a list of strings representing sleep states supported
@ -162,9 +165,9 @@ user space for sleep states control.
to start a transition of the system into the sleep state represented by
that string.
In particular, the strings "disk", "freeze" and "standby" represent the
In particular, the "disk", "freeze" and "standby" strings represent the
:ref:`hibernation <hibernation>`, :ref:`suspend-to-idle <s2idle>` and
:ref:`standby <standby>` sleep states, respectively. The string "mem"
:ref:`standby <standby>` sleep states, respectively. The "mem" string
is interpreted in accordance with the contents of the ``mem_sleep`` file
described below.
@ -177,7 +180,7 @@ user space for sleep states control.
associated with the "mem" string in the ``state`` file described above.
The strings that may be present in this file are "s2idle", "shallow"
and "deep". The string "s2idle" always represents :ref:`suspend-to-idle
and "deep". The "s2idle" string always represents :ref:`suspend-to-idle
<s2idle>` and, by convention, "shallow" and "deep" represent
:ref:`standby <standby>` and :ref:`suspend-to-RAM <s2ram>`,
respectively.
@ -185,15 +188,17 @@ user space for sleep states control.
Writing one of the listed strings into this file causes the system
suspend variant represented by it to be associated with the "mem" string
in the ``state`` file. The string representing the suspend variant
currently associated with the "mem" string in the ``state`` file
is listed in square brackets.
currently associated with the "mem" string in the ``state`` file is
shown in square brackets.
If the kernel does not support system suspend, this file is not present.
``disk``
This file contains a list of strings representing different operations
that can be carried out after the hibernation image has been saved. The
possible options are as follows:
This file controls the operating mode of hibernation (Suspend-to-Disk).
Specifically, it tells the kernel what to do after creating a
hibernation image.
Reading from it returns a list of supported options encoded as:
``platform``
Put the system into a special low-power state (e.g. ACPI S4) to
@ -201,6 +206,11 @@ user space for sleep states control.
platform firmware to take a simplified initialization path after
wakeup.
It is only available if the platform provides a special
mechanism to put the system to sleep after creating a
hibernation image (platforms with ACPI do that as a rule, for
example).
``shutdown``
Power off the system.
@ -214,22 +224,53 @@ user space for sleep states control.
the hibernation image and continue. Otherwise, use the image
to restore the previous state of the system.
It is available if system suspend is supported.
``test_resume``
Diagnostic operation. Load the image as though the system had
just woken up from hibernation and the currently running kernel
instance was a restore kernel and follow up with full system
resume.
Writing one of the listed strings into this file causes the option
Writing one of the strings listed above into this file causes the option
represented by it to be selected.
The currently selected option is shown in square brackets which means
The currently selected option is shown in square brackets, which means
that the operation represented by it will be carried out after creating
and saving the image next time hibernation is triggered by writing
``disk`` to :file:`/sys/power/state`.
and saving the image when hibernation is triggered by writing ``disk``
to :file:`/sys/power/state`.
If the kernel does not support hibernation, this file is not present.
``image_size``
This file controls the size of hibernation images.
It can be written a string representing a non-negative integer that will
be used as a best-effort upper limit of the image size, in bytes. The
hibernation core will do its best to ensure that the image size will not
exceed that number, but if that turns out to be impossible to achieve, a
hibernation image will still be created and its size will be as small as
possible. In particular, writing '0' to this file causes the size of
hibernation images to be minimum.
Reading from it returns the current image size limit, which is set to
around 2/5 of the available RAM size by default.
``pm_trace``
This file controls the "PM trace" mechanism saving the last suspend
or resume event point in the RTC memory across reboots. It helps to
debug hard lockups or reboots due to device driver failures that occur
during system suspend or resume (which is more common) more effectively.
If it contains "1", the fingerprint of each suspend/resume event point
in turn will be stored in the RTC memory (overwriting the actual RTC
information), so it will survive a system crash if one occurs right
after storing it and it can be used later to identify the driver that
caused the crash to happen.
It contains "0" by default, which may be changed to "1" by writing a
string representing a nonzero integer into it.
According to the above, there are two ways to make the system go into the
:ref:`suspend-to-idle <s2idle>` state. The first one is to write "freeze"
directly to :file:`/sys/power/state`. The second one is to write "s2idle" to
@ -244,6 +285,7 @@ system go into the :ref:`suspend-to-RAM <s2ram>` state (write "deep" into
The default suspend variant (ie. the one to be used without writing anything
into :file:`/sys/power/mem_sleep`) is either "deep" (on the majority of systems
supporting :ref:`suspend-to-RAM <s2ram>`) or "s2idle", but it can be overridden
by the value of the "mem_sleep_default" parameter in the kernel command line.
On some ACPI-based systems, depending on the information in the ACPI tables, the
default may be "s2idle" even if :ref:`suspend-to-RAM <s2ram>` is supported.
by the value of the ``mem_sleep_default`` parameter in the kernel command line.
On some systems with ACPI, depending on the information in the ACPI tables, the
default may be "s2idle" even if :ref:`suspend-to-RAM <s2ram>` is supported in
principle.

View File

@ -1,79 +0,0 @@
===========================================
Power Management Interface for System Sleep
===========================================
Copyright (c) 2016 Intel Corp., Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The power management subsystem provides userspace with a unified sysfs interface
for system sleep regardless of the underlying system architecture or platform.
The interface is located in the /sys/power/ directory (assuming that sysfs is
mounted at /sys).
/sys/power/state is the system sleep state control file.
Reading from it returns a list of supported sleep states, encoded as:
- 'freeze' (Suspend-to-Idle)
- 'standby' (Power-On Suspend)
- 'mem' (Suspend-to-RAM)
- 'disk' (Suspend-to-Disk)
Suspend-to-Idle is always supported. Suspend-to-Disk is always supported
too as long the kernel has been configured to support hibernation at all
(ie. CONFIG_HIBERNATION is set in the kernel configuration file). Support
for Suspend-to-RAM and Power-On Suspend depends on the capabilities of the
platform.
If one of the strings listed in /sys/power/state is written to it, the system
will attempt to transition into the corresponding sleep state. Refer to
Documentation/admin-guide/pm/sleep-states.rst for a description of each of
those states.
/sys/power/disk controls the operating mode of hibernation (Suspend-to-Disk).
Specifically, it tells the kernel what to do after creating a hibernation image.
Reading from it returns a list of supported options encoded as:
- 'platform' (put the system into sleep using a platform-provided method)
- 'shutdown' (shut the system down)
- 'reboot' (reboot the system)
- 'suspend' (trigger a Suspend-to-RAM transition)
- 'test_resume' (resume-after-hibernation test mode)
The currently selected option is printed in square brackets.
The 'platform' option is only available if the platform provides a special
mechanism to put the system to sleep after creating a hibernation image (ACPI
does that, for example). The 'suspend' option is available if Suspend-to-RAM
is supported. Refer to Documentation/power/basic-pm-debugging.rst for the
description of the 'test_resume' option.
To select an option, write the string representing it to /sys/power/disk.
/sys/power/image_size controls the size of hibernation images.
It can be written a string representing a non-negative integer that will be
used as a best-effort upper limit of the image size, in bytes. The hibernation
core will do its best to ensure that the image size will not exceed that number.
However, if that turns out to be impossible to achieve, a hibernation image will
still be created and its size will be as small as possible. In particular,
writing '0' to this file will enforce hibernation images to be as small as
possible.
Reading from this file returns the current image size limit, which is set to
around 2/5 of available RAM by default.
/sys/power/pm_trace controls the PM trace mechanism saving the last suspend
or resume event point in the RTC across reboots.
It helps to debug hard lockups or reboots due to device driver failures that
occur during system suspend or resume (which is more common) more effectively.
If /sys/power/pm_trace contains '1', the fingerprint of each suspend/resume
event point in turn will be stored in the RTC memory (overwriting the actual
RTC information), so it will survive a system crash if one occurs right after
storing it and it can be used later to identify the driver that caused the crash
to happen (see Documentation/power/s2ram.rst for more information).
Initially it contains '0' which may be changed to '1' by writing a string
representing a nonzero integer into it.

View File

@ -63,6 +63,7 @@ static struct cpuidle_driver intel_idle_driver = {
};
/* intel_idle.max_cstate=0 disables driver */
static int max_cstate = CPUIDLE_STATE_MAX - 1;
static unsigned int disabled_states_mask;
static unsigned int mwait_substates;
@ -1131,6 +1132,10 @@ static bool no_acpi __read_mostly;
module_param(no_acpi, bool, 0444);
MODULE_PARM_DESC(no_acpi, "Do not use ACPI _CST for building the idle states list");
static bool force_use_acpi __read_mostly; /* No effect if no_acpi is set. */
module_param_named(use_acpi, force_use_acpi, bool, 0444);
MODULE_PARM_DESC(use_acpi, "Use ACPI _CST for building the idle states list");
static struct acpi_processor_power acpi_state_table __initdata;
/**
@ -1230,6 +1235,9 @@ static void __init intel_idle_init_cstates_acpi(struct cpuidle_driver *drv)
if (cx->type > ACPI_STATE_C2)
state->flags |= CPUIDLE_FLAG_TLB_FLUSHED;
if (disabled_states_mask & BIT(cstate))
state->flags |= CPUIDLE_FLAG_OFF;
state->enter = intel_idle;
state->enter_s2idle = intel_idle_s2idle;
}
@ -1258,6 +1266,8 @@ static bool __init intel_idle_off_by_default(u32 mwait_hint)
return true;
}
#else /* !CONFIG_ACPI_PROCESSOR_CSTATE */
#define force_use_acpi (false)
static inline bool intel_idle_acpi_cst_extract(void) { return false; }
static inline void intel_idle_init_cstates_acpi(struct cpuidle_driver *drv) { }
static inline bool intel_idle_off_by_default(u32 mwait_hint) { return false; }
@ -1460,8 +1470,10 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
/* Structure copy. */
drv->states[drv->state_count] = cpuidle_state_table[cstate];
if (icpu->use_acpi && intel_idle_off_by_default(mwait_hint) &&
!(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
if ((disabled_states_mask & BIT(drv->state_count)) ||
((icpu->use_acpi || force_use_acpi) &&
intel_idle_off_by_default(mwait_hint) &&
!(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE)))
drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
drv->state_count++;
@ -1480,6 +1492,10 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
static void __init intel_idle_cpuidle_driver_init(struct cpuidle_driver *drv)
{
cpuidle_poll_state_init(drv);
if (disabled_states_mask & BIT(0))
drv->states[0].flags |= CPUIDLE_FLAG_OFF;
drv->state_count = 1;
if (icpu)
@ -1607,7 +1623,7 @@ static int __init intel_idle_init(void)
icpu = (const struct idle_cpu *)id->driver_data;
if (icpu) {
cpuidle_state_table = icpu->state_table;
if (icpu->use_acpi)
if (icpu->use_acpi || force_use_acpi)
intel_idle_acpi_cst_extract();
} else if (!intel_idle_acpi_cst_extract()) {
return -ENODEV;
@ -1660,3 +1676,11 @@ device_initcall(intel_idle_init);
* is the easiest way (currently) to continue doing that.
*/
module_param(max_cstate, int, 0444);
/*
* The positions of the bits that are set in this number are the indices of the
* idle states to be disabled by default (as reflected by the names of the
* corresponding idle state directories in sysfs, "state0", "state1" ...
* "state<i>" ..., where <i> is the index of the given state).
*/
module_param_named(states_off, disabled_states_mask, uint, 0444);
MODULE_PARM_DESC(states_off, "Mask of disabled idle states");

View File

@ -14,7 +14,7 @@ menuconfig POWER_AVS
config QCOM_CPR
tristate "QCOM Core Power Reduction (CPR) support"
depends on POWER_AVS
depends on POWER_AVS && HAS_IOMEM
select PM_OPP
select REGMAP
help

View File

@ -517,7 +517,7 @@ static int cpr_scale(struct cpr_drv *drv, enum voltage_change_dir dir)
dev_dbg(drv->dev,
"UP: -> new_uV: %d last_uV: %d perf state: %u\n",
new_uV, last_uV, cpr_get_cur_perf_state(drv));
} else if (dir == DOWN) {
} else {
if (desc->clamp_timer_interval &&
error_steps < desc->down_threshold) {
/*
@ -567,7 +567,7 @@ static int cpr_scale(struct cpr_drv *drv, enum voltage_change_dir dir)
/* Disable auto nack down */
reg_mask = RBCPR_CTL_SW_AUTO_CONT_NACK_DN_EN;
val = 0;
} else if (dir == DOWN) {
} else {
/* Restore default threshold for UP */
reg_mask = RBCPR_CTL_UP_THRESHOLD_MASK;
reg_mask <<= RBCPR_CTL_UP_THRESHOLD_SHIFT;
@ -1547,8 +1547,6 @@ static int cpr_pd_attach_dev(struct generic_pm_domain *domain,
goto unlock;
}
dev_dbg(drv->dev, "number of OPPs: %d\n", drv->num_corners);
drv->corners = devm_kcalloc(drv->dev, drv->num_corners,
sizeof(*drv->corners),
GFP_KERNEL);
@ -1586,6 +1584,9 @@ static int cpr_pd_attach_dev(struct generic_pm_domain *domain,
acc_desc->enable_mask,
acc_desc->enable_mask);
dev_info(drv->dev, "driver initialized with %u OPPs\n",
drv->num_corners);
unlock:
mutex_unlock(&drv->lock);

View File

@ -152,18 +152,18 @@ static void px30_iodomain_init(struct rockchip_iodomain *iod)
int ret;
u32 val;
/* if no VCCIO0 supply we should leave things alone */
/* if no VCCIO6 supply we should leave things alone */
if (!iod->supplies[PX30_IO_VSEL_VCCIO6_SUPPLY_NUM].reg)
return;
/*
* set vccio0 iodomain to also use this framework
* set vccio6 iodomain to also use this framework
* instead of a special gpio.
*/
val = PX30_IO_VSEL_VCCIO6_SRC | (PX30_IO_VSEL_VCCIO6_SRC << 16);
ret = regmap_write(iod->grf, PX30_IO_VSEL, val);
if (ret < 0)
dev_warn(iod->dev, "couldn't update vccio0 ctrl\n");
dev_warn(iod->dev, "couldn't update vccio6 ctrl\n");
}
static void rk3288_iodomain_init(struct rockchip_iodomain *iod)