[ Upstream commit 8d74f1da776da9b0306630b13a3025214fa44618 ]
Currently EXPORT_*_SIMPLE_DEV_PM_OPS() use EXPORT_*_DEV_PM_OPS() set
of macros to export dev_pm_ops symbol, which export the symbol in case
CONFIG_PM=y but don't take CONFIG_PM_SLEEP into consideration.
Since _SIMPLE_ variants of _PM_OPS() do not include runtime PM handles
and are only used in case CONFIG_PM_SLEEP=y, we should not be exporting
dev_pm_ops symbol for them in case CONFIG_PM_SLEEP=n.
This can be fixed by having two distinct set of export macros for both
_RUNTIME_ and _SIMPLE_ variants of _PM_OPS(), such that the export of
dev_pm_ops symbol used in each variant depends on CONFIG_PM and
CONFIG_PM_SLEEP respectively.
Introduce _DEV_SLEEP_PM_OPS() set of export macros for _SIMPLE_ variants
of _PM_OPS(), which export dev_pm_ops symbol only in case CONFIG_PM_SLEEP=y
and discard it otherwise.
Fixes: 34e1ed189f ("PM: Improve EXPORT_*_DEV_PM_OPS macros")
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
_DEFINE_DEV_PM_OPS() helps to define PM operations for the system sleep
and/or runtime PM cases. Some of the existing users want to have _noirq()
variants to be set. For that purpose introduce a new helper which sets
up _noirq() callbacks to be assigned and struct dev_pm_ops be provided.
Acked-by: "Rafael J. Wysocki" <rafael@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20230717172821.62827-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Now, EXPORT_SYMBOL() is populated in two stages. In the first stage,
all of EXPORT_SYMBOL/EXPORT_SYMBOL_GPL go into the same section,
'.export_symbol'.
'sec' does not make sense any more. Rename it to 'license'.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Commit 7b4537199a ("kbuild: link symbol CRCs at final link, removing
CONFIG_MODULE_REL_CRCS") made modpost output CRCs in the same way
whether the EXPORT_SYMBOL() is placed in *.c or *.S.
For further cleanups, this commit applies a similar approach to the
entire data structure of EXPORT_SYMBOL().
The EXPORT_SYMBOL() compilation is split into two stages.
When a source file is compiled, EXPORT_SYMBOL() will be converted into
a dummy symbol in the .export_symbol section.
For example,
EXPORT_SYMBOL(foo);
EXPORT_SYMBOL_NS_GPL(bar, BAR_NAMESPACE);
will be encoded into the following assembly code:
.section ".export_symbol","a"
__export_symbol_foo:
.asciz "" /* license */
.asciz "" /* name space */
.balign 8
.quad foo /* symbol reference */
.previous
.section ".export_symbol","a"
__export_symbol_bar:
.asciz "GPL" /* license */
.asciz "BAR_NAMESPACE" /* name space */
.balign 8
.quad bar /* symbol reference */
.previous
They are mere markers to tell modpost the name, license, and namespace
of the symbols. They will be dropped from the final vmlinux and modules
because the *(.export_symbol) will go into /DISCARD/ in the linker script.
Then, modpost extracts all the information about EXPORT_SYMBOL() from the
.export_symbol section, and generates the final C code:
KSYMTAB_FUNC(foo, "", "");
KSYMTAB_FUNC(bar, "_gpl", "BAR_NAMESPACE");
KSYMTAB_FUNC() (or KSYMTAB_DATA() if it is data) is expanded to struct
kernel_symbol that will be linked to the vmlinux or a module.
With this change, EXPORT_SYMBOL() works in the same way for *.c and *.S
files, providing the following benefits.
[1] Deprecate EXPORT_DATA_SYMBOL()
In the old days, EXPORT_SYMBOL() was only available in C files. To export
a symbol in *.S, EXPORT_SYMBOL() was placed in a separate *.c file.
arch/arm/kernel/armksyms.c is one example written in the classic manner.
Commit 22823ab419 ("EXPORT_SYMBOL() for asm") removed this limitation.
Since then, EXPORT_SYMBOL() can be placed close to the symbol definition
in *.S files. It was a nice improvement.
However, as that commit mentioned, you need to use EXPORT_DATA_SYMBOL()
for data objects on some architectures.
In the new approach, modpost checks symbol's type (STT_FUNC or not),
and outputs KSYMTAB_FUNC() or KSYMTAB_DATA() accordingly.
There are only two users of EXPORT_DATA_SYMBOL:
EXPORT_DATA_SYMBOL_GPL(empty_zero_page) (arch/ia64/kernel/head.S)
EXPORT_DATA_SYMBOL(ia64_ivt) (arch/ia64/kernel/ivt.S)
They are transformed as follows and output into .vmlinux.export.c
KSYMTAB_DATA(empty_zero_page, "_gpl", "");
KSYMTAB_DATA(ia64_ivt, "", "");
The other EXPORT_SYMBOL users in ia64 assembly are output as
KSYMTAB_FUNC().
EXPORT_DATA_SYMBOL() is now deprecated.
[2] merge <linux/export.h> and <asm-generic/export.h>
There are two similar header implementations:
include/linux/export.h for .c files
include/asm-generic/export.h for .S files
Ideally, the functionality should be consistent between them, but they
tend to diverge.
Commit 8651ec01da ("module: add support for symbol namespaces.") did
not support the namespace for *.S files.
This commit shifts the essential implementation part to C, which supports
EXPORT_SYMBOL_NS() for *.S files.
<asm/export.h> and <asm-generic/export.h> will remain as a wrapper of
<linux/export.h> for a while.
They will be removed after #include <asm/export.h> directives are all
replaced with #include <linux/export.h>.
[3] Implement CONFIG_TRIM_UNUSED_KSYMS in one-pass algorithm (by a later commit)
When CONFIG_TRIM_UNUSED_KSYMS is enabled, Kbuild recursively traverses
the directory tree to determine which EXPORT_SYMBOL to trim. If an
EXPORT_SYMBOL turns out to be unused by anyone, Kbuild begins the
second traverse, where some source files are recompiled with their
EXPORT_SYMBOL() tuned into a no-op.
We can do this better now; modpost can selectively emit KSYMTAB entries
that are really used by modules.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Add a pair of macros for exporting functions only if CONFIG_PM
is enabled.
The naming follows the style of the standard EXPORT_SYMBOL_*()
macros that they replace.
Sometimes a module wants to export PM functions directly to other
drivers, not a complete struct dev_pm_ops. A typical example is
where a core library exports the generic (shared) implementation
and calling code wraps one or more of these in custom code.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Update the _EXPORT_DEV_PM_OPS() internal macro. It was not used anywhere
outside pm.h and pm_runtime.h, so it is safe to update it.
Before, this macro would take a few parameters to be used as sleep and
runtime callbacks. This made it unsuitable to use with different
callbacks, for instance the "noirq" ones.
It is now semantically different: instead of creating a conditionally
exported dev_pm_ops structure, it only contains part of the definition.
This macro should however never be used directly (hence the trailing
underscore). Instead, the following four macros are provided:
- EXPORT_DEV_PM_OPS(name)
- EXPORT_GPL_DEV_PM_OPS(name)
- EXPORT_NS_DEV_PM_OPS(name, ns)
- EXPORT_NS_GPL_DEV_PM_OPS(name, ns)
For instance, it is now possible to conditionally export noirq
suspend/resume PM functions like this:
EXPORT_GPL_DEV_PM_OPS(foo_pm_ops) = {
NOIRQ_SYSTEM_SLEEP_PM_OPS(suspend_fn, resume_fn)
};
The existing helper macros EXPORT_*_SIMPLE_DEV_PM_OPS() and
EXPORT_*_RUNTIME_DEV_PM_OPS() have been updated to use these new macros.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Add Tegra234 cpufreq support (Sumit Gupta).
- Clean up and enhance the Mediatek cpufreq driver (Wan Jiabing,
Rex-BC Chen, and Jia-Wei Chang).
- Fix up the CPPC cpufreq driver after recent changes (Zheng Bin,
Pierre Gondois).
- Minor update to dt-binding for Qcom's opp-v2-kryo-cpu (Yassine
Oudjana).
- Use list iterator only inside the list_for_each_entry loop (Xiaomeng
Tong, and Jakob Koschel).
- New APIs related to finding OPP based on interconnect bandwidth
(Krzysztof Kozlowski).
- Fix the missing of_node_put() in _bandwidth_supported() (Dan
Carpenter).
- Cleanups (Krzysztof Kozlowski, and Viresh Kumar).
- Add Out of Band mode description to the intel-speed-select utility
documentation (Srinivas Pandruvada).
- Add power sequences support to the system reboot and power off
code and make related platform-specific changes for multiple
platforms (Dmitry Osipenko, Geert Uytterhoeven).
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKU8lESHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxVz0P91LNCbkDSt60jzNkXdEjsvUnI/YjJ+QJ
/+ta7iCwf90obb6s9soBkTyU8Ia7hJ/IWDJW/5xhdG0ySYF17hGNIGKK9xKGsJFK
tzzWtjFsvT3PeUZQERekqWp8OYskHYmQMj8o4jqqFF7DZD/AswTgkVLALUd7YhVL
UvLmcKsUA7eXy3ZrhtrGSzVSEbKOGXBLFyjy3IuWjfz6Uk/nGQRNKGf7byRWLM44
y7zb75/5+p4MPyyJP8M/uiXzEYDKuubRtfx9PdmLgBUSMbtho6eB1x47dZWooaxe
YKmcFjF80AmnwxHb+Te2rZHPeIYr+5hLBaEq7xaLQf/nAS3y5z1PIfI2wVQ5mXPz
D599jHHda/6oSAKCVTq2fKfnlR6fetm5j66xOQINpD+G5b5tNSpllXJDamFZxFgP
DiQAOFzdnRYnK7yTiLWVl1q76SVRxqsGz7/5Ak+NRj2OQK2wRkLzHuZfiV/8r0pk
ksi6Ew9TerXkstoTQsSToPQxB2VvosSajNU3Oy27pmM0oal1XxP0LIPz9sMor5/g
tfk5f6Yz/+FFIfXj3cZffZNdhsJgejmcqPdrSdCOV3sBrblnIMQNpHiYg4jGztoj
IjYKYPVpSaWiSZLQOaK2moTEvm9CfQz1TQCF+/Kz88LX6/7ZaDJFxHG2FDEob0sg
6KVbrZWweLI=
=PAh+
-----END PGP SIGNATURE-----
Merge tag 'pm-5.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These update the ARM cpufreq drivers and fix up the CPPC cpufreq
driver after recent changes, update the OPP code and PM documentation
and add power sequences support to the system reboot and power off
code.
Specifics:
- Add Tegra234 cpufreq support (Sumit Gupta)
- Clean up and enhance the Mediatek cpufreq driver (Wan Jiabing,
Rex-BC Chen, and Jia-Wei Chang)
- Fix up the CPPC cpufreq driver after recent changes (Zheng Bin,
Pierre Gondois)
- Minor update to dt-binding for Qcom's opp-v2-kryo-cpu (Yassine
Oudjana)
- Use list iterator only inside the list_for_each_entry loop
(Xiaomeng Tong, and Jakob Koschel)
- New APIs related to finding OPP based on interconnect bandwidth
(Krzysztof Kozlowski)
- Fix the missing of_node_put() in _bandwidth_supported() (Dan
Carpenter)
- Cleanups (Krzysztof Kozlowski, and Viresh Kumar)
- Add Out of Band mode description to the intel-speed-select utility
documentation (Srinivas Pandruvada)
- Add power sequences support to the system reboot and power off code
and make related platform-specific changes for multiple platforms
(Dmitry Osipenko, Geert Uytterhoeven)"
* tag 'pm-5.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (60 commits)
cpufreq: CPPC: Fix unused-function warning
cpufreq: CPPC: Fix build error without CONFIG_ACPI_CPPC_CPUFREQ_FIE
Documentation: admin-guide: PM: Add Out of Band mode
kernel/reboot: Change registration order of legacy power-off handler
m68k: virt: Switch to new sys-off handler API
kernel/reboot: Add devm_register_restart_handler()
kernel/reboot: Add devm_register_power_off_handler()
soc/tegra: pmc: Use sys-off handler API to power off Nexus 7 properly
reboot: Remove pm_power_off_prepare()
regulator: pfuze100: Use devm_register_sys_off_handler()
ACPI: power: Switch to sys-off handler API
memory: emif: Use kernel_can_power_off()
mips: Use do_kernel_power_off()
ia64: Use do_kernel_power_off()
x86: Use do_kernel_power_off()
sh: Use do_kernel_power_off()
m68k: Switch to new sys-off handler API
powerpc: Use do_kernel_power_off()
xen/x86: Use do_kernel_power_off()
parisc: Use do_kernel_power_off()
...
- Add driver-core infrastructure for lockdep validation of
device_lock(), and fixup a deadlock report that was previously hidden
behind the 'lockdep no validate' policy.
- Add CXL _OSC support for claiming native control of CXL hotplug and
error handling.
- Disable suspend in the presence of CXL memory unless and until a
protocol is identified for restoring PCI device context from memory
hosted on CXL PCI devices.
- Add support for snooping CXL mailbox commands to protect against
inopportune changes, like set-partition with the 'immediate' flag set.
- Rework how the driver detects legacy CXL 1.1 configurations (CXL DVSEC
/ 'mem_enable') before enabling new CXL 2.0 decode configurations (CXL
HDM Capability).
- Miscellaneous cleanups and fixes from -next exposure.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYpFUogAKCRDfioYZHlFs
Zz+VAP9o/NkYhbaM2Ne9ImgsdJii96gA8nN7q/q/ZoXjsSx2WQD+NRC5d3ZwZDCa
9YKEkntnvbnAZOCs+ZUuyZBgNh6vsgU=
=p92w
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams:
"Compute Express Link (CXL) updates for this cycle.
The highlight is new driver-core infrastructure and CXL subsystem
changes for allowing lockdep to validate device_lock() usage. Thanks
to PeterZ for setting me straight on the current capabilities of the
lockdep API, and Greg acked it as well.
On the CXL ACPI side this update adds support for CXL _OSC so that
platform firmware knows that it is safe to still grant Linux native
control of PCIe hotplug and error handling in the presence of CXL
devices. A circular dependency problem was discovered between suspend
and CXL memory for cases where the suspend image might be stored in
CXL memory where that image also contains the PCI register state to
restore to re-enable the device. Disable suspend for now until an
architecture is defined to clarify that conflict.
Lastly a collection of reworks, fixes, and cleanups to the CXL
subsystem where support for snooping mailbox commands and properly
handling the "mem_enable" flow are the highlights.
Summary:
- Add driver-core infrastructure for lockdep validation of
device_lock(), and fixup a deadlock report that was previously
hidden behind the 'lockdep no validate' policy.
- Add CXL _OSC support for claiming native control of CXL hotplug and
error handling.
- Disable suspend in the presence of CXL memory unless and until a
protocol is identified for restoring PCI device context from memory
hosted on CXL PCI devices.
- Add support for snooping CXL mailbox commands to protect against
inopportune changes, like set-partition with the 'immediate' flag
set.
- Rework how the driver detects legacy CXL 1.1 configurations (CXL
DVSEC / 'mem_enable') before enabling new CXL 2.0 decode
configurations (CXL HDM Capability).
- Miscellaneous cleanups and fixes from -next exposure"
* tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (47 commits)
cxl/port: Enable HDM Capability after validating DVSEC Ranges
cxl/port: Reuse 'struct cxl_hdm' context for hdm init
cxl/port: Move endpoint HDM Decoder Capability init to port driver
cxl/pci: Drop @info argument to cxl_hdm_decode_init()
cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init()
cxl/mem: Skip range enumeration if mem_enable clear
cxl/mem: Consolidate CXL DVSEC Range enumeration in the core
cxl/pci: Move cxl_await_media_ready() to the core
cxl/mem: Validate port connectivity before dvsec ranges
cxl/mem: Fix cxl_mem_probe() error exit
cxl/pci: Drop wait_for_valid() from cxl_await_media_ready()
cxl/pci: Consolidate wait_for_media() and wait_for_media_ready()
cxl/mem: Drop mem_enabled check from wait_for_media()
nvdimm: Fix firmware activation deadlock scenarios
device-core: Kill the lockdep_mutex
nvdimm: Drop nd_device_lock()
ACPI: NFIT: Drop nfit_device_lock()
nvdimm: Replace lockdep_mutex with local lock classes
cxl: Drop cxl_device_lock()
cxl/acpi: Add root device lockdep validation
...
All pm_power_off_prepare() users were converted to sys-off handler API.
Remove the obsolete global callback variable.
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The CXL specification claims S3 support at a hardware level, but at a
system software level there are some missing pieces. Section 9.4 (CXL
2.0) rightly claims that "CXL mem adapters may need aux power to retain
memory context across S3", but there is no enumeration mechanism for the
OS to determine if a given adapter has that support. Moreover the save
state and resume image for the system may inadvertantly end up in a CXL
device that needs to be restored before the save state is recoverable.
I.e. a circular dependency that is not resolvable without a third party
save-area.
Arrange for the cxl_mem driver to fail S3 attempts. This still nominaly
allows for suspend, but requires unbinding all CXL memory devices before
the suspend to ensure the typical DRAM flow is taken. The cxl_mem unbind
flow is intended to also tear down all CXL memory regions associated
with a given cxl_memdev.
It is reasonable to assume that any device participating in a System RAM
range published in the EFI memory map is covered by aux power and
save-area outside the device itself. So this restriction can be
minimized in the future once pre-existing region enumeration support
arrives, and perhaps a spec update to clarify if the EFI memory map is
sufficent for determining the range of devices managed by
platform-firmware for S3 support.
Per Rafael, if the CXL configuration prevents suspend then it should
fail early before tasks are frozen, and mem_sleep should stop showing
'mem' as an option [1]. Effectively CXL augments the platform suspend
->valid() op since, for example, the ACPI ops are not aware of the CXL /
PCI dependencies. Given the split role of platform firmware vs OS
provisioned CXL memory it is up to the cxl_mem driver to determine if
the CXL configuration has elements that platform firmware may not be
prepared to restore.
Link: https://lore.kernel.org/r/CAJZ5v0hGVN_=3iU8OLpHY3Ak35T5+JcBM-qs8SbojKrpd0VXsA@mail.gmail.com [1]
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <len.brown@intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/165066828317.3907920.5690432272182042556.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
As more drivers start to use namespaces, we need to have varients of these
useful macros that allow the export to be in a particular namespace.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, suspend_report_result() prints only function information.
If any driver uses a common PM function, nobody knows who exactly
called the failing function.
A device pinter is needed to recognize the failing device.
For example:
PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 0
PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns 0
become after the change:
serial 00:05: PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 0
pci 0000:00:01.3: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns 0
Signed-off-by: Youngjin Jang <yj84.jang@samsung.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A lot of drivers create a dev_pm_ops struct with the system sleep
suspend/resume callbacks set to pm_runtime_force_suspend() and
pm_runtime_force_resume().
These drivers can now use the DEFINE_RUNTIME_DEV_PM_OPS() macro, which
will use pm_runtime_force_{suspend,resume}() as the system sleep
callbacks, while having the same dead code removal characteristic that
is already provided by DEFINE_SIMPLE_DEV_PM_OPS().
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
These macros are defined conditionally, according to CONFIG_PM:
- if CONFIG_PM is enabled, these macros resolve to
DEFINE_SIMPLE_DEV_PM_OPS(), and the dev_pm_ops symbol will be
exported.
- if CONFIG_PM is disabled, these macros will result in a dummy static
dev_pm_ops to be created with the __maybe_unused flag. The dev_pm_ops
will then be discarded by the compiler, along with the provided
callback functions if they are not used anywhere else.
In the second case, the symbol is not exported, which should be
perfectly fine - users of the symbol should all use the pm_ptr() or
pm_sleep_ptr() macro, so the dev_pm_ops marked as "extern" in the
client's code will never be accessed.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Keep this macro in line with the other ones. This makes it possible to
use them in the cases where the underlying dev_pm_ops structure is
exported.
Restore the "static" qualifier in the two drivers where the
DEFINE_SIMPLE_DEV_PM_OPS macro was used.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The deprecated UNIVERSAL_DEV_PM_OPS() macro uses the provided callbacks
for both runtime PM and system sleep, which is very likely to be a
mistake, as a system sleep can be triggered while a given device is
already PM-suspended, which would cause the suspend callback to be
called twice.
The amount of users of UNIVERSAL_DEV_PM_OPS() is also tiny (16
occurences) compared to the number of places where
SET_SYSTEM_SLEEP_PM_OPS() is used with pm_runtime_force_suspend() and
pm_runtime_force_resume(), which makes me think that none of these cases
are actually valid.
As the new macro DEFINE_UNIVERSAL_DEV_PM_OPS() which was introduced to
replace UNIVERSAL_DEV_PM_OPS() is currently unused, remove it before
someone starts to use it in yet another invalid case.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In some cases (for example, during system-wide suspend and resume of
devices) it is useful to know whether or not runtime PM has ever been
enabled for a given device and, if so, what the runtime PM status of
it had been right before runtime PM was disabled for it last time.
For this reason, introduce a new struct dev_pm_info field called
last_status that will be used for capturing the runtime PM status of
the device when its power.disable_depth counter changes from 0 to 1.
The new field will be set to RPM_INVALID to start with and whenever
power.disable_depth changes from 1 to 0, so it will be valid only
when runtime PM of the device is currently disabled, but it has been
enabled at least once.
Immediately use power.last_status in rpm_resume() to make it handle
the case when PM runtime is disabled for the device, but its runtime
PM status is RPM_ACTIVE more consistently. Namely, make it return 1
if power.last_status is also equal to RPM_ACTIVE in that case (the
idea being that if the status was RPM_ACTIVE last time when
power.disable_depth was changing from 0 to 1 and it is still
RPM_ACTIVE, it can be assumed to reflect what happened to the device
last time when it was using runtime PM) and -EACCES otherwise.
Update the documentation to provide a description of last_status and
change the description of pm_runtime_resume() in it to reflect the
new behavior of rpm_active().
While at it, rearrange the code in pm_runtime_enable() to be more
straightforward and replace the WARN() macro in it with a pr_warn()
invocation which is less disruptive.
Link: https://lore.kernel.org/linux-pm/20211026222626.39222-1-ulf.hansson@linaro.org/t/#u
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This commit introduces the following macros:
SYSTEM_SLEEP_PM_OPS()
LATE_SYSTEM_SLEEP_PM_OPS()
NOIRQ_SYSTEM_SLEEP_PM_OPS()
RUNTIME_PM_OPS()
These new macros are very similar to their SET_*_PM_OPS() equivalent.
They however differ in the fact that the callbacks they set will always
be seen as referenced by the compiler. This means that the callback
functions don't need to be wrapped with a #ifdef CONFIG_PM guard, or
tagged with __maybe_unused, to prevent the compiler from complaining
about unused static symbols. The compiler will then simply evaluate at
compile time whether or not these symbols are dead code.
The callbacks that are only useful with CONFIG_PM_SLEEP is enabled, are
now also wrapped with a new pm_sleep_ptr() macro, which is inspired from
pm_ptr(). This is needed for drivers that use different callbacks for
sleep and runtime PM, to handle the case where CONFIG_PM is set and
CONFIG_PM_SLEEP is not.
This commit also deprecates the following macros:
SIMPLE_DEV_PM_OPS()
UNIVERSAL_DEV_PM_OPS()
And introduces the following macros:
DEFINE_SIMPLE_DEV_PM_OPS()
DEFINE_UNIVERSAL_DEV_PM_OPS()
These macros are similar to the functions they were created to replace,
with the following differences:
- They use the new macros introduced above, and as such always
reference the provided callback functions.
- They are not tagged with __maybe_unused. They are meant to be used
with pm_ptr() or pm_sleep_ptr() for DEFINE_UNIVERSAL_DEV_PM_OPS()
and DEFINE_SIMPLE_DEV_PM_OPS() respectively.
- They declare the symbol static, since every driver seems to do that
anyway; and if a non-static use-case is needed an indirection pointer
could be used.
The point of this change, is to progressively switch from a code model
where PM callbacks are all protected behind CONFIG_PM guards, to a code
model where the PM callbacks are always seen by the compiler, but
discarded if not used.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The pm_ptr() macro was previously conditionally defined, according to
the value of the CONFIG_PM option. This meant that the pointed structure
was either referenced (if CONFIG_PM was set), or never referenced (if
CONFIG_PM was not set), causing it to be detected as unused by the
compiler.
This worked fine, but required the __maybe_unused compiler attribute to
be used to every symbol pointed to by a pointer wrapped with pm_ptr().
We can do better. With this change, the pm_ptr() is now defined the
same, independently of the value of CONFIG_PM. It now uses the (?:)
ternary operator to conditionally resolve to its argument. Since the
condition is known at compile time, the compiler will then choose to
discard the unused symbols, which won't need to be tagged with
__maybe_unused anymore.
This pm_ptr() macro is usually used with pointers to dev_pm_ops
structures created with SIMPLE_DEV_PM_OPS() or similar macros. These do
use a __maybe_unused flag, which is now useless with this change, so it
later can be removed. However in the meantime it causes no harm, and all
the drivers still compile fine with the new pm_ptr() macro.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
As pm_runtime_need_not_resume() relies also on usage_count, it can return
a different value in pm_runtime_force_suspend() compared to when called in
pm_runtime_force_resume(). Different return values can happen if anything
calls PM runtime functions in between, and causes the parent child_count
to increase on every resume.
So far I've seen the issue only for omapdrm that does complicated things
with PM runtime calls during system suspend for legacy reasons:
omap_atomic_commit_tail() for omapdrm.0
dispc_runtime_get()
wakes up 58000000.dss as it's the dispc parent
dispc_runtime_resume()
rpm_resume() increases parent child_count
dispc_runtime_put() won't idle, PM runtime suspend blocked
pm_runtime_force_suspend() for 58000000.dss, !pm_runtime_need_not_resume()
__update_runtime_status()
system suspended
pm_runtime_force_resume() for 58000000.dss, pm_runtime_need_not_resume()
pm_runtime_enable() only called because of pm_runtime_need_not_resume()
omap_atomic_commit_tail() for omapdrm.0
dispc_runtime_get()
wakes up 58000000.dss as it's the dispc parent
dispc_runtime_resume()
rpm_resume() increases parent child_count
dispc_runtime_put() won't idle, PM runtime suspend blocked
...
rpm_suspend for 58000000.dss but parent child_count is now unbalanced
Let's fix the issue by adding a flag for needs_force_resume and use it in
pm_runtime_force_resume() instead of pm_runtime_need_not_resume().
Additionally omapdrm system suspend could be simplified later on to avoid
lots of unnecessary PM runtime calls and the complexity it adds. The
driver can just use internal functions that are shared between the PM
runtime and system suspend related functions.
Fixes: 4918e1f87c ("PM / runtime: Rework pm_runtime_force_suspend/resume()")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
struct device is declared twice, so remove the duplicate.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The clock API splits its interface into sleepable ant atomic contexts:
- clk_prepare/clk_unprepare for stuff that might sleep
- clk_enable_clk_disable for anything that may be done in atomic context
The code handling runtime PM for clocks only calls clk_disable() on
suspend requests, and clk_enable on resume requests. This means that
runtime PM with clock providers that only have the prepare/unprepare
methods implemented is basically useless.
Many clock implementations can't accommodate atomic contexts. This is
often the case when communication with the clock happens through another
subsystem like I2C or SCMI.
Let's make the clock PM code useful with such clocks by safely invoking
clk_prepare/clk_unprepare upon resume/suspend requests. Of course, when
such clocks are registered with the PM layer then pm_runtime_irq_safe()
can't be used, and neither pm_runtime_suspend() nor pm_runtime_resume()
may be invoked in atomic context.
For clocks that do implement the enable and disable methods then
everything just works as before.
A note on sparse:
According to https://lwn.net/Articles/109066/ there are things
that sparse can't cope with. In particular, pm_clk_op_lock() and
pm_clk_op_unlock() may or may not lock/unlock psd->lock depending on
some runtime condition. To work around that we tell it the lock is
always untaken for the purpose of static analisys.
Thanks to Naresh Kamboju for reporting issues with the initial patch.
Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 8234f6734c ("PM-runtime: Switch autosuspend over to using
hrtimers") switched PM runtime autosuspend to use hrtimers and all
related time accounting in ns, but missed to update the timer_expires
data type in struct dev_pm_info to u64.
This causes the timer_expires value to be truncated on 32-bit
architectures when assignment is done from u64 values:
rpm_suspend()
|- dev->power.timer_expires = expires;
Fix it by changing the timer_expires type to u64.
Fixes: 8234f6734c ("PM-runtime: Switch autosuspend over to using hrtimers")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: 5.0+ <stable@vger.kernel.org> # 5.0+
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This way, when the dev_pm_ops instance is not referenced anywhere, it
will simply be dropped by the compiler without a warning.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This macro is analogous to the infamous of_match_ptr(). If CONFIG_PM
is enabled, this macro will resolve to its argument, otherwise to NULL.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Update the documentation of the driver flags for system-wide power
management to reflect the current code flows and be more consistent.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rename DPM_FLAG_LEAVE_SUSPENDED to DPM_FLAG_MAY_SKIP_RESUME which
matches its purpose more closely.
No functional impact.
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de> # for I2C
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Rename DPM_FLAG_NEVER_SKIP to DPM_FLAG_NO_DIRECT_COMPLETE which
matches its purpose more closely.
No functional impact.
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # for PCI parts
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Because all callers of dev_pm_smart_suspend_and_suspended use it only
for checking whether or not to skip driver suspend callbacks for a
device, rename it to dev_pm_skip_suspend() in analogy with
dev_pm_skip_resume().
No functional impact.
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
The name of dev_pm_may_skip_resume() may be easily confused with the
power.may_skip_resume flag which is not checked by that function, so
rename the former as dev_pm_skip_resume().
No functional impact.
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
For a subsystem/driver that either doesn't support runtime PM or makes use
of pm_runtime_set_active() during ->probe(), may try to access its device
when probing, even if it may not be fully powered on from the PM domain's
point of view. This may be the case when the used PM domain is a genpd
provider, that implements genpd's ->start|stop() device callbacks.
There are cases where the subsystem/driver managed to avoid the above
problem, simply by calling pm_runtime_enable() and pm_runtime_get_sync()
during ->probe(). However, this approach comes with a drawback, especially
if the subsystem/driver implements a ->runtime_resume() callback.
More precisely, the subsystem/driver then needs to use a device flag, which
is checked in its ->runtime_resume() callback, as to avoid powering on its
resources the first time the callback is invoked. This is needed because
the subsystem/driver has already powered on the resources for the device,
during ->probe() and before it called pm_runtime_get_sync().
In a way to avoid this boilerplate code and the inefficient check for "if
(first_time_suspend)" in the ->runtime_resume() callback for these
subsystems/drivers, let's introduce and export a dev_pm_domain_start()
function, that may be called during ->probe() instead.
Moreover, let the dev_pm_domain_start() invoke an optional ->start()
callback, added to the struct dev_pm_domain, as to allow a PM domain
specific implementation.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Note that after previous changes dpm_noirq_begin() and
dpm_noirq_end() each have only one caller, so move the code from
them to their respective callers and drop them.
Also note that dpm_noirq_resume_devices() and
dpm_noirq_suspend_devices() need not be exported any more, so make
them both static.
This change is not expected to alter functionality.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
After recent hibernation-related changes, there are no more callers
of dev_pm_skip_next_resume_phases() except for the PM core itself
in which it is more straightforward to run the statements from
that function directly, so do that and drop it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Convert the PM documents to ReST, in order to allow them to
build with Sphinx.
The conversion is actually:
- add blank lines and indentation in order to identify paragraphs;
- fix tables markups;
- add some lists markups;
- mark literal blocks;
- adjust title markups.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details you
should have received a copy of the gnu general public license along
with this program if not write to the free software foundation inc
59 temple place suite 330 boston ma 02111 1307 usa
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 1334 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Avoid the open-coding of the accounted time acquisition in
runtime_active|suspend_time_show() and make them call
pm_runtime_active|suspended_time() instead.
Note that this change also indirectly avoids holding dev->power.lock
around the do_div() computation and the sprintf() call which is an
additional improvement.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
All device objects in the driver model contain fields that control the
handling of various power management activities. However, it's not
always useful. There are few instances where pseudo devices are added
to the model just to take advantage of many other features like
kobjects, udev events, and so on. One such example is cpu devices and
their caches.
The sysfs for the cpu caches are managed by adding devices with cpu
as the parent in cpu_device_create() when secondary cpu is brought
online. Generally when the secondary CPUs are hotplugged back in as part
of resume from suspend-to-ram, we call cpu_device_create() from the cpu
hotplug state machine while the cpu device associated with that CPU is
not yet ready to be resumed as the device_resume() call happens bit
later. It's not really needed to set the flag is_prepared for cpu
devices as they are mostly pseudo device and hotplug framework deals
with state machine and not managed through the cpu device.
This often results in annoying warning when resuming:
Enabling non-boot CPUs ...
CPU1: Booted secondary processor
cache: parent cpu1 should not be sleeping
CPU1 is up
CPU2: Booted secondary processor
cache: parent cpu2 should not be sleeping
CPU2 is up
.... and so on.
So in order to fix these kind of errors, we could just completely avoid
doing any power management related initialisations and operations if
they are not used by these devices.
Add no_pm flags to indicate that the device doesn't require any sort of
PM activities and all of them can be completely skipped. We can use the
same flag to also avoid adding not used *power* sysfs entries for these
devices. For now, lets use this for cpu cache devices.
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Replace jiffies-based accounting for runtime_active_time and
runtime_suspended_time with ktime-based accounting. This makes the
runtime debug counters inline with genpd and other PM subsytems which
use ktime-based accounting.
Timekeeping is initialized before driver_init(). It's only at that time
that PM-runtime can be enabled.
Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
[switch from ktime to raw nsec]
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
PM-runtime uses the timer infrastructure for autosuspend. This implies
that the minimum time before autosuspending a device is in the range
of 1 tick included to 2 ticks excluded
-On arm64 this means between 4ms and 8ms with default jiffies
configuration
-And on arm, it is between 10ms and 20ms
These values are quite high for embedded systems which sometimes want
the duration to be in the range of 1 ms.
It is possible to switch autosuspend over to using hrtimers to get
finer granularity for short durations and take advantage of slack to
retain some margins and get long timeouts with minimum wakeups.
On an arm64 platform that uses 1ms for autosuspending timeout of its
GPU, idle power is reduced by 10% with hrtimer.
The latency impact on arm64 hikey octo cores is:
- mark_last_busy: from 1.11 us to 1.25 us
- rpm_suspend: from 15.54 us to 15.38 us
[Only the code path of rpm_suspend() that starts hrtimer has been
measured.]
arm64 image (arm64 default defconfig) decreases by around 3KB
with following details:
$ size vmlinux-timer
text data bss dec hex filename
12034646 6869268 386840 19290754 1265a82 vmlinux
$ size vmlinux-hrtimer
text data bss dec hex filename
12030550 6870164 387032 19287746 1264ec2 vmlinux
The latency impact on arm 32bits snowball dual cores is :
- mark_last_busy: from 0.31 us usec to 0.77 us
- rpm_suspend: from 6.83 us to 6.67 usec
The increase of the image for snowball platform that I used for
testing performance impact, is neglictable (244B).
$ size vmlinux-timer
text data bss dec hex filename
7157961 2119580 264120 9541661 91981d build-ux500/vmlinux
size vmlinux-hrtimer
text data bss dec hex filename
7157773 2119884 264248 9541905 919911 vmlinux-hrtimer
And arm 32bits image (multi_v7_defconfig) increases by around 1.7KB
with following details:
$ size vmlinux-timer
text data bss dec hex filename
13304443 6803420 402768 20510631 138f7a7 vmlinux
$ size vmlinux-hrtimer
text data bss dec hex filename
13304299 6805276 402768 20512343 138fe57 vmlinux
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Middle-layer code doing suspend-time optimizations for devices with
the DPM_FLAG_SMART_SUSPEND flag set (currently, the PCI bus type and
the ACPI PM domain) needs to make the core skip ->thaw_early and
->thaw callbacks for those devices in some cases and it sets the
power.direct_complete flag for them for this purpose.
However, it turns out that setting power.direct_complete outside of
the PM core is a bad idea as it triggers an excess invocation of
pm_runtime_enable() in device_resume().
For this reason, provide a helper to clear power.is_late_suspended
and power.is_suspended to be invoked by the middle-layer code in
question instead of setting power.direct_complete and make that code
call the new helper.
Fixes: c4b65157ae (PCI / PM: Take SMART_SUSPEND driver flag into account)
Fixes: 05087360fd (ACPI / PM: Take SMART_SUSPEND driver flag into account)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Define and document a new driver flag, DPM_FLAG_LEAVE_SUSPENDED, to
instruct the PM core and middle-layer (bus type, PM domain, etc.)
code that it is desirable to leave the device in runtime suspend
after system-wide transitions to the working state (for example,
the device may be slow to resume and it may be better to avoid
resuming it right away).
Generally, the middle-layer code involved in the handling of the
device is expected to indicate to the PM core whether or not the
device may be left in suspend with the help of the device's
power.may_skip_resume status bit. That has to happen in the "noirq"
phase of the preceding system suspend (or analogous) transition.
The middle layer is then responsible for handling the device as
appropriate in its "noirq" resume callback which is executed
regardless of whether or not the device may be left suspended, but
the other resume callbacks (except for ->complete) will be skipped
automatically by the core if the device really can be left in
suspend.
The additional power.must_resume status bit introduced for the
implementation of this mechanisn is used internally by the PM core
to track the requirement to resume the device (which may depend on
its children etc).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Make the PCI bus type take DPM_FLAG_SMART_SUSPEND into account in its
system-wide PM callbacks and make sure that all code that should not
run in parallel with pci_pm_runtime_resume() is executed in the "late"
phases of system suspend, freeze and poweroff transitions.
[Note that the pm_runtime_suspended() check in pci_dev_keep_suspended()
is an optimization, because if is not passed, all of the subsequent
checks may be skipped and some of them are much more overhead in
general.]
Also use the observation that if the device is in runtime suspend
at the beginning of the "late" phase of a system-wide suspend-like
transition, its state cannot change going forward (runtime PM is
disabled for it at that time) until the transition is over and the
subsequent system-wide PM callbacks should be skipped for it (as
they generally assume the device to not be suspended), so add checks
for that in pci_pm_suspend_late/noirq(), pci_pm_freeze_late/noirq()
and pci_pm_poweroff_late/noirq().
Moreover, if pci_pm_resume_noirq() or pci_pm_restore_noirq() is
called during the subsequent system-wide resume transition and if
the device was left in runtime suspend previously, its runtime PM
status needs to be changed to "active" as it is going to be put
into the full-power state, so add checks for that too to these
functions.
In turn, if pci_pm_thaw_noirq() runs after the device has been
left in runtime suspend, the subsequent "thaw" callbacks need
to be skipped for it (as they may not work correctly with a
suspended device), so set the power.direct_complete flag for the
device then to make the PM core skip those callbacks.
In addition to the above add a core helper for checking if
DPM_FLAG_SMART_SUSPEND is set and the device runtime PM status is
"suspended" at the same time, which is done quite often in the new
code (and will be done elsewhere going forward too).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Define and document a SMART_SUSPEND flag to instruct bus types and PM
domains that the system suspend callbacks provided by the driver can
cope with runtime-suspended devices, so from the driver's perspective
it should be safe to leave devices in runtime suspend during system
suspend.
Setting that flag may also cause middle-layer code (bus types,
PM domains etc.) to skip invocations of the ->suspend_late and
->suspend_noirq callbacks provided by the driver if the device
is in runtime suspend at the beginning of the "late" phase of
the system-wide suspend transition, in which case the driver's
system-wide resume callbacks may be invoked back-to-back with
its ->runtime_suspend callback, so the driver has to be able to
cope with that too.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
The motivation for this change is to provide a way to work around
a problem with the direct-complete mechanism used for avoiding
system suspend/resume handling for devices in runtime suspend.
The problem is that some middle layer code (the PCI bus type and
the ACPI PM domain in particular) returns positive values from its
system suspend ->prepare callbacks regardless of whether the driver's
->prepare returns a positive value or 0, which effectively prevents
drivers from being able to control the direct-complete feature.
Some drivers need that control, however, and the PCI bus type has
grown its own flag to deal with this issue, but since it is not
limited to PCI, it is better to address it by adding driver flags at
the core level.
To that end, add a driver_flags field to struct dev_pm_info for flags
that can be set by device drivers at the probe time to inform the PM
core and/or bus types, PM domains and so on on the capabilities and/or
preferences of device drivers. Also add two static inline helpers
for setting that field and testing it against a given set of flags
and make the driver core clear it automatically on driver remove
and probe failures.
Define and document two PM driver flags related to the direct-
complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
respectively, to indicate to the PM core that the direct-complete
mechanism should never be used for the device and to inform the
middle layer code (bus types, PM domains etc) that it can only
request the PM core to use the direct-complete mechanism for
the device (by returning a positive value from its ->prepare
callback) if it also has been requested by the driver.
While at it, make the core check pm_runtime_suspended() when
setting power.direct_complete so that it doesn't need to be
checked by ->prepare callbacks.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
According to recent changes for ACPI, the are longer any users of
pm_complete_with_resume_check(), thus let's drop it.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Put the device interrupts disabling and enabling as well as
cpuidle_pause() and cpuidle_resume() called during the "noirq"
stages of system suspend into separate functions to allow the
core suspend-to-idle code to be optimized (later).
The only functional difference this makes is that debug facilities
and diagnostic tools will not include the above operations into the
"noirq" device suspend/resume duration measurements.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The run_wake flag in struct dev_pm_info is used to indicate whether
or not the device is capable of generating remote wakeup signals at
run time (or in the system working state), but the distinction
between runtime remote wakeup and system wakeup signaling has always
been rather artificial. The only practical reason for it to exist
at the core level was that ACPI and PCI treated those two cases
differently, but that's not the case any more after recent changes.
For this reason, get rid of the run_wake flag and, when applicable,
use device_set_wakeup_capable() and device_can_wakeup() instead of
device_set_run_wake() and device_run_wake(), respectively.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>