The exported function pm_children_suspended() has only one caller, which is
the runtime PM internal function, rpm_check_suspend_allowed().
Let's clean-up this code, by removing pm_children_suspended() altogether
and instead do the one-liner check directly in rpm_check_suspend_allowed().
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Further testing with false negatives suppressed by commit 293e2421fe
("rcu: Remove superfluous versions of rcu_read_lock_sched_held()")
identified a few more unprotected uses of RCU from the idle loop.
Because RCU actively ignores idle-loop code (for energy-efficiency
reasons, among other things), using RCU from the idle loop can result
in too-short grace periods, in turn resulting in arbitrary misbehavior.
The affected function is rpm_suspend().
The resulting lockdep-RCU splat is as follows:
------------------------------------------------------------------------
Warning from omap3
===============================
[ INFO: suspicious RCU usage. ]
4.6.0-rc5-next-20160426+ #1112 Not tainted
-------------------------------
include/trace/events/rpm.h:63 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
1 lock held by swapper/0/0:
#0: (&(&dev->power.lock)->rlock){-.-...}, at: [<c052ee24>] __pm_runtime_suspend+0x54/0x84
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1112
Hardware name: Generic OMAP36xx (Flattened Device Tree)
[<c0110308>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
[<c010c3a8>] (show_stack) from [<c047fec8>] (dump_stack+0xb0/0xe4)
[<c047fec8>] (dump_stack) from [<c052d7b4>] (rpm_suspend+0x604/0x7e4)
[<c052d7b4>] (rpm_suspend) from [<c052ee34>] (__pm_runtime_suspend+0x64/0x84)
[<c052ee34>] (__pm_runtime_suspend) from [<c04bf3bc>] (omap2_gpio_prepare_for_idle+0x5c/0x70)
[<c04bf3bc>] (omap2_gpio_prepare_for_idle) from [<c01255e8>] (omap_sram_idle+0x140/0x244)
[<c01255e8>] (omap_sram_idle) from [<c0126b48>] (omap3_enter_idle_bm+0xfc/0x1ec)
[<c0126b48>] (omap3_enter_idle_bm) from [<c0601db8>] (cpuidle_enter_state+0x80/0x3d4)
[<c0601db8>] (cpuidle_enter_state) from [<c0183c74>] (cpu_startup_entry+0x198/0x3a0)
[<c0183c74>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
[<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
------------------------------------------------------------------------
Reported-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This commit appends a few _rcuidle suffixes to fix the following
RCU-used-from-idle bug:
> ===============================
> [ INFO: suspicious RCU usage. ]
> 4.6.0-rc5-next-20160426+ #1116 Not tainted
> -------------------------------
> include/trace/events/rpm.h:95 suspicious rcu_dereference_check() usage!
>
> other info that might help us debug this:
>
>
> RCU used illegally from idle CPU!
> rcu_scheduler_active = 1, debug_locks = 0
> RCU used illegally from extended quiescent state!
> 1 lock held by swapper/0/0:
> #0: (&(&dev->power.lock)->rlock){-.-...}, at: [<c052cc2c>] __rpm_callback+0x58/0x60
>
> stack backtrace:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1116
> Hardware name: Generic OMAP36xx (Flattened Device Tree)
> [<c0110290>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
> [<c010c3a8>] (show_stack) from [<c047fd68>] (dump_stack+0xb0/0xe4)
> [<c047fd68>] (dump_stack) from [<c052d5d0>] (rpm_suspend+0x580/0x768)
> [<c052d5d0>] (rpm_suspend) from [<c052ec58>] (__pm_runtime_suspend+0x64/0x84)
> [<c052ec58>] (__pm_runtime_suspend) from [<c04bf25c>] (omap2_gpio_prepare_for_idle+0x5c/0x70)
> [<c04bf25c>] (omap2_gpio_prepare_for_idle) from [<c0125568>] (omap_sram_idle+0x140/0x244)
> [<c0125568>] (omap_sram_idle) from [<c01269dc>] (omap3_enter_idle_bm+0xfc/0x1ec)
> [<c01269dc>] (omap3_enter_idle_bm) from [<c0601bdc>] (cpuidle_enter_state+0x80/0x3d4)
> [<c0601bdc>] (cpuidle_enter_state) from [<c0183b08>] (cpu_startup_entry+0x198/0x3a0)
> [<c0183b08>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
> [<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
In the immortal words of Steven Rostedt, "*Whack* *Whack* *Whack*!!!"
Reported-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
WhACKED-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pm-core:
PM / runtime: Asynchronous "idle" in pm_runtime_allow()
PM / runtime: print error when activating a child to unactive parent
* pm-clk:
PM / clk: Add support for adding a specific clock from device-tree
PM / clk: export symbols for existing pm_clk_<...> API fcns
* pm-domains:
PM / Domains: Convert pm_genpd_init() to return an error code
PM / Domains: Stop/start devices during system PM suspend/resume in genpd
PM / Domains: Allow runtime PM during system PM phases
PM / Runtime: Avoid resuming devices again in pm_runtime_force_resume()
PM / Domains: Remove redundant pm_request_idle() call in genpd
PM / Domains: Remove redundant wrapper functions for system PM
PM / Domains: Allow genpd to power on during system PM phases
* pm-pci:
PCI / PM: check all fields in pci_set_platform_pm()
Arjan reports that it takes a relatively long time to enable runtime
PM for multiple devices at system startup, because all writes to the
"control" attribute in sysfs are handled synchronously and if the
device is suspended as a result of the write, it will block until
that operation is complete.
That may be avoided by passing the RPM_ASYNC flag to rpm_idle()
in pm_runtime_allow() which will make it execute the device's
"idle" callback asynchronously, so writes to "control" changing
it from "on" to "auto" will return without waiting.
Reported-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
The code currently silently bails out with -EBUSY if you try to
activate a child to an inactive parent.
This typically happens when you have a runtime suspended parent
and runtime resume your child, but forgot to set .ignore_children
on the parent to true with pm_suspend_ignore_children(dev).
Silently ignoring this error is not good as it gives rise to
other strange behaviour like double-resume of devices after
silently bailing out of the .runtime_resume() callback.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If the runtime PM status of the device isn't RPM_SUSPENDED, prevent the
pm_runtime_force_resume() from invoking the ->runtime_resume() callback
for the device, as it's not the expected behaviour from the subsystem/driver.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
As pm_runtime_set_active() may fail because the device's parent isn't
active, we can end up executing the ->runtime_resume() callback for the
device when it isn't allowed.
Fix this by invoking pm_runtime_set_active() before running the callback
and let's also deal with the error code.
Fixes: 37f204164d (PM: Add pm_runtime_suspend|resume_force functions)
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Introduce a new runtime PM function, pm_runtime_get_if_in_use(),
that will increment the device's runtime PM usage counter and
return 1 if its status is RPM_ACTIVE and its usage counter
is greater than 0 at the same time (0 will be returned otherwise).
This is useful for things that should only be done if the device
is active (from the runtime PM perspective) and used by somebody
(as indicated by the usage counter) already and they are not worth
bothering otherwise.
Requested-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are two common expectations among several subsystems/drivers that
deploys runtime PM support, but which isn't met by the driver core.
Expectation 1)
At ->probe() the subsystem/driver expects the runtime PM status of the
device to be RPM_SUSPENDED, which is the initial status being assigned at
device registration.
This expectation is especially common among some of those subsystems/
drivers that manages devices with an attached PM domain, as those requires
the ->runtime_resume() callback at the PM domain level to be invoked
during ->probe().
Moreover these subsystems/drivers entirely relies on runtime PM resources
being managed at the PM domain level, thus don't implement their own set
of runtime PM callbacks.
These are two scenarios that suffers from this unmet expectation.
i) A failed ->probe() sequence requests probe deferral:
->probe()
...
pm_runtime_enable()
pm_runtime_get_sync()
...
err:
pm_runtime_put()
pm_runtime_disable()
...
As there are no guarantees that such sequence turns the runtime PM status
of the device into RPM_SUSPENDED, the re-trying ->probe() may start with
the status in RPM_ACTIVE.
In such case the runtime PM core won't invoke the ->runtime_resume()
callback because of a pm_runtime_get_sync(), as it considers the device to
be already runtime resumed.
ii) A driver re-bind sequence:
At driver unbind, the subsystem/driver's >remove() callback invokes a
sequence of runtime PM APIs, to undo actions during ->probe() and to put
the device into low power state.
->remove()
...
pm_runtime_put()
pm_runtime_disable()
...
Similar as in the failing ->probe() case, this sequence don't guarantee
the runtime PM status of the device to turn into RPM_SUSPENDED.
Trying to re-bind the driver thus causes the same issue as when re-trying
->probe(), in the probe deferral scenario.
Expectation 2)
Drivers that invokes the pm_runtime_irq_safe() API during ->probe(),
triggers the runtime PM core to increase the usage count for the device's
parent and permanently make it runtime resumed.
The usage count is only dropped at device removal, which also allows it to
be runtime suspended again.
A re-trying ->probe() repeats the call to pm_runtime_irq_safe() and thus
once more triggers the usage count of the device's parent to be increased.
This leads to not only an imbalance issue of the usage count of the
device's parent, but also to keep it runtime resumed permanently even if
->probe() fails.
To address these issues, let's change the policy of the driver core to
meet these expectations. More precisely, at ->probe() failures and driver
unbind, restore the initial states of runtime PM.
Although to still allow subsystem's to control PM for devices that doesn't
->probe() successfully, don't restore the initial states unless runtime PM
is disabled.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Turns out we can automate the handling for the device_may_wakeup()
quite a bit by using the kernel wakeup source list as suggested
by Rafael J. Wysocki <rjw@rjwysocki.net>.
And as some hardware has separate dedicated wake-up interrupt
in addition to the IO interrupt, we can automate the handling by
adding a generic threaded interrupt handler that just calls the
device PM runtime to wake up the device.
This allows dropping code from device drivers as we currently
are doing it in multiple ways, and often wrong.
For most drivers, we should be able to drop the following
boilerplate code from runtime_suspend and runtime_resume
functions:
...
device_init_wakeup(dev, true);
...
if (device_may_wakeup(dev))
enable_irq_wake(irq);
...
if (device_may_wakeup(dev))
disable_irq_wake(irq);
...
device_init_wakeup(dev, false);
...
We can replace it with just the following init and exit
time code:
...
device_init_wakeup(dev, true);
dev_pm_set_wake_irq(dev, irq);
...
dev_pm_clear_wake_irq(dev);
device_init_wakeup(dev, false);
...
And for hardware with dedicated wake-up interrupts:
...
device_init_wakeup(dev, true);
dev_pm_set_dedicated_wake_irq(dev, irq);
...
dev_pm_clear_wake_irq(dev);
device_init_wakeup(dev, false);
...
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If we don't update last_busy in rpm_resume, devices can go back
to sleep immediately after resume. This happens at least in
cases where the device has been powered off and does not have
any interrupt pending until there's something in the FIFO.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After commit b2b49ccbdd (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so quite a few
depend on CONFIG_PM or even may be dropped entirely in some cases.
Replace CONFIG_PM_RUNTIME with CONFIG_PM in the PM core code.
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
PM uses three separate functions to fetch RPM callbacks.
These functions uses quite complicated macro in their body.
The patch replaces these routines with one small macro and
one helper function.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This patch provides two new runtime PM helper functions which intend to
be used from system suspend/resume callbacks, to make sure devices are
put into low power state during system suspend and brought back to full
power at system resume.
The prerequisite is to have all levels of a device's runtime PM
callbacks to be defined through the SET_PM_RUNTIME_PM_OPS macro, which
means these are available for CONFIG_PM.
By using the new runtime PM helper functions especially the two
scenarios below will be addressed.
1) The PM core prevents .runtime_suspend callbacks from being invoked
during system suspend. That means even for a runtime PM centric
subsystem and driver, the device needs to be put into low power state
from a system suspend callback. Otherwise it may very well be left in
full power state (runtime resumed) while the system is suspended. By
using the new helper functions, we make sure to walk the hierarchy of
a device's power domain, subsystem and driver.
2) Subsystems and drivers need to cope with all the combinations of
CONFIG_PM_SLEEP and CONFIG_PM_RUNTIME. The two new helper functions
smothly addresses this.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
While fetching the proper runtime PM callback, we walk the hierarchy of
device's power domains, subsystems and drivers.
This is common for rpm_suspend(), rpm_idle() and rpm_resume(). Let's
clean up the code by using a macro that handles this.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
For devices which don't have a .runtime_idle() callback or if it
returns 0, rpm_idle() will end up in triggering a call to
rpm_suspend(), thus trying to carry out a runtime suspend directly
from runtime_idle().
In the above situation we want to respect devices which has enabled
autosuspend, we therfore append the flag sent to rpm_suspend with
RPM_AUTO.
Do note that drivers still needs to update the device last busy mark,
to control the delay for this circumstance.
Updated runtime PM documentation accordingly.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The "runtime idle" helper routine, rpm_idle(), currently ignores
return values from .runtime_idle() callbacks executed by it.
However, it turns out that many subsystems use
pm_generic_runtime_idle() which checks the return value of the
driver's callback and executes pm_runtime_suspend() for the device
unless that value is not 0. If that logic is moved to rpm_idle()
instead, pm_generic_runtime_idle() can be dropped and its users
will not need any .runtime_idle() callbacks any more.
Moreover, the PCI, SCSI, and SATA subsystems' .runtime_idle()
routines, pci_pm_runtime_idle(), scsi_runtime_idle(), and
ata_port_runtime_idle(), respectively, as well as a few drivers'
ones may be simplified if rpm_idle() calls rpm_suspend() after 0 has
been returned by the .runtime_idle() callback executed by it.
To reduce overall code bloat, make the changes described above.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
For irq safe devices return the runtime reference for the parent
by using the asyncronous runtime PM API. Thus we don't have to
wait for it to become idle|suspended. Instead we can move on and
handle the next device in queue.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Apply the introduced memalloc_noio_save() and memalloc_noio_restore() to
force memory allocation with no I/O during runtime_resume/runtime_suspend
callback on device with the flag of 'memalloc_noio' set.
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Decotigny <david.decotigny@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Jiri Kosina <jiri.kosina@suse.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce the flag memalloc_noio in 'struct dev_pm_info' to help PM core
to teach mm not allocating memory with GFP_KERNEL flag for avoiding
probable deadlock.
As explained in the comment, any GFP_KERNEL allocation inside
runtime_resume() or runtime_suspend() on any one of device in the path
from one block or network device to the root device in the device tree
may cause deadlock, the introduced pm_runtime_set_memalloc_noio() sets
or clears the flag on device in the path recursively.
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Jiri Kosina <jiri.kosina@suse.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Decotigny <david.decotigny@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are several drivers where the return value of
pm_runtime_get_sync() is used to decide whether or not it is safe to
access hardware and that don't provide .suspend() callbacks for system
suspend (but may use late/noirq callbacks.) If such a driver happens
to call pm_runtime_get_sync() during system suspend, after the core
has disabled runtime PM, it will get the error code and will decide
that the hardware should not be accessed, although this may be a wrong
conclusion, depending on the state of the device when runtime PM was
disabled.
Drivers might work around this problem by using a test like:
ret = pm_runtime_get_sync(dev);
if (!ret || (ret == -EACCES && driver_private_data(dev)->suspended)) {
/* access hardware */
}
where driver_private_data(dev)->suspended is a flag set by the
driver's .suspend() method (that would have to be added for this
purpose). However, that potentially would need to be done by multiple
drivers which means quite a lot of duplicated code and bloat.
To avoid that we can use the observation that the core sets
dev->power.is_suspended before disabling runtime PM and use that
instead of the driver's private flag. Still, potentially many drivers
would need to repeat that same check in quite a few places, so it's
better to let the core do it.
Then we can be a bit smarter and check whether or not runtime PM was
disabled by the core only (disable_depth == 1) or by someone else in
addition to the core (disable_depth > 1). In the former case
rpm_resume() can return 1 if the runtime PM status is RPM_ACTIVE,
because it means the device was active when the core disabled runtime
PM. In the latter case it should still return -EACCES, because it
isn't clear why runtime PM has been disabled.
Tested on AM3730/Beagle-xM where a wakeup IRQ firing during the late
suspend phase triggers runtime PM activity in the I2C driver since the
wakeup IRQ is on an I2C-connected PMIC.
[rjw: Modified whitespace to follow the file's convention.]
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
If __dev_pm_qos_read_value(dev) returns a negative value,
rpm_suspend() should return -EPERM for dev even if its
power.no_callbacks flag is set. For this to happen, the device's
power.no_callbacks flag has to be checked after the PM QoS check,
so move the PM QoS check to rpm_check_suspend_allowed() (this will
make it cover idle notifications as well as runtime suspend too).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable@vger.kernel.org
The power.deferred_resume can only be set if the runtime PM status
of device is RPM_SUSPENDING and it should be cleared after its
status has been changed, regardless of whether or not the runtime
suspend has been successful. However, it only is cleared on
suspend failure, while it may remain set on successful suspend and
is happily leaked to rpm_resume() executed in that case.
That shouldn't happen, so if power.deferred_resume is set in
rpm_suspend() after the status has been changed to RPM_SUSPENDED,
clear it before calling rpm_resume(). Then, it doesn't need to be
cleared before changing the status to RPM_SUSPENDING any more,
because it's always cleared after the status has been changed to
either RPM_SUSPENDED (on success) or RPM_ACTIVE (on failure).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable@vger.kernel.org
For devices whose power.no_callbacks flag is set, rpm_resume()
should return 1 if the device's parent is already active, so that
the callers of pm_runtime_get() don't think that they have to wait
for the device to resume (asynchronously) in that case (the core
won't queue up an asynchronous resume in that case, so there's
nothing to wait for anyway).
Modify the code accordingly (and make sure that an idle notification
will be queued up on success, even if 1 is to be returned).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable@vger.kernel.org
After the previous changes in default_stop_ok() and
default_power_down_ok() for PM domains, there are two fields in
struct dev_pm_info that aren't necessary any more, suspend_time
and max_time_suspended_ns.
Remove those fields along with all of the code that accesses them,
which simplifies the runtime PM framework quite a bit.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch (as1535) fixes a bug in the runtime PM core. When a
runtime suspend attempt completes, whether successfully or not, the
device's power.wait_queue is supposed to be signalled. But this
doesn't happen in the failure pathway of rpm_suspend() when another
autosuspend attempt is rescheduled. As a result, a task can get stuck
indefinitely on the wait queue (I have seen this happen in testing).
The patch fixes the problem by moving the wake_up_all() call up near
the start of the failure code.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Make the PM core execute driver PM callbacks directly if the
corresponding subsystem callbacks are not present.
There are three reasons for doing that. First, it reflects the
behavior of drivers/base/dd.c:really_probe() that runs the driver's
.probe() callback directly if the bus type's one is not defined, so
this change will remove one arbitrary difference between the PM core
and the remaining parts of the driver core. Second, it will allow
some subsystems, whose PM callbacks don't do anything except for
executing driver callbacks, to be simplified quite a bit by removing
those "forward-only" callbacks. Finally, it will allow us to remove
one level of indirection in the system suspend and resume code paths
where it is not necessary, which is going to lead to less debug noise
with initcall_debug passed in the kernel command line (messages won't
be printed for driverless devices whose subsystems don't provide
PM callbacks among other things).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Make the runtime PM core use device PM QoS constraints to check if
it is allowed to suspend a given device, so that an error code is
returned if the device's own PM QoS constraint is negative or one of
its children has already been suspended for too long. If this is
not the case, the maximum estimated time the device is allowed to be
suspended, computed as the minimum of the device's PM QoS constraint
and the PM QoS constraints of its children (reduced by the difference
between the current time and their suspend times) is stored in a new
device's PM field power.max_time_suspended_ns that can be used by
the device's subsystem or PM domain to decide whether or not to put
the device into lower-power (and presumably higher-latency) states
later (if the constraint is 0, which means "no constraint", the
power.max_time_suspended_ns is set to -1).
Additionally, the time of execution of the subsystem-level
.runtime_suspend() callback for the device is recorded in the new
power.suspend_time field for later use by the device's subsystem or
PM domain along with power.max_time_suspended_ns (it also is used
by the core code when the device's parent is suspended).
Introduce a new helper function,
pm_runtime_update_max_time_suspended(), allowing subsystems and PM
domains (or device drivers) to update the power.max_time_suspended_ns
field, for example after changing the power state of a suspended
device.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
Revert "tracing: Include module.h in define_trace.h"
irq: don't put module.h into irq.h for tracking irqgen modules.
bluetooth: macroize two small inlines to avoid module.h
ip_vs.h: fix implicit use of module_get/module_put from module.h
nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
include: replace linux/module.h with "struct module" wherever possible
include: convert various register fcns to macros to avoid include chaining
crypto.h: remove unused crypto_tfm_alg_modname() inline
uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
pm_runtime.h: explicitly requires notifier.h
linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
miscdevice.h: fix up implicit use of lists and types
stop_machine.h: fix implicit use of smp.h for smp_processor_id
of: fix implicit use of errno.h in include/linux/of.h
of_platform.h: delete needless include <linux/module.h>
acpi: remove module.h include from platform/aclinux.h
miscdevice.h: delete unnecessary inclusion of module.h
device_cgroup.h: delete needless include <linux/module.h>
net: sch_generic remove redundant use of <linux/module.h>
net: inet_timewait_sock doesnt need <linux/module.h>
...
Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
- drivers/media/dvb/frontends/dibx000_common.c
- drivers/media/video/{mt9m111.c,ov6650.c}
- drivers/mfd/ab3550-core.c
- include/linux/dmaengine.h
Originally, the runtime PM core would send an idle notification
whenever a suspend attempt failed. The idle callback routine could
then schedule a delayed suspend for some time later.
However this behavior was changed by commit
f71648d73c (PM / Runtime: Remove idle
notification after failing suspend). No notifications were sent, and
there was no clear mechanism to retry failed suspends.
This caused problems for the usbhid driver, because it fails
autosuspend attempts as long as a key is being held down. Therefore
this patch (as1492) adds a mechanism for retrying failed
autosuspends. If the callback routine updates the last_busy field so
that the next autosuspend expiration time is in the future, the
autosuspend will automatically be rescheduled.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Henrik Rydberg <rydberg@euromail.se>
Cc: <stable@kernel.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
With delta type being int, its value is made zero
for all values of now > 0x80000000.
Hence fixing it.
Signed-off-by: venu byravarasu <vbyravarasu@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Most of these files were implicitly getting EXPORT_SYMBOL via
device.h which was including module.h, but that path will be broken
soon.
[ with input from Stephen Rothwell <sfr@canb.auug.org.au> ]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
If .runtime_suspend() returns -EAGAIN or -EBUSY, the device should
still be in ACTIVE state, so it is not necessary to send an idle
notification to its parent. If .runtime_suspend() returns other
fatal failure, it doesn't make sense to send idle notification to
its parent.
Skip parent idle notification when failure is returned from
.runtime_suspend() and update comments in rpm_suspend() to reflect
that change.
[rjw: Modified the subject and changelog slightly.]
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch fix kerneldoc comments for rpm_suspend():
- 'Cancel a pending idle notification' should be put before, also
should be changed to 'Cancel a pending idle notification,
autosuspend or suspend'.
- idle notification for the device after succeeding suspend has
been removed, so update the comment accordingly.
[rjw: Modified the subject and changelog slightly.]
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch replaces dev_dbg with trace_rpm_* inside
the three important functions:
rpm_idle
rpm_suspend
rpm_resume
Trace points have the below advantages compared with dev_dbg:
- trace points include much runtime information(such as
running cpu, current task, ...)
- most of linux distributions may disable "verbose debug"
driver debug compile switch, so it is very difficult to
report/debug runtime pm related problems from distribution
users without this kind of debug information.
- for upstream kernel users, enableing the debug switch will
produce many useless "rpm_resume" output, and it is very noise.
- dev_dbg inside rpm_suspend/rpm_resume may have some effects
on runtime pm behaviour of console devicer
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The rpm_suspend() and rpm_resume() routines execute subsystem or PM
domain callbacks under power.lock if power.irq_safe is set for the
given device. This is inconsistent with that rpm_idle() does after
commit 02b2677 (PM / Runtime: Allow _put_sync() from
interrupts-disabled context) and is problematic for subsystems and PM
domains wanting to use power.lock for synchronization in their
runtime PM callbacks.
This change requires the code checking if the device's runtime PM
status is RPM_SUSPENDING or RPM_RESUMING to be modified too, to take
the power.irq_safe set case into account (that code wasn't reachable
before with power.irq_safe set, because it's executed with the
device's power.lock held).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Some of the entry points to pm runtime are not safe to
call in atomic context unless pm_runtime_irq_safe() has
been called. Inspecting the code, it is not immediately
obvious that the functions sleep at all, as they run
inside a spin_lock_irqsave, but under some conditions
they can drop the lock and turn on irqs.
If a driver incorrectly calls the pm_runtime apis, it can
cause sleeping and irq processing when it expects to stay
in atomic context.
Add might_sleep_if to the majority of the __pm_runtime_* entry points
to enforce correct usage.
Add pm_runtime_put_sync_autosuspend to the list of
functions that can be called in atomic context.
Signed-off-by: Colin Cross <ccross@android.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Currently the use of pm_runtime_put_sync() is not safe from
interrupts-disabled context because rpm_idle() will release the
spinlock and enable interrupts for the idle callbacks. This enables
interrupts during a time where interrupts were expected to be
disabled, and can have strange side effects on drivers that expected
interrupts to be disabled.
This is not a bug since the documentation clearly states that only
_put_sync_suspend() is safe in IRQ-safe mode.
However, pm_runtime_put_sync() could be made safe when in IRQ-safe
mode by releasing the spinlock but not re-enabling interrupts, which
is what this patch aims to do.
Problem was found when using some buggy drivers that set
pm_runtime_irq_safe() and used _put_sync() in interrupts-disabled
context.
Reported-by: Colin Cross <ccross@google.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
dev->power.deferred_resume is used as a bool typically, so change
one assignment to false from 0, like other places.
Signed-off-by: ShuoX Liu <shuox.liu@intel.com>
The runtime PM documentation and kerneldoc comments sometimes spell
"runtime" with a dash (i.e. "run-time"). Replace all of those
instances with "runtime" to make the naming consistent.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Some callers of pm_runtime_get_sync() and other runtime PM helper
functions, scsi_autopm_get_host() and scsi_autopm_get_device() in
particular, need to distinguish error codes returned when runtime PM
is disabled (i.e. power.disable_depth is nonzero for the given
device) from error codes returned in other situations. For this
reason, make the runtime PM helper functions return -EACCES when
power.disable_depth is nonzero and ensure that this error code
won't be returned by them in any other circumstances. Modify
scsi_autopm_get_host() and scsi_autopm_get_device() to check the
error code returned by pm_runtime_get_sync() and ignore -EACCES.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The naming convention used by commit 7538e3db6e015e890825fbd9f86599b
(PM: Add support for device power domains), which introduced the
struct dev_power_domain type for representing device power domains,
evidently confuses some developers who tend to think that objects
of this type must correspond to "power domains" as defined by
hardware, which is not the case. Namely, at the kernel level, a
struct dev_power_domain object can represent arbitrary set of devices
that are mutually dependent power management-wise and need not belong
to one hardware power domain. To avoid that confusion, rename struct
dev_power_domain to struct dev_pm_domain and rename the related
pointers in struct device and struct pm_clk_notifier_block from
pwr_domain to pm_domain.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Kevin Hilman <khilman@ti.com>
Change the PM core's behavior related to power domains in such a way
that, if a power domain is defined for a given device, its callbacks
will be executed instead of and not in addition to the device
subsystem's PM callbacks.
The idea behind the initial implementation of power domains handling
by the PM core was that power domain callbacks would be executed in
addition to subsystem callbacks, so that it would be possible to
extend the subsystem callbacks by using power domains. It turns out,
however, that this wouldn't be really convenient in some important
situations.
For example, there are systems in which power can only be removed
from entire power domains. On those systems it is not desirable to
execute device drivers' PM callbacks until it is known that power is
going to be removed from the devices in question, which means that
they should be executed by power domain callbacks rather then by
subsystem (e.g. bus type) PM callbacks, because subsystems generally
have no information about what devices belong to which power domain.
Thus, for instance, if the bus type in question is the platform bus
type, its PM callbacks generally should not be called in addition to
power domain callbacks, because they run device drivers' callbacks
unconditionally if defined.
While in principle the default subsystem PM callbacks, or a subset of
them, may be replaced with different functions, it doesn't seem
correct to do so, because that would change the subsystem's behavior
with respect to all devices in the system, regardless of whether or
not they belong to any power domains. Thus, the only remaining
option is to make power domain callbacks take precedence over
subsystem callbacks.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Kevin Hilman <khilman@ti.com>
The code handling system-wide power transitions (eg. suspend-to-RAM)
can in theory execute callbacks provided by the device's bus type,
device type and class in each phase of the power transition. In
turn, the runtime PM core code only calls one of those callbacks at
a time, preferring bus type callbacks to device type or class
callbacks and device type callbacks to class callbacks.
It seems reasonable to make them both behave in the same way in that
respect. Moreover, even though a device may belong to two subsystems
(eg. bus type and device class) simultaneously, in practice power
management callbacks for system-wide power transitions are always
provided by only one of them (ie. if the bus type callbacks are
defined, the device class ones are not and vice versa). Thus it is
possible to modify the code handling system-wide power transitions
so that it follows the core runtime PM code (ie. treats the
subsystem callbacks as mutually exclusive).
On the other hand, the core runtime PM code will choose to execute,
for example, a runtime suspend callback provided by the device type
even if the bus type's struct dev_pm_ops object exists, but the
runtime_suspend pointer in it happens to be NULL. This is confusing,
because it may lead to the execution of callbacks from different
subsystems during different operations (eg. the bus type suspend
callback may be executed during runtime suspend of the device, while
the device type callback will be executed during system suspend).
Make all of the power management code treat subsystem callbacks in
a consistent way, such that:
(1) If the device's type is defined (eg. dev->type is not NULL)
and its pm pointer is not NULL, the callbacks from dev->type->pm
will be used.
(2) If dev->type is NULL or dev->type->pm is NULL, but the device's
class is defined (eg. dev->class is not NULL) and its pm pointer
is not NULL, the callbacks from dev->class->pm will be used.
(3) If dev->type is NULL or dev->type->pm is NULL and dev->class is
NULL or dev->class->pm is NULL, the callbacks from dev->bus->pm
will be used provided that both dev->bus and dev->bus->pm are
not NULL.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Kevin Hilman <khilman@ti.com>
Reasoning-sounds-sane-to: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>