Currently the sh_cmt clocksource timer is disabled or enabled
unconditionally on clocksource suspend resp. resume, even if a
better clocksource is present (e.g. arch_sys_counter) and the
sh_cmt clocksource is not enabled.
As sh_cmt is a syscore device when its timer is enabled, this
may lead to a genpd.prepared_count imbalance in the presence of
PM Domains, which may cause a lock-up during reboot after s2ram.
During suspend:
- pm_genpd_prepare() is called for all non-syscore devices (incl.
sh_cmt), increasing genpd.prepared_count for each device,
- clocksource.suspend() is called for all clocksource devices,
- sh_cmt_clocksource_suspend() calls sh_cmt_stop(), which is a no-op
as the clocksource was not enabled.
During resume:
- clocksource.resume() is called for all clocksource devices,
- sh_cmt_clocksource_resume() calls sh_cmt_start(), which enables the
clocksource timer, and turns sh_cmt into a syscore device,
- pm_genpd_complete() is called for all non-syscore devices (excl.
sh_cmt now!), decreasing genpd.prepared_count for each device but
sh_cmt.
Now genpd.prepared_count of the PM Domain containing sh_cmt is
still 1 instead of zero. On subsequent suspend/resume cycles,
sh_cmt is still a syscore device, hence it's skipped for
pm_genpd_{prepare,complete}(), keeping the imbalance of
genpd.prepared_count at 1.
During reboot:
- platform_drv_shutdown() is called for any platform device that has
a driver with a .shutdown() method (only rcar-dmac on R-Car Gen2),
- platform_drv_shutdown() calls dev_pm_domain_detach(), which
calls genpd_dev_pm_detach(),
- genpd_dev_pm_detach() keeps calling pm_genpd_remove_device() until
it doesn't return -EAGAIN[*],
- If the device is part of the same PM Domain as sh_cmt,
pm_genpd_remove_device() always fails with -EAGAIN due to
genpd.prepared_count > 0.
- Infinite loop in genpd_dev_pm_detach()[*].
[*] Commit 93af5e9354 ("PM / Domains: Avoid infinite loops in
attach/detach code") already limited the number of loop iterations,
avoiding the lock-up.
To fix this, only disable or enable the clocksource timer on
clocksource suspend resp. resume if the clocksource was enabled.
This was tested on r8a7791/koelsch with the CPG Clock Domain:
- using arch_sys_counter as the clocksource, which is the default, and
which showed the problem,
- using sh_cmt as a clocksource ("echo ffca0000.timer > \
/sys/devices/system/clocksource/clocksource0/current_clocksource"),
which behaves the same as before.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438875126-12596-2-git-send-email-daniel.lezcano@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>