Currently the _of_add_opp_table_v2 call loops through the OPP nodes in
the operating-points-v2 table in the device tree and calls
_opp_add_static_v2 for each to add them to the table. It counts each
iteration through this loop as an added OPP, however there are cases
where _opp_add_static_v2() returns 0 but no new OPP is added to the
list.
This can happen while adding duplicate OPP or if the OPP isn't supported
by hardware.
Because of this the count variable will contain the number of OPP nodes
in the table in device tree but not necessarily the ones that are
actually added.
As this count value is what is checked to determine if there are any
valid OPPs, if a platform has an operating-points-v2 table with all OPP
nodes containing opp-supported-hw values that are not currently
supported, then _of_add_opp_table_v2 will fail to abort as it should due
to an empty table.
Additionally, since commit 3ba98324e8 ("PM / OPP: Get
performance state using genpd helper"), the same count variable is
compared against the number of OPPs containing performance states and
requires that either all or none have pstates set, however in the case
of any opp table that has any entries that do not get added by
_opp_add_static_v2 due to incompatible opp-supported-hw fields, these
numbers will not match and _of_add_opp_table_v2 will incorrectly fail.
We need to clearly identify all the three cases (success, failure,
unsupported/duplicate OPPs) and then increment count only on success
case. Change return type of _opp_add_static_v2() to return the pointer
to the newly added OPP instead of an integer. This routine now returns a
valid pointer if the OPP is really added, NULL for unsupported or
duplicate OPPs, and error value cased as a pointer on errors.
Ideally the fixes tag in this commit should point back to the commit
that introduced OPP v2 initially, as that's where we started incorrectly
accounting for duplicate OPPs:
commit 274659029c ("PM / OPP: Add support to parse "operating-points-v2" bindings")
But it wasn't a real problem until recently as the count was only used
to check if any OPPs are added or not. And so this commit points to a
rather recent commit where we added more code that depends on the value
of "count".
Fixes: 3ba98324e8 ("PM / OPP: Get performance state using genpd helper")
Reported-by: Dave Gerlach <d-gerlach@ti.com>
Reported-by: Niklas Cassel <niklas.cassel@linaro.org>
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Return error number instead of 0 on failures.
Fixes: a1e8c13600 ("PM / OPP: "opp-hz" is optional for power domains")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The error handling wasn't appropriate in
dev_pm_opp_of_cpumask_add_table(). For example it returns 0 on success
and also for the case where cpumask is empty or cpu_device wasn't found
for any of the CPUs.
It should really return error on such cases, so that the callers can be
aware of the outcome.
Fix it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Both _of_add_opp_table_v1() and _of_add_opp_table_v2() contain similar
code to get the OPP table and their parent routine also parses the DT to
find the OPP table's node pointer. This can be simplified by getting the
OPP table in advance and then passing it as argument to these routines.
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
When two or more devices are sharing their clock and voltage rails, they
share the same OPP table. But there are some corner cases where the OPP
core incorrectly creates separate OPP tables for them.
For example, CPU 0 and 1 share clock/voltage rails. The platform
specific code calls dev_pm_opp_set_regulators() for CPU0 and the OPP
core creates an OPP table for it (the individual OPPs aren't initialized
as of now). The same is repeated for CPU1 then. Because
_opp_get_opp_table() doesn't compare DT node pointers currently, it
fails to find the link between CPU0 and CPU1 and so creates a new OPP
table.
Fix this by calling _managed_opp() from _opp_get_opp_table().
_managed_opp() gain an additional argument (index) to get the right node
pointer. This resulted in simplifying code in _of_add_opp_table_v2() as
well.
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Currently there are two separate ways to free the OPP table based on how
it is created in the first place.
We call _dev_pm_opp_remove_table() to free the static and/or dynamic
OPP, OPP list devices, etc. This is done for the case where the OPP
table is added while initializing the OPPs, like via the path
dev_pm_opp_of_add_table().
We also call dev_pm_opp_put_opp_table() in some cases which eventually
frees the OPP table structure once the reference count reaches 0. This
is used by the first case as well as other cases like
dev_pm_opp_set_regulators() where the OPPs aren't necessarily
initialized at this point.
This whole thing is a bit unclear and messy and obstruct any further
cleanup/fixup of OPP core.
This patch tries to streamline this by keeping a single path for OPP
table destruction, i.e. dev_pm_opp_put_opp_table().
All the cleanup happens in _opp_table_kref_release() now after the
reference count reaches 0. _dev_pm_opp_remove_table() is removed as it
isn't required anymore.
We don't drop the reference to the OPP table after creating it from
_of_add_opp_table_v{1|2}() anymore and the same is dropped only when we
try to remove them.
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Only one platform was depending on this feature and it is already
updated now. Stop removing dynamic OPPs from _dev_pm_opp_remove_table().
This simplifies lot of paths and removes unnecessary parameters.
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The static OPPs don't always get freed with the OPP table, it can happen
before that as well. For example, if the OPP table is first created
using helpers like dev_pm_opp_set_supported_hw() and the OPPs are
created at a later point. Now when the OPPs are removed, the OPP table
stays until the time dev_pm_opp_put_supported_hw() is called.
Later patches will streamline the freeing of OPP table and that requires
the static OPPs to get freed with help of a separate kernel reference.
This patch prepares for that by creating a separate kref for static OPPs
list.
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The reference count is only required to be incremented for every call
that may lead to adding the OPP table. For static OPPs the same should
be done from the parent routine which adds all static OPPs together and
so only one refcount for all static OPPs.
Update code to reflect that.
The refcount is incremented every time a dynamic OPP is created (as that
can lead to creating the OPP table) and the same is dropped when the OPP
is removed.
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Parse the DT properties present in the OPP table from
_of_init_opp_table(), which is a dedicated routine for DT parsing.
Minor relocation of helpers is required for this.
It is possible now for _managed_opp() to return a partially initialized
OPP table if the OPP table is created via the helpers like
dev_pm_opp_set_supported_hw() and we need another flag to indicate if
the static OPP are already parsed or not to make sure we don't
incorrectly skip initializing the static OPPs.
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
This is a preparatory patch required for the next commit which will
start using OPP table's node pointer in _of_init_opp_table(), which
requires the index in order to read the OPP table's phandle.
This commit adds the index argument in the call chains in order to get
it delivered to _of_init_opp_table().
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The dev_list needs to be protected with a lock, else we may have
simultaneous access (addition/removal) to it and that would be racy.
Extend scope of the opp_table lock to protect dev_list as well.
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
dev_pm_opp_of_cpumask_add_table() creates the OPP table for all CPUs
present in the cpumask and on errors it should revert all changes it has
done.
It actually is doing a bit more than that. On errors, it tries to free
all the OPP tables, even the one it hasn't created yet. This may also
end up freeing the OPP tables which were created from separate path,
like dev_pm_opp_set_supported_hw().
Reported-and-tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The OPP table was freed, but not the individual OPPs which is done from
_dev_pm_opp_remove_table(). Fix it by calling _dev_pm_opp_remove_table()
as well.
Cc: 4.18 <stable@vger.kernel.org> # v4.18
Fixes: 3ba98324e8 ("PM / OPP: Get performance state using genpd helper")
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
This commit fixes a rare but possible case when the clk rate is updated
without update of the regulator voltage.
At boot up, CPUfreq checks if the system is running at the right freq. This
is a sanity check in case a bootloader set clk rate that is outside of freq
table present with cpufreq core. In such cases system can be unstable so
better to change it to a freq that is preset in freq-table.
The CPUfreq takes next freq that is >= policy->cur and this is our
target_freq that needs to be set now.
dev_pm_opp_set_rate(dev, target_freq) checks the target_freq and the
old_freq (a current rate). If these are equal it returns early. If not,
it searches for OPP (old_opp) that fits best to old_freq (not listed in
the table) and updates old_freq (!).
Here, we can end up with old_freq = old_opp.rate = target_freq, which
is not handled in _generic_set_opp_regulator(). It's supposed to update
voltage only when freq > old_freq || freq > old_freq.
if (freq > old_freq) {
ret = _set_opp_voltage(dev, reg, new_supply);
[...]
if (freq < old_freq) {
ret = _set_opp_voltage(dev, reg, new_supply);
if (ret)
It results in, no voltage update while clk rate is updated.
Example:
freq-table = {
1000MHz 1.15V
666MHZ 1.10V
333MHz 1.05V
}
boot-up-freq = 800MHz # not listed in freq-table
freq = target_freq = 1GHz
old_freq = 800Mhz
old_opp = _find_freq_ceil(opp_table, &old_freq); #(old_freq is modified!)
old_freq = 1GHz
Fixes: 6a0712f6f1 ("PM / OPP: Add dev_pm_opp_set_rate()")
Cc: 4.6+ <stable@vger.kernel.org> # v4.6+
Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The OPP binding says:
Property: operating-points-v2
...
This can contain more than one phandle for power domain
providers that provide multiple power domains. That is, one
phandle for each power domain. If only one phandle is available,
then the same OPP table will be used for all power domains
provided by the power domain provider.
But the OPP core isn't allowing the same OPP table to be used for
multiple domains. Update dev_pm_opp_of_add_table_indexed() to allow
that.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Rajendra Nayak <rnayak@codeaurora.org>
It should be fine to call dev_pm_opp_register_set_opp_helper() for all
possible CPUs, even if some of them share the OPP table as the caller
may not be aware of sharing policy.
Lets increment the reference count of the OPP table and return its
pointer. The caller need to call dev_pm_opp_register_put_opp_helper()
the same number of times later on to drop all the references.
To avoid adding another counter to count how many times
dev_pm_opp_register_set_opp_helper() is called for the same OPP table,
dev_pm_opp_register_put_opp_helper() frees the resources on the very
first call made to it, assuming that the caller would be calling it
sequentially for all the CPUs. We can revisit that if that assumption is
broken in the future.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
It should be fine to call dev_pm_opp_set_regulators() for all possible
CPUs, even if some of them share the OPP table as the caller may not be
aware of sharing policy.
Lets increment the reference count of the OPP table and return its
pointer. The caller need to call dev_pm_opp_put_regulators() the same
number of times later on to drop all the references.
To avoid adding another counter to count how many times
dev_pm_opp_set_regulators() is called for the same OPP table,
dev_pm_opp_put_regulators() frees the resources on the very first call
made to it, assuming that the caller would be calling it sequentially
for all the CPUs. We can revisit that if that assumption is broken in
the future.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
It should be fine to call dev_pm_opp_set_prop_name() for all possible
CPUs, even if some of them share the OPP table as the caller may not be
aware of sharing policy.
Lets increment the reference count of the OPP table and return its
pointer. The caller need to call dev_pm_opp_put_prop_name() the same
number of times later on to drop all the references.
To avoid adding another counter to count how many times
dev_pm_opp_set_prop_name() is called for the same OPP table,
dev_pm_opp_put_prop_name() frees the resources on the very first call
made to it, assuming that the caller would be calling it sequentially
for all the CPUs. We can revisit that if that assumption is broken in
the future.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
It should be fine to call dev_pm_opp_set_supported_hw() for all possible
CPUs, even if some of them share the OPP table as the caller may not be
aware of sharing policy.
Lets increment the reference count of the OPP table and return its
pointer. The caller need to call dev_pm_opp_put_supported_hw() the same
number of times later on to drop all the references.
To avoid adding another counter to count how many times
dev_pm_opp_set_supported_hw() is called for the same OPP table,
dev_pm_opp_put_supported_hw() frees the resources on the very first call
made to it, assuming that the caller would be calling it sequentially
for all the CPUs. We can revisit that if that assumption is broken in
the future.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Smatch complains that it's possible we print "rate" in the debug output
when it hasn't been initialized. It should be zero on that path.
Fixes: a1e8c13600 ("PM / OPP: "opp-hz" is optional for power domains")
[ Viresh: Added the Fixes tag ]
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The genpd core provides an API now to retrieve the performance state
from DT, use that instead of the ->get_pstate() callback.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
This adds a new helper to let the power domain drivers to access
opp->np, so that they can read platform specific properties from the
node.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
A device's DT node or its OPP nodes can contain a phandle to other
device's OPP node, in the "required-opps" property.
This patch implements a routine to find that required OPP from the node
that contains the "required-opps" property.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
The "operating-points-v2" property can contain a list of phandles now,
specifically for the power domain providers that provide multiple
domains.
Add support to parse that.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
"opp-hz" property is optional for power domains now and we shouldn't
error out if it is missing for power domains.
This patch creates two new routines, _get_opp_count() and
_opp_is_duplicate(), by separating existing code from their parent
functions. Also skip duplicate OPP check for power domain OPPs as they
may not have any the "opp-hz" field, but a platform specific performance
state binding to uniquely identify OPP nodes.
By default the debugfs OPP nodes are named using the "rate" value, but
that isn't possible for the power domain OPP nodes and hence they use
the index of the OPP node in the OPP node list instead.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
After checking all possible call chains to
dev_pm_opp_init_cpufreq_table() here,
my tool finds that this function is never called in atomic context,
namely never in an interrupt handler or holding a spinlock.
And dev_pm_opp_init_cpufreq_table() calls dev_pm_opp_get_opp_count(),
which calls mutex_lock that can sleep.
It indicates that atmtcp_v_send() can call functions which may sleep.
Thus GFP_ATOMIC is not necessary, and it can be replaced with GFP_KERNEL.
This is found by a static analysis tool named DCNS written by myself.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Fixes the following sparse warning:
drivers/opp/ti-opp-supply.c:276:5: warning:
symbol 'ti_opp_supply_set_opp' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Introduce a ti-opp-supply driver that will use new multiple regulator
support that is part of the OPP core This is needed on TI platforms like
DRA7/AM57 in order to control both CPU regulator and Adaptive Body Bias
(ABB) regulator. These regulators must be scaled in sequence during an
OPP transition depending on whether or not the frequency is being scaled
up or down.
This driver also implements AVS Class0 for these parts by looking up the
required values from registers in the SoC and programming adjusted
optimal voltage values for each OPP.
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This adds the dev_pm_opp_{un}register_get_pstate_helper() helper
routines which will be used to set the get_pstate() callback for a
device. This callback will be later called internally by the OPP core to
get performance state corresponding to an OPP.
This is required temporarily until the time we have proper DT bindings
to include the performance state information.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The genpd framework now provides an API to request device's power
domain to update its performance state. Use that interface from the
OPP core for devices whose power domains support performance states.
Note that this commit doesn't add any mechanism by which performance
states are made available to the OPP core. That would be done by a
later commit.
Note that the current implementation is restricted to the case where
the device doesn't have separate regulators for itself. We shouldn't
over engineer the code before we have real use case for them. We can
always come back and add more code to support such cases later on.
Tested-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 762792913f (PM / OPP: Fix get sharing CPUs when hotplug
is used) moved away from using cpu_dev->of_node because of some
limitations.
However, commit 7467c9d959 (of: return of_get_cpu_node from
of_cpu_device_node_get if CPUs are not registered) added support to
fall back to of_get_cpu_node() if called if CPUs are not registered
yet.
Add the missing of_node_put() for the CPU device nodes. Also go back
to using of_cpu_device_node_get() in dev_pm_opp_of_get_sharing_cpus()
to avoid scanning the device tree again.
Acked-by: Viresh Kumar <vireshk@kernel.org>
Fixes: 762792913f (PM / OPP: Fix get sharing CPUs when hotplug is used)
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The routine is named incorrectly since the first attempt as there is
nothing like a put_opp() helper. We wanted to unregister the set_opp()
helper here and so it should rather be named as
dev_pm_opp_unregister_set_opp_helper().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The for_each_available_child_of_node() loop in _of_add_opp_table_v2()
doesn't drop the reference to "np" on errors. Fix that.
Fixes: 274659029c (PM / OPP: Add support to parse "operating-points-v2" bindings)
Cc: 4.3+ <stable@vger.kernel.org> # 4.3+
Signed-off-by: Tobias Jordan <Tobias.Jordan@elektrobit.com>
[ VK: Improved commit log. ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
On some i.MX6 platforms which do not have speed grading
check, opp table will not be created in platform code,
so cpufreq driver prints the following error message:
cpu cpu0: dev_pm_opp_get_opp_count: OPP table not found (-19)
However, this is not really an error in this case because the
imx6q-cpufreq driver first calls dev_pm_opp_get_opp_count()
and if it fails, it means that platform code does not provide
OPP and then dev_pm_opp_of_add_table() will be called.
In order to avoid such confusing error message, move it to
debug level.
It is up to the caller of dev_pm_opp_get_opp_count() to check its
return value and decide if it will print an error or not.
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The drivers/base/power/ directory is special and contains code related
to power management core like system suspend/resume, hibernation, etc.
It was fine to keep the OPP code inside it when we had just one file for
it, but it is growing now and already has a directory for itself.
Lets move it directly under drivers/ directory, just like cpufreq and
cpuidle.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>