The Energy Model (EM) can be created based on DT entry:
'dynamic-power-coefficient'. It's a 'simple' EM which is limited to the
dynamic power. It has to fit into the math formula which requires also
information about voltage. Some of the platforms don't expose voltage
information, thus it's not possible to use EM registration using DT.
This patch aims to fix it. It introduces new implementation of the EM
registration callback. The new mechanism relies on the new OPP feature
allowing to get power (which is coming from "opp-microwatt" DT property)
expressed in micro-Watts.
The patch also opens new opportunity to better support platforms, which
have a decent static power. It allows to register the EM based on real
power measurements which models total power (static + dynamic), so better
reflects real HW.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Add new property to the OPP: power value. The OPP entry in the DT can have
"opp-microwatt". Add the needed code to handle this new property in the
existing infrastructure.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
It is difficult to find which OPPs are active at the moment, specially
if there are multiple OPPs with same frequency available in the device
tree (controlled by supported hardware feature).
Expose name of the DT node to find out the exact OPP.
While at it, also expose level field.
Reported-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Fix sparse warning:
drivers/opp/of.c:924 _opp_add_static_v2() warn: passing zero to 'ERR_PTR'
For duplicate OPPs 'ret' be set to zero.
Fixes: deac8703da ("PM / OPP: _of_add_opp_table_v2(): increment count only if OPP is added")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Elements of the 'names' array are not changed by the code, constify them
for consistency.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The 'required-opps' property is optional. So of_count_phandle_with_args()
can return -ENOENT when queried for required-opps. Handle this case.
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
* pm-pci:
PCI: PM: Enable PME if it can be signaled from D3cold
PCI: PM: Avoid forcing PCI_D0 for wakeup reasons inconsistently
PCI: Use pci_update_current_state() in pci_enable_device_flags()
* pm-sleep:
PM: sleep: unmark 'state' functions as kernel-doc
PM: sleep: check RTC features instead of ops in suspend_test
PM: sleep: s2idle: Replace deprecated CPU-hotplug functions
* pm-domains:
PM: domains: Fix domain attach for CONFIG_PM_OPP=n
arm64: dts: sc7180: Add required-opps for i2c
PM: domains: Add support for 'required-opps' to set default perf state
opp: Don't print an error if required-opps is missing
* powercap:
powercap: Add Power Limit4 support for Alder Lake SoC
powercap: intel_rapl: Replace deprecated CPU-hotplug functions
Commit 4fa82a87ba ("opp: Allow required-opps to be used for non genpd
use cases") dereferences the pointers in required_opp_tables but these
might be set to an ERR_PTR if the list still has lazy links pending,
resulting in segfaults. Prior to this patch IS_ERR was also checked on
required_opp_tables[i] before reading ->is_genpd inside
_opp_table_alloc_required_tables, which is at the same time the
predicate to add this table to the lazy list. This segfault is solved
by reordering the checks to bail on lazy pending tables before reading
->is_genpd.
Fixes: 4fa82a87ba ("opp: Allow required-opps to be used for non genpd use cases")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org>
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The 'required-opps' property is considered optional, hence remove
the pr_err() in of_parse_required_opp() when we find the property is
missing.
While at it, also fix the return value of
of_get_required_opp_performance_state() when of_parse_required_opp()
fails, return a -ENODEV instead of the -EINVAL.
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The current_opp is released only when whole OPP table is released,
otherwise it's only marked as removed by dev_pm_opp_remove_table().
Functions like dev_pm_opp_put_clkname() and dev_pm_opp_put_supported_hw()
are checking whether OPP table is empty and it's not if current_opp is
set since it holds the refcount of OPP, this produces a noisy warning
from these functions about busy OPP table. Remove the checks to fix it.
Cc: stable@vger.kernel.org
Fixes: 81c4d8a3c4 ("opp: Keep track of currently programmed OPP")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
This WARN can be triggered per-core and the stack trace is not useful.
Replace it with plain dev_err(). Fix a comment while at it.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Don't limit required_opp_table to genpd only. One possible use case is
cpufreq based devfreq governor, which can use required-opps property to
derive devfreq from cpufreq.
Though the OPP core still doesn't support non-genpd required-opps in
_set_required_opps().
Suggested-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
[ Viresh: Update _set_required_opps() to check for genpd ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Using list_del_init() instead of list_del() + INIT_LIST_HEAD()
to simpify the code.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Make devm_pm_opp_attach_genpd() to return error code instead of
opp_table pointer in order to have return type consistent with the
other resource-managed OPP helpers.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Make devm_pm_opp_register_set_opp_helper() to return error code instead
of opp_table pointer in order to have return type consistent with the
other resource-managed OPP helpers.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
We are required to call dev_pm_opp_put() from outside of the
opp_table->lock as debugfs removal needs to happen lock-less to avoid
circular dependency issues.
commit cf1fac943c ("opp: Reduce the size of critical section in
_opp_kref_release()") tried to fix that introducing a new routine
_opp_get_next() which keeps returning OPPs that can be freed by the
callers and this routine shall be called without holding the
opp_table->lock.
Though the commit overlooked the fact that the OPPs can be referenced by
other users as well and this routine will end up dropping references
which were taken by other users and hence freeing the OPPs prematurely.
In effect, other users of the OPPs will end up having invalid pointers
at hand. We didn't see any crash reports earlier as the exact situation
never happened, though it is certainly possible.
We need a way to mark which OPPs are no longer referenced by the OPP
core, so we don't drop extra references to them accidentally.
This commit adds another OPP flag, "removed", which is used to track
this. And now we should never end up dropping extra references to the
OPPs.
Cc: v5.11+ <stable@vger.kernel.org> # v5.11+
Fixes: cf1fac943c ("opp: Reduce the size of critical section in _opp_kref_release()")
Signed-off-by: Beata Michalska <beata.michalska@arm.com>
[ Viresh: Almost rewrote entire patch, added new "removed" field,
rewrote commit log and added the correct Fixes tag. ]
Co-developed-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
We skip the OPP update if the current and target OPPs are same. This is
fine for the devices that don't support frequency but may cause issues
for the ones that need to program frequency.
An OPP entry doesn't really signify a single operating frequency but
rather the highest frequency at which the other properties of the OPP
entry apply. And we may reach here with different frequency values,
while all of them would point to the same OPP entry in the OPP table.
We just need to update the clock frequency in that case, though in order
to not add special exit points we reuse the code flow from a normal
path.
While at it, rearrange the conditionals in the 'if' statement to check
'enabled' flag at the end.
Fixes: 81c4d8a3c4 ("opp: Keep track of currently programmed OPP")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
[ Viresh: Improved commit log and subject, rename current_freq as
current_rate, document it, remove local variable and rearrange
code. ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Add a function that allows looking up required OPPs given a source OPP
table, destination OPP table and the source OPP.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
[ Viresh: Rearranged code, fixed return errors ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Not all devices that need to use OPP core need to have clocks, a missing
clock is fine in which case -ENOENT shall be returned by clk_get().
Anything else is an error and must be handled properly.
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The bandwidth must be scaled at a different point in the code flow based
on if we are scaling up or down the frequency, otherwise this may cause
undesired effects as the device will try to use more of the memory
bandwidth which may be shared across several devices. Much like how
regulators and required-opps are programmed.
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Reported-by: Akhil P Oommen <akhilpo@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
The OPP core currently requires the required opp tables to be available
before the dependent OPP table is added, as it needs to create links
from the dependent OPP table to the required ones. This may not be
convenient for all the platforms though, as this requires strict
ordering for probing the drivers.
This patch allows lazy-linking of the required-opps. The OPP tables for
which the required-opp-tables aren't available at the time of their
initialization, are added to a special list of OPP tables:
lazy_opp_tables. Later on, whenever a new OPP table is registered with
the OPP core, we check if it is required by an OPP table in the pending
list; if yes, then we complete the linking then and there.
An OPP table is marked unusable until the time all its required-opp
tables are available. And if lazy-linking fails for an OPP table, the
OPP core disables all of its OPPs to make sure no one can use them.
Tested-by: Hsin-Yi Wang <hsinyi@chromium.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
All the users have migrated to dev_pm_opp_set_opp() now, get rid of the
duplicate API, dev_pm_opp_set_bw(), which only performs a part of the new API.
While at it, remove the unnecessary parameter to _set_opp_bw().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
The new helper dev_pm_opp_set_opp() can be used for configuring the
devices for a particular OPP and can be used by different type of
devices, even the ones which don't change frequency (like power
domains).
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Drop the unnecessary parameters and follow the pattern from
_generic_set_opp_regulator().
While at it, also remove the local variable old_freq.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
In order to avoid conditional statements at the caller site, this patch
updates _generic_set_opp_clk_only() to work for devices that don't
change frequency (like power domains, etc.). Return 0 if the clk pointer
passed to this routine is not valid.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
The _generic_set_opp_regulator() helper will be used for devices which
don't change frequency (like power domains, etc.) later on, prepare for
that by not relying on frequency for making decisions here.
While at it, update its parameters to pass only what is necessary.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
The _set_opp() helper will be used for devices which don't change frequency
(like power domains, etc.) later on, prepare for that by not relying on
frequency for making decisions here.
While at it, also update the debug print to contain all relevant
information.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
The _set_opp() helper will be used for devices which don't change their
frequency (like power domains, etc.) later on, prepare for that by
breaking the generic part out of dev_pm_opp_set_rate().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
The dev_pm_opp_set_rate() helper needs to know the currently programmed
OPP to make few decisions and currently we try to find it on every
invocation of this routine.
Lets start keeping track of the current_opp programmed for the devices
of the opp table, that will be quite useful going forward.
If we fail to find the current OPP, we pick the first one available in
the list, as the list is in ascending order of frequencies, level, or
bandwidth and that's the best guess we can make anyway.
Note that we used to do the frequency comparison a bit early in
dev_pm_opp_set_rate() previously, and now instead we check the target
opp, which shall be more accurate anyway.
We need to make sure that current_opp's memory doesn't get freed while
it is being used and so we keep a reference of it until the time it is
used.
Now that current_opp will always be set, we can drop some unnecessary
checks as well.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Clock is not optional for users who call into dev_pm_opp_set_rate().
Remove the unnecessary checks.
While at it also drop the local variable for clk and use opp_table->clk
instead.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
This routine has nothing to do with frequency, it just disables all the
resources previously enabled. Rename it to match its purpose.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Check whether OPP table has regulators in _set_opp_custom() and set up
dev_pm_set_opp_data accordingly. Now _set_opp_custom() works properly,
i.e. it doesn't crash if OPP table doesn't have assigned regulators.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
[ Viresh: Rearrange the routine a bit ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Until now the ->set_opp() helper (i.e. special implementation for
setting the OPPs for platforms) was implemented only to take care of
multiple regulators case, but going forward we would need that for other
use cases as well.
This patch prepares for that by allocating the regulator specific part
from dev_pm_opp_set_regulators() and the opp helper part from
dev_pm_opp_register_set_opp_helper().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Print OPP level in debug message of _opp_add_static_v2(). This helps to
chase GENPD bugs.
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Matt Merhar <mattmerhar@protonmail.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
NVIDIA Tegra SoCs have a power domains topology such that child domains
only clamp a power rail, while parent domain controls shared performance
state of the multiple child domains. In this case child's domain doesn't
need to have OPP table. Hence we want to allow children power domains to
pass performance state to the parent domain if child's domain doesn't have
OPP table.
The dev_pm_opp_xlate_performance_state() gets src_table=NULL if a child
power domain doesn't have OPP table and in this case we should pass the
performance state to the parent domain.
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Matt Merhar <mattmerhar@protonmail.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Add resource-managed version of dev_pm_opp_attach_genpd().
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
[ Viresh: Manually apply the patch and relocate the routines ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Add resource-managed version of dev_pm_opp_register_set_opp_helper().
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Matt Merhar <mattmerhar@protonmail.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
[ Viresh: Manually apply the patch and relocate the routines ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
A few drivers have device's clk but they don't want the OPP core to
handle that. Add a new helper for them, dev_pm_opp_of_add_table_noclk().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
We acquire the clk at the time the OPP table is allocated, though it
works fine, it is not the best place to do so. One of the main reason
being we may need to acquire it again from dev_pm_opp_set_clkname() if
the platform wants another clock to be acquired instead.
There is also requirement from some of the platforms where they do not
want the OPP core to manage the clock at all.
This patch hence defers acquiring the clk until the time we are certain
about which clk we need to acquire and if we really need to acquire one.
With this commit, the clk will get acquired either from
dev_pm_opp_set_clkname() or while we initialize the OPPs within the
table.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
The implementation of dev_pm_opp_of_add_table() and
dev_pm_opp_of_add_table_indexed() are almost identical. Create
_of_add_table_indexed() to reduce code redundancy.
Also remove the duplication of the doc style comments by referring to
dev_pm_opp_of_add_table() from dev_pm_opp_of_add_table_indexed().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Extend OPP API with dev_pm_opp_sync_regulators() function, which syncs
voltage state of regulators.
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Matt Merhar <mattmerhar@protonmail.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
[ Viresh: Added unlikely() ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Add dev_pm_opp_get_required_pstate() which allows OPP users to retrieve
required performance state of a given OPP.
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Matt Merhar <mattmerhar@protonmail.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Add a ceil version of the dev_pm_opp_find_level(). It's handy to have if
levels don't start from 0 in OPP table and zero usually means a minimal
level.
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Matt Merhar <mattmerhar@protonmail.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The debug message always prints rate=0 instead of a proper value, fix it.
Fixes: 6c591eec67 ("OPP: Add helpers for reading the binding properties")
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Matt Merhar <mattmerhar@protonmail.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
[ Viresh: Added Fixes tag ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
A required OPP may not be available, and thus, all OPPs which are using
this required OPP should be unavailable too.
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Matt Merhar <mattmerhar@protonmail.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Fix adding OPP entries in a wrong (opposite) order if OPP rate is
unavailable. The OPP comparison was erroneously skipped, thus OPPs
were left unsorted.
Tested-by: Peter Geis <pgwipeout@gmail.com>
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Matt Merhar <mattmerhar@protonmail.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
In function _allocate_opp_table, opp_dev is allocated and referenced
by opp_table via _add_opp_dev. But in the case that the subsequent calls
return -EPROBE_DEFER, it will jump to err label and opp_table will be
freed. Then opp_dev becomes an unreferenced object to cause memory leak.
So let's call _remove_opp_dev to do the cleanup.
This fixes the following kmemleak report:
unreferenced object 0xffff000801524a00 (size 128):
comm "swapper/0", pid 1, jiffies 4294892465 (age 84.616s)
hex dump (first 32 bytes):
40 00 56 01 08 00 ff ff 40 00 56 01 08 00 ff ff @.V.....@.V.....
b8 52 77 7f 08 00 ff ff 00 3c 4c 00 08 00 ff ff .Rw......<L.....
backtrace:
[<00000000b1289fb1>] kmemleak_alloc+0x30/0x40
[<0000000056da48f0>] kmem_cache_alloc+0x3d4/0x588
[<00000000a84b3b0e>] _add_opp_dev+0x2c/0x88
[<0000000062a380cd>] _add_opp_table_indexed+0x124/0x268
[<000000008b4c8f1f>] dev_pm_opp_of_add_table+0x20/0x1d8
[<00000000e5316798>] dev_pm_opp_of_cpumask_add_table+0x48/0xf0
[<00000000db0a8ec2>] dt_cpufreq_probe+0x20c/0x448
[<0000000030a3a26c>] platform_probe+0x68/0xd8
[<00000000c618e78d>] really_probe+0xd0/0x3a0
[<00000000642e856f>] driver_probe_device+0x58/0xb8
[<00000000f10f5307>] device_driver_attach+0x74/0x80
[<0000000004f254b8>] __driver_attach+0x58/0xe0
[<0000000009d5d19e>] bus_for_each_dev+0x70/0xc8
[<0000000000d22e1c>] driver_attach+0x24/0x30
[<0000000001d4e952>] bus_add_driver+0x14c/0x1f0
[<0000000089928aaa>] driver_register+0x64/0x120
Cc: v5.10 <stable@vger.kernel.org> # v5.10
Fixes: dd461cd918 ("opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER")
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
[ Viresh: Added the stable tag ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The opp binding now allows to have an empty opp table and shared-opp to
still describe that devices share v/f lines.
When initialising an empty opp table, allow such case by:
- treating such conditions with warnings in place of errors
- don't fail on empty table
Signed-off-by: Nicola Mazzucato <nicola.mazzucato@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
It has been found that some users (like cpufreq-dt and others on LKML)
have abused the helper dev_pm_opp_get_opp_table() to create the OPP
table instead of just finding it, which is the wrong thing to do. This
routine was meant for OPP core's internal working and exposed the whole
functionality by mistake.
Change the scope of dev_pm_opp_get_opp_table() to only finding the
table. The internal helpers _opp_get_opp_table*() are thus renamed to
_add_opp_table*(), dev_pm_opp_get_opp_table_indexed() is removed (as we
don't need the index field for finding the OPP table) and so the only
user, genpd, is updated.
Note that the prototype of _add_opp_table() was already left in opp.h by
mistake when it was removed earlier and so we weren't required to add it
now.
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
There is a lot of stuff here which can be done outside of the
opp_table->lock, do that. This helps avoiding a circular dependency
lockdeps around debugfs.
Reported-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
There are different platforms and devices which might use different scale
for the power values. Kernel sub-systems might need to check if all
Energy Model (EM) devices are using the same scale. Address that issue and
store the information inside EM for each device. Thanks to that they can
be easily compared and proper action triggered.
Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The readers of dev_list expect the updates to it to take place from
within the opp_table->lock and this is missing in the case where the
dev_list is updated for already managed OPPs.
Fix that by calling _add_opp_dev() from there and remove the now unused
_add_opp_dev_unlocked() callback. While at it, also reduce the length of
the critical section in _add_opp_dev().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
There is a lot of stuff here which can be done outside of the big
opp_table_lock, do that. This helps avoiding few circular dependency
lockdeps around debugfs and interconnects.
Reported-by: Rob Clark <robdclark@gmail.com>
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
We returned earlier by mistake even when there were no failures. Fix it.
Fixes: dd461cd918 ("opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.com>
The patch missed returning 0 early in case of success and hence the
static OPPs got removed by mistake. Fix it.
Fixes: 90d46d71cc ("opp: Handle multiple calls for same OPP table in _of_add_opp_table_v1()")
Reported-by: Aisheng Dong <aisheng.dong@nxp.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Dong Aisheng <aisheng.dong@nxp.com>
Pull opertaing performance points (OPP) framework fixes for 5.10-rc1
from Viresh Kumar:
"- Return -EPROBE_DEFER properly from dev_pm_opp_get_opp_table()
(Stephan Gerhold).
- Minor cleanups around required-opps (Stephan Gerhold).
- Extends opp-supported-hw property to contain multiple versions
(Viresh Kumar).
- Multiple cleanups around dev_pm_opp_attach_genpd() (Viresh Kumar).
- Multiple fixes, cleanups in the OPP core for overall better design
(Viresh Kumar)."
* 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
opp: Allow opp-level to be set to 0
opp: Prevent memory leak in dev_pm_opp_attach_genpd()
ARM: tegra: Pass multiple versions in opp-supported-hw property
opp: Allow opp-supported-hw to contain multiple versions
dt-bindings: opp: Allow opp-supported-hw to contain multiple versions
opp: Set required OPPs in reverse order when scaling down
opp: Reduce code duplication in _set_required_opps()
opp: Drop unnecessary check from dev_pm_opp_attach_genpd()
opp: Handle multiple calls for same OPP table in _of_add_opp_table_v1()
opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER
opp: Remove _dev_pm_opp_find_and_remove_table() wrapper
opp: Split out _opp_set_rate_zero()
opp: Reuse the enabled flag in !target_freq path
opp: Rename regulator_enabled and use it as status of all resources
The DT bindings don't put such a constraint, nor should the kernel. It
is perfectly fine for opp-level to be set to 0, if we need to put the
performance state votes for a domain for a particular OPP.
Reported-by: Stephan Gerhold <stephan@gerhold.net>
Tested-by: Stephan Gerhold <stephan@gerhold.net>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
If dev_pm_opp_attach_genpd() is called multiple times (once for each CPU
sharing the table), then it would result in unwanted behavior like
memory leak, attaching the domain multiple times, etc.
Handle that by checking and returning earlier if the domains are already
attached. Now that dev_pm_opp_detach_genpd() can get called multiple
times as well, we need to protect that too.
Note that the virtual device pointers aren't returned in this case, as
they may become unavailable to some callers during the middle of the
operation.
Reported-by: Stephan Gerhold <stephan@gerhold.net>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The bindings allow multiple versions to be passed to "opp-supported-hw"
property, either of which can result in enabling of the OPP.
Update code to allow that.
Tested-by: Stephan Gerhold <stephan@gerhold.net>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The OPP core already has well-defined semantics to ensure required
OPPs/regulators are set before/after the frequency change, depending
on if we scale up or down.
Similar requirements might exist for the order of required OPPs
when multiple power domains need to be scaled for a frequency change.
For example, on Qualcomm platforms using CPR (Core Power Reduction),
we need to scale the VDDMX and CPR power domain. When scaling up,
MX should be scaled up before CPR. When scaling down, CPR should be
scaled down before MX.
In general, if there are multiple "required-opps" in the device tree
I would expect that the order is either irrelevant, or there is some
dependency between the power domains. In that case, the power domains
should be scaled down in reverse order.
This commit updates _set_required_opps() to set required OPPs in
reverse order when scaling down.
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
[ Viresh: Fix rebase conflict and minor rearrangement of the code ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Move call to dev_pm_genpd_set_performance_state() to a separate
function so we can avoid duplicating the code for the single and
multiple genpd case.
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
[ Viresh: Validate virtual device before use ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Since commit c0ab9e0812 ("opp: Allocate genpd_virt_devs from
dev_pm_opp_attach_genpd()"), the allocation of the virtual devices is
moved to dev_pm_opp_attach_genpd() and this check isn't required anymore
as it will always fail. Drop it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Until now for V1 OPP bindings we used to call
dev_pm_opp_of_cpumask_add_table() first and then
dev_pm_opp_set_sharing_cpus() in the cpufreq-dt driver.
A later patch will though update the cpufreq-dt driver to optimize the
code a bit and we will call dev_pm_opp_set_sharing_cpus() first followed
by dev_pm_opp_of_cpumask_add_table(), which doesn't work well today as
it tries to re parse the OPP entries. This should work nevertheless for
V1 bindings as the same works for V2 bindings.
Adapt the same approach from V2 bindings and fix this.
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Pull operating performance points (OPP) framework fixes for 5.9-rc4 from
Viresh Kumar:
"This fixes reference counting for OPP tables. Few patches are getting
queued (for various subsystems) for 5.10 which depend on this to be
fixed first."
* 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
opp: Don't drop reference for an OPP table that was never parsed
dev_pm_opp_remove_table() should drop a reference to the OPP table only
if the DT OPP table was parsed earlier with a call to
dev_pm_opp_of_add_table() earlier. Else it may end up dropping the
reference to the OPP table, which was added as a result of other calls
like dev_pm_opp_set_clkname(). And would hence result in undesirable
behavior later on when caller would try to free the resource again.
Fixes: 03758d6026 ("opp: Replace list_kref with a local counter")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Anders Roxell <anders.roxell@linaro.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The OPP core manages various resources, e.g. clocks or interconnect paths.
These resources are looked up when the OPP table is allocated once
dev_pm_opp_get_opp_table() is called the first time (either directly
or indirectly through one of the many helper functions).
At this point, the resources may not be available yet, i.e. looking them
up will result in -EPROBE_DEFER. Unfortunately, dev_pm_opp_get_opp_table()
is currently unable to propagate this error code since it only returns
the allocated OPP table or NULL.
This means that all consumers of the OPP core are required to make sure
that all necessary resources are available. Usually this happens by
requesting them, checking the result and releasing them immediately after.
For example, we have added "dev_pm_opp_of_find_icc_paths(dev, NULL)" to
several drivers now just to make sure the interconnect providers are
ready before the OPP table is allocated. If this call is missing,
the OPP core will only warn about this and then attempt to continue
without interconnect. This will eventually fail horribly, e.g.:
cpu cpu0: _allocate_opp_table: Error finding interconnect paths: -517
... later ...
of: _read_bw: Mismatch between opp-peak-kBps and paths (1 0)
cpu cpu0: _opp_add_static_v2: opp key field not found
cpu cpu0: _of_add_opp_table_v2: Failed to add OPP, -22
This example happens when trying to use interconnects for a CPU OPP
table together with qcom-cpufreq-nvmem.c. qcom-cpufreq-nvmem calls
dev_pm_opp_set_supported_hw(), which ends up allocating the OPP table
early. To fix the problem with the current approach we would need to add
yet another call to dev_pm_opp_of_find_icc_paths(dev, NULL).
But actually qcom-cpufreq-nvmem.c has nothing to do with interconnects...
This commit attempts to make this more robust by allowing
dev_pm_opp_get_opp_table() to return an error pointer. Fixing all
the usages is trivial because the function is usually used indirectly
through another helper (e.g. dev_pm_opp_set_supported_hw() above).
These other helpers already return an error pointer.
The example above then works correctly because set_supported_hw() will
return -EPROBE_DEFER, and qcom-cpufreq-nvmem.c already propagates that
error. It should also be possible to remove the remaining usages of
"dev_pm_opp_of_find_icc_paths(dev, NULL)" from other drivers as well.
Note that this commit currently only handles -EPROBE_DEFER for the
clock/interconnects within _allocate_opp_table(). Other errors are just
ignored as before. Eventually those should be propagated as well.
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
[ Viresh: skip checking return value of dev_pm_opp_get_opp_table() for
EPROBE_DEFER in domain.c, fix NULL return value and reorder
code a bit in core.c, and update exynos-asv.c ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The OPP core needs to track if the resources of devices are
enabled/configured or not, as it disables the resources when target_freq
is set to 0.
Handle that with the new enabled flag and remove otherwise complex
conditional statements.
Tested-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Expand the scope of the regulator_enabled flag and use it to track
status of all the resources. This will be used for other stuff in the
next patch.
Tested-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
dev_pm_opp_set_rate() can now be called with freq = 0 in order
to either drop performance or bandwidth votes or to disable
regulators on platforms which support them.
In such cases, a subsequent call to dev_pm_opp_set_rate() with
the same frequency ends up returning early because 'old_freq == freq'
Instead make it fall through and put back the dropped performance
and bandwidth votes and/or enable back the regulators.
Cc: v5.3+ <stable@vger.kernel.org> # v5.3+
Fixes: cd7ea58286 ("opp: Make dev_pm_opp_set_rate() handle freq = 0 to drop performance votes")
Reported-by: Sajida Bhanu <sbhanu@codeaurora.org>
Reviewed-by: Sibi Sankar <sibis@codeaurora.org>
Reported-by: Matthias Kaehlcke <mka@chromium.org>
Tested-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
[ Viresh: Don't skip clk_set_rate() and massaged changelog ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
We get the opp_table pointer at the top of the function and so we should
put the pointer at the end of the function like all other exit paths
from this function do.
Cc: v5.8+ <stable@vger.kernel.org> # v5.8+
Fixes: b00e667a6d ("opp: Remove bandwidth votes when target_freq is zero")
Reviewed-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
[ Viresh: Split the patch into two ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
We get the opp_table pointer at the top of the function and so we should
put the pointer at the end of the function like all other exit paths
from this function do.
Cc: v5.7+ <stable@vger.kernel.org> # v5.7+
Fixes: aca48b61f9 ("opp: Manage empty OPP tables with clk handle")
Reviewed-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
[ Viresh: Split the patch into two ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Pull ARM cpufreq driver changes for v5.9-rc1 from Viresh Kumar:
"Here are the details:
- Adaptive voltage scaling (AVS) support and minor cleanups for
brcmstb driver (Florian Fainelli and Markus Mayer).
- A new tegra driver and cleanup for the existing one (Sumit Gupta and
Jon Hunter).
- Bandwidth level support for Qcom driver along with OPP changes (Sibi
Sankar).
- Cleanups to sti, cpufreq-dt, ap806, CPPC drivers (Viresh Kumar, Lee
Jones, Ivan Kokshaysky, Sven Auhagen, and Xin Hao).
- Make schedutil default governor for ARM (Valentin Schneider).
- Fix dependency issues for imx (Walter Lozano).
- Cleanup around cached_resolved_idx in cpufreq core (Viresh Kumar)."
* 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
cpufreq: make schedutil the default for arm and arm64
cpufreq: cached_resolved_idx can not be negative
cpufreq: Add Tegra194 cpufreq driver
dt-bindings: arm: Add NVIDIA Tegra194 CPU Complex binding
cpufreq: imx: Select NVMEM_IMX_OCOTP
cpufreq: sti-cpufreq: Fix some formatting and misspelling issues
cpufreq: tegra186: Simplify probe return path
cpufreq: CPPC: Reuse caps variable in few routines
cpufreq: ap806: fix cpufreq driver needs ap cpu clk
cpufreq: cppc: Reorder code and remove apply_hisi_workaround variable
cpufreq: dt: fix oops on armada37xx
cpufreq: brcmstb-avs-cpufreq: send S2_ENTER / S2_EXIT commands to AVS
cpufreq: brcmstb-avs-cpufreq: Support polling AVS firmware
cpufreq: brcmstb-avs-cpufreq: more flexible interface for __issue_avs_command()
cpufreq: qcom: Disable fast switch when scaling DDR/L3
cpufreq: qcom: Update the bandwidth levels on frequency change
OPP: Add and export helper to set bandwidth
cpufreq: blacklist SC7180 in cpufreq-dt-platdev
cpufreq: blacklist SDM845 in cpufreq-dt-platdev
* pm-em:
OPP: refactor dev_pm_opp_of_register_em() and update related drivers
Documentation: power: update Energy Model description
PM / EM: change name of em_pd_energy to em_cpu_energy
PM / EM: remove em_register_perf_domain
PM / EM: add support for other devices than CPUs in Energy Model
PM / EM: update callback structure and add device pointer
PM / EM: introduce em_dev_register_perf_domain function
PM / EM: change naming convention from 'capacity' to 'performance'
* pm-core:
mmc: jz4740: Use pm_ptr() macro
PM: Make *_DEV_PM_OPS macros use __maybe_unused
PM: core: introduce pm_ptr() macro
Add and export 'dev_pm_opp_set_bw' to set the bandwidth
levels associated with an OPP.
Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Export dev_pm_opp_adjust_voltage() as it may be used by modules later
on.
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
[ Viresh: Rewrote commit log ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Currently, when using _of_add_opp_table_v2 parsed_static_opps is
increased and this value is used in _opp_remove_all_static() to
check if there are static opp entries that need to be freed.
Unfortunately this does not happen when using _of_add_opp_table_v1(),
which leads to warnings.
This patch increases parsed_static_opps in _of_add_opp_table_v1() in a
similar way as in _of_add_opp_table_v2().
Fixes: 03758d6026 ("opp: Replace list_kref with a local counter")
Cc: v5.6+ <stable@vger.kernel.org> # v5.6+
Signed-off-by: Walter Lozano <walter.lozano@collabora.com>
[ Viresh: Do the operation with lock held and set the value to 1 instead
of incrementing it ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The Energy Model framework supports not only CPU devices. Drop the CPU
specific interface with cpumask and add struct device. Add also a return
value, user might use it. This new interface provides easy way to create
a simple Energy Model, which then might be used by e.g. thermal subsystem.
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Energy Model framework is going to support devices other that CPUs. In
order to make this happen change the callback function and add pointer to
a device as an argument.
Update the related users to use new function and new callback from the
Energy Model.
Acked-by: Quentin Perret <qperret@google.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.
This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.
There are a variety of indentation styles found.
a) 4 spaces + '---help---'
b) 7 spaces + '---help---'
c) 8 spaces + '---help---'
d) 1 space + 1 tab + '---help---'
e) 1 tab + '---help---' (correct indentation)
f) 1 tab + 1 space + '---help---'
g) 1 tab + 2 spaces + '---help---'
In order to convert all of them to 1 tab + 'help', I ran the
following commend:
$ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
The DT node of the device may contain interconnect paths while the OPP
table doesn't have the bandwidth values. There is no need to parse the
paths in such cases.
Signed-off-by: Sibi Sankar <sibis@codeaurora.org>
Tested-by: Sibi Sankar <sibis@codeaurora.org>
Reviewed-by: Sibi Sankar <sibis@codeaurora.org>
[ Viresh: Support the case of !opp_table and massaged changelog ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
We already drop several votes when target_freq is set to zero, drop
bandwidth votes as well.
Reported-by: Sibi Sankar <sibis@codeaurora.org>
Reviewed-by: Georgi Djakov <georgi.djakov@linaro.org>
Tested-by: Georgi Djakov <georgi.djakov@linaro.org>
Reviewed-by: Sibi Sankar <sibis@codeaurora.org>
Tested-by: Sibi Sankar <sibis@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Add enable regulators to dev_pm_opp_set_regulators() and disable
regulators to dev_pm_opp_put_regulators(). Even if bootloader
leaves regulators enabled, they should be enabled in kernel in
order to increase the reference count.
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Clément Péron <peron.clem@gmail.com>
Tested-by: Clément Péron <peron.clem@gmail.com>
Signed-off-by: Kamil Konieczny <k.konieczny@samsung.com>
[ Viresh: Enable the regulator only after it is programmed and add a
flag to track its status. ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>