list cpu_data_list has been inited staticly through LIST_HEAD,
so there's no need to call another INIT_LIST_HEAD. Simply remove
it from cppc_cpufreq_init.
Signed-off-by: Han Wang <zjuwanghan@outlook.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The Frequency Invariance Engine (FIE) is providing a frequency scaling
correction factor that helps achieve more accurate load-tracking.
Normally, this scaling factor can be obtained directly with the help of
the cpufreq drivers as they know the exact frequency the hardware is
running at. But that isn't the case for CPPC cpufreq driver.
Another way of obtaining that is using the arch specific counter
support, which is already present in kernel, but that hardware is
optional for platforms.
This patch updates the CPPC driver to register itself with the topology
core to provide its own implementation (cppc_scale_freq_tick()) of
topology_scale_freq_tick() which gets called by the scheduler on every
tick. Note that the arch specific counters have higher priority than
CPPC counters, if available, though the CPPC driver doesn't need to have
any special handling for that.
On an invocation of cppc_scale_freq_tick(), we schedule an irq work
(since we reach here from hard-irq context), which then schedules a
normal work item and cppc_scale_freq_workfn() updates the per_cpu
arch_freq_scale variable based on the counter updates since the last
tick.
To allow platforms to disable this CPPC counter-based frequency
invariance support, this is all done under CONFIG_ACPI_CPPC_CPUFREQ_FIE,
which is enabled by default.
This also exports sched_setattr_nocheck() as the CPPC driver can be
built as a module.
Cc: linux-acpi@vger.kernel.org
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
It's a classic example of memleak, we allocate something, we fail and
never free the resources.
Make sure we free all resources on policy ->init() failures.
Fixes: a28b2bfc09 ("cppc_cpufreq: replace per-cpu data array with a list")
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Commit 367dc4aa93 ("cpufreq: Add stop CPU callback to cpufreq_driver
interface") added the ->stop_cpu() callback to allow the drivers to do
clean up before the CPU is completely down and its state can't be
modified.
At that time the CPU hotplug framework used to call the cpufreq core's
registered notifier for different events like CPU_DOWN_PREPARE and
CPU_POST_DEAD. The ->stop_cpu() callback was called during the
CPU_DOWN_PREPARE event.
This is no longer the case, cpuhp_cpufreq_offline() is called only
once by the CPU hotplug core now and we don't really need two
separate callbacks for cpufreq drivers, i.e. ->stop_cpu() and
-<exit(), as everything can be done from the ->exit() callback
itself.
Migrate to using the ->exit() callback instead of ->stop_cpu().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Minor edits in the changelog and subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts commit 4c38f2df71.
There are few races in the frequency invariance support for CPPC driver,
namely the driver doesn't stop the kthread_work and irq_work on policy
exit during suspend/resume or CPU hotplug.
A proper fix won't be possible for the 5.13-rc, as it requires a lot of
changes. Lets revert the patch instead for now.
Fixes: 4c38f2df71 ("cpufreq: CPPC: Add support for frequency invariance")
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Simplify case when setting default in cppc_cpufreq_get_transition_delay_us.
Signed-off-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The Frequency Invariance Engine (FIE) is providing a frequency scaling
correction factor that helps achieve more accurate load-tracking.
Normally, this scaling factor can be obtained directly with the help of
the cpufreq drivers as they know the exact frequency the hardware is
running at. But that isn't the case for CPPC cpufreq driver.
Another way of obtaining that is using the arch specific counter
support, which is already present in kernel, but that hardware is
optional for platforms.
This patch updates the CPPC driver to register itself with the topology
core to provide its own implementation (cppc_scale_freq_tick()) of
topology_scale_freq_tick() which gets called by the scheduler on every
tick. Note that the arch specific counters have higher priority than
CPPC counters, if available, though the CPPC driver doesn't need to have
any special handling for that.
On an invocation of cppc_scale_freq_tick(), we schedule an irq work
(since we reach here from hard-irq context), which then schedules a
normal work item and cppc_scale_freq_workfn() updates the per_cpu
arch_freq_scale variable based on the counter updates since the last
tick.
To allow platforms to disable this CPPC counter-based frequency
invariance support, this is all done under CONFIG_ACPI_CPPC_CPUFREQ_FIE,
which is enabled by default.
This also exports sched_setattr_nocheck() as the CPPC driver can be
built as a module.
Cc: linux-acpi@vger.kernel.org
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The cppc_cpudata per-cpu storage was inefficient (1) additional to causing
functional issues (2) when CPUs are hotplugged out, due to per-cpu data
being improperly initialised.
(1) The amount of information needed for CPPC performance control in its
cpufreq driver depends on the domain (PSD) coordination type:
ANY: One set of CPPC control and capability data (e.g desired
performance, highest/lowest performance, etc) applies to all
CPUs in the domain.
ALL: Same as ANY. To be noted that this type is not currently
supported. When supported, information about which CPUs
belong to a domain is needed in order for frequency change
requests to be sent to each of them.
HW: It's necessary to store CPPC control and capability
information for all the CPUs. HW will then coordinate the
performance state based on their limitations and requests.
NONE: Same as HW. No HW coordination is expected.
Despite this, the previous initialisation code would indiscriminately
allocate memory for all CPUs (all_cpu_data) and unnecessarily
duplicate performance capabilities and the domain sharing mask and type
for each possible CPU.
(2) With the current per-cpu structure, when having ANY coordination,
the cppc_cpudata cpu information is not initialised (will remain 0)
for all CPUs in a policy, other than policy->cpu. When policy->cpu is
hotplugged out, the driver will incorrectly use the uninitialised (0)
value of the other CPUs when making frequency changes. Additionally,
the previous values stored in the perf_ctrls.desired_perf will be
lost when policy->cpu changes.
Therefore replace the array of per cpu data with a list. The memory for
each structure is allocated at policy init, where a single structure
can be allocated per policy, not per cpu. In order to accommodate the
struct list_head node in the cppc_cpudata structure, the now unused cpu
and cur_policy variables are removed.
For example, on a arm64 Juno platform with 6 CPUs: (0, 1, 2, 3) in PSD1,
(4, 5) in PSD2 - ANY coordination, the memory allocation comparison shows:
Before patch:
- ANY coordination:
total slack req alloc/free caller
0 0 0 0/1 _kernel_size_le_hi32+0x0xffff800008ff7810
0 0 0 0/6 _kernel_size_le_hi32+0x0xffff800008ff7808
128 80 48 1/0 _kernel_size_le_hi32+0x0xffff800008ffc070
768 0 768 6/0 _kernel_size_le_hi32+0x0xffff800008ffc0e4
After patch:
- ANY coordination:
total slack req alloc/free caller
256 0 256 2/0 _kernel_size_le_hi32+0x0xffff800008fed410
0 0 0 0/2 _kernel_size_le_hi32+0x0xffff800008fed274
Additional notes:
- A pointer to the policy's cppc_cpudata is stored in policy->driver_data
- Driver registration is skipped if _CPC entries are not present.
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use the existing sysfs attribute "freqdomain_cpus" to expose
information to userspace about CPUs in the same frequency domain.
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The previous coordination type handling in the cppc_cpufreq init code
created some confusion: the comment mentioned "Support only SW_ANY for
now" while only the SW_ALL/ALL case resulted in a failure. The other
coordination types (HW_ALL/HW, NONE) were silently supported.
Clarify support for coordination types while describing in comments the
intended behavior.
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Considering only the currently supported coordination types (ANY, HW,
NONE), this change only makes a difference for the ANY type, when
policy->cpu is hotplugged out. In that case the new policy->cpu will
be different from ((struct cppc_cpudata *)policy->driver_data)->cpu.
While in this case the controls of *ANY* CPU could be used to drive
frequency changes, it's more consistent to use policy->cpu as the
leading CPU, as used in all other cppc_cpufreq functions. Additionally,
the debug prints in cppc_set_perf() would no longer create confusion
when referring to a CPU that is hotplugged out.
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The CPPC performance capabilities are used significantly throughout
the driver.
Simplify the use of them by introducing a local pointer "caps" to
point to cpu_data->perf_caps, in functions that access performance
capabilities often.
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In order to maintain the typical naming convention in the cpufreq
framework:
- replace the use of "cpu" variable name for cppc_cpudata pointers
with "cpu_data"
- replace variable names "cpu_num" and "cpunum" with "cpu"
- make cpu variables unsigned int
Where pertinent, also move the initialisation of cpu_data variable to
its declaration and make consistent use of the local "cpu" variable.
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fix a few trivial issues in the cppc_cpufreq driver:
- indentation of function arguments
- consistent use of tabs (vs space) in defines
- spelling: s/Offest/Offset, s/trasition/transition
- order of local variables, from long pointers to structures to
short ret and i (index) variables, to improve readability
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The 'caps' variable has been defined in cppc_cpufreq_khz_to_perf() and
cppc_cpufreq_perf_to_khz() routines, so there is no need to get
'highest_perf' value through 'cpu->caps.highest_perf', we can use
'caps->highest_perf' instead.
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
[ Viresh: Updated commit log ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
With the current approach we have an extra check in the
cppc_cpufreq_get_rate() callback, which checks if hisilicon's get rate
implementation should be used instead. While it works fine, the approach
isn't very straight forward, over that we have an extra check in the
routine.
Rearrange code and update the cpufreq driver's get() callback pointer
directly for the hisilicon case. This gets the extra variable is removed
and the extra check isn't required anymore as well.
Tested-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
To add SW BOOST support for CPPC, we need to get the max frequency of
boost mode and non-boost mode. ACPI spec 6.2 section 8.4.7.1 describes
the following two CPC registers.
"Highest performance is the absolute maximum performance an individual
processor may reach, assuming ideal conditions. This performance level
may not be sustainable for long durations, and may only be achievable if
other platform components are in a specific state; for example, it may
require other processors be in an idle state.
Nominal Performance is the maximum sustained performance level of the
processor, assuming ideal operating conditions. In absence of an
external constraint (power, thermal, etc.) this is the performance level
the platform is expected to be able to maintain continuously. All
processors are expected to be able to sustain their nominal performance
state simultaneously."
To add SW BOOST support for CPPC, we can use Highest Performance as the
max performance in boost mode and Nominal Performance as the max
performance in non-boost mode. If the Highest Performance is greater
than the Nominal Performance, we assume SW BOOST is supported.
The current CPPC driver does not support SW BOOST and use 'Highest
Performance' as the max performance the CPU can achieve. 'Nominal
Performance' is used to convert 'performance' to 'frequency'. That
means, if firmware enable boost and provide a value for Highest
Performance which is greater than Nominal Performance, boost feature is
enabled by default.
Because SW BOOST is disabled by default, so, after this patch, boost
feature is disabled by default even if boost is enabled by firmware.
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Suggested-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the process of modifying a cpufreq policy, the cpufreq core makes
a copy of it including all of the internals which is stored on the
CPU stack. Because struct cpufreq_policy is relatively large, this
may cause the size of the stack frame to exceed the 2 KB limit and
so the GCC complains when -Wframe-larger-than= is used.
In fact, it is not necessary to copy the entire policy structure
in order to modify it, however.
First, because cpufreq_set_policy() obtains the min and max policy
limits from frequency QoS now, it is not necessary to pass the limits
to it from the callers. The only things that need to be passed to it
from there are the new governor pointer or (if there is a built-in
governor in the driver) the "policy" value representing the governor
choice. They both can be passed as individual arguments, though, so
make cpufreq_set_policy() take them this way and rework its callers
accordingly. This avoids making copies of cpufreq policies in the
callers of cpufreq_set_policy().
Second, cpufreq_set_policy() still needs to pass the new policy
data to the ->verify() callback of the cpufreq driver whose task
is to sanitize the min and max policy limits. It still does not
need to make a full copy of struct cpufreq_policy for this purpose,
but it needs to pass a few items from it to the driver in case they
are needed (different drivers have different needs in that respect
and all of them have to be covered). For this reason, introduce
struct cpufreq_policy_data to hold copies of the members of
struct cpufreq_policy used by the existing ->verify() driver
callbacks and pass a pointer to a temporary structure of that
type to ->verify() (instead of passing a pointer to full struct
cpufreq_policy to it).
While at it, notice that intel_pstate and longrun don't really need
to verify the "policy" value in struct cpufreq_policy, so drop those
check from them to avoid copying "policy" into struct
cpufreq_policy_data (which allows it to be slightly smaller).
Also while at it fix up white space in a couple of places and make
cpufreq_set_policy() static (as it can be so).
Fixes: 3000ce3c52 ("cpufreq: Use per-policy frequency QoS")
Link: https://lore.kernel.org/linux-pm/CAMuHMdX6-jb1W8uC2_237m8ctCpsnGp=JCxqt8pCWVqNXHmkVg@mail.gmail.com
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Put the ACPI table to release the table mapping after using it
successfully.
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bail out if we match the OEM information, to save some possible
extra iteration.
Also update the code to fix minor coding style issue.
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation version 2 of the license
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 315 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hisilicon chips do not support delivered performance counter register
and reference performance counter register. But the platform can
calculate the real performance using its own method. We reuse the
desired performance register to store the real performance calculated by
the platform. After the platform finished the frequency adjust, it gets
the real performance and writes it into desired performance register. Os
can use it to calculate the real frequency.
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
[ rjw: Drop unnecessary braces ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Clang warns:
drivers/cpufreq/cppc_cpufreq.c:431:36: warning: variable 'cppc_acpi_ids'
is not needed and will not be emitted [-Wunneeded-internal-declaration]
static const struct acpi_device_id cppc_acpi_ids[] = {
^
1 warning generated.
Mark the definition as used so that Clang understands we don't want this
warning while not inhibiting Clang's dead code elimination from removing
the unreferenced internal symbol when moving the data it contains to the
globally available symbol via MODULE_DEVICE_TABLE.
$ nm -S drivers/cpufreq/cppc_cpufreq.o | grep acpi | tail -1
0000000000000000 0000000000000040 R __mod_acpi__cppc_acpi_ids_device_table
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Per Section 8.4.7.1.3 of ACPI 6.2, the platform provides performance
feedback via set of performance counters. To determine the actual
performance level delivered over time, OSPM may read a set of
performance counters from the Reference Performance Counter Register
and the Delivered Performance Counter Register.
OSPM calculates the delivered performance over a given time period by
taking a beginning and ending snapshot of both the reference and
delivered performance counters, and calculating:
delivered_perf = reference_perf X (delta of delivered_perf counter / delta of reference_perf counter).
Implement the above and hook this up to the cpufreq->get method.
Signed-off-by: George Cherian <george.cherian@cavium.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* acpi-cppc:
mailbox: PCC: erroneous error message when parsing ACPI PCCT
ACPI / CPPC: Fix invalid PCC channel status errors
ACPI / CPPC: Document CPPC sysfs interface
cpufreq / CPPC: Support for CPPC v3
ACPI / CPPC: Check for valid PCC subspace only if PCC is used
ACPI / CPPC: Add support for CPPC v3
* acpi-misc:
ACPI: Add missing prototype_for arch_post_acpi_subsys_init()
ACPI: add missing newline to printk
* acpi-battery:
ACPI / battery: Add quirk to avoid checking for PMIC with native driver
ACPI / battery: Ignore AC state in handle_discharging on systems where it is broken
ACPI / battery: Add handling for devices which wrongly report discharging state
ACPI / battery: Remove initializer for unused ident dmi_system_id
ACPI / AC: Remove initializer for unused ident dmi_system_id
* acpi-ac:
ACPI / AC: Add quirk to avoid checking for PMIC with native driver
Add support to specify platform specific transition_delay_us instead
of using the transition delay derived from PCC.
With commit 3d41386d55 (cpufreq: CPPC: Use transition_delay_us
depending transition_latency) we are setting transition_delay_us
directly and not applying the LATENCY_MULTIPLIER. Because of that,
on Qualcomm Centriq we can end up with a very high rate of frequency
change requests when using the schedutil governor (default
rate_limit_us=10 compared to an earlier value of 10000).
The PCC subspace describes the rate at which the platform can accept
commands on the CPPC's PCC channel. This includes read and write
command on the PCC channel that can be used for reasons other than
frequency transitions. Moreover the same PCC subspace can be used by
multiple freq domains and deriving transition_delay_us from it as we
do now can be sub-optimal.
Moreover if a platform does not use PCC for desired_perf register then
there is no way to compute the transition latency or the delay_us.
CPPC does not have a standard defined mechanism to get the transition
rate or the latency at the moment.
Given the above limitations, it is simpler to have a platform specific
transition_delay_us and rely on PCC derived value only if a platform
specific value is not available.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
Fixes: 3d41386d55 (cpufreq: CPPC: Use transition_delay_us depending transition_latency)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use CPPC v3 entries to convert the abstract processor performance to
processor frequency in KHz.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When multiple CPUs are related in one cpufreq policy, the first online
CPU will be chosen by default to handle cpufreq operations. Let's take
cpu0 and cpu1 as an example.
When cpu0 is offline, policy->cpu will be shifted to cpu1. cpu1's perf
capabilities should be initialized. Otherwise, perf capabilities are 0s
and speed change can not take effect.
This patch copies perf capabilities of the first online CPU to other
shared CPUs when policy shared type is CPUFREQ_SHARED_TYPE_ANY.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Shunyong Yang <shunyong.yang@hxt-semitech.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Now that the driver has started to set transition_delay_us directly,
there is no need to set transition_latency along with it, as it is not
used by the cpufreq core.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
With commit e948bc8fbe (cpufreq: Cap the default transition delay
value to 10 ms) the cpufreq was not honouring the delay passed via
ACPI (PCCT). Due to which on ARM based platforms using CPPC the
cpufreq governor tries to change the frequency of CPUs faster than
expected.
This leads to continuous error messages like the following.
" ACPI CPPC: PCC check channel failed. Status=0 "
Earlier (without above commit) the default transition delay was
taken form the value passed from PCCT. Use the same value provided
by PCCT to set the transition_delay_us.
Fixes: e948bc8fbe (cpufreq: Cap the default transition delay value to 10 ms)
Signed-off-by: George Cherian <george.cherian@cavium.com>
Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
policy->cpu is copied into policy->cpus in cpufreq_online() before
calling into cpufreq_driver->init(). So there's no need to set the
same in the individual driver init() functions again.
This patch removes the redundant setting of policy->cpu in policy->cpus
in intel_pstate and cppc drivers.
Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Description of Lowest Perfomance in ACPI 6.1 specification states:
"Lowest Performance is the absolute lowest performance level of
the platform. Selecting a performance level lower than the lowest
nonlinear performance level may actually cause an efficiency penalty,
but should reduce the instantaneous power consumption of the processor.
In traditional terms, this represents the T-state range of performance
levels."
Set the default value of policy->min to Lowest Nonlinear Performance
to avoid any potential efficiency penalty.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Alexey Klimov <alexey.klimov@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
MODULE_DEVICE_TABLE is added so that CPPC cpufreq module can be
automatically loaded when we have a acpi processor device with
"ACPI0007" hid.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Fix two cpufreq regressions causing undesirable changes in
behavior to appear (one in the core and one in the conservative
governor) introduced during the 4.8 cycle (Aaro Koskinen, Rafael
Wysocki).
- Fix the way the intel_pstate driver accesses MSRs related to the
hardware-managed P-states (HWP) feature during the initialization
which currently is unsafe and may cause the processor to generate
a general protection fault (Srinivas Pandruvada).
- Rework the intel_pstate's P-state selection algorithm used on Atom
processors to avoid known problems with the current one and to
make the computation more straightforward, which also happens to
improve performance in multiple benchmarks a bit (Rafael Wysocki).
- Improve two comments in the intel_pstate driver (Rafael Wysocki).
- Fix the desired performance computation in the CPPC cpufreq driver
(Hoan Tran).
- Fix the devfreq core to avoid printing misleading error messages
in some cases (Tobias Jakobi).
- Fix the error code path in devfreq_add_device() to use proper
locking around list modifications (Axel Lin).
- Fix a build failure and remove a couple of redundant updates of
variables in the exynos-nocp devfreq driver (Axel Lin).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJYANqMAAoJEILEb/54YlRx+U4P/A1ZJ/93u+ChipehTckNDogR
xMCNsUz6Pn9VIdilEnaUcsCaNc93R7e6KjwgSO7Caeriw4syW3YZz2LuGQTihs8b
5vnvVvya9Bw1aXUweeayogMyOYZV1y1G/yzq7/+c02/cgxO8WBPnmGrE17Mhu43q
IF1pQJ257e0HgKKspuzy+twRCLwnOqHbvWtQnEi2rzuaGrsK7XZk9yRuaXK4NshQ
+M9hrHlw+OdmI+9lLmH8Ap2G68EJ4Q2o69sbAQ6MWgxRU44D0uEqgbT16cIdDs3J
c9VCgiqHuhj2bfd9vqNAjr4bGdy4iwcEKyz2nkIl0KEq9tTPtJky8v6WUzV0+rbR
xVbGIWg8X5wKe/Ndve2GLDrqhuVJ0hZkRdqpzRgm08VBGpRlmM0Gjqk+uEKqA2n2
IhidwTlzbQFVh437cjqupCKVXPb2POdgNyk4fEK7WVckRR3K7LR+rXoWN1uwW2YJ
9rjQBX0n2UfZ9Ft+gVO6/faWZlqLPmx60lHQSXNHvNY04HfZ5EiRFGEZEX1g0Uep
16nYHpB+qx/GwR7druGQVVY58YEp2g68jbpL2ehr2lLBYVSExy0kiOrS7GpoA0vd
ngImjroJ842wQYjfek4Gi8VfGu+tsuMIVdjltOn1sVZ1QvprgF/atZHcN84eV8BU
OyEGOQ7H1idEZ14Oa19C
=3yoB
-----END PGP SIGNATURE-----
Merge tag 'pm-extra-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"This includes a couple of fixes for cpufreq regressions introduced in
4.8, a rework of the intel_pstate algorithm used on Atom processors
(that took some time to test) plus a fix and a couple of cleanups in
that driver, a CPPC cpufreq driver fix, and a some devfreq fixes and
cleanups (core and exynos-nocp).
Specifics:
- Fix two cpufreq regressions causing undesirable changes in behavior
to appear (one in the core and one in the conservative governor)
introduced during the 4.8 cycle (Aaro Koskinen, Rafael Wysocki).
- Fix the way the intel_pstate driver accesses MSRs related to the
hardware-managed P-states (HWP) feature during the initialization
which currently is unsafe and may cause the processor to generate a
general protection fault (Srinivas Pandruvada).
- Rework the intel_pstate's P-state selection algorithm used on Atom
processors to avoid known problems with the current one and to make
the computation more straightforward, which also happens to improve
performance in multiple benchmarks a bit (Rafael Wysocki).
- Improve two comments in the intel_pstate driver (Rafael Wysocki).
- Fix the desired performance computation in the CPPC cpufreq driver
(Hoan Tran).
- Fix the devfreq core to avoid printing misleading error messages in
some cases (Tobias Jakobi).
- Fix the error code path in devfreq_add_device() to use proper
locking around list modifications (Axel Lin).
- Fix a build failure and remove a couple of redundant updates of
variables in the exynos-nocp devfreq driver (Axel Lin)"
* tag 'pm-extra-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: CPPC: Correct desired_perf calculation
cpufreq: conservative: Fix next frequency selection
cpufreq: skip invalid entries when searching the frequency
cpufreq: intel_pstate: Fix struct pstate_adjust_policy kerneldoc
cpufreq: intel_pstate: Proportional algorithm for Atom
PM / devfreq: Skip status update on uninitialized previous_freq
PM / devfreq: Add proper locking around list_del()
PM / devfreq: exynos-nocp: Remove redundant code
PM / devfreq: exynos-nocp: Select REGMAP_MMIO
cpufreq: intel_pstate: Clarify comment in get_target_pstate_use_performance()
cpufreq: intel_pstate: Fix unsafe HWP MSR access
The desired_perf is an abstract performance number. Its value should
be in the range of [lowest perf, highest perf] of CPPC.
The correct calculation is
desired_perf = freq * cppc_highest_perf / cppc_dmi_max_khz
And cppc_cpufreq_set_target() returns if desired_perf is exactly
the same with the old perf.
Signed-off-by: Hoan Tran <hotran@apm.com>
Reviewed-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Update of the ACPICA code in the kernel to upstream revision 20160831 with
the following major changes:
* New mechanism for GPE masking.
* Fixes for issues related to the LoadTable operator and table loading.
* Fixes for issues related to so-called module-level code (MLC), that is
AML that doesn't belong to any methods.
* Change of the return value of the _OSI method to reflect the Windows
behavior.
* GAS (Generic Address Structure) support fix related to 32-bit FADT
addresses.
* Elimination of unnecessary FADT version 2 support.
* ACPI tools fixes and cleanups.
From Bob Moore, Lv Zheng, and Jung-uk Kim.
- ACPI sysfs interface updates to fix GPE handling (on top of the new GPE
masking mechanism in ACPICA) and issues related to table loading (Lv Zheng).
- New watchdog driver based on the ACPI WDAT (ACPI Watchdog Action Table),
needed on some platforms to replace the iTCO watchdog that doesn't work there
and related updates of the intel_pmc_ipc, i2c/i801 and MFD/lcp_ich drivers
(Mika Westerberg).
- Driver core fix to prevent it from leaking secondary fwnode objects during
device removal (Lukas Wunner).
- New definitions of built-in properties for UART in ACPI-based x86 SoC drivers
and a 8250_dw driver quirk for the APM X-Gene SoC (Heikki Krogerus).
- New device ID for the Vulcan SPI controller and constification of local
strucures in the AMD SoC (APD) ACPI driver (Kamlakant Patel, Julia Lawall).
- Fix for a bug causing the allocation of PCI resorces to fail if
ACPI-enumerated child platform devices are registered below the PCI
devices in question (Mika Westerberg).
- Change of the default polarity for PCI legacy IRQs to high on systems
booting wth ACPI on platforms with a GIC interrupt controller model
fixing the discrepancy between the specification and HW behavior (Lorenzo
Pieralisi).
- Fixes for the handling of system suspend/resume in the ACPI EC driver and
update of that driver to make it cope with the cases when the EC device
defined in the ECDT has to be used throughout the entire system life cycle
(Lv Zheng).
- Update of the ACPI CPPC library to allow it to batch requests sent over the
PCC channel (to reduce overhead), to support the fixed functional hardware
(FFH) CPPC registers access type, to notify the mailbox framework about TX
completions when the interrupt flag is set for the PCC mailbox, and to
support HW-Reduced Communication Subspace type 2 (Ashwin Chaugule, Prashanth
Prakash, Srinivas Pandruvada, Hoan Tran).
- ACPI button driver fix and documentation update related to the handling of
laptop lids (Lv Zheng).
- ACPI battery driver initialization fix (Carlos Garnacho).
- ACPI GPIO enumeration documentation update (Mika Westerberg).
- Assorted updates of the core ACPI bus type code (Lukas Wunner, Lv Zheng).
- Assorted cleanups of the ACPI table parsing code and the x86-specific ACPI
code (Al Stone).
- Fixes for assorted ACPI-related issues found in linux-next (Wei Yongjun).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJX8Y5+AAoJEILEb/54YlRx73oP/RiAi86NKjOj+GfYceVe37jn
6lSqoMugjgTQHRYvYiQCjJ/BR0GzQZqUkz9TAu1Op14+rhTH3OhSfPizzJWCpVfA
G9l9ZRQNnsKNs14bbYmWtmWduh46dFLVFJqo+M/0H3ZMFZu6Adcb+1SBtXHUoQ6L
z69ngFxTu3yRvqS4cmm5h7SOx5W2uZZl8zViJW8jgyGhUBStG87gzR6wsYBldGCk
XFxcaGWBXRccWGAQLSwfs0psQccEooCqbpsDqaUdrK/mI0rsQr88f25ZxEE7Zw7H
bv3py1cgJBZRq36L7eBGQXjIE7YQey6qG2lug2zsUJWe+vzy2vHjHVJHuBXKKgv3
txOA6QZx63UgEyN3zFT7K5ek6uOnkKdeE+s+Laj+K/x4V2R6gbtgO011EVcXy+bI
NvqsO76tfPHpwrn5s1VVc5lcEBEPHKHb+WulHrqhSSU4ivk0gtJDeSI+c8xta6YT
XwSry5tozDLkG1uEZqkyY1XTlOUAHO8E6YcrlOv2z1+mG7L8OH/vCp1apzgexsZA
1683AH5cwKc3KaP+4QdKGdxY2BDxb7OTVh3cGy4kAYb6tqQ/vj7vlRiJvtaMBtFw
xJn3buuagwJzKtgebpA565opvyFAfUX/RNFlTP63aXAefSAgq6KLq70vKFxkIZto
H1LpUbmiEbuBml8CBGb1
=xDOQ
-----END PGP SIGNATURE-----
Merge tag 'acpi-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"First off, the ACPICA code in the kernel is updated to upstream
revision 20160831 that brings in a few bug fixes and cleanups. In
particular, it is possible to mask GPEs now (and the sysfs interface
for GPE control is fixed on top of that), problems related to the
table loading mechanism are fixed and all code related to FADT version
2 (which has never been part of the ACPI specification) is dropped.
On the new features front, there is a new watchdog driver based on the
ACPI WDAT (ACPI Watchdog Action Table), needed on some platforms to
replace the iTCO watchdog that doesn't work there, and some UART
devices get new definitions of built-in properties (to be accessed via
the generic device properties API).
Also, included is a fix for an ACPI-related PCI resorces allocation
issue and a few problems in the EC driver and in the button and
battery drivers are fixed.
In addition to that, the ACPI CPPC library is updated to make batching
of requests sent over the PCC channel possible (which reduces the PCC
usage overhead substantially in some cases) and to support functional
fixed hardware (FFH) type of CPPC registers access (which will allow
CPPC to be used on x86 too in the future).
As usual, there are some assorted fixes and cleanups too.
Specifics:
- Update of the ACPICA code in the kernel to upstream revision
20160831 with the following major changes:
* New mechanism for GPE masking.
* Fixes for issues related to the LoadTable operator and table
loading.
* Fixes for issues related to so-called module-level code (MLC),
that is AML that doesn't belong to any methods.
* Change of the return value of the _OSI method to reflect the
Windows behavior.
* GAS (Generic Address Structure) support fix related to 32-bit
FADT addresses.
* Elimination of unnecessary FADT version 2 support.
* ACPI tools fixes and cleanups.
From Bob Moore, Lv Zheng, and Jung-uk Kim.
- ACPI sysfs interface updates to fix GPE handling (on top of the new
GPE masking mechanism in ACPICA) and issues related to table
loading (Lv Zheng).
- New watchdog driver based on the ACPI WDAT (ACPI Watchdog Action
Table), needed on some platforms to replace the iTCO watchdog that
doesn't work there and related updates of the intel_pmc_ipc,
i2c/i801 and MFD/lcp_ich drivers (Mika Westerberg).
- Driver core fix to prevent it from leaking secondary fwnode objects
during device removal (Lukas Wunner).
- New definitions of built-in properties for UART in ACPI-based x86
SoC drivers and a 8250_dw driver quirk for the APM X-Gene SoC
(Heikki Krogerus).
- New device ID for the Vulcan SPI controller and constification of
local strucures in the AMD SoC (APD) ACPI driver (Kamlakant Patel,
Julia Lawall).
- Fix for a bug causing the allocation of PCI resorces to fail if
ACPI-enumerated child platform devices are registered below the PCI
devices in question (Mika Westerberg).
- Change of the default polarity for PCI legacy IRQs to high on
systems booting wth ACPI on platforms with a GIC interrupt
controller model fixing the discrepancy between the specification
and HW behavior (Lorenzo Pieralisi).
- Fixes for the handling of system suspend/resume in the ACPI EC
driver and update of that driver to make it cope with the cases
when the EC device defined in the ECDT has to be used throughout
the entire system life cycle (Lv Zheng).
- Update of the ACPI CPPC library to allow it to batch requests sent
over the PCC channel (to reduce overhead), to support the fixed
functional hardware (FFH) CPPC registers access type, to notify the
mailbox framework about TX completions when the interrupt flag is
set for the PCC mailbox, and to support HW-Reduced Communication
Subspace type 2 (Ashwin Chaugule, Prashanth Prakash, Srinivas
Pandruvada, Hoan Tran).
- ACPI button driver fix and documentation update related to the
handling of laptop lids (Lv Zheng).
- ACPI battery driver initialization fix (Carlos Garnacho).
- ACPI GPIO enumeration documentation update (Mika Westerberg).
- Assorted updates of the core ACPI bus type code (Lukas Wunner, Lv
Zheng).
- Assorted cleanups of the ACPI table parsing code and the
x86-specific ACPI code (Al Stone).
- Fixes for assorted ACPI-related issues found in linux-next (Wei
Yongjun)"
* tag 'acpi-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (98 commits)
ACPI / documentation: Use recommended name in GPIO property names
watchdog: wdat_wdt: Fix warning for using 0 as NULL
watchdog: wdat_wdt: fix return value check in wdat_wdt_probe()
platform/x86: intel_pmc_ipc: Do not create iTCO watchdog when WDAT table exists
i2c: i801: Do not create iTCO watchdog when WDAT table exists
mfd: lpc_ich: Do not create iTCO watchdog when WDAT table exists
ACPI / bus: Adjust ACPI subsystem initialization for new table loading mode
ACPICA: Parser: Fix a regression in LoadTable support
ACPICA: Tables: Fix "UNLOAD" code path lock issues
ACPI / watchdog: Add support for WDAT hardware watchdog
ACPI / platform: Pay attention to parent device's resources
PCI: Add pci_find_resource()
ACPI / CPPC: Support PCC with interrupt flag
ACPI / sysfs: Update sysfs signature handling code
ACPI / sysfs: Fix an issue for LoadTable opcode
ACPICA: Tables: Fix a regression in acpi_tb_find_table()
ACPI / tables: Remove duplicated include from tables.c
ACPI / APD: constify local structures
x86: ACPI: make variable names clearer in acpi_parse_madt_lapic_entries()
x86: ACPI: remove extraneous white space after semicolon
...
This patch fixes overflow issue when calculating the desired_perf.
Fixes: ad38677df4 (cpufreq: CPPC: Force reporting values in KHz to fix user space interface)
Signed-off-by: Hoan Tran <hotran@apm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When CPPC is being used by ACPI on arm64, user space tools such as
cpupower report CPU frequency values from sysfs that are incorrect.
What the driver was doing was reporting the values given by ACPI tables
in whatever scale was used to provide them. However, the ACPI spec
defines the CPPC values as unitless abstract numbers. Internal kernel
structures such as struct perf_cap, in contrast, expect these values
to be in KHz. When these struct values get reported via sysfs, the
user space tools also assume they are in KHz, causing them to report
incorrect values (for example, reporting a CPU frequency of 1MHz when
it should be 1.8GHz).
The downside is that this approach has some assumptions:
(1) It relies on SMBIOS3 being used, *and* that the Max Frequency
value for a processor is set to a non-zero value.
(2) It assumes that all processors run at the same speed, or that
the CPPC values have all been scaled to reflect relative speed.
This patch retrieves the largest CPU Max Frequency from a type 4 DMI
record that it can find. This may not be an issue, however, as a
sampling of DMI data on x86 and arm64 indicates there is often only
one such record regardless. Since CPPC is relatively new, it is
unclear if the ACPI ASL will always be written to reflect any sort
of relative performance of processors of differing speeds.
(3) It assumes that performance and frequency both scale linearly.
For arm64 servers, this may be sufficient, but it does rely on
firmware values being set correctly. Hence, other approaches will
be considered in the future.
This has been tested on three arm64 servers, with and without DMI, with
and without CPPC support.
Signed-off-by: Al Stone <ahs3@redhat.com>
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Since struct cpudata is defined in a header file, add prefix cppc_ to
make it not a generic name. Otherwise it causes compile issue in locally
define structure with the same name.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Compute the expected transition latency for frequency transitions
using the values from the PCCT tables when the desired perf
register is in PCC.
Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org>
Reviewed-by: Alexey Klimov <alexey.klimov@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a function to cleanup at module exit and export
appropriate GPL string to enable moduler support
for the cppc_cpufreq driver.
Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com>
Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The CPU policy struct indicates the co-ordination type
for all CPUs of a common freq domain. Initialize it
correctly using the CPU specific data gathered from
CPPC ACPI lib via acpi_get_psd_map().
The PSD object is optional, so the cpu->shared_type
can also be 0. So instead of assuming any value other
than SW_ANY(0xFD) is unsupported, explictly check
if shared_type is SW_ALL and then bail.
Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The kfree() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This driver utilizes the methods introduced in a previous
patch titled - "ACPI: Introduce CPU performance controls using CPPC"
and enables usage with existing CPUFreq governors.
Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Reviewed-by: Al Stone <al.stone@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>