Currently, intel_pstate doesn't register if _PSS is not present on
HP Proliant systems, because it expects the firmware to take over
CPU performance scaling in that case. However, if ACPI PCCH is
present, the firmware expects the kernel to use it for CPU
performance scaling and the pcc-cpufreq driver is loaded for that.
Unfortunately, the firmware interface used by that driver is not
scalable for fundamental reasons, so pcc-cpufreq is way suboptimal
on systems with more than just a few CPUs. In fact, it is better to
avoid using it at all.
For this reason, modify intel_pstate to look for ACPI PCCH if _PSS
is not present and register if it is there. Also prevent the
pcc-cpufreq driver from trying to initialize itself if intel_pstate
has been registered already.
Fixes: fbbcdc0744 (intel_pstate: skip the driver if ACPI has power mgmt option)
Reported-by: Andreas Herrmann <aherrmann@suse.com>
Reviewed-by: Andreas Herrmann <aherrmann@suse.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Andreas Herrmann <aherrmann@suse.com>
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The firmware interface used by the pcc-cpufreq driver is
fundamentally not scalable and using it for dynamic CPU performance
scaling on systems with many CPUs leads to degraded performance.
For this reason, disable dynamic CPU performance scaling on systems
with pcc-cpufreq where the number of CPUs present at the driver init
time is greater than 4. Also make the driver print corresponding
complaints to the kernel log.
Reported-by: Andreas Herrmann <aherrmann@suse.com>
Tested-by: Andreas Herrmann <aherrmann@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts commit 790d849bf8.
Using a v4.7-rc7 kernel on a HP ProLiant triggered following messages
pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz
cpufreq: ondemand governor failed, too long transition latency of HW, fallback to performance governor
The last line was shown for each CPU in the system.
Testing v4.5 (where commit 790d849b was integrated) triggered
similar messages. Same behaviour on a 2nd HP Proliant system.
So commit 790d849bf (cpufreq: pcc-cpufreq: update default value of
cpuinfo_transition_latency) causes the system to use performance
governor which, I guess, was not the intention of the patch.
Enabling debug output in pcc-cpufreq provides following verbose output:
pcc-cpufreq: (v1.10.00) driver loaded with frequency limits: 1200 MHz, 2800 MHz
pcc_get_offset: for CPU 0: pcc_cpu_data input_offset: 0x44, pcc_cpu_data output_offset: 0x48
init: policy->max is 2800000, policy->min is 1200000
get: get_freq for CPU 0
get: SUCCESS: (virtual) output_offset for cpu 0 is 0xffffc9000d7c0048, contains a value of: 0xff06. Speed is: 168000 MHz
cpufreq: ondemand governor failed, too long transition latency of HW, fallback to performance governor
target: CPU 0 should go to target freq: 2800000 (virtual) input_offset is 0xffffc9000d7c0044
target: was SUCCESSFUL for cpu 0
I am asking to revert 790d849bf to re-enable usage of ondemand
governor with pcc-cpufreq.
Fixes: 790d849bf (cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency)
CC: <stable@vger.kernel.org> # 4.5+
Signed-off-by: Andreas Herrmann <aherrmann@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 920de6ebfa (ACPICA: Hardware: Enhance
acpi_hw_validate_register() with access_width/bit_offset awareness)
apparently exposed a latent bug, doorbell.access_width is initialized
to 64, but per Lv Zheng, it should be 4, and indeed, making that
change does bring pcc-cpufreq back to life.
Fixes: 920de6ebfa (ACPICA: Hardware: Enhance acpi_hw_validate_register() with access_width/bit_offset awareness)
Suggested-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The cpufreq documentation specifies
policy->cpuinfo.transition_latency the time it takes on this CPU to
switch between two frequencies in
nanoseconds (if appropriate, else
specify CPUFREQ_ETERNAL)
currently pcc-cpufreq does not expose the value and sets it to zero. I
changed the pcc-cpufreq driver and it's documentation to conform to the
default value specified in Documentation/cpu-freq/cpu-drivers.txt
Signed-off-by: Jacob Tanenbaum <jtanenba@redhat.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The pcc-cpufreq driver is not automatically loaded on systems where
the platform's power management setting requires this driver.
Instead, on those systems no CPU frequency driver is registered and
active.
Make the autoloading matching criteria for loading the pcc-cpufreq
driver the same as done in acpi-cpufreq by commit c655affbd5
("ACPI / cpufreq: Add ACPI processor device IDs to acpi-cpufreq").
x86 CPU frequency drivers are now typically autoloaded by specifying
MODULE_DEVICE_TABLE entries and x86cpu model specific matching.
But pcc-cpufreq was omitted when acpi-cpufreq and other drivers were
changed to use this approach.
Both acpi-cpufreq and pcc-cpufreq depend on a distinct and mutually
exclusive set of ACPI methods which are not directly tied to specific
processor model numbers. Both of these drivers have init routines
which look for their required ACPI methods. As a result, only the
appropriate driver registers as the cpu frequency driver and the other
one ends up being unloaded.
Tested on various systems where acpi-cpufreq, intel_pstate, and
pcc-cpufreq are the expected cpu frequency drivers.
Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com>
Signed-off-by: Joseph Szczypek <joseph.szczypek@hp.com>
Reported-by: Trinh Dao <trinh.dao@hp.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CPUFreq core has new infrastructure that would guarantee serialized calls to
target() or target_index() callbacks. These are called
cpufreq_freq_transition_begin() and cpufreq_freq_transition_end().
This patch converts existing drivers to use these new set of routines.
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the current code, if we fail during a frequency transition, we
simply send the POSTCHANGE notification with the old frequency. This
isn't enough.
One of the core users of these notifications is the code responsible
for keeping loops_per_jiffy aligned with frequency changes. And mostly
it is written as:
if ((val == CPUFREQ_PRECHANGE && freq->old < freq->new) ||
(val == CPUFREQ_POSTCHANGE && freq->old > freq->new)) {
update-loops-per-jiffy...
}
So, suppose we are changing to a higher frequency and failed during
transition, then following will happen:
- CPUFREQ_PRECHANGE notification with freq-new > freq-old
- CPUFREQ_POSTCHANGE notification with freq-new == freq-old
The first one will update loops_per_jiffy and second one will do
nothing. Even if we send the 2nd notification by exchanging values of
freq-new and old, some users of these notifications might get
unstable.
This can be fixed by simply calling cpufreq_notify_post_transition()
with error code and this routine will take care of sending
notifications in the correct order.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[rjw: Folded 3 patches into one, rebased unicore2 changes]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pm-cpufreq: (167 commits)
cpufreq: create per policy rwsem instead of per CPU cpu_policy_rwsem
intel_pstate: Add Baytrail support
intel_pstate: Refactor driver to support CPUs with different MSR layouts
cpufreq: Implement light weight ->target_index() routine
PM / OPP: rename header to linux/pm_opp.h
PM / OPP: rename data structures to dev_pm equivalents
PM / OPP: rename functions to dev_pm_opp*
cpufreq / governor: Remove fossil comment
cpufreq: exynos4210: Use the common clock framework to set APLL clock rate
cpufreq: exynos4x12: Use the common clock framework to set APLL clock rate
cpufreq: Detect spurious invocations of update_policy_cpu()
cpufreq: pmac64: enable cpufreq on iMac G5 (iSight) model
cpufreq: pmac64: provide cpufreq transition latency for older G5 models
cpufreq: pmac64: speed up frequency switch
cpufreq: highbank-cpufreq: Enable Midway/ECX-2000
exynos-cpufreq: fix false return check from "regulator_set_voltage"
speedstep-centrino: Remove unnecessary braces
acpi-cpufreq: Add comment under ACPI_ADR_SPACE_SYSTEM_IO case
cpufreq: arm-big-little: use clk_get instead of clk_get_sys
cpufreq: exynos: Show a list of available frequencies
...
Conflicts:
drivers/devfreq/exynos/exynos5_bus.c
Many common initializations of struct policy are moved to core now and hence
this driver doesn't need to do it. This patch removes such code.
Most recent of those changes is to call ->get() in the core after calling
->init().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Most of the users of cpufreq_verify_within_limits() calls it for
limiting with min/max from policy->cpuinfo. We can make that code
simple by introducing another routine which will do this for them
automatically.
This patch adds another routine cpufreq_verify_within_cpu_limits()
and updates others to use it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
acpi_has_method() is a new ACPI API introduced to check
the existence of an ACPI control method.
It can be used to replace acpi_get_handle() in the case that
1. the calling function doesn't need the ACPI handle of the control method.
and
2. the calling function doesn't care the reason why the method is unavailable.
Convert acpi_get_handle() to acpi_has_method()
in drivers/cpufreq/pcc_freq.c in this patch.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We don't need to set .owner = THIS_MODULE any more in cpufreq drivers
as this field isn't used any more by the cpufreq core.
This patch removes it and updates all dependent drivers accordingly.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
PRECHANGE and POSTCHANGE notifiers must be called in groups, i.e
either both should be called or both shouldn't be.
In case we have started PRECHANGE notifier and found an error, we
must call POSTCHANGE notifier with freqs.new = freqs.old to guarantee
that the sequence of calling notifiers is complete.
This patch fixes it.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
policy->cpus contains all online cpus that have single shared clock line. And
their frequencies are always updated together.
Many SMP system's cpufreq drivers take care of this in individual drivers but
the best place for this code is in cpufreq core.
This patch modifies cpufreq_notify_transition() to notify frequency change for
all cpus in policy->cpus and hence updates all users of this API.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Convert a 0 error return code to a negative one, as returned elsewhere in the
function.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
identifier ret;
expression e,e1,e2,e3,e4,x;
@@
(
if (\(ret != 0\|ret < 0\) || ...) { ... return ...; }
|
ret = 0
)
... when != ret = e1
*x = \(kmalloc\|kzalloc\|kcalloc\|devm_kzalloc\|ioremap\|ioremap_nocache\|devm_ioremap\|devm_ioremap_nocache\)(...);
... when != x = e2
when != ret = e3
*if (x == NULL || ...)
{
... when != ret = e4
* return ret;
}
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
per_cpu(processors, n) can be NULL, resulting in:
Loading CPUFreq modules[ 437.661360] BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffffa0434314>] pcc_cpufreq_cpu_init+0x74/0x220 [pcc_cpufreq]
It's better to avoid the oops by failing the driver, and allowing the
system to boot.
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Len Brown <lenb@kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>