Merge branches 'pm-avs', 'pm-docs' and 'pm-tools'

* pm-avs:
  ARM: OMAP2+: SmartReflex: add omap_sr_pdata definition
  power: avs: smartreflex: Remove superfluous cast in debugfs_create_file() call

* pm-docs:
  PM: Wrap documentation to fit in 80 columns

* pm-tools:
  cpupower: ToDo: Update ToDo with ideas for per_cpu_schedule handling
  cpupower: mperf_monitor: Update cpupower to use the RDPRU instruction
  cpupower: mperf_monitor: Introduce per_cpu_schedule flag
  cpupower: Move needs_root variable into a sub-struct
  cpupower : Handle set and info subcommands correctly
  pm-graph info added to MAINTAINERS
  tools/power/cpupower: Fix initializer override in hsw_ext_cstates
This commit is contained in:
Rafael J. Wysocki 2019-11-26 10:28:34 +01:00
commit e350b60f4e
24 changed files with 194 additions and 94 deletions

View File

@ -39,9 +39,10 @@ c) Compile the driver directly into the kernel and try the test modes of
d) Attempt to hibernate with the driver compiled directly into the kernel d) Attempt to hibernate with the driver compiled directly into the kernel
in the "reboot", "shutdown" and "platform" modes. in the "reboot", "shutdown" and "platform" modes.
e) Try the test modes of suspend (see: Documentation/power/basic-pm-debugging.rst, e) Try the test modes of suspend (see:
2). [As far as the STR tests are concerned, it should not matter whether or Documentation/power/basic-pm-debugging.rst, 2). [As far as the STR tests are
not the driver is built as a module.] concerned, it should not matter whether or not the driver is built as a
module.]
f) Attempt to suspend to RAM using the s2ram tool with the driver loaded f) Attempt to suspend to RAM using the s2ram tool with the driver loaded
(see: Documentation/power/basic-pm-debugging.rst, 2). (see: Documentation/power/basic-pm-debugging.rst, 2).

View File

@ -215,30 +215,31 @@ VI. Are there any precautions to be taken to prevent freezing failures?
Yes, there are. Yes, there are.
First of all, grabbing the 'system_transition_mutex' lock to mutually exclude a piece of code First of all, grabbing the 'system_transition_mutex' lock to mutually exclude a
from system-wide sleep such as suspend/hibernation is not encouraged. piece of code from system-wide sleep such as suspend/hibernation is not
If possible, that piece of code must instead hook onto the suspend/hibernation encouraged. If possible, that piece of code must instead hook onto the
notifiers to achieve mutual exclusion. Look at the CPU-Hotplug code suspend/hibernation notifiers to achieve mutual exclusion. Look at the
(kernel/cpu.c) for an example. CPU-Hotplug code (kernel/cpu.c) for an example.
However, if that is not feasible, and grabbing 'system_transition_mutex' is deemed necessary, However, if that is not feasible, and grabbing 'system_transition_mutex' is
it is strongly discouraged to directly call mutex_[un]lock(&system_transition_mutex) since deemed necessary, it is strongly discouraged to directly call
that could lead to freezing failures, because if the suspend/hibernate code mutex_[un]lock(&system_transition_mutex) since that could lead to freezing
successfully acquired the 'system_transition_mutex' lock, and hence that other entity failed failures, because if the suspend/hibernate code successfully acquired the
to acquire the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE 'system_transition_mutex' lock, and hence that other entity failed to acquire
state. As a consequence, the freezer would not be able to freeze that task, the lock, then that task would get blocked in TASK_UNINTERRUPTIBLE state. As a
leading to freezing failure. consequence, the freezer would not be able to freeze that task, leading to
freezing failure.
However, the [un]lock_system_sleep() APIs are safe to use in this scenario, However, the [un]lock_system_sleep() APIs are safe to use in this scenario,
since they ask the freezer to skip freezing this task, since it is anyway since they ask the freezer to skip freezing this task, since it is anyway
"frozen enough" as it is blocked on 'system_transition_mutex', which will be released "frozen enough" as it is blocked on 'system_transition_mutex', which will be
only after the entire suspend/hibernation sequence is complete. released only after the entire suspend/hibernation sequence is complete. So, to
So, to summarize, use [un]lock_system_sleep() instead of directly using summarize, use [un]lock_system_sleep() instead of directly using
mutex_[un]lock(&system_transition_mutex). That would prevent freezing failures. mutex_[un]lock(&system_transition_mutex). That would prevent freezing failures.
V. Miscellaneous V. Miscellaneous
================ ================
/sys/power/pm_freeze_timeout controls how long it will cost at most to freeze /sys/power/pm_freeze_timeout controls how long it will cost at most to freeze
all user space processes or all freezable kernel threads, in unit of millisecond. all user space processes or all freezable kernel threads, in unit of
The default value is 20000, with range of unsigned integer. millisecond. The default value is 20000, with range of unsigned integer.

View File

@ -73,19 +73,21 @@ factors. Example usage: Thermal management or other exceptional situations where
SoC framework might choose to disable a higher frequency OPP to safely continue SoC framework might choose to disable a higher frequency OPP to safely continue
operations until that OPP could be re-enabled if possible. operations until that OPP could be re-enabled if possible.
OPP library facilitates this concept in it's implementation. The following OPP library facilitates this concept in its implementation. The following
operational functions operate only on available opps: operational functions operate only on available opps:
opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq, dev_pm_opp_get_opp_count opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq,
dev_pm_opp_get_opp_count
dev_pm_opp_find_freq_exact is meant to be used to find the opp pointer which can then dev_pm_opp_find_freq_exact is meant to be used to find the opp pointer
be used for dev_pm_opp_enable/disable functions to make an opp available as required. which can then be used for dev_pm_opp_enable/disable functions to make an
opp available as required.
WARNING: Users of OPP library should refresh their availability count using WARNING: Users of OPP library should refresh their availability count using
get_opp_count if dev_pm_opp_enable/disable functions are invoked for a device, the get_opp_count if dev_pm_opp_enable/disable functions are invoked for a
exact mechanism to trigger these or the notification mechanism to other device, the exact mechanism to trigger these or the notification mechanism
dependent subsystems such as cpufreq are left to the discretion of the SoC to other dependent subsystems such as cpufreq are left to the discretion of
specific framework which uses the OPP library. Similar care needs to be taken the SoC specific framework which uses the OPP library. Similar care needs
care to refresh the cpufreq table in cases of these operations. to be taken care to refresh the cpufreq table in cases of these operations.
2. Initial OPP List Registration 2. Initial OPP List Registration
================================ ================================
@ -99,11 +101,11 @@ OPPs dynamically using the dev_pm_opp_enable / disable functions.
dev_pm_opp_add dev_pm_opp_add
Add a new OPP for a specific domain represented by the device pointer. Add a new OPP for a specific domain represented by the device pointer.
The OPP is defined using the frequency and voltage. Once added, the OPP The OPP is defined using the frequency and voltage. Once added, the OPP
is assumed to be available and control of it's availability can be done is assumed to be available and control of its availability can be done
with the dev_pm_opp_enable/disable functions. OPP library internally stores with the dev_pm_opp_enable/disable functions. OPP library
and manages this information in the opp struct. This function may be internally stores and manages this information in the opp struct.
used by SoC framework to define a optimal list as per the demands of This function may be used by SoC framework to define a optimal list
SoC usage environment. as per the demands of SoC usage environment.
WARNING: WARNING:
Do not use this function in interrupt context. Do not use this function in interrupt context.
@ -354,7 +356,7 @@ struct dev_pm_opp
struct device struct device
This is used to identify a domain to the OPP layer. The This is used to identify a domain to the OPP layer. The
nature of the device and it's implementation is left to the user of nature of the device and its implementation is left to the user of
OPP library such as the SoC framework. OPP library such as the SoC framework.
Overall, in a simplistic view, the data structure operations is represented as Overall, in a simplistic view, the data structure operations is represented as

View File

@ -426,12 +426,12 @@ pm->runtime_idle() callback.
2.4. System-Wide Power Transitions 2.4. System-Wide Power Transitions
---------------------------------- ----------------------------------
There are a few different types of system-wide power transitions, described in There are a few different types of system-wide power transitions, described in
Documentation/driver-api/pm/devices.rst. Each of them requires devices to be handled Documentation/driver-api/pm/devices.rst. Each of them requires devices to be
in a specific way and the PM core executes subsystem-level power management handled in a specific way and the PM core executes subsystem-level power
callbacks for this purpose. They are executed in phases such that each phase management callbacks for this purpose. They are executed in phases such that
involves executing the same subsystem-level callback for every device belonging each phase involves executing the same subsystem-level callback for every device
to the given subsystem before the next phase begins. These phases always run belonging to the given subsystem before the next phase begins. These phases
after tasks have been frozen. always run after tasks have been frozen.
2.4.1. System Suspend 2.4.1. System Suspend
^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^
@ -636,12 +636,12 @@ System restore requires a hibernation image to be loaded into memory and the
pre-hibernation memory contents to be restored before the pre-hibernation system pre-hibernation memory contents to be restored before the pre-hibernation system
activity can be resumed. activity can be resumed.
As described in Documentation/driver-api/pm/devices.rst, the hibernation image is loaded As described in Documentation/driver-api/pm/devices.rst, the hibernation image
into memory by a fresh instance of the kernel, called the boot kernel, which in is loaded into memory by a fresh instance of the kernel, called the boot kernel,
turn is loaded and run by a boot loader in the usual way. After the boot kernel which in turn is loaded and run by a boot loader in the usual way. After the
has loaded the image, it needs to replace its own code and data with the code boot kernel has loaded the image, it needs to replace its own code and data with
and data of the "hibernated" kernel stored within the image, called the image the code and data of the "hibernated" kernel stored within the image, called the
kernel. For this purpose all devices are frozen just like before creating image kernel. For this purpose all devices are frozen just like before creating
the image during hibernation, in the the image during hibernation, in the
prepare, freeze, freeze_noirq prepare, freeze, freeze_noirq
@ -691,8 +691,8 @@ controlling the runtime power management of their devices.
At the time of this writing there are two ways to define power management At the time of this writing there are two ways to define power management
callbacks for a PCI device driver, the recommended one, based on using a callbacks for a PCI device driver, the recommended one, based on using a
dev_pm_ops structure described in Documentation/driver-api/pm/devices.rst, and the dev_pm_ops structure described in Documentation/driver-api/pm/devices.rst, and
"legacy" one, in which the .suspend(), .suspend_late(), .resume_early(), and the "legacy" one, in which the .suspend(), .suspend_late(), .resume_early(), and
.resume() callbacks from struct pci_driver are used. The legacy approach, .resume() callbacks from struct pci_driver are used. The legacy approach,
however, doesn't allow one to define runtime power management callbacks and is however, doesn't allow one to define runtime power management callbacks and is
not really suitable for any new drivers. Therefore it is not covered by this not really suitable for any new drivers. Therefore it is not covered by this

View File

@ -8,8 +8,8 @@ one of the parameters.
Two different PM QoS frameworks are available: Two different PM QoS frameworks are available:
1. PM QoS classes for cpu_dma_latency 1. PM QoS classes for cpu_dma_latency
2. the per-device PM QoS framework provides the API to manage the per-device latency 2. The per-device PM QoS framework provides the API to manage the
constraints and PM QoS flags. per-device latency constraints and PM QoS flags.
Each parameters have defined units: Each parameters have defined units:
@ -47,14 +47,14 @@ void pm_qos_add_request(handle, param_class, target_value):
pm_qos API functions. pm_qos API functions.
void pm_qos_update_request(handle, new_target_value): void pm_qos_update_request(handle, new_target_value):
Will update the list element pointed to by the handle with the new target value Will update the list element pointed to by the handle with the new target
and recompute the new aggregated target, calling the notification tree if the value and recompute the new aggregated target, calling the notification tree
target is changed. if the target is changed.
void pm_qos_remove_request(handle): void pm_qos_remove_request(handle):
Will remove the element. After removal it will update the aggregate target and Will remove the element. After removal it will update the aggregate target
call the notification tree if the target was changed as a result of removing and call the notification tree if the target was changed as a result of
the request. removing the request.
int pm_qos_request(param_class): int pm_qos_request(param_class):
Returns the aggregated value for a given PM QoS class. Returns the aggregated value for a given PM QoS class.
@ -167,9 +167,9 @@ int dev_pm_qos_expose_flags(device, value)
change the value of the PM_QOS_FLAG_NO_POWER_OFF flag. change the value of the PM_QOS_FLAG_NO_POWER_OFF flag.
void dev_pm_qos_hide_flags(device) void dev_pm_qos_hide_flags(device)
Drop the request added by dev_pm_qos_expose_flags() from the device's PM QoS list Drop the request added by dev_pm_qos_expose_flags() from the device's PM QoS
of flags and remove sysfs attribute pm_qos_no_power_off from the device's power list of flags and remove sysfs attribute pm_qos_no_power_off from the device's
directory. power directory.
Notification mechanisms: Notification mechanisms:
@ -179,8 +179,8 @@ int dev_pm_qos_add_notifier(device, notifier, type):
Adds a notification callback function for the device for a particular request Adds a notification callback function for the device for a particular request
type. type.
The callback is called when the aggregated value of the device constraints list The callback is called when the aggregated value of the device constraints
is changed. list is changed.
int dev_pm_qos_remove_notifier(device, notifier, type): int dev_pm_qos_remove_notifier(device, notifier, type):
Removes the notification callback function for the device. Removes the notification callback function for the device.

View File

@ -268,8 +268,8 @@ defined in include/linux/pm.h:
`unsigned int runtime_auto;` `unsigned int runtime_auto;`
- if set, indicates that the user space has allowed the device driver to - if set, indicates that the user space has allowed the device driver to
power manage the device at run time via the /sys/devices/.../power/control power manage the device at run time via the /sys/devices/.../power/control
`interface;` it may only be modified with the help of the pm_runtime_allow() `interface;` it may only be modified with the help of the
and pm_runtime_forbid() helper functions pm_runtime_allow() and pm_runtime_forbid() helper functions
`unsigned int no_callbacks;` `unsigned int no_callbacks;`
- indicates that the device does not use the runtime PM callbacks (see - indicates that the device does not use the runtime PM callbacks (see

View File

@ -106,8 +106,8 @@ execution during resume):
* Release system_transition_mutex lock. * Release system_transition_mutex lock.
It is to be noted here that the system_transition_mutex lock is acquired at the very It is to be noted here that the system_transition_mutex lock is acquired at the
beginning, when we are just starting out to suspend, and then released only very beginning, when we are just starting out to suspend, and then released only
after the entire cycle is complete (i.e., suspend + resume). after the entire cycle is complete (i.e., suspend + resume).
:: ::
@ -165,7 +165,8 @@ Important files and functions/entry points:
- kernel/power/process.c : freeze_processes(), thaw_processes() - kernel/power/process.c : freeze_processes(), thaw_processes()
- kernel/power/suspend.c : suspend_prepare(), suspend_enter(), suspend_finish() - kernel/power/suspend.c : suspend_prepare(), suspend_enter(), suspend_finish()
- kernel/cpu.c: cpu_[up|down](), _cpu_[up|down](), [disable|enable]_nonboot_cpus() - kernel/cpu.c: cpu_[up|down](), _cpu_[up|down](),
[disable|enable]_nonboot_cpus()

View File

@ -118,7 +118,8 @@ In a really perfect world::
echo 1 > /proc/acpi/sleep # for standby echo 1 > /proc/acpi/sleep # for standby
echo 2 > /proc/acpi/sleep # for suspend to ram echo 2 > /proc/acpi/sleep # for suspend to ram
echo 3 > /proc/acpi/sleep # for suspend to ram, but with more power conservative echo 3 > /proc/acpi/sleep # for suspend to ram, but with more power
# conservative
echo 4 > /proc/acpi/sleep # for suspend to disk echo 4 > /proc/acpi/sleep # for suspend to disk
echo 5 > /proc/acpi/sleep # for shutdown unfriendly the system echo 5 > /proc/acpi/sleep # for shutdown unfriendly the system
@ -192,8 +193,8 @@ Q:
A: A:
The freezing of tasks is a mechanism by which user space processes and some The freezing of tasks is a mechanism by which user space processes and some
kernel threads are controlled during hibernation or system-wide suspend (on some kernel threads are controlled during hibernation or system-wide suspend (on
architectures). See freezing-of-tasks.txt for details. some architectures). See freezing-of-tasks.txt for details.
Q: Q:
What is the difference between "platform" and "shutdown"? What is the difference between "platform" and "shutdown"?
@ -282,7 +283,8 @@ A:
suspend(PMSG_FREEZE): devices are frozen so that they don't interfere suspend(PMSG_FREEZE): devices are frozen so that they don't interfere
with state snapshot with state snapshot
state snapshot: copy of whole used memory is taken with interrupts disabled state snapshot: copy of whole used memory is taken with interrupts
disabled
resume(): devices are woken up so that we can write image to swap resume(): devices are woken up so that we can write image to swap
@ -353,8 +355,8 @@ Q:
A: A:
Generally, yes, you can. However, it requires you to use the "resume=" and Generally, yes, you can. However, it requires you to use the "resume=" and
"resume_offset=" kernel command line parameters, so the resume from a swap file "resume_offset=" kernel command line parameters, so the resume from a swap
cannot be initiated from an initrd or initramfs image. See file cannot be initiated from an initrd or initramfs image. See
swsusp-and-swap-files.txt for details. swsusp-and-swap-files.txt for details.
Q: Q:

View File

@ -13000,6 +13000,15 @@ L: linux-scsi@vger.kernel.org
S: Supported S: Supported
F: drivers/scsi/pm8001/ F: drivers/scsi/pm8001/
PM-GRAPH UTILITY
M: "Todd E Brandt" <todd.e.brandt@linux.intel.com>
L: linux-pm@vger.kernel.org
W: https://01.org/pm-graph
B: https://bugzilla.kernel.org/buglist.cgi?component=pm-graph&product=Tools
T: git git://github.com/intel/pm-graph
S: Supported
F: tools/power/pm-graph
PNP SUPPORT PNP SUPPORT
M: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> M: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
S: Maintained S: Maintained

View File

@ -905,7 +905,7 @@ static int omap_sr_probe(struct platform_device *pdev)
sr_info->dbg_dir = debugfs_create_dir(sr_info->name, sr_dbg_dir); sr_info->dbg_dir = debugfs_create_dir(sr_info->name, sr_dbg_dir);
debugfs_create_file("autocomp", S_IRUGO | S_IWUSR, sr_info->dbg_dir, debugfs_create_file("autocomp", S_IRUGO | S_IWUSR, sr_info->dbg_dir,
(void *)sr_info, &pm_sr_fops); sr_info, &pm_sr_fops);
debugfs_create_x32("errweight", S_IRUGO, sr_info->dbg_dir, debugfs_create_x32("errweight", S_IRUGO, sr_info->dbg_dir,
&sr_info->err_weight); &sr_info->err_weight);
debugfs_create_x32("errmaxlimit", S_IRUGO, sr_info->dbg_dir, debugfs_create_x32("errmaxlimit", S_IRUGO, sr_info->dbg_dir,

View File

@ -293,6 +293,9 @@ struct omap_sr_data {
struct voltagedomain *voltdm; struct voltagedomain *voltdm;
}; };
extern struct omap_sr_data omap_sr_pdata[OMAP_SR_NR];
#ifdef CONFIG_POWER_AVS_OMAP #ifdef CONFIG_POWER_AVS_OMAP
/* Smartreflex module enable/disable interface */ /* Smartreflex module enable/disable interface */

View File

@ -8,3 +8,17 @@ ToDos sorted by priority:
- Add another c1e debug idle monitor - Add another c1e debug idle monitor
-> Is by design racy with BIOS, but could be added -> Is by design racy with BIOS, but could be added
with a --force option and some "be careful" messages with a --force option and some "be careful" messages
- Add cpu_start()/cpu_stop() callbacks for monitor
-> This is to move the per_cpu logic from inside the
monitor to outside it. This can be given higher
priority in fork_it.
- Fork as many processes as there are CPUs in case the
per_cpu_schedule flag is set.
-> Bind forked process to each cpu.
-> Execute start measures via the forked processes on
each cpu.
-> Run test executable in a forked process.
-> Execute stop measures via the forked processes on
each cpu.
This would be ideal as it will not introduce noise in the
tested executable.

View File

@ -10,6 +10,7 @@
#include <errno.h> #include <errno.h>
#include <string.h> #include <string.h>
#include <getopt.h> #include <getopt.h>
#include <sys/utsname.h>
#include "helpers/helpers.h" #include "helpers/helpers.h"
#include "helpers/sysfs.h" #include "helpers/sysfs.h"
@ -30,6 +31,7 @@ int cmd_info(int argc, char **argv)
extern char *optarg; extern char *optarg;
extern int optind, opterr, optopt; extern int optind, opterr, optopt;
unsigned int cpu; unsigned int cpu;
struct utsname uts;
union { union {
struct { struct {
@ -39,6 +41,13 @@ int cmd_info(int argc, char **argv)
} params = {}; } params = {};
int ret = 0; int ret = 0;
ret = uname(&uts);
if (!ret && (!strcmp(uts.machine, "ppc64le") ||
!strcmp(uts.machine, "ppc64"))) {
fprintf(stderr, _("Subcommand not supported on POWER.\n"));
return ret;
}
setlocale(LC_ALL, ""); setlocale(LC_ALL, "");
textdomain(PACKAGE); textdomain(PACKAGE);

View File

@ -10,6 +10,7 @@
#include <errno.h> #include <errno.h>
#include <string.h> #include <string.h>
#include <getopt.h> #include <getopt.h>
#include <sys/utsname.h>
#include "helpers/helpers.h" #include "helpers/helpers.h"
#include "helpers/sysfs.h" #include "helpers/sysfs.h"
@ -31,6 +32,7 @@ int cmd_set(int argc, char **argv)
extern char *optarg; extern char *optarg;
extern int optind, opterr, optopt; extern int optind, opterr, optopt;
unsigned int cpu; unsigned int cpu;
struct utsname uts;
union { union {
struct { struct {
@ -41,6 +43,13 @@ int cmd_set(int argc, char **argv)
int perf_bias = 0; int perf_bias = 0;
int ret = 0; int ret = 0;
ret = uname(&uts);
if (!ret && (!strcmp(uts.machine, "ppc64le") ||
!strcmp(uts.machine, "ppc64"))) {
fprintf(stderr, _("Subcommand not supported on POWER.\n"));
return ret;
}
setlocale(LC_ALL, ""); setlocale(LC_ALL, "");
textdomain(PACKAGE); textdomain(PACKAGE);

View File

@ -131,6 +131,10 @@ out:
if (ext_cpuid_level >= 0x80000007 && if (ext_cpuid_level >= 0x80000007 &&
(cpuid_edx(0x80000007) & (1 << 9))) (cpuid_edx(0x80000007) & (1 << 9)))
cpu_info->caps |= CPUPOWER_CAP_AMD_CBP; cpu_info->caps |= CPUPOWER_CAP_AMD_CBP;
if (ext_cpuid_level >= 0x80000008 &&
cpuid_ebx(0x80000008) & (1 << 4))
cpu_info->caps |= CPUPOWER_CAP_AMD_RDPRU;
} }
if (cpu_info->vendor == X86_VENDOR_INTEL) { if (cpu_info->vendor == X86_VENDOR_INTEL) {

View File

@ -69,6 +69,7 @@ enum cpupower_cpu_vendor {X86_VENDOR_UNKNOWN = 0, X86_VENDOR_INTEL,
#define CPUPOWER_CAP_HAS_TURBO_RATIO 0x00000010 #define CPUPOWER_CAP_HAS_TURBO_RATIO 0x00000010
#define CPUPOWER_CAP_IS_SNB 0x00000020 #define CPUPOWER_CAP_IS_SNB 0x00000020
#define CPUPOWER_CAP_INTEL_IDA 0x00000040 #define CPUPOWER_CAP_INTEL_IDA 0x00000040
#define CPUPOWER_CAP_AMD_RDPRU 0x00000080
#define CPUPOWER_AMD_CPBDIS 0x02000000 #define CPUPOWER_AMD_CPBDIS 0x02000000

View File

@ -328,7 +328,7 @@ struct cpuidle_monitor amd_fam14h_monitor = {
.stop = amd_fam14h_stop, .stop = amd_fam14h_stop,
.do_register = amd_fam14h_register, .do_register = amd_fam14h_register,
.unregister = amd_fam14h_unregister, .unregister = amd_fam14h_unregister,
.needs_root = 1, .flags.needs_root = 1,
.overflow_s = OVERFLOW_MS / 1000, .overflow_s = OVERFLOW_MS / 1000,
}; };
#endif /* #if defined(__i386__) || defined(__x86_64__) */ #endif /* #if defined(__i386__) || defined(__x86_64__) */

View File

@ -207,6 +207,6 @@ struct cpuidle_monitor cpuidle_sysfs_monitor = {
.stop = cpuidle_stop, .stop = cpuidle_stop,
.do_register = cpuidle_register, .do_register = cpuidle_register,
.unregister = cpuidle_unregister, .unregister = cpuidle_unregister,
.needs_root = 0, .flags.needs_root = 0,
.overflow_s = UINT_MAX, .overflow_s = UINT_MAX,
}; };

View File

@ -408,7 +408,7 @@ int cmd_monitor(int argc, char **argv)
dprint("Try to register: %s\n", all_monitors[num]->name); dprint("Try to register: %s\n", all_monitors[num]->name);
test_mon = all_monitors[num]->do_register(); test_mon = all_monitors[num]->do_register();
if (test_mon) { if (test_mon) {
if (test_mon->needs_root && !run_as_root) { if (test_mon->flags.needs_root && !run_as_root) {
fprintf(stderr, _("Available monitor %s needs " fprintf(stderr, _("Available monitor %s needs "
"root access\n"), test_mon->name); "root access\n"), test_mon->name);
continue; continue;

View File

@ -60,7 +60,10 @@ struct cpuidle_monitor {
struct cpuidle_monitor* (*do_register) (void); struct cpuidle_monitor* (*do_register) (void);
void (*unregister)(void); void (*unregister)(void);
unsigned int overflow_s; unsigned int overflow_s;
int needs_root; struct {
unsigned int needs_root:1;
unsigned int per_cpu_schedule:1;
} flags;
}; };
extern long long timespec_diff_us(struct timespec start, struct timespec end); extern long long timespec_diff_us(struct timespec start, struct timespec end);

View File

@ -39,7 +39,6 @@ static cstate_t hsw_ext_cstates[HSW_EXT_CSTATE_COUNT] = {
{ {
.name = "PC9", .name = "PC9",
.desc = N_("Processor Package C9"), .desc = N_("Processor Package C9"),
.desc = N_("Processor Package C2"),
.id = PC9, .id = PC9,
.range = RANGE_PACKAGE, .range = RANGE_PACKAGE,
.get_count_percent = hsw_ext_get_count_percent, .get_count_percent = hsw_ext_get_count_percent,
@ -188,7 +187,7 @@ struct cpuidle_monitor intel_hsw_ext_monitor = {
.stop = hsw_ext_stop, .stop = hsw_ext_stop,
.do_register = hsw_ext_register, .do_register = hsw_ext_register,
.unregister = hsw_ext_unregister, .unregister = hsw_ext_unregister,
.needs_root = 1, .flags.needs_root = 1,
.overflow_s = 922000000 /* 922337203 seconds TSC overflow .overflow_s = 922000000 /* 922337203 seconds TSC overflow
at 20GHz */ at 20GHz */
}; };

View File

@ -19,6 +19,10 @@
#define MSR_APERF 0xE8 #define MSR_APERF 0xE8
#define MSR_MPERF 0xE7 #define MSR_MPERF 0xE7
#define RDPRU ".byte 0x0f, 0x01, 0xfd"
#define RDPRU_ECX_MPERF 0
#define RDPRU_ECX_APERF 1
#define MSR_TSC 0x10 #define MSR_TSC 0x10
#define MSR_AMD_HWCR 0xc0010015 #define MSR_AMD_HWCR 0xc0010015
@ -86,15 +90,51 @@ static int mperf_get_tsc(unsigned long long *tsc)
return ret; return ret;
} }
static int mperf_init_stats(unsigned int cpu) static int get_aperf_mperf(int cpu, unsigned long long *aval,
unsigned long long *mval)
{ {
unsigned long long val; unsigned long low_a, high_a;
unsigned long low_m, high_m;
int ret; int ret;
ret = read_msr(cpu, MSR_APERF, &val); /*
aperf_previous_count[cpu] = val; * Running on the cpu from which we read the registers will
ret |= read_msr(cpu, MSR_MPERF, &val); * prevent APERF/MPERF from going out of sync because of IPI
mperf_previous_count[cpu] = val; * latency introduced by read_msr()s.
*/
if (mperf_monitor.flags.per_cpu_schedule) {
if (bind_cpu(cpu))
return 1;
}
if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_RDPRU) {
asm volatile(RDPRU
: "=a" (low_a), "=d" (high_a)
: "c" (RDPRU_ECX_APERF));
asm volatile(RDPRU
: "=a" (low_m), "=d" (high_m)
: "c" (RDPRU_ECX_MPERF));
*aval = ((low_a) | (high_a) << 32);
*mval = ((low_m) | (high_m) << 32);
return 0;
}
ret = read_msr(cpu, MSR_APERF, aval);
ret |= read_msr(cpu, MSR_MPERF, mval);
return ret;
}
static int mperf_init_stats(unsigned int cpu)
{
unsigned long long aval, mval;
int ret;
ret = get_aperf_mperf(cpu, &aval, &mval);
aperf_previous_count[cpu] = aval;
mperf_previous_count[cpu] = mval;
is_valid[cpu] = !ret; is_valid[cpu] = !ret;
return 0; return 0;
@ -102,13 +142,12 @@ static int mperf_init_stats(unsigned int cpu)
static int mperf_measure_stats(unsigned int cpu) static int mperf_measure_stats(unsigned int cpu)
{ {
unsigned long long val; unsigned long long aval, mval;
int ret; int ret;
ret = read_msr(cpu, MSR_APERF, &val); ret = get_aperf_mperf(cpu, &aval, &mval);
aperf_current_count[cpu] = val; aperf_current_count[cpu] = aval;
ret |= read_msr(cpu, MSR_MPERF, &val); mperf_current_count[cpu] = mval;
mperf_current_count[cpu] = val;
is_valid[cpu] = !ret; is_valid[cpu] = !ret;
return 0; return 0;
@ -305,6 +344,9 @@ struct cpuidle_monitor *mperf_register(void)
if (init_maxfreq_mode()) if (init_maxfreq_mode())
return NULL; return NULL;
if (cpupower_cpu_info.vendor == X86_VENDOR_AMD)
mperf_monitor.flags.per_cpu_schedule = 1;
/* Free this at program termination */ /* Free this at program termination */
is_valid = calloc(cpu_count, sizeof(int)); is_valid = calloc(cpu_count, sizeof(int));
mperf_previous_count = calloc(cpu_count, sizeof(unsigned long long)); mperf_previous_count = calloc(cpu_count, sizeof(unsigned long long));
@ -333,7 +375,7 @@ struct cpuidle_monitor mperf_monitor = {
.stop = mperf_stop, .stop = mperf_stop,
.do_register = mperf_register, .do_register = mperf_register,
.unregister = mperf_unregister, .unregister = mperf_unregister,
.needs_root = 1, .flags.needs_root = 1,
.overflow_s = 922000000 /* 922337203 seconds TSC overflow .overflow_s = 922000000 /* 922337203 seconds TSC overflow
at 20GHz */ at 20GHz */
}; };

View File

@ -208,7 +208,7 @@ struct cpuidle_monitor intel_nhm_monitor = {
.stop = nhm_stop, .stop = nhm_stop,
.do_register = intel_nhm_register, .do_register = intel_nhm_register,
.unregister = intel_nhm_unregister, .unregister = intel_nhm_unregister,
.needs_root = 1, .flags.needs_root = 1,
.overflow_s = 922000000 /* 922337203 seconds TSC overflow .overflow_s = 922000000 /* 922337203 seconds TSC overflow
at 20GHz */ at 20GHz */
}; };

View File

@ -192,7 +192,7 @@ struct cpuidle_monitor intel_snb_monitor = {
.stop = snb_stop, .stop = snb_stop,
.do_register = snb_register, .do_register = snb_register,
.unregister = snb_unregister, .unregister = snb_unregister,
.needs_root = 1, .flags.needs_root = 1,
.overflow_s = 922000000 /* 922337203 seconds TSC overflow .overflow_s = 922000000 /* 922337203 seconds TSC overflow
at 20GHz */ at 20GHz */
}; };