- Eliminate significant AML processing overhead related to using
operation regions in system memory by reworking the management
of memory mappings in the ACPI code to defer unmap operations
(to do them outside of the ACPICA locks, among other things) and
making the memory operation reagion handler avoid releasing memory
mappings created by it too early (Rafael Wysocki).
- Update the ACPICA code in the kernel to upstream revision
20200717:
* Prevent operation region reference counts from overflowing in
some cases (Erik Kaneda).
* Replace one-element array with flexible-array (Gustavo A. R.
Silva).
- Fix ACPI PCI hotplug reference counting (Rafael Wysocki).
- Drop last bits of the ACPI procfs interface (Thomas Renninger).
- Drop some redundant checks from the code parsing ACPI tables
related to NUMA (Hanjun Guo).
- Avoid redundant object evaluation in the ACPI device properties
handling code (Heikki Krogerus).
- Avoid unecessary memory overhead related to storing the signatures
of the ACPI tables recognized by the kernel (Ard Biesheuvel).
- Add missing newline characters when printing module parameter
values in some places (Xiongfeng Wang).
- Update the link to the ACPI specifications in some places (Tiezhu
Yang).
- Use the fallthrough pseudo-keyword in the ACPI code (Gustavo A. R.
Silva).
- Drop redundant variable initialization from the APEI code (Colin
Ian King).
- Drop uninitialized_var() from the ACPI PAD driver (Jason Yan).
- Replace HTTP links with HTTPS ones in the ACPI code (Alexander A.
Klimov).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl8oO8gSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx2nUP/iSRAW0DK4PYDNLDV1Q+y5RrQw44iMDf
yfLQu3agardM1KGtPuYw5zmU0UoEYtW8s2r027bxw9Hvn0IzBh5TiDvcVjMEnbVC
+6m/fWg3EStfZ9w2dxDzXDMIk/oiEZsjtWSRaDTfAIH2jc/xVcSXDojlMgBPQDu5
hIITjMbGGx783o4PNCYbIZy1ReJgd8MNQ+Xp3MCpTgbFgHMHKBOJ6B/nS8aTfilO
eE5JvzhXED7qITaXYWxI9OZpRTPTNQ3eaEPbWvnw4KJ5boMfyREMGdTBipXO+kSA
SwKhFysYEUAZM7Ffq0eTnWSCU7VWogAsTauIgs4+d9z8VrGhWi5+b6N/E/uwTKtj
HF98xtk+Loe8V24LwN0snvv51O7P5nAH47QxwIBvQssfR8ZSgdwHtUQcckybAJhx
LLmPtJrM8ZAefc9H4o0eVqumjoh1amGKC9dTY0g1j0UIE0y3ZIFHTvDNvhpTzgBk
5uUHHEiolGNWHVrs1LIMOEejqx62m+EjVc9b8XUdJqHoboTccMM73DRk/00meP/7
br/VfMI0aTjPLssvSC/ZSlTZt+ddrBm+cXw9eqruDQwdQaqxpJu+D3odjdaYSjpg
luiYQrQdoDmIDh4UNuJbvG/Hub3CLzvJSqGWLExNbX7nWXxH4HIx/8PcNtVkKZRV
qBXotIc+i4VD
=Nn2Q
-----END PGP SIGNATURE-----
Merge tag 'acpi-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These eliminate significant AML processing overhead related to using
operation regions in system memory, update the ACPICA code in the
kernel to upstream revision 20200717 (including a fix to prevent
operation region reference counts from overflowing in some cases),
remove the last bits of the (long deprecated) ACPI procfs interface
and do some assorted cleanups.
Specifics:
- Eliminate significant AML processing overhead related to using
operation regions in system memory by reworking the management of
memory mappings in the ACPI code to defer unmap operations (to do
them outside of the ACPICA locks, among other things) and making
the memory operation reagion handler avoid releasing memory
mappings created by it too early (Rafael Wysocki).
- Update the ACPICA code in the kernel to upstream revision 20200717:
* Prevent operation region reference counts from overflowing in
some cases (Erik Kaneda).
* Replace one-element array with flexible-array (Gustavo A. R.
Silva).
- Fix ACPI PCI hotplug reference counting (Rafael Wysocki).
- Drop last bits of the ACPI procfs interface (Thomas Renninger).
- Drop some redundant checks from the code parsing ACPI tables
related to NUMA (Hanjun Guo).
- Avoid redundant object evaluation in the ACPI device properties
handling code (Heikki Krogerus).
- Avoid unecessary memory overhead related to storing the signatures
of the ACPI tables recognized by the kernel (Ard Biesheuvel).
- Add missing newline characters when printing module parameter
values in some places (Xiongfeng Wang).
- Update the link to the ACPI specifications in some places (Tiezhu
Yang).
- Use the fallthrough pseudo-keyword in the ACPI code (Gustavo A. R.
Silva).
- Drop redundant variable initialization from the APEI code (Colin
Ian King).
- Drop uninitialized_var() from the ACPI PAD driver (Jason Yan).
- Replace HTTP links with HTTPS ones in the ACPI code (Alexander A.
Klimov)"
* tag 'acpi-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (22 commits)
ACPI: APEI: remove redundant assignment to variable rc
ACPI: NUMA: Remove the useless 'node >= MAX_NUMNODES' check
ACPI: NUMA: Remove the useless sub table pointer check
ACPI: tables: Remove the duplicated checks for acpi_parse_entries_array()
ACPICA: Update version to 20200717
ACPICA: Do not increment operation_region reference counts for field units
ACPICA: Replace one-element array with flexible-array
ACPI: Replace HTTP links with HTTPS ones
ACPI: Use valid link to the ACPI specification
ACPI: OSL: Clean up the removal of unused memory mappings
ACPI: OSL: Use deferred unmapping in acpi_os_unmap_iomem()
ACPI: OSL: Use deferred unmapping in acpi_os_unmap_generic_address()
ACPICA: Preserve memory opregion mappings
ACPI: OSL: Implement deferred unmapping of ACPI memory
ACPI: Use fallthrough pseudo-keyword
PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context()
ACPI: tables: avoid relocations for table signature array
ACPI: PAD: Eliminate usage of uninitialized_var() macro
ACPI: sysfs: add newlines when printing module parameters
ACPI: EC: add newline when printing 'ec_event_clearing' module parameter
...
Currently, acpi.info is an invalid link to access ACPI specification,
the new valid link is https://uefi.org/specifications.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Manpage of cpupower is listing wrong sub-commands in "See Also"
section. The option for cpupower-idle(1) should actually be
cpupower-idle-info(1) and cpupower-idle-set(1). This patch corrects
this anomaly.
Signed-off-by: Brahadambal Srinivasan <latha@linux.vnet.ibm.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Fix up multiple instances of "intervall" to correct
"interval" (all save one Italian instance).
Signed-off-by: Nick Black <dankamongmen@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The token before "-" should be the program name, no spaces allowed.
See man(7) and lexgrog(1).
Signed-off-by: Mattia Dongili <malattia@linux.it>
Signed-off-by: Thomas Renninger <trenn@suse.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
cpupower idle-set -D <latency>
currently only disables all C-states that have a higher latency than the
specified <latency>. But if deep sleep states were already disabled and
have a lower latency, they should get enabled again.
For example:
This call:
cpupower idle-set -D 30
disables all C-states with a higher or equal latency than 30.
If one then calls:
cpupower idle-set -D 100
C-states with a latency between 30-99 will get enabled again with this patch
now. It is ensured that only C-states with a latency of 100 and higher are
disabled.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
These kernel interfaces got removed by:
commit 8e7fbcbc22
Author: Peter Zijlstra <peterz@infradead.org>
Date: Mon Jan 9 11:28:35 2012 +0100
sched: Remove stale power aware scheduling remnants and dysfunctional knobs
No need to further keep them as userspace configurations.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The command "cpupower frequency-info" can be used when using cpupower to
monitor and test processor behaviour to determine if the processor is
behaving as expected. This data can be compared to the output of
/proc/cpuinfo or the output of
/sys/devices/system/cpu/cpuX/cpufreq/scaling_available_frequencies
to determine if the cpu is in an expected state.
When doing this I noticed comparison test failures due to the way the
data is displayed in cpupower. For example,
[root@intel-s3e37-02 cpupower]# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
2262000 2261000 2128000 1995000 1862000 1729000 1596000 1463000 1330000
1197000 1064000
compared to
[root@intel-s3e37-02 cpupower]# cpupower frequency-info
analyzing CPU 0:
driver: acpi-cpufreq
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: 10.0 us.
hardware limits: 1.06 GHz - 2.26 GHz
available frequency steps: 2.26 GHz, 2.26 GHz, 2.13 GHz, 2.00 GHz, 1.86 GHz, 1.73 GHz, 1.60 GHz, 1.46 GHz, 1.33 GHz, 1.20 GHz, 1.06 GHz
available cpufreq governors: conservative, userspace, powersave, ondemand, performance
current policy: frequency should be within 1.06 GHz and 2.26 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency is 2.26 GHz (asserted by call to hardware).
boost state support:
Supported: yes
Active: yes
shows very different values for the available frequency steps. The cpupower
output rounds off values at 2 decimal points and this causes problems with
test scripts. For example, with the data above,
1.064 is 1.06
1.197 is 1.20
1.596 is 1.60
1.995 is 2.00
2.128 is 2.13
and most confusingly,
2.261 is 2.26
2.262 is 2.26
Truncating these values serves no real purpose other than making the output
pretty. Since the default has been to round off these values I am adding
a -n/--no-rounding option to the cpupower utility that will display the
data without rounding off the still significant digits.
After patch,
analyzing CPU 0:
driver: acpi-cpufreq
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: 10.000 us.
hardware limits: 1.064000 GHz - 2.262000 GHz
available frequency steps: 2.262000 GHz, 2.261000 GHz, 2.128000 GHz, 1.995000 GHz, 1.862000 GHz, 1.729000 GHz, 1.596000 GHz, 1.463000 GHz, 1.330000 GHz, 1.197000 GHz, 1.064000 GHz
available cpufreq governors: conservative, userspace, powersave, ondemand, performance
current policy: frequency should be within 1.064000 GHz and 2.262000 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency is 2.262000 GHz (asserted by call to hardware).
boost state support:
Supported: yes
Active: yes
Acked-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
[rjw: Subject]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The cpupower idle-set subcommand was introduce recently.
This patch provides the missing manpage.
If cpupower is properly installed it will show up automatically
(similar to git), when invoking:
cpupower help idle-set
or
cpupower idle-set --help
Some parts have been taken over and adjusted from
git commit 62d6ae880e
documentation submitted by Carsten Emde.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Example:
cpupower idle-set -d 3
will disable C-state 3 on all processors (set commands are active on
all CPUs by default), same as:
cpupower -c all idle-set -d 3
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If an MSR based monitor is run in parallel this is not needed. This is the
default case on all/most Intel machines.
But when only sysfs info is read via cpupower monitor -m Idle_Stats (typically
the case for non root users) or when other monitors are PCI based (AMD),
Idle_Stats, read from sysfs can be totally bogus:
cpupower monitor -m Idle_Stats
PKG |CORE|CPU | POLL | C1-N | C3-N | C6-N
0| 0| 0| 0.00| 0.00| 0.24| 99.81
0| 0| 32| 0.00| 0.00| 0.00| 100.7
...
0| 17| 20| 0.00| 0.00| 0.00| 173.1
0| 17| 52| 0.00| 0.00| 0.07| 173.0
0| 18| 68| 0.00| 0.00| 0.00| 0.00
0| 18| 76| 0.00| 0.00| 0.00| 0.00
...
With the -c option all cores are woken up and the kernel
did update cpuidle statistics before reading out sysfs.
This causes some overhead. Therefore avoid if possible, use
if needed:
cpupower monitor -c -m Idle_Stats
PKG |CORE|CPU | POLL | C1-N | C3-N | C6-N
0| 0| 0| 0.00| 0.00| 0.00| 100.2
0| 0| 32| 0.00| 0.00| 0.00| 100.2
...
0| 8| 8| 0.00| 0.00| 0.00| 99.82
0| 8| 40| 0.00| 0.00| 0.00| 99.81
0| 9| 24| 0.00| 0.00| 0.00| 100.3
0| 9| 56| 0.00| 0.00| 0.00| 100.2
0| 16| 4| 0.00| 0.00| 0.00| 99.75
0| 16| 36| 0.00| 0.00| 0.00| 99.38
...
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It's been broken forever (i.e. it's not scheduling in a power
aware fashion), as reported by Suresh and others sending
patches, and nobody cares enough to fix it properly ...
so remove it to make space free for something better.
There's various problems with the code as it stands today, first
and foremost the user interface which is bound to topology
levels and has multiple values per level. This results in a
state explosion which the administrator or distro needs to
master and almost nobody does.
Furthermore large configuration state spaces aren't good, it
means the thing doesn't just work right because it's either
under so many impossibe to meet constraints, or even if
there's an achievable state workloads have to be aware of
it precisely and can never meet it for dynamic workloads.
So pushing this kind of decision to user-space was a bad idea
even with a single knob - it's exponentially worse with knobs
on every node of the topology.
There is a proposal to replace the user interface with a single
3 state knob:
sched_balance_policy := { performance, power, auto }
where 'auto' would be the preferred default which looks at things
like Battery/AC mode and possible cpufreq state or whatever the hw
exposes to show us power use expectations - but there's been no
progress on it in the past many months.
Aside from that, the actual implementation of the various knobs
is known to be broken. There have been sporadic attempts at
fixing things but these always stop short of reaching a mergable
state.
Therefore this wholesale removal with the hopes of spurring
people who care to come forward once again and work on a
coherent replacement.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1326104915.2442.53.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
cpupower-frequency-* manpages slightly differed from the others.
- Use uppercase letters in the title
- Show cpupower Manual in the header
- Remove Mattia from left down corner of the manpage, he is already
listed as author
- Remove --help, prints this message -> not needed
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
The last missing manpage for cpupower tools.
More info about other architecture's sleep state specialities would be great.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
The name of the monitor is updated at runtime to the name of the
CPU type.
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Andreas Herrmann <herrmann.der.user@googlemail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Instead of printing something non-formatted to stdout, call
man(1) to show the man page for the proper subcommand.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
CPU power consumption vs performance tuning is no longer
limited to CPU frequency switching anymore: deep sleep states,
traditional dynamic frequency scaling and hidden turbo/boost
frequencies are tied close together and depend on each other.
The first two exist on different architectures like PPC, Itanium and
ARM, the latter (so far) only on X86. On X86 the APU (CPU+GPU) will
only run most efficiently if CPU and GPU has proper power management
in place.
Users and Developers want to have *one* tool to get an overview what
their system supports and to monitor and debug CPU power management
in detail. The tool should compile and work on as many architectures
as possible.
Once this tool stabilizes a bit, it is intended to replace the
Intel-specific tools in tools/power/x86
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>