Commit Graph

48 Commits

Author SHA1 Message Date
Lina Iyer f49735f497 cpuidle: record state entry rejection statistics
CPUs may fail to enter the chosen idle state if there was a
pending interrupt, causing the cpuidle driver to return an error
value.

Record that and export it via sysfs along with the other idle state
statistics.

This could prove useful in understanding behavior of the governor
and the system during usecases that involve multiple CPUs.

Signed-off-by: Lina Iyer <ilina@codeaurora.org>
[ rjw: Changelog and documentation edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-23 14:10:31 +02:00
Qiushi Wu c343bf1ba5 cpuidle: Fix three reference count leaks
kobject_init_and_add() takes reference even when it fails.
If this function returns an error, kobject_put() must be called to
properly clean up the memory associated with the object.

Previous commit "b8eb718348b8" fixed a similar problem.

Signed-off-by: Qiushi Wu <wu000273@umn.edu>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-29 18:07:18 +02:00
Hanjun Guo cce55cc902 cpuidle: sysfs: Remove sysfs_switch and switch attributes
Since the cpuidle governor can be switched via sysfs in default,
remove sysfs_switch and cpuidle_switch_attrs.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-19 17:41:17 +02:00
Hanjun Guo b52e93e4e8 cpuidle: Make cpuidle governor switchable to be the default behaviour
For now cpuidle governor can be switched via sysfs only when the
boot option "cpuidle_sysfs_switch" is passed, but it's important
to switch the governor to adapt to different workloads, especially
after TEO and haltpoll governor were introduced.

Add available_governors and current_governor into the default
attributes, but reserve the current_governor_ro for compatiblity.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-19 17:41:17 +02:00
Hanjun Guo ef7e7d65eb cpuidle: sysfs: Accept governor name with 15 characters
CPUIDLE_NAME_LEN is 16, so it's possible to accept governor name
with 15 characters, but now store_current_governor() rejects
governor name with 15 characters as it returns -EINVAL if count
equals CPUIDLE_NAME_LEN.

Refactor the code to accept such case and simplify the code.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-19 17:41:17 +02:00
Hanjun Guo 3f9f8daad3 cpuidle: sysfs: Fix the overlap for showing available governors
When showing the available governors, it's "%s " in scnprintf(),
not "%s", so if the governor name has 15 characters, it will
overlap with the later one, fix it by adding one more for the
size.

While we are at it, fix the minor coding style issue and remove
the "/sizeof(char)" since sizeof(char) always equals 1.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-19 17:41:17 +02:00
Hanjun Guo eba933ceeb cpuidle: sysfs: Minor coding style corrections
Fix two minor coding style issues.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-04-29 13:33:24 +02:00
Hanjun Guo 2f516e7cbe cpuidle: sysfs: Remove the unused define_one_r(o/w) macros
The define_one_ro and define_one_rw macros are not used,
remove it.

Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-04-29 13:33:24 +02:00
Rafael J. Wysocki e6cf623ba3 Merge branch 'intel_idle+acpi'
Merge changes updating the ACPI processor driver in order to export
acpi_processor_evaluate_cst() to the code outside of it and adding
ACPI support to the intel_idle driver based on that.

* intel_idle+acpi:
  Documentation: admin-guide: PM: Add intel_idle document
  intel_idle: Use ACPI _CST on server systems
  intel_idle: Add module parameter to prevent ACPI _CST from being used
  intel_idle: Allow ACPI _CST to be used for selected known processors
  cpuidle: Allow idle states to be disabled by default
  intel_idle: Use ACPI _CST for processor models without C-state tables
  intel_idle: Refactor intel_idle_cpuidle_driver_init()
  ACPI: processor: Export acpi_processor_evaluate_cst()
  ACPI: processor: Make ACPI_PROCESSOR_CSTATE depend on ACPI_PROCESSOR
  ACPI: processor: Clean up acpi_processor_evaluate_cst()
  ACPI: processor: Introduce acpi_processor_evaluate_cst()
  ACPI: processor: Export function to claim _CST control
2020-01-23 00:35:50 +01:00
Benjamin Gaignard a09da3fbc1 cpuidle: sysfs: fix warnings when compiling with W=1
Fix kernel documentation comments to remove warnings when
compiling with W=1.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-01-23 00:31:31 +01:00
Rafael J. Wysocki 75a8026741 cpuidle: Allow idle states to be disabled by default
In certain situations it may be useful to prevent some idle states
from being used by default while allowing user space to enable them
later on.

For this purpose, introduce a new state flag, CPUIDLE_FLAG_OFF, to
mark idle states that should be disabled by default, make the core
set CPUIDLE_STATE_DISABLED_BY_USER for those states at the
initialization time and add a new state attribute in sysfs,
"default_status", to inform user space of the initial status of
the given idle state ("disabled" if CPUIDLE_FLAG_OFF is set for it,
"enabled" otherwise).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-12-27 11:02:08 +01:00
Rafael J. Wysocki c1d51f684c cpuidle: Use nanoseconds as the unit of time
Currently, the cpuidle subsystem uses microseconds as the unit of
time which (among other things) causes the idle loop to incur some
integer division overhead for no clear benefit.

In order to allow cpuidle to measure time in nanoseconds, add two
new fields, exit_latency_ns and target_residency_ns, to represent the
exit latency and target residency of an idle state in nanoseconds,
respectively, to struct cpuidle_state and initialize them with the
help of the corresponding values in microseconds provided by drivers.
Additionally, change cpuidle_governor_latency_req() to return the
idle state exit latency constraint in nanoseconds.

Also meeasure idle state residency (last_residency_ns in struct
cpuidle_device and time_ns in struct cpuidle_driver) in nanoseconds
and update the cpuidle core and governors accordingly.

However, the menu governor still computes typical intervals in
microseconds to avoid integer overflows.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
2019-11-11 21:56:07 +01:00
Rafael J. Wysocki 99e98d3fb1 cpuidle: Consolidate disabled state checks
There are two reasons why CPU idle states may be disabled: either
because the driver has disabled them or because they have been
disabled by user space via sysfs.

In the former case, the state's "disabled" flag is set once during
the initialization of the driver and it is never cleared later (it
is read-only effectively).  In the latter case, the "disable" field
of the given state's cpuidle_state_usage struct is set and it may be
changed via sysfs.  Thus checking whether or not an idle state has
been disabled involves reading these two flags every time.

In order to avoid the additional check of the state's "disabled" flag
(which is effectively read-only anyway), use the value of it at the
init time to set a (new) flag in the "disable" field of that state's
cpuidle_state_usage structure and use the sysfs interface to
manipulate another (new) flag in it.  This way the state is disabled
whenever the "disable" field of its cpuidle_state_usage structure is
nonzero, whatever the reason, and it is the only place to look into
to check whether or not the state has been disabled.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2019-11-06 13:19:56 +01:00
Marcelo Tosatti 259231a045 cpuidle: add poll_limit_ns to cpuidle_device structure
Add a poll_limit_ns variable to cpuidle_device structure.

Calculate and configure it in the new cpuidle_poll_time
function, in case its zero.

Individual governors are allowed to override this value.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-07-30 17:27:37 +02:00
Rafael J. Wysocki 04dab58a39 cpuidle: Add 'above' and 'below' idle state metrics
Add two new metrics for CPU idle states, "above" and "below", to count
the number of times the given state had been asked for (or entered
from the kernel's perspective), but the observed idle duration turned
out to be too short or too long for it (respectively).

These metrics help to estimate the quality of the CPU idle governor
in use.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-12-12 23:22:18 +01:00
Rafael J. Wysocki 64bdff6980 PM: cpuidle/suspend: Add s2idle usage and time state attributes
Add a new attribute group called "s2idle" under the sysfs directory
of each cpuidle state that supports the ->enter_s2idle callback
and put two new attributes, "usage" and "time", into that group to
represent the number of times the given state was requested for
suspend-to-idle and the total time spent in suspend-to-idle after
requesting that state, respectively.

That will allow diagnostic information related to suspend-to-idle
to be collected without enabling advanced debug features and
analyzing dmesg output.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-03-29 13:06:08 +02:00
Vaidyanathan Srinivasan ad0a45fd9c cpuidle: Validate cpu_dev in cpuidle_add_sysfs()
If a given cpu is not in cpu_present and cpu hotplug
is disabled, arch can skip setting up the cpu_dev.

Arch cpuidle driver should pass correct cpu mask
for registration, but failing to do so by the driver
causes error to propagate and crash like this:

[   30.076045] Unable to handle kernel paging request for data at address 0x00000048
[   30.076100] Faulting instruction address: 0xc0000000007b2f30
cpu 0x4d: Vector: 300 (Data Access) at [c000003feb18b670]
    pc: c0000000007b2f30: kobject_get+0x20/0x70
    lr: c0000000007b3c94: kobject_add_internal+0x54/0x3f0
    sp: c000003feb18b8f0
   msr: 9000000000009033
   dar: 48
 dsisr: 40000000
  current = 0xc000003fd2ed8300
  paca    = 0xc00000000fbab500   softe: 0        irq_happened: 0x01
    pid   = 1, comm = swapper/0
Linux version 4.11.0-rc2-svaidy+ (sv@sagarika) (gcc version 6.2.0
20161005 (Ubuntu 6.2.0-5ubuntu12) ) #10 SMP Sun Mar 19 00:08:09 IST 2017
enter ? for help
[c000003feb18b960] c0000000007b3c94 kobject_add_internal+0x54/0x3f0
[c000003feb18b9f0] c0000000007b43a4 kobject_init_and_add+0x64/0xa0
[c000003feb18ba70] c000000000e284f4 cpuidle_add_sysfs+0xb4/0x130
[c000003feb18baf0] c000000000e26038 cpuidle_register_device+0x118/0x1c0
[c000003feb18bb30] c000000000e26c48 cpuidle_register+0x78/0x120
[c000003feb18bbc0] c00000000168fd9c powernv_processor_idle_init+0x110/0x1c4
[c000003feb18bc40] c00000000000cff8 do_one_initcall+0x68/0x1d0
[c000003feb18bd00] c0000000016242f4 kernel_init_freeable+0x280/0x360
[c000003feb18bdc0] c00000000000d864 kernel_init+0x24/0x160
[c000003feb18be30] c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74

Validating cpu_dev fixes the crash and reports correct error message like:

[   30.163506] Failed to register cpuidle device for cpu136
[   30.173329] Registration of powernv driver failed.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
[ rjw: Comment massage ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-21 22:26:37 +01:00
Pan Bian 8f6040cebd cpuidle: fix improper return value on error
In function cpuidle_add_state_sysfs(), variable ret takes the return
value. Its value should be negative on errors. Because ret is reset in
the loop, its value will be 0 during the second and after repeat of the
loop. If kzalloc() returns a NULL pointer then, it will return 0. It may
be better to explicitly assign "-ENOMEM" when the call to kzalloc()
fails.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=188901
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-06 02:24:14 +01:00
Bartlomiej Zolnierkiewicz d75e4af14e cpuidle: remove state_count field from struct cpuidle_device
Thomas Schlichter reports the following issue on his Samsung NC20:

"The C-states C1 and C2 to the OS when connected to AC, and additionally
 provides the C3 C-state when disconnected from AC.  However, the number
 of C-states shown in sysfs is fixed to the number of C-states present
 at boot.
   If I boot with AC connected, I always only see the C-states up to C2
   even if I disconnect AC.

   The reason is commit 130a5f6924 (ACPI / cpuidle: remove dev->state_count
   setting).  It removes the update of dev->state_count, but sysfs uses
   exactly this variable to show the C-states.

   The fix is to use drv->state_count in sysfs.  As this is currently the
   last user of dev->state_count, this variable can be completely removed."

Remove dev->state_count as per the above.

Reported-by: Thomas Schlichter <thomas.schlichter@web.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-03 13:15:50 +02:00
Mohammad Merajul Islam Molla 4f8eea9b9f cpuidle: fix permission for driver name sysfs node
cpuidle driver name sysfs node is read-only, so permissions should be 0444.

Signed-off-by: Mohammad Merajul Islam Molla <meraj.enigma@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-07-19 21:43:28 +02:00
Daniel Lezcano 9bc0482fea cpuidle: sysfs: Export target residency information
From user space, there is no way to know the target residency for each idle
state. If we want to write tools to measure the accuracy of the idle state
selection from the governor, we need this info.

As the exit latency is exported through sysfs, exporting the target residency
in the same place makes sense.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-04-08 12:37:05 +02:00
Viresh Kumar 1f6b9f74ee cpuidle: use drv instead of cpuidle_driver in show_current_driver()
Instances of "struct cpuidle_driver *" are consistently named as "drv"
in the cpuidle core except in show_current_driver().

Make that function use variable naming consistent with the rest of the
code.

[rjw: Changelog]
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-30 01:21:23 +01:00
Viresh Kumar 0d09d31256 cpuidle: call cpuidle_get_driver() from after taking cpuidle_driver_lock
There are a few cpuidle_get_driver() calls that aren't made under
cpuidle_driver_lock which is incorrect.

Fix them by calling cpuidle_get_driver() after taking cpuidle_driver_lock.

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-10-30 01:21:23 +01:00
Daniel Lezcano 728ce22b69 cpuidle: Make cpuidle's sysfs directory dynamically allocated
The cpuidle sysfs code is designed to have a single instance of per
CPU cpuidle directory.  It is not possible to remove the sysfs entry
and create it again.  This is not a problem with the current code but
future changes will add CPU hotplug support to enable/disable the
device, so it will need to remove the sysfs entry like other
subsystems do.  That won't be possible without this change, because
the kobj is a static object which can't be reused for
kobj_init_and_add().

Add cpuidle_device_kobj to be allocated dynamically when
adding/removing a sysfs entry which is consistent with the other
cpuidle's sysfs entries.

An added benefit is that the sysfs code is now more self-contained
and the includes needed for sysfs can be moved from cpuidle.h
directly into sysfs.c so as to reduce the total number of headers
dragged along with cpuidle.h.

[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-15 02:09:47 +02:00
Daniel Lezcano f89ae89e27 cpuidle: Fix white space to follow CodingStyle
Fix white space in the cpuidle code to follow the rules described in
CodingStyle.

No changes in behavior should result from this.

[rjw: Changelog]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-15 02:09:47 +02:00
Krzysztof Mazur 392370e7aa cpuidle: fix number of initialized/destroyed states
Commit bf4d1b5ddb (cpuidle: support
multiple drivers) changed the number of initialized state kobjects
in cpuidle_add_state_sysfs() from device->state_count to
drv->state_count, but left device->state_count in
cpuidle_remove_state_sysfs().  The values of these two fields may be
different, in which case a NULL pointer dereference may happen in
cpuidle_remove_state_sysfs(), for example.  Fix this problem by making
cpuidle_add_state_sysfs() use device->state_count too (which restores
the original behavior of it).

[rjw: Changelog]
Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-01-11 23:20:09 +01:00
Daniel Lezcano bf4d1b5ddb cpuidle: support multiple drivers
With the tegra3 and the big.LITTLE [1] new architectures, several cpus
with different characteristics (latencies and states) can co-exists on the
system.

The cpuidle framework has the limitation of handling only identical cpus.

This patch removes this limitation by introducing the multiple driver support
for cpuidle.

This option is configurable at compile time and should be enabled for the
architectures mentioned above. So there is no impact for the other platforms
if the option is disabled. The option defaults to 'n'. Note the multiple drivers
support is also compatible with the existing drivers, even if just one driver is
needed, all the cpu will be tied to this driver using an extra small chunk of
processor memory.

The multiple driver support use a per-cpu driver pointer instead of a global
variable and the accessor to this variable are done from a cpu context.

In order to keep the compatibility with the existing drivers, the function
'cpuidle_register_driver' and 'cpuidle_unregister_driver' will register
the specified driver for all the cpus.

The semantic for the output of /sys/devices/system/cpu/cpuidle/current_driver
remains the same except the driver name will be related to the current cpu.

The /sys/devices/system/cpu/cpu[0-9]/cpuidle/driver/name files are added
allowing to read the per cpu driver name.

[1] http://lwn.net/Articles/481055/

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:34:23 +01:00
Daniel Lezcano 8f3e9953e1 cpuidle: fixup device.h header in cpuidle.h
The "struct device" is only used in sysfs.c.

The other .c files including the private header "cpuidle.h"
do not need to pull the entire headers tree from there as they
don't manipulate the "struct device".

This patch fixes this by moving the header inclusion to sysfs.c
and adding a forward declaration for the struct device.

The number of lines generated by the preprocesor:
Without this patch : 17269 loc
With this patch : 16446 loc

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:34:21 +01:00
Daniel Lezcano 349631e0e4 cpuidle / sysfs: move structure declaration into the sysfs.c file
The structure cpuidle_state_kobj is not used anywhere except
in the sysfs.c file. The definition of this structure is not
needed in the cpuidle header file. This patch moves it to the
sysfs.c file in order to encapsulate the code a bit more.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:34:21 +01:00
Daniel Lezcano e45a00d679 cpuidle / sysfs: move kobj initialization in the syfs file
Move the kobj initialization and completion in the sysfs.c
and encapsulate the code more.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:34:19 +01:00
Daniel Lezcano 1aef40e288 cpuidle / sysfs: change function parameter
The function needs the cpuidle_device which is initially passed to the
caller.

The current code gets the struct device from the struct cpuidle_device,
pass it the cpuidle_add_sysfs function. This function calls
per_cpu(cpuidle_devices, cpu) to get the cpuidle_device.

This patch pass the cpuidle_device instead and simplify the code.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:34:19 +01:00
ShuoX Liu dc7fd275ae cpuidle: move field disable from per-driver to per-cpu
Andrew J.Schorr raises a question.  When he changes the disable setting on
a single CPU, it affects all the other CPUs.  Basically, currently, the
disable field is per-driver instead of per-cpu.  All the C states of the
same driver are shared by all CPU in the same machine.

The patch changes the `disable' field to per-cpu, so we could set this
separately for each cpu.

Signed-off-by: ShuoX Liu <shuox.liu@intel.com>
Reported-by: Andrew J.Schorr <aschorr@telemetry-investments.com>
Reviewed-by: Yanmin Zhang <yanmin_zhang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-07-03 19:05:31 +02:00
ShuoX Liu 3a53396b03 cpuidle: add a sysfs entry to disable specific C state for debug purpose.
Some C states of new CPU might be not good.  One reason is BIOS might
configure them incorrectly.  To help developers root cause it quickly, the
patch adds a new sysfs entry, so developers could disable specific C state
manually.

In addition, C state might have much impact on performance tuning, as it
takes much time to enter/exit C states, which might delay interrupt
processing.  With the new debug option, developers could check if a deep C
state could impact performance and how much impact it could cause.

Also add this option in Documentation/cpuidle/sysfs.txt.

[akpm@linux-foundation.org: check kstrtol return value]
Signed-off-by: ShuoX Liu <shuox.liu@intel.com>
Reviewed-by: Yanmin Zhang <yanmin_zhang@intel.com>
Reviewed-and-Tested-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:52:58 -04:00
Kay Sievers 8a25a2fd12 cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem
This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
and converts the devices to regular devices. The sysdev drivers are
implemented as subsystem interfaces now.

After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.

Userspace relies on events and generic sysfs subsystem infrastructure
from sysdev devices, which are made available with this conversion.

Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@amd64.org>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Len Brown <lenb@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-21 14:29:42 -08:00
Deepthi Dharwar 46bcfad7a8 cpuidle: Single/Global registration of idle states
This patch makes the cpuidle_states structure global (single copy)
instead of per-cpu. The statistics needed on per-cpu basis
by the governor are kept per-cpu. This simplifies the cpuidle
subsystem as state registration is done by single cpu only.
Having single copy of cpuidle_states saves memory. Rare case
of asymmetric C-states can be handled within the cpuidle driver
and architectures such as POWER do not have asymmetric C-states.

Having single/global registration of all the idle states,
dynamic C-state transitions on x86 are handled by
the boot cpu. Here, the boot cpu  would disable all the devices,
re-populate the states and later enable all the devices,
irrespective of the cpu that would receive the notification first.

Reference:
https://lkml.org/lkml/2011/4/25/83

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-11-06 21:13:58 -05:00
Deepthi Dharwar 4202735e8a cpuidle: Split cpuidle_state structure and move per-cpu statistics fields
This is the first step towards global registration of cpuidle
states. The statistics used primarily by the governor are per-cpu
and have to be split from rest of the fields inside cpuidle_state,
which would be made global i.e. single copy. The driver_data field
is also per-cpu and moved.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-11-06 21:13:49 -05:00
Jesper Juhl 42b16b3fbb Kill off warning: ‘inline’ is not at beginning of declaration
Fix a bunch of
	warning: ‘inline’ is not at beginning of declaration
messages when building a 'make allyesconfig' kernel with -Wextra.

These warnings are trivial to kill, yet rather annoying when building with
-Wextra.
The more we can cut down on pointless crap like this the better (IMHO).

A previous patch to do this for a 'allnoconfig' build has already been
merged. This just takes the cleanup a little further.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-01-19 15:43:08 +01:00
Len Brown 752138df0d cpuidle: make cpuidle_curr_driver static
cpuidle_register_driver() sets cpuidle_curr_driver
cpuidle_unregister_driver() clears cpuidle_curr_driver

We should't expose cpuidle_curr_driver to
potential modification except via these interfaces.
So make it static and create cpuidle_get_driver() to observe it.

Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-27 21:06:58 -04:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Emese Revfy 52cf25d0ab Driver core: Constify struct sysfs_ops in struct kobj_type
Constify struct sysfs_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

 * prevents modification of data that is shared
   (referenced) by many other structure instances
   at runtime

 * detects/prevents accidental (but not intentional)
   modification attempts on archs that enforce
   read-only kernel data at runtime

 * potentially better optimized code as the compiler
   can assume that the const data cannot be changed

 * the compiler/linker move const data into .rodata
   and therefore exclude them from false sharing

Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Andi Kleen c9be0a36f9 sysdev: Pass attribute in sysdev_class attributes show/store
Passing the attribute to the low level IO functions allows all kinds
of cleanups, by sharing low level IO code without requiring
an own function for every piece of data.

Also drivers can extend the attributes with own data fields
and use that in the low level function.

Similar to sysdev_attributes and normal attributes.

This is a tree-wide sweep, converting everything in one go.

No functional changes in this patch other than passing the new
argument everywhere.

Tested on x86, the non x86 parts are uncompiled.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:47 -08:00
Rabin Vincent 66198f36aa cpuidle: make sysfs attributes sysdev class attributes
These attributes are really sysdev class attributes.  The incorrect
definition leads to an oops because of recent changes which make sysdev
attributes use a different prototype.

Based on Andi's f718cd4add ("sched: make
scheduler sysfs attributes sysdev class devices")

Reported-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-12 16:07:28 -07:00
Andi Kleen 4a0b2b4dbe sysdev: Pass the attribute to the low level sysdev show/store function
This allow to dynamically generate attributes and share show/store
functions between attributes. Right now most attributes are generated
by special macros and lots of duplicated code. With the attribute
passed it's instead possible to attach some data to the attribute
and then use that in shared low level functions to do different things.

I need this for the dynamically generated bank attributes in the x86
machine check code, but it'll allow some further cleanups.

I converted all users in tree to the new show/store prototype. It's a single
huge patch to avoid unbisectable sections.

Runtime tested: x86-32, x86-64
Compiled only: ia64, powerpc
Not compile tested/only grep converted: sh, arm, avr32

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:55:02 -07:00
Yi Yang 8b78cf602f cpuidle: fix cpuidle time and usage overflow
cpuidle C-state sysfs node time and usage are very easy to overflow because
they are all of unsigned int type, time will overflow within about two hours,
usage will take longer time to overflow, but they are increasing for ever.

This patch will convert them to unsigned long long.

Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-03-26 00:45:26 -04:00
Venkatesh Pallipadi 4fcb2fcd4d ACPI, cpuidle: Clarify C-state description in sysfs
Add a new sysfs entry under cpuidle states. desc - can be used by driver to
communicate to userspace any specific information about the state.
This helps in identifying the exact hardware C-states behind the ACPI C-state
definition.

Idea is to export this through powertop, which will help to map the C-state
reported by powertop to actual hardware C-state.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-14 00:09:55 -05:00
Greg Kroah-Hartman c10997f657 Kobject: convert drivers/* from kobject_unregister() to kobject_put()
There is no need for kobject_unregister() anymore, thanks to Kay's
kobject cleanup changes, so replace all instances of it with
kobject_put().


Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman 94f57f3368 Kobject: change drivers/cpuidle/sysfs.c to use kobject_init_and_add
Stop using kobject_register, as this way we can control the sending of
the uevent properly, after everything is properly initialized.

Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:28 -08:00
Len Brown 4f86d3a8e2 cpuidle: consolidate 2.6.22 cpuidle branch into one patch
commit e5a16b1f9eec0af7cfa0830304b41c1c0833cf9f
Author: Len Brown <len.brown@intel.com>
Date:   Tue Oct 2 23:44:44 2007 -0400

    cpuidle: shrink diff

    processor_idle.c |  440 +++++++++++++++++++++++++++++++++++++++++--
    1 file changed, 429 insertions(+), 11 deletions(-)

    Signed-off-by: Len Brown <len.brown@intel.com>

commit dfbb9d5aedfb18848a3e0d6f6e3e4969febb209c
Author: Len Brown <len.brown@intel.com>
Date:   Wed Sep 26 02:17:55 2007 -0400

    cpuidle: reduce diff size

    Reduces the cpuidle processor_idle.c diff vs 2.6.22 from this
     processor_idle.c | 2006 ++++++++++++++++++++++++++-----------------
     1 file changed, 1219 insertions(+), 787 deletions(-)

    to this:
     processor_idle.c |  502 +++++++++++++++++++++++++++++++++++++++----
     1 file changed, 458 insertions(+), 44 deletions(-)

    ...for the purpose of making the cpuilde patch less invasive
    and easier to review.

    no functional changes.  build tested only.

    Signed-off-by: Len Brown <len.brown@intel.com>

commit 889172fc915f5a7fe20f35b133cbd205ce69bf6c
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Sep 13 13:40:05 2007 -0700

    cpuidle: Retain old ACPI policy for !CONFIG_CPU_IDLE

    Retain the old policy in processor_idle, so that when CPU_IDLE is not
    configured, old C-state policy will still be used. This provides a
    clean gradual migration path from old ACPI policy to new cpuidle
    based policy.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 9544a8181edc7ecc33b3bfd69271571f98ed08bc
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Sep 13 13:39:17 2007 -0700

    cpuidle: Configure governors by default

    Quoting Len "Do not give an option to users to shoot themselves in the foot".

    Remove the configurability of ladder and menu governors as they are
    needed for default policy of cpuidle. That way users will not be able to
    have cpuidle without any policy loosing all C-state power savings.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 8975059a2c1e56cfe83d1bcf031bcf4cb39be743
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:27:07 2007 -0400

    CPUIDLE: load ACPI properly when CPUIDLE is disabled

    Change the registration return codes for when CPUIDLE
    support is not compiled into the kernel.  As a result, the ACPI
    processor driver will load properly even if CPUIDLE is unavailable.
    However, it may be possible to cleanup the ACPI processor driver further
    and eliminate some dead code paths.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit e0322e2b58dd1b12ec669bf84693efe0dc2414a8
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:26:06 2007 -0400

    CPUIDLE: remove cpuidle_get_bm_activity()

    Remove cpuidle_get_bm_activity() and updates governors
    accordingly.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 18a6e770d5c82ba26653e53d240caa617e09e9ab
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:25:58 2007 -0400

    CPUIDLE: max_cstate fix

    Currently max_cstate is limited to 0, resulting in no idle processor
    power management on ACPI platforms.  This patch restores the value to
    the array size.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 1fdc0887286179b40ce24bcdbde663172e205ef0
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:25:40 2007 -0400

    CPUIDLE: handle BM detection inside the ACPI Processor driver

    Update the ACPI processor driver to detect BM activity and
    limit state entry depth internally, rather than exposing such
    requirements to CPUIDLE.  As a result, CPUIDLE can drop this
    ACPI-specific interface and become more platform independent.  BM
    activity is now handled much more aggressively than it was in the
    original implementation, so some testing coverage may be needed to
    verify that this doesn't introduce any DMA buffer under-run issues.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 0ef38840db666f48e3cdd2b769da676c57228dd9
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:25:14 2007 -0400

    CPUIDLE: menu governor updates

    Tweak the menu governor to more effectively handle non-timer
    break events.  Non-timer break events are detected by comparing the
    actual sleep time to the expected sleep time.  In future revisions, it
    may be more reliable to use the timer data structures directly.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit bb4d74fca63fa96cf3ace644b15ae0f12b7df5a1
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:24:40 2007 -0400

    CPUIDLE: fix 'current_governor' sysfs entry

    Allow the "current_governor" sysfs entry to properly handle
    input terminated with '\n'.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit df3c71559bb69b125f1a48971bf0d17f78bbdf47
Author: Len Brown <len.brown@intel.com>
Date:   Sun Aug 12 02:00:45 2007 -0400

    cpuidle: fix IA64 build (again)

    Signed-off-by: Len Brown <len.brown@intel.com>

commit a02064579e3f9530fd31baae16b1fc46b5a7bca8
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Sun Aug 12 01:39:27 2007 -0400

    cpuidle: Remove support for runtime changing of max_cstate

    Remove support for runtime changeability of max_cstate. Drivers can use
    use latency APIs.

    max_cstate can still be used as a boot time option and dmi override.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 0912a44b13adf22f5e3f607d263aed23b4910d7e
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Sun Aug 12 01:39:16 2007 -0400

    cpuidle: Remove ACPI cstate_limit calls from ipw2100

    ipw2100 already has code to use accetable_latency interfaces to limit the
    C-state. Remove the calls to acpi_set_cstate_limit and acpi_get_cstate_limit
    as they are redundant.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit c649a76e76be6bff1fd770d0a775798813a3f6e0
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Sun Aug 12 01:35:39 2007 -0400

    cpuidle: compile fix for pause and resume functions

    Fix the compilation failure when cpuidle is not compiled in.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Acked-by: Adam Belay <adam.belay@novell.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 2305a5920fb8ee6ccec1c62ade05aa8351091d71
Author: Adam Belay <abelay@novell.com>
Date:   Thu Jul 19 00:49:00 2007 -0400

    cpuidle: re-write

    Some portions have been rewritten to make the code cleaner and lighter
    weight.  The following is a list of changes:

    1.) the state name is now included in the sysfs interface
    2.) detection, hotplug, and available state modifications are handled by
    CPUIDLE drivers directly
    3.) the CPUIDLE idle handler is only ever installed when at least one
    cpuidle_device is enabled and ready
    4.) the menu governor BM code no longer overflows
    5.) the sysfs attributes are now printed as unsigned integers, avoiding
    negative values
    6.) a variety of other small cleanups

    Also, Idle drivers are no longer swappable during runtime through the
    CPUIDLE sysfs inteface.  On i386 and x86_64 most idle handlers (e.g.
    poll, mwait, halt, etc.) don't benefit from an infrastructure that
    supports multiple states, so I think using a more general case idle
    handler selection mechanism would be cleaner.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Acked-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit df25b6b56955714e6e24b574d88d1fd11f0c3ee5
Author: Len Brown <len.brown@intel.com>
Date:   Tue Jul 24 17:08:21 2007 -0400

    cpuidle: fix IA64 buid

    Signed-off-by: Len Brown <len.brown@intel.com>

commit fd6ada4c14488755ff7068860078c437431fbccd
Author: Adrian Bunk <bunk@stusta.de>
Date:   Mon Jul 9 11:33:13 2007 -0700

    cpuidle: static

    make cpuidle_replace_governor() static

    Signed-off-by: Adrian Bunk <bunk@stusta.de>
    Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit c1d4a2cebcadf2429c0c72e1d29aa2a9684c32e0
Author: Adrian Bunk <bunk@stusta.de>
Date:   Tue Jul 3 00:54:40 2007 -0400

    cpuidle: static

    This patch makes the needlessly global struct menu_governor static.

    Signed-off-by: Adrian Bunk <bunk@stusta.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit dbf8780c6e8d572c2c273da97ed1cca7608fd999
Author: Andrew Morton <akpm@linux-foundation.org>
Date:   Tue Jul 3 00:49:14 2007 -0400

    export symbol tick_nohz_get_sleep_length

    ERROR: "tick_nohz_get_sleep_length" [drivers/cpuidle/governors/menu.ko] undefined!
    ERROR: "tick_nohz_get_idle_jiffies" [drivers/cpuidle/governors/menu.ko] undefined!

    And please be sure to get your changes to core kernel suitably reviewed.

    Cc: Adam Belay <abelay@novell.com>
    Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: john stultz <johnstul@us.ibm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 29f0e248e7017be15f99febf9143a2cef00b2961
Author: Andrew Morton <akpm@linux-foundation.org>
Date:   Tue Jul 3 00:43:04 2007 -0400

    tick.h needs hrtimer.h

    It uses hrtimers.

    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit e40cede7d63a029e92712a3fe02faee60cc38fb4
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:40:34 2007 -0400

    cpuidle: first round of documentation updates

    Documentation changes based on Pavel's feedback.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 83b42be2efece386976507555c29e7773a0dfcd1
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:39:25 2007 -0400

    cpuidle: add rating to the governors and pick the one with highest rating by default

    Introduce a governor rating scheme to pick the right governor by default.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit d2a74b8c5e8f22def4709330d4bfc4a29209b71c
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:38:08 2007 -0400

    cpuidle: make cpuidle sysfs driver governor switch off by default

    Make default cpuidle sysfs to show current_governor and current_driver in
    read-only mode.  More elaborate available_governors and available_drivers with
    writeable current_governor and current_driver interface only appear with
    "cpuidle_sysfs_switch" boot parameter.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 1f60a0e80bf83cf6b55c8845bbe5596ed8f6307b
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:37:00 2007 -0400

    cpuidle: menu governor: change the early break condition

    Change the C-state early break out algorithm in menu governor.

    We only look at early breakouts that result in wakeups shorter than idle
    state's target_residency.  If such a breakout is frequent enough, eliminate
    the particular idle state upto a timeout period.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 45a42095cf64b003b4a69be3ce7f434f97d7af51
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:35:38 2007 -0400

    cpuidle: fix uninitialized variable in sysfs routine

    Fix the uninitialized usage of ret.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 80dca7cdba3e6ee13eae277660873ab9584eb3be
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:34:16 2007 -0400

    cpuidle: reenable /proc/acpi//power interface for the time being

    Keep /proc/acpi/processor/CPU*/power around for a while as powertop depends
    on it. It will be marked deprecated and removed in future. powertop can use
    cpuidle interfaces instead.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 589c37c2646c5e3813a51255a5ee1159cb4c33fc
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:32:37 2007 -0400

    cpuidle: menu governor and hrtimer compile fix

    Compile fix for menu governor.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 0ba80bd9ab3ed304cb4f19b722e4cc6740588b5e
Author: Len Brown <len.brown@intel.com>
Date:   Thu May 31 22:51:43 2007 -0400

    cpuidle: build fix - cpuidle vs ipw2100 module

    ERROR: "acpi_set_cstate_limit" [drivers/net/wireless/ipw2100.ko] undefined!

    Signed-off-by: Len Brown <len.brown@intel.com>

commit d7d8fa7f96a7f7682be7c6cc0cc53fa7a18c3b58
Author: Adam Belay <abelay@novell.com>
Date:   Sat Mar 24 03:47:07 2007 -0400

    cpuidle: add the 'menu' governor

    Here is my first take at implementing an idle PM governor that takes
    full advantage of NO_HZ.  I call it the 'menu' governor because it
    considers the full list of idle states before each entry.

    I've kept the implementation fairly simple.  It attempts to guess the
    next residency time and then chooses a state that would meet at least
    the break-even point between power savings and entry cost.  To this end,
    it selects the deepest idle state that satisfies the following
    constraints:
         1. If the idle time elapsed since bus master activity was detected
            is below a threshold (currently 20 ms), then limit the selection
            to C2-type or above.
         2. Do not choose a state with a break-even residency that exceeds
            the expected time remaining until the next timer interrupt.
         3. Do not choose a state with a break-even residency that exceeds
            the elapsed time between the last pair of break events,
            excluding timer interrupts.

    This governor has an advantage over "ladder" governor because it
    proactively checks how much time remains until the next timer interrupt
    using the tick infrastructure.  Also, it handles device interrupt
    activity more intelligently by not including timer interrupts in break
    event calculations.  Finally, it doesn't make policy decisions using the
    number of state entries, which can have variable residency times (NO_HZ
    makes these potentially very large), and instead only considers sleep
    time deltas.

    The menu governor can be selected during runtime using the cpuidle sysfs
    interface like so:
    "echo "menu" > /sys/devices/system/cpu/cpuidle/current_governor"

    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit a4bec7e65aa3b7488b879d971651cc99a6c410fe
Author: Adam Belay <abelay@novell.com>
Date:   Sat Mar 24 03:47:03 2007 -0400

    cpuidle: export time until next timer interrupt using NO_HZ

    Expose information about the time remaining until the next
    timer interrupt expires by utilizing the dynticks infrastructure.
    Also modify the main idle loop to allow dynticks to handle
    non-interrupt break events (e.g. DMA).  Finally, expose sleep ticks
    information to external code.  Thomas Gleixner is responsible for much
    of the code in this patch.  However, I've made some additional changes,
    so I'm probably responsible if there are any bugs or oversights :)

    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 2929d8996fbc77f41a5ff86bb67cdde3ca7d2d72
Author: Adam Belay <abelay@novell.com>
Date:   Sat Mar 24 03:46:58 2007 -0400

    cpuidle: governor API changes

    This patch prepares cpuidle for the menu governor.  It adds an optional
    stage after idle state entry to give the governor an opportunity to
    check why the state was exited.  Also it makes sure the idle loop
    returns after each state entry, allowing the appropriate dynticks code
    to run.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 3a7fd42f9825c3b03e364ca59baa751bb350775f
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Apr 26 00:03:59 2007 -0700

    cpuidle: hang fix

    Prevent hang on x86-64, when ACPI processor driver is added as a module on
    a system that does not support C-states.

    x86-64 expects all idle handlers to enable interrupts before returning from
    idle handler.  This is due to enter_idle(), exit_idle() races.  Make
    cpuidle_idle_call() confirm to this when there is no pm_idle_old.

    Also, cpuidle look at the return values of attch_driver() and set
    current_driver to NULL if attach fails on all CPUs.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 4893339a142afbd5b7c01ffadfd53d14746e858e
Author: Shaohua Li <shaohua.li@intel.com>
Date:   Thu Apr 26 10:40:09 2007 +0800

    cpuidle: add support for max_cstate limit

    With CPUIDLE framework, the max_cstate (to limit max cpu c-state)
    parameter is ingored. Some systems require it to ignore C2/C3
    and some drivers like ipw require it too.

    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 43bbbbe1cb998cbd2df656f55bb3bfe30f30e7d1
Author: Shaohua Li <shaohua.li@intel.com>
Date:   Thu Apr 26 10:40:13 2007 +0800

    cpuidle: add cpuidle_fore_redetect_devices API

    add cpuidle_force_redetect_devices API,
    which forces all CPU redetect idle states.
    Next patch will use it.

    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit d1edadd608f24836def5ec483d2edccfb37b1d19
Author: Shaohua Li <shaohua.li@intel.com>
Date:   Thu Apr 26 10:40:01 2007 +0800

    cpuidle: fix sysfs related issue

    Fix the cpuidle sysfs issue.
    a. make kobject dynamicaly allocated
    b. fixed sysfs init issue to avoid suspend/resume issue

    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 7169a5cc0d67b263978859672e86c13c23a5570d
Author: Randy Dunlap <randy.dunlap@oracle.com>
Date:   Wed Mar 28 22:52:53 2007 -0400

    cpuidle: 1-bit field must be unsigned

    A 1-bit bitfield has no room for a sign bit.
    drivers/cpuidle/governors/ladder.c:54:16: error: dubious bitfield without explicit `signed' or `unsigned'

    Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
    Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 4658620158dc2fbd9e4bcb213c5b6fb5d05ba7d4
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Wed Mar 28 22:52:41 2007 -0400

    cpuidle: fix boot hang

    Patch for cpuidle boot hang reported by Larry Finger here.
    http://www.ussg.iu.edu/hypermail/linux/kernel/0703.2/2025.html

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Cc: Larry Finger <larry.finger@lwfinger.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit c17e168aa6e5fe3851baaae8df2fbc1cf11443a9
Author: Len Brown <len.brown@intel.com>
Date:   Wed Mar 7 04:37:53 2007 -0500

    cpuidle: ladder does not depend on ACPI

    build fix for CONFIG_ACPI=n

    In file included from drivers/cpuidle/governors/ladder.c:21:
    include/acpi/processor.h:88: error: expected specifier-qualifier-list before ‘acpi_integer’
    include/acpi/processor.h:106: error: expected specifier-qualifier-list before ‘acpi_integer’
    include/acpi/processor.h:168: error: expected specifier-qualifier-list before ‘acpi_handle’

    Signed-off-by: Len Brown <len.brown@intel.com>

commit 8c91d958246bde68db0c3f0c57b535962ce861cb
Author: Adrian Bunk <bunk@stusta.de>
Date:   Tue Mar 6 02:29:40 2007 -0800

    cpuidle: make code static

    This patch makes the following needlessly global code static:
    - driver.c: __cpuidle_find_driver()
    - governor.c: __cpuidle_find_governor()
    - ladder.c: struct ladder_governor

    Signed-off-by: Adrian Bunk <bunk@stusta.de>
    Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Cc: Adam Belay <abelay@novell.com>
    Cc: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 0c39dc3187094c72c33ab65a64d2017b21f372d2
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Wed Mar 7 02:38:22 2007 -0500

    cpu_idle: fix build break

    This patch fixes a build breakage with !CONFIG_HOTPLUG_CPU and
    CONFIG_CPU_IDLE.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Adrian Bunk <bunk@stusta.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 8112e3b115659b07df340ef170515799c0105f82
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Mar 6 02:29:39 2007 -0800

    cpuidle: build fix for !CPU_IDLE

    Fix the compile issues when CPU_IDLE is not configured.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Cc: Adam Belay <abelay@novell.com>
    Cc: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 1eb4431e9599cd25e0d9872f3c2c8986821839dd
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Feb 22 13:54:57 2007 -0800

    cpuidle take2: Basic documentation for cpuidle

    Documentation for cpuidle infrastructure

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit ef5f15a8b79123a047285ec2e3899108661df779
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Feb 22 13:54:03 2007 -0800

    cpuidle take2: Hookup ACPI C-states driver with cpuidle

    Hookup ACPI C-states onto generic cpuidle infrastructure.

    drivers/acpi/procesor_idle.c is now a ACPI C-states driver that registers as
    a driver in cpuidle infrastructure and the policy part is removed from
    drivers/acpi/processor_idle.c. We use governor in cpuidle instead.

    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 987196fa82d4db52c407e8c9d5dec884ba602183
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Feb 22 13:52:57 2007 -0800

    cpuidle take2: Core cpuidle infrastructure

    Announcing 'cpuidle', a new CPU power management infrastructure to manage
    idle CPUs in a clean and efficient manner.
    cpuidle separates out the drivers that can provide support for multiple types
    of idle states and policy governors that decide on what idle state to use
    at run time.
    A cpuidle driver can support multiple idle states based on parameters like
    varying power consumption, wakeup latency, etc (ACPI C-states for example).
    A cpuidle governor can be usage model specific (laptop, server,
    laptop on battery etc).
    Main advantage of the infrastructure being, it allows independent development
    of drivers and governors and allows for better CPU power management.

    A huge thanks to Adam Belay and Shaohua Li who were part of this mini-project
    since its beginning and are greatly responsible for this patchset.

    This patch:

    Core cpuidle infrastructure.
    Introduces a new abstraction layer for cpuidle:
    * which manages drivers that can support multiple idles states. Drivers
      can be generic or particular to specific hardware/platform
    * allows pluging in multiple policy governors that can take idle state policy
      decision
    * The core also has a set of sysfs interfaces with which administrato can know
      about supported drivers and governors and switch them at run time.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

Signed-off-by: Len Brown <len.brown@intel.com>
2007-10-10 00:12:41 -04:00