Commit Graph

45 Commits

Author SHA1 Message Date
Arnd Bergmann c7a9b09b1a ARM: omap: allow building omap44xx without SMP
The new omap4 cpuidle implementation currently requires
ARCH_NEEDS_CPU_IDLE_COUPLED, which only works on SMP.

This patch makes it possible to build a non-SMP kernel
for that platform. This is not normally desired for
end-users but can be useful for testing.

Without this patch, building rand-0y2jSKT results in:

drivers/cpuidle/coupled.c: In function 'cpuidle_coupled_poke':
drivers/cpuidle/coupled.c:317:3: error: implicit declaration of function '__smp_call_function_single' [-Werror=implicit-function-declaration]

It's not clear if this patch is the best solution for
the problem at hand. I have made sure that we can now
build the kernel in all configurations, but that does
not mean it will actually work on an OMAP44xx.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Tony Lindgren <tony@atomide.com>
2012-08-23 17:16:42 +02:00
Linus Torvalds 476525004a Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull ACPI & power management update from Len Brown:
 "Re-write of the turbostat tool.
     lower overhead was necessary for measuring very large system when
     they are very idle.

  IVB support in intel_idle
     It's what I run on my IVB, others should be able to also:-)

  ACPICA core update
     We have found some bugs due to divergence between Linux and the
     upstream ACPICA base.  Most of these patches are to reduce that
     divergence to reduce the risk of future bugs.

  Some cpuidle updates, mostly for non-Intel
     More will be coming, as they depend on this part.

  Some thermal management changes needed by non-ACPI systems.

  Some _OST (OS Status Indication) updates for hot ACPI hot-plug."

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (51 commits)
  Thermal: Documentation update
  Thermal: Add Hysteresis attributes
  Thermal: Make Thermal trip points writeable
  ACPI/AC: prevent OOPS on some boxes due to missing check power_supply_register() return value check
  tools/power: turbostat: fix large c1% issue
  tools/power: turbostat v2 - re-write for efficiency
  ACPICA: Update to version 20120711
  ACPICA: AcpiSrc: Fix some translation issues for Linux conversion
  ACPICA: Update header files copyrights to 2012
  ACPICA: Add new ACPI table load/unload external interfaces
  ACPICA: Split file: tbxface.c -> tbxfload.c
  ACPICA: Add PCC address space to space ID decode function
  ACPICA: Fix some comment fields
  ACPICA: Table manager: deploy new firmware error/warning interfaces
  ACPICA: Add new interfaces for BIOS(firmware) errors and warnings
  ACPICA: Split exception code utilities to a new file, utexcep.c
  ACPI: acpi_pad: tune round_robin_time
  ACPICA: Update to version 20120620
  ACPICA: Add support for implicit notify on multiple devices
  ACPICA: Update comments; no functional change
  ...
2012-07-26 14:28:55 -07:00
Rafael J. Wysocki 7791bd230c Merge branch 'pm-domains'
* pm-domains:
  PM / Domains: Fix build warning for CONFIG_PM_RUNTIME unset
  PM / Domains: Replace plain integer with NULL pointer in domain.c file
  PM / Domains: Add missing static storage class specifier in domain.c file
  PM / Domains: Allow device callbacks to be added at any time
  PM / Domains: Add device domain data reference counter
  PM / Domains: Add preliminary support for cpuidle, v2
  PM / Domains: Do not stop devices after restoring their states
  PM / Domains: Use subsystem runtime suspend/resume callbacks by default
2012-07-19 00:03:17 +02:00
Preeti U Murthy 8651f97bd9 PM / cpuidle: System resume hang fix with cpuidle
On certain bios, resume hangs if cpus are allowed to enter idle states
during suspend [1].

This was fixed in apci idle driver [2].But intel_idle driver does not
have this fix. Thus instead of replicating the fix in both the idle
drivers, or in more platform specific idle drivers if needed, the
more general cpuidle infrastructure could handle this.

A suspend callback in cpuidle_driver could handle this fix. But
a cpuidle_driver provides only basic functionalities like platform idle
state detection capability and mechanisms to support entry and exit
into CPU idle states. All other cpuidle functions are found in the
cpuidle generic infrastructure for good reason that all cpuidle
drivers, irrepective of their platforms will support these functions.

One option therefore would be to register a suspend callback in cpuidle
which handles this fix. This could be called through a PM_SUSPEND_PREPARE
notifier. But this is too generic a notfier for a driver to handle.

Also, ideally the job of cpuidle is not to handle side effects of suspend.
It should expose the interfaces which "handle cpuidle 'during' suspend"
or any other operation, which the subsystems call during that respective
operation.

The fix demands that during suspend, no cpus should be allowed to enter
deep C-states. The interface cpuidle_uninstall_idle_handler() in cpuidle
ensures that. Not just that it also kicks all the cpus which are already
in idle out of their idle states which was being done during cpu hotplug
through a CPU_DYING_FROZEN callbacks.

Now the question arises about when during suspend should
cpuidle_uninstall_idle_handler() be called. Since we are dealing with
drivers it seems best to call this function during dpm_suspend().
Delaying the call till dpm_suspend_noirq() does no harm, as long as it is
before cpu_hotplug_begin() to avoid race conditions with cpu hotpulg
operations. In dpm_suspend_noirq(), it would be wise to place this call
before suspend_device_irqs() to avoid ugly interactions with the same.

Ananlogously, during resume.

References:
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/674075.
[2] http://marc.info/?l=linux-pm&m=133958534231884&w=2

Reported-and-tested-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-07-10 21:34:49 +02:00
Daniel Lezcano 25ac77613a ACPI: intel_idle : break dependency between modules
When the system is booted with some cpus offline, the idle
driver is not initialized. When a cpu is set online, the
acpi code call the intel idle init function. Unfortunately
this code introduce a dependency between intel_idle and acpi.

This patch is intended to remove this dependency by using the
notifier of intel_idle. This patch has the benefit of
encapsulating the intel_idle driver and remove some exported
functions.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-07-05 22:37:47 +02:00
Rafael J. Wysocki cbc9ef0287 PM / Domains: Add preliminary support for cpuidle, v2
On some systems there are CPU cores located in the same power
domains as I/O devices.  Then, power can only be removed from the
domain if all I/O devices in it are not in use and the CPU core
is idle.  Add preliminary support for that to the generic PM domains
framework.

First, the platform is expected to provide a cpuidle driver with one
extra state designated for use with the generic PM domains code.
This state should be initially disabled and its exit_latency value
should be set to whatever time is needed to bring up the CPU core
itself after restoring power to it, not including the domain's
power on latency.  Its .enter() callback should point to a procedure
that will remove power from the domain containing the CPU core at
the end of the CPU power transition.

The remaining characteristics of the extra cpuidle state, referred to
as the "domain" cpuidle state below, (e.g. power usage, target
residency) should be populated in accordance with the properties of
the hardware.

Next, the platform should execute genpd_attach_cpuidle() on the PM
domain containing the CPU core.  That will cause the generic PM
domains framework to treat that domain in a special way such that:

 * When all devices in the domain have been suspended and it is about
   to be turned off, the states of the devices will be saved, but
   power will not be removed from the domain.  Instead, the "domain"
   cpuidle state will be enabled so that power can be removed from
   the domain when the CPU core is idle and the state has been chosen
   as the target by the cpuidle governor.

 * When the first I/O device in the domain is resumed and
   __pm_genpd_poweron(() is called for the first time after
   power has been removed from the domain, the "domain" cpuidle
   state will be disabled to avoid subsequent surprise power removals
   via cpuidle.

The effective exit_latency value of the "domain" cpuidle state
depends on the time needed to bring up the CPU core itself after
restoring power to it as well as on the power on latency of the
domain containing the CPU core.  Thus the "domain" cpuidle state's
exit_latency has to be recomputed every time the domain's power on
latency is updated, which may happen every time power is restored
to the domain, if the measured power on latency is greater than
the latency stored in the corresponding generic_pm_domain structure.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Kevin Hilman <khilman@ti.com>
2012-07-03 19:07:42 +02:00
Rafael J. Wysocki 6e797a0788 PM / cpuidle: Add driver reference counter
Add a reference counter for the cpuidle driver, so that it can't
be unregistered when it is in use.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-07-03 19:06:25 +02:00
ShuoX Liu dc7fd275ae cpuidle: move field disable from per-driver to per-cpu
Andrew J.Schorr raises a question.  When he changes the disable setting on
a single CPU, it affects all the other CPUs.  Basically, currently, the
disable field is per-driver instead of per-cpu.  All the C states of the
same driver are shared by all CPU in the same machine.

The patch changes the `disable' field to per-cpu, so we could set this
separately for each cpu.

Signed-off-by: ShuoX Liu <shuox.liu@intel.com>
Reported-by: Andrew J.Schorr <aschorr@telemetry-investments.com>
Reviewed-by: Yanmin Zhang <yanmin_zhang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-07-03 19:05:31 +02:00
Colin Cross 20ff51a36b cpuidle: coupled: add parallel barrier function
Adds cpuidle_coupled_parallel_barrier, which can be used by coupled
cpuidle state enter functions to handle resynchronization after
determining if any cpu needs to abort.  The normal use case will
be:

static bool abort_flag;
static atomic_t abort_barrier;

int arch_cpuidle_enter(struct cpuidle_device *dev, ...)
{
	if (arch_turn_off_irq_controller()) {
	   	/* returns an error if an irq is pending and would be lost
		   if idle continued and turned off power */
		abort_flag = true;
	}

	cpuidle_coupled_parallel_barrier(dev, &abort_barrier);

	if (abort_flag) {
	   	/* One of the cpus didn't turn off it's irq controller */
	   	arch_turn_on_irq_controller();
		return -EINTR;
	}

	/* continue with idle */
	...
}

This will cause all cpus to abort idle together if one of them needs
to abort.

Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-02 00:49:36 -04:00
Colin Cross 4126c0197b cpuidle: add support for states that affect multiple cpus
On some ARM SMP SoCs (OMAP4460, Tegra 2, and probably more), the
cpus cannot be independently powered down, either due to
sequencing restrictions (on Tegra 2, cpu 0 must be the last to
power down), or due to HW bugs (on OMAP4460, a cpu powering up
will corrupt the gic state unless the other cpu runs a work
around).  Each cpu has a power state that it can enter without
coordinating with the other cpu (usually Wait For Interrupt, or
WFI), and one or more "coupled" power states that affect blocks
shared between the cpus (L2 cache, interrupt controller, and
sometimes the whole SoC).  Entering a coupled power state must
be tightly controlled on both cpus.

The easiest solution to implementing coupled cpu power states is
to hotplug all but one cpu whenever possible, usually using a
cpufreq governor that looks at cpu load to determine when to
enable the secondary cpus.  This causes problems, as hotplug is an
expensive operation, so the number of hotplug transitions must be
minimized, leading to very slow response to loads, often on the
order of seconds.

This file implements an alternative solution, where each cpu will
wait in the WFI state until all cpus are ready to enter a coupled
state, at which point the coupled state function will be called
on all cpus at approximately the same time.

Once all cpus are ready to enter idle, they are woken by an smp
cross call.  At this point, there is a chance that one of the
cpus will find work to do, and choose not to enter idle.  A
final pass is needed to guarantee that all cpus will call the
power state enter function at the same time.  During this pass,
each cpu will increment the ready counter, and continue once the
ready counter matches the number of online coupled cpus.  If any
cpu exits idle, the other cpus will decrement their counter and
retry.

To use coupled cpuidle states, a cpuidle driver must:

   Set struct cpuidle_device.coupled_cpus to the mask of all
   coupled cpus, usually the same as cpu_possible_mask if all cpus
   are part of the same cluster.  The coupled_cpus mask must be
   set in the struct cpuidle_device for each cpu.

   Set struct cpuidle_device.safe_state to a state that is not a
   coupled state.  This is usually WFI.

   Set CPUIDLE_FLAG_COUPLED in struct cpuidle_state.flags for each
   state that affects multiple cpus.

   Provide a struct cpuidle_state.enter function for each state
   that affects multiple cpus.  This function is guaranteed to be
   called on all cpus at approximately the same time.  The driver
   should ensure that the cpus all abort together if any cpu tries
   to abort once the function is called.

update1:

cpuidle: coupled: fix count of online cpus

online_count was never incremented on boot, and was also counting
cpus that were not part of the coupled set.  Fix both issues by
introducting a new function that counts online coupled cpus, and
call it from register as well as the hotplug notifier.

update2:

cpuidle: coupled: fix decrementing ready count

cpuidle_coupled_set_not_ready sometimes refuses to decrement the
ready count in order to prevent a race condition.  This makes it
unsuitable for use when finished with idle.  Add a new function
cpuidle_coupled_set_done that decrements both the ready count and
waiting count, and call it after idle is complete.

Cc: Amit Kucheria <amit.kucheria@linaro.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Trinabh Gupta <g.trinabh@gmail.com>
Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-06-02 00:49:09 -04:00
Boris Ostrovsky 02401c06b7 cpuidle: power_usage should be declared signed integer
power_usage is always assigned a negative value and should be declared
a signed integer

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:23:30 -04:00
Boris Ostrovsky 1a022e3f1b idle, x86: Allow off-lined CPU to enter deeper C states
Currently when a CPU is off-lined it enters either MWAIT-based idle or,
if MWAIT is not desired or supported, HLT-based idle (which places the
processor in C1 state). This patch allows processors without MWAIT
support to stay in states deeper than C1.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 03:23:01 -04:00
Daniel Lezcano e07510585a cpuidle: remove unused 'governor_data' field
As far as I can see, this field is never used in the code.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:55:40 -04:00
Daniel Lezcano db70b04407 cpuidle: remove useless array definition in cpuidle_structure
All the modules name are ro-data, it is never copied to the array.

eg.

static struct cpuidle_driver intel_idle_driver = {
        .name = "intel_idle",
        .owner = THIS_MODULE,
};

It safe to assign the pointer of this ro-data to a const char *.
By this way we save 12 bytes.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Tested-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:55:22 -04:00
ShuoX Liu 3a53396b03 cpuidle: add a sysfs entry to disable specific C state for debug purpose.
Some C states of new CPU might be not good.  One reason is BIOS might
configure them incorrectly.  To help developers root cause it quickly, the
patch adds a new sysfs entry, so developers could disable specific C state
manually.

In addition, C state might have much impact on performance tuning, as it
takes much time to enter/exit C states, which might delay interrupt
processing.  With the new debug option, developers could check if a deep C
state could impact performance and how much impact it could cause.

Also add this option in Documentation/cpuidle/sysfs.txt.

[akpm@linux-foundation.org: check kstrtol return value]
Signed-off-by: ShuoX Liu <shuox.liu@intel.com>
Reviewed-by: Yanmin Zhang <yanmin_zhang@intel.com>
Reviewed-and-Tested-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-30 01:52:58 -04:00
Robert Lee e1689795a7 cpuidle: Add common time keeping and irq enabling
Make necessary changes to implement time keeping and irq enabling
in the core cpuidle code.  This will allow the removal of these
functionalities from various platform cpuidle implementations whose
timekeeping and irq enabling follows the form in this common code.

Signed-off-by: Robert Lee <rob.lee@linaro.org>
Tested-by: Jean Pihet <j-pihet@ti.com>
Tested-by: Amit Daniel <amit.kachhap@linaro.org>
Tested-by: Robert Lee <rob.lee@linaro.org>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-03-21 01:59:40 -04:00
Linus Torvalds 507a03c1cb Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
This includes initial support for the recently published ACPI 5.0 spec.
In particular, support for the "hardware-reduced" bit that eliminates
the dependency on legacy hardware.

APEI has patches resulting from testing on real hardware.

Plus other random fixes.

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (52 commits)
  acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec
  intel_idle: Split up and provide per CPU initialization func
  ACPI processor: Remove unneeded variable passed by acpi_processor_hotadd_init V2
  ACPI processor: Remove unneeded cpuidle_unregister_driver call
  intel idle: Make idle driver more robust
  intel_idle: Fix a cast to pointer from integer of different size warning in intel_idle
  ACPI: kernel-parameters.txt : Add intel_idle.max_cstate
  intel_idle: remove redundant local_irq_disable() call
  ACPI processor: Fix error path, also remove sysdev link
  ACPI: processor: fix acpi_get_cpuid for UP processor
  intel_idle: fix API misuse
  ACPI APEI: Convert atomicio routines
  ACPI: Export interfaces for ioremapping/iounmapping ACPI registers
  ACPI: Fix possible alignment issues with GAS 'address' references
  ACPI, ia64: Use SRAT table rev to use 8bit or 16/32bit PXM fields (ia64)
  ACPI, x86: Use SRAT table rev to use 8bit or 32bit PXM fields (x86/x86-64)
  ACPI: Store SRAT table revision
  ACPI, APEI, Resolve false conflict between ACPI NVS and APEI
  ACPI, Record ACPI NVS regions
  ACPI, APEI, EINJ, Refine the fix of resource conflict
  ...
2012-01-18 15:51:48 -08:00
Thomas Renninger 65b7f839ce intel_idle: Split up and provide per CPU initialization func
Function split up, should have no functional change.

Provides entry point for physically hotplugged CPUs
to initialize and activate cpuidle.

Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
CC: Shaohua Li <shaohua.li@intel.com>
CC: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2012-01-17 23:59:12 -05:00
Deepthi Dharwar e179816ce6 powerpc/cpuidle: Enable cpuidle and directly call cpuidle_idle_call() for pSeries
This patch enables cpuidle for pSeries and pSeries_idle is
directly called from the idle loop. As a result of pSeries_idle, cpuidle
driver registered with cpuidle subsystem comes into action. On
failure of loading of the driver or cpuidle framework default idle
is executed as part of the function. This patch
also removes the routines pseries_shared_idle_sleep and
pseries_dedicated_idle_sleep as they are now implemented as part of
pseries_idle cpuidle driver.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 13:57:20 +11:00
Linus Torvalds 3c00303206 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  cpuidle: Single/Global registration of idle states
  cpuidle: Split cpuidle_state structure and move per-cpu statistics fields
  cpuidle: Remove CPUIDLE_FLAG_IGNORE and dev->prepare()
  cpuidle: Move dev->last_residency update to driver enter routine; remove dev->last_state
  ACPI: Fix CONFIG_ACPI_DOCK=n compiler warning
  ACPI: Export FADT pm_profile integer value to userspace
  thermal: Prevent polling from happening during system suspend
  ACPI: Drop ACPI_NO_HARDWARE_INIT
  ACPI atomicio: Convert width in bits to bytes in __acpi_ioremap_fast()
  PNPACPI: Simplify disabled resource registration
  ACPI: Fix possible recursive locking in hwregs.c
  ACPI: use kstrdup()
  mrst pmu: update comment
  tools/power turbostat: less verbose debugging
2011-11-07 10:13:52 -08:00
Deepthi Dharwar 46bcfad7a8 cpuidle: Single/Global registration of idle states
This patch makes the cpuidle_states structure global (single copy)
instead of per-cpu. The statistics needed on per-cpu basis
by the governor are kept per-cpu. This simplifies the cpuidle
subsystem as state registration is done by single cpu only.
Having single copy of cpuidle_states saves memory. Rare case
of asymmetric C-states can be handled within the cpuidle driver
and architectures such as POWER do not have asymmetric C-states.

Having single/global registration of all the idle states,
dynamic C-state transitions on x86 are handled by
the boot cpu. Here, the boot cpu  would disable all the devices,
re-populate the states and later enable all the devices,
irrespective of the cpu that would receive the notification first.

Reference:
https://lkml.org/lkml/2011/4/25/83

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-11-06 21:13:58 -05:00
Deepthi Dharwar 4202735e8a cpuidle: Split cpuidle_state structure and move per-cpu statistics fields
This is the first step towards global registration of cpuidle
states. The statistics used primarily by the governor are per-cpu
and have to be split from rest of the fields inside cpuidle_state,
which would be made global i.e. single copy. The driver_data field
is also per-cpu and moved.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-11-06 21:13:49 -05:00
Deepthi Dharwar b25edc42bf cpuidle: Remove CPUIDLE_FLAG_IGNORE and dev->prepare()
The cpuidle_device->prepare() mechanism causes updates to the
cpuidle_state[].flags, setting and clearing CPUIDLE_FLAG_IGNORE
to tell the governor not to chose a state on a per-cpu basis at
run-time. State demotion is now handled by the driver and it returns
the actual state entered. Hence, this mechanism is not required.
Also this removes per-cpu flags from cpuidle_state enabling
it to be made global.

Reference:
https://lkml.org/lkml/2011/3/25/52

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-11-06 21:13:43 -05:00
Deepthi Dharwar e978aa7d7d cpuidle: Move dev->last_residency update to driver enter routine; remove dev->last_state
Cpuidle governor only suggests the state to enter using the
governor->select() interface, but allows the low level driver to
override the recommended state. The actual entered state
may be different because of software or hardware demotion. Software
demotion is done by the back-end cpuidle driver and can be accounted
correctly. Current cpuidle code uses last_state field to capture the
actual state entered and based on that updates the statistics for the
state entered.

Ideally the driver enter routine should update the counters,
and it should return the state actually entered rather than the time
spent there. The generic cpuidle code should simply handle where
the counters live in the sysfs namespace, not updating the counters.

Reference:
https://lkml.org/lkml/2011/3/25/52

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Tested-by: Jean Pihet <j-pihet@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-11-06 21:13:30 -05:00
Paul Gortmaker de47725421 include: replace linux/module.h with "struct module" wherever possible
The <linux/module.h> pretty much brings in the kitchen sink along
with it, so it should be avoided wherever reasonably possible in
terms of being included from other commonly used <linux/something.h>
files, as it results in a measureable increase on compile times.

The worst culprit was probably device.h since it is used everywhere.
This file also had an implicit dependency/usage of mutex.h which was
masked by module.h, and is also fixed here at the same time.

There are over a dozen other headers that simply declare the
struct instead of pulling in the whole file, so follow their lead
and simply make it a few more.

Most of the implicit dependencies on module.h being present by
these headers pulling it in have been now weeded out, so we can
finally make this change with hopefully minimal breakage.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:32:32 -04:00
Len Brown a0bfa13738 cpuidle: stop depending on pm_idle
cpuidle users should call cpuidle_call_idle() directly
rather than via (pm_idle)() function pointer.

Architecture may choose to continue using (pm_idle)(),
but cpuidle need not depend on it:

  my_arch_cpu_idle()
	...
	if(cpuidle_call_idle())
		pm_idle();

cc: Kevin Hilman <khilman@deeprootsystems.com>
cc: Paul Mundt <lethal@linux-sh.org>
cc: x86@kernel.org
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 19:06:37 -04:00
Len Brown d91ee5863b cpuidle: replace xen access to x86 pm_idle and default_idle
When a Xen Dom0 kernel boots on a hypervisor, it gets access
to the raw-hardware ACPI tables.  While it parses the idle tables
for the hypervisor's beneift, it uses HLT for its own idle.

Rather than have xen scribble on pm_idle and access default_idle,
have it simply disable_cpuidle() so acpi_idle will not load and
architecture default HLT will be used.

cc: xen-devel@lists.xensource.com
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-08-03 19:06:36 -04:00
Len Brown 5392083748 cpuidle: CPUIDLE_FLAG_CHECK_BM is omap3_idle specific
Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12 12:47:34 -05:00
Len Brown 956d033fb2 cpuidle: CPUIDLE_FLAG_TLB_FLUSHED is specific to intel_idle
Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12 12:47:33 -05:00
Len Brown 642f11c53f cpuidle: delete unused CPUIDLE_FLAG_SHALLOW, BALANCED, DEEP definitions
Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12 12:47:32 -05:00
Len Brown d247632c08 cpuidle: delete NOP CPUIDLE_FLAG_POLL
it serves no purpose

Signed-off-by: Len Brown <len.brown@intel.com>
2011-01-12 12:47:31 -05:00
Suresh Siddha 6110a1f43c intel_idle: Voluntary leave_mm before entering deeper
Avoid TLB flush IPIs for the cores in deeper c-states by voluntary leave_mm()
before entering into that state. CPUs tend to flush TLB in those c-states
anyways.

acpi_idle does this with C3-type states, but it was not caried over
when intel_idle was introduced.  intel_idle can apply it
to C-states in addition to those that ACPI might export as C3...

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2010-09-30 21:19:22 -04:00
Ai Li 71abbbf856 cpuidle: extend cpuidle and menu governor to handle dynamic states
On some SoC chips, HW resources may be in use during any particular idle
period.  As a consequence, the cpuidle states that the SoC is safe to
enter can change from idle period to idle period.  In addition, the
latency and threshold of each cpuidle state can vary, depending on the
operating condition when the CPU becomes idle, e.g.  the current cpu
frequency, the current state of the HW blocks, etc.

cpuidle core and the menu governor, in the current form, are geared
towards cpuidle states that are static, i.e.  the availabiltiy of the
states, their latencies, their thresholds are non-changing during run
time.  cpuidle does not provide any hook that cpuidle drivers can use to
adjust those values on the fly for the current idle period before the menu
governor selects the target cpuidle state.

This patch extends cpuidle core and the menu governor to handle states
that are dynamic.  There are three additions in the patch and the patch
maintains backwards-compatibility with existing cpuidle drivers.

1) add prepare() to struct cpuidle_device.  A cpuidle driver can hook
   into the callback and cpuidle will call prepare() before calling the
   governor's select function.  The callback gives the cpuidle driver a
   chance to update the dynamic information of the cpuidle states for the
   current idle period, e.g.  state availability, latencies, thresholds,
   power values, etc.

2) add CPUIDLE_FLAG_IGNORE as one of the state flags.  In the prepare()
   function, a cpuidle driver can set/clear the flag to indicate to the
   menu governor whether a cpuidle state should be ignored, i.e.  not
   available, during the current idle period.

3) add power_specified bit to struct cpuidle_device.  The menu governor
   currently assumes that the cpuidle states are arranged in the order of
   increasing latency, threshold, and power savings.  This is true or can
   be made true for static states.  Once the state parameters are dynamic,
   the latencies, thresholds, and power savings for the cpuidle states can
   increase or decrease by different amounts from idle period to idle
   period.  So the assumption of increasing latency, threshold, and power
   savings from Cn to C(n+1) can no longer be guaranteed.

It can be straightforward to calculate the power consumption of each
available state and to specify it in power_usage for the idle period.
Using the power_usage fields, the menu governor then selects the state
that has the lowest power consumption and that still satisfies all other
critieria.  The power_specified bit defaults to 0.  For existing cpuidle
drivers, cpuidle detects that power_specified is 0 and fills in a dummy
set of power_usage values.

Signed-off-by: Ai Li <aili@codeaurora.org>
Cc: Len Brown <len.brown@intel.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:04 -07:00
Len Brown 752138df0d cpuidle: make cpuidle_curr_driver static
cpuidle_register_driver() sets cpuidle_curr_driver
cpuidle_unregister_driver() clears cpuidle_curr_driver

We should't expose cpuidle_curr_driver to
potential modification except via these interfaces.
So make it static and create cpuidle_get_driver() to observe it.

Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-27 21:06:58 -04:00
Len Brown 6b2c676bf3 cpuidle: fail to register if !CONFIG_CPU_IDLE
Signed-off-by: Len Brown <len.brown@intel.com>
2010-05-27 01:56:24 -04:00
Venkatesh Pallipadi dcb84f335b cpuidle acpi driver: fix oops on AC<->DC
cpuidle and acpi driver interaction bug with the way cpuidle_register_driver()
is called. Due to this bug, there will be oops on
AC<->DC on some systems, where they support C-states in one DC and not in AC.

The current code does
ON BOOT:
	Look at CST and other C-state info to see whether more than C1 is
	supported. If it is, then acpi processor_idle does a
	cpuidle_register_driver() call, which internally enables the device.

ON CST change notification (AC<->DC) and on suspend-resume:
	acpi driver temporarily disables device, updates the device with
	any new C-states, and reenables the device.

The problem is is on boot, there are no C2, C3 states supported and we skip
the register. Later on AC<->DC, we may get a CST notification and we try
to reevaluate CST and enabled the device, without actually registering it.
This causes breakage as we try to create /sys fs sub directory, without the
parent directory which is created at register time.

Thanks to Sanjeev for reporting the problem here.
http://bugzilla.kernel.org/show_bug.cgi?id=10394

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-06-11 19:13:45 -04:00
Yi Yang 8b78cf602f cpuidle: fix cpuidle time and usage overflow
cpuidle C-state sysfs node time and usage are very easy to overflow because
they are all of unsigned int type, time will overflow within about two hours,
usage will take longer time to overflow, but they are increasing for ever.

This patch will convert them to unsigned long long.

Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-03-26 00:45:26 -04:00
Venkatesh Pallipadi 4fcb2fcd4d ACPI, cpuidle: Clarify C-state description in sysfs
Add a new sysfs entry under cpuidle states. desc - can be used by driver to
communicate to userspace any specific information about the state.
This helps in identifying the exact hardware C-states behind the ACPI C-state
definition.

Idea is to export this through powertop, which will help to map the C-state
reported by powertop to actual hardware C-state.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-14 00:09:55 -05:00
Len Brown 9b71315421 Revert "cpuidle: build fix for non-x86"
This reverts commit f757397097.
which ironically broke the ia64 build
2008-02-07 04:16:34 -05:00
Len Brown acf63867ae Merge branches 'release', 'cpuidle-2.6.25' and 'idle' into release 2008-02-07 03:11:05 -05:00
venkatesh.pallipadi@intel.com 9a0b841586 cpuidle: Add a poll_idle method
Add a default poll idle state with 0 latency. Provides an option to users
to use poll_idle by using 0 as the latency requirement.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-02-07 02:20:15 -05:00
Harvey Harrison b5556a67f0 cpuidle: dubious one-bit signed bitfield in cpuidle.h
fix these sparse warnings:

  CHECK   arch/x86/kernel/acpi/cstate.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield
  CHECK   arch/x86/kernel/acpi/processor.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield
  CHECK   arch/x86/kernel/cpu/cpufreq/powernow-k7.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield
  CHECK   arch/x86/kernel/cpu/cpufreq/powernow-k8.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield
  CHECK   arch/x86/kernel/cpu/cpufreq/longhaul.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield
  CHECK   arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
include/linux/cpuidle.h:82:17: error: dubious one-bit signed bitfield

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-06 22:39:44 +01:00
Kevin Hilman f757397097 cpuidle: build fix for non-x86
Convert cpu_idle_wait() to cpuidle_kick_cpus() macro which is
SMP-only, and gives error on non supported CPU.

Signed-off-by: Kevin Hilman <khilman@mvista.com>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-01-31 22:50:48 -05:00
Venkatesh Pallipadi ddc081a195 cpuidle: fix HP nx6125 regression
Fix for http://bugzilla.kernel.org/show_bug.cgi?id=9355

cpuidle always used to fallback to C2 if there is some bm activity while
entering C3. But, presence of C2 is not always guaranteed. Change cpuidle
algorithm to detect a safe_state to fallback in case of bm_activity and
use that state instead of C2.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2007-11-19 21:43:22 -05:00
Len Brown 4f86d3a8e2 cpuidle: consolidate 2.6.22 cpuidle branch into one patch
commit e5a16b1f9eec0af7cfa0830304b41c1c0833cf9f
Author: Len Brown <len.brown@intel.com>
Date:   Tue Oct 2 23:44:44 2007 -0400

    cpuidle: shrink diff

    processor_idle.c |  440 +++++++++++++++++++++++++++++++++++++++++--
    1 file changed, 429 insertions(+), 11 deletions(-)

    Signed-off-by: Len Brown <len.brown@intel.com>

commit dfbb9d5aedfb18848a3e0d6f6e3e4969febb209c
Author: Len Brown <len.brown@intel.com>
Date:   Wed Sep 26 02:17:55 2007 -0400

    cpuidle: reduce diff size

    Reduces the cpuidle processor_idle.c diff vs 2.6.22 from this
     processor_idle.c | 2006 ++++++++++++++++++++++++++-----------------
     1 file changed, 1219 insertions(+), 787 deletions(-)

    to this:
     processor_idle.c |  502 +++++++++++++++++++++++++++++++++++++++----
     1 file changed, 458 insertions(+), 44 deletions(-)

    ...for the purpose of making the cpuilde patch less invasive
    and easier to review.

    no functional changes.  build tested only.

    Signed-off-by: Len Brown <len.brown@intel.com>

commit 889172fc915f5a7fe20f35b133cbd205ce69bf6c
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Sep 13 13:40:05 2007 -0700

    cpuidle: Retain old ACPI policy for !CONFIG_CPU_IDLE

    Retain the old policy in processor_idle, so that when CPU_IDLE is not
    configured, old C-state policy will still be used. This provides a
    clean gradual migration path from old ACPI policy to new cpuidle
    based policy.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 9544a8181edc7ecc33b3bfd69271571f98ed08bc
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Sep 13 13:39:17 2007 -0700

    cpuidle: Configure governors by default

    Quoting Len "Do not give an option to users to shoot themselves in the foot".

    Remove the configurability of ladder and menu governors as they are
    needed for default policy of cpuidle. That way users will not be able to
    have cpuidle without any policy loosing all C-state power savings.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 8975059a2c1e56cfe83d1bcf031bcf4cb39be743
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:27:07 2007 -0400

    CPUIDLE: load ACPI properly when CPUIDLE is disabled

    Change the registration return codes for when CPUIDLE
    support is not compiled into the kernel.  As a result, the ACPI
    processor driver will load properly even if CPUIDLE is unavailable.
    However, it may be possible to cleanup the ACPI processor driver further
    and eliminate some dead code paths.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit e0322e2b58dd1b12ec669bf84693efe0dc2414a8
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:26:06 2007 -0400

    CPUIDLE: remove cpuidle_get_bm_activity()

    Remove cpuidle_get_bm_activity() and updates governors
    accordingly.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 18a6e770d5c82ba26653e53d240caa617e09e9ab
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:25:58 2007 -0400

    CPUIDLE: max_cstate fix

    Currently max_cstate is limited to 0, resulting in no idle processor
    power management on ACPI platforms.  This patch restores the value to
    the array size.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 1fdc0887286179b40ce24bcdbde663172e205ef0
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:25:40 2007 -0400

    CPUIDLE: handle BM detection inside the ACPI Processor driver

    Update the ACPI processor driver to detect BM activity and
    limit state entry depth internally, rather than exposing such
    requirements to CPUIDLE.  As a result, CPUIDLE can drop this
    ACPI-specific interface and become more platform independent.  BM
    activity is now handled much more aggressively than it was in the
    original implementation, so some testing coverage may be needed to
    verify that this doesn't introduce any DMA buffer under-run issues.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 0ef38840db666f48e3cdd2b769da676c57228dd9
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:25:14 2007 -0400

    CPUIDLE: menu governor updates

    Tweak the menu governor to more effectively handle non-timer
    break events.  Non-timer break events are detected by comparing the
    actual sleep time to the expected sleep time.  In future revisions, it
    may be more reliable to use the timer data structures directly.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit bb4d74fca63fa96cf3ace644b15ae0f12b7df5a1
Author: Adam Belay <abelay@novell.com>
Date:   Tue Aug 21 18:24:40 2007 -0400

    CPUIDLE: fix 'current_governor' sysfs entry

    Allow the "current_governor" sysfs entry to properly handle
    input terminated with '\n'.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit df3c71559bb69b125f1a48971bf0d17f78bbdf47
Author: Len Brown <len.brown@intel.com>
Date:   Sun Aug 12 02:00:45 2007 -0400

    cpuidle: fix IA64 build (again)

    Signed-off-by: Len Brown <len.brown@intel.com>

commit a02064579e3f9530fd31baae16b1fc46b5a7bca8
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Sun Aug 12 01:39:27 2007 -0400

    cpuidle: Remove support for runtime changing of max_cstate

    Remove support for runtime changeability of max_cstate. Drivers can use
    use latency APIs.

    max_cstate can still be used as a boot time option and dmi override.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 0912a44b13adf22f5e3f607d263aed23b4910d7e
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Sun Aug 12 01:39:16 2007 -0400

    cpuidle: Remove ACPI cstate_limit calls from ipw2100

    ipw2100 already has code to use accetable_latency interfaces to limit the
    C-state. Remove the calls to acpi_set_cstate_limit and acpi_get_cstate_limit
    as they are redundant.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit c649a76e76be6bff1fd770d0a775798813a3f6e0
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Sun Aug 12 01:35:39 2007 -0400

    cpuidle: compile fix for pause and resume functions

    Fix the compilation failure when cpuidle is not compiled in.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Acked-by: Adam Belay <adam.belay@novell.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 2305a5920fb8ee6ccec1c62ade05aa8351091d71
Author: Adam Belay <abelay@novell.com>
Date:   Thu Jul 19 00:49:00 2007 -0400

    cpuidle: re-write

    Some portions have been rewritten to make the code cleaner and lighter
    weight.  The following is a list of changes:

    1.) the state name is now included in the sysfs interface
    2.) detection, hotplug, and available state modifications are handled by
    CPUIDLE drivers directly
    3.) the CPUIDLE idle handler is only ever installed when at least one
    cpuidle_device is enabled and ready
    4.) the menu governor BM code no longer overflows
    5.) the sysfs attributes are now printed as unsigned integers, avoiding
    negative values
    6.) a variety of other small cleanups

    Also, Idle drivers are no longer swappable during runtime through the
    CPUIDLE sysfs inteface.  On i386 and x86_64 most idle handlers (e.g.
    poll, mwait, halt, etc.) don't benefit from an infrastructure that
    supports multiple states, so I think using a more general case idle
    handler selection mechanism would be cleaner.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Acked-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit df25b6b56955714e6e24b574d88d1fd11f0c3ee5
Author: Len Brown <len.brown@intel.com>
Date:   Tue Jul 24 17:08:21 2007 -0400

    cpuidle: fix IA64 buid

    Signed-off-by: Len Brown <len.brown@intel.com>

commit fd6ada4c14488755ff7068860078c437431fbccd
Author: Adrian Bunk <bunk@stusta.de>
Date:   Mon Jul 9 11:33:13 2007 -0700

    cpuidle: static

    make cpuidle_replace_governor() static

    Signed-off-by: Adrian Bunk <bunk@stusta.de>
    Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit c1d4a2cebcadf2429c0c72e1d29aa2a9684c32e0
Author: Adrian Bunk <bunk@stusta.de>
Date:   Tue Jul 3 00:54:40 2007 -0400

    cpuidle: static

    This patch makes the needlessly global struct menu_governor static.

    Signed-off-by: Adrian Bunk <bunk@stusta.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit dbf8780c6e8d572c2c273da97ed1cca7608fd999
Author: Andrew Morton <akpm@linux-foundation.org>
Date:   Tue Jul 3 00:49:14 2007 -0400

    export symbol tick_nohz_get_sleep_length

    ERROR: "tick_nohz_get_sleep_length" [drivers/cpuidle/governors/menu.ko] undefined!
    ERROR: "tick_nohz_get_idle_jiffies" [drivers/cpuidle/governors/menu.ko] undefined!

    And please be sure to get your changes to core kernel suitably reviewed.

    Cc: Adam Belay <abelay@novell.com>
    Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: john stultz <johnstul@us.ibm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 29f0e248e7017be15f99febf9143a2cef00b2961
Author: Andrew Morton <akpm@linux-foundation.org>
Date:   Tue Jul 3 00:43:04 2007 -0400

    tick.h needs hrtimer.h

    It uses hrtimers.

    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit e40cede7d63a029e92712a3fe02faee60cc38fb4
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:40:34 2007 -0400

    cpuidle: first round of documentation updates

    Documentation changes based on Pavel's feedback.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 83b42be2efece386976507555c29e7773a0dfcd1
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:39:25 2007 -0400

    cpuidle: add rating to the governors and pick the one with highest rating by default

    Introduce a governor rating scheme to pick the right governor by default.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit d2a74b8c5e8f22def4709330d4bfc4a29209b71c
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:38:08 2007 -0400

    cpuidle: make cpuidle sysfs driver governor switch off by default

    Make default cpuidle sysfs to show current_governor and current_driver in
    read-only mode.  More elaborate available_governors and available_drivers with
    writeable current_governor and current_driver interface only appear with
    "cpuidle_sysfs_switch" boot parameter.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 1f60a0e80bf83cf6b55c8845bbe5596ed8f6307b
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:37:00 2007 -0400

    cpuidle: menu governor: change the early break condition

    Change the C-state early break out algorithm in menu governor.

    We only look at early breakouts that result in wakeups shorter than idle
    state's target_residency.  If such a breakout is frequent enough, eliminate
    the particular idle state upto a timeout period.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 45a42095cf64b003b4a69be3ce7f434f97d7af51
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:35:38 2007 -0400

    cpuidle: fix uninitialized variable in sysfs routine

    Fix the uninitialized usage of ret.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 80dca7cdba3e6ee13eae277660873ab9584eb3be
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:34:16 2007 -0400

    cpuidle: reenable /proc/acpi//power interface for the time being

    Keep /proc/acpi/processor/CPU*/power around for a while as powertop depends
    on it. It will be marked deprecated and removed in future. powertop can use
    cpuidle interfaces instead.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 589c37c2646c5e3813a51255a5ee1159cb4c33fc
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Jul 3 00:32:37 2007 -0400

    cpuidle: menu governor and hrtimer compile fix

    Compile fix for menu governor.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 0ba80bd9ab3ed304cb4f19b722e4cc6740588b5e
Author: Len Brown <len.brown@intel.com>
Date:   Thu May 31 22:51:43 2007 -0400

    cpuidle: build fix - cpuidle vs ipw2100 module

    ERROR: "acpi_set_cstate_limit" [drivers/net/wireless/ipw2100.ko] undefined!

    Signed-off-by: Len Brown <len.brown@intel.com>

commit d7d8fa7f96a7f7682be7c6cc0cc53fa7a18c3b58
Author: Adam Belay <abelay@novell.com>
Date:   Sat Mar 24 03:47:07 2007 -0400

    cpuidle: add the 'menu' governor

    Here is my first take at implementing an idle PM governor that takes
    full advantage of NO_HZ.  I call it the 'menu' governor because it
    considers the full list of idle states before each entry.

    I've kept the implementation fairly simple.  It attempts to guess the
    next residency time and then chooses a state that would meet at least
    the break-even point between power savings and entry cost.  To this end,
    it selects the deepest idle state that satisfies the following
    constraints:
         1. If the idle time elapsed since bus master activity was detected
            is below a threshold (currently 20 ms), then limit the selection
            to C2-type or above.
         2. Do not choose a state with a break-even residency that exceeds
            the expected time remaining until the next timer interrupt.
         3. Do not choose a state with a break-even residency that exceeds
            the elapsed time between the last pair of break events,
            excluding timer interrupts.

    This governor has an advantage over "ladder" governor because it
    proactively checks how much time remains until the next timer interrupt
    using the tick infrastructure.  Also, it handles device interrupt
    activity more intelligently by not including timer interrupts in break
    event calculations.  Finally, it doesn't make policy decisions using the
    number of state entries, which can have variable residency times (NO_HZ
    makes these potentially very large), and instead only considers sleep
    time deltas.

    The menu governor can be selected during runtime using the cpuidle sysfs
    interface like so:
    "echo "menu" > /sys/devices/system/cpu/cpuidle/current_governor"

    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit a4bec7e65aa3b7488b879d971651cc99a6c410fe
Author: Adam Belay <abelay@novell.com>
Date:   Sat Mar 24 03:47:03 2007 -0400

    cpuidle: export time until next timer interrupt using NO_HZ

    Expose information about the time remaining until the next
    timer interrupt expires by utilizing the dynticks infrastructure.
    Also modify the main idle loop to allow dynticks to handle
    non-interrupt break events (e.g. DMA).  Finally, expose sleep ticks
    information to external code.  Thomas Gleixner is responsible for much
    of the code in this patch.  However, I've made some additional changes,
    so I'm probably responsible if there are any bugs or oversights :)

    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 2929d8996fbc77f41a5ff86bb67cdde3ca7d2d72
Author: Adam Belay <abelay@novell.com>
Date:   Sat Mar 24 03:46:58 2007 -0400

    cpuidle: governor API changes

    This patch prepares cpuidle for the menu governor.  It adds an optional
    stage after idle state entry to give the governor an opportunity to
    check why the state was exited.  Also it makes sure the idle loop
    returns after each state entry, allowing the appropriate dynticks code
    to run.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 3a7fd42f9825c3b03e364ca59baa751bb350775f
Author: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Apr 26 00:03:59 2007 -0700

    cpuidle: hang fix

    Prevent hang on x86-64, when ACPI processor driver is added as a module on
    a system that does not support C-states.

    x86-64 expects all idle handlers to enable interrupts before returning from
    idle handler.  This is due to enter_idle(), exit_idle() races.  Make
    cpuidle_idle_call() confirm to this when there is no pm_idle_old.

    Also, cpuidle look at the return values of attch_driver() and set
    current_driver to NULL if attach fails on all CPUs.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 4893339a142afbd5b7c01ffadfd53d14746e858e
Author: Shaohua Li <shaohua.li@intel.com>
Date:   Thu Apr 26 10:40:09 2007 +0800

    cpuidle: add support for max_cstate limit

    With CPUIDLE framework, the max_cstate (to limit max cpu c-state)
    parameter is ingored. Some systems require it to ignore C2/C3
    and some drivers like ipw require it too.

    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 43bbbbe1cb998cbd2df656f55bb3bfe30f30e7d1
Author: Shaohua Li <shaohua.li@intel.com>
Date:   Thu Apr 26 10:40:13 2007 +0800

    cpuidle: add cpuidle_fore_redetect_devices API

    add cpuidle_force_redetect_devices API,
    which forces all CPU redetect idle states.
    Next patch will use it.

    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit d1edadd608f24836def5ec483d2edccfb37b1d19
Author: Shaohua Li <shaohua.li@intel.com>
Date:   Thu Apr 26 10:40:01 2007 +0800

    cpuidle: fix sysfs related issue

    Fix the cpuidle sysfs issue.
    a. make kobject dynamicaly allocated
    b. fixed sysfs init issue to avoid suspend/resume issue

    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 7169a5cc0d67b263978859672e86c13c23a5570d
Author: Randy Dunlap <randy.dunlap@oracle.com>
Date:   Wed Mar 28 22:52:53 2007 -0400

    cpuidle: 1-bit field must be unsigned

    A 1-bit bitfield has no room for a sign bit.
    drivers/cpuidle/governors/ladder.c:54:16: error: dubious bitfield without explicit `signed' or `unsigned'

    Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
    Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 4658620158dc2fbd9e4bcb213c5b6fb5d05ba7d4
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Wed Mar 28 22:52:41 2007 -0400

    cpuidle: fix boot hang

    Patch for cpuidle boot hang reported by Larry Finger here.
    http://www.ussg.iu.edu/hypermail/linux/kernel/0703.2/2025.html

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Cc: Larry Finger <larry.finger@lwfinger.net>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit c17e168aa6e5fe3851baaae8df2fbc1cf11443a9
Author: Len Brown <len.brown@intel.com>
Date:   Wed Mar 7 04:37:53 2007 -0500

    cpuidle: ladder does not depend on ACPI

    build fix for CONFIG_ACPI=n

    In file included from drivers/cpuidle/governors/ladder.c:21:
    include/acpi/processor.h:88: error: expected specifier-qualifier-list before ‘acpi_integer’
    include/acpi/processor.h:106: error: expected specifier-qualifier-list before ‘acpi_integer’
    include/acpi/processor.h:168: error: expected specifier-qualifier-list before ‘acpi_handle’

    Signed-off-by: Len Brown <len.brown@intel.com>

commit 8c91d958246bde68db0c3f0c57b535962ce861cb
Author: Adrian Bunk <bunk@stusta.de>
Date:   Tue Mar 6 02:29:40 2007 -0800

    cpuidle: make code static

    This patch makes the following needlessly global code static:
    - driver.c: __cpuidle_find_driver()
    - governor.c: __cpuidle_find_governor()
    - ladder.c: struct ladder_governor

    Signed-off-by: Adrian Bunk <bunk@stusta.de>
    Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Cc: Adam Belay <abelay@novell.com>
    Cc: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 0c39dc3187094c72c33ab65a64d2017b21f372d2
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Wed Mar 7 02:38:22 2007 -0500

    cpu_idle: fix build break

    This patch fixes a build breakage with !CONFIG_HOTPLUG_CPU and
    CONFIG_CPU_IDLE.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Adrian Bunk <bunk@stusta.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 8112e3b115659b07df340ef170515799c0105f82
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Tue Mar 6 02:29:39 2007 -0800

    cpuidle: build fix for !CPU_IDLE

    Fix the compile issues when CPU_IDLE is not configured.

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Cc: Adam Belay <abelay@novell.com>
    Cc: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 1eb4431e9599cd25e0d9872f3c2c8986821839dd
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Feb 22 13:54:57 2007 -0800

    cpuidle take2: Basic documentation for cpuidle

    Documentation for cpuidle infrastructure

    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit ef5f15a8b79123a047285ec2e3899108661df779
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Feb 22 13:54:03 2007 -0800

    cpuidle take2: Hookup ACPI C-states driver with cpuidle

    Hookup ACPI C-states onto generic cpuidle infrastructure.

    drivers/acpi/procesor_idle.c is now a ACPI C-states driver that registers as
    a driver in cpuidle infrastructure and the policy part is removed from
    drivers/acpi/processor_idle.c. We use governor in cpuidle instead.

    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

commit 987196fa82d4db52c407e8c9d5dec884ba602183
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Date:   Thu Feb 22 13:52:57 2007 -0800

    cpuidle take2: Core cpuidle infrastructure

    Announcing 'cpuidle', a new CPU power management infrastructure to manage
    idle CPUs in a clean and efficient manner.
    cpuidle separates out the drivers that can provide support for multiple types
    of idle states and policy governors that decide on what idle state to use
    at run time.
    A cpuidle driver can support multiple idle states based on parameters like
    varying power consumption, wakeup latency, etc (ACPI C-states for example).
    A cpuidle governor can be usage model specific (laptop, server,
    laptop on battery etc).
    Main advantage of the infrastructure being, it allows independent development
    of drivers and governors and allows for better CPU power management.

    A huge thanks to Adam Belay and Shaohua Li who were part of this mini-project
    since its beginning and are greatly responsible for this patchset.

    This patch:

    Core cpuidle infrastructure.
    Introduces a new abstraction layer for cpuidle:
    * which manages drivers that can support multiple idles states. Drivers
      can be generic or particular to specific hardware/platform
    * allows pluging in multiple policy governors that can take idle state policy
      decision
    * The core also has a set of sysfs interfaces with which administrato can know
      about supported drivers and governors and switch them at run time.

    Signed-off-by: Adam Belay <abelay@novell.com>
    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>

Signed-off-by: Len Brown <len.brown@intel.com>
2007-10-10 00:12:41 -04:00