linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Srivatsa S. Bhat	cedb70afd0	cpufreq: Split __cpufreq_remove_dev() into two parts During CPU offline, the cpufreq core invokes __cpufreq_remove_dev() to perform work such as stopping the cpufreq governor, clearing the CPU from the policy structure etc, and finally cleaning up the kobject. There are certain subtle issues related to the kobject cleanup, and it would be much easier to deal with them if we separate that part from the rest of the cleanup-work in the CPU offline phase. So split the __cpufreq_remove_dev() function into 2 parts: one that handles the kobject cleanup, and the other that handles the rest of the work. Reported-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-09-10 02:49:46 +02:00
Viresh Kumar	19c763031a	cpufreq: serialize calls to __cpufreq_governor() We can't take a big lock around __cpufreq_governor() as this causes recursive locking for some cases. But calls to this routine must be serialized for every policy. Otherwise we can see some unpredictable events. For example, consider following scenario: __cpufreq_remove_dev() __cpufreq_governor(policy, CPUFREQ_GOV_STOP); policy->governor->governor(policy, CPUFREQ_GOV_STOP); cpufreq_governor_dbs() case CPUFREQ_GOV_STOP: mutex_destroy(&cpu_cdbs->timer_mutex) cpu_cdbs->cur_policy = NULL; <PREEMPT> store() __cpufreq_set_policy() __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS); policy->governor->governor(policy, CPUFREQ_GOV_LIMITS); case CPUFREQ_GOV_LIMITS: mutex_lock(&cpu_cdbs->timer_mutex); <-- Warning (destroyed mutex) if (policy->max < cpu_cdbs->cur_policy->cur) <- cur_policy == NULL And so store() will eventually result in a crash if cur_policy is NULL at this point. Introduce an additional variable which would guarantee serialization here. Reported-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-09-10 02:49:46 +02:00
Viresh Kumar	f73d393384	cpufreq: don't allow governor limits to be changed when it is disabled __cpufreq_governor() returns with -EBUSY when governor is already stopped and we try to stop it again, but when it is stopped we must not allow calls to CPUFREQ_GOV_LIMITS event as well. This patch adds this check in __cpufreq_governor(). Reported-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-09-10 02:49:45 +02:00
Li Zhong	5025d628c8	cpufreq: fix bad unlock balance on !CONFIG_SMP This patch tries to fix lockdep complaint attached below. It seems that we should always read acquire the cpufreq_rwsem, whether CONFIG_SMP is enabled or not. And CONFIG_HOTPLUG_CPU depends on CONFIG_SMP, so it seems we don't need CONFIG_SMP for the code enabled by CONFIG_HOTPLUG_CPU. [ 0.504191] ===================================== [ 0.504627] [ BUG: bad unlock balance detected! ] [ 0.504627] 3.11.0-rc6-next-20130819 #1 Not tainted [ 0.504627] ------------------------------------- [ 0.504627] swapper/1 is trying to release lock (cpufreq_rwsem) at: [ 0.504627] [<ffffffff813d927a>] cpufreq_add_dev+0x13a/0x3e0 [ 0.504627] but there are no more locks to release! [ 0.504627] [ 0.504627] other info that might help us debug this: [ 0.504627] 1 lock held by swapper/1: [ 0.504627] #0: (subsys mutex#4){+.+.+.}, at: [<ffffffff8134a7bf>] subsys_interface_register+0x4f/0xe0 [ 0.504627] [ 0.504627] stack backtrace: [ 0.504627] CPU: 0 PID: 1 Comm: swapper Not tainted 3.11.0-rc6-next-20130819 #1 [ 0.504627] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 0.504627] ffffffff813d927a ffff88007f847c98 ffffffff814c062b ffff88007f847cc8 [ 0.504627] ffffffff81098bce ffff88007f847cf8 ffffffff81aadc30 ffffffff813d927a [ 0.504627] 00000000ffffffff ffff88007f847d68 ffffffff8109d0be 0000000000000006 [ 0.504627] Call Trace: [ 0.504627] [<ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0 [ 0.504627] [<ffffffff814c062b>] dump_stack+0x19/0x1b [ 0.504627] [<ffffffff81098bce>] print_unlock_imbalance_bug+0xfe/0x110 [ 0.504627] [<ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0 [ 0.504627] [<ffffffff8109d0be>] lock_release_non_nested+0x1ee/0x310 [ 0.504627] [<ffffffff81099d0e>] ? mark_held_locks+0xae/0x120 [ 0.504627] [<ffffffff811510cb>] ? kfree+0xcb/0x1d0 [ 0.504627] [<ffffffff813d77ea>] ? cpufreq_policy_free+0x4a/0x60 [ 0.504627] [<ffffffff813d927a>] ? cpufreq_add_dev+0x13a/0x3e0 [ 0.504627] [<ffffffff8109d2a4>] lock_release+0xc4/0x250 [ 0.504627] [<ffffffff8106c9f3>] up_read+0x23/0x40 [ 0.504627] [<ffffffff813d927a>] cpufreq_add_dev+0x13a/0x3e0 [ 0.504627] [<ffffffff8134a809>] subsys_interface_register+0x99/0xe0 [ 0.504627] [<ffffffff81b19f3b>] ? cpufreq_gov_dbs_init+0x12/0x12 [ 0.504627] [<ffffffff813d7f0d>] cpufreq_register_driver+0x9d/0x1d0 [ 0.504627] [<ffffffff81b19f3b>] ? cpufreq_gov_dbs_init+0x12/0x12 [ 0.504627] [<ffffffff81b1a039>] acpi_cpufreq_init+0xfe/0x1f8 [ 0.504627] [<ffffffff810002ba>] do_one_initcall+0xda/0x180 [ 0.504627] [<ffffffff81ae301e>] kernel_init_freeable+0x12c/0x1bb [ 0.504627] [<ffffffff81ae2841>] ? do_early_param+0x8c/0x8c [ 0.504627] [<ffffffff814b4dd0>] ? rest_init+0x140/0x140 [ 0.504627] [<ffffffff814b4dde>] kernel_init+0xe/0xf0 [ 0.504627] [<ffffffff814d029a>] ret_from_fork+0x7a/0xb0 [ 0.504627] [<ffffffff814b4dd0>] ? rest_init+0x140/0x140 Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Acked-and-tested-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-21 02:04:31 +02:00
Viresh Kumar	1b27429446	cpufreq: Use cpufreq_policy_list for iterating over policies To iterate over all policies we currently iterate over all online CPUs and then get the policy for each of them which is suboptimal. Use the newly created cpufreq_policy_list for this purpose instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-20 15:43:50 +02:00
Viresh Kumar	474deff744	cpufreq: remove cpufreq_policy_cpu per-cpu variable cpufreq_policy_cpu per-cpu variables are used for storing the ID of the CPU that manages the given CPU's policy. However, we also store a policy pointer for each cpu in cpufreq_cpu_data, so the cpufreq_policy_cpu information is simply redundant. It is better to use cpufreq_cpu_data to retrieve a policy and get policy->cpu from there, so make that happen everywhere and drop the cpufreq_policy_cpu per-cpu variables which aren't necessary any more. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-20 15:43:50 +02:00
Viresh Kumar	9e9fd80167	cpufreq: remove unnecessary check in __cpufreq_governor() We don't need to check if event is CPUFREQ_GOV_POLICY_INIT and put governor module as we are sure event can only be START/STOP here. Remove the useless check. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-20 15:43:50 +02:00
Viresh Kumar	9515f4d69b	cpufreq: remove policy from cpufreq_policy_list during suspend cpufreq_policy_list is a list of active policies. We do remove policies from this list when all CPUs belonging to that policy are removed. But during system suspend we don't really free a policy struct as it will be used again during resume, so we didn't remove it from cpufreq_policy_list as well.. However, this is incorrect. We are saying this policy isn't valid anymore and must not be referenced (though we haven't freed it), but it can still be used by code that iterates over cpufreq_policy_list. Remove policy from this list during system suspend as well. Of course, we must add it back whenever the first CPU belonging to that policy shows up. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-20 15:43:50 +02:00
Viresh Kumar	edab2fbc21	cpufreq: Fix white space in __cpufreq_remove_dev() Align closing brace '}' of an if block. [rjw: Subject and changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-20 15:43:50 +02:00
Rafael J. Wysocki	878f6e074e	Revert "cpufreq: Use cpufreq_policy_list for iterating over policies" Revert commit `eb60852` (cpufreq: Use cpufreq_policy_list for iterating over policies), because it breaks system suspend/resume on multiple machines. It either causes resume to block indefinitely or causes the BUG_ON() in lock_policy_rwsem_##mode() to trigger on sysfs accesses to cpufreq attributes. Conflicts: drivers/cpufreq/cpufreq.c	2013-08-18 15:35:59 +02:00
Viresh Kumar	3de9bdeb28	cpufreq: improve error checking on return values of __cpufreq_governor() The __cpufreq_governor() function can fail in rare cases especially if there are bugs in cpufreq drivers. Thus we must stop processing as soon as this routine fails, otherwise it may result in undefined behavior. This patch adds error checking code whenever this routine is called from any place. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-10 03:24:48 +02:00
Viresh Kumar	6eed9404ab	cpufreq: Use rwsem for protecting critical sections Critical sections of the cpufreq core are protected with the help of the driver module owner's refcount, which isn't the correct approach, because it causes rmmod to return an error when some routine has updated that refcount. Let's use rwsem for this purpose instead. Only cpufreq_unregister_driver() will use write sem and everybody else will use read sem. [rjw: Subject & changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-10 03:24:47 +02:00
Viresh Kumar	fe492f3f03	cpufreq: Fix broken usage of governor->owner's refcount The cpufreq governor owner refcount usage is broken. We should only increment that refcount when a CPUFREQ_GOV_POLICY_INIT event has come and it should only be decremented if CPUFREQ_GOV_POLICY_EXIT has come. Currently, there can be situations where the governor is in use, but we have allowed it to be unloaded which may result in undefined behavior. Let's fix it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-10 03:24:47 +02:00
Viresh Kumar	eb608521f1	cpufreq: Use cpufreq_policy_list for iterating over policies To iterate over all policies we currently iterate over all CPUs and then get the policy for each of them. Let's use the newly created cpufreq_policy_list for this purpose. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-10 03:24:46 +02:00
Lukasz Majewski	c88a1f8b96	cpufreq: Store cpufreq policies in a list Policies available in the cpufreq framework are now linked together. They are accessible via cpufreq_policy_list defined in the cpufreq core. [rjw: Fix from Yinghai Lu folded in] Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Signed-off-by: Myungjoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-10 03:24:06 +02:00
Viresh Kumar	d5b73cd870	cpufreq: Use sizeof(ptr) convetion for computing sizes Chapter 14 of Documentation/CodingStyle says: The preferred form for passing a size of a struct is the following: p = kmalloc(sizeof(p), ...); The alternative form where struct name is spelled out hurts readability and introduces an opportunity for a bug when the pointer variable type is changed but the corresponding sizeof that is passed to a memory allocator is not. This wasn't followed consistently in drivers/cpufreq, let's make it more consistent by always following this rule. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:34:10 +02:00
Viresh Kumar	3a3e9e06d0	cpufreq: Give consistent names to cpufreq_policy objects They are called policy, cur_policy, new_policy, data, etc. Just call them policy wherever possible. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:34:10 +02:00
Viresh Kumar	5ff0a26803	cpufreq: Clean up header files included in the core This patch addresses the following issues in the header files in the cpufreq core: - Include headers in ascending order, so that we don't add same many times by mistake. - <asm/> must be included after <linux/>, so that they override whatever they need to. - Remove unnecessary includes. - Don't include files already included by cpufreq.h or cpufreq_governor.h. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:34:09 +02:00
Rafael J. Wysocki	1133bfa6dc	Merge branch 'pm-cpufreq-ondemand' into pm-cpufreq * pm-cpufreq: cpufreq: Remove unused function __cpufreq_driver_getavg() cpufreq: Remove unused APERF/MPERF support cpufreq: ondemand: Change the calculation of target frequency	2013-08-07 23:11:43 +02:00
Viresh Kumar	d8d3b47112	cpufreq: Pass policy to cpufreq_add_policy_cpu() The caller of cpufreq_add_policy_cpu() already has a pointer to the policy structure and there is no need to look it up again in cpufreq_add_policy_cpu(). Let's pass it directly. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:02:50 +02:00
Rafael J. Wysocki	10659ab7b5	cpufreq: Avoid double kobject_put() for the same kobject in error code path The only case triggering a jump to the err_out_unregister label in __cpufreq_add_dev() is when cpufreq_add_dev_interface() fails. However, if cpufreq_add_dev_interface() fails, it calls kobject_put() for the policy kobject in its error code path and since that causes the kobject's refcount to become 0, the additional kobject_put() for the same kobject under err_out_unregister and the wait_for_completion() following it are pointless, so drop them. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2013-08-07 23:02:50 +02:00
Rafael J. Wysocki	71c3461ef7	cpufreq: Do not hold driver module references for additional policy CPUs The cpufreq core is a little inconsistent in the way it uses the driver module refcount. Namely, if __cpufreq_add_dev() is called for a CPU that doesn't share the policy object with any other CPUs, the driver module refcount it grabs to start with will be dropped by it before returning and will be equal to whatever it had been before that function was invoked. However, if the given CPU does share the policy object with other CPUs, either cpufreq_add_policy_cpu() is called to link the new CPU to the existing policy, or cpufreq_add_dev_symlink() is used to link the other CPUs sharing the policy with it to the just created policy object. In that case, because both cpufreq_add_policy_cpu() and cpufreq_add_dev_symlink() call cpufreq_cpu_get() for the given policy (the latter possibly many times) without the balancing cpufreq_cpu_put() (unless there is an error), the driver module refcount will be left by __cpufreq_add_dev() with a nonzero value (different from the initial one). To remove that inconsistency make cpufreq_add_policy_cpu() execute cpufreq_cpu_put() for the given policy before returning, which decrements the driver module refcount so that it will be equal to its initial value after __cpufreq_add_dev() returns. Also remove the cpufreq_cpu_get() call from cpufreq_add_dev_symlink(), since both the policy refcount and the driver module refcount are nonzero when it is called and they don't need to be bumped up by it. Accordingly, drop the cpufreq_cpu_put() from __cpufreq_remove_dev(), since it is only necessary to balance the cpufreq_cpu_get() called by cpufreq_add_policy_cpu() or cpufreq_add_dev_symlink(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2013-08-07 23:02:50 +02:00
Viresh Kumar	308b60e715	cpufreq: Don't pass CPU to cpufreq_add_dev_{symlink\|interface}() Pointer to struct cpufreq_policy is already passed to these routines and we don't need to send policy->cpu to them as well. So, get rid of this extra argument and use policy->cpu everywhere. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:02:49 +02:00
Viresh Kumar	e8fdde1011	cpufreq: Remove extra variables from cpufreq_add_dev_symlink() We call cpufreq_cpu_get() in cpufreq_add_dev_symlink() to increase usage refcount of policy, but not to get a policy for the given CPU. So, we don't really need to capture the return value of this routine. We can simply use policy passed as an argument to cpufreq_add_dev_symlink(). Moreover debug print is rewritten to make it more clear. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:02:49 +02:00
Srivatsa S. Bhat	5302c3fb2e	cpufreq: Perform light-weight init/teardown during suspend/resume Now that we have the infrastructure to perform a light-weight init/tear-down, use that in the cpufreq CPU hotplug notifier when invoked from the suspend/resume path. This also ensures that the file permissions of the cpufreq sysfs files are preserved across suspend/resume, something which commit a66b2e (cpufreq: Preserve sysfs files across suspend/resume) originally intended to do, but had to be reverted due to other problems. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:02:49 +02:00
Srivatsa S. Bhat	8414809c6a	cpufreq: Preserve policy structure across suspend/resume To perform light-weight cpu-init and teardown in the cpufreq subsystem during suspend/resume, we need to separate out the 2 main functionalities of the cpufreq CPU hotplug callbacks, as outlined below: 1. Init/tear-down of core cpufreq and CPU-specific components, which are critical to the correct functioning of the cpufreq subsystem. 2. Init/tear-down of cpufreq sysfs files during suspend/resume. The first part requires accurate updates to the policy structure such as its ->cpus and ->related_cpus masks, whereas the second part requires that the policy->kobj structure is not released or re-initialized during suspend/resume. To handle both these requirements, we need to allow updates to the policy structure throughout suspend/resume, but prevent the structure from getting freed up. Also, we must have a mechanism by which the cpu-up callbacks can restore the policy structure, without allocating things afresh. (That also helps avoid memory leaks). To achieve this, we use 2 schemes: a. Use a fallback per-cpu storage area for preserving the policy structures during suspend, so that they can be restored during resume appropriately. b. Use the 'frozen' flag to determine when to free or allocate the policy structure vs when to restore the policy from the saved fallback storage. Thus we can successfully preserve the structure across suspend/resume. Effectively, this helps us complete the separation of the 'light-weight' and the 'full' init/tear-down sequences in the cpufreq subsystem, so that this can be made use of in the suspend/resume scenario. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:02:49 +02:00
Srivatsa S. Bhat	a82fab2928	cpufreq: Introduce a flag ('frozen') to separate full vs temporary init/teardown During suspend/resume we would like to do a light-weight init/teardown of CPUs in the cpufreq subsystem and preserve certain things such as sysfs files etc across suspend/resume transitions. Add a flag called 'frozen' to help distinguish the full init/teardown sequence from the light-weight one. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:02:48 +02:00
Srivatsa S. Bhat	f9ba680d23	cpufreq: Extract the handover of policy cpu to a helper function During cpu offline, when the policy->cpu is going down, some other CPU present in the policy->cpus mask is nominated as the new policy->cpu. Extract this functionality from __cpufreq_remove_dev() and implement it in a helper function. This helps in upcoming code reorganization. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:02:48 +02:00
Srivatsa S. Bhat	e18f1682bc	cpufreq: Extract non-interface related stuff from cpufreq_add_dev_interface cpufreq_add_dev_interface() includes the work of exposing the interface to the device, as well as a lot of unrelated stuff. Move the latter to cpufreq_add_dev(), where it is more appropriate. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:02:48 +02:00
Srivatsa S. Bhat	e9698cc5d2	cpufreq: Add helper to perform alloc/free of policy structure Separate out the allocation of the cpufreq policy structure (along with its error handling) to a helper function. This makes the code easier to read and also helps with some upcoming code reorganization. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:02:48 +02:00
Srivatsa S. Bhat	23d328994b	cpufreq: Fix misplaced call to cpufreq_update_policy() The call to cpufreq_update_policy() is placed in the CPU hotplug callback of cpufreq_stats, which has a higher priority than the CPU hotplug callback of cpufreq-core. As a result, during CPU_ONLINE/CPU_ONLINE_FROZEN, we end up calling cpufreq_update_policy() before calling cpufreq_add_dev() ! And for uninitialized CPUs, it just returns silently, not doing anything. To add to that, cpufreq_stats is not even the right place to call cpufreq_update_policy() to begin with. The cpufreq core ought to handle this in its own callback, from an elegance/relevance perspective. So move the invocation of cpufreq_update_policy() to cpufreq_cpu_callback, and place it after cpufreq_add_dev(). Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-08-07 23:02:48 +02:00
Rafael J. Wysocki	2a99859932	cpufreq: Fix cpufreq driver module refcount balance after suspend/resume Since cpufreq_cpu_put() called by __cpufreq_remove_dev() drops the driver module refcount, __cpufreq_remove_dev() causes that refcount to become negative for the cpufreq driver after a suspend/resume cycle. This is not the only bad thing that happens there, however, because kobject_put() should only be called for the policy kobject at this point if the CPU is not the last one for that policy. Namely, if the given CPU is the last one for that policy, the policy kobject's refcount should be 1 at this point, as set by cpufreq_add_dev_interface(), and only needs to be dropped once for the kobject to go away. This actually happens under the cpu == 1 check, so it need not be done before by cpufreq_cpu_put(). On the other hand, if the given CPU is not the last one for that policy, this means that cpufreq_add_policy_cpu() has been called at least once for that policy and cpufreq_cpu_get() has been called for it too. To balance that cpufreq_cpu_get(), we need to call cpufreq_cpu_put() in that case. Thus, to fix the described problem and keep the reference counters balanced in both cases, move the cpufreq_cpu_get() call in __cpufreq_remove_dev() to the code path executed only for CPUs that share the policy with other CPUs. Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: 3.10+ <stable@vger.kernel.org>	2013-07-30 00:32:00 +02:00
Stratos Karafotis	cffe4e0e74	cpufreq: Remove unused function __cpufreq_driver_getavg() The target frequency calculation method in the ondemand governor has changed and it is now independent of the measured average frequency. Consequently, the __cpufreq_driver_getavg() function and getavg member of struct cpufreq_driver are not used any more, so drop them. [rjw: Changelog] Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-07-26 01:06:44 +02:00
Linus Torvalds	b7356abb9f	Power management and ACPI fixes for 3.11-rc2 - Two cpufreq commits from the 3.10 cycle introduced regressions. The first of them was buggy (it did way much more than it needed to do) and the second one attempted to fix an issue introduced by the first one. Fixes from Srivatsa S Bhat revert both. - If autosleep triggers during system shutdown and the shutdown callbacks of some device drivers have been called already, it may crash the system. Fix from Liu Shuo prevents that from happening by making try_to_suspend() check system_state. - The ACPI memory hotplug driver doesn't clear its driver_data on errors which may cause a NULL poiter dereference to happen later. Fix from Toshi Kani. - The ACPI namespace scanning code should not try to attach scan handlers to device objects that have them already, which may confuse things quite a bit, and it should rescan the whole namespace branch starting at the given node after receiving a bus check notify event even if the device at that particular node has been discovered already. Fixes from Rafael J Wysocki. - New ACPI video blacklist entry for a system whose initial backlight setting from the BIOS doesn't make sense. From Lan Tianyu. - Garbage string output avoindance for ACPI PNP from Liu Shuo. - Two Kconfig fixes for issues introduced recently in the s3c24xx cpufreq driver (when moving the driver to drivers/cpufreq) from Paul Bolle. - Trivial comment fix in pm_wakeup.h from Chanwoo Choi. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABAgAGBQJR6Sq+AAoJEKhOf7ml8uNsrGQP/0HRDW+QmTGM8znDTHXngbn9 X3pqlpjEOiCtcmJaSJlD7GwLHMscwWcHKEezteaZ7KUI4mcnysJX6EY5YVbNriDC xlLcDQn9c6Xx1maCSfp+WMygvqItxZwuc8veRjrT3XtOfCNWS/FlX40Voh63BCAe GbfQ/HesmUg5CKplyD8/XypLWh5OFXmHzCe8IhrKGfhsZukXdSgSBjwQZMRrEMsQ kJjDCF8zUu0JisiWqL+xE6IFSKme9i6LBlHpzU0Y1g4RqAqkIbuS0Z3vezOYzoTD oZjBNa9XAgCS3x0l5g3G0ChgDAU+Mpji/imXA7JysrwbirGFbtPHtQYh2HzpAtnw Hkah/0ocBM7/w7VTsUQiRsFPdIJTCBLlm6J38x8yh7n84h4nJgOpK69dBLrMwCUZ f3kid6KIPVLBvnC3QSULrCAKUcUcVVWYtNho+sfXBMjP+cPwTmc3DvATnpru6twa 0KjR5o585UOcciq7EWAoMrCFCfZYF5C4XGaZAxHI/SWooxeCQH84S8vfNLL2epVC ixmLYo4X2ANDsnfbUV+ewhB0/L2905Et6NhPUgPD/1rm15MEZbowbB2K0pzr0QL9 /1hTL61InXx3jLxducJJFKN+HZ0zfDQdTkyafKrR9jb+GsdmnzYJ/vnfDG8MfPjp GZ281YBqVmUeYJh5CPU+ =IUmn -----END PGP SIGNATURE----- Merge tag 'pm+acpi-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael Wysocki: "These are fixes collected over the last week, most importnatly two cpufreq reverts fixing regressions introduced in 3.10, an autoseelp fix preventing systems using it from crashing during shutdown and two ACPI scan fixes related to hotplug. Specifics: - Two cpufreq commits from the 3.10 cycle introduced regressions. The first of them was buggy (it did way much more than it needed to do) and the second one attempted to fix an issue introduced by the first one. Fixes from Srivatsa S Bhat revert both. - If autosleep triggers during system shutdown and the shutdown callbacks of some device drivers have been called already, it may crash the system. Fix from Liu Shuo prevents that from happening by making try_to_suspend() check system_state. - The ACPI memory hotplug driver doesn't clear its driver_data on errors which may cause a NULL poiter dereference to happen later. Fix from Toshi Kani. - The ACPI namespace scanning code should not try to attach scan handlers to device objects that have them already, which may confuse things quite a bit, and it should rescan the whole namespace branch starting at the given node after receiving a bus check notify event even if the device at that particular node has been discovered already. Fixes from Rafael J Wysocki. - New ACPI video blacklist entry for a system whose initial backlight setting from the BIOS doesn't make sense. From Lan Tianyu. - Garbage string output avoindance for ACPI PNP from Liu Shuo. - Two Kconfig fixes for issues introduced recently in the s3c24xx cpufreq driver (when moving the driver to drivers/cpufreq) from Paul Bolle. - Trivial comment fix in pm_wakeup.h from Chanwoo Choi" * tag 'pm+acpi-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / video: ignore BIOS initial backlight value for Fujitsu E753 PNP / ACPI: avoid garbage in resource name cpufreq: Revert commit `2f7021a8` to fix CPU hotplug regression cpufreq: s3c24xx: fix "depends on ARM_S3C24XX" in Kconfig cpufreq: s3c24xx: rename CONFIG_CPU_FREQ_S3C24XX_DEBUGFS PM / Sleep: Fix comment typo in pm_wakeup.h PM / Sleep: avoid 'autosleep' in shutdown progress cpufreq: Revert commit a66b2e to fix suspend/resume regression ACPI / memhotplug: Fix a stale pointer in error path ACPI / scan: Always call acpi_bus_scan() for bus check notifications ACPI / scan: Do not try to attach scan handlers to devices having them	2013-07-19 09:59:06 -07:00
Paul Gortmaker	2760984f65	cpufreq: delete __cpuinit usage from all cpufreq files The __cpuinit type of throwaway sections might have made sense some time ago when RAM was more constrained, but now the savings do not offset the cost and complications. For example, the fix in commit `5e427ec2d0` ("x86: Fix bit corruption at CPU resume time") is a good example of the nasty type of bugs that can be created with improper use of the various __init prefixes. After a discussion on LKML[1] it was decided that cpuinit should go the way of devinit and be phased out. Once all the users are gone, we can then finally remove the macros themselves from linux/init.h. This removes all the drivers/cpufreq uses of the __cpuinit macros from all C files. [1] https://lkml.org/lkml/2013/5/20/589 [v2: leave 2nd lines of args misaligned as requested by Viresh] Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: cpufreq@vger.kernel.org Cc: linux-pm@vger.kernel.org Acked-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2013-07-14 19:36:57 -04:00
Srivatsa S. Bhat	aae760ed21	cpufreq: Revert commit a66b2e to fix suspend/resume regression commit a66b2e (cpufreq: Preserve sysfs files across suspend/resume) has unfortunately caused several things in the cpufreq subsystem to break subtly after a suspend/resume cycle. The intention of that patch was to retain the file permissions of the cpufreq related sysfs files across suspend/resume. To achieve that, the commit completely removed the calls to cpufreq_add_dev() and __cpufreq_remove_dev() during suspend/resume transitions. But the problem is that those functions do 2 kinds of things: 1. Low-level initialization/tear-down that are critical to the correct functioning of cpufreq-core. 2. Kobject and sysfs related initialization/teardown. Ideally we should have reorganized the code to cleanly separate these two responsibilities, and skipped only the sysfs related parts during suspend/resume. Since we skipped the entire callbacks instead (which also included some CPU and cpufreq-specific critical components), cpufreq subsystem started behaving erratically after suspend/resume. So revert the commit to fix the regression. We'll revisit and address the original goal of that commit separately, since it involves quite a bit of careful code reorganization and appears to be non-trivial. (While reverting the commit, note that another commit `f51e1eb` (cpufreq: Fix cpufreq regression after suspend/resume) already reverted part of the original set of changes. So revert only the remaining ones). Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Paul Bolle <pebolle@tiscali.nl> Cc: 3.10+ <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-07-15 01:31:36 +02:00
Viresh Kumar	266c13d767	cpufreq: Fix serialization of frequency transitions Commit 7c30ed ("cpufreq: make sure frequency transitions are serialized") interacts poorly with systems that have a single core freqency for all cores. On such systems we have a single policy for all cores with several CPUs. When we do a frequency transition the governor calls the pre and post change notifiers which causes cpufreq_notify_transition() per CPU. Since the policy is the same for all of them all CPUs after the first and the warnings added are generated by checking a per-policy flag the warnings will be triggered for all cores after the first. Fix this by allowing notifier to be called for n times. Where n is the number of cpus in policy->cpus. Reported-and-tested-by: Mark Brown <broonie@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-07-04 13:12:44 +02:00
Lan Tianyu	f4fd379784	acpi-cpufreq: Add new sysfs attribute freqdomain_cpus Commits `fcf8058` (cpufreq: Simplify cpufreq_add_dev()) and `aa77a52` (cpufreq: acpi-cpufreq: Don't set policy->related_cpus from .init()) changed the contents of the "related_cpus" sysfs attribute on systems where acpi-cpufreq is used and user space can't get the list of CPUs which are in the same hardware coordination CPU domain (provided by the ACPI AML method _PSD) via "related_cpus" any more. To make up for that loss add a new sysfs attribute "freqdomian_cpus" for the acpi-cpufreq driver which exposes the list of CPUs in the same domain regardless of whether it is coordinated by hardware or software. [rjw: Changelog, documentation] References: https://bugzilla.kernel.org/show_bug.cgi?id=58761 Reported-by: Jean-Philippe Halimi <jean-philippe.halimi@exascale-computing.eu> Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-06-27 21:51:09 +02:00
Viresh Kumar	7c30ed532c	cpufreq: make sure frequency transitions are serialized Whenever we are changing frequency of a cpu, we are calling PRECHANGE and POSTCHANGE notifiers. They must be serialized. i.e. PRECHANGE or POSTCHANGE shouldn't be called twice contiguously. This can happen due to bugs in users of __cpufreq_driver_target() or actual cpufreq drivers who are sending these notifiers. This patch adds some protection against this. Now, we keep track of the last transaction and see if something went wrong. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-06-27 21:49:55 +02:00
Viresh Kumar	0956df9c84	cpufreq: make __cpufreq_notify_transition() static __cpufreq_notify_transition() is used only in cpufreq.c, make it static. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-06-21 01:08:16 +02:00
Viresh Kumar	bb176f7d03	cpufreq: Fix minor formatting issues There were a few noticeable formatting issues in core cpufreq code. This cleans them up to make code look better. The changes include: - Whitespace cleanup. - Rearrangements of code. - Multiline comments fixes. - Formatting changes to fit 80 columns. Copyright information in cpufreq.c is also updated to include my name for 2013. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-06-21 01:06:34 +02:00
Xiaoguang Chen	95731ebb11	cpufreq: Fix governor start/stop race condition Cpufreq governors' stop and start operations should be carried out in sequence. Otherwise, there will be unexpected behavior, like in the example below. Suppose there are 4 CPUs and policy->cpu=CPU0, CPU1/2/3 are linked to CPU0. The normal sequence is: 1) Current governor is userspace. An application tries to set the governor to ondemand. It will call __cpufreq_set_policy() in which it will stop the userspace governor and then start the ondemand governor. 2) Current governor is userspace. The online of CPU3 runs on CPU0. It will call cpufreq_add_policy_cpu() in which it will first stop the userspace governor, and then start it again. If the sequence of the above two cases interleaves, it becomes: 1) Application stops userspace governor 2) Hotplug stops userspace governor which is a problem, because the governor shouldn't be stopped twice in a row. What happens next is: 3) Application starts ondemand governor 4) Hotplug starts a governor In step 4, the hotplug is supposed to start the userspace governor, but now the governor has been changed by the application to ondemand, so the ondemand governor is started once again, which is incorrect. The solution is to prevent policy governors from being stopped multiple times in a row. A governor should only be stopped once for one policy. After it has been stopped, no more governor stop operations should be executed. Also add a mutex to serialize governor operations. [rjw: Changelog. And you owe me a beverage of my choice.] Signed-off-by: Xiaoguang Chen <chenxg@marvell.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-06-21 00:56:04 +02:00
Viresh Kumar	a262e94cdc	cpufreq: remove unnecessary cpufreq_cpu_{get\|put}() calls struct cpufreq_policy is already passed as argument to some routines like: __cpufreq_driver_getavg() and so we don't really need to do cpufreq_cpu_get() before and cpufreq_cpu_put() in them to get a policy structure. Remove them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-06-04 14:26:27 +02:00
Viresh Kumar	2361be2366	cpufreq: Don't create empty /sys/devices/system/cpu/cpufreq directory When we don't have any file in cpu/cpufreq directory we shouldn't create it. Specially with the introduction of per-policy governor instance patchset, even governors are moved to cpu/cpu*/cpufreq/governor-name directory and so this directory is just not required. Lets have it only when required. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-27 13:24:02 +02:00
Viresh Kumar	72a4ce340a	cpufreq: Move get_cpu_idle_time() to cpufreq.c Governors other than ondemand and conservative can also use get_cpu_idle_time() and they aren't required to compile cpufreq_governor.c. So, move these independent routines to cpufreq.c instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-27 13:20:56 +02:00
Viresh Kumar	944e9a0316	cpufreq: governors: Move get_governor_parent_kobj() to cpufreq.c get_governor_parent_kobj() can be used by any governor, generic cpufreq governors or platform specific ones and so must be present in cpufreq.c instead of cpufreq_governor.c. This patch moves it to cpufreq.c. This also adds EXPORT_SYMBOL_GPL(get_governor_parent_kobj) so that modules can use this function too. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-27 13:20:56 +02:00
Viresh Kumar	3f869d6d41	cpufreq: Add EXPORT_SYMBOL_GPL for have_governor_per_policy This patch adds: EXPORT_SYMBOL_GPL(have_governor_per_policy), so that this routine can be used by modules too. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-27 13:20:55 +02:00
Viresh Kumar	955ef48335	cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT With the rwsem lock around __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT), we get circular dependency when we call sysfs_remove_group(). ====================================================== [ INFO: possible circular locking dependency detected ] 3.9.0-rc7+ #15 Not tainted ------------------------------------------------------- cat/2387 is trying to acquire lock: (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [<c02f6179>] lock_policy_rwsem_read+0x25/0x34 but task is already holding lock: (s_active#41){++++.+}, at: [<c00f9bf7>] sysfs_read_file+0x4f/0xcc which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (s_active#41){++++.+}: [<c0055a79>] lock_acquire+0x61/0xbc [<c00fabf1>] sysfs_addrm_finish+0xc1/0x128 [<c00f9819>] sysfs_hash_and_remove+0x35/0x64 [<c00fbe6f>] remove_files.isra.0+0x1b/0x24 [<c00fbea5>] sysfs_remove_group+0x2d/0xa8 [<c02f9a0b>] cpufreq_governor_interactive+0x13b/0x35c [<c02f61df>] __cpufreq_governor+0x2b/0x8c [<c02f6579>] __cpufreq_set_policy+0xa9/0xf8 [<c02f6b75>] store_scaling_governor+0x61/0x100 [<c02f6f4d>] store+0x39/0x60 [<c00f9b81>] sysfs_write_file+0xed/0x114 [<c00b3fd1>] vfs_write+0x65/0xd8 [<c00b424b>] sys_write+0x2f/0x50 [<c000cdc1>] ret_fast_syscall+0x1/0x52 -> #0 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}: [<c0055253>] __lock_acquire+0xef3/0x13dc [<c0055a79>] lock_acquire+0x61/0xbc [<c03ee1f5>] down_read+0x25/0x30 [<c02f6179>] lock_policy_rwsem_read+0x25/0x34 [<c02f6edd>] show+0x21/0x58 [<c00f9c0f>] sysfs_read_file+0x67/0xcc [<c00b40a7>] vfs_read+0x63/0xd8 [<c00b41fb>] sys_read+0x2f/0x50 [<c000cdc1>] ret_fast_syscall+0x1/0x52 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(s_active#41); lock(&per_cpu(cpu_policy_rwsem, cpu)); lock(s_active#41); lock(&per_cpu(cpu_policy_rwsem, cpu)); * DEADLOCK * 2 locks held by cat/2387: #0: (&buffer->mutex){+.+.+.}, at: [<c00f9bcd>] sysfs_read_file+0x25/0xcc #1: (s_active#41){++++.+}, at: [<c00f9bf7>] sysfs_read_file+0x4f/0xcc stack backtrace: [<c0011d55>] (unwind_backtrace+0x1/0x9c) from [<c03e9a09>] (print_circular_bug+0x19d/0x1e8) [<c03e9a09>] (print_circular_bug+0x19d/0x1e8) from [<c0055253>] (__lock_acquire+0xef3/0x13dc) [<c0055253>] (__lock_acquire+0xef3/0x13dc) from [<c0055a79>] (lock_acquire+0x61/0xbc) [<c0055a79>] (lock_acquire+0x61/0xbc) from [<c03ee1f5>] (down_read+0x25/0x30) [<c03ee1f5>] (down_read+0x25/0x30) from [<c02f6179>] (lock_policy_rwsem_read+0x25/0x34) [<c02f6179>] (lock_policy_rwsem_read+0x25/0x34) from [<c02f6edd>] (show+0x21/0x58) [<c02f6edd>] (show+0x21/0x58) from [<c00f9c0f>] (sysfs_read_file+0x67/0xcc) [<c00f9c0f>] (sysfs_read_file+0x67/0xcc) from [<c00b40a7>] (vfs_read+0x63/0xd8) [<c00b40a7>] (vfs_read+0x63/0xd8) from [<c00b41fb>] (sys_read+0x2f/0x50) [<c00b41fb>] (sys_read+0x2f/0x50) from [<c000cdc1>] (ret_fast_syscall+0x1/0x52) This lock isn't required while calling __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT). Remove it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-22 00:23:54 +02:00
Srivatsa S. Bhat	a66b2e503f	cpufreq: Preserve sysfs files across suspend/resume The file permissions of cpufreq per-cpu sysfs files are not preserved across suspend/resume because we internally go through the CPU Hotplug path which reinitializes the file permissions on CPU online. But the user is not supposed to know that we are using CPU hotplug internally within suspend/resume (IOW, the kernel should not silently wreck the user-set file permissions across a suspend cycle). Therefore, we need to preserve the file permissions as they are across suspend/resume. The simplest way to achieve that is to just not touch the sysfs files at all - ie., just ignore the CPU hotplug notifications in the suspend/resume path (_FROZEN) in the cpufreq hotplug callback. Reported-by: Robert Jarzmik <robert.jarzmik@intel.com> Reported-by: Durgadoss R <durgadoss.r@intel.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-15 21:47:17 +02:00
Viresh Kumar	d96038e0fa	cpufreq: Issue CPUFREQ_GOV_POLICY_EXIT notifier before dropping policy refcount We must call __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT) before calling cpufreq_cpu_put(data), so that policy kobject have valid fields. Otherwise, removing last online cpu of policy->cpus causes this crash for ondemand/conservative governor. [<c00fb076>] (sysfs_find_dirent+0xe/0xa8) from [<c00fb1bd>] (sysfs_get_dirent+0x21/0x58) [<c00fb1bd>] (sysfs_get_dirent+0x21/0x58) from [<c00fc259>] (sysfs_remove_group+0x85/0xbc) [<c00fc259>] (sysfs_remove_group+0x85/0xbc) from [<c02faad9>] (cpufreq_governor_dbs+0x369/0x4a0) [<c02faad9>] (cpufreq_governor_dbs+0x369/0x4a0) from [<c02f66d7>] (__cpufreq_governor+0x2b/0x8c) [<c02f66d7>] (__cpufreq_governor+0x2b/0x8c) from [<c02f6893>] (__cpufreq_remove_dev.isra.12+0x15b/0x250) [<c02f6893>] (__cpufreq_remove_dev.isra.12+0x15b/0x250) from [<c03e91c7>] (cpufreq_cpu_callback+0x2f/0x3c) [<c03e91c7>] (cpufreq_cpu_callback+0x2f/0x3c) from [<c0036fe1>] (notifier_call_chain+0x45/0x54) [<c0036fe1>] (notifier_call_chain+0x45/0x54) from [<c001e611>] (__cpu_notify+0x1d/0x34) [<c001e611>] (__cpu_notify+0x1d/0x34) from [<c03e5833>] (_cpu_down+0x63/0x1ac) [<c03e5833>] (_cpu_down+0x63/0x1ac) from [<c03e5997>] (cpu_down+0x1b/0x30) [<c03e5997>] (cpu_down+0x1b/0x30) from [<c03e60eb>] (store_online+0x27/0x54) [<c03e60eb>] (store_online+0x27/0x54) from [<c0295629>] (dev_attr_store+0x11/0x18) [<c0295629>] (dev_attr_store+0x11/0x18) from [<c00f9edd>] (sysfs_write_file+0xed/0x114) [<c00f9edd>] (sysfs_write_file+0xed/0x114) from [<c00b42a9>] (vfs_write+0x65/0xd8) [<c00b42a9>] (vfs_write+0x65/0xd8) from [<c00b4523>] (sys_write+0x2f/0x50) [<c00b4523>] (sys_write+0x2f/0x50) from [<c000cdc1>] (ret_fast_syscall+0x1/0x52) Of course this only impacted drivers which have have_governor_per_policy set to true. i.e. big LITTLE cpufreq driver. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-05-12 14:04:16 +02:00
Rafael J. Wysocki	1c3d85dd4e	cpufreq: Revert incorrect commit `5800043` Commit `5800043` (cpufreq: convert cpufreq_driver to using RCU) causes the following call trace to be spit on boot: BUG: sleeping function called from invalid context at /scratch/rafael/work/linux-pm/mm/slab.c:3179 in_atomic(): 0, irqs_disabled(): 0, pid: 292, name: systemd-udevd 2 locks held by systemd-udevd/292: #0: (subsys mutex){+.+.+.}, at: [<ffffffff8146851a>] subsys_interface_register+0x4a/0xe0 #1: (rcu_read_lock){.+.+.+}, at: [<ffffffff81538210>] cpufreq_add_dev_interface+0x60/0x5e0 Pid: 292, comm: systemd-udevd Not tainted 3.9.0-rc8+ #323 Call Trace: [<ffffffff81072c90>] __might_sleep+0x140/0x1f0 [<ffffffff811581c2>] kmem_cache_alloc+0x42/0x2b0 [<ffffffff811e7179>] sysfs_new_dirent+0x59/0x130 [<ffffffff811e63cb>] sysfs_add_file_mode+0x6b/0x110 [<ffffffff81538210>] ? cpufreq_add_dev_interface+0x60/0x5e0 [<ffffffff810a3254>] ? __lock_is_held+0x54/0x80 [<ffffffff811e647d>] sysfs_add_file+0xd/0x10 [<ffffffff811e6541>] sysfs_create_file+0x21/0x30 [<ffffffff81538280>] cpufreq_add_dev_interface+0xd0/0x5e0 [<ffffffff81538210>] ? cpufreq_add_dev_interface+0x60/0x5e0 [<ffffffffa000337f>] ? acpi_processor_get_platform_limit+0x32/0xbb [processor] [<ffffffffa022f540>] ? do_drv_write+0x70/0x70 [acpi_cpufreq] [<ffffffff810a3254>] ? __lock_is_held+0x54/0x80 [<ffffffff8106c97e>] ? up_read+0x1e/0x40 [<ffffffff8106e632>] ? __blocking_notifier_call_chain+0x72/0xc0 [<ffffffff81538dbd>] cpufreq_add_dev+0x62d/0xae0 [<ffffffff815389b8>] ? cpufreq_add_dev+0x228/0xae0 [<ffffffff81468569>] subsys_interface_register+0x99/0xe0 [<ffffffffa014d000>] ? 0xffffffffa014cfff [<ffffffff81535d5d>] cpufreq_register_driver+0x9d/0x200 [<ffffffffa014d000>] ? 0xffffffffa014cfff [<ffffffffa014d0e9>] acpi_cpufreq_init+0xe9/0x1000 [acpi_cpufreq] [<ffffffff810002fa>] do_one_initcall+0x11a/0x170 [<ffffffff810b4b87>] load_module+0x1cf7/0x2920 [<ffffffff81322580>] ? ddebug_proc_open+0xb0/0xb0 [<ffffffff816baee0>] ? retint_restore_args+0xe/0xe [<ffffffff810b5887>] sys_init_module+0xd7/0x120 [<ffffffff816bb6d2>] system_call_fastpath+0x16/0x1b which is quite obvious, because that commit put (multiple instances of) sysfs_create_file() under rcu_read_lock()/rcu_read_unlock(), although sysfs_create_file() may cause memory to be allocated with GFP_KERNEL and that may sleep, which is not permitted in RCU read critical section. Revert the buggy commit altogether along with some changes on top of it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-29 00:08:16 +02:00
Viresh Kumar	820c6ca293	cpufreq: Don't call __cpufreq_governor() for drivers without target() Some cpufreq drivers implement their own governor and so don't need us to call generic governors interface via __cpufreq_governor(). Few recent commits haven't obeyed this law well and we saw some regressions. This patch is an attempt to fix the above issue. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-22 00:48:03 +02:00
Viresh Kumar	e4969ebac8	cpufreq: Call __cpufreq_governor() with correct policy->cpus mask __cpufreq_governor() must be called with a correct policy->cpus mask. In __cpufreq_remove_dev() we initially clear policy->cpus with cpumask_clear_cpu() and then call __cpufreq_governor(policy, CPUFREQ_GOV_POLICY_EXIT). If the governor is doing some per-cpu stuff in EXIT callback, this can create uncertain behavior. Generic governors in drivers/cpufreq/ doesn't do any per-cpu stuff in EXIT callback and so we don't face any issues currently. But its better to keep the code clean, so we don't face any issues in future. Now, we call cpumask_clear_cpu() only when multiple cpus are managed by policy. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-11 22:50:09 +02:00
Nathan Zimmer	5800043b24	cpufreq: convert cpufreq_driver to using RCU We eventually would like to remove the rwlock cpufreq_driver_lock or convert it back to a spinlock and protect the read sections with RCU. The first step in that direction is to make cpufreq_driver use RCU. I don't see an easy wasy to protect the cpufreq_cpu_data structure with RCU, so I am leaving it with the rwlock for now since under certain configs __cpufreq_cpu_get is a hot spot with 256+ cores. [rjw: Subject, changelog, white space] Signed-off-by: Nathan Zimmer <nzimmer@sgi.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-10 13:19:26 +02:00
Viresh Kumar	b43a7ffbf3	cpufreq: Notify all policy->cpus in cpufreq_notify_transition() policy->cpus contains all online cpus that have single shared clock line. And their frequencies are always updated together. Many SMP system's cpufreq drivers take care of this in individual drivers but the best place for this code is in cpufreq core. This patch modifies cpufreq_notify_transition() to notify frequency change for all cpus in policy->cpus and hence updates all users of this API. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Stephen Warren <swarren@nvidia.com> Tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-02 15:24:00 +02:00
Viresh Kumar	4d5dcc4211	cpufreq: governor: Implement per policy instances of governors Currently, there can't be multiple instances of single governor_type. If we have a multi-package system, where we have multiple instances of struct policy (per package), we can't have multiple instances of same governor. i.e. We can't have multiple instances of ondemand governor for multiple packages. Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/ governor-name/. Which again reflects that there can be only one instance of a governor_type in the system. This is a bottleneck for multicluster system, where we want different packages to use same governor type, but with different tunables. This patch uses the infrastructure provided by earlier patch and implements init/exit routines for ondemand and conservative governors. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-01 01:11:34 +02:00
Viresh Kumar	7bd353a995	cpufreq: Add per policy governor-init/exit infrastructure Currently, there can't be multiple instances of single governor_type. If we have a multi-package system, where we have multiple instances of struct policy (per package), we can't have multiple instances of same governor. i.e. We can't have multiple instances of ondemand governor for multiple packages. Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/ governor-name/. Which again reflects that there can be only one instance of a governor_type in the system. This is a bottleneck for multicluster system, where we want different packages to use same governor type, but with different tunables. This patch is inclined towards providing this infrastructure. Because we are required to allocate governor's resources dynamically now, we must do it at policy creation and end. And so got CPUFREQ_GOV_POLICY_INIT/EXIT. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-01 01:11:34 +02:00
Nathan Zimmer	0d1857a1b9	cpufreq: Convert the cpufreq_driver_lock to a rwlock This eliminates the contention I am seeing in __cpufreq_cpu_get. It also nicely stages the lock to be replaced by the rcu. Signed-off-by: Nathan Zimmer <nzimmer@sgi.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-04-01 01:11:34 +02:00
Rafael J. Wysocki	4419fbd4b4	Merge branch 'pm-cpufreq' * pm-cpufreq: (55 commits) cpufreq / intel_pstate: Fix 32 bit build cpufreq: conservative: Fix typos in comments cpufreq: ondemand: Fix typos in comments cpufreq: exynos: simplify .init() for setting policy->cpus cpufreq: kirkwood: Add a cpufreq driver for Marvell Kirkwood SoCs cpufreq/x86: Add P-state driver for sandy bridge. cpufreq_stats: do not remove sysfs files if frequency table is not present cpufreq: Do not track governor name for scaling drivers with internal governors. cpufreq: Only call cpufreq_out_of_sync() for driver that implement cpufreq_driver.target() cpufreq: Retrieve current frequency from scaling drivers with internal governors cpufreq: Fix locking issues cpufreq: Create a macro for unlock_policy_rwsem{read,write} cpufreq: Remove unused HOTPLUG_CPU code cpufreq: governors: Fix WARN_ON() for multi-policy platforms cpufreq: ondemand: Replace down_differential tuner with adj_up_threshold cpufreq / stats: Get rid of CPUFREQ_STATDEVICE_ATTR cpufreq: Don't check cpu_online(policy->cpu) cpufreq: add imx6q-cpufreq driver cpufreq: Don't remove sysfs link for policy->cpu cpufreq: Remove unnecessary use of policy->shared_type ...	2013-02-15 13:59:07 +01:00
Dirk Brandewie	fa69e33f7d	cpufreq: Do not track governor name for scaling drivers with internal governors. Scaling drivers that implement internal governors do not have governor structures assocaited with them. Only track the name of the governor associated with the CPU if the driver does not implement cpufreq_driver.setpolicy() Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 12:55:53 +01:00
Dirk Brandewie	f6b0515b07	cpufreq: Only call cpufreq_out_of_sync() for driver that implement cpufreq_driver.target() Scaling drivers that implement cpufreq_driver.setpolicy() have internal governors that do not signal changes via cpufreq_notify_transition() so the frequncy in the policy will almost certainly be different than the current frequncy. Only call cpufreq_out_of_sync() when the underlying driver implements cpufreq_driver.target() Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 12:55:47 +01:00
Dirk Brandewie	9e21ba8bd8	cpufreq: Retrieve current frequency from scaling drivers with internal governors Scaling drivers that implement the cpufreq_driver.setpolicy() versus the cpufreq_driver.target() interface do not set policy->cur. Normally policy->cur is set during the call to cpufreq_driver.target() when the frequnecy request is made by the governor. If the scaling driver implements cpufreq_driver.setpolicy() and cpufreq_driver.get() interfaces use cpufreq_driver.get() to retrieve the current frequency. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 12:55:03 +01:00
Viresh Kumar	2eaa3e2df1	cpufreq: Fix locking issues cpufreq core uses two locks: - cpufreq_driver_lock: General lock for driver and cpufreq_cpu_data array. - cpu_policy_rwsemfix locking: per CPU reader-writer semaphore designed to cure all cpufreq/hotplug/workqueue/etc related lock issues. These locks were not used properly and are placed against their principle (present before their definition) at various places. This patch is an attempt to fix their use. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 01:22:57 +01:00
Viresh Kumar	fa1d8af47f	cpufreq: Create a macro for unlock_policy_rwsem{read,write} On the lines of macro: lock_policy_rwsem, we can create another macro for unlock_policy_rwsem. Lets do it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 01:22:06 +01:00
Viresh Kumar	65922465b5	cpufreq: Remove unused HOTPLUG_CPU code Because the sibling cpu of any online cpu is identified very early in cpufreq_add_dev(), below code is never executed. And so can be removed. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 01:21:37 +01:00
Viresh Kumar	8e53695f7f	cpufreq: governors: Fix WARN_ON() for multi-policy platforms On multi-policy systems there is a single instance of governor for both the policies (if same governor is chosen for both policies). With the code update from following patches: `8eeed09` cpufreq: governors: Get rid of dbs_data->enable field `b394058` cpufreq: governors: Reset tunables only for cpufreq_unregister_governor() We are creating/removing sysfs directory of governor for for every call to GOV_START and STOP. This would fail for multi-policy system as there is a per-policy call to START/STOP. This patch reuses the governor->initialized variable to detect total users of governor. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 01:21:13 +01:00
Viresh Kumar	3361b7b173	cpufreq: Don't check cpu_online(policy->cpu) policy->cpu or cpus in policy->cpus can't be offline anymore. And so we don't need to check if they are online or not. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-09 01:18:34 +01:00
Viresh Kumar	73bf0fc2b0	cpufreq: Don't remove sysfs link for policy->cpu "cpufreq" directory in policy->cpu is never created using sysfs_create_link(), but using kobject_init_and_add(). And so we shouldn't call sysfs_remove_link() for policy->cpu(). sysfs stuff for policy->cpu is automatically removed when we call kobject_put() for dying policy. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Dirk Brandewie <dirk.brandewie@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-05 22:21:14 +01:00
Viresh Kumar	b394058f06	cpufreq: governors: Reset tunables only for cpufreq_unregister_governor() Currently, whenever governor->governor() is called for CPUFRREQ_GOV_START event we reset few tunables of governor. Which isn't correct, as this routine is called for every cpu hot-[un]plugging event. We should actually be resetting these only when the governor module is removed and re-installed. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 01:29:31 +01:00
Viresh Kumar	fcf8058296	cpufreq: Simplify cpufreq_add_dev() Currently cpufreq_add_dev() firsts allocates policy, calls driver->init() and then checks if this CPU is already managed or not. And if it is already managed, its policy is freed. We can save all this if we somehow know that CPU is managed or not in advance. policy->related_cpus contains the list of all valid sibling CPUs of policy->cpu. We can check this to see if the current CPU is already managed. From now on, platforms don't really need to set related_cpus from their init() routines, as the same work is done by core too. If a platform driver needs to set the related_cpus mask with some additional CPUs, other than CPUs present in policy->cpus, they are free to do it, though, as we don't override anything. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:16 +01:00
Viresh Kumar	b26f72042e	cpufreq: Revert "cpufreq: Don't use cpu removed during cpufreq_driver_unregister" This reverts commit 956f339 "cpufreq: Don't use cpu removed during cpufreq_driver_unregister". With the addition of the following commit, this change/variable is not required any more: commit b9ba2725343ae57add3f324dfa5074167f48de96 Author: Viresh Kumar <viresh.kumar@linaro.org> Date: Mon Jan 14 13:23:03 2013 +0000 cpufreq: Simplify __cpufreq_remove_dev() [rjw: Subject and changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:16 +01:00
Borislav Petkov	9d95046e5d	cpufreq: Add a get_current_driver helper Add a helper function to return cpufreq_driver->name. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:15 +01:00
Dirk Brandewie	d5aaffa9dd	cpufreq: handle cpufreq being disabled for all exported function. When disable_cpufreq() is called some exported functions are still being used that do not have a check for cpufreq being disabled. Add a disabled check into cpufreq_cpu_get() to return NULL if cpufreq is disabled this covers most of the exported functions. For the exported functions that do not call cpufreq_cpu_get() add an explicit check. Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:14 +01:00
Viresh Kumar	b8eed8af94	cpufreq: Simplify __cpufreq_remove_dev() __cpufreq_remove_dev() is called on multiple occasions: cpufreq_driver unregister and cpu removals. Current implementation of this routine is overly complex without much need. If the cpu to be removed is the policy->cpu, we remove the policy first and add all other cpus again from policy->cpus and then finally call __cpufreq_remove_dev() again to remove the cpu to be deleted. Haahhhh.. There exist a simple solution to removal of a cpu: - Simply use the old policy structure - update its fields like: policy->cpu, etc. - notify any users of cpufreq, which depend on changing policy->cpu Hence this patch, which tries to implement the above theory. It is tested well by myself on ARM big.LITTLE TC2 SoC, which has 5 cores (2 A15 and 3 A7). Both A15's share same struct policy and all A7's share same policy structure. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:14 +01:00
Viresh Kumar	6954ca9c8b	cpufreq: Don't use cpu removed during cpufreq_driver_unregister This is how the core works: cpufreq_driver_unregister() - subsys_interface_unregister() - for_each_cpu() call cpufreq_remove_dev(), i.e. 0,1,2,3,4 when we unregister. cpufreq_remove_dev(): - Remove policy node - Call cpufreq_add_dev() for next cpu, sharing mask with removed cpu. i.e. When cpu 0 is removed, we call it for cpu 1. And when called for cpu 2, we call it for cpu 3. - cpufreq_add_dev() would call cpufreq_driver->init() - init would return mask as AND of 2, 3 and 4 for cluster A7. - cpufreq core would do online_cpu && policy->cpus Here is the BUG(). Because cpu hasn't died but we have just unregistered the cpufreq driver, online cpu would still have cpu 2 in it. And so thing go bad again. Solution: Keep cpumask of cpus that are registered with cpufreq core and clear cpus when we get a call from subsys_interface_unregister() via cpufreq_remove_dev(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:14 +01:00
Viresh Kumar	f6a7409cab	cpufreq: Notify governors when cpus are hot-[un]plugged Because cpufreq core and governors worry only about the online cpus, if a cpu is hot [un]plugged, we must notify governors about it, otherwise be ready to expect something unexpected. We already have notifiers in the form of CPUFREQ_GOV_START/CPUFREQ_GOV_STOP, we just need to call them now. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:14 +01:00
Viresh Kumar	643ae6e81d	cpufreq: Manage only online cpus cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by cpufreq core or governors. We need to get only online cpus in this mask. There are two places to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-02-02 00:01:14 +01:00
Paul Gortmaker	43720bd601	PM / tracing: remove deprecated power trace API The text in Documentation said it would be removed in 2.6.41; the text in the Kconfig said removal in the 3.1 release. Either way you look at it, we are well past both, so push it off a cliff. Note that the POWER_CSTATE and the POWER_PSTATE are part of the legacy tracing API. Remove all tracepoints which use these flags. As can be seen from context, most already have a trace entry via trace_cpu_idle anyways. Also, the cpufreq/cpufreq.c PSTATE one is actually unpaired, as compared to the CSTATE ones which all have a clear start/stop. As part of this, the trace_power_frequency also becomes orphaned, so it too is deleted. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-01-26 00:39:12 +01:00
Jingoo Han	f55c9c2627	cpufreq: Remove unnecessary initialization of a local variable Remove an unnecessary initializer for the 'ret' variable in __cpufreq_set_policy(). [rjw: Modified the subject and changelog slightly.] Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:09 +01:00
Viresh Kumar	7249924e53	cpufreq: Make sure target freq is within limits __cpufreq_driver_target() must not pass target frequency beyond the limits of current policy. Today most of cpufreq platform drivers are doing this check in their target routines. Why not move it to __cpufreq_driver_target()? Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:09 +01:00
Viresh Kumar	5a1c022850	cpufreq: Avoid calling cpufreq driver's target() routine if target_freq == policy->cur Avoid calling cpufreq driver's target() routine if new frequency is same as policies current frequency. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:09 +01:00
Viresh Kumar	da58445570	cpufreq: Fix sparse warning by making local function static cpufreq_disabled() is a local function, so should be marked static. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:08 +01:00
Viresh Kumar	0676f7f2e7	cpufreq: return early from __cpufreq_driver_getavg() There is no need to do cpufreq_get_cpu() and cpufreq_put_cpu() for drivers that don't support getavg() routine. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:07 +01:00
Viresh Kumar	db7011516c	cpufreq: Improve debug prints With debug options on, it is difficult to locate cpufreq core's debug prints. Fix this by prefixing debug prints with KBUILD_MODNAME. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:06 +01:00
viresh kumar	4b972f0b04	cpufreq / core: Fix printing of governor and driver name Arrays for governer and driver name are of size CPUFREQ_NAME_LEN or 16. i.e. 15 bytes for name and 1 for trailing '\0'. When cpufreq driver print these names (for sysfs), it includes '\n' or ' ' in the fmt string and still passes length as CPUFREQ_NAME_LEN. If the driver or governor names are using all 15 fields allocated to them, then the trailing '\n' or ' ' will never be printed. And so commands like: root@linaro-developer# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver will print something like: cpufreq_foodrvroot@linaro-developer# Fix this by increasing print length by one character. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:06 +01:00
viresh kumar	8bf1ac7236	cpufreq / core: Fix typo in comment describing show_bios_limit() show_bios_limit is mistakenly written as show_scaling_driver in a comment describing purpose of show_bios_limit() routine. Fix it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2012-11-15 00:33:05 +01:00
Stephen Boyd	a914443627	cpufreq: Fix sysfs deadlock with concurrent hotplug/frequency switch Running one program that continuously hotplugs and replugs a cpu concurrently with another program that continuously writes to the scaling_setspeed node eventually deadlocks with: ============================================= [ INFO: possible recursive locking detected ] 3.4.0 #37 Tainted: G W --------------------------------------------- filemonkey/122 is trying to acquire lock: (s_active#13){++++.+}, at: [<c01a3d28>] sysfs_remove_dir+0x9c/0xb4 but task is already holding lock: (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(s_active#13); lock(s_active#13); * DEADLOCK * May be due to missing lock nesting notation 2 locks held by filemonkey/122: #0: (&buffer->mutex){+.+.+.}, at: [<c01a2230>] sysfs_write_file+0x28/0x140 #1: (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140 stack backtrace: [<c0014fcc>] (unwind_backtrace+0x0/0x120) from [<c00ca600>] (validate_chain+0x6f8/0x1054) [<c00ca600>] (validate_chain+0x6f8/0x1054) from [<c00cb778>] (__lock_acquire+0x81c/0x8d8) [<c00cb778>] (__lock_acquire+0x81c/0x8d8) from [<c00cb9c0>] (lock_acquire+0x18c/0x1e8) [<c00cb9c0>] (lock_acquire+0x18c/0x1e8) from [<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180) [<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180) from [<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4) [<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4) from [<c02d0e5c>] (kobject_del+0x10/0x38) [<c02d0e5c>] (kobject_del+0x10/0x38) from [<c02d0f74>] (kobject_release+0xf0/0x194) [<c02d0f74>] (kobject_release+0xf0/0x194) from [<c0565a98>] (cpufreq_cpu_put+0xc/0x24) [<c0565a98>] (cpufreq_cpu_put+0xc/0x24) from [<c05683f0>] (store+0x6c/0x74) [<c05683f0>] (store+0x6c/0x74) from [<c01a2314>] (sysfs_write_file+0x10c/0x140) [<c01a2314>] (sysfs_write_file+0x10c/0x140) from [<c014af44>] (vfs_write+0xb0/0x128) [<c014af44>] (vfs_write+0xb0/0x128) from [<c014b06c>] (sys_write+0x3c/0x68) [<c014b06c>] (sys_write+0x3c/0x68) from [<c000e0e0>] (ret_fast_syscall+0x0/0x3c) This is because store() in cpufreq.c indirectly calls kobject_get() via cpufreq_cpu_get() and is the last one to call kobject_put() via cpufreq_cpu_put(). Sysfs code should not call kobject_get() or kobject_put() directly (see the comment around sysfs_schedule_callback() for more information). Fix this deadlock by introducing two new functions: struct cpufreq_policy cpufreq_cpu_get_sysfs(unsigned int cpu) void cpufreq_cpu_put_sysfs(struct cpufreq_policy data) which do the same thing as cpufreq_cpu_{get,put}() but don't call kobject functions. To easily trigger this deadlock you can insert an msleep() with a reasonably large value right after the fail label at the bottom of the store() function in cpufreq.c and then write scaling_setspeed in one task and offline the cpu in another. The first task will hang and be detected by the hung task detector. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2012-07-20 21:39:25 +02:00
Konrad Rzeszutek Wilk	a7b422cda5	provide disable_cpufreq() function to disable the API. useful for disabling cpufreq altogether. The cpu frequency scaling drivers and cpu frequency governors will fail to register. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Jones <davej@redhat.com>	2012-03-14 14:45:03 -04:00
Linus Torvalds	02d929502c	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: (23 commits) [CPUFREQ] EXYNOS: Removed useless headers and codes [CPUFREQ] EXYNOS: Make EXYNOS common cpufreq driver [CPUFREQ] powernow-k8: Update copyright, maintainer and documentation information [CPUFREQ] powernow-k8: Fix indexing issue [CPUFREQ] powernow-k8: Avoid Pstate MSR accesses on systems supporting CPB [CPUFREQ] update lpj only if frequency has changed [CPUFREQ] cpufreq:userspace: fix cpu_cur_freq updation [CPUFREQ] Remove wall variable from cpufreq_gov_dbs_init() [CPUFREQ] EXYNOS4210: cpufreq code is changed for stable working [CPUFREQ] EXYNOS4210: Update frequency table for cpu divider [CPUFREQ] EXYNOS4210: Remove code about bus on cpufreq [CPUFREQ] s3c64xx: Use pr_fmt() for consistent log messages cpufreq: OMAP: fixup for omap_device changes, include <linux/module.h> cpufreq: OMAP: fix freq_table leak cpufreq: OMAP: put clk if cpu_init failed cpufreq: OMAP: only supports OPP library cpufreq: OMAP: dont support !freq_table cpufreq: OMAP: deny initialization if no mpudev cpufreq: OMAP: move clk name decision to init cpufreq: OMAP: notify even with bad boot frequency ...	2012-01-11 18:53:33 -08:00
Afzal Mohammed	d08de0c19c	[CPUFREQ] update lpj only if frequency has changed During scaling up of cpu frequency, loops_per_jiffy is updated upon invoking PRECHANGE notifier. If setting to new frequency fails in cpufreq driver, lpj is left at incorrect value. Hence update lpj only if cpu frequency is changed, i.e. upon invoking POSTCHANGE notifier. Penalty would be that during time period between changing cpu frequency & invocation of POSTCHANGE notifier, udelay(x) may not gurantee minimal delay of 'x' us for frequency scaling up operation. Perhaps a better solution would be to define CPUFREQ_ABORTCHANGE & handle accordingly, but then it would be more intrusive (using ABORTCHANGE may help drivers also; if any has registered notifier and expect POST for a PRECHANGE, their needs can be taken care using ABORT) Signed-off-by: Afzal Mohammed <afzal@ti.com> Signed-off-by: Dave Jones <davej@redhat.com>	2012-01-06 10:10:53 -05:00
Kay Sievers	8a25a2fd12	cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem and converts the devices to regular devices. The sysdev drivers are implemented as subsystem interfaces now. After all sysdev classes are ported to regular driver core entities, the sysdev implementation will be entirely removed from the kernel. Userspace relies on events and generic sysfs subsystem infrastructure from sysdev devices, which are made available with this conversion. Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@amd64.org> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Len Brown <lenb@kernel.org> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Dave Jones <davej@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-12-21 14:29:42 -08:00
Jesse Barnes	3d73710880	cpufreq: expose a cpufreq_quick_get_max routine This allows drivers and other code to get the max reported CPU frequency. Initial use is for scaling ring frequency with GPU frequency in the i915 driver. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-28 13:54:26 -07:00
Kees Cook	1a8e1463a4	[CPUFREQ] remove redundant sprintf from request_module call. Since format string handling is part of request_module, there is no need to construct the module name. As such, drop the redundant sprintf and heap usage. Signed-off-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: Dave Jones <davej@redhat.com>	2011-05-04 11:50:58 -04:00
Dominik Brodowski	2d06d8c49a	[CPUFREQ] use dynamic debug instead of custom infrastructure With dynamic debug having gained the capability to report debug messages also during the boot process, it offers a far superior interface for debug messages than the custom cpufreq infrastructure. As a first step, remove the old cpufreq_debug_printk() function and replace it with a call to the generic pr_debug() function. How can dynamic debug be used on cpufreq? You need a kernel which has CONFIG_DYNAMIC_DEBUG enabled. To enabled debugging during runtime, mount debugfs and $ echo -n 'module cpufreq +p' > /sys/kernel/debug/dynamic_debug/control for debugging the complete "cpufreq" module. To achieve the same goal during boot, append ddebug_query="module cpufreq +p" as a boot parameter to the kernel of your choice. For more detailled instructions, please see Documentation/dynamic-debug-howto.txt Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Dave Jones <davej@redhat.com>	2011-05-04 11:50:57 -04:00
Jacob Shin	27ecddc2a9	[CPUFREQ] CPU hotplug, re-create sysfs directory and symlinks When we discover CPUs that are affected by each other's frequency/voltage transitions, the first CPU gets a sysfs directory created, and rest of the siblings get symlinks. Currently, when we hotplug off only the first CPU, all of the symlinks and the sysfs directory gets removed. Even though rest of the siblings are still online and functional, they are orphaned, and no longer governed by cpufreq. This patch, given the above scenario, creates a sysfs directory for the first sibling and symlinks for the rest of the siblings. Please note the recursive call, it was rather too ugly to roll it out. And the removal of redundant NULL setting (it is already taken care of near the top of the function). Signed-off-by: Jacob Shin <jacob.shin@amd.com> Acked-by: Mark Langsdorf <mark.langsdorf@amd.com> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com> Cc: stable@kernel.org	2011-05-04 11:50:57 -04:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Rafael J. Wysocki	e00e56dfd3	cpufreq: Use syscore_ops for boot CPU suspend/resume (v2) The cpufreq subsystem uses sysdev suspend and resume for executing cpufreq_suspend() and cpufreq_resume(), respectively, during system suspend, after interrupts have been switched off on the boot CPU, and during system resume, while interrupts are still off on the boot CPU. In both cases the other CPUs are off-line at the relevant point (either they have been switched off via CPU hotplug during suspend, or they haven't been switched on yet during resume). For this reason, although it may seem that cpufreq_suspend() and cpufreq_resume() are executed for all CPUs in the system, they are only called for the boot CPU in fact, which is quite confusing. To remove the confusion and to prepare for elimiating sysdev suspend and resume operations from the kernel enirely, convernt cpufreq to using a struct syscore_ops object for the boot CPU suspend and resume and rename the callbacks so that their names reflect their purpose. In addition, put some explanatory remarks into their kerneldoc comments. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-03-23 22:16:32 +01:00
Rafael J. Wysocki	7ca64e2d28	[CPUFREQ] Remove the pm_message_t argument from driver suspend None of the existing cpufreq drivers uses the second argument of its .suspend() callback (which isn't useful anyway), so remove it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Dave Jones <davej@redhat.com>	2011-03-16 17:54:33 -04:00
Jiri Slaby	8f5bc2abfd	[CPUFREQ] fix BUG on cpufreq policy init failure cpufreq_register_driver sets cpufreq_driver to a structure owned (and placed) in the caller's memory. If cpufreq policy fails in its ->init function, sysdev_driver_register returns nonzero in cpufreq_register_driver. Now, cpufreq_register_driver returns an error without setting cpufreq_driver back to NULL. Usually cpufreq policy modules are unloaded because they propagate the error to the module init function and return that. So a later access to any member of cpufreq_driver causes bugs like: BUG: unable to handle kernel paging request at ffffffffa00270a0 IP: [<ffffffff8145eca3>] cpufreq_cpu_get+0x53/0xe0 PGD 1805067 PUD 1809063 PMD 1c3f90067 PTE 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/net/tun0/statistics/collisions CPU 0 Modules linked in: ... Pid: 5677, comm: thunderbird-bin Tainted: G W 2.6.38-rc4-mm1_64+ #1389 To be filled by O.E.M./To Be Filled By O.E.M. RIP: 0010:[<ffffffff8145eca3>] [<ffffffff8145eca3>] cpufreq_cpu_get+0x53/0xe0 RSP: 0018:ffff8801aec37d98 EFLAGS: 00010086 RAX: 0000000000000202 RBX: 0000000000000000 RCX: 0000000000000001 RDX: ffffffffa00270a0 RSI: 0000000000001000 RDI: ffffffff8199ece8 ... Call Trace: [<ffffffff8145f490>] cpufreq_quick_get+0x10/0x30 [<ffffffff8103f12b>] show_cpuinfo+0x2ab/0x300 [<ffffffff81136292>] seq_read+0xf2/0x3f0 [<ffffffff8126c5d3>] ? __strncpy_from_user+0x33/0x60 [<ffffffff8116850d>] proc_reg_read+0x6d/0xa0 [<ffffffff81116e53>] vfs_read+0xc3/0x180 [<ffffffff81116f5c>] sys_read+0x4c/0x90 [<ffffffff81030dbb>] system_call_fastpath+0x16/0x1b ... It's all cause by weird fail path handling in cpufreq_register_driver. To fix that, shuffle the code to do proper handling with gotos. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Dave Jones <davej@redhat.com>	2011-03-01 18:49:44 -05:00
Thomas Renninger	25e41933b5	perf: Clean up power events by introducing new, more generic ones Add these new power trace events: power:cpu_idle power:cpu_frequency power:machine_suspend The old C-state/idle accounting events: power:power_start power:power_end Have now a replacement (but we are still keeping the old tracepoints for compatibility): power:cpu_idle and power:power_frequency is replaced with: power:cpu_frequency power:machine_suspend is newly introduced. Jean Pihet has a patch integrated into the generic layer (kernel/power/suspend.c) which will make use of it. the type= field got removed from both, it was never used and the type is differed by the event type itself. perf timechart userspace tool gets adjusted in a separate patch. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Jean Pihet <jean.pihet@newoldbits.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: rjw@sisk.pl LKML-Reference: <1294073445-14812-3-git-send-email-trenn@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1290072314-31155-2-git-send-email-trenn@suse.de>	2011-01-04 08:16:54 +01:00
Julia Lawall	bec037aa6c	[CPUFREQ] drivers/cpufreq: Adjust confusing if indentation Indent the body of for_each_cpu. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r disable braces4@ position p1,p2; statement S1,S2; @@ ( if (...) { ... } \| if (...) S1@p1 S2@p2 ) @script:python@ p1 << r.p1; p2 << r.p2; @@ if (p1[0].column == p2[0].column): cocci.print_main("branch",p1) cocci.print_secs("after",p2) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Dave Jones <davej@redhat.com>	2010-10-22 11:44:47 -04:00
Neal Buckendahl	9c36f746d7	[CPUFREQ] fix brace coding style issue. This patch fixes up a brace warning found by the checkpatch.pl tool Signed-off-by: Neal Buckendahl <nealb001@tbcnet.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-08-03 13:47:05 -04:00
Thomas Renninger	6f4f2723d0	[CPUFREQ] x86 cpufreq: Make trace_power_frequency cpufreq driver independent and fix the broken case if a core's frequency depends on others. trace_power_frequency was only implemented in a rather ungeneric way in acpi-cpufreq driver's target() function only. -> Move the call to trace_power_frequency to cpufreq.c:cpufreq_notify_transition() where CPUFREQ_POSTCHANGE notifier is triggered. This will support power frequency tracing by all cpufreq drivers trace_power_frequency did not trace frequency changes correctly when the userspace governor was used or when CPU cores' frequency depend on each other. -> Moving this into the CPUFREQ_POSTCHANGE notifier and pass the cpu which gets switched automatically fixes this. Robert Schoene provided some important fixes on top of my initial quick shot version which are integrated in this patch: - Forgot some changes in power_end trace (TP_printk/variable names) - Variable dummy in power_end must now be cpu_id - Use static 64 bit variable instead of unsigned int for cpu_id Signed-off-by: Thomas Renninger <trenn@suse.de> CC: davej@redhat.com CC: arjan@infradead.org CC: linux-kernel@vger.kernel.org CC: robert.schoene@tu-dresden.de Tested-by: robert.schoene@tu-dresden.de Signed-off-by: Dave Jones <davej@redhat.com>	2010-08-03 13:47:05 -04:00
Amerigo Wang	226528c610	[CPUFREQ] unexport (un)lock_policy_rwsem* functions lock_policy_rwsem_* and unlock_policy_rwsem_* functions are scheduled to be unexported when 2.6.33. Now there are no other callers of them out of cpufreq.c, unexport them and make them static. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-08-03 13:47:04 -04:00
Xiaotian Feng	cad70a6ae5	[CPUFREQ] fix memory leak in cpufreq_add_dev We didn't free policy->related_cpus in error path err_unlock_policy. This is catched by following kmemleak report: unreferenced object 0xffff88022a0b96d0 (size 512): comm "modprobe", pid 886, jiffies 4294689177 (age 780.694s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8111ebe5>] create_object+0x186/0x281 [<ffffffff814fad4f>] kmemleak_alloc+0x60/0xa7 [<ffffffff8111127a>] kmem_cache_alloc_node_notrace+0x120/0x142 [<ffffffff81262e4f>] alloc_cpumask_var_node+0x2c/0xd7 [<ffffffff81262f0b>] alloc_cpumask_var+0x11/0x13 [<ffffffff81262f1c>] zalloc_cpumask_var+0xf/0x11 [<ffffffff8140fac0>] cpufreq_add_dev+0x11f/0x547 [<ffffffff81334bda>] sysdev_driver_register+0xc2/0x11d [<ffffffff8140e334>] cpufreq_register_driver+0xcb/0x1b8 [<ffffffffa032e040>] 0xffffffffa032e040 [<ffffffff810021ba>] do_one_initcall+0x5e/0x15c [<ffffffff81087f94>] sys_init_module+0xa6/0x1e6 [<ffffffff81009bc2>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Cc: Thomas Renninger <trenn@suse.de> Cc: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-08-03 13:47:02 -04:00
Andrej Gelenberg	ffe6275f90	[CPUFREQ] revert "[CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)" `395913d0b1` ("[CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)") is not needed, because there is no rwsem lock in cpufreq_ondemand and cpufreq_conservative anymore. Lock should not be released until the work done. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=1594 Signed-off-by: Andrej Gelenberg <andrej.gelenberg@udo.edu> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-08-03 13:47:01 -04:00
H. Peter Anvin	d7be0ce6af	Merge commit 'v2.6.34-rc6' into x86/cpu	2010-05-08 14:59:58 -07:00
Borislav Petkov	6dad2a2964	cpufreq: Unify sysfs attribute definition macros Multiple modules used to define those which are with identical functionality and were needlessly replicated among the different cpufreq drivers. Push them into the header and remove duplication. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1270065406-1814-7-git-send-email-bp@amd64.org> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-04-09 14:07:56 -07:00
Amerigo Wang	499bca9b6d	[CPUFREQ] fix a lockdep warning There is no need to do sysfs_remove_link() or kobject_put() etc. when policy_rwsem_write is held, move them after releasing the lock. This fixes the lockdep warning: halt/4071 is trying to acquire lock: (s_active){++++.+}, at: [<c0000000001ef868>] .sysfs_addrm_finish+0x58/0xc0 but task is already holding lock: (&per_cpu(cpu_policy_rwsem, cpu)){+.+.+.}, at: [<c0000000004cd6ac>] .lock_policy_rwsem_write+0x84/0xf4 Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2010-03-31 12:00:21 -04:00
Emese Revfy	52cf25d0ab	Driver core: Constify struct sysfs_ops in struct kobj_type Constify struct sysfs_ops. This is part of the ops structure constification effort started by Arjan van de Ven et al. Benefits of this constification: * prevents modification of data that is shared (referenced) by many other structure instances at runtime * detects/prevents accidental (but not intentional) modification attempts on archs that enforce read-only kernel data at runtime * potentially better optimized code as the compiler can assume that the const data cannot be changed * the compiler/linker move const data into .rodata and therefore exclude them from false sharing Signed-off-by: Emese Revfy <re.emese@gmail.com> Acked-by: David Teigland <teigland@redhat.com> Acked-by: Matt Domsch <Matt_Domsch@dell.com> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Hans J. Koch <hjk@linutronix.de> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-03-07 17:04:49 -08:00
Linus Torvalds	d0316554d3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits) m68k: rename global variable vmalloc_end to m68k_vmalloc_end percpu: add missing per_cpu_ptr_to_phys() definition for UP percpu: Fix kdump failure if booted with percpu_alloc=page percpu: make misc percpu symbols unique percpu: make percpu symbols in ia64 unique percpu: make percpu symbols in powerpc unique percpu: make percpu symbols in x86 unique percpu: make percpu symbols in xen unique percpu: make percpu symbols in cpufreq unique percpu: make percpu symbols in oprofile unique percpu: make percpu symbols in tracer unique percpu: make percpu symbols under kernel/ and mm/ unique percpu: remove some sparse warnings percpu: make alloc_percpu() handle array types vmalloc: fix use of non-existent percpu variable in put_cpu_var() this_cpu: Use this_cpu_xx in trace_functions_graph.c this_cpu: Use this_cpu_xx for ftrace this_cpu: Use this_cpu_xx in nmi handling this_cpu: Use this_cpu operations in RCU this_cpu: Use this_cpu ops for VM statistics ... Fix up trivial (famous last words) global per-cpu naming conflicts in arch/x86/kvm/svm.c mm/slab.c	2009-12-14 09:58:24 -08:00
Thomas Renninger	e2f74f355e	[ACPI/CPUFREQ] Introduce bios_limit per cpu cpufreq sysfs interface This interface is mainly intended (and implemented) for ACPI _PPC BIOS frequency limitations, but other cpufreq drivers can also use it for similar use-cases. Why is this needed: Currently it's not obvious why cpufreq got limited. People see cpufreq/scaling_max_freq reduced, but this could have happened by: - any userspace prog writing to scaling_max_freq - thermal limitations - hardware (_PPC in ACPI case) limitiations Therefore export bios_limit (in kHz) to: - Point the user that it's the BIOS (broken or intended) which limits frequency - Export it as a sysfs interface for userspace progs. While this was a rarely used feature on laptops, there will appear more and more server implemenations providing "Green IT" features like allowing the service processor to limit the frequency. People want to know about HW/BIOS frequency limitations. All ACPI P-state driven cpufreq drivers are covered with this patch: - powernow-k8 - powernow-k7 - acpi-cpufreq Tested with a patched DSDT which limits the first two cores (_PPC returns 1) via _PPC, exposed by bios_limit: # echo 2200000 >cpu2/cpufreq/scaling_max_freq # cat cpu/cpufreq/scaling_max_freq 2600000 2600000 2200000 2200000 # #scaling_max_freq shows general user/thermal/BIOS limitations # cat cpu/cpufreq/bios_limit 2600000 2600000 2800000 2800000 # #bios_limit only shows the HW/BIOS limitation CC: Pallipadi Venkatesh <venkatesh.pallipadi@intel.com> CC: Len Brown <lenb@kernel.org> CC: davej@codemonkey.org.uk CC: linux@dominikbrodowski.net Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-24 13:33:34 -05:00
Alex Chiang	cf3289d0e7	[CPUFREQ] make internal cpufreq_add_dev_* static No need to export these symbols; make them static. cpufreq_add_dev_policy cpufreq_add_dev_symlink cpufreq_add_dev_interface Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-24 13:33:34 -05:00
Prarit Bhargava	90e41bac10	[CPUFREQ] Fix stale cpufreq_cpu_governor pointer Dave, Attached is an update of my patch against the cpufreq fixes branch. Before applying the patch I compiled and booted the tree to see if the panic was still there -- to my surprise it was not. This is because of the conversion of cpufreq_cpu_governor to a char[]. While the panic is kaput, the problem of stale data continues and my patch is still valid. It is possible to end up with the wrong governor after hotplug events because CPUFREQ_DEFAULT_GOVERNOR is statically linked to a default, while the cpu siblings may have had a different governor assigned by a user. ie) the patch is still needed in order to keep the governors assigned properly when hotplugging devices Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-17 23:15:04 -05:00
Dmitry Monakhov	e77b89f13a	[CPUFREQ] Fix use after free on governor restore Currently on governer backup/restore path we storing governor's pointer. This is wrong because one may unload governor's module after cpu goes offline. As result use-after-free will take place on restored cpu. It is not easy to exploit this bug, but still we have to close this issue ASAP. Issue was introduced by following commit `084f349394` ##TESTCASE## #!/bin/sh -x modprobe acpi_cpufreq # Any non default governor, in may case it is "ondemand" modprobe cpufreq_ondemand echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor rmmod acpi_cpufreq rmmod cpufreq_ondemand modprobe acpi_cpufreq # << use-after-free here. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Dave Jones <davej@redhat.com>	2009-11-17 23:15:04 -05:00
Tejun Heo	f16250669d	percpu: make percpu symbols in cpufreq unique This patch updates percpu related symbols in cpufreq such that percpu symbols are unique and don't clash with local symbols. This serves two purposes of decreasing the possibility of global percpu symbol collision and allowing dropping per_cpu__ prefix from percpu symbols. * drivers/cpufreq/cpufreq.c: s/policy_cpu/cpufreq_policy_cpu/ * drivers/cpufreq/freq_table.c: s/show_table/cpufreq_show_table/ * arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c: s/drv_data/acfreq_data/ s/old_perf/acfreq_old_perf/ Partly based on Rusty Russell's "alloc_percpu: rename percpu vars which cause name clashes" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au>	2009-10-29 22:34:13 +09:00
Mathieu Desnoyers	395913d0b1	[CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site) remove rwsem lock from CPUFREQ_GOV_STOP call (second call site) commit `42a06f2166` Missed a call site for CPUFREQ_GOV_STOP to remove the rwlock taken around the teardown. To make a long story short, the rwlock write-lock causes a circular dependency with cancel_delayed_work_sync(), because the timer handler takes the read lock. Note that all callers to __cpufreq_set_policy are taking the rwsem. All sysfs callers (writers) hold the write rwsem at the earliest sysfs calling stage. However, the rwlock write-lock is not needed upon governor stop. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> CC: rjw@sisk.pl CC: mingo@elte.hu CC: Shaohua Li <shaohua.li@intel.com> CC: Pekka Enberg <penberg@cs.helsinki.fi> CC: Dave Young <hidave.darkstar@gmail.com> CC: "Rafael J. Wysocki" <rjw@sisk.pl> CC: Rusty Russell <rusty@rustcorp.com.au> CC: trenn@suse.de CC: sven.wegener@stealer.net CC: cpufreq@vger.kernel.org Signed-off-by: Dave Jones <davej@redhat.com>	2009-09-01 12:45:18 -04:00
Thomas Renninger	8aa84ad8d6	[CPUFREQ] Introduce global, not per core: /sys/devices/system/cpu/cpufreq Currently everything in the cpufreq layer is per core based. This does not reflect reality, for example ondemand on conservative governors have global sysfs variables. Introduce a global cpufreq directory and add the kobject to the governor struct, so that governors can easily access it. The directory is initialized in the cpufreq_core_init initcall and thus will always be created if cpufreq is compiled in, even if no cpufreq driver is active later. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>	2009-09-01 12:45:14 -04:00
Thomas Renninger	4bfa042cd3	[CPUFREQ] Bail out of cpufreq_add_dev if the link for a managed CPU got created Doing: echo 0 >cpu1/online echo 1 >cpu1/online on a managed CPU will result in: Jul 22 15:15:37 linux kernel: [ 80.013864] WARNING: at fs/sysfs/dir.c:487 sysfs_add_one+0xcf/0xe6() Jul 22 15:15:37 linux kernel: [ 80.013866] Hardware name: To Be Filled By O.E.M. Jul 22 15:15:37 linux kernel: [ 80.013868] sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq' Jul 22 15:15:37 linux kernel: [ 80.013870] Modules linked in: powernow_k8 Jul 22 15:15:37 linux kernel: [ 80.013874] Pid: 5750, comm: bash Not tainted 2.6.31-rc2 #40 Jul 22 15:15:37 linux kernel: [ 80.013876] Call Trace: Jul 22 15:15:37 linux kernel: [ 80.013879] [<ffffffff8112ebda>] ? sysfs_add_one+0xcf/0xe6 Jul 22 15:15:37 linux kernel: [ 80.013884] [<ffffffff81041926>] warn_slowpath_common+0x77/0xa4 Jul 22 15:15:37 linux kernel: [ 80.013888] [<ffffffff810419a0>] warn_slowpath_fmt+0x3c/0x3e Jul 22 15:15:37 linux kernel: [ 80.013891] [<ffffffff8112ebda>] sysfs_add_one+0xcf/0xe6 Jul 22 15:15:37 linux kernel: [ 80.013894] [<ffffffff8112f213>] create_dir+0x58/0x87 Jul 22 15:15:37 linux kernel: [ 80.013898] [<ffffffff8112f27a>] sysfs_create_dir+0x38/0x4f Jul 22 15:15:37 linux kernel: [ 80.013902] [<ffffffff811ffb8a>] kobject_add_internal+0x11f/0x1de Jul 22 15:15:37 linux kernel: [ 80.013905] [<ffffffff811ffd21>] kobject_add_varg+0x41/0x4e Jul 22 15:15:37 linux kernel: [ 80.013908] [<ffffffff811ffd7a>] kobject_init_and_add+0x4c/0x57 Jul 22 15:15:37 linux kernel: [ 80.013913] [<ffffffff810667bc>] ? mark_lock+0x22/0x228 Jul 22 15:15:37 linux kernel: [ 80.013918] [<ffffffff813e8a3b>] cpufreq_add_dev_interface+0x40/0x1e4 ... This bug slipped in by git commit: 150b06f7f223cfd0f808737a5243cceca8ea47fa When splitting up cpufreq_add_dev, the whole cpufreq_add_dev function is not left anymore, only cpufreq_add_dev_policy. This patch should reconstruct the identical functionality again as it was before the split. CC: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>	2009-09-01 12:45:14 -04:00
Dave Jones	ecf7e4611c	[CPUFREQ] Factor out policy setting from cpufreq_add_dev Signed-off-by: Dave Jones <davej@redhat.com>	2009-09-01 12:45:14 -04:00
Dave Jones	909a694e33	[CPUFREQ] Factor out interface creation from cpufreq_add_dev Signed-off-by: Dave Jones <davej@redhat.com>	2009-09-01 12:45:13 -04:00
Dave Jones	19d6f7ec3e	[CPUFREQ] Factor out symlink creation from cpufreq_add_dev Signed-off-by: Dave Jones <davej@redhat.com>	2009-09-01 12:45:12 -04:00
Dave Jones	059019a3c3	[CPUFREQ] cleanup up -ENOMEM handling in cpufreq_add_dev Signed-off-by: Dave Jones <davej@redhat.com>	2009-09-01 12:45:09 -04:00
Dave Jones	54e6fe167b	[CPUFREQ] Reduce scope of cpu_sys_dev in cpufreq_add_dev Signed-off-by: Dave Jones <davej@redhat.com>	2009-09-01 12:45:09 -04:00
Dominik Brodowski	ce6c3997c2	[CPUFREQ] Re-enable cpufreq suspend and resume code Commit `4bc5d34135` is broken and causes regressions: (1) cpufreq_driver->resume() and ->suspend() were only called on __powerpc__, but you could set them on all architectures. In fact, ->resume() was defined and used before the PPC-related commit `42d4dc3f4e` complained about in `4bc5d34135`. (2) Therfore, the resume functions in acpi_cpufreq and speedstep-smi would never be called. (3) This means speedstep-smi would be unusuable after suspend or resume. The _real_ problem was calling cpufreq_driver->get() with interrupts off, but it re-enabling interrupts on some platforms. Why is ->get() necessary? Some systems like to change the CPU frequency behind our back, especially during BIOS-intensive operations like suspend or resume. If such systems also use a CPU frequency-dependant timing loop, delays might be off by large factors. Therefore, we need to ascertain as soon as possible that the CPU frequency is indeed at the speed we think it is. You can do this two ways: either setting it anew, or trying to get it. The latter is what was done, the former also has the same IRQ issue. So, let's try something different: defer the checking to after interrupts are re-enabled, by calling cpufreq_update_policy() (via schedule_work()). Timings may be off until this later stage, so let's watch out for resume regressions caused by the deferred handling of frequency changes behind the kernel's back. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Dave Jones <davej@redhat.com>	2009-09-01 12:45:08 -04:00
Dave Jones	4bc5d34135	[CPUFREQ] Make cpufreq suspend code conditional on powerpc. The suspend code runs with interrupts disabled, and the powerpc workaround we do in the cpufreq suspend hook calls the drivers ->get method. powernow-k8's ->get does an smp_call_function_single which needs interrupts enabled cpufreq's suspend/resume code was added in `42d4dc3f4e` to work around a hardware problem on ppc powerbooks. If we make all this code conditional on powerpc, we avoid the issue above. Signed-off-by: Dave Jones <davej@redhat.com>	2009-08-04 14:32:11 -04:00
Thomas Renninger	d5194decd0	[CPUFREQ] Fix a kobject reference bug related to managed CPUs The first offline/online cycle is successful, the second not. Doing: echo 0 >cpu1/online echo 1 >cpu1/online echo 0 >cpu1/online The last command will trigger: Jul 22 14:39:50 linux kernel: [ 593.210125] ------------[ cut here ]------------ Jul 22 14:39:50 linux kernel: [ 593.210139] WARNING: at lib/kref.c:43 kref_get+0x23/0x2b() Jul 22 14:39:50 linux kernel: [ 593.210144] Hardware name: To Be Filled By O.E.M. Jul 22 14:39:50 linux kernel: [ 593.210148] Modules linked in: powernow_k8 Jul 22 14:39:50 linux kernel: [ 593.210158] Pid: 378, comm: kondemand/2 Tainted: G W 2.6.31-rc2 #38 Jul 22 14:39:50 linux kernel: [ 593.210163] Call Trace: Jul 22 14:39:50 linux kernel: [ 593.210171] [<ffffffff812008e8>] ? kref_get+0x23/0x2b Jul 22 14:39:50 linux kernel: [ 593.210181] [<ffffffff81041926>] warn_slowpath_common+0x77/0xa4 Jul 22 14:39:50 linux kernel: [ 593.210190] [<ffffffff81041962>] warn_slowpath_null+0xf/0x11 Jul 22 14:39:50 linux kernel: [ 593.210198] [<ffffffff812008e8>] kref_get+0x23/0x2b Jul 22 14:39:50 linux kernel: [ 593.210206] [<ffffffff811ffa19>] kobject_get+0x1a/0x22 Jul 22 14:39:50 linux kernel: [ 593.210214] [<ffffffff813e815d>] cpufreq_cpu_get+0x8a/0xcb Jul 22 14:39:50 linux kernel: [ 593.210222] [<ffffffff813e87d1>] __cpufreq_driver_getavg+0x1d/0x67 Jul 22 14:39:50 linux kernel: [ 593.210231] [<ffffffff813ea18f>] do_dbs_timer+0x158/0x27f Jul 22 14:39:50 linux kernel: [ 593.210240] [<ffffffff810529ea>] worker_thread+0x200/0x313 ... The output continues on every do_dbs_timer ondemand freq checking poll. This regression was introduced by git commit: `3f4a782b5c` The policy is released when the cpufreq device is removed in: __cpufreq_remove_dev(): /* if this isn't the CPU which is the parent of the kobj, we * only need to unlink, put and exit */ Not creating the symlink is not sever at all. As long as: sysfs_remove_link(&sys_dev->kobj, "cpufreq"); handles it gracefully that the symlink did not exist. Possibly no error should be returned at all, because ondemand governor would still provide the same functionality. Userspace in userspace gov case might be confused if the link is missing. Resolves http://bugzilla.kernel.org/show_bug.cgi?id=13903 CC: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>	2009-08-04 14:32:11 -04:00
Prarit Bhargava	42c74b84c6	[CPUFREQ] Do not set policy for offline cpus Suspend/Resume fails on multi socket, multi core systems because the cpufreq code erroneously sets the per_cpu policy_cpu value when a logical cpu is offline. This most notably results in missing sysfs files that are used to set the cpu frequencies of the various cpus. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-08-04 14:32:10 -04:00
Dave Jones	5e1596f753	[CPUFREQ] Fix compile failure in cpufreq.c managed_policy is out of scope for the non-smp case. Declare it locally where used (twice) Signed-off-by: Dave Jones <davej@redhat.com>	2009-07-08 19:04:23 -04:00
Mathieu Desnoyers	3f4a782b5c	[CPUFREQ] fix (utter) cpufreq_add_dev mess OK, I've tried to clean it up the best I could, but please test this with concurrent cpu hotplug and cpufreq add/remove in loops. I'm sure we will make other interesting findings. This is step one of fixing the overall locking dependency mess in cpufreq. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> CC: rjw@sisk.pl CC: mingo@elte.hu CC: Shaohua Li <shaohua.li@intel.com> CC: Pekka Enberg <penberg@cs.helsinki.fi> CC: Dave Young <hidave.darkstar@gmail.com> CC: "Rafael J. Wysocki" <rjw@sisk.pl> CC: Rusty Russell <rusty@rustcorp.com.au> CC: sven.wegener@stealer.net CC: cpufreq@vger.kernel.org CC: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>	2009-07-06 21:38:28 -04:00
venkatesh.pallipadi@intel.com	7d26e2d5e2	[CPUFREQ] Eliminate the recent lockdep warnings in cpufreq Commit `b14893a62c` although it was very much needed to properly cleanup ondemand timer, opened-up a can of worms related to locking dependencies in cpufreq. Patch here defines the need for dbs_mutex and cleans up its usage in ondemand governor. This also resolves the lockdep warnings reported here http://lkml.indiana.edu/hypermail/linux/kernel/0906.1/01925.html http://lkml.indiana.edu/hypermail/linux/kernel/0907.0/00820.html and few others.. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2009-07-06 21:38:27 -04:00
Yinghai Lu	eaa958402e	cpumask: alloc zeroed cpumask for static cpumask_var_ts These are defined as static cpumask_var_t so if MAXSMP is not used, they are cleared already. Avoid surprises when MAXSMP is enabled. Signed-off-by: Yinghai Lu <yinghai.lu@kernel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-06-09 22:30:27 +09:30
Mathieu Desnoyers	42a06f2166	[CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call * Rafael J. Wysocki (rjw@sisk.pl) wrote: > This message has been generated automatically as a part of a report > of regressions introduced between 2.6.28 and 2.6.29. > > The following bug entry is on the current list of known regressions > introduced between 2.6.28 and 2.6.29. Please verify if it still should > be listed and let me know (either way). > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186 > Subject : cpufreq timer teardown problem > Submitter : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> > Date : 2009-04-23 14:00 (24 days old) > References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4 > Handled-By : Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> > Patch : http://patchwork.kernel.org/patch/19754/ > http://patchwork.kernel.org/patch/19753/ The patches linked above depend on the following patch to remove circular locking dependency : cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call (the following issue was faced when using cancel_delayed_work_sync() in the timer teardown (which fixes a race). * KOSAKI Motohiro (kosaki.motohiro@jp.fujitsu.com) wrote: > Hi > > my box output following warnings. > it seems regression by commit 7ccc7608b836e58fbacf65ee4f8eefa288e86fac. > > A: work -> do_dbs_timer() -> cpu_policy_rwsem > B: store() -> cpu_policy_rwsem -> cpufreq_governor_dbs() -> work > > Hrm, I think it must be due to my attempt to fix the timer teardown race in ondemand governor mixed with new locking behavior in 2.6.30-rc. The rwlock seems to be taken around the whole call to cpufreq_governor_dbs(), when it should be only taken around accesses to the locked data, and especially not around the call to dbs_timer_exit(). Reverting my fix attempt would put the teardown race back in place (replacing the cancel_delayed_work_sync by cancel_delayed_work). Instead, a proper fix would imply modifying this critical section : cpufreq.c: __cpufreq_remove_dev() ... if (cpufreq_driver->target) __cpufreq_governor(data, CPUFREQ_GOV_STOP); unlock_policy_rwsem_write(cpu); To make sure the __cpufreq_governor() callback is not called with rwsem held. This would allow execution of cancel_delayed_work_sync() without being nested within the rwsem. Applies on top of the 2.6.30-rc5 tree. Required to remove circular dep in teardown of both conservative and ondemande governors so they can use cancel_delayed_work_sync(). CPUFREQ_GOV_STOP does not modify the policy, therefore this locking seemed unneeded. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Greg KH <greg@kroah.com> CC: Ingo Molnar <mingo@elte.hu> CC: "Rafael J. Wysocki" <rjw@sisk.pl> CC: Ben Slusky <sluskyb@paranoiacs.org> CC: Chris Wright <chrisw@sous-sol.org> CC: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Jones <davej@redhat.com>	2009-05-26 12:04:50 -04:00
Linus Torvalds	ada19a31a9	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: (35 commits) [CPUFREQ] Prevent p4-clockmod from auto-binding to the ondemand governor. [CPUFREQ] Make cpufreq-nforce2 less obnoxious [CPUFREQ] p4-clockmod reports wrong frequency. [CPUFREQ] powernow-k8: Use a common exit path. [CPUFREQ] Change link order of x86 cpufreq modules [CPUFREQ] conservative: remove 10x from def_sampling_rate [CPUFREQ] conservative: fixup governor to function more like ondemand logic [CPUFREQ] conservative: fix dbs_cpufreq_notifier so freq is not locked [CPUFREQ] conservative: amend author's email address [CPUFREQ] Use swap() in longhaul.c [CPUFREQ] checkpatch cleanups for acpi-cpufreq [CPUFREQ] powernow-k8: Only print error message once, not per core. [CPUFREQ] ondemand/conservative: sanitize sampling_rate restrictions [CPUFREQ] ondemand/conservative: deprecate sampling_rate{min,max} [CPUFREQ] powernow-k8: Always compile powernow-k8 driver with ACPI support [CPUFREQ] Introduce /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_transition_latency [CPUFREQ] checkpatch cleanups for powernow-k8 [CPUFREQ] checkpatch cleanups for ondemand governor. [CPUFREQ] checkpatch cleanups for powernow-k7 [CPUFREQ] checkpatch cleanups for speedstep related drivers. ...	2009-03-26 11:04:08 -07:00
Dave Jones	129f8ae9b1	Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod." This reverts commit `e088e4c9cd`. Removing the sysfs interface for p4-clockmod was flagged as a regression in bug 12826. Course of action: - Find out the remaining causes of overheating, and fix them if possible. ACPI should be doing the right thing automatically. If it isn't, we need to fix that. - mark p4-clockmod ui as deprecated - try again with the removal in six months. It's not really feasible to printk about the deprecation, because it needs to happen at all the sysfs entry points, which means adding a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch. Signed-off-by: Dave Jones <davej@redhat.com>	2009-03-09 15:07:33 -04:00
Thomas Renninger	ed12978453	[CPUFREQ] Introduce /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_transition_latency It's not only useful for the ondemand and conservative governors, but also for userspace daemons to know about the HW transition latency of the CPU. It is especially useful for userspace to know about this value when the ondemand or conservative governors are run. The sampling rate control value depends on it and for userspace being able to set sane tuning values there it has to know about the transition latency. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com>	2009-02-24 22:47:31 -05:00
Dave Jones	29464f2813	[CPUFREQ] checkpatch cleanups for cpufreq core Signed-off-by: Dave Jones <davej@redhat.com>	2009-02-24 22:47:29 -05:00
Rusty Russell	835481d9bc	cpumask: convert struct cpufreq_policy to cpumask_var_t Impact: use new cpumask API to reduce memory usage This is part of an effort to reduce structure sizes for machines configured with large NR_CPUS. cpumask_t gets replaced by cpumask_var_t, which is either struct cpumask[1] (small NR_CPUS) or struct cpumask * (large NR_CPUS). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Dave Jones <davej@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-06 09:05:31 +01:00
Mike Chan	187d9f4ed4	[CPUFREQ] Fix on resume, now preserves user policy min/max. Previously driver resume would always set the current policy min/max with the cpuinfo min/max, defined by user_policy.min/max. Resulting in a reset of policy settings when policy.min/max != cpuinfo.min/max when coming out of suspend. Now user_policy is saved as the policy instead of cpuinfo to preserve what the user actually set. Signed-off-by: Mike Chan <mike@android.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-12-05 15:20:11 -05:00
Matthew Garrett	e088e4c9cd	[CPUFREQ] Disable sysfs ui for p4-clockmod. p4-clockmod has a long history of abuse. It pretends to be a CPU frequency scaling driver, even though it doesn't actually change the CPU frequency, but instead just modulates the frequency with wait-states. The biggest misconception is that when running at the lower 'frequency' p4-clockmod is saving power. This isn't the case, as workloads running slower take longer to complete, preventing the CPU from entering deep C states. However p4-clockmod does have a purpose. It can prevent overheating. Having it hooked up to the cpufreq interfaces is the wrong way to achieve cooling however. It should instead be hooked up to ACPI. This diff introduces a means for a cpufreq driver to register with the cpufreq core, but not present a sysfs interface. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-12-05 15:20:10 -05:00
venkatesh.pallipadi@intel.com	bf0b90e357	[CPUFREQ][1/6] cpufreq: Add cpu number parameter to __cpufreq_driver_getavg() Add a cpu parameter to __cpufreq_driver_getavg(). This is needed for software cpufreq coordination where policy->cpu may not be same as the CPU on which we want to getavg frequency. A follow-on patch will use this parameter to getavg freq from all cpus in policy->cpus. Change since last patch. Fix the offline/online and suspend/resume oops reported by Youquan Song <youquan.song@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>	2008-10-09 13:52:43 -04:00
Julia Lawall	f1829e4a37	[CPUFREQ] drivers/cpufreq/cpufreq.c: Adjust error handling code involving cpufreq_cpu_put After calling cpufreq_cpu_get, error handling code should call cpufreq_cpu_put. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r@ expression x,E; statement S; position p1,p2,p3; @@ ( if ((x = cpufreq_cpu_get@p1(...)) == NULL \|\| ...) S \| x = cpufreq_cpu_get@p1(...) ... when != x if (x == NULL \|\| ...) S ) <... if@p3 (...) { ... when != cpufreq_cpu_put(x) when != if (x) { ... cpufreq_cpu_put(x); ...} return@p2 ...; } ...> ( return x; \| return 0; \| x = E \| E = x \| cpufreq_cpu_put(x) ) @exists@ position r.p1,r.p2,r.p3; expression x; int ret != 0; statement S; @@ * x = cpufreq_cpu_get@p1(...) <... * if@p3 (...) S ...> * return@p2 $NULL\\|ret$; // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Dave Jones <davej@redhat.com>	2008-10-09 13:52:43 -04:00
Thomas Renninger	a1531acd43	cpufreq acpi: only call _PPC after cpufreq ACPI init funcs got called already Ingo Molnar provided a fix to not call _PPC at processor driver initialization time in "[PATCH] ACPI: fix cpufreq regression" (git commit `e4233dec74`) But it can still happen that _PPC is called at processor driver initialization time. This patch should make sure that this is not possible anymore. Signed-off-by: Thomas Renninger <trenn@suse.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Len Brown <lenb@kernel.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:43 -07:00
Linus Torvalds	26dcce0fab	Merge branch 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) NR_CPUS: Replace NR_CPUS in speedstep-centrino.c cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP NR_CPUS: Replace NR_CPUS in cpufreq userspace routines NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target cpumask: Provide a generic set of CPUMASK_ALLOC macros cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr Revert "cpumask: introduce new APIs" cpumask: make for_each_cpu_mask a bit smaller net: Pass reference to cpumask variable in net/sunrpc/svc.c ... Fix up trivial conflicts in drivers/cpufreq/cpufreq.c manually	2008-07-23 18:37:44 -07:00
Linus Torvalds	6d52dcbe56	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] cpufreq: remove CVS keywords [CPUFREQ] change cpu freq arrays to per_cpu variables	2008-07-21 15:10:37 -07:00
Ingo Molnar	68083e05d7	Merge commit 'v2.6.26-rc9' into cpus4096	2008-07-06 14:23:39 +02:00
Linus Torvalds	cc55875e26	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] Fix format string bug.	2008-06-09 11:27:55 -07:00
Chris Wright	326f6a5c9c	[CPUFREQ] Fix format string bug. Format string bug. Not exploitable, as this is only writable by root, but worth fixing all the same. Spotted-by: Ilja van Sprundel <ilja@netric.org> Signed-off-by: Dave Jones <davej@redhat.com>	2008-06-09 13:24:55 -04:00
CHIKAMA masaki	879000f944	cpufreq: fix null object access on Transmeta CPU If cpu specific cpufreq driver(i.e. longrun) has "setpolicy" function, governor object isn't set into cpufreq_policy object at "__cpufreq_set_policy" function in driver/cpufreq/cpufreq.c . This causes a null object access at "store_scaling_setspeed" and "show_scaling_setspeed" function in driver/cpufreq/cpufreq.c when reading or writing through /sys interface (ex. cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed) Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=10654 https://bugzilla.redhat.com/show_bug.cgi?id=443354 Signed-off-by: CHIKAMA Masaki <masaki.chikama@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Chuck Ebbert <cebbert@redhat.com> Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-06-06 11:29:11 -07:00
Lothar Waßmann	dca0261393	[CPUFREQ] fix double unlock of cpu_policy_rwsem in drivers/cpufreq/cpufreq.c In drivers/cpufreq/cpufreq.c the function cpufreq_add_dev() takes the error exit 'err_out_unregister' from different places once with the 'cpu_policy_rwsem' lock held, once with the lock released: \| if (ret) \| goto err_out_unregister; \| } \| \| policy->governor = NULL; /* to assure that the starting sequence is \| * run in cpufreq_set_policy / \| \| / set default policy */ \| ret = __cpufreq_set_policy(policy, &new_policy); \| policy->user_policy.policy = policy->policy; \| policy->user_policy.governor = policy->governor; \| \| unlock_policy_rwsem_write(cpu); \| \| if (ret) { \| dprintk("setting policy failed\n"); \| goto err_out_unregister; \| } This leads to the following error message in case of a failing __cpufreq_set_policy() call: ===================================== [ BUG: bad unlock balance detected! ] ------------------------------------- swapper/1 is trying to release lock (&per_cpu(cpu_policy_rwsem, cpu)) at: [<c01b4564>] unlock_policy_rwsem_write+0x30/0x40 but there are no more locks to release! other info that might help us debug this: 1 lock held by swapper/1: #0: (sysdev_drivers_lock){--..}, at: [<c018fd18>] sysdev_driver_register+0x74/0x130 stack backtrace: [<c002f588>] (dump_stack+0x0/0x14) from [<c00692fc>] (print_unlock_inbalance_bug+0xc8/0x104) [<c0069234>] (print_unlock_inbalance_bug+0x0/0x104) from [<c006b7ac>] (lock_release_non_nested+0xc4/0x19c) r6:00000028 r5:c3c1ab80 r4:c01b4564 [<c006b6e8>] (lock_release_non_nested+0x0/0x19c) from [<c006b9e0>] (lock_release+0x15c/0x18c) r8:60000013 r7:00000001 r6:c01b4564 r5:c0541bb4 r4:c3c1ab80 [<c006b884>] (lock_release+0x0/0x18c) from [<c0061ba0>] (up_write+0x24/0x30) r8:c0541b80 r7:00000000 r6:ffffffea r5:c3c34828 r4:c0541b8c [<c0061b7c>] (up_write+0x0/0x30) from [<c01b4564>] (unlock_policy_rwsem_write+0x30/0x40) r4:c3c34884 [<c01b4534>] (unlock_policy_rwsem_write+0x0/0x40) from [<c01b4c40>] (cpufreq_add_dev+0x324/0x398) [<c01b491c>] (cpufreq_add_dev+0x0/0x398) from [<c018fd64>] (sysdev_driver_register+0xc0/0x130) [<c018fca4>] (sysdev_driver_register+0x0/0x130) from [<c01b3574>] (cpufreq_register_driver+0xbc/0x174) Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Signed-off-by: Dave Jones <davej@redhat.com>	2008-05-29 12:10:12 -04:00

1 2 3 4 5 ...

340 Commits