Commit Graph

10 Commits

Author SHA1 Message Date
Srivatsa S. Bhat cf0485a2ac thermal, x86-pkg-temp: Fix CPU hotplug callback registration
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

	get_online_cpus();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	register_cpu_notifier(&foobar_cpu_notifier);

	put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

	cpu_notifier_register_begin();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	/* Note the use of the double underscored version of the API */
	__register_cpu_notifier(&foobar_cpu_notifier);

	cpu_notifier_register_done();

Fix the thermal x86-pkg-temp code by using this latter form of callback
registration.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <eduardo.valentin@ti.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-20 13:43:47 +01:00
Jean Delvare 3e4216531c x86_pkg_temp_thermal: Fix the thermal zone type
The thermal zone type should not include an instance number. Otherwise
each zone is considered a different type and the thermal-to-hwmon
bridge fails to group them all in a single hwmon device.

I also changed the type to "x86_pkg_temp", because "pkg" was too
generic, and other thermal drivers use an underscore, not a dash, as
a separator. Or maybe "cpu_pkg_temp" would be better?

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <eduardo.valentin@ti.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-03-03 23:13:37 +08:00
Jean Delvare 79786880a4 x86_pkg_temp_thermal: Do not expose as a hwmon device
The temperature value reported by x86_pkg_temp_thermal is already
reported by the coretemp driver. So, do not expose this thermal zone
as a hwmon device, because it would be redundant.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <eduardo.valentin@ti.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-03-03 23:13:24 +08:00
Rashika f67fe3c55f drivers: thermal: Mark function as static in x86_pkg_temp_thermal.c
Mark function sys_set_trip_temp() as static in x86_pkg_temp_thermal.c
because it is not used outside this file.

This eliminates the following warning in x86_pkg_temp_thermal.c:
drivers/thermal/x86_pkg_temp_thermal.c:218:5: warning: no previous prototype for ‘sys_set_trip_temp’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-01-02 10:34:37 +08:00
Srinivas Pandruvada 7bed1b3caa Thermal: x86_pkg_temp: change spin lock
x86_pkg_temp receives thermal notifications via a callback from a
therm_throt driver, where thermal interrupts are processed.
This callback is pkg_temp_thermal_platform_thermal_notify. Here to
avoid multiple interrupts from cores in a package, we disable the
source and also set a variable to avoid scheduling delayed work function.
This variable is protected via spin_lock_irqsave. On one buggy platform,
we still receiving interrupts even if the source is disabled. This
can cause deadlock/lockdep warning, when interrupt is generated while under
spinlock in work function.
Change spin_lock to spin_lock_irqsave and spin_unlock to
spin_unlock_irqrestore as the data it is trying to protect can also
be modified in a notification call called from interrupt handler.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2013-09-25 21:29:39 +08:00
Steven Rostedt ace120dcf2 Thermal: Fix lockup of cpu_down()
Commit f1a18a105 "Thermal: CPU Package temperature thermal" had code
that did a get_online_cpus(), run a loop and then do a
put_online_cpus(). The problem is that the loop had an error exit that
would skip the put_online_cpus() part.

In the error exit part of the function, it also did a get_online_cpus(),
run a loop and then put_online_cpus(). The only way to get to the error
exit part is with get_online_cpus() already performed. If this error
condition is hit, the system will be prevented from taking CPUs offline.
The process taking the CPU offline will lock up hard.

Removing the get_online_cpus() removes the lockup as the hotplug CPU
refcount is back to zero.

This was bisected with ktest.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2013-07-22 09:34:46 +08:00
Srinivas Pandruvada 94e791f522 Thermal: x86_pkg_temp: Limit number of pkg temp zones
Although it is unlikley that physical package id is set to some
arbitary number, it is better to prevent in anycase. Since package
temp zones use this in thermal zone type and for allocation, added
a limit.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2013-07-16 08:27:25 +08:00
Wei Yongjun c7c1b3112e Thermal: x86_pkg_temp: fix krealloc() misuse in in pkg_temp_thermal_device_add()
If krealloc() returns NULL, it doesn't free the original. So any code
of the form 'foo = krealloc(foo, ...);' is almost certainly a bug.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2013-07-15 16:26:33 +08:00
Srinivas Pandruvada f3ed0a17f0 Thermal: x86 package temp thermal crash
On systems with no package MSR support this caused crash as there
is a bug in the logic to check presence of DTHERM and PTS feature
together. Added a change so that when there is no PTS support, module
doesn't get loaded. Even if some CPU comes online with the PTS
feature disabled, and other CPUs has this support, this patch
will still prevent such MSR accesses.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reported-by: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2013-07-15 16:20:58 +08:00
Srinivas Pandruvada f1a18a1056 Thermal: CPU Package temperature thermal
This driver register CPU digital temperature sensor as a thermal
zone at package level.
Each package will show up as one zone with at max two trip points.
These trip points can be both read and updated. Once a non zero
value is set in the trip point, if the package package temperature
goes above or below this setting, a thermal notification is
generated.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2013-06-18 06:27:47 +08:00