Commit Graph

77 Commits

Author SHA1 Message Date
Michele Di Giorgio d0b7306d20 thermal: fix race condition when updating cooling device
When multiple thermal zones are bound to the same cooling device, multiple
kernel threads may want to update the cooling device state by calling
thermal_cdev_update(). Having cdev not protected by a mutex can lead to a race
condition. Consider the following situation with two kernel threads k1 and k2:

	    Thread k1				Thread k2
                                    ||
                                    ||  call thermal_cdev_update()
                                    ||      ...
                                    ||      set_cur_state(cdev, target);
    call power_actor_set_power()    ||
        ...                         ||
        instance->target = state;   ||
        cdev->updated = false;      ||
                                    ||      cdev->updated = true;
                                    ||      // completes execution
    call thermal_cdev_update()      ||
        // cdev->updated == true    ||
        return;                     ||
                                    \/
                                    time

k2 has already looped through the thermal instances looking for the deepest
cooling device state and is preempted right before setting cdev->updated to
true. Now, k1 runs, modifies the thermal instance state and sets cdev->updated
to false. Then, k1 is preempted and k2 continues the execution by setting
cdev->updated to true, therefore preventing k1 from performing the update.
Notice that this is not an issue if k2 looks at the instance->target modified by
k1 "after" it is assigned by k1. In fact, in this case the update will happen
anyway and k1 can safely return immediately from thermal_cdev_update().

This may lead to a situation where a thermal governor never updates the cooling
device. For example, this is the case for the step_wise governor: when calling
the function thermal_zone_trip_update(), the governor may always get a new state
equal to the old one (which, however, wasn't notified to the cooling device) and
will therefore skip the update.

CC: Zhang Rui <rui.zhang@intel.com>
CC: Eduardo Valentin <edubezval@gmail.com>
CC: Peter Feuerer <peter@piie.net>
Reported-by: Toby Huang <toby.huang@arm.com>
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-08-08 10:57:39 +08:00
Leo Yan 15333e3af1 thermal: use %d to print S32 parameters
Power allocator's parameters are S32 type, so use %d to print them.

Acked-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-04-27 15:54:51 -07:00
Wei Ni 1d0fd42fa3 thermal: consistently use int for trip temp
The commit 17e8351a77 consistently use int for temperature,
however it missed a few in trip temperature and thermal_core.

In current codes, the trip->temperature used "unsigned long"
and zone->temperature used"int", if the temperature is negative
value, it will get wrong result when compare temperature with
trip temperature.

This patch can fix it.

Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-04-20 20:31:14 -07:00
Zhang Rui 81ad4276b5 Thermal: Ignore invalid trip points
In some cases, platform thermal driver may report invalid trip points,
thermal core should not take any action for these trip points.

This fixed a regression that bogus trip point starts to screw up thermal
control on some Lenovo laptops, after
commit bb431ba26c
Author: Zhang Rui <rui.zhang@intel.com>
Date:   Fri Oct 30 16:31:47 2015 +0800

    Thermal: initialize thermal zone device correctly

    After thermal zone device registered, as we have not read any
    temperature before, thus tz->temperature should not be 0,
    which actually means 0C, and thermal trend is not available.
    In this case, we need specially handling for the first
    thermal_zone_device_update().

    Both thermal core framework and step_wise governor is
    enhanced to handle this. And since the step_wise governor
    is the only one that uses trends, so it's the only thermal
    governor that needs to be updated.

    Tested-by: Manuel Krause <manuelkrause@netscape.net>
    Tested-by: szegad <szegadlo@poczta.onet.pl>
    Tested-by: prash <prash.n.rao@gmail.com>
    Tested-by: amish <ammdispose-arch@yahoo.com>
    Tested-by: Matthias <morpheusxyz123@yahoo.de>
    Reviewed-by: Javi Merino <javi.merino@arm.com>
    Signed-off-by: Zhang Rui <rui.zhang@intel.com>
    Signed-off-by: Chen Yu <yu.c.chen@intel.com>

CC: <stable@vger.kernel.org> #3.18+
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1317190
Link: https://bugzilla.kernel.org/show_bug.cgi?id=114551
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2016-03-18 14:10:57 +08:00
Zhang Rui 98d94507e1 Merge branches 'thermal-intel', 'thermal-suspend-fix' and 'thermal-soc' into next 2016-01-23 11:43:27 +08:00
Kuninori Morimoto ad74e46cb3 thermal: trip_point_temp_store() calls thermal_zone_device_update()
trip_point_temp_store() updates trip temperature. It should call
thermal_zone_device_update() immediately.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2016-01-06 18:06:39 -08:00
Chen Yu 4511f7166a Thermal: do thermal zone update after a cooling device registered
When a new cooling device is registered, we need to update the
thermal zone to set the new registered cooling device to a proper
state.

This fixes a problem that the system is cool, while the fan devices
are left running on full speed after boot, if fan device is registered
after thermal zone device.

Here is the history of why current patch looks like this:
https://patchwork.kernel.org/patch/7273041/

CC: <stable@vger.kernel.org> #3.18+
Reference:https://bugzilla.kernel.org/show_bug.cgi?id=92431
Tested-by: Manuel Krause <manuelkrause@netscape.net>
Tested-by: szegad <szegadlo@poczta.onet.pl>
Tested-by: prash <prash.n.rao@gmail.com>
Tested-by: amish <ammdispose-arch@yahoo.com>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-12-29 16:00:00 +08:00
Zhang Rui ff140fea84 Thermal: handle thermal zone device properly during system sleep
Current thermal code does not handle system sleep well because
1. the cooling device cooling state may be changed during suspend
2. the previous temperature reading becomes invalid after resumed because
   it is got before system sleep
3. updating thermal zone device during suspending/resuming
   is wrong because some devices may have already been suspended
   or may have not been resumed.

Thus, the proper way to do this is to cancel all thermal zone
device update requirements during suspend/resume, and after all
the devices have been resumed, reset and update every registered
thermal zone devices.

This also fixes a regression introduced by:
Commit 19593a1fb1 ("ACPI / fan: convert to platform driver")
Because, with above commit applied, all the fan devices are attached
to the acpi_general_pm_domain, and they are turned on by the pm_domain
automatically after resume, without the awareness of thermal core.

CC: <stable@vger.kernel.org> #3.18+
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=78201
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=91411
Tested-by: Manuel Krause <manuelkrause@netscape.net>
Tested-by: szegad <szegadlo@poczta.onet.pl>
Tested-by: prash <prash.n.rao@gmail.com>
Tested-by: amish <ammdispose-arch@yahoo.com>
Tested-by: Matthias <morpheusxyz123@yahoo.de>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-12-29 15:59:53 +08:00
Zhang Rui bb431ba26c Thermal: initialize thermal zone device correctly
After thermal zone device registered, as we have not read any
temperature before, thus tz->temperature should not be 0,
which actually means 0C, and thermal trend is not available.
In this case, we need specially handling for the first
thermal_zone_device_update().

Both thermal core framework and step_wise governor is
enhanced to handle this. And since the step_wise governor
is the only one that uses trends, so it's the only thermal
governor that needs to be updated.

CC: <stable@vger.kernel.org> #3.18+
Tested-by: Manuel Krause <manuelkrause@netscape.net>
Tested-by: szegad <szegadlo@poczta.onet.pl>
Tested-by: prash <prash.n.rao@gmail.com>
Tested-by: amish <ammdispose-arch@yahoo.com>
Tested-by: Matthias <morpheusxyz123@yahoo.de>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
2015-12-29 15:59:44 +08:00
Javi Merino c973c3bcec thermal: Add a function to get the minimum power
The thermal core already has a function to get the maximum power of a
cooling device: power_actor_get_max_power().  Add a function to get the
minimum power of a cooling device.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-09-14 07:39:46 -07:00
Zhang Rui 5a924a07f8 Merge branches 'thermal-core' and 'thermal-intel' of .git into next 2015-09-02 10:08:02 +08:00
Sascha Hauer 934c93b8c1 thermal: Add comment explaining test for critical temperature
The code testing if a temperature should be emulated or not is
not obvious. Add a comment explaining why this test is done.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:51 +08:00
Sascha Hauer 79e5421cf0 thermal: Use IS_ENABLED instead of #ifdef
Use IS_ENABLED(CONFIG_THERMAL_EMULATION) to make the code more readable
and to get rid of the addtional #ifdef around the variable definitions
in thermal_zone_get_temp().

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:51 +08:00
Sascha Hauer dbdf2532b4 thermal: remove unnecessary call to thermal_zone_device_set_polling
When the thermal zone has no get_temp callback then thermal_zone_device_register()
calls thermal_zone_device_set_polling() with a polling delay of 0. This
only cancels the poll_queue. Since the poll_queue hasn't been scheduled this
is a no-op. Remove it.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:51 +08:00
Sascha Hauer f6be058493 thermal: trivial: fix typo in comment
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:51 +08:00
Sascha Hauer 17e8351a77 thermal: consistently use int for temperatures
The thermal code uses int, long and unsigned long for temperatures
in different places.

Using an unsigned type limits the thermal framework to positive
temperatures without need. Also several drivers currently will report
temperatures near UINT_MAX for temperatures below 0°C. This will probably
immediately shut the machine down due to overtemperature if started below
0°C.

'long' is 64bit on several architectures. This is not needed since INT_MAX °mC
is above the melting point of all known materials.

Consistently use a plain 'int' for temperatures throughout the thermal code and
the drivers. This only changes the places in the drivers where the temperature
is passed around as pointer, when drivers internally use another type this is
not changed.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Lukasz Majewski <l.majewski@samsung.com>
Reviewed-by: Darren Hart <dvhart@linux.intel.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Peter Feuerer <peter@piie.net>
Cc: Punit Agrawal <punit.agrawal@arm.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Peter Feuerer <peter@piie.net>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Lukasz Majewski <l.majewski@samsung.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: linux-acpi@vger.kernel.org
Cc: platform-driver-x86@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-omap@vger.kernel.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: lm-sensors@lm-sensors.org
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:50 +08:00
Ni Wade 25a0a5ce16 thermal: add available policies sysfs attribute
The Linux thermal framework support to change thermal governor
policy in userspace, but it can't show what available policies
supported.

This patch adds available_policies attribute to the thermal
framework, it can list the thermal governors which can be
used for a particular zone. This attribute is read only.

Signed-off-by: Wei Ni <wni@nvidia.com>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-08-03 23:15:50 +08:00
Viresh Kumar 528464eaa4 thermal: remove dangling 'weight_attr' device file
This file isn't getting removed while we unbind a device from thermal
zone. And this causes following messages when the device is registered
again:

WARNING: CPU: 0 PID: 2228 at /home/viresh/linux/fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x70()
sysfs: cannot create duplicate filename '/devices/virtual/thermal/thermal_zone0/cdev0_weight'
Modules linked in: cpufreq_dt(+) [last unloaded: cpufreq_dt]
CPU: 0 PID: 2228 Comm: insmod Not tainted 4.2.0-rc3-00059-g44fffd9473eb #272
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[<c00153e8>] (unwind_backtrace) from [<c0012368>] (show_stack+0x10/0x14)
[<c0012368>] (show_stack) from [<c053a684>] (dump_stack+0x84/0xc4)
[<c053a684>] (dump_stack) from [<c002284c>] (warn_slowpath_common+0x80/0xb0)
[<c002284c>] (warn_slowpath_common) from [<c00228ac>] (warn_slowpath_fmt+0x30/0x40)
[<c00228ac>] (warn_slowpath_fmt) from [<c012d524>] (sysfs_warn_dup+0x60/0x70)
[<c012d524>] (sysfs_warn_dup) from [<c012d244>] (sysfs_add_file_mode_ns+0x13c/0x190)
[<c012d244>] (sysfs_add_file_mode_ns) from [<c012d2d4>] (sysfs_create_file_ns+0x3c/0x48)
[<c012d2d4>] (sysfs_create_file_ns) from [<c03c04a8>] (thermal_zone_bind_cooling_device+0x260/0x358)
[<c03c04a8>] (thermal_zone_bind_cooling_device) from [<c03c2e70>] (of_thermal_bind+0x88/0xb4)
[<c03c2e70>] (of_thermal_bind) from [<c03c10d0>] (__thermal_cooling_device_register+0x17c/0x2e0)
[<c03c10d0>] (__thermal_cooling_device_register) from [<c03c3f50>] (__cpufreq_cooling_register+0x3a0/0x51c)
[<c03c3f50>] (__cpufreq_cooling_register) from [<bf00505c>] (cpufreq_ready+0x44/0x88 [cpufreq_dt])
[<bf00505c>] (cpufreq_ready [cpufreq_dt]) from [<c03d6c30>] (cpufreq_add_dev+0x4a0/0x7dc)
[<c03d6c30>] (cpufreq_add_dev) from [<c02cd3ec>] (subsys_interface_register+0x94/0xd8)
[<c02cd3ec>] (subsys_interface_register) from [<c03d785c>] (cpufreq_register_driver+0x10c/0x1f0)
[<c03d785c>] (cpufreq_register_driver) from [<bf0057d4>] (dt_cpufreq_probe+0x60/0x8c [cpufreq_dt])
[<bf0057d4>] (dt_cpufreq_probe [cpufreq_dt]) from [<c02d03e4>] (platform_drv_probe+0x44/0xa4)
[<c02d03e4>] (platform_drv_probe) from [<c02cead8>] (driver_probe_device+0x174/0x2b4)
[<c02cead8>] (driver_probe_device) from [<c02ceca4>] (__driver_attach+0x8c/0x90)
[<c02ceca4>] (__driver_attach) from [<c02cd078>] (bus_for_each_dev+0x68/0x9c)
[<c02cd078>] (bus_for_each_dev) from [<c02ce2f0>] (bus_add_driver+0x19c/0x214)
[<c02ce2f0>] (bus_add_driver) from [<c02cf490>] (driver_register+0x78/0xf8)
[<c02cf490>] (driver_register) from [<c0009710>] (do_one_initcall+0x8c/0x1d4)
[<c0009710>] (do_one_initcall) from [<c05396b0>] (do_init_module+0x5c/0x1b8)
[<c05396b0>] (do_init_module) from [<c0086490>] (load_module+0xd34/0xed8)
[<c0086490>] (load_module) from [<c0086704>] (SyS_init_module+0xd0/0x120)
[<c0086704>] (SyS_init_module) from [<c000f480>] (ret_fast_syscall+0x0/0x3c)
---[ end trace 3be0e7b7dc6e3c4f ]---

Fixes: db91651311 ("thermal: export weight to sysfs")
Acked-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-08-02 19:36:57 -07:00
Eduardo Valentin 9d0be7f481 thermal: support slope and offset coefficients
It is common to have a linear extrapolation from
the current sensor readings and the actual temperature
value. This is specially the case when the sensor
is in use to extrapolate hotspots.

This patch adds slope and offset constants for
single sensor linear extrapolation equation. Because
the same sensor can be use in different locations,
from board to board, these constants are added
as part of thermal_zone_params.

The constants are available through sysfs.

It is up to the device driver to determine
the usage of these values.

Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-11 19:46:52 -07:00
Javi Merino 9f38271c6f thermal: export thermal_zone_parameters to sysfs
It's useful for tuning to be able to edit thermal_zone_parameters from
userspace.  Export them to the thermal_zone sysfs so that they can be
easily changed.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:54 -07:00
Punit Agrawal 35e946447f thermal: core: Add Kconfig option to enable writable trips
Add a Kconfig option to allow system integrators to control whether
userspace tools can change trip temperatures. This option overrides
the thermal zone setup in the driver code and must be enabled for
platform specified writable trips to come into effect.

The original behaviour of requiring root privileges to change trip
temperatures remains unchanged.

Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:53 -07:00
Javi Merino 6b775e870c thermal: introduce the Power Allocator governor
The power allocator governor is a thermal governor that controls system
and device power allocation to control temperature.  Conceptually, the
implementation divides the sustainable power of a thermal zone among
all the heat sources in that zone.

This governor relies on "power actors", entities that represent heat
sources.  They can report current and maximum power consumption and
can set a given maximum power consumption, usually via a cooling
device.

The governor uses a Proportional Integral Derivative (PID) controller
driven by the temperature of the thermal zone.  The output of the
controller is a power budget that is then allocated to each power
actor that can have bearing on the temperature we are trying to
control.  It decides how much power to give each cooling device based
on the performance they are requesting.  The PID controller ensures
that the total power budget does not exceed the control temperature.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:52 -07:00
Javi Merino 35b11d2e3a thermal: extend the cooling device API to include power information
Add three optional callbacks to the cooling device interface to allow
them to express power.  In addition to the callbacks, add helpers to
identify cooling devices that implement the power cooling device API.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:52 -07:00
Javi Merino e33df1d2f3 thermal: let governors have private data for each thermal zone
A governor may need to store its current state between calls to
throttle().  That state depends on the thermal zone, so store it as
private data in struct thermal_zone_device.

The governors may have two new ops: bind_to_tz() and unbind_from_tz().
When provided, these functions let governors do some initialization
and teardown when they are bound/unbound to a tz and possibly store that
information in the governor_data field of the struct
thermal_zone_device.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:52 -07:00
Javi Merino db91651311 thermal: export weight to sysfs
It's useful to have access to the weights for the cooling devices for
thermal zones and change them if needed.  Export them to sysfs.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:51 -07:00
Kapileshwar Singh 6cd9e9f629 thermal: of: fix cooling device weights in device tree
Currently you can specify the weight of the cooling device in the device
tree but that information is not populated to the
thermal_bind_params where the fair share governor expects it to
be.  The of thermal zone device doesn't have a thermal_bind_params
structure and arguably it's better to pass the weight inside the
thermal_instance as it is specific to the bind of a cooling device to a
thermal zone parameter.

Core thermal code is fixed to populate the weight in the instance from
the thermal_bind_params, so platform code that was passing the weight
inside the thermal_bind_params continue to work seamlessly.

While we are at it, create a default value for the weight parameter for
those thermal zones that currently don't define it and remove the
hardcoded default in of-thermal.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Peter Feuerer <peter@piie.net>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Kapileshwar Singh <kapileshwar.singh@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-05-04 21:27:50 -07:00
Hans de Goede 7e497a7375 thermal: Do not log an error if thermal_zone_get_temp returns -EAGAIN
Some temperature sensors only get updated every few seconds and while
waiting for the first irq reporting a (new) temperature to happen there
get_temp operand will return -EAGAIN as it does not have any data to report
yet.

Not logging an error in this case avoids messages like these from showing
up in dmesg on affected systems:

[    1.219353] thermal thermal_zone0: failed to read out thermal zone 0
[    2.015433] thermal thermal_zone0: failed to read out thermal zone 0
[    2.416737] thermal thermal_zone0: failed to read out thermal zone 0

Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-04-07 13:11:29 -07:00
Matthias Kaehlcke 2dc10f8963 thermal: Make sysfs attributes of cooling devices default attributes
Default attributes are created when the device is registered. Attributes
created after device registration can lead to race conditions, where user space
(e.g. udev) sees the device but not the attributes.

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-03-05 01:47:57 -04:00
Johannes Berg 053c095a82 netlink: make nlmsg_end() and genlmsg_end() void
Contrary to common expectations for an "int" return, these functions
return only a positive value -- if used correctly they cannot even
return 0 because the message header will necessarily be in the skb.

This makes the very common pattern of

  if (genlmsg_end(...) < 0) { ... }

be a whole bunch of dead code. Many places also simply do

  return nlmsg_end(...);

and the caller is expected to deal with it.

This also commonly (at least for me) causes errors, because it is very
common to write

  if (my_function(...))
    /* error condition */

and if my_function() does "return nlmsg_end()" this is of course wrong.

Additionally, there's not a single place in the kernel that actually
needs the message length returned, and if anyone needs it later then
it'll be very easy to just use skb->len there.

Remove this, and make the functions void. This removes a bunch of dead
code as described above. The patch adds lines because I did

-	return nlmsg_end(...);
+	nlmsg_end(...);
+	return 0;

I could have preserved all the function's return values by returning
skb->len, but instead I've audited all the places calling the affected
functions and found that none cared. A few places actually compared
the return value with <= 0 in dump functionality, but that could just
be changed to < 0 with no change in behaviour, so I opted for the more
efficient version.

One instance of the error I've made numerous times now is also present
in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
check for <0 or <=0 and thus broke out of the loop every single time.
I've preserved this since it will (I think) have caused the messages to
userspace to be formatted differently with just a single message for
every SKB returned to userspace. It's possible that this isn't needed
for the tools that actually use this, but I don't even know what they
are so couldn't test that changing this behaviour would be acceptable.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-18 01:03:45 -05:00
Zhang Rui 32c9edc4e3 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal into thermal-soc 2014-12-21 22:49:12 +08:00
Lukasz Majewski 9a3031dc3e thermal:core:fix: Check return code of the ->get_max_state() callback
The return code from ->get_max_state() callback was not checked during
binding cooling device to thermal zone device.

Signed-off-by: Lukasz Majewski <l.majewski@samsung.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2014-12-08 21:32:56 -04:00
Luis Henriques 9d367e5e7b thermal: Fix error path in thermal_init()
thermal_unregister_governors() and class_unregister() were being called in
the wrong order.

Fixes: 80a26a5c22 ("Thermal: build thermal governors into thermal_sys module")
Cc: stable@vger.kernel.org
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-12-08 12:17:25 +08:00
Javi Merino b6cc772f64 thermal: lock the thermal zone when switching governors
Currently, userspace can request a governor change while the governor
itself is running.  Grab the thermal zone lock when changing the
governor to prevent this race.

Signed-off-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-12-08 12:10:44 +08:00
Srinivas Pandruvada 84ffe3ecc2 thermal: core: ignore invalid trip temperature
Ignore invalid trip temperature less or equal to zero. Some
buggy systems have invalid trips, causing system shutdown.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-12-08 11:51:46 +08:00
Yao Dongdong 1401586056 Thermal:Remove usless if(!result) before return tz
result is always zero when comes here.

Signed-off-by: Yao Dongdong <yaodongdong@huawei.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2014-11-03 18:59:50 -04:00
Linus Torvalds 8264fce6de Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:
 "Sorry that I missed the merge window as there is a bug found in the
  last minute, and I have to fix it and wait for the code to be tested
  in linux-next tree for a few days.  Now the buggy patch has been
  dropped entirely from my next branch.  Thus I hope those changes can
  still be merged in 3.18-rc2 as most of them are platform thermal
  driver changes.

  Specifics:

   - introduce ACPI INT340X thermal drivers.

     Newer laptops and tablets may have thermal sensors and other
     devices with thermal control capabilities that are exposed for the
     OS to use via the ACPI INT340x device objects.  Several drivers are
     introduced to expose the temperature information and cooling
     ability from these objects to user-space via the normal thermal
     framework.

     From: Lu Aaron, Lan Tianyu, Jacob Pan and Zhang Rui.

   - introduce a new thermal governor, which just uses a hysteresis to
     switch abruptly on/off a cooling device.  This governor can be used
     to control certain fan devices that can not be throttled but just
     switched on or off.  From: Peter Feuerer.

   - introduce support for some new thermal interrupt functions on
     i.MX6SX, in IMX thermal driver.  From: Anson, Huang.

   - introduce tracing support on thermal framework.  From: Punit
     Agrawal.

   - small fixes in OF thermal and thermal step_wise governor"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (25 commits)
  Thermal: int340x thermal: select ACPI fan driver
  Thermal: int3400_thermal: use acpi_thermal_rel parsing APIs
  Thermal: int340x_thermal: expose acpi thermal relationship tables
  Thermal: introduce int3403 thermal driver
  Thermal: introduce INT3402 thermal driver
  Thermal: move the KELVIN_TO_MILLICELSIUS macro to thermal.h
  ACPI / Fan: support INT3404 thermal device
  ACPI / Fan: add ACPI 4.0 style fan support
  ACPI / fan: convert to platform driver
  ACPI / fan: use acpi_device_xxx_power instead of acpi_bus equivelant
  ACPI / fan: remove no need check for device pointer
  ACPI / fan: remove unused macro
  Thermal: int3400 thermal: register to thermal framework
  Thermal: int3400 thermal: add capability to detect supporting UUIDs
  Thermal: introduce int3400 thermal driver
  ACPI: add ACPI_TYPE_LOCAL_REFERENCE support to acpi_extract_package()
  ACPI: make acpi_create_platform_device() an external API
  thermal: step_wise: fix: Prevent from binary overflow when trend is dropping
  ACPI: introduce ACPI int340x thermal scan handler
  thermal: Added Bang-bang thermal governor
  ...
2014-10-24 11:21:43 -07:00
Rasmus Villemoes 484ac2f32d thermal: replace strnicmp with strncasecmp
The kernel used to contain two functions for length-delimited,
case-insensitive string comparison, strnicmp with correct semantics and
a slightly buggy strncasecmp.  The latter is the POSIX name, so strnicmp
was renamed to strncasecmp, and strnicmp made into a wrapper for the new
strncasecmp to avoid breaking existing users.

To allow the compat wrapper strnicmp to be removed at some point in the
future, and to avoid the extra indirection cost, do
s/strnicmp/strncasecmp/g.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-14 02:18:25 +02:00
Zhang Rui dd63466679 Merge branches 'eduardo-soc' and 'bang-bang-governor' of .git into next 2014-09-18 14:48:40 +08:00
Peter Feuerer e4dbf98f7f thermal: Added Bang-bang thermal governor
The bang-bang thermal governor uses a hysteresis to switch abruptly on
or off a cooling device.  It is intended to control fans, which can
not be throttled but just switched on or off.
Bang-bang cannot be set as default governor as it is intended for
special devices only.  For those special devices the driver needs to
explicitely request it.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Andreas Mohr <andi@lisas.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Javi Merino <javi.merino@arm.com>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Peter Feuerer <peter@piie.net>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-08-27 15:45:58 +08:00
Punit Agrawal 208cd822a1 thermal: trace: Trace when temperature is above a trip point
Create a new event to trace when the temperature is above a trip
point. Use the trace-point when handling non-critical and critical
trip pionts.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2014-07-29 09:28:43 -04:00
Punit Agrawal 39811569e4 thermal: trace: Trace when a cooling device's state is updated
Introduce and use an event to trace when a cooling device's state is
updated. This is useful to follow the effect of governor decisions on
cooling devices.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2014-07-29 09:27:54 -04:00
Punit Agrawal 100a8fdbf5 thermal: trace: Trace temperature changes
Create a new event to trace the temperature of a thermal zone. Using
this event trace the temperature changes of the thermal zone every-time
it is updated.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2014-07-29 09:27:54 -04:00
Zhang Rui f2234bcd03 Thermal: thermal zone governor fix
This patch does a cleanup about the thermal zone govenor,
setting and make the following rule.
1. For thermal zone devices that are registered w/o tz->tzp,
   they can use the default thermal governor only.
2. For thermal zone devices w/ governor name specified in
   tz->tzp->governor_name, we will use the default govenor
   if the governor specified is not available at the moment,
   and update tz->governor when the matched governor is registered.

This also fixes a problem that OF registered thermal zones
are running with no governor.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Javi Merino <javi.merino@arm.com>
2014-03-03 23:15:57 +08:00
Ni Wade 5ca0cce562 Thermal: Allow first update of cooling device state
In initialization, if the cooling device is initialized at
max cooling state, and the thermal zone temperature is below
the first trip point, then the cooling state can't be updated
to the right state, untill the first trip point be triggered.

To fix this issue, allow first update of cooling device state
during registration, initialized "updated" device field as
"false" (instead of "true").

Signed-off-by: Wei Ni <wni@nvidia.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-03-03 23:15:29 +08:00
Zhang Rui 8c59ecb5c1 Merge branches 'misc' and 'soc' of .git into next 2014-01-03 22:55:04 +08:00
lan,Tianyu 800744bf31 Thermal: update thermal zone device after setting emul_temp
This patch is to update thermal zone device after setting emul_temp
in order to make governor work according to input temperature immediately.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-01-03 22:54:20 +08:00
Zhang Rui 201531c277 Merge branches 'misc', 'soc', 'soc-eduardo' and 'int3404-thermal' of .git into next 2014-01-02 14:22:28 +08:00
Aaron Lu 06475b556c thermal: debug: add debug statement for core and step_wise
To ease debugging thermal problem, add these dynamic debug statements
so that user do not need rebuild kernel to see these info.

Based on a patch from Zhang Rui for debugging on bugzilla:
https://bugzilla.kernel.org/attachment.cgi?id=98671

A sample output after we turn on dynamic debug with the following cmd:
# echo 'module thermal_sys +fp' > /sys/kernel/debug/dynamic_debug/control
is like:

[  355.147627] update_temperature: thermal thermal_zone0: last_temperature=52000, current_temperature=55000
[  355.147636] thermal_zone_trip_update: thermal thermal_zone0: Trip1[type=1,temp=79000]:trend=2,throttle=0
[  355.147644] get_target_state: thermal cooling_device8: cur_state=0
[  355.147647] thermal_zone_trip_update: thermal cooling_device8: old_target=-1, target=-1
[  355.147652] get_target_state: thermal cooling_device7: cur_state=0
[  355.147655] thermal_zone_trip_update: thermal cooling_device7: old_target=-1, target=-1
[  355.147660] get_target_state: thermal cooling_device6: cur_state=0
[  355.147663] thermal_zone_trip_update: thermal cooling_device6: old_target=-1, target=-1
[  355.147668] get_target_state: thermal cooling_device5: cur_state=0
[  355.147671] thermal_zone_trip_update: thermal cooling_device5: old_target=-1, target=-1
[  355.147678] thermal_zone_trip_update: thermal thermal_zone0: Trip2[type=0,temp=90000]:trend=1,throttle=0
[  355.147776] get_target_state: thermal cooling_device0: cur_state=0
[  355.147783] thermal_zone_trip_update: thermal cooling_device0: old_target=-1, target=-1
[  355.147792] thermal_zone_trip_update: thermal thermal_zone0: Trip3[type=0,temp=80000]:trend=1,throttle=0
[  355.147845] get_target_state: thermal cooling_device1: cur_state=0
[  355.147849] thermal_zone_trip_update: thermal cooling_device1: old_target=-1, target=-1
[  355.147856] thermal_zone_trip_update: thermal thermal_zone0: Trip4[type=0,temp=70000]:trend=1,throttle=0
[  355.147904] get_target_state: thermal cooling_device2: cur_state=0
[  355.147908] thermal_zone_trip_update: thermal cooling_device2: old_target=-1, target=-1
[  355.147915] thermal_zone_trip_update: thermal thermal_zone0: Trip5[type=0,temp=60000]:trend=1,throttle=0
[  355.147963] get_target_state: thermal cooling_device3: cur_state=0
[  355.147967] thermal_zone_trip_update: thermal cooling_device3: old_target=-1, target=-1
[  355.147973] thermal_zone_trip_update: thermal thermal_zone0: Trip6[type=0,temp=55000]:trend=1,throttle=1
[  355.148022] get_target_state: thermal cooling_device4: cur_state=0
[  355.148025] thermal_zone_trip_update: thermal cooling_device4: old_target=-1, target=1
[  355.148036] thermal_cdev_update: thermal cooling_device4: zone0->target=1
[  355.169279] thermal_cdev_update: thermal cooling_device4: set to state 1

Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Eduardo Valentin <eduardo.valentin@ti.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2014-01-02 10:52:48 +08:00
Eduardo Valentin a116b5d44f thermal: core: introduce thermal_of_cooling_device_register
This patch adds a new API to allow registering cooling devices
in the thermal framework derived from device tree nodes.

This API links the cooling device with the device tree node
so that binding with thermal zones is possible, given
that thermal zones are pointing to cooling device
device tree nodes.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com>
2013-12-04 09:31:34 -04:00
Eduardo Valentin 4e5e4705bf thermal: introduce device tree parser
This patch introduces a device tree bindings for
describing the hardware thermal behavior and limits.
Also a parser to read and interpret the data and feed
it in the thermal framework is presented.

This patch introduces a thermal data parser for device
tree. The parsed data is used to build thermal zones
and thermal binding parameters. The output data
can then be used to deploy thermal policies.

This patch adds also documentation regarding this
API and how to define tree nodes to use
this infrastructure.

Note that, in order to be able to have control
on the sensor registration on the DT thermal zone,
it was required to allow changing the thermal zone
.get_temp callback. For this reason, this patch
also removes the 'const' modifier from the .ops
field of thermal zone devices.

Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Eduardo Valentin <eduardo.valentin@ti.com>
2013-12-04 09:31:34 -04:00