The thermal OF code has a new API allowing to migrate the OF
initialization to a simpler approach. The ops are no longer device
tree specific and are the generic ones provided by the core code.
Convert the ops to the thermal_zone_device_ops format and use the new
API to register the thermal zone with these generic ops.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220804224349.1926752-10-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The thermal OF code has a new API allowing to migrate the OF
initialization to a simpler approach. The ops are no longer device
tree specific and are the generic ones provided by the core code.
Convert the ops to the thermal_zone_device_ops format and use the new
API to register the thermal zone with these generic ops.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220804224349.1926752-9-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The thermal OF code has a new API allowing to migrate the OF
initialization to a simpler approach. The ops are no longer device
tree specific and are the generic ones provided by the core code.
Convert the ops to the thermal_zone_device_ops format and use the new
API to register the thermal zone with these generic ops.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220804224349.1926752-8-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The thermal OF code has a new API allowing to migrate the OF
initialization to a simpler approach. The ops are no longer device
tree specific and are the generic ones provided by the core code.
Convert the ops to the thermal_zone_device_ops format and use the new
API to register the thermal zone with these generic ops.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Reviewed-by: Talel Shenhar <talel@amazon.com>
Link: https://lore.kernel.org/r/20220804224349.1926752-7-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The thermal OF code has a new API allowing to migrate the OF
initialization to a simpler approach. The ops are no longer device
tree specific and are the generic ones provided by the core code.
Convert the ops to the thermal_zone_device_ops format and use the new
API to register the thermal zone with these generic ops.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220804224349.1926752-6-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The thermal OF code has a new API allowing to migrate the OF
initialization to a simpler approach. The ops are no longer device
tree specific and are the generic ones provided by the core code.
Convert the ops to the thermal_zone_device_ops format and use the new
API to register the thermal zone with these generic ops.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220804224349.1926752-5-daniel.lezcano@linexp.org
Reviewed-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The thermal OF code has a new API allowing to migrate the OF
initialization to a simpler approach. The ops are no longer device
tree specific and are the generic ones provided by the core code.
Convert the ops to the thermal_zone_device_ops format and use the new
API to register the thermal zone with these generic ops.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220804224349.1926752-4-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
This transient change allows to use old and new OF together until all
the drivers are converted to use the new OF API.
This will go away when the old OF code will be removed.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220804224349.1926752-3-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The previous version of the OF code was returning -ENODEV if no
thermal zones description was found or if the lookup of the sensor in
the thermal zones was not found.
The backend drivers are expecting this return value as an information
about skipping the sensor initialization and considered as normal.
Fix the return value by replacing -EINVAL by -ENODEV and remove the
error message as this missing is not considered as an error.
Fixes: 3bd52ac87347 ("thermal/of: Rework the thermal device tree initialization")
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Michael Walle <michael@walle.cc>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20220809085629.509116-2-daniel.lezcano@linaro.org
Currently, if we cannot find the correct thermal zone then this error
path returns NULL and it would lead to an Oops in the caller. Return
ERR_PTR(-EINVAL) instead.
Fixes: 3bd52ac87347 ("thermal/of: Rework the thermal device tree initialization")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/YvDzovkMCQecPDjz@kili
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The following changes are reworking entirely the thermal device tree
initialization. The old version is kept until the different drivers
using it are converted to the new API.
The old approach creates the different actors independently. This
approach is the source of the code duplication in the thermal OF
because a thermal zone is created but a sensor is registered
after. The thermal zones are created unconditionnaly with a fake
sensor at init time, thus forcing to provide fake ops and store all
the thermal zone related information in duplicated structures. Then
the sensor is initialized and the code looks up the thermal zone name
using the device tree. Then the sensor is associated to the thermal
zone, and the sensor specific ops are called with a second level of
indirection from the thermal zone ops.
When a sensor is removed (with a module unload), the thermal zone
stays there with the fake sensor.
The cooling device associated with a thermal zone and a trip point is
stored in a list, again duplicating information, using the node name
of the device tree to match afterwards the cooling devices.
The new approach is simpler, it creates a thermal zone when the sensor
is registered and destroys it when the sensor is removed. All the
matching between the cooling device, trip points and thermal zones are
done using the device tree, as well as bindings. The ops are no longer
specific but uses the generic ones provided by the thermal framework.
When the old code won't have any users, it can be removed and the
remaining thermal OF code will be much simpler.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220804224349.1926752-2-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Make usage of the new firmware agnostic API
'devm_of_iio_channel_get_by_name()' to get the IIO channel.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220715122903.332535-7-nuno.sa@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
- Fix NULL pointer dereference in the thermal sysfs interface that
results from an error code path mishandling (Rafael Wysocki).
- Drop COMPILE_TEST dependency that's not needed any more from two
thermal Kconfig entries (Jean Delvare).
- Make the Intel TCC cooling driver support Alder Lake-N and Raptor
Lake-P (Sumeet Pawnikar).
- Fix possible path truncations in the tmon utility (Florian Fainelli).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLxT1USHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxj5UP/j2uxSAZC0OpyS/wRP9PmzTJKaUdofD8
BhQ6jvVu1wcQ+zG2MGrLUw9mOTEfd6dDfDUMnwiSAoHKvtUpEM7hsRa9wEGjXZoN
mJYzRFtWT8cqNA4WaVV9UjmTLVRvswXQ25ExumydCoXMHzk8ZSwkslq0rIfPlHWa
NpqraiXdmqG5qFlfQR1AhnLF3IXbogrxWbXqJRDatxwb7m4VEqy7TKQYKBozpWu6
0SfnGOrW6WqL5YaNCO2Q2cbRNRlVnS1gaGTfowMGAR5IRRz0MYmoHAvRIQzpigJI
VaTShIh3LQi+ud2dHsyeBZn95r35JgUQB716CHsZkkPB7WJx+Jj68K/u+TpsV8bV
KHTS7H1if2jugv0duC/ZSMqrN9Kqam2rjPnAtjK+Eo6gzWYP77Ha+yjkHTcWfICB
b7WargRB3tqj9Rsczl6bozQ4x6djYZAB4I0dZffdenbuees7U4qsmrSi2Z/3jta1
zZz5yPM15ZaESGAKr0ECiHD45Lgf7+ZCqS0oMMVNIaCXQyDUInpaR28t9YLQeIg/
h8uPIJ+AONdgNs6XKZ96AGpOZHJG+pszSIMVM8bPYCsy56bXBDI8m2NVlKXp58WF
N4fimjv97MCDLxxua6SuwWVhy8FHlv+OW3zCAiGJ2mOHvoO2E/nLoeOQkC//03zn
MIayMXpCTYRM
=8JEE
-----END PGP SIGNATURE-----
Merge tag 'thermal-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"These fix an error code path issue leading to a NULL pointer
dereference, drop Kconfig dependencies that are not needed any more
after recent changes, add CPU IDs for new chips to a driver and fix up
the tmon utility.
Specifics:
- Fix NULL pointer dereference in the thermal sysfs interface that
results from an error code path mishandling (Rafael Wysocki).
- Drop COMPILE_TEST dependency that's not needed any more from two
thermal Kconfig entries (Jean Delvare).
- Make the Intel TCC cooling driver support Alder Lake-N and Raptor
Lake-P (Sumeet Pawnikar).
- Fix possible path truncations in the tmon utility (Florian
Fainelli)"
* tag 'thermal-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
tools/thermal: Fix possible path truncations
thermal: Drop obsolete dependency on COMPILE_TEST
thermal: sysfs: Fix cooling_device_stats_setup() error code path
thermal: intel: Add TCC cooling support for Alder Lake-N and Raptor Lake-P
Merge additional changes in the thermal core and thermal tools updates
for 5.20-rc1:
- Fix NULL poiter dereference in the thermal sysfs interface that
results from an error code path mishandling (Rafael Wysocki).
- Drop COMPILE_TEST dependency that's not needed any more from two
thermal Kconfig entries (Jean Delvare).
- Fix possible path truncations in the tmon utility (Florian Fainelli).
* thermal-core:
thermal: Drop obsolete dependency on COMPILE_TEST
thermal: sysfs: Fix cooling_device_stats_setup() error code path
* thermal-tools:
tools/thermal: Fix possible path truncations
Here is the set of SPDX comment updates for 6.0-rc1.
Nothing huge here, just a number of updated SPDX license tags and
cleanups based on the review of a number of common patterns in GPLv2
boilerplate text. Also included in here are a few other minor updates,
2 USB files, and one Documentation file update to get the SPDX lines
correct.
All of these have been in the linux-next tree for a very long time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYupz3g8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynPUgCgslaf2ssCgW5IeuXbhla+ZBRAzisAnjVgOvLN
4AKdqbiBNlFbCroQwmeQ
=v1sg
-----END PGP SIGNATURE-----
Merge tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX updates from Greg KH:
"Here is the set of SPDX comment updates for 6.0-rc1.
Nothing huge here, just a number of updated SPDX license tags and
cleanups based on the review of a number of common patterns in GPLv2
boilerplate text.
Also included in here are a few other minor updates, two USB files,
and one Documentation file update to get the SPDX lines correct.
All of these have been in the linux-next tree for a very long time"
* tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (28 commits)
Documentation: samsung-s3c24xx: Add blank line after SPDX directive
x86/crypto: Remove stray comment terminator
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_406.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_398.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_391.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_390.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_385.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_320.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_319.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_318.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_298.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_292.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_179.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 2)
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 1)
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_160.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_152.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_149.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_147.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_133.RULE
...
Since commit 0166dc11be ("of: make CONFIG_OF user selectable"), it
is possible to test-build any driver which depends on OF on any
architecture by explicitly selecting OF. Therefore depending on
COMPILE_TEST as an alternative is no longer needed.
It is actually better to always build such drivers with OF enabled,
so that the test builds are closer to how each driver will actually be
built on its intended target. Building them without OF may not test
much as the compiler will optimize out potentially large parts of the
code. In the worst case, this could even pop false positive warnings.
Dropping COMPILE_TEST here improves the quality of our testing and
avoids wasting time on non-existent issues.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If cooling_device_stats_setup() fails to create the stats object, it
must clear the last slot in cooling_device_attr_groups that was
initially empty (so as to make it possible to add stats attributes to
the cooling device attribute groups).
Failing to do so may cause the stats attributes to be created by
mistake for a device that doesn't have a stats object, because the
slot in question might be populated previously during the registration
of another cooling device.
Fixes: 8ea229511e ("thermal: Add cooling device's statistics in sysfs")
Reported-by: Di Shen <di.shen@unisoc.com>
Tested-by: Di Shen <di.shen@unisoc.com>
Cc: 4.17+ <stable@vger.kernel.org> # 4.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add Alder Lake-N and Raptor Lake-P to the list of processor models
supported by the Intel TCC cooling driver.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Consolidate the thermal core code by beginning to move the thermal
trip structure from the thermal OF code as a generic structure to be
used by the different sensors when registering a thermal zone
(Daniel Lezcano).
- Make per cpufreq / devfreq cooling device ops instead of using a
global variable, fix comments and rework the trace information
(Lukasz Luba).
- Add the include/dt-bindings/thermal.h under the area covered by the
thermal maintainer in the MAINTAINERS file (Lukas Bulwahn).
- Improve the error output by giving the sensor identification when a
thermal zone failed to initialize, the DT bindings by changing the
positive logic and adding the r8a779f0 support on the rcar3 (Wolfram
Sang).
- Convert the QCom tsens DT binding to the dtsformat format (Krzysztof
Kozlowski).
- Remove the pointless get_trend() function in the QCom, Ux500 and
tegra thermal drivers, along with the unused DROP_FULL and
RAISE_FULL trends definitions. Simplify the code by using clamp()
macros (Daniel Lezcano).
- Fix ref_table memory leak at probe time on the k3_j72xx bandgap
(Bryan Brattlof).
- Fix array underflow in prep_lookup_table (Dan Carpenter).
- Add static annotation to the k3_j72xx_bandgap_j7* data structure
(Jin Xiaoyun).
- Fix typos in comments detected on sun8i by Coccinelle (Julia
Lawall).
- Fix typos in comments on rzg2l (Biju Das).
- Remove as unnecessary call to dev_err() as the error is already
printed by the failing function on u8500 (Yang Li).
- Register the thermal zones as hwmon sensors for the Qcom thermal
sensors (Dmitry Baryshkov).
- Fix 'tmon' tool compilation issue by adding phtread.h include
(Markus Mayer).
- Fix typo in the comments for the 'tmon' tool (Slark Xiao).
- Make the thermal core use ida_alloc()/free() directly instead of
ida_simple_get()/ida_simple_remove() that have been deprecated
(keliu).
- Drop ACPI_FADT_LOW_POWER_S0 check from the Intel PCH thermal control
driver (Rafael Wysocki).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLoK5ASHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxa0cQAJsl3wDxkDbvfEENZ1VSdfeH3qXbUSSE
EEo0j4X85JE1F1NwT8R2tb4D/YMJDT3p6I55twrVLvxNUdTnx7ybRfXem24uXkK5
xOybfsuYsSWXxaEfI4260GBzY6ijTR7uWYyDLPN3vvbW3FdMj+nni0D9uTySw7UL
ecIe1ISn3nxbbp0FxYh+n88+718HWKo07BaTE4TyKeUgQHw+v7HHtCZq7Rdoogm8
cp6tTkJ8ymrHoEvAWBIcO58zCx7LkSFeU69oMm4CUzVjxWdFfREb079F5cZ92GXr
ex70r/gKfFAd5GAAdL0WjeS4RwHKta49WKqAMA7w41nIgDj0IA2gJRowfJvKDkF+
JgcQ7OrJ5eo5jCr4pbycgQ9Lh23zBQe/3LH+yV71KlKiLf6/Tl5rhELfBNbZmraZ
HOvD5dAxBLySmANN2VX7DJgtbTcinneL9BDVo6dBTdYaWC4jQxXYm73n66nkZdS7
BDJ0N2P0uZ7NGLawXwrrsMi8xbIApMw4W/o8SN9R4FF1LqIroDg60kLJ9zO+6IhI
xF8ZtcMdyPVa71fSZNwD0+mz2sF6XnTucf88CjxzVdAxbvNVPQEvKufThWTreyuU
pjBPtf1YFOFz9CusBYAplOIu96RqUgL1t1aqqwsCqXoUu4Lgh/pyksIDeam1l0EP
Q5WBUB9bK8q8
=wj9M
-----END PGP SIGNATURE-----
Merge tag 'thermal-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These start a rework of the handling of trip points in the thermal
core, improve the cpufreq/devfreq cooling device handling, update some
thermal control drivers and the tmon utility and clean up code.
Specifics:
- Consolidate the thermal core code by beginning to move the thermal
trip structure from the thermal OF code as a generic structure to
be used by the different sensors when registering a thermal zone
(Daniel Lezcano).
- Make per cpufreq / devfreq cooling device ops instead of using a
global variable, fix comments and rework the trace information
(Lukasz Luba).
- Add the include/dt-bindings/thermal.h under the area covered by the
thermal maintainer in the MAINTAINERS file (Lukas Bulwahn).
- Improve the error output by giving the sensor identification when a
thermal zone failed to initialize, the DT bindings by changing the
positive logic and adding the r8a779f0 support on the rcar3
(Wolfram Sang).
- Convert the QCom tsens DT binding to the dtsformat format
(Krzysztof Kozlowski).
- Remove the pointless get_trend() function in the QCom, Ux500 and
tegra thermal drivers, along with the unused DROP_FULL and
RAISE_FULL trends definitions. Simplify the code by using clamp()
macros (Daniel Lezcano).
- Fix ref_table memory leak at probe time on the k3_j72xx bandgap
(Bryan Brattlof).
- Fix array underflow in prep_lookup_table (Dan Carpenter).
- Add static annotation to the k3_j72xx_bandgap_j7* data structure
(Jin Xiaoyun).
- Fix typos in comments detected on sun8i by Coccinelle (Julia
Lawall).
- Fix typos in comments on rzg2l (Biju Das).
- Remove as unnecessary call to dev_err() as the error is already
printed by the failing function on u8500 (Yang Li).
- Register the thermal zones as hwmon sensors for the Qcom thermal
sensors (Dmitry Baryshkov).
- Fix 'tmon' tool compilation issue by adding phtread.h include
(Markus Mayer).
- Fix typo in the comments for the 'tmon' tool (Slark Xiao).
- Make the thermal core use ida_alloc()/free() directly instead of
ida_simple_get()/ida_simple_remove() that have been deprecated
(keliu).
- Drop ACPI_FADT_LOW_POWER_S0 check from the Intel PCH thermal
control driver (Rafael Wysocki)"
* tag 'thermal-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (39 commits)
thermal/of: Initialize trip points separately
thermal/of: Use thermal trips stored in the thermal zone
thermal/core: Add thermal_trip in thermal_zone
thermal/core: Rename 'trips' to 'num_trips'
thermal/core: Move thermal_set_delay_jiffies to static
thermal/core: Remove unneeded EXPORT_SYMBOLS
thermal/of: Move thermal_trip structure to thermal.h
thermal/of: Remove the device node pointer for thermal_trip
thermal/of: Replace device node match with device node search
thermal/core: Remove duplicate information when an error occurs
thermal/core: Avoid calling ->get_trip_temp() unnecessarily
thermal/tools/tmon: Fix typo 'the the' in comment
thermal/tools/tmon: Include pthread and time headers in tmon.h
thermal/ti-soc-thermal: Fix comment typo
thermal/drivers/qcom/spmi-adc-tm5: Register thermal zones as hwmon sensors
thermal/drivers/qcom/temp-alarm: Register thermal zones as hwmon sensors
thermal/drivers/u8500: Remove unnecessary print function dev_err()
thermal/drivers/rzg2l: Fix comments
thermal/drivers/sun8i: Fix typo in comment
thermal/drivers/k3_j72xx_bandgap: Make k3_j72xx_bandgap_j721e_data and k3_j72xx_bandgap_j7200_data static
...
- Make cpufreq_show_cpus() more straightforward (Viresh Kumar).
- Drop unnecessary CPU hotplug locking from store() used by cpufreq
sysfs attributes (Viresh Kumar).
- Make the ACPI cpufreq driver support the boost control interface on
Zhaoxin/Centaur processors (Tony W Wang-oc).
- Print a warning message on attempts to free an active cpufreq policy
which should never happen (Viresh Kumar).
- Fix grammar in the Kconfig help text for the loongson2 cpufreq
driver (Randy Dunlap).
- Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
governor (Zhao Liu).
- Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
cpuidle driver (Eiichi Tsukata).
- Modify intel_idle to treat C1 and C1E as independent idle states on
Sapphire Rapids (Artem Bityutskiy).
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
- Add new devfreq driver for Mediatek CCI (Cache Coherent
Interconnect) (Johnson Wang).
- Convert the Samsung Exynos SoC Bus bindings to DT schema of
exynos-bus.c (Krzysztof Kozlowski).
- Address kernel-doc warnings by adding the description for unused
fucntion parameters in devfreq core (Mauro Carvalho Chehab).
- Use NULL to pass a null pointer rather than zero according to the
function propotype in imx-bus.c (Colin Ian King).
- Print error message instead of error interger value in
tegra30-devfreq.c (Dmitry Osipenko).
- Add checks to prevent setting negative frequency QoS limits for
CPUs (Shivnandan Kumar).
- Update the pm-graph suite of utilities to the latest revision 5.9
including multiple improvements (Todd Brandt).
- Drop pme_interrupt reference from the PCI power management
documentation (Mario Limonciello).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLoKy8SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx3+oQAJNVU+W14EaRPWXQRMuwBC5zk3hb6T9q
JqmMd8coEd+9/4ABAeRAWso1B26rUzB6JyBvw3lGH9OXInpYmvnJEhEPrTpK2h0D
U9HxEARuGJolrDm0X9NAkn7tKKMC9GnvPS9z2s7s+N97VFFWC/QiU+PHB0SypGNb
JxRfbVJZQCuxmNG9UeK+xeHFQ9lM2Z9ZdTxR71G0n7nQPPR+sUvnFufFby3Aogf3
XnBYfia+YNqkUlefxxwB5a0cFwPXOUGsQkIf4d64gZnq1TgZ+71kht1GEF08PDFl
wV8v1rOWuXEae8dozuf5xszp/eVyAqzgB+IShT9APREOO3Wg6I16XdBm8R1TGwCK
JTdZqnm6HVKBNqchEwYViJILX69rrNUT+AwHBWhtKKDNh3qeTuwi/JGTeDVN++en
xf3TNKx3LV31Nq6nWJFzDGLehfZMnAPkhfYohUBI7FNyblpk4mJRVcZ0bYI7UNnS
als77uoipvb5KdFCtdhxYBHd/y867NvXKa1qsAuDxusAsfJHf4SnlMdbgOepBH2y
jJg06CGrMDU3TZ8BL+WpqUYk4irQnAMs/159Txh7A6/dOnTjE7S9NHrENCwmt2og
FrHSLH1eLX6Sa4RSibiGHPC7mNULP2/TOtryf3zFdlIVcjm3NEU3bnfzx7nlJn05
8t6ObMxgMhWT
=XeLV
-----END PGP SIGNATURE-----
Merge tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These are mostly minor improvements all over including new CPU IDs for
the Intel RAPL driver, an Energy Model rework to use micro-Watt as the
power unit, cpufreq fixes and cleanus, cpuidle updates, devfreq
updates, documentation cleanups and a new version of the pm-graph
suite of utilities.
Specifics:
- Make cpufreq_show_cpus() more straightforward (Viresh Kumar).
- Drop unnecessary CPU hotplug locking from store() used by cpufreq
sysfs attributes (Viresh Kumar).
- Make the ACPI cpufreq driver support the boost control interface on
Zhaoxin/Centaur processors (Tony W Wang-oc).
- Print a warning message on attempts to free an active cpufreq
policy which should never happen (Viresh Kumar).
- Fix grammar in the Kconfig help text for the loongson2 cpufreq
driver (Randy Dunlap).
- Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
governor (Zhao Liu).
- Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
cpuidle driver (Eiichi Tsukata).
- Modify intel_idle to treat C1 and C1E as independent idle states on
Sapphire Rapids (Artem Bityutskiy).
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
- Add new devfreq driver for Mediatek CCI (Cache Coherent
Interconnect) (Johnson Wang).
- Convert the Samsung Exynos SoC Bus bindings to DT schema of
exynos-bus.c (Krzysztof Kozlowski).
- Address kernel-doc warnings by adding the description for unused
function parameters in devfreq core (Mauro Carvalho Chehab).
- Use NULL to pass a null pointer rather than zero according to the
function propotype in imx-bus.c (Colin Ian King).
- Print error message instead of error interger value in
tegra30-devfreq.c (Dmitry Osipenko).
- Add checks to prevent setting negative frequency QoS limits for
CPUs (Shivnandan Kumar).
- Update the pm-graph suite of utilities to the latest revision 5.9
including multiple improvements (Todd Brandt).
- Drop pme_interrupt reference from the PCI power management
documentation (Mario Limonciello)"
* tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (27 commits)
powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P
PM: QoS: Add check to make sure CPU freq is non-negative
PM: hibernate: defer device probing when resuming from hibernation
intel_idle: make SPR C1 and C1E be independent
cpufreq: ondemand: Use cpumask_var_t for on-stack cpu mask
cpufreq: loongson2: fix Kconfig "its" grammar
pm-graph v5.9
cpufreq: Warn users while freeing active policy
cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1
firmware: arm_scmi: Get detailed power scale from perf
Documentation: EM: Switch to micro-Watts scale
PM: EM: convert power field to micro-Watts precision and align drivers
PM / devfreq: tegra30: Add error message for devm_devfreq_add_device()
PM / devfreq: imx-bus: use NULL to pass a null pointer rather than zero
PM / devfreq: shut up kernel-doc warnings
dt-bindings: interconnect: samsung,exynos-bus: convert to dtschema
PM / devfreq: mediatek: Introduce MediaTek CCI devfreq driver
dt-bindings: interconnect: Add MediaTek CCI dt-bindings
PM: domains: Ensure genpd_debugfs_dir exists before remove
PM: runtime: Extend support for wakeirq for force_suspend|resume
...
core:
- Make wait_event_hrtimeout() ware of RT/DL tasks
drivers:
- New driver for the R-Car Gen4 timer
- New driver for the Tegra186 timer
- New driver for the Mediatek MT6795 CPUXGPT timer
- Rework suspend/resume handling in timer drivers so it
takes inactive clocks into account.
- The usual device tree compatible add ons
- Small fixed and cleanups all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmLn5vUTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoV5HEACj2xGuGCnMBkOddhwAhSh8dsMP0+Sb
gRoC6urvlAxDvehblZJ7BD1o9yZqHEguoPeJLRgPVzSDGADWppzj3N0lGtJnELcQ
XQ93SQYBxpy3eJP7AzkUxdy/pvleIAfmje6VlbKnzNEtM/csXS+kJuL9HSgOFDBr
NWVtTtkQCgFmkI6bRoQqq3fU0mYkHnK+KwBrXoX3QJntwEZ7WM0YySE+U1RbeYuh
bo6K2x+gllJXD7wF4X2quQ7PBbiRLw7/2aEeO10mcSPn1sl/efMg3i4a4cmck0ce
zrjQWC1s0hv4PRJRbEKN9wiwrgYyyYaUeWPWDO+6iF69e2i5qu6wRHJM8tyeEm9R
olEt8o6Go0Yw4KZfOeVgK87vsIV50GDD6b/D2ingY5OjiJT6glwmO6yMQ9USa3B6
BUfSLt1LG/AMJ2kWkOKN4AckaUaSh/IZ150ZjFA7hEy1RW9+XCsf0q0IMgEh43xL
D6W+IHdtOZnVMHztxT9H0hzmUnCrj7vcsyrxagEo9rMeiFhCK/5OCh9HENZYsrNg
zElnG++WwP5i9pOzC3Sl5mj/4nYVj5LNpmt4XZGp+7A3+UAfeFvgq7It0b74M6rW
SfyC6D9X4uL0j2VVLRxV0o9Kqnqecc0S50NrPOkXTVPn2pYjRfqVBp0gj5T3knBO
0Vvrwy8Owc1iAA==
=kPNA
-----END PGP SIGNATURE-----
Merge tag 'timers-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
"Timers, timekeeping and related drivers update:
Core:
- Make wait_event_hrtimeout() aware of RT/DL tasks
New drivers:
- R-Car Gen4 timer
- Tegra186 timer
- Mediatek MT6795 CPUXGPT timer
Updates:
- Rework suspend/resume handling in timer drivers so it
takes inactive clocks into account.
- The usual device tree compatible add ons
- Small fixed and cleanups all over the place"
* tag 'timers-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
wait: Fix __wait_event_hrtimeout for RT/DL tasks
clocksource/drivers/sun5i: Remove unnecessary (void*) conversions
dt-bindings: timer: allwinner,sun4i-a10-timer: Add D1 compatible
dt-bindings: timer: ingenic,tcu: use absolute path to other schema
clocksource/drivers/sun4i: Remove unnecessary (void*) conversions
dt-bindings: timer: renesas,cmt: Fix R-Car Gen4 fall-out
clocksource/drivers/tegra186: Put Kconfig option 'tristate' to 'bool'
clocksource/drivers/timer-ti-dm: Make driver selection bool for TI K3
clocksource/drivers/timer-ti-dm: Add compatible for am6 SoCs
clocksource/drivers/timer-ti-dm: Make timer selectable for ARCH_K3
clocksource/drivers/timer-ti-dm: Move inline functions to driver for am6
clocksource/drivers/sh_cmt: Add R-Car Gen4 support
dt-bindings: timer: renesas,cmt: R-Car V3U is R-Car Gen4
dt-bindings: timer: renesas,cmt: Add r8a779f0 and generic Gen4 CMT support
clocksource/drivers/timer-microchip-pit64b: Fix compilation warnings
clocksource/drivers/timer-microchip-pit64b: Use mchp_pit64b_{suspend, resume}
clocksource/drivers/timer-microchip-pit64b: Remove suspend/resume ops for ce
thermal/drivers/rcar_gen3_thermal: Add r8a779f0 support
clocksource/drivers/timer-mediatek: Implement CPUXGPT timers
dt-bindings: timer: mediatek: Add CPUX System Timer and MT6795 compatible
...
Load-balancing improvements:
============================
- Improve NUMA balancing on AMD Zen systems for affine workloads.
- Improve the handling of reduced-capacity CPUs in load-balancing.
- Energy Model improvements: fix & refine all the energy fairness metrics (PELT),
and remove the conservative threshold requiring 6% energy savings to
migrate a task. Doing this improves power efficiency for most workloads,
and also increases the reliability of energy-efficiency scheduling.
- Optimize/tweak select_idle_cpu() to spend (much) less time searching
for an idle CPU on overloaded systems. There's reports of several
milliseconds spent there on large systems with large workloads ...
[ Since the search logic changed, there might be behavioral side effects. ]
- Improve NUMA imbalance behavior. On certain systems
with spare capacity, initial placement of tasks is non-deterministic,
and such an artificial placement imbalance can persist for a long time,
hurting (and sometimes helping) performance.
The fix is to make fork-time task placement consistent with runtime
NUMA balancing placement.
Note that some performance regressions were reported against this,
caused by workloads that are not memory bandwith limited, which benefit
from the artificial locality of the placement bug(s). Mel Gorman's
conclusion, with which we concur, was that consistency is better than
random workload benefits from non-deterministic bugs:
"Given there is no crystal ball and it's a tradeoff, I think it's
better to be consistent and use similar logic at both fork time
and runtime even if it doesn't have universal benefit."
- Improve core scheduling by fixing a bug in sched_core_update_cookie() that
caused unnecessary forced idling.
- Improve wakeup-balancing by allowing same-LLC wakeup of idle CPUs for newly
woken tasks.
- Fix a newidle balancing bug that introduced unnecessary wakeup latencies.
ABI improvements/fixes:
=======================
- Do not check capabilities and do not issue capability check denial messages
when a scheduler syscall doesn't require privileges. (Such as increasing niceness.)
- Add forced-idle accounting to cgroups too.
- Fix/improve the RSEQ ABI to not just silently accept unknown flags.
(No existing tooling is known to have learned to rely on the previous behavior.)
- Depreciate the (unused) RSEQ_CS_FLAG_NO_RESTART_ON_* flags.
Optimizations:
==============
- Optimize & simplify leaf_cfs_rq_list()
- Micro-optimize set_nr_{and_not,if}_polling() via try_cmpxchg().
Misc fixes & cleanups:
======================
- Fix the RSEQ self-tests on RISC-V and Glibc 2.35 systems.
- Fix a full-NOHZ bug that can in some cases result in the tick not being
re-enabled when the last SCHED_RT task is gone from a runqueue but there's
still SCHED_OTHER tasks around.
- Various PREEMPT_RT related fixes.
- Misc cleanups & smaller fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmLn2ywRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1iNfxAAhPJMwM4tYCpIM6PhmxKiHl6kkiT2tt42
HhEmiJVLjczLybWaWwmGA2dSFkv1f4+hG7nqdZTm9QYn0Pqat2UTSRcwoKQc+gpB
x85Hwt2IUmnUman52fRl5r1miH9LTdCI6agWaFLQae5ds1XmOugFo52t2ahax+Gn
dB8LxS2fa/GrKj229EhkJSPWAK4Y94asoTProwpKLuKEeXhDkqUNrOWbKhz+wEnA
pVZySpA9uEOdNLVSr1s0VB6mZoh5/z6yQefj5YSNntsG71XWo9jxKCIm5buVdk2U
wjdn6UzoTThOy/5Ygm64eYRexMHG71UamF1JYUdmvDeUJZ5fhG6RD0FECUQNVcJB
Msu2fce6u1AV0giZGYtiooLGSawB/+e6MoDkjTl8guFHi/peve9CezKX1ZgDWPfE
eGn+EbYkUS9RMafXCKuEUBAC1UUqAavGN9sGGN1ufyR4za6ogZplOqAFKtTRTGnT
/Ne3fHTtvv73DLGW9ohO5vSS2Rp7zhAhB6FunhibhxCWlt7W6hA4Ze2vU9hf78Yn
SJDLAJjOEilLaKUkRG/d9uM3FjKJM1tqxuT76+sUbM0MNxdyiKcviQlP1b8oq5Um
xE1KNZUevnr/WXqOTGDKHH/HNPFgwxbwavMiP7dNFn8h/hEk4t9dkf5siDmVHtn4
nzDVOob1LgE=
=xr2b
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Load-balancing improvements:
- Improve NUMA balancing on AMD Zen systems for affine workloads.
- Improve the handling of reduced-capacity CPUs in load-balancing.
- Energy Model improvements: fix & refine all the energy fairness
metrics (PELT), and remove the conservative threshold requiring 6%
energy savings to migrate a task. Doing this improves power
efficiency for most workloads, and also increases the reliability
of energy-efficiency scheduling.
- Optimize/tweak select_idle_cpu() to spend (much) less time
searching for an idle CPU on overloaded systems. There's reports of
several milliseconds spent there on large systems with large
workloads ...
[ Since the search logic changed, there might be behavioral side
effects. ]
- Improve NUMA imbalance behavior. On certain systems with spare
capacity, initial placement of tasks is non-deterministic, and such
an artificial placement imbalance can persist for a long time,
hurting (and sometimes helping) performance.
The fix is to make fork-time task placement consistent with runtime
NUMA balancing placement.
Note that some performance regressions were reported against this,
caused by workloads that are not memory bandwith limited, which
benefit from the artificial locality of the placement bug(s). Mel
Gorman's conclusion, with which we concur, was that consistency is
better than random workload benefits from non-deterministic bugs:
"Given there is no crystal ball and it's a tradeoff, I think
it's better to be consistent and use similar logic at both fork
time and runtime even if it doesn't have universal benefit."
- Improve core scheduling by fixing a bug in
sched_core_update_cookie() that caused unnecessary forced idling.
- Improve wakeup-balancing by allowing same-LLC wakeup of idle CPUs
for newly woken tasks.
- Fix a newidle balancing bug that introduced unnecessary wakeup
latencies.
ABI improvements/fixes:
- Do not check capabilities and do not issue capability check denial
messages when a scheduler syscall doesn't require privileges. (Such
as increasing niceness.)
- Add forced-idle accounting to cgroups too.
- Fix/improve the RSEQ ABI to not just silently accept unknown flags.
(No existing tooling is known to have learned to rely on the
previous behavior.)
- Depreciate the (unused) RSEQ_CS_FLAG_NO_RESTART_ON_* flags.
Optimizations:
- Optimize & simplify leaf_cfs_rq_list()
- Micro-optimize set_nr_{and_not,if}_polling() via try_cmpxchg().
Misc fixes & cleanups:
- Fix the RSEQ self-tests on RISC-V and Glibc 2.35 systems.
- Fix a full-NOHZ bug that can in some cases result in the tick not
being re-enabled when the last SCHED_RT task is gone from a
runqueue but there's still SCHED_OTHER tasks around.
- Various PREEMPT_RT related fixes.
- Misc cleanups & smaller fixes"
* tag 'sched-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
rseq: Kill process when unknown flags are encountered in ABI structures
rseq: Deprecate RSEQ_CS_FLAG_NO_RESTART_ON_* flags
sched/core: Fix the bug that task won't enqueue into core tree when update cookie
nohz/full, sched/rt: Fix missed tick-reenabling bug in dequeue_task_rt()
sched/core: Always flush pending blk_plug
sched/fair: fix case with reduced capacity CPU
sched/core: Use try_cmpxchg in set_nr_{and_not,if}_polling
sched/core: add forced idle accounting for cgroups
sched/fair: Remove the energy margin in feec()
sched/fair: Remove task_util from effective utilization in feec()
sched/fair: Use the same cpumask per-PD throughout find_energy_efficient_cpu()
sched/fair: Rename select_idle_mask to select_rq_mask
sched, drivers: Remove max param from effective_cpu_util()/sched_cpu_util()
sched/fair: Decay task PELT values during wakeup migration
sched/fair: Provide u64 read for 32-bits arch helper
sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg
sched: only perform capability check on privileged operation
sched: Remove unused function group_first_cpu()
sched/fair: Remove redundant word " *"
selftests/rseq: check if libc rseq support is registered
...
global variable, fix comments and rework the trace information
(Lukasz Luba)
- Add the include/dt-bindings/thermal.h under the area covered by the
thermal maintainer in the MAINTAINERS file (Lukas Bulwahn)
- Improve the error output by giving the sensor identification when a
thermal zone failed to initialize, the DT bindings by changing the
positive logic and adding the r8a779f0 support on the rcar3 (Wolfram
Sang)
- Convert the QCom tsens DT binding to the dtsformat format (Krzysztof
Kozlowski)
- Remove the pointless get_trend() function in the QCom, Ux500 and
tegra thermal drivers, along with the unused DROP_FULL and
RAISE_FULL trends definitions. Simplify the code by using clamp()
macros (Daniel Lezcano)
- Fix ref_table memory leak at probe time on the k3_j72xx bandgap
(Bryan Brattlof)
- Fix array underflow in prep_lookup_table (Dan Carpenter)
- Add static annotation to the k3_j72xx_bandgap_j7* data structure
(Jin Xiaoyun)
- Fix typos in comments detected on sun8i by Coccinelle (Julia Lawall)
- Fix typos in comments on rzg2l (Biju Das)
- Remove as unnecessary call to dev_err() as the error is already
printed by the failing function on u8500 (Yang Li)
- Register the thermal zones as hwmon sensors for the Qcom thermal
sensors (Dmitry Baryshkov)
- Fix 'tmon' tool compilation issue by adding phtread.h include
(Markus Mayer)
- Fix typo in the comments for the 'tmon' tool (Slark Xiao)
- Consolidate the thermal core code by beginning to move the thermal
trip structure from the thermal OF code as a generic structure to be
used by the different sensors when registering a thermal zone
(Daniel Lezcano)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmLkDEgACgkQqDIjiipP
6E9PPAf/fZRYgzqgv68lYy2hnJBZEha7z76KyKKxbPATy65VQHzHBqWyPgOnZWx8
xm26tlDJMFEGql/Sy5QetvnFdDqvY33Q0FBhDbmCdCp7vxxirDNKxXhGnxUggCIt
PrloMzC9zjgdNaFTclf/ceCFNwHPnY8l5kxGHhVDn/l5vvGFB869HKMT+13FMCQM
cKVNZY0F3BgmY0ouAMbXT2jwNm/FIYfXC9CFaQo9XhiTAvqU1h4BI08S8JdXsve0
VVBi8MB0sBolWIQ/GVlC1IWj1FhxgMfvcfZAOlyia7I4kQz7K5wAHxiHnhA+GHsZ
NdxVeGTIdmjIInvRxnsT7yR2HcitkA==
=sDfh
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal control changes for 5.20-rc1 from Daniel Lezcano:
"- Make per cpufreq / devfreq cooling device ops instead of using a
global variable, fix comments and rework the trace information
(Lukasz Luba)
- Add the include/dt-bindings/thermal.h under the area covered by the
thermal maintainer in the MAINTAINERS file (Lukas Bulwahn)
- Improve the error output by giving the sensor identification when a
thermal zone failed to initialize, the DT bindings by changing the
positive logic and adding the r8a779f0 support on the rcar3 (Wolfram
Sang)
- Convert the QCom tsens DT binding to the dtsformat format (Krzysztof
Kozlowski)
- Remove the pointless get_trend() function in the QCom, Ux500 and
tegra thermal drivers, along with the unused DROP_FULL and
RAISE_FULL trends definitions. Simplify the code by using clamp()
macros (Daniel Lezcano)
- Fix ref_table memory leak at probe time on the k3_j72xx bandgap
(Bryan Brattlof)
- Fix array underflow in prep_lookup_table (Dan Carpenter)
- Add static annotation to the k3_j72xx_bandgap_j7* data structure
(Jin Xiaoyun)
- Fix typos in comments detected on sun8i by Coccinelle (Julia Lawall)
- Fix typos in comments on rzg2l (Biju Das)
- Remove as unnecessary call to dev_err() as the error is already
printed by the failing function on u8500 (Yang Li)
- Register the thermal zones as hwmon sensors for the Qcom thermal
sensors (Dmitry Baryshkov)
- Fix 'tmon' tool compilation issue by adding phtread.h include
(Markus Mayer)
- Fix typo in the comments for the 'tmon' tool (Slark Xiao)
- Consolidate the thermal core code by beginning to move the thermal
trip structure from the thermal OF code as a generic structure to be
used by the different sensors when registering a thermal zone
(Daniel Lezcano)"
* tag 'thermal-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (36 commits)
thermal/of: Initialize trip points separately
thermal/of: Use thermal trips stored in the thermal zone
thermal/core: Add thermal_trip in thermal_zone
thermal/core: Rename 'trips' to 'num_trips'
thermal/core: Move thermal_set_delay_jiffies to static
thermal/core: Remove unneeded EXPORT_SYMBOLS
thermal/of: Move thermal_trip structure to thermal.h
thermal/of: Remove the device node pointer for thermal_trip
thermal/of: Replace device node match with device node search
thermal/core: Remove duplicate information when an error occurs
thermal/core: Avoid calling ->get_trip_temp() unnecessarily
thermal/tools/tmon: Fix typo 'the the' in comment
thermal/tools/tmon: Include pthread and time headers in tmon.h
thermal/ti-soc-thermal: Fix comment typo
thermal/drivers/qcom/spmi-adc-tm5: Register thermal zones as hwmon sensors
thermal/drivers/qcom/temp-alarm: Register thermal zones as hwmon sensors
thermal/drivers/u8500: Remove unnecessary print function dev_err()
thermal/drivers/rzg2l: Fix comments
thermal/drivers/sun8i: Fix typo in comment
thermal/drivers/k3_j72xx_bandgap: Make k3_j72xx_bandgap_j721e_data and k3_j72xx_bandgap_j7200_data static
...
Merge changes that make the thermal core use ida_alloc()/free()
directly instead of ida_simple_get()/ida_simple_remove() that have been
deprecated.
* thermal-core:
thermal: Directly use ida_alloc()/free()
Self contain the trip initialization from the device tree in a single
function for the sake of making the code flow more clear.
Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220722200007.1839356-11-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Now that we have the thermal trip stored in the thermal zone in a
generic way, we can rely on them and remove one indirection we found
in the thermal_of code and do one more step forward the removal of the
duplicated structures.
Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220722200007.1839356-10-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The thermal trip points are properties of a thermal zone and the
different sub systems should be able to save them in the thermal zone
structure instead of having their own definition.
Give the opportunity to the drivers to create a thermal zone with
thermal trips which will be accessible directly from the thermal core
framework.
As we added the thermal trip points structure in the thermal zone,
let's extend the thermal zone register function to have the thermal
trip structures as a parameter and store it in the 'trips' field of
the thermal zone structure.
The thermal zone contains the trip point, we can store them directly
when registering the thermal zone. That will allow another step
forward to remove the duplicate thermal zone structure we find in the
thermal_of code.
Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220722200007.1839356-9-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
In order to use thermal trips defined in the thermal structure, rename
the 'trips' field to 'num_trips' to have the 'trips' field containing the
thermal trip points.
Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220722200007.1839356-8-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The function 'thermal_set_delay_jiffies' is only used in
thermal_core.c but it is defined and implemented in a separate
file. Move the function to thermal_core.c and make it static.
Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20220722200007.1839356-7-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Different functions are exporting the symbols but are actually only
used by the thermal framework internals. Remove these EXPORT_SYMBOLS.
Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220722200007.1839356-6-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The structure thermal_trip is now generic and will be usable by the
different sensor drivers in place of their own structure.
Move its definition to thermal.h to make it accessible.
Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20220722200007.1839356-5-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The device node pointer is no longer needed in the thermal trip
structure, remove it.
Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20220722200007.1839356-4-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The thermal_of code builds a trip array associated with the node
pointer in order to compare the trip point phandle with the list.
The thermal trip is a thermal zone property and should be moved
there. If some sensors have hardcoded trip points, they should use the
exported structure instead of redefining again and again their own
structure and data to describe exactly the same things.
In order to move this to the thermal.h header and allow more cleanup,
we need to remove the node pointer from the structure.
Instead of building storing the device node, we search directly in the
device tree the corresponding node. That results in a simplification
of the code and allows to move the structure to thermal.h
Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Link: https://lore.kernel.org/r/20220722200007.1839356-3-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The pr_err already tells it is an error, it is pointless to add the
'Error:' string in the messages. Remove them.
Cc: Alexandre Bailon <abailon@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linexp.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20220722200007.1839356-2-daniel.lezcano@linexp.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
As the trip temperature is already available when calling the function
handle_critical_trips(), pass it as a parameter instead of having this
function calling the ops again to retrieve the same data.
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220718145038.1114379-2-daniel.lezcano@linaro.org
Register thermal zones as hwmon sensors to let userspace read
temperatures using standard hwmon interface.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Link: https://lore.kernel.org/r/20220719054940.755907-2-dmitry.baryshkov@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Register thermal zones as hwmon sensors to let userspace read
temperatures using standard hwmon interface.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Link: https://lore.kernel.org/r/20220719054940.755907-1-dmitry.baryshkov@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The print function dev_err() is redundant because platform_get_irq()
already prints an error.
Eliminate the follow coccicheck warnings:
./drivers/thermal/db8500_thermal.c:162:2-9: line 162 is redundant because platform_get_irq() already prints an error
./drivers/thermal/db8500_thermal.c:176:2-9: line 176 is redundant because platform_get_irq() already prints an error
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220719003556.74460-1-yang.lee@linux.alibaba.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
This patch replaces 'Capture times'->'Total number of ADC data samples' as
the former does not really explain much.
It also fixes the typo
* caliberation->calibration
Lastly, as per the coding style /* should be on a separate line.
This patch fixes this issue.
Reported-by: Pavel Machek <pavel@denx.de>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Link: https://lore.kernel.org/r/20220718121440.556408-1-biju.das.jz@bp.renesas.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220521111145.81697-36-Julia.Lawall@inria.fr
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Fix sparse warnings:
drivers/thermal/k3_j72xx_bandgap.c:532:36: sparse: sparse: symbol 'k3_j72xx_bandgap_j721e_data' was not declared. Should it be static?
drivers/thermal/k3_j72xx_bandgap.c:536:36: sparse: sparse: symbol 'k3_j72xx_bandgap_j7200_data' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jin Xiaoyun <jinxiaoyun2@huawei.com>
Link: https://lore.kernel.org/r/20220613063111.654893-1-jinxiaoyun2@huawei.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
This while loop exits with "i" set to -1 and so then it sets:
derived_table[-1] = derived_table[0] - 300;
There is no need for this assignment at all. Just delete it.
Fixes: 72b3fc61c752 ("thermal: k3_j72xx_bandgap: Add the bandgap driver support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/YoetjwcOEzYEFp9b@kili
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
If an error occurs in the k3_j72xx_bandgap_probe() function the memory
allocated to the 'ref_table' will not be released.
Add a err_free_ref_table step to the error path to free 'ref_table'
Fixes: 72b3fc61c752 ("thermal: k3_j72xx_bandgap: Add the bandgap driver support")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bryan Brattlof <bb@ti.com>
Reviewed-by: Keerthy <j-keerthy@ti.com>
Link: https://lore.kernel.org/r/20220525213617.30002-1-bb@ti.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The trends DROP_FULL and RAISE_FULL are not used and were never used
in the past AFAICT. Remove these conditions as they seems to not be
handled anywhere.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220629151012.3115773-2-daniel.lezcano@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The code is actually clampling the next cooling device state using the
lowest and highest states of the thermal instance.
That code can be replaced by the clamp() macro which does exactly the
same. It results in a simpler routine to read.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220629151012.3115773-1-daniel.lezcano@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The get_trend function relies on the interrupt to set the raising or
dropping trend. However the interpolated temperature is already giving
the temperature information to the thermal framework which is able to
deduce the trend.
Remove the trend code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220616202537.303655-3-daniel.lezcano@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The get_trend function does already what the generic framework does.
Remove it.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://lore.kernel.org/r/20220616202537.303655-2-daniel.lezcano@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
There is a get_trend function which is a wrapper to call a private
get_trend function. However, this private get_trend function is not
assigned anywhere.
Remove this dead code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Amit Kucheria <amitk@kernel.org>
Link: https://lore.kernel.org/r/20220616202537.303655-1-daniel.lezcano@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
When setting up a new board, a plain "Can't register thermal zone"
didn't help me much because the thermal zones in DT were all fine. I
just had a sensor entry too much in the parent TSC node. Reword the
failure/success messages to contain the sensor number to make it easier
to understand which sensor is affected. Example output now:
rcar_gen3_thermal e6198000.thermal: Sensor 0: Loaded 1 trip points
rcar_gen3_thermal e6198000.thermal: Sensor 1: Loaded 1 trip points
rcar_gen3_thermal e6198000.thermal: Sensor 2: Loaded 1 trip points
rcar_gen3_thermal e6198000.thermal: Sensor 3: Can't register thermal zone
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://lore.kernel.org/r/20220610200500.6727-1-wsa+renesas@sang-engineering.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Remove unneeded global variable devfreq_cooling_ops which is used only
as a copy pattern. Instead, extend the struct devfreq_cooling_device with
the needed ops structure. This also simplifies the allocation/free code
during the setup/cleanup.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20220613124327.30766-5-lukasz.luba@arm.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The code has moved and left some comments stale. Update them where
there is a need.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20220613124327.30766-4-lukasz.luba@arm.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Simplify the thermal_power_cpu_get_power trace event by removing
complicated cpumask and variable length array. Now the tools parsing trace
output don't have to hassle to get this power data. The simplified format
version uses 'policy->cpu'. Remove also the 'load' information completely
since there is very little value of it in this trace event. To get the
CPUs' load (or utilization) there are other dedicated trace hooks in the
kernel. This patch also simplifies and speeds-up the main cooling code
when that trace event is enabled.
Rename the trace event to avoid confusion of tools which parse the trace
file.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20220613124327.30766-3-lukasz.luba@arm.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
It is very unlikely that one CPU cluster would have the EM and some other
won't have it (because EM registration failed or DT lacks needed entry).
Although, we should avoid modifying global variable with callbacks anyway.
Redesign this and add safety for such situation.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20220613124327.30766-2-lukasz.luba@arm.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Walleij)
- Fix grammar typo in the ARM global timer Kconfig option (Randy
Dunlap)
- Add the tegra186 timer and use it on the tegra234 board (Thierry
Reding)
- Add the 'CPUXGPT' CPU timer for Mediatek MT6795 and implement a
workaround to overcome an ATF bug where the timer is not correctly
initialized (AngeloGioacchino Del Regno)
- Rework the suspend/resume approach to enable the feature on the
timer even it is not an active clock and fix a compilation warning
(Claudiu Beznea)
- Add the Add R-Car Gen4 timer support along with the DT bindings
(Wolfram Sang)
- Add compatible for ti,am654-timer to support AM6 SoC (Tony Lindgren)
- Fix Kconfig option to put it back to 'bool' instead of 'tristate'
for the tegra186 (Daniel Lezcano)
- Sort 'family,type' DT bindings for the Renesas timers (Geert
Uytterhoeven)
- Add compatible 'allwinner,sun20i-d1-timer' for Allwinner D1 (Samuel
Holland)
- Remove unnecessary (void*) conversions for sun4i (XU pengfei)
- Remove unnecessary (void*) conversions for sun5i (Li zeming)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmLhWF0ACgkQqDIjiipP
6E+ziAgArY8BJ/09w4rt63LDO5nynorr0+Uuf0E9hCmc0KzRsSF3hO39EcuLj1zD
ubKKSXwaOucAcRYj7gCeqaSk/IOs9EmwESRVtLLIwEwYsg44iC1vmq4pJ8cX66qN
h5TqkOiq77fPIY3BR/nSFxddmlU2farO8oSUqdLNsVrqoIq2rttaQYADdYGa4LYn
dvQHEmh6jvJRoZM3aR6It0EWM7pyHKxAWFChWv+SbikheDCgIcugz+FUTy9lyhmf
39wLm2srmrCLbjaGdJnyc38/y/UPIJ0aqeSf7XSiHTWQ+dfNBfdbrmPTSJHiyIt+
iPrlx0ShvrMXeOxgbSxse53e+9Q88w==
=ApcN
-----END PGP SIGNATURE-----
Merge tag 'timers-v5.20-rc1' of https://git.linaro.org/people/daniel.lezcano/linux into timers/core
Pull clockevent/source updates from Daniel Lezcano:
- Add the missing DT bindings for the MTU nomadik timer (Linus
Walleij)
- Fix grammar typo in the ARM global timer Kconfig option (Randy
Dunlap)
- Add the tegra186 timer and use it on the tegra234 board (Thierry
Reding)
- Add the 'CPUXGPT' CPU timer for Mediatek MT6795 and implement a
workaround to overcome an ATF bug where the timer is not correctly
initialized (AngeloGioacchino Del Regno)
- Rework the suspend/resume approach to enable the feature on the
timer even it is not an active clock and fix a compilation warning
(Claudiu Beznea)
- Add the Add R-Car Gen4 timer support along with the DT bindings
(Wolfram Sang)
- Add compatible for ti,am654-timer to support AM6 SoC (Tony Lindgren)
- Fix Kconfig option to put it back to 'bool' instead of 'tristate'
for the tegra186 (Daniel Lezcano)
- Sort 'family,type' DT bindings for the Renesas timers (Geert
Uytterhoeven)
- Add compatible 'allwinner,sun20i-d1-timer' for Allwinner D1 (Samuel
Holland)
- Remove unnecessary (void*) conversions for sun4i (XU pengfei)
- Remove unnecessary (void*) conversions for sun5i (Li zeming)
Link: https://lore.kernel.org/all/7472984e-f502-5f27-82bf-070127dd85a5@linaro.org
If ACPI_FADT_LOW_POWER_S0 is not set, this doesn't mean that low-power
S0 idle is not usable. It merely means that using S3 on the given
system is more beneficial from the energy saving perspective than using
low-power S0 idle, as long as S3 is supported.
Suspend-to-idle is still a valid suspend mode if ACPI_FADT_LOW_POWER_S0
is not set and the pm_suspend_via_firmware() check in pch_wpt_suspend()
is sufficient to distinguish suspend-to-idle from S3, so drop the
confusing ACPI_FADT_LOW_POWER_S0 check.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
The milli-Watts precision causes rounding errors while calculating
efficiency cost for each OPP. This is especially visible in the 'simple'
Energy Model (EM), where the power for each OPP is provided from OPP
framework. This can cause some OPPs to be marked inefficient, while
using micro-Watts precision that might not happen.
Update all EM users which access 'power' field and assume the value is
in milli-Watts.
Solve also an issue with potential overflow in calculation of energy
estimation on 32bit machine. It's needed now since the power value
(thus the 'cost' as well) are higher.
Example calculation which shows the rounding error and impact:
power = 'dyn-power-coeff' * volt_mV * volt_mV * freq_MHz
power_a_uW = (100 * 600mW * 600mW * 500MHz) / 10^6 = 18000
power_a_mW = (100 * 600mW * 600mW * 500MHz) / 10^9 = 18
power_b_uW = (100 * 605mW * 605mW * 600MHz) / 10^6 = 21961
power_b_mW = (100 * 605mW * 605mW * 600MHz) / 10^9 = 21
max_freq = 2000MHz
cost_a_mW = 18 * 2000MHz/500MHz = 72
cost_a_uW = 18000 * 2000MHz/500MHz = 72000
cost_b_mW = 21 * 2000MHz/600MHz = 70 // <- artificially better
cost_b_uW = 21961 * 2000MHz/600MHz = 73203
The 'cost_b_mW' (which is based on old milli-Watts) is misleadingly
better that the 'cost_b_uW' (this patch uses micro-Watts) and such
would have impact on the 'inefficient OPPs' information in the Cpufreq
framework. This patch set removes the rounding issue.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is an unexpected word 'is' in a comments that need to be dropped
file: ./drivers/thermal/intel/x86_pkg_temp_thermal.c
line: 108
* tj-max is is interesting because threshold is set relative to this
changed to:
* tj-max is interesting because threshold is set relative to this
Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a new CPU ID to the list of supported processors in the
intel_tcc_cooling driver (Sumeet Pawnikar).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmK/TzESHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxwPcQALCM2DfwhD+6lvO2bc0e8gAVYRPwekfa
dZYK6e+b2HBnekzGOpOCSF58qB8jd4hxk2ODJ7RXUSkvtNfLXVDTotZ74Ggr0WN3
JSoSW4OTpds844HiMzaal1WOxc6DEWydCPliSryqyPYLjyxMJjFd/sfsSJRSANs1
q7ySDj+qdSNzSrsFkJ5Ia+W687lBFG3AJBOQ65Fuh8sfJDVsPOcyaWKYcdXUYZVn
bPNm7ll9EWYUYE8lAgWHTgH/keIYTMnyJbfcmqLXfp0jiYAfyKm4559Rf0YiCSX9
Hix5cvhlLBK7exHE7IUBY8k5pC1UxhIWUBlWf96vxGz3XY4hMgXaWUp6w/MOFjqa
ht3aobhvtPv6a3rzCpH3pZTLHuoILwR9pH7kq5aE562K687m2T9vsP+Wvn/5Oj7d
OmThKDuqeWq6FuVsTvBvBFHRtVZ+olUPx0R9aKhFtFDH1bxjwRXX0uEHQu12M8Ww
QwgRmkSSJeCd+iBtSjMXorCS5vjztCQDPZgvD8xH3sg0G5f9Ha/QHJbbpDUv2TaC
P6nghyqeFla8EA21Xg973UxJaajfnal2/FIDmBvoyEhLoQ5M5hs06QjIaY3HA9Js
zNI/4qjLfTR7HZoCy4CDttSA0pgdCHRfNY673QfN6F7sO5kZn9FGJPp7F/qsruO6
3f5kh/HcmQpy
=F4yU
-----END PGP SIGNATURE-----
Merge tag 'thermal-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
"Add a new CPU ID to the list of supported processors in the
intel_tcc_cooling driver (Sumeet Pawnikar)"
* tag 'thermal-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: intel_tcc_cooling: Add TCC cooling support for RaptorLake
Add RaptorLake to the list of processor models supported by the Intel
TCC cooling driver.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
[ rjw: Subject edits, new changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
effective_cpu_util() already has a `int cpu' parameter which allows to
retrieve the CPU capacity scale factor (or maximum CPU capacity) inside
this function via an arch_scale_cpu_capacity(cpu).
A lot of code calling effective_cpu_util() (or the shim
sched_cpu_util()) needs the maximum CPU capacity, i.e. it will call
arch_scale_cpu_capacity() already.
But not having to pass it into effective_cpu_util() will make the EAS
wake-up code easier, especially when the maximum CPU capacity reduced
by the thermal pressure is passed through the EAS wake-up functions.
Due to the asymmetric CPU capacity support of arm/arm64 architectures,
arch_scale_cpu_capacity(int cpu) is a per-CPU variable read access via
per_cpu(cpu_scale, cpu) on such a system.
On all other architectures it is a a compile-time constant
(SCHED_CAPACITY_SCALE).
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lkml.kernel.org/r/20220621090414.433602-4-vdonnefort@google.com
Use ida_alloc()/ida_free() instead of deprecated
ida_simple_get()/ida_simple_remove() as recommended.
Signed-off-by: keliu <liuke94@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Based on the normalized pattern:
this program is free software you can redistribute it and/or modify it
under the terms of the gnu general public license version 2 as
published by the free software foundation this program is distributed
as is without any warranty of any kind whether express or implied
without even the implied warranty of merchantability or fitness for a
particular purpose see the gnu general public license for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference.
Reviewed-by: Allison Randal <allison@lohutok.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add Meteor Lake PCI device ID to the int340x thermal control
driver (Sumeet Pawnikar).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKU8b0SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx5iYP/2XlliK93GruPtohHKfvBXA7GWBunDqe
2OWWGXNKOojZxkpVr+ek96MBjjmnqxMt9cXhqCQMyEYJgp1EePdDc3ixt/p8WDC5
oxSYXhNXe5dXz/GHrwUTp5xQdkOvJNs1PqPQmPCdsUWhNMiGJ1wN0v6HRb4qoTce
/4/zCj4LHk71fYh+A4zma7dY1SnE5RG7JlfLe1TIPHMEjx6QOeoywroqjifn6jZ2
9AjSl3crAz6tP72ng/QL3bG6j8p6CKbT6xmBD47SrHOVbwkDe9ZdTD7m1H4kyU2c
sFwGUix5HJK+LR4zWmeYG8kXzbNScaDIsyyxdRCyB+kOl8IrvCEY++gBW+3ALHWp
HLqMN3lzEOi9VO0hYmmbWMbtwndjXqtLKauU9e5WMjW3dKHkslH1seAlY8H3vmal
czu+jZZMlu3XTMAPo9JD4ycBw/pNdk1eLi0KSerSsuHWOz0vK/cdPL4ew4undolz
Y2AwkGmTO7dyFmhG5jqIlYpeUD7QgujARmdxGhDTWlx6eZ7YiU7bFM2jom+dN+fU
HvRPSleGVs1vpCwyqQsXqyf5I8t7AEh/inHok3YNetGe7Su0RgJvPNgbns3qurHy
MGv8WKHAKHW0VTRZfBrPr8ko/b6DzhlBQEyvAMqQN6vjvznKW8ckZpmVDv+cnN/V
quX/Qo+UHuD7
=lIzD
-----END PGP SIGNATURE-----
Merge tag 'thermal-5.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull additional thermal control update from Rafael Wysocki:
"Add Meteor Lake PCI device ID to the int340x thermal control driver
(Sumeet Pawnikar)"
* tag 'thermal-5.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: int340x: Add Meteor Lake PCI device ID
Add Meteor Lake PCI ID for processor thermal device.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Add thermal library and thermal tools to encapsulate the netlink
into event based callbacks (Daniel Lezcano, Jiapeng Chong).
- Improve overheat condition handling during suspend-to-idle in the
Intel PCH thermal driver (Zhang Rui).
- Use local ops instead of global ops in devfreq_cooling (Kant Fan).
- Clean up _OSC handling in int340x (Davidlohr Bueso).
- Switch hisi_termal from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
(Hesham Almatary).
- Add new k3 j72xx bangdap driver and the corresponding bindings
(Keerthy).
- Fix missing of_node_put() in the SC iMX driver at probe time
(Miaoqian Lin).
- Fix memory leak in __thermal_cooling_device_register() when
device_register() fails by calling thermal_cooling_device_destroy_sysfs()
(Yang Yingliang).
- Add sc8180x and sc8280xp compatible string in the DT bindings and
lMH support for QCom tsens driver (Bjorn Andersson).
- Fix OTP Calibration Register values conforming to the documentation
on RZ/G2L and bindings documentation for RZ/G2UL (Biju Das).
- Fix type in kerneldoc description for __thermal_bind_params (Corentin
Labbe).
- Fix potential NULL dereference in sr_thermal_probe() on Broadcom
platform (Zheng Yongjun).
- Add change mode ops to the thermal-of sensor (Manaf Meethalavalappu
Pallikunhi).
- Fix non-negative value support by preventing the value to be clamp
to zero (Stefan Wahren).
- Add compatible string and DT bindings for MSM8960 tsens driver
(Dmitry Baryshkov).
- Add hwmon support for K3 driver (Massimiliano Minella).
- Refactor and add multiple generations support for QCom ADC driver
(Jishnu Prakash).
- Use platform_get_irq_optional() to get the interrupt on RCar driver and
document Document RZ/V2L bindings (Lad Prabhakar).
- Remove NULL check after container_of() call from the Intel HFI
thermal driver (Haowen Bai).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKL3MESHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxgzsQAKj4D0ROHLdqReYeIutS7IU6YQ+ejh3N
krcAZvxQd8B3sVwSHHABPdsHwwfYjP4BwoHHVtha5q+5zFGBq5F3fJHsDeRxN23T
NEOW81HBaLu81N5gNuAYg+NQcdqBR4U4IGKZfxWRyx13OMECXGtJRb/SIW/TuDYk
AQIrqOivVEsVRtn+qp+x0rKHsj6ge+2lU1OyJagr2BDhgLrwzxl6cmNBfali1C3D
sw4SqGwGVjEB0QDDpMbty9BsLEjNRE86kwoPh2u0K9dT9/b/P9pk1lp+pOnltspZ
eOdwr0CtoEHw2x5tuhbO7fmvNIuf5jgSDWsCrP/xYnsozcRiwWsgyeJj4f5soreN
kTJB8S/FH+fd7UuYqdmDIeJpDnkkGt1jIqjkDvfJNL7ffBWIe/meXKF4vBnrdlbd
FMNQwzgc5eug07IFjOE43V1v5Hw1H1leVUwZczOMGNIhqyy0WZyQr7vzwbr16jCO
x0hu3q3Y++5SUAUy9WzsOOrpRHa9JxUwVhZuLG7ajf+6pqPxB8kqs9CJe+sgWKju
T4KqHbljJ2Gnkm9SngT0BdnB+AgparhXf8/8Mj0ZBQvamXw+ylNbYfG7qTZk5Ne/
rUf4YDew6hGDqikyJBe2e0a1lLSoN4zaADC1lLYhC+Zp7m6eK0r3y3mhic0bZe1c
RU/2eeA4kTuZ
=3ltV
-----END PGP SIGNATURE-----
Merge tag 'thermal-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These add a thermal library and thermal tools to wrap the netlink
interface into event-based callbacks, improve overheat condition
handling during suspend-to-idle on Intel SoCs, add some new hardware
support, fix bugs and clean up code.
Specifics:
- Add thermal library and thermal tools to encapsulate the netlink
into event based callbacks (Daniel Lezcano, Jiapeng Chong).
- Improve overheat condition handling during suspend-to-idle in the
Intel PCH thermal driver (Zhang Rui).
- Use local ops instead of global ops in devfreq_cooling (Kant Fan).
- Clean up _OSC handling in int340x (Davidlohr Bueso).
- Switch hisi_termal from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
(Hesham Almatary).
- Add new k3 j72xx bangdap driver and the corresponding bindings
(Keerthy).
- Fix missing of_node_put() in the SC iMX driver at probe time
(Miaoqian Lin).
- Fix memory leak in __thermal_cooling_device_register()
when device_register() fails by calling
thermal_cooling_device_destroy_sysfs() (Yang Yingliang).
- Add sc8180x and sc8280xp compatible string in the DT bindings and
lMH support for QCom tsens driver (Bjorn Andersson).
- Fix OTP Calibration Register values conforming to the documentation
on RZ/G2L and bindings documentation for RZ/G2UL (Biju Das).
- Fix type in kerneldoc description for __thermal_bind_params
(Corentin Labbe).
- Fix potential NULL dereference in sr_thermal_probe() on Broadcom
platform (Zheng Yongjun).
- Add change mode ops to the thermal-of sensor (Manaf Meethalavalappu
Pallikunhi).
- Fix non-negative value support by preventing the value to be clamp
to zero (Stefan Wahren).
- Add compatible string and DT bindings for MSM8960 tsens driver
(Dmitry Baryshkov).
- Add hwmon support for K3 driver (Massimiliano Minella).
- Refactor and add multiple generations support for QCom ADC driver
(Jishnu Prakash).
- Use platform_get_irq_optional() to get the interrupt on RCar driver
and document Document RZ/V2L bindings (Lad Prabhakar).
- Remove NULL check after container_of() call from the Intel HFI
thermal driver (Haowen Bai)"
* tag 'thermal-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (38 commits)
thermal: intel: pch: improve the cooling delay log
thermal: intel: pch: enhance overheat handling
thermal: intel: pch: move cooling delay to suspend_noirq phase
PM: wakeup: expose pm_wakeup_pending to modules
thermal: k3_j72xx_bandgap: Add the bandgap driver support
dt-bindings: thermal: k3-j72xx: Add VTM bindings documentation
thermal/drivers/imx_sc_thermal: Fix refcount leak in imx_sc_thermal_probe
thermal/core: Fix memory leak in __thermal_cooling_device_register()
dt-bindings: thermal: tsens: Add sc8280xp compatible
dt-bindings: thermal: lmh: Add Qualcomm sc8180x compatible
thermal/drivers/qcom/lmh: Add sc8180x compatible
thermal/drivers/rz2gl: Fix OTP Calibration Register values
dt-bindings: thermal: rzg2l-thermal: Document RZ/G2UL bindings
thermal: thermal_of: fix typo on __thermal_bind_params
tools/thermal: remove unneeded semicolon
tools/lib/thermal: remove unneeded semicolon
thermal/drivers/broadcom: Fix potential NULL dereference in sr_thermal_probe
tools/thermal: Add thermal daemon skeleton
tools/thermal: Add a temperature capture tool
tools/thermal: Add util library
...
- Update the Energy Model support code to allow the Energy Model to be
artificial, which means that the power values may not be on a uniform
scale with other devices providing power information, and update the
cpufreq_cooling and devfreq_cooling thermal drivers to support
artificial Energy Models (Lukasz Luba).
- Make DTPM check the Energy Model type (Lukasz Luba).
- Fix policy counter decrementation in cpufreq if Energy Model is in
use (Pierre Gondois).
- Add CPU-based scaling support to passive devfreq governor (Saravana
Kannan, Chanwoo Choi).
- Update the rk3399_dmc devfreq driver (Brian Norris).
- Export dev_pm_ops instead of suspend() and resume() in the IIO
chemical scd30 driver (Jonathan Cameron).
- Add namespace variants of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and
PM-runtime counterparts (Jonathan Cameron).
- Move symbol exports in the IIO chemical scd30 driver into the
IIO_SCD30 namespace (Jonathan Cameron).
- Avoid device PM-runtime usage count underflows (Rafael Wysocki).
- Allow dynamic debug to control printing of PM messages (David
Cohen).
- Fix some kernel-doc comments in hibernation code (Yang Li, Haowen
Bai).
- Preserve ACPI-table override during hibernation (Amadeusz Sławiński).
- Improve support for suspend-to-RAM for PSCI OSI mode (Ulf Hansson).
- Make Intel RAPL power capping driver support the RaptorLake and
AlderLake N processors (Zhang Rui, Sumeet Pawnikar).
- Remove redundant store to value after multiply in the RAPL power
capping driver (Colin Ian King).
- Add AlderLake processor support to the intel_idle driver (Zhang Rui).
- Fix regression leading to no genpd governor in the PSCI cpuidle
driver and fix the riscv-sbi cpuidle driver to allow a genpd
governor to be used (Ulf Hansson).
- Fix cpufreq governor clean up code to avoid using kfree() directly
to free kobject-based items (Kevin Hao).
- Prepare cpufreq for powerpc's asm/prom.h cleanup (Christophe Leroy).
- Make intel_pstate notify frequency invariance code when no_turbo is
turned on and off (Chen Yu).
- Add Sapphire Rapids OOB mode support to intel_pstate (Srinivas
Pandruvada).
- Make cpufreq avoid unnecessary frequency updates due to mismatch
between hardware and the frequency table (Viresh Kumar).
- Make remove_cpu_dev_symlink() clear the real_cpus mask to simplify
code (Viresh Kumar).
- Rearrange cpufreq_offline() and cpufreq_remove_dev() to make the
calling convention for some driver callbacks consistent (Rafael
Wysocki).
- Avoid accessing half-initialized cpufreq policies from the show()
and store() sysfs functions (Schspa Shi).
- Rearrange cpufreq_offline() to make the calling convention for some
driver callbacks consistent (Schspa Shi).
- Update CPPC handling in cpufreq (Pierre Gondois).
- Extend dev_pm_domain_detach() doc (Krzysztof Kozlowski).
- Move genpd's time-accounting to ktime_get_mono_fast_ns() (Ulf
Hansson).
- Improve the way genpd deals with its governors (Ulf Hansson).
- Update the turbostat utility to version 2022.04.16 (Len Brown,
Dan Merillat, Sumeet Pawnikar, Zephaniah E. Loss-Cutler-Hull, Chen
Yu).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKL3hsSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxW4oP/RzMh6dclWXs3J/gUCKTqRepq6cb80tq
Q2r9xRRHwy6ZH/PVddGDHmhQ7d3NAv13s4srA9kznZognF3hzuxnGau226ilDqHh
qxVSBRjWY9ijxRBvkcCaa6HZm4Chb91pUX0CLpdYSl9BTgIdk66HZYaMsKhHU/di
j7KKHPdKyyQkssWnMjGEyuaF+UebiEgISCF3+X0eb6c1m7GHXpgLJVxNy0pKkUdK
j+n6+ms12OlVLtg1eIl0J5824w/rkK3ZdqfEXJSq++mNMqSj/KCI3yWpzsLKp9AB
xxhox/tPgJVyON8Vtbb2IkWkiQUKeSrAGIUYXWmnwIZYLPSGD7BPzr82Cxr7S/ez
imMB+1Qd3SsOQ9EdI9rGYgNsEF2vOs1xjMehSdUdmTz148IzBOBt4YyQeb/mfXqH
nh9eVuFCzqH1lAayYt6iP1+V5gQn9as/+rR91k4k4A6OKXomuQUGORLeHfuKMfNH
eBZ72tdXqiq6z+ag3lY3pBAMSm11epCOa3VR6QNaC7hrlY3AZP+o3tIUL6W813b+
V3l1gWApGHZE1hiDM95dll/dIt9IZpTRd3dlqF/YnFW7fPDrz71EGvhrZpO7vdO0
/G6eJcCDjqJVcbCE8Y77I6/AXjpVQ7PRPeNx6aW7jPcQhpVIgcsF2BGjk9anjXDs
3yHJs9R/HMmA
=Hewm
-----END PGP SIGNATURE-----
Merge tag 'pm-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add support for 'artificial' Energy Models in which power
numbers for different entities may be in different scales, add support
for some new hardware, fix bugs and clean up code in multiple places.
Specifics:
- Update the Energy Model support code to allow the Energy Model to
be artificial, which means that the power values may not be on a
uniform scale with other devices providing power information, and
update the cpufreq_cooling and devfreq_cooling thermal drivers to
support artificial Energy Models (Lukasz Luba).
- Make DTPM check the Energy Model type (Lukasz Luba).
- Fix policy counter decrementation in cpufreq if Energy Model is in
use (Pierre Gondois).
- Add CPU-based scaling support to passive devfreq governor (Saravana
Kannan, Chanwoo Choi).
- Update the rk3399_dmc devfreq driver (Brian Norris).
- Export dev_pm_ops instead of suspend() and resume() in the IIO
chemical scd30 driver (Jonathan Cameron).
- Add namespace variants of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and
PM-runtime counterparts (Jonathan Cameron).
- Move symbol exports in the IIO chemical scd30 driver into the
IIO_SCD30 namespace (Jonathan Cameron).
- Avoid device PM-runtime usage count underflows (Rafael Wysocki).
- Allow dynamic debug to control printing of PM messages (David
Cohen).
- Fix some kernel-doc comments in hibernation code (Yang Li, Haowen
Bai).
- Preserve ACPI-table override during hibernation (Amadeusz
Sławiński).
- Improve support for suspend-to-RAM for PSCI OSI mode (Ulf Hansson).
- Make Intel RAPL power capping driver support the RaptorLake and
AlderLake N processors (Zhang Rui, Sumeet Pawnikar).
- Remove redundant store to value after multiply in the RAPL power
capping driver (Colin Ian King).
- Add AlderLake processor support to the intel_idle driver (Zhang
Rui).
- Fix regression leading to no genpd governor in the PSCI cpuidle
driver and fix the riscv-sbi cpuidle driver to allow a genpd
governor to be used (Ulf Hansson).
- Fix cpufreq governor clean up code to avoid using kfree() directly
to free kobject-based items (Kevin Hao).
- Prepare cpufreq for powerpc's asm/prom.h cleanup (Christophe
Leroy).
- Make intel_pstate notify frequency invariance code when no_turbo is
turned on and off (Chen Yu).
- Add Sapphire Rapids OOB mode support to intel_pstate (Srinivas
Pandruvada).
- Make cpufreq avoid unnecessary frequency updates due to mismatch
between hardware and the frequency table (Viresh Kumar).
- Make remove_cpu_dev_symlink() clear the real_cpus mask to simplify
code (Viresh Kumar).
- Rearrange cpufreq_offline() and cpufreq_remove_dev() to make the
calling convention for some driver callbacks consistent (Rafael
Wysocki).
- Avoid accessing half-initialized cpufreq policies from the show()
and store() sysfs functions (Schspa Shi).
- Rearrange cpufreq_offline() to make the calling convention for some
driver callbacks consistent (Schspa Shi).
- Update CPPC handling in cpufreq (Pierre Gondois).
- Extend dev_pm_domain_detach() doc (Krzysztof Kozlowski).
- Move genpd's time-accounting to ktime_get_mono_fast_ns() (Ulf
Hansson).
- Improve the way genpd deals with its governors (Ulf Hansson).
- Update the turbostat utility to version 2022.04.16 (Len Brown, Dan
Merillat, Sumeet Pawnikar, Zephaniah E. Loss-Cutler-Hull, Chen Yu)"
* tag 'pm-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (94 commits)
PM: domains: Trust domain-idle-states from DT to be correct by genpd
PM: domains: Measure power-on/off latencies in genpd based on a governor
PM: domains: Allocate governor data dynamically based on a genpd governor
PM: domains: Clean up some code in pm_genpd_init() and genpd_remove()
PM: domains: Fix initialization of genpd's next_wakeup
PM: domains: Fixup QoS latency measurements for IRQ safe devices in genpd
PM: domains: Measure suspend/resume latencies in genpd based on governor
PM: domains: Move the next_wakeup variable into the struct gpd_timing_data
PM: domains: Allocate gpd_timing_data dynamically based on governor
PM: domains: Skip another warning in irq_safe_dev_in_sleep_domain()
PM: domains: Rename irq_safe_dev_in_no_sleep_domain() in genpd
PM: domains: Don't check PM_QOS_FLAG_NO_POWER_OFF in genpd
PM: domains: Drop redundant code for genpd always-on governor
PM: domains: Add GENPD_FLAG_RPM_ALWAYS_ON for the always-on governor
powercap: intel_rapl: remove redundant store to value after multiply
cpufreq: CPPC: Enable dvfs_possible_from_any_cpu
cpufreq: CPPC: Enable fast_switch
ACPI: CPPC: Assume no transition latency if no PCCT
ACPI: bus: Set CPPC _OSC bits for all and when CPPC_LIB is supported
ACPI: CPPC: Check _OSC for flexible address space
...
- New drivers
- Driver for the Microchip LAN966x SoC
- PMBus driver for Infineon Digital Multi-phase xdp152 family controllers
- Chip support added to existing drivers
- asus-ec-sensors
- Support for ROG STRIX X570-E GAMING WIFI II, PRIME X470-PRO,
and ProArt X570 Creator WIFI
- External temperature sensor support for ASUS WS X570-ACE
- nct6775
- Support for I2C driver
- Support for ASUS PRO H410T / PRIME H410M-R / ROG X570-E GAMING WIFI II
- lm75
- Support for - Atmel AT30TS74
- pmbus/max16601
- Support for MAX16602
- aquacomputer_d5next
- Support for Aquacomputer Farbwerk
- Support for Aquacomputer Octo
- jc42
- Support for S-34TS04A
- Kernel API changes / clarifications
- The chip parameter of with_info API is now mandatory
- New hwmon_device_register_for_thermal API call for use by the thermal
subsystem
- Improvements
- PMBus and JC42 drivers now register with thermal subsystem
- PMBus drivers now support get_voltage/set_voltage power operations
- The adt7475 driver now supports pin configuration
- The lm90 driver now supports setting extended range temperatures
configuration with a devicetree property
- The dell-smm driver now registers as cooling device
- The OCC driver delays hwmon registration until requested by userspace
- Various other minor fixes and improvements
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEiHPvMQj9QTOCiqgVyx8mb86fmYEFAmKKpskACgkQyx8mb86f
mYHmkw/+IsOgkaSwA0PMBSQvPdncDcywchhtJ20UP3aKogy9Lp4HZ9NBRPZKeL7Y
r89LSi3OT27yn+NQ7JXGIA7VLnqftoHREkyq3khJDwqRCMv0/bTxEYuO04Hdte1n
4QrLth4yMfG5domgQn/M1KyS40jsMLPLMg0ui/Zwbm6O9J4D/Jj+P8KiT+Txgdmh
Zm/a2WQEkqueXENv1XEOgZ4DvKxq236pqn9kLVBQSiI74GAtg08pB5K+HyDIcTph
1nnbW/hJclWX96/Dbw87QNV7tu5xTAfno9xN4rbTYNgafx6gtoJoXWXukA9memi4
NzkFiaOdf+47Pr+EEi7SczVf+P+EwisVt4IMahMLIXZMaStHEJFcodR3PjsVPWt/
8R6z6r+byNFjfGJDpvGwUm9zJcaiCs/zrylyrOx2UXdzMrD3A6zngsPtWoli37h0
X5vV5MYEVKSE1m4ZEt0rq8O2gc2Jrb2FyVxhEzaDoM5IwviXSNEGIiav6uPaFI/R
ehmsWV/qbqRp3lfcvwyei4frITHhpgZQC5eaEiN+LFu1XbBxy7TlSp3UAqL0jHj+
qBZxpFgAz9MmEH1NgfSc8hHdz1cKIo9eR8IdteFg3WexcJ9evFwKiVK8yvlMOlVS
CnOhGOTOFHZVASnNQS45Vi9Ofr6Ou2YSss2McyB1eMOYUMC0cxU=
=LA2x
-----END PGP SIGNATURE-----
Merge tag 'hwmon-for-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon updates from Guenter Roeck:
"New drivers:
- Driver for the Microchip LAN966x SoC
- PMBus driver for Infineon Digital Multi-phase xdp152 family
controllers
Chip support added to existing drivers:
- asus-ec-sensors:
- Support for ROG STRIX X570-E GAMING WIFI II, PRIME X470-PRO, and
ProArt X570 Creator WIFI
- External temperature sensor support for ASUS WS X570-ACE
- nct6775:
- Support for I2C driver
- Support for ASUS PRO H410T / PRIME H410M-R /
ROG X570-E GAMING WIFI II
- lm75:
- Support for - Atmel AT30TS74
- pmbus/max16601:
- Support for MAX16602
- aquacomputer_d5next:
- Support for Aquacomputer Farbwerk
- Support for Aquacomputer Octo
- jc42:
- Support for S-34TS04A
Kernel API changes / clarifications:
- The chip parameter of with_info API is now mandatory
- New hwmon_device_register_for_thermal API call for use by the
thermal subsystem
Improvements:
- PMBus and JC42 drivers now register with thermal subsystem
- PMBus drivers now support get_voltage/set_voltage power operations
- The adt7475 driver now supports pin configuration
- The lm90 driver now supports setting extended range temperatures
configuration with a devicetree property
- The dell-smm driver now registers as cooling device
- The OCC driver delays hwmon registration until requested by
userspace
... and various other minor fixes and improvements"
* tag 'hwmon-for-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (71 commits)
hwmon: (aquacomputer_d5next) Fix an error handling path in aqc_probe()
hwmon: (sl28cpld) Fix typo in comment
hwmon: (pmbus) Check PEC support before reading other registers
hwmon: (dimmtemp) Fix bitmap handling
hwmon: (lm90) enable extended range according to DTS node
dt-bindings: hwmon: lm90: add ti,extended-range-enable property
dt-bindings: hwmon: lm90: add missing ti,tmp461
hwmon: (ibmaem) Directly use ida_alloc()/free()
hwmon: Directly use ida_alloc()/free()
hwmon: (asus-ec-sensors) fix Formula VIII definition
dt-bindings: trivial-devices: Add xdp152
hwmon: (sl28cpld-hwmon) Use HWMON_CHANNEL_INFO macro
hwmon: (pwm-fan) Use HWMON_CHANNEL_INFO macro
hwmon: (peci/dimmtemp) Use HWMON_CHANNEL_INFO macro
hwmon: (peci/cputemp) Use HWMON_CHANNEL_INFO macro
hwmon: (mr75203) Use HWMON_CHANNEL_INFO macro
hwmon: (ltc2992) Use HWMON_CHANNEL_INFO macro
hwmon: (as370-hwmon) Use HWMON_CHANNEL_INFO macro
hwmon: Make chip parameter for with_info API mandatory
thermal/drivers/thermal_hwmon: Use hwmon_device_register_for_thermal()
...
Marge Energy Model support updates and cpuidle updates for 5.19-rc1:
- Update the Energy Model support code to allow the Energy Model to be
artificial, which means that the power values may not be on a uniform
scale with other devices providing power information, and update the
cpufreq_cooling and devfreq_cooling thermal drivers to support
artificial Energy Models (Lukasz Luba).
- Make DTPM check the Energy Model type (Lukasz Luba).
- Fix policy counter decrementation in cpufreq if Energy Model is in
use (Pierre Gondois).
- Add AlderLake processor support to the intel_idle driver (Zhang Rui).
- Fix regression leading to no genpd governor in the PSCI cpuidle
driver and fix the riscv-sbi cpuidle driver to allow a genpd
governor to be used (Ulf Hansson).
* pm-em:
PM: EM: Decrement policy counter
powercap: DTPM: Check for Energy Model type
thermal: cooling: Check Energy Model type in cpufreq_cooling and devfreq_cooling
Documentation: EM: Add artificial EM registration description
PM: EM: Remove old debugfs files and print all 'flags'
PM: EM: Change the order of arguments in the .active_power() callback
PM: EM: Use the new .get_cost() callback while registering EM
PM: EM: Add artificial EM flag
PM: EM: Add .get_cost() callback
* pm-cpuidle:
cpuidle: riscv-sbi: Fix code to allow a genpd governor to be used
cpuidle: psci: Fix regression leading to no genpd governor
intel_idle: Add AlderLake support
The thermal subsystem registers a hwmon device without providing chip
information or sysfs attribute groups. While undesirable, it would be
difficult to change. On the other side, it abuses the
hwmon_device_register_with_info API by not providing that information.
Use new API specifically created for the thermal subsystem instead to
let us enforce the 'chip' parameter for other callers of
hwmon_device_register_with_info().
Acked-by: Rafael J . Wysocki <rafael@kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Previously, during suspend, intel_pch_thermal driver logs for every
cooling iteration, about the current PCH temperature and number of cooling
iterations that have been tried, like below
[ 100.955526] intel_pch_thermal 0000:00:14.2: CPU-PCH current temp [53C] higher than the threshold temp [50C], sleep 1 times for 100 ms duration
[ 101.064156] intel_pch_thermal 0000:00:14.2: CPU-PCH current temp [53C] higher than the threshold temp [50C], sleep 2 times for 100 ms duration
After changing the default delay_cnt to 600, in practice, it is common to
see tens of the above messages if the system is suspended when PCH
overheats. Thus, change this log message from dev_warn to dev_dbg because
it is only useful when we want to check the temperature trend.
At the same time, there is always a one-line message given by the driver
with the patch applied, with below four possibilities.
1. PCH is cool, no cooling delay needed
[ 1791.902853] intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [48C]
2. PCH overheats and becomes cool after the cooling delays
[ 1475.511617] intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [49C] after 30700 ms delay
3. PCH still overheats after the overall cooling timeout
[ 2250.157487] intel_pch_thermal 0000:00:12.0: CPU-PCH is hot [60C] after 60000 ms delay. S0ix might fail
4. PCH aborts cooling because of wakeup event detected during the delay
[ 1933.639509] intel_pch_thermal 0000:00:12.0: Wakeup event detected, abort cooling
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit ef63b043ac ("thermal: intel: pch: fix S0ix failure due to PCH
temperature above threshold") introduces delay loop mechanism that allows
PCH temperature to go down below threshold during suspend so it won't
block S0ix. And the default overall delay timeout is 1 second.
However, in practice, we found that the time it takes to cool the PCH down
below threshold highly depends on the initial PCH temperature when the
delay starts, as well as the ambient temperature.
And in some cases, the 1 second delay is not sufficient. As a result, the
system stays in a shallower power state like PCx instead of S0ix, and
drains the battery power, without user' notice.
To make sure S0ix is not blocked by the PCH overheating, we
1. expand the default overall timeout to 60 seconds.
2. make sure the temperature is below threshold rather than equal to it.
At the same time, as the cooling delay can be much longer and many wakeup
events (ACPI Power Button press, USB mouse move, etc) becomes valid in the
suspend_noirq phase, add detection of wakeup event so that the driver
does not delay blindly when the system suspend is likely to abort soon.
This patch may introduce longer suspend time, but only in the cases when
the system overheats and Linux used to enter a shallower S2idle state,
say, PCx instead of S0ix.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Move the PCH Thermal driver suspend callback to suspend_noirq to do
cooling while the system is more quiescent.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add VTM thermal support. In the Voltage Thermal Management
Module(VTM), K3 J72XX supplies a voltage reference and a temperature
sensor feature that are gathered in the band gap voltage and
temperature sensor (VBGAPTS) module. The band gap provides current and
voltage reference for its internal circuits and other analog IP
blocks. The analog-to-digital converter (ADC) produces an output value
that is proportional to the silicon temperature.
Currently reading temperatures only is supported. There are no
active/passive cooling agent supported.
J721e SoCs have errata i2128: https://www.ti.com/lit/pdf/sprz455
The VTM Temperature Monitors (TEMPSENSORs) are trimmed during production,
with the resulting values stored in software-readable registers. Software
should use these register values when translating the Temperature
Monitor output codes to temperature values.
It has an involved workaround. Software needs to read the error codes for
-40C, 30C, 125C from the efuse for each device & derive a new look up table
for adc to temperature conversion. Involved calculating slopes & constants
using 3 different straight line equations with adc refernce codes as the
y-axis & error codes in the x-axis.
-40C to 30C
30C to 125C
125C to 150C
With the above 2 line equations we derive the full look-up table to
workaround the errata i2128 for j721e SoC.
Tested temperature reading on J721e SoC & J7200 SoC.
[daniel.lezcano@linaro.org: Generate look-up tables run-time]
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Keerthy <j-keerthy@ti.com>
Link: https://lore.kernel.org/r/20220517172920.10857-3-j-keerthy@ti.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
of_find_node_by_name() returns a node pointer with refcount
incremented, we should use of_node_put() on it when done.
Add missing of_node_put() to avoid refcount leak.
Fixes: e20db70dba ("thermal: imx_sc: add i.MX system controller thermal support")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220517055121.18092-1-linmq006@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The LMh instances in the Qualcomm SC8180X platform looks to behave
similar to those in SM8150, add additional compatibles to allow
platform specific behavior to be added if needed.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220502164504.3972938-1-bjorn.andersson@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
As per the latest RZ/G2L Hardware User's Manual (Rev.1.10 Apr, 2022),
the bit 31 of TSU OTP Calibration Register(OTPTSUTRIM) indicates
whether bit [11:0] of OTPTSUTRIM is valid or invalid.
This patch updates the code to reflect this change.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220428093346.7552-1-biju.das.jz@bp.renesas.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Add a missing s to __thermal_bind_param kernel doc comment.
This fixes the following sparse warnings:
drivers/thermal/thermal_of.c:50: warning: expecting prototype for struct __thermal_bind_param. Prototype was for struct __thermal_bind_params instead
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Link: https://lore.kernel.org/r/20220426064113.3787826-1-clabbe@baylibre.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The sensor driver which register through thermal_of interface doesn't
have an option to get thermal zone mode change notification from
thermal core.
Add support for change_mode ops in thermal_of interface so that sensor
driver can use this ops for mode change notification.
Signed-off-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com>
Link: https://lore.kernel.org/r/1646767586-31908-1-git-send-email-quic_manafm@quicinc.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The thermal sensor on BCM2711 is capable of negative temperatures, so don't
clamp the measurements at zero. Since this was the only use for variable t,
drop it.
This change based on a patch by Dom Cobley, who also tested the fix.
Fixes: 59b781352d ("thermal: Add BCM2711 thermal driver")
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220412195423.104511-1-stefan.wahren@i2se.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
On apq8064 (msm8960) platforms the tsens device is created manually by
the gcc driver. Prepare the tsens driver for the qcom,msm8960-tsens
device instantiated from the device tree.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20220406002648.393486-3-dmitry.baryshkov@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Expose the thermal sensors on K3 AM654 as hwmon devices, so that
temperatures could be read using lm-sensors.
Signed-off-by: Massimiliano Minella <massimiliano.minella@gmail.com>
Link: https://lore.kernel.org/r/20220401151656.913166-1-massimiliano.minella@se.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Add support for PMIC5 Gen2 ADC_TM, used on PMIC7 chips. It is a
close counterpart of PMIC7 ADC and has the same functionality as
PMIC5 ADC_TM, for threshold monitoring and interrupt generation.
It is present on PMK8350 alone, like PMIC7 ADC and can be used
to monitor up to 8 ADC channels, from any of the PMIC7 PMICs
having ADC on a target, through PBS(Programmable Boot Sequence).
Signed-off-by: Jishnu Prakash <quic_jprakash@quicinc.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/1648991869-20899-5-git-send-email-quic_jprakash@quicinc.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Refactor code to support multiple generations of ADC_TM devices
by defining gen number, irq name and disable, configure, isr and
init APIs in the individual data structs.
Signed-off-by: Jishnu Prakash <quic_jprakash@quicinc.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/1648991869-20899-4-git-send-email-quic_jprakash@quicinc.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
allocation of IRQ resources in DT core code, this causes an issue
when using hierarchical interrupt domains using "interrupts" property
in the node as this bypasses the hierarchical setup and messes up the
irq chaining.
In preparation for removal of static setup of IRQ resource from DT core
code use platform_get_irq_optional().
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20220110144039.5810-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
container_of() will never return NULL, so remove useless code.
Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
With the new OS handshake introduced by commit: "c7ff29763989 ("thermal:
int340x: Update OS policy capability handshake")", the "enabled" thermal
zone mode doesn't work in the same way as previously.
The "enabled" mode fails with -EINVAL when the new handshake is used.
To address this issue, when the new OS UUID mask is set:
- When the mode is "enabled", return 0 as the firmware already has the
latest policy mask.
- When the mode is "disabled", update the firmware with the UUID mask
of zero.
This way, the firmware can take over the thermal control.
Also reset the OS UUID mask, which allows user space to update with new
set of policies.
Fixes: c7ff297639 ("thermal: int340x: Update OS policy capability handshake")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Changelog edits, removed unneeded parens ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Merge a fix for the attr.show callback prototype in the int340x thermal
driver (Kees Cook).
* thermal-int340x:
thermal: int340x: Fix attr.show callback prototype
The userspace governor is still in use on production systems and the
deprecating warning is scary.
Even if we want to get rid of the userspace governor, it is too soon
yet as the alternatives are not yet adopted.
Change the deprecated warning by an information message suggesting to
switch to the netlink thermal events.
Fixes: 0275c9fb0e ("thermal/core: Make the userspace governor deprecated")
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts commit a67a46af4a.
It has been reported the warning is annoying as the cooling device
state is still needed on some production system.
Meanwhile we provide a way to consolidate the thermal framework to
prevent multiple actors acting on the cooling devices with conflicting
decisions, let's revert this warning.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Control Flow Integrity (CFI) instrumentation of the kernel noticed that
the caller, dev_attr_show(), and the callback, odvp_show(), did not have
matching function prototypes, which would cause a CFI exception to be
raised. Correct the prototype by using struct device_attribute instead
of struct kobj_attribute.
Reported-and-tested-by: Joao Moreira <joao@overdrivepizza.com>
Link: https://lore.kernel.org/lkml/067ce8bd4c3968054509831fa2347f4f@overdrivepizza.com/
Fixes: 006f006f1e ("thermal/int340x_thermal: Export OEM vendor variables")
Cc: 5.8+ <stable@vger.kernel.org> # 5.8+
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fix access illegal address problem in following condition:
There are multiple devfreq cooling devices in system, some of them has
EM model but others do not. Energy model ops such as state2power will
append to global devfreq_cooling_ops when the cooling device with
EM model is registered. It makes the cooling device without EM model
also use devfreq_cooling_ops after appending when registered later by
of_devfreq_cooling_register_power() or of_devfreq_cooling_register().
The IPA governor regards the cooling devices without EM model as a power
actor, because they also have energy model ops, and will access illegal
address at dfc->em_pd when execute cdev->ops->get_requested_power,
cdev->ops->state2power or cdev->ops->power2state.
Fixes: 615510fe13 ("thermal: devfreq_cooling: remove old power model and use EM")
Cc: 5.13+ <stable@vger.kernel.org> # 5.13+
Signed-off-by: Kant Fan <kant@allwinnertech.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cleaning up the driver to use pm_sleep_ptr() macro instead of #ifdef
guards is simpler and allows the compiler to remove those functions
if built without CONFIG_PM_SLEEP support.
Signed-off-by: Hesham Almatary <hesham.almatary@huawei.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Energy Model can now be artificial, which means the power values
are mathematically generated to leverage EAS while not expected to be on
an uniform scale with other devices providing power information. If this
EM type is in use, the thermal governor IPA should not be allowed to
operate, since the relation between cooling devices is not properly
defined. Thus, it might be possible that big GPU has lower power values
than a Little CPU. To mitigate a misbehaviour of the thermal control
algorithm, simply do not register the cooling device as IPA's power
actor.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Now that the UUID is already sanitized by the caller,
lets trivially clean up some of the context arming.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Zhang Rui <rui.zhang@intel.com>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Introduce a single point of freeing/exit after ensuring no error in
int3400_setup_gddv().
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It is the caller's responsibility to free only upon ACPI_SUCCESS.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Zhang Rui <rui.zhang@intel.com>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Merge Intel Hardware Feedback Interface (HFI) thermal driver for
5.18-rc1 and update the intel-speed-select utility to support that
driver.
* thermal-hfi:
tools/power/x86/intel-speed-select: v1.12 release
tools/power/x86/intel-speed-select: HFI support
tools/power/x86/intel-speed-select: OOB daemon mode
thermal: intel: hfi: INTEL_HFI_THERMAL depends on NET
thermal: netlink: Fix parameter type of thermal_genl_cpu_capability_event() stub
thermal: intel: hfi: Notify user space for HFI events
thermal: netlink: Add a new event to notify CPU capabilities change
thermal: intel: hfi: Enable notification interrupt
thermal: intel: hfi: Handle CPU hotplug events
thermal: intel: hfi: Minimally initialize the Hardware Feedback Interface
x86/cpu: Add definitions for the Intel Hardware Feedback Interface
x86/Documentation: Describe the Intel Hardware Feedback Interface
Merge powerclamp thermal driver changes, int340x thermal driver changes
and thermal documentation changes for 5.18-rc1:
- Don't use bitmap_weight() in end_power_clamp() in the powerclamp
driver (Yury Norov).
- Update the OS policy capabilities handshake in the int340x thermal
driver (Srinivas Pandruvada).
- Increase the policies bitmap size in int340x (Srinivas Pandruvada).
- Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the
int340x thermal driver (Rafael Wysocki).
- Check for NULL after calling kmemdup() in int340x (Jiasheng Jiang).
- Add Intel Dynamic Power and Thermal Framework (DPTF) kernel interface
documentation (Srinivas Pandruvada).
- Fix bullet list warning in the thermal documentation (Randy Dunlap).
* thermal-powerclamp:
thermal: intel_powerclamp: don't use bitmap_weight() in end_power_clamp()
* thermal-int340x:
thermal: int340x: Update OS policy capability handshake
thermal: int340x: Increase bitmap size
thermal: Replace acpi_bus_get_device()
thermal: int340x: Check for NULL after calling kmemdup()
* thermal-docs:
Documentation: thermal: DPTF Documentation
thermal: fix Documentation bullet list warning
Update the firmware with OS supported policies mask, so that firmware can
relinquish its internal controls. Without this update several Tiger Lake
laptops gets performance limited with in few seconds of executing in
turbo region.
The existing way of enumerating firmware policies via IDSP method and
selecting policy by directly writing those policy UUIDS via _OSC method
is not supported in newer generation of hardware.
There is a new UUID "B23BA85D-C8B7-3542-88DE-8DE2FFCFD698" is defined for
updating policy capabilities. As part of ACPI _OSC method:
Arg0 - UUID: B23BA85D-C8B7-3542-88DE-8DE2FFCFD698
Arg1 - Rev ID: 1
Arg2 - Count: 2
Arg3 - Capability buffers: Array of Arg2 DWORDS
DWORD1: As defined in the ACPI 5.0 Specification
- Bit 0: Query Flag
- Bits 1-3: Always 0
- Bits 4-31: Reserved
DWORD2 and beyond:
- Bit0: set to 1 to indicate Intel(R) Dynamic Tuning is active, 0 to
indicate it is disabled and legacy thermal mechanism should
be enabled.
- Bit1: set to 1 to indicate Intel(R) Dynamic Tuning is controlling
active cooling, 0 to indicate bios shall enable legacy thermal
zone with active trip point.
- Bit2: set to 1 to indicate Intel(R) Dynamic Tuning is controlling
passive cooling, 0 to indicate bios shall enable legacy thermal
zone with passive trip point.
- Bit3: set to 1 to indicate Intel(R) Dynamic Tuning is handling
critical trip point, 0 to indicate bios shall enable legacy
thermal zone with critical trip point.
- Bits 4:31: Reserved
From sysfs interface, there is an existing interface to update policy
UUID using attribute "current_uuid". User space can write the same UUID
for ACTIVE, PASSIVE and CRITICAL policy. Driver converts these UUIDs to
DWORD2 Bit 1 to Bit 3. When any of the policy is activated by user
space it is assumed that dynamic tuning is active.
For example
$cd /sys/bus/platform/devices/INTC1040:00/uuids
To support active policy
$echo "3A95C389-E4B8-4629-A526-C52C88626BAE" > current_uuid
To support passive policy
$echo "42A441D6-AE6A-462b-A84B-4A8CE79027D3" > current_uuid
To support critical policy
$echo "97C68AE7-15FA-499c-B8C9-5DA81D606E0A" > current_uuid
To check all the supported policies
$cat current_uuid
3A95C389-E4B8-4629-A526-C52C88626BAE
42A441D6-AE6A-462b-A84B-4A8CE79027D3
97C68AE7-15FA-499c-B8C9-5DA81D606E0A
To match the bit format for DWORD2, rearranged enum int3400_thermal_uuid
and int3400_thermal_uuids[] by swapping current INT3400_THERMAL_ACTIVE
and INT3400_THERMAL_PASSIVE_1.
If the policies are enumerated via IDSP method then legacy method is
used, if not the new method is used to update policy support.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The number of policies are 10, so can't be supported by the bitmap size
of u8.
Even though there are no platfoms with these many policies, but
for correctness increase to u32.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Fixes: 16fc8eca19 ("thermal/int340x_thermal: Add additional UUIDs")
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Utilize platform_get_irq_optional() to silence these messages:
brcmstb_thermal a581500.thermal: IRQ index 0 not found
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220301181412.2008044-1-f.fainelli@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The return value from tegra_bpmp_transfer indicates the success or
failure of the IPC transaction with BPMP. If the transaction
succeeded, we also need to check the actual command's result code.
Add code to do this.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20210915085517.1669675-1-mperttunen@nvidia.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Expose ti-soc-thermal thermal sensors as HWMON devices.
# sensors
cpu_thermal-virtual-0
Adapter: Virtual device
temp1: +54.2 C (crit = +105.0 C)
dspeve_thermal-virtual-0
Adapter: Virtual device
temp1: +51.4 C (crit = +105.0 C)
gpu_thermal-virtual-0
Adapter: Virtual device
temp1: +54.2 C (crit = +105.0 C)
iva_thermal-virtual-0
Adapter: Virtual device
temp1: +54.6 C (crit = +105.0 C)
core_thermal-virtual-0
Adapter: Virtual device
temp1: +52.6 C (crit = +105.0 C)
Similar to imx_sc_thermal d2bc4dd91d.
No need to take care of thermal_remove_hwmon_sysfs() since
devm_thermal_add_hwmon_sysfs() (a wrapper around devres) is
used. See c7fc403e40.
Signed-off-by: Romain Naour <romain.naour@smile.fr>
Link: https://lore.kernel.org/r/20220218104725.2718904-1-romain.naour@smile.fr
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Add compatible to support LMh for sm8150 SoC.
sm8150 does not require explicit enabling for various LMh subsystems.
Add a variable indicating the same as match data which is set for sdm845.
Execute the piece of code enabling various LMh subsystems only if
enable algorithm match data is present.
Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220106173138.411097-2-thara.gopinath@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Register thermal zones as hwmon sensors to let userspace read
CPU temperatures using standard hwmon interface.
Acked-by: Amit Kucheria <amitk@kernel.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20220129180750.1882310-1-dmitry.baryshkov@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Do not call get_trip_hyst() from thermal_genl_cmd_tz_get_trip() if
the thermal zone does not define one.
Fixes: 1ce50e7d40 ("thermal: core: genetlink support for events/cmd/sampling")
Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
THERMAL_NETLINK depends on NET and since 'select' does not follow
any dependency chain, INTEL_HFI_THERMAL also should depend on NET.
Fix one Kconfig warning and 48 subsequent build errors:
WARNING: unmet direct dependencies detected for THERMAL_NETLINK
Depends on [n]: THERMAL [=y] && NET [=n]
Selected by [y]:
- INTEL_HFI_THERMAL [=y] && THERMAL [=y] && (X86 [=y] || X86_INTEL_QUARK [=n] || COMPILE_TEST [=y]) && CPU_SUP_INTEL [=y] && X86_THERMAL_VECTOR [=y]
Fixes: bd30cdfd9b ("thermal: intel: hfi: Notify user space for HFI events")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When building with CONFIG_THERMAL_NETLINK=n, there is a spew of warnings
along the lines of:
In file included from drivers/thermal/thermal_core.c:27:
In file included from drivers/thermal/thermal_core.h:15:
drivers/thermal/thermal_netlink.h:113:71: warning: declaration of 'struct cpu_capability' will not be visible outside of this function [-Wvisibility]
static inline int thermal_genl_cpu_capability_event(int count, struct cpu_capability *caps)
^
1 warning generated.
'struct cpu_capability' is not forward declared anywhere in the header.
As it turns out, this should really be 'struct thermal_genl_cpu_caps',
which silences the warning and makes the parameter types of the stub
match the full function.
Fixes: e4b1eb24ce ("thermal: netlink: Add a new event to notify CPU capabilities change")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Replace acpi_bus_get_device() that is going to be dropped with
acpi_fetch_acpi_dev().
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Don't call bitmap_weight() if the following code can get by
without it.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
As the potential failure of the allocation, kmemdup() may return NULL.
Then, 'bin_attr_data_vault.private' will be NULL, but
'bin_attr_data_vault.size' is not 0, which is not consistent.
Therefore, it is better to check the return value of kmemdup() to
avoid the confusion.
Fixes: 0ba13c763a ("thermal/int340x_thermal: Export GDDV")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When the hardware issues an HFI event, relay a notification to user space.
This allows user space to respond by reading performance and efficiency of
each CPU and take appropriate action.
For example, when the performance and efficiency of a CPU is 0, user space
can either offline the CPU or inject idle. Also, if user space notices a
downward trend in performance, it may proactively adjust power limits to
avoid future situations in which performance drops to 0.
To avoid excessive notifications, the rate is limited by one HZ per event.
To limit the netlink message size, send parameters for up to 16 CPUs in a
single message. If there are more than 16 CPUs, issue as many messages as
needed to notify the status of all CPUs.
In the HFI specification, both performance and efficiency capabilities are
defined in the [0, 255] range. The existing implementations of HFI hardware
do not scale the maximum values to 255. Since userspace cares about
capability values that are either 0 or show a downward/upward trend, this
fact does not matter much. Relative changes in capabilities are enough. To
comply with the thermal netlink ABI, scale both performance and efficiency
capabilities to the [0, 1023] interval.
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a new netlink event to notify change in CPU capabilities in terms of
performance and efficiency.
Firmware may change CPU capabilities as a result of thermal events in the
system or to account for changes in the TDP (thermal design power) level.
This notification type will allow user space to avoid running workloads
on certain CPUs or proactively adjust power limits to avoid future events.
The netlink message consists of a nested attribute
(THERMAL_GENL_ATTR_CPU_CAPABILITY) with three attributes:
* THERMAL_GENL_ATTR_CPU_CAPABILITY_ID (type u32):
-- logical CPU number
* THERMAL_GENL_ATTR_CPU_CAPABILITY_PERFORMANCE (type u32):
-- Scaled performance from 0-1023
* THERMAL_GENL_ATTR_CPU_CAPABILITY_EFFICIENCY (type u32):
-- Scaled efficiency from 0-1023
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When hardware wants to inform the operating system about updates in the HFI
table, it issues a package-level thermal event interrupt. For this,
hardware has new interrupt and status bits in the IA32_PACKAGE_THERM_
INTERRUPT and IA32_PACKAGE_THERM_STATUS registers. The existing thermal
throttle driver already handles thermal event interrupts: it initializes
the thermal vector of the local APIC as well as per-CPU and package-level
interrupt reporting. It also provides routines to service such interrupts.
Extend its functionality to also handle HFI interrupts.
The frequency of the thermal HFI interrupt is specific to each processor
model. On some processors, a single interrupt happens as soon as the HFI is
enabled and hardware will never update HFI capabilities afterwards. On
other processors, thermal and power constraints may cause thermal HFI
interrupts every tens of milliseconds.
To not overwhelm consumers of the HFI data, use delayed work to throttle
the rate at which HFI updates are processed. Use a dedicated workqueue to
not overload system_wq if hardware issues many HFI updates.
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
All CPUs in a package are represented in an HFI table. There exists an
HFI table per package. Thus, CPUs in a package need to coordinate to
initialize and access the table. Do such coordination during CPU hotplug.
Use the first CPU to come online in a package to initialize the HFI
instance and the data structure representing it. Other CPUs in the same
package need only to register or unregister themselves in that data
structure.
The HFI depends on both the package-level thermal management and the local
APIC thermal local vector. Thus, to ensure that a CPU coming online has an
associated HFI instance when the hardware issues an HFI event, enable the
HFI only after having enabled the local APIC thermal vector. The thermal
throttle driver takes care of the needed package-level initialization.
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Intel Hardware Feedback Interface provides guidance to the operating
system about the performance and energy efficiency capabilities of each
CPU in the system. Capabilities are numbers between 0 and 255 where a
higher number represents a higher capability. For each CPU, energy
efficiency and performance are reported as separate capabilities.
Hardware computes these capabilities based on the operating conditions of
the system such as power and thermal limits. These capabilities are shared
with the operating system in a table resident in memory. Each package in
the system has its own HFI instance. Every logical CPU in the package is
represented in the table. More than one logical CPUs may be represented in
a single table entry. When the hardware updates the table, it generates a
package-level thermal interrupt.
The size and format of the HFI table depend on the supported features and
can only be determined at runtime. To minimally initialize the HFI, parse
its features and allocate one instance per package of a data structure with
the necessary parameters to read and navigate a local copy (i.e., owned by
the driver) of individual HFI tables.
A subsequent changeset will provide per-CPU initialization and interrupt
handling.
Reviewed-by: Len Brown <len.brown@intel.com>
Co-developed by: Aubrey Li <aubrey.li@linux.intel.com>
Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add Raptor Lake PCI ID for processor thermal device.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add Raptor Lake ACPI IDs for DPTF devices.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Add new TSU driver and DT bindings for the Renesas RZ/G2L platform
(Biju Das).
- Fix missing check when calling reset_control_deassert() in the
rz2gl thermal driver (Biju Das).
- In preparation for FORTIFY_SOURCE performing compile-time and
run-time field bounds checking for memcpy(), avoid intentionally
writing across neighboring fields in the int340x thermal control
driver (Kees Cook).
- Fix RFIM mailbox write commands handling in the int340x thermal
control driver (Sumeet Pawnikar).
- Fix PM issue occurring in the iMX thermal control driver during
suspend/resume by implementing PM runtime support in it (Oleksij
Rempel).
- Add 'const' annotation to thermal_cooling_ops in the Intel
powerclamp driver (Rikard Falkeborn).
- Fix missing ADC bit set in the iMX8MP thermal driver to enable the
sensor (Paul Gerber).
- Drop unused local variable definition from tmon (ran jianping).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmHcgsISHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxGJ4P/1S6lW2SUA98YfEVt0V6HnnXRCFXRE+5
OANgcjKqrDy57ltHM56sFqym8KQb9If5vki9yKq9/crmUVg9YMFsaN4RCfuUIRE5
B+v/NHg+s8VTrcDgF1fINXbBtYVgUoXhbjTH6gdCLSiEDhTgFkEQFasrcrTQN7IX
uqaC33ZxoSk7e89Wb/roJushHfjylqpqxlFX8ME3aQyRxIkMVN2p3qddg33XwzMS
lQjtTLpL4fOtOZET+KwKxOG97VBDuGcH6bHdzNWHlcKWjI6cZYMS9OrxyMGNggUi
JF9N4Y4xxwQWLQ32YQPXv+xUS92A2fsGTGqw1g50PmZHh2hvRsODJEPopjO8jTd0
0QtmIT+GzfNaEcBbbS31eb0ePYc/DobbBlpJ8kMoe/WskWANl4I6TrFMTDceZj9X
iuzMzGvpr/3/L7tRDIY7qM0iKX1WNeMuOXNIDuD7cmM6of82hzjwOBvhAOf338nP
InZB6o0fRdM38Wypw0E17CHBAJTgXh2vCAgRvfOuDCqavZM21h1BBwNpKHfxa+5z
LsO8WRifkDHHNctSBsh5UoRUO/L6RnCB9SVaQUb8k19HfIY8ZoNjM+CPMAojCAxs
KOEqHYxF7hG7eEkDAIF624INJlxkntPEuQDI9TSj6wi2wc5cISG1tMi1p8M01yND
XVisUkJoZKVy
=Crok
-----END PGP SIGNATURE-----
Merge tag 'thermal-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These add a new driver for Renesas RZ/G2L TSU, update a few existing
thermal control drivers and clean up the tmon utility.
Specifics:
- Add new TSU driver and DT bindings for the Renesas RZ/G2L platform
(Biju Das).
- Fix missing check when calling reset_control_deassert() in the
rz2gl thermal driver (Biju Das).
- In preparation for FORTIFY_SOURCE performing compile-time and
run-time field bounds checking for memcpy(), avoid intentionally
writing across neighboring fields in the int340x thermal control
driver (Kees Cook).
- Fix RFIM mailbox write commands handling in the int340x thermal
control driver (Sumeet Pawnikar).
- Fix PM issue occurring in the iMX thermal control driver during
suspend/resume by implementing PM runtime support in it (Oleksij
Rempel).
- Add 'const' annotation to thermal_cooling_ops in the Intel
powerclamp driver (Rikard Falkeborn).
- Fix missing ADC bit set in the iMX8MP thermal driver to enable the
sensor (Paul Gerber).
- Drop unused local variable definition from tmon (ran jianping)"
* tag 'thermal-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal/drivers/int340x: Fix RFIM mailbox write commands
thermal/drivers/rz2gl: Add error check for reset_control_deassert()
thermal/drivers/imx8mm: Enable ADC when enabling monitor
thermal/drivers: Add TSU driver for RZ/G2L
dt-bindings: thermal: Document Renesas RZ/G2L TSU
thermal/drivers/intel_powerclamp: Constify static thermal_cooling_device_ops
thermal/drivers/imx: Implement runtime PM support
thermal: tools: tmon: remove unneeded local variable
thermal: int340x: Use struct_group() for memcpy() region
The existing mail mechanism only supports writing of workload types.
However, mailbox command for RFIM (cmd = 0x08) also requires write
operation which is ignored. This results in failing to store RFI
restriction.
Fixint this requires enhancing mailbox writes for non workload
commands too, so remove the check for MBOX_CMD_WORKLOAD_TYPE_WRITE
in mailbox write to allow this other write commands to be supoorted.
At the same time, however, we have to make sure that there is no
impact on read commands, by avoiding to write anything into the
mailbox data register.
To properly implement that, add two separate functions for mbox read
and write commands for the processor thermal workload command type.
This helps to distinguish the read and write workload command types
from each other while sending mbox commands.
Fixes: 5d6fbc96bd ("thermal/drivers/int340x: processor_thermal: Export additional attributes")
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Cc: 5.14+ <stable@vger.kernel.org> # 5.14+
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull ARM cpufreq updates for 5.17-rc1 from Viresh Kumar:
"- Qcom cpufreq driver updates improve irq support (Ard Biesheuvel, Stephen Boyd,
and Vladimir Zapolskiy).
- Fixes double devm_remap for mediatek driver (Hector Yuan).
- Introduces thermal pressure helpers (Lukasz Luba)."
* 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
cpufreq: mediatek-hw: Fix double devm_remap in hotplug case
cpufreq: qcom-hw: Use optional irq API
cpufreq: qcom-hw: Set CPU affinity of dcvsh interrupts
cpufreq: qcom-hw: Fix probable nested interrupt handling
cpufreq: qcom-cpufreq-hw: Avoid stack buffer for IRQ name
arch_topology: Remove unused topology_set_thermal_pressure() and related
cpufreq: qcom-cpufreq-hw: Use new thermal pressure update function
cpufreq: qcom-cpufreq-hw: Update offline CPUs per-cpu thermal pressure
thermal: cpufreq_cooling: Use new thermal pressure update function
arch_topology: Introduce thermal pressure update function
implementing PM runtime support (Oleksij Rempel)
- Add 'const' annotation to the thermal_cooling_ops in the Intel
powerclamp driver (Rikard Falkeborn)
- Add TSU driver and bindings for the RZ/G2L platform (Biju Das)
- Fix the missing ADC bit set on iMX8MP to enable the sensor (Paul
Gerber)
- Fix missing check when calling reset_control_deassert() (Biju Das)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmHEmTsACgkQqDIjiipP
6E/KMgf+MdzK/OTuxP+pf0yVqpC3WpH6lrNhBRC+mO4S0yAe8ZlYCFVVtvMftSrn
M2H8uxw74XlVkD8yi1kSYa1iVeD9jF+5ZViOybc83SKwryYEFI7tjxsjpaPFotxe
ZnL8y55hDC63YpwNeafGgW9M/JFehBP09zafakdVsvzzNGP8XKoYQcgyeofwNIrX
CRwU95K85k0eS2h9gPFPRWaNwZS4ubVEYr857oiGArX4/hEycvJ2UO+cH+zEmGjQ
QKJIEh5gQEBs2lKgcqUwzNbitm5lUG+rYRWD9omQ4QLtHK/fxZw2CMMVnNso/G5z
lslc8pLLwLh5QdR3v9tX4+OAxsV38A==
=uIVF
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.17-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal control material for 5.17-rc1 from Daniel Lezcano:
- Fix PM issue on the iMX driver when suspend/resume is happening by
implementing PM runtime support (Oleksij Rempel)
- Add 'const' annotation to the thermal_cooling_ops in the Intel
powerclamp driver (Rikard Falkeborn)
- Add TSU driver and bindings for the RZ/G2L platform (Biju Das)
- Fix missing ADC bit set on iMX8MP to enable the sensor (Paul Gerber)
- Fix missing check when calling reset_control_deassert() (Biju Das)
* tag 'thermal-v5.17-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal/drivers/rz2gl: Add error check for reset_control_deassert()
thermal/drivers/imx8mm: Enable ADC when enabling monitor
thermal/drivers: Add TSU driver for RZ/G2L
dt-bindings: thermal: Document Renesas RZ/G2L TSU
thermal/drivers/intel_powerclamp: Constify static thermal_cooling_device_ops
thermal/drivers/imx: Implement runtime PM support
If reset_control_deassert() fails, then we won't be able to access
the device registers. Therefore check the return code of
reset_control_deassert() and bail out in case of error.
While at it replace the parameter "&pdev->dev" -> "dev" in
devm_reset_control_get_exclusive().
Suggested-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Link: https://lore.kernel.org/r/20211208164010.4130-1-biju.das.jz@bp.renesas.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The VCoRefLow CPU FIVR register definition for Tiger Lake is incorrect.
Current implementation reads it from MMIO offset 0x5A18 and bit
offset [12:14], but the actual correct register definition is from
bit offset [11:13].
Update to fix the bit offset.
Fixes: 473be51142 ("thermal: int340x: processor_thermal: Add RFIM driver")
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Cc: 5.14+ <stable@vger.kernel.org> # 5.14+
[ rjw: New subject, changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The i.MX 8MP has a ADC_PD bit in the TMU_TER register that controls the
operating mode of the ADC:
* 0 means normal operating mode
* 1 means power down mode
When enabling/disabling the TMU, the ADC operating mode must be set
accordingly.
i.MX 8M Mini & Nano are lacking this bit.
Signed-off-by: Paul Gerber <Paul.Gerber@tq-group.com>
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Fixes: 2b8f1f0337 ("thermal: imx8mm: Add i.MX8MP support")
Link: https://lore.kernel.org/r/20211122114225.196280-1-alexander.stein@ew.tq-group.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The RZ/G2L SoC incorporates a thermal sensor unit (TSU) that measures the
temperature inside the LSI.
The thermal sensor in this unit measures temperatures in the range from
−40 degree Celsius to 125 degree Celsius with an accuracy of ±3°C. The
TSU repeats measurement at 20 microseconds intervals and automatically
updates the results of measurement.
The TSU has no interrupts as well as no external pins.
This patch adds Thermal Sensor Unit(TSU) driver for RZ/G2L SoC.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20211130155757.17837-3-biju.das.jz@bp.renesas.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The only usage of powerclamp_cooling_ops is to pass its address to
thermal_cooling_device_register(), which takes a pointer to const struct
thermal_cooling_device_ops. Make it const to allow the compiler to put
it in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20211128214641.30953-1-rikard.falkeborn@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Starting with commit d92ed2c9d3 ("thermal: imx: Use driver's local
data to decide whether to run a measurement") this driver stared using
irq_enabled flag to make decision to power on/off the thermal
core. This triggered a regression, where after reaching critical
temperature, alarm IRQ handler set irq_enabled to false, disabled
thermal core and was not able read temperature and disable cooling
sequence.
In case the cooling device is "CPU/GPU freq", the system will run with
reduce performance until next reboot.
To solve this issue, we need to move all parts implementing hand made
runtime power management and let it handle actual runtime PM framework.
Fixes: d92ed2c9d3 ("thermal: imx: Use driver's local data to decide whether to run a measurement")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Tested-by: Petr Beneš <petr.benes@ysoft.com>
Link: https://lore.kernel.org/r/20211117103426.81813-1-o.rempel@pengutronix.de
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
In preparation for FORTIFY_SOURCE performing compile-time and
run-time field bounds checking for memcpy(), avoid intentionally
writing across neighboring fields.
Use struct_group() in struct art around members weight, and
ac[0-9]_max, so they can be referenced together. This will allow
memcpy() and sizeof() to more easily reason about sizes, improve
readability, and avoid future warnings about writing beyond the
end of weight.
"pahole" shows no size nor member offset changes to struct art.
"objdump -d" shows no meaningful object code changes (i.e. only
source line number induced differences).
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Thermal pressure provides a new API, which allows to use CPU frequency
as an argument. That removes the need of local conversion to capacity.
Use this new function and remove old conversion code.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
During the suspend is in process, thermal_zone_device_update bails out
thermal zone re-evaluation for any sensor trip violation without
setting next valid trip to that sensor. It assumes during resume
it will re-evaluate same thermal zone and update trip. But when it is
in suspend temperature goes down and on resume path while updating
thermal zone if temperature is less than previously violated trip,
thermal zone set trip function evaluates the same previous high and
previous low trip as new high and low trip. Since there is no change
in high/low trip, it bails out from thermal zone set trip API without
setting any trip. It leads to a case where sensor high trip or low
trip is disabled forever even though thermal zone has a valid high
or low trip.
During thermal zone device init, reset thermal zone previous high
and low trip. It resolves above mentioned scenario.
Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
32-bit processors cannot generally access 64-bit MMIO registers
atomically, and it is unknown in which order the two halves of
this registers would need to be read:
drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.c: In function 'send_mbox_cmd':
drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.c:79:37: error: implicit declaration of function 'readq'; did you mean 'readl'? [-Werror=implicit-function-declaration]
79 | *cmd_resp = readq((void __iomem *) (proc_priv->mmio_base + MBOX_OFFSET_DATA));
| ^~~~~
| readl
The driver already does not build for anything other than x86,
so limit it further to x86-64.
Fixes: aeb58c860d ("thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit aeb58c860d ("thermal/drivers/int340x: processor_thermal: Suppot
64 bit RFIM responses") started using 'readq()' to read 64-bit status
responses from the int340x hardware.
That's all fine and good, but on 32-bit targets a 64-bit 'readq()' is
ambiguous, since it's no longer an atomic access. Some hardware might
require 64-bit accesses, and other hardware might want low word first or
high word first.
It's quite likely that the driver isn't relevant in a 32-bit environment
any more, and there's a patch floating around to just make it depend on
X86_64, but let's make it buildable on x86-32 anyway.
The driver previously just read the low 32 bits, so the hardware
certainly is ok with 32-bit reads, and in a little-endian environment
the low word first model is the natural one.
So just add the include for the 'io-64-nonatomic-lo-hi.h' version.
Fixes: aeb58c860d ("thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use pr_warn_once() instead of pr_warn() to print the user space
governor deprecation message in user_space_bind() to reduce the
kernel log noise.
Fixes: 0275c9fb0e ("thermal/core: Make the userspace governor deprecated")
Reported-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
of_parse_thermal_zones() parses the thermal-zones node and registers a
thermal_zone device for each subnode. However, if a thermal zone is
consuming a thermal sensor and that thermal sensor device hasn't probed
yet, an attempt to set trip_point_*_temp for that thermal zone device
can cause a NULL pointer dereference. Fix it.
console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp
...
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
...
Call trace:
of_thermal_set_trip_temp+0x40/0xc4
trip_point_temp_store+0xc0/0x1dc
dev_attr_store+0x38/0x88
sysfs_kf_write+0x64/0xc0
kernfs_fop_write_iter+0x108/0x1d0
vfs_write+0x2f4/0x368
ksys_write+0x7c/0xec
__arm64_sys_write+0x20/0x30
el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc
do_el0_svc+0x28/0xa0
el0_svc+0x14/0x24
el0_sync_handler+0x88/0xec
el0_sync+0x1c0/0x200
While at it, fix the possible NULL pointer dereference in other
functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),
of_thermal_get_trend().
Suggested-by: David Collins <quic_collinsd@quicinc.com>
Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some of the RFIM mail box command returns 64 bit values. So enhance
mailbox interface to return 64 bit values and use them for RFIM
commands.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Fixes: 5d6fbc96bd ("thermal/drivers/int340x: processor_thermal: Export additional attributes")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The cooling devices have their cooling device set_cur_state
read-writable all the time in the sysfs directory, thus allowing the
userspace to act on it.
The thermal framework is wrongly used by userspace as a power capping
framework by acting on the cooling device opaque state. This one then
competes with the in-kernel governor decision.
We have seen in out-of-tree kernels, a big number of devices which are
abusely declaring themselves as cooling device just to act on their
power.
The role of the thermal framework is to protect the junction
temperature of the silicon. Letting the userspace to play with a
cooling device is invalid and potentially dangerous.
The powercap framework is the right framework to do power capping and
moreover it deals with the aggregation via the dev pm qos.
As the userspace governor is marked deprecated and about to be
removed, there is no point to keep this file writable also in the
future.
Emit a warning and deprecate the interface.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20211019163506.2831454-2-daniel.lezcano@linaro.org
The userspace governor is sending temperature when polling is active
and trip point crossed events. Nothing else.
AFAICT, this governor is used with custom kernels making the userspace
governor co-existing with another governor on the same thermal zone
because there was no notification mechanism, implying a hack in the
framework to support this configuration.
The new netlink thermal notification is able to provide more
information than the userspace governor and give the opportunity to
the users of this governor to replace it by a dedicated notification
framework.
The userspace governor will be removed as its usage is no longer
needed.
Add a warning message to tell the userspace governor is deprecated.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20211019163506.2831454-1-daniel.lezcano@linaro.org
When the driver resumes, the tcc offset is set back to its previous
value. But this only works if the value was user defined as otherwise
the offset isn't saved. This asymmetric logic is harder to maintain and
introduced some issues.
Improve the logic by saving the tcc offset in a suspend op, so the right
value is always restored after a resume.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pI andruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20210909085613.5577-3-atenart@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The tsadc node in rk356x.dtsi has more resets then currently supported
by the rockchip_thermal.c driver, so use
devm_reset_control_array_get() to reset them all.
Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Link: https://lore.kernel.org/r/20210930110517.14323-3-jbx6244@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The function can loop and lock the system if for whatever reason the bit
for the target sensor is NEVER valid. This is the case if a sensor is
disabled by the factory and the valid bit is never reported as actually
valid. Add a timeout check and exit if a timeout occurs. As this is
a very rare condition, handle the timeout only if the first read fails.
While at it also rework the function to improve readability and convert
to poll_timeout generic macro.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211007172859.583-1-ansuelsmth@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
When device_register() return failed, program will goto out_kfree_type
to release 'cdev->device' by put_device(). That will call thermal_release()
to free 'cdev'. But the follow-up processes access 'cdev' continually.
That trggers the UAF bug.
====================================================================
BUG: KASAN: use-after-free in __thermal_cooling_device_register+0x75b/0xa90
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
dump_stack_lvl+0xe2/0x152
print_address_description.constprop.0+0x21/0x140
? __thermal_cooling_device_register+0x75b/0xa90
kasan_report.cold+0x7f/0x11b
? __thermal_cooling_device_register+0x75b/0xa90
__thermal_cooling_device_register+0x75b/0xa90
? memset+0x20/0x40
? __sanitizer_cov_trace_pc+0x1d/0x50
? __devres_alloc_node+0x130/0x180
devm_thermal_of_cooling_device_register+0x67/0xf0
max6650_probe.cold+0x557/0x6aa
......
Freed by task 258:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0x109/0x140
kfree+0x117/0x4c0
thermal_release+0xa0/0x110
device_release+0xa7/0x240
kobject_put+0x1ce/0x540
put_device+0x20/0x30
__thermal_cooling_device_register+0x731/0xa90
devm_thermal_of_cooling_device_register+0x67/0xf0
max6650_probe.cold+0x557/0x6aa [max6650]
Do not use 'cdev' again after put_device() to fix the problem like doing
in thermal_zone_device_register().
[dlezcano]: as requested by Rafael, change the affectation into two statements.
Fixes: 5848376181 ("thermal/drivers/core: Use a char pointer for the cooling device name")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20211015024504.947520-1-william.xuanziyang@huawei.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
If both dev_set_name() and device_register() failed, then null pointer
dereference occurs in thermal_release() which will use strncmp() to
compare the name.
So fix it by adding dev_set_name() return value check.
Signed-off-by: Yuanzheng Song <songyuanzheng@huawei.com>
Link: https://lore.kernel.org/r/20211015083230.67658-1-songyuanzheng@huawei.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
In production hardware the calibration values used to convert register
values to temperatures can be read from hardware. While pre-production
hardware still depends on pseudo values hard-coded in the driver.
Add support for reading out calibration values from hardware if it's
fused. The presence of fused calibration is indicated in the THSCP
register.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20211014103816.1939782-3-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Prepare for reading the THCODE and PTAT values from hardware fuses by
storing the values used during calculations in the drivers private
data structures.
As the values are now stored directly in the private data structures
there is no need to keep track of the TSC channel id as its only usage
was to lookup the THCODE row, drop it.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20211014103816.1939782-2-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The variant of the ADC Thermal Monitor block found in e.g. PM8998 is
"HC", add support for this variant to the ADC TM5 driver in order to
support using VADC channels as thermal_zones on SDM845 et al.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20211005032531.2251928-3-bjorn.andersson@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The slope of the temperature increase or decrease can be high and when
the temperature crosses the trip point, there could be a significant
difference between the trip temperature and the measured temperatures.
That forces the userspace to read the temperature back right after
receiving a trip violation notification.
In order to be efficient, give the temperature which resulted in the
trip violation.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20211001223323.1836640-1-daniel.lezcano@linaro.org
The only usage of thermal_mmio_ops is to pass its address to
devm_thermal_zone_of_sensor_register(), which has a pointer to const
struct thermal_zone_of_device_ops as argument. Make it const to allow
the compiler to put it in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Acked-by: Talel Shenhar <talel@amazon.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210920203849.32136-1-rikard.falkeborn@gmail.com
This check has a signedness bug and does not work. If "length" is
larger than "PAGE_SIZE" then "PAGE_SIZE - length" is not negative
but instead it is a large unsigned value. Fortunately, Takashi Iwai
changed this code to use scnprint() instead of snprintf() so now
"length" is never larger than "PAGE_SIZE - 1" and the check can be
removed.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
'cpu_clamping_mask' is a bitmap. So use 'bitmap_zalloc()' and
'bitmap_free()' to simplify code, improve the semantic of the code and
avoid some open-coded arithmetic in allocator arguments.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some devices can have some thermal sensors disabled from the
factory. The current two irq handler functions check all the sensor by
default and the check if the sensor was actually registered is
wrong. The tzd is actually never set if the registration fails hence
the IS_ERR check is wrong.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210907212543.20220-1-ansuelsmth@gmail.com
After printing the list of thermal governors, then this function prints
a newline character. The problem is that "size" has not been updated
after printing the last governor. This means that it can write one
character (the NUL terminator) beyond the end of the buffer.
Get rid of the "size" variable and just use "PAGE_SIZE - count" directly.
Fixes: 1b4f48494e ("thermal: core: group functions related to governor handling")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210916131342.GB25094@kili
After upgrading to Linux 5.13.3 I noticed my laptop would shutdown due
to overheat (when it should not). It turned out this was due to commit
fe6a6de669 ("thermal/drivers/int340x/processor_thermal: Fix tcc setting").
What happens is this drivers uses a global variable to keep track of the
tcc offset (tcc_offset_save) and uses it on resume. The issue is this
variable is initialized to 0, but is only set in
tcc_offset_degree_celsius_store, i.e. when the tcc offset is explicitly
set by userspace. If that does not happen, the resume path will set the
offset to 0 (in my case the h/w default being 3, the offset would become
too low after a suspend/resume cycle).
The issue did not arise before commit fe6a6de669, as the function
setting the offset would return if the offset was 0. This is no longer
the case (rightfully).
Fix this by not applying the offset if it wasn't saved before, reverting
back to the old logic. A better approach will come later, but this will
be easier to apply to stable kernels.
The logic to restore the offset after a resume was there long before
commit fe6a6de669, but as a value of 0 was considered invalid I'm
referencing the commit that made the issue possible in the Fixes tag
instead.
Fixes: fe6a6de669 ("thermal/drivers/int340x/processor_thermal: Fix tcc setting")
Cc: stable@vger.kernel.org
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pI andruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20210909085613.5577-2-atenart@kernel.org
tegra by adding a dependency on ARCH_TEGRA along with COMPILE_TEST
(Dmitry Osipenko)
- Fix the error code for the exynos when devm_get_clk() fails (Dan
Carpenter)
- Add the TCC cooling support for AlderLake platform (Sumeet Pawnikar)
- Add support for hardware trip points for the rcar gen3 thermal
driver and store TSC id as unsigned int (Niklas Söderlund)
- Replace the deprecated CPU-hotplug functions get_online_cpus() and
put_online_cpus (Sebastian Andrzej Siewior)
- Add the thermal tools directory in the MAINTAINERS file (Daniel
Lezcano)
- Fix the Makefile and the cross compilation flags for the userspace
'tmon' tool (Rolf Eike Beer)
- Allow to use the IMOK independently from the GDDV on Int340x (Sumeet
Pawnikar)
- Fix the stub thermal_cooling_device_register() function prototype
which does not match the real function (Arnd Bergmann)
- Make the thermal trip point optional in the DT bindings (Maxime
Ripard)
- Fix a typo in a comment in the core code (Geert Uytterhoeven)
- Reduce the verbosity of the trace in the SoC thermal tegra driver
(Dmitry Osipenko)
- Add the support for the LMh (Limit Management hardware) driver on
the QCom platforms (Thara Gopinath)
- Allow processing of HWP interrupt by adding a weak function in the
Intel driver (Srinivas Pandruvada)
- Prevent an abort of the sensor probe is a channel is not used
(Matthias Kaehlcke)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmE7yAcACgkQqDIjiipP
6E8WKwgAmI+kdOURCz1CtCIv6FnQFx20suxZidfATPULBPZ8zcHqdjxJECXJpvDz
y8T8dGAPRqXmMBJm2vF/qVraHaDJvscHXaflhsurhyEp0tiGptNeugBosso0jDEQ
JFrQ6Rny/NCmNMOkWW4YlSfkik0zQ43xlHocy3UfOKfcshhYWGsPTbIXdA6wH/zh
3tFW8j327tkyenrKZiaioJOSX88Oyy7Ft8l5k0HP+hPYkrvihWDJuEfEEmpfkdA7
Q70Lacf2QtFP8mUfGKpQxURSgndjphDPU7tzP5NqqWvLuT94aTspvJx4ywVwK8Iv
o+jNvR33yBIs5FXXB7B1ymkNE1h8uQ==
=blIQ
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal updates from Daniel Lezcano:
- Add the tegra3 thermal sensor and fix the compilation testing on
tegra by adding a dependency on ARCH_TEGRA along with COMPILE_TEST
(Dmitry Osipenko)
- Fix the error code for the exynos when devm_get_clk() fails (Dan
Carpenter)
- Add the TCC cooling support for AlderLake platform (Sumeet Pawnikar)
- Add support for hardware trip points for the rcar gen3 thermal driver
and store TSC id as unsigned int (Niklas Söderlund)
- Replace the deprecated CPU-hotplug functions get_online_cpus() and
put_online_cpus (Sebastian Andrzej Siewior)
- Add the thermal tools directory in the MAINTAINERS file (Daniel
Lezcano)
- Fix the Makefile and the cross compilation flags for the userspace
'tmon' tool (Rolf Eike Beer)
- Allow to use the IMOK independently from the GDDV on Int340x (Sumeet
Pawnikar)
- Fix the stub thermal_cooling_device_register() function prototype
which does not match the real function (Arnd Bergmann)
- Make the thermal trip point optional in the DT bindings (Maxime
Ripard)
- Fix a typo in a comment in the core code (Geert Uytterhoeven)
- Reduce the verbosity of the trace in the SoC thermal tegra driver
(Dmitry Osipenko)
- Add the support for the LMh (Limit Management hardware) driver on the
QCom platforms (Thara Gopinath)
- Allow processing of HWP interrupt by adding a weak function in the
Intel driver (Srinivas Pandruvada)
- Prevent an abort of the sensor probe is a channel is not used
(Matthias Kaehlcke)
* tag 'thermal-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal/drivers/qcom/spmi-adc-tm5: Don't abort probing if a sensor is not used
thermal/drivers/intel: Allow processing of HWP interrupt
dt-bindings: thermal: Add dt binding for QCOM LMh
thermal/drivers/qcom: Add support for LMh driver
firmware: qcom_scm: Introduce SCM calls to access LMh
thermal/drivers/tegra-soctherm: Silence message about clamped temperature
thermal: Spelling s/scallbacks/callbacks/
dt-bindings: thermal: Make trips node optional
thermal/core: Fix thermal_cooling_device_register() prototype
thermal/drivers/int340x: Use IMOK independently
tools/thermal/tmon: Add cross compiling support
thermal/tools/tmon: Improve the Makefile
MAINTAINERS: Add missing userspace thermal tools to the thermal section
thermal/drivers/intel_powerclamp: Replace deprecated CPU-hotplug functions.
thermal/drivers/rcar_gen3_thermal: Store TSC id as unsigned int
thermal/drivers/rcar_gen3_thermal: Add support for hardware trip points
drivers/thermal/intel: Add TCC cooling support for AlderLake platform
thermal/drivers/exynos: Fix an error code in exynos_tmu_probe()
thermal/drivers/tegra: Correct compile-testing of drivers
thermal/drivers/tegra: Add driver for Tegra30 thermal sensor
adc_tm5_register_tzd() registers the thermal zone sensors for all
channels of the thermal monitor. If the registration of one channel
fails the function skips the processing of the remaining channels
and returns an error, which results in _probe() being aborted.
One of the reasons the registration could fail is that none of the
thermal zones is using the channel/sensor, which hardly is a critical
error (if it is an error at all). If this case is detected emit a
warning and continue with processing the remaining channels.
Fixes: ca66dca5ed ("thermal: qcom: add support for adc-tm5 PMIC thermal monitor")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210823134726.1.I1dd23ddf77e5b3568625d80d6827653af071ce19@changeid
Add a weak function to process HWP (Hardware P-states) notifications and
move updating HWP_STATUS MSR to this function.
This allows HWP interrupts to be processed by the intel_pstate driver in
HWP mode by overriding the implementation.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210820024006.2347720-1-srinivas.pandruvada@linux.intel.com
Merge more updates from Andrew Morton:
"147 patches, based on 7d2a07b769.
Subsystems affected by this patch series: mm (memory-hotplug, rmap,
ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan),
alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib,
checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig,
selftests, ipc, and scripts"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
scripts: check_extable: fix typo in user error message
mm/workingset: correct kernel-doc notations
ipc: replace costly bailout check in sysvipc_find_ipc()
selftests/memfd: remove unused variable
Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
configs: remove the obsolete CONFIG_INPUT_POLLDEV
prctl: allow to setup brk for et_dyn executables
pid: cleanup the stale comment mentioning pidmap_init().
kernel/fork.c: unexport get_{mm,task}_exe_file
coredump: fix memleak in dump_vma_snapshot()
fs/coredump.c: log if a core dump is aborted due to changed file permissions
nilfs2: use refcount_dec_and_lock() to fix potential UAF
nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
nilfs2: fix NULL pointer in nilfs_##name##_attr_release
nilfs2: fix memory leak in nilfs_sysfs_create_device_group
trap: cleanup trap_init()
init: move usermodehelper_enable() to populate_rootfs()
...
HZ unit conversion macros are available in units.h, use them and remove
the duplicate definition.
The new macro uses a unsigned long type which is already the type in the
current code via the 'freq' variable.
Link: https://lkml.kernel.org/r/20210816114732.1834145-4-daniel.lezcano@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Christian Eggers <ceggers@arri.de>
Cc: Chanwoo Choi <cw00.choi@samsung.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Lukasz Luba <lukasz.luba@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: MyungJoo Ham <myungjoo.ham@samsung.com>
Cc: Peter Meerwald <pmeerw@pmeerw.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Highlights:
- Move all the Intel drivers into their own subdir(s) (mostly Kate's work)
- New meraki-mx100 platform driver
- Asus WMI driver enhancements, including
/sys/firmware/acpi/platform_profile support
- New BIOS SAR driver for Intel M.2 WWAM modems
- Alder Lake support for the Intel PMC driver
- A whole bunch of cleanups + fixes all over the place
The following is an automated git shortlog grouped by driver:
BIOS SAR driver for Intel M.2 Modem:
- BIOS SAR driver for Intel M.2 Modem
ISST:
- use semi-colons instead of commas
- Fix optimization with use of numa
Replace deprecated CPU-hotplug functions.:
- Replace deprecated CPU-hotplug functions.
Update Mario Limonciello's email address in the docs:
- Update Mario Limonciello's email address in the docs
acer-wmi:
- Add Turbo Mode support for Acer PH315-53
add meraki-mx100 platform driver:
- add meraki-mx100 platform driver
asus-nb-wmi:
- Add tablet_mode_sw=lid-flip quirk for the TP200s
- Allow configuring SW_TABLET_MODE method with a module option
asus-wmi:
- Fix "unsigned 'retval' is never less than zero" smatch warning
- Delete impossible condition
- Add support for platform_profile
- Add egpu enable method
- Add dgpu disable method
- Add panel overdrive functionality
dell-smbios:
- Remove unused dmi_system_id table
dell-smbios-wmi:
- Add missing kfree in error-exit from run_smbios_call
- Avoid false-positive memcpy() warning
dell-smo8800:
- Convert to be a platform driver
dual_accel_detect:
- Use the new i2c_acpi_client_count() helper
gigabyte-wmi:
- add support for B450M S2H V2
- add support for X570 GAMING X
hp_accel:
- Convert to be a platform driver
- Remove _INI method call
i2c:
- acpi: Add an i2c_acpi_client_count() helper function
i2c-multi-instantiate:
- Use the new i2c_acpi_client_count() helper
ideapad-laptop:
- Fix Legion 5 Fn lock LED
intel-hid:
- Move to intel sub-directory
intel-rst:
- Move to intel sub-directory
intel-smartconnect:
- Move to intel sub-directory
intel-uncore-frequency:
- Move to intel sub-directory
intel-vbtn:
- Move to intel sub-directory
intel-wmi-sbl-fw-update:
- Move to intel sub-directory
intel-wmi-thunderbolt:
- Move to intel sub-directory
intel_atomisp2:
- Move to intel sub-directory
intel_bxtwc_tmu:
- Move to intel sub-directory
intel_cht_int33fe:
- Use the new i2c_acpi_client_count() helper
intel_chtdc_ti_pwrbtn:
- Move to intel sub-directory
intel_int0002_vgpio:
- Move to intel sub-directory
intel_mrfld_pwrbtn:
- Move to intel sub-directory
intel_oaktrail:
- Move to intel sub-directory
intel_pmc_core:
- Move to intel sub-directory
- Prevent possibile overflow
intel_pmt_telemetry:
- Ignore zero sized entries
intel_punit_ipc:
- Move to intel sub-directory
intel_scu_ipc:
- Fix doc of intel_scu_ipc_dev_command_with_size()
intel_speed_select_if:
- Move to intel sub-directory
intel_telemetry:
- Move to intel sub-directory
intel_turbo_max_3:
- Move to intel sub-directory
lg-laptop:
- Use correct event for keyboard backlight FN-key
- Use correct event for touchpad toggle FN-key
- Support for battery charge limit on newer models
platform/mellanox:
- mlxbf-pmc: fix kernel-doc notation
platform/surface:
- aggregator: Use y instead of objs in Makefile
- surface3_power: Use i2c_acpi_get_i2c_resource() helper
platform/x86/intel:
- pmc/core: Add GBE Package C10 fix for Alder Lake PCH
- pmc/core: Add Alder Lake low power mode support for pmc core
- pmc/core: Add Latency Tolerance Reporting (LTR) support to Alder Lake
- pmc/core: Add Alderlake support to pmc core driver
- int3472: Use y instead of objs in Makefile
- pmt: Use y instead of objs in Makefile
- int33fe: Use y instead of objs in Makefile
- Move Intel PMT drivers to new subfolder
thermal/drivers/intel:
- Move intel_menlow to thermal drivers
think-lmi:
- add debug_cmd
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmEw1KAUHGhkZWdvZWRl
QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9wk7Qf/dsaaDgx7aC6DfKzdcMgfeLIdTaGm
a6svNXM2t/JFdvhjYzxA+4QlQgco7zkN06iRlWbObEonSUsHlGlwEOHX60VgcopO
qJaqnznmfdXUocFTnA+5acJXabNaw7xkKHS0K61UWgk+mm6aMuygpKxULnNTa4X+
p3HoU6uXFckpoA/Jstzo5UfegNYhg11bflNd7XN4F3rMCbbNHAsWlf4oVr2YsEHa
wECW+1e8wZl4BInUzoXQhilRoybJWXWJ8sLsvQfDXLs9aNoLdDqu9p0MuXEW5QqE
wNt26SNNAP2L49BD6kaJszV5Ry/jNSEtVkWwkrzHbGTZoOEyMYas8pm6Uw==
=iK1W
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Hans de Goede:
"Highlights:
- Move all the Intel drivers into their own subdir(s) (mostly Kate's
work)
- New meraki-mx100 platform driver
- Asus WMI driver enhancements, including support for
/sys/firmware/acpi/platform_profile
- New BIOS SAR driver for Intel M.2 WWAM modems
- Alder Lake support for the Intel PMC driver
- A whole bunch of cleanups + fixes all over the place"
* tag 'platform-drivers-x86-v5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (65 commits)
platform/x86: dell-smbios-wmi: Add missing kfree in error-exit from run_smbios_call
platform/x86: dell-smbios-wmi: Avoid false-positive memcpy() warning
platform/x86: ISST: use semi-colons instead of commas
platform/x86: asus-wmi: Fix "unsigned 'retval' is never less than zero" smatch warning
platform/x86: asus-wmi: Delete impossible condition
platform/x86: hp_accel: Convert to be a platform driver
platform/x86: hp_accel: Remove _INI method call
platform/mellanox: mlxbf-pmc: fix kernel-doc notation
platform/x86/intel: pmc/core: Add GBE Package C10 fix for Alder Lake PCH
platform/x86/intel: pmc/core: Add Alder Lake low power mode support for pmc core
platform/x86/intel: pmc/core: Add Latency Tolerance Reporting (LTR) support to Alder Lake
platform/x86/intel: pmc/core: Add Alderlake support to pmc core driver
platform/x86: intel-wmi-thunderbolt: Move to intel sub-directory
platform/x86: intel-wmi-sbl-fw-update: Move to intel sub-directory
platform/x86: intel-vbtn: Move to intel sub-directory
platform/x86: intel_oaktrail: Move to intel sub-directory
platform/x86: intel_int0002_vgpio: Move to intel sub-directory
platform/x86: intel-hid: Move to intel sub-directory
platform/x86: intel_atomisp2: Move to intel sub-directory
platform/x86: intel_speed_select_if: Move to intel sub-directory
...
Add a weak function to process HWP (Hardware P-states) notifications and
move updating HWP_STATUS MSR to this function.
This allows HWP interrupts to be processed by the intel_pstate driver in
HWP mode by overriding the implementation.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Driver enabling various pieces of Limits Management Hardware(LMh) for cpu
cluster0 and cpu cluster1 namely kick starting monitoring of temperature,
current, battery current violations, enabling reliability algorithm and
setting up various temperature limits.
The following has been explained in the cover letter. I am including this
here so that this remains in the commit message as well.
LMh is a hardware infrastructure on some Qualcomm SoCs that can enforce
temperature and current limits as programmed by software for certain IPs
like CPU. On many newer LMh is configured by firmware/TZ and no programming
is needed from the kernel side. But on certain SoCs like sdm845 the
firmware does not do a complete programming of the h/w. On such soc's
kernel software has to explicitly set up the temperature limits and turn on
various monitoring and enforcing algorithms on the hardware.
Tested-by: Steev Klimaszewski <steev@kali.org> # Lenovo Yoga C630
Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210809191605.3742979-3-thara.gopinath@linaro.org
Moved drivers/platform/x86/intel_menlow.c to drivers/thermal/intel.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20210816035356.1955982-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The Tegra soctherm driver prints message about the clamped temperature
trip each time when thermal core disables the low/high trip. The message
is confusing and creates illusion that driver is malfunctioning. Turn that
noisy info message into a debug message.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210712002353.17276-1-digetx@gmail.com
Some chrome platform requires IMOK method in coreboot. But these platforms
don't use GDDV data vault in coreboot. As per current code flow, to enable
and use IMOK only, we need to have GDDV support as well in coreboot. This
patch removes the dependency for IMOK from GDDV to enable and use IMOK
independently.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210716163946.3142-1-sumeet.r.pawnikar@intel.com
The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().
Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Amit Kucheria <amitk@kernel.org>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210803141621.780504-20-bigeasy@linutronix.de
The TSC id and number of TSC ids should be stored as unsigned int as
they can't be negative. Fix the datatype of the loop counter 'i' and
rcar_gen3_thermal_tsc.id to reflect this.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210804091818.2196806-3-niklas.soderlund+renesas@ragnatech.se
All supported hardware except V3U is capable of generating interrupts
to the CPU when the temperature go below or above a set value. Use this
to implement support for the set_trip() feature of the thermal core on
supported hardware.
The V3U have its interrupts routed to the ECM module and therefore can
not be used to implement set_trip() as the driver can't be made aware of
when the interrupt triggers.
Each TSC is capable of tracking up-to three different temperatures while
only two are needed to implement the tracking of the thermal window.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210804091818.2196806-2-niklas.soderlund+renesas@ragnatech.se
This error path return success but it should propagate the negative
error code from devm_clk_get().
Fixes: 6c247393cf ("thermal: exynos: Add TMU support for Exynos7 SoC")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210810084413.GA23810@kili
All Tegra thermal drivers support compile-testing, but the drivers are
not available for compile-testing because the whole Kconfig meny entry
depends on ARCH_TEGRA, missing the alternative COMPILE_TEST dependency
option. Correct the Kconfig entry.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210617072403.3487-1-digetx@gmail.com
All NVIDIA Tegra30 SoCs have a two-channel on-chip sensor unit which
monitors temperature and voltage of the SoC. Sensors control CPU frequency
throttling, which is activated by hardware once preprogrammed temperature
level is breached, they also send signal to Power Management controller to
perform emergency shutdown on a critical overheat of the SoC die. Add
driver for the Tegra30 TSENSOR module, exposing it as a thermal sensor.
Tested-by: Andreas Westman Dorcsak <hedmoo@yahoo.com> # Asus TF700T
Tested-by: Maxim Schwalm <maxim.schwalm@gmail.com> # Asus TF700T
Tested-by: Svyatoslav Ryhel <clamor95@gmail.com> # Asus TF201T
Tested-by: Ihor Didenko <tailormoon@rambler.ru> # Asus TF300T
Tested-by: Ion Agorria <ion@agorria.com> # Asus TF201T
Tested-by: Matt Merhar <mattmerhar@protonmail.com> # Ouya
Tested-by: Peter Geis <pgwipeout@gmail.com> # Ouya
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210616190417.32214-4-digetx@gmail.com
- Add missing MODULE_DEVICE_TABLE for the Spreadtrum sensor (Chunyan
Zhang)
- Export additionnal attributes for the int340x thermal processor
(Srinivas Pandruvada)
- Add SC7280 compatible for the tsens driver (Rajeshwari Ravindra
Kamble)
- Fix kernel documentation for thermal_zone_device_unregister() and
use devm_platform_get_and_ioremap_resource() (Yang Yingliang)
- Fix coefficient calculations for the rcar_gen3 sensor driver (Niklas
Söderlund)
- Fix shadowing variable rcar_gen3_ths_tj_1 (Geert Uytterhoeven)
- Add missing of_node_put() for the iMX and Spreadtrum sensors
(Krzysztof Kozlowski)
- Add tegra3 thermal sensor DT bindings (Dmitry Osipenko)
- Stop the thermal zone monitoring when unregistering it to prevent a
temperature update without the 'get_temp' callback (Dmitry Osipenko)
- Add rk3568 DT bindings, convert bindings to yaml schemas and add the
corresponding compatible in the Rockchip sensor (Ezequiel Garcia)
- Add the sc8180x compatible for the Qualcomm tsensor (Bjorn Andersson)
- Use the find_first_zero_bit() function instead of custom code (Andy
Shevchenko)
- Fix the kernel doc for the device cooling device (Yang Li)
- Reorg the processor thermal int340x to set the scene for the PCI
mmio driver (Srinivas Pandruvada)
- Add PCI MMIO driver for the int340x processor thermal driver
(Srinivas Pandruvada)
- Add hwmon sensors for the mediatek sensor (Frank Wunderlich)
- Fix warning for return value reported by Smatch for the int340x
thermal processor (Srinivas Pandruvada)
- Fix wrong register access and decoding for the int340x thermal
processor (Srinivas Pandruvada)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmDkbdAACgkQqDIjiipP
6E8pTwgAgc4RMdSBdNMVWP1Lc3Gprmg8uXMhma9ZlmJD+l3h4b4P+Zm7HjU+SPHI
emvvqHgiWGw/ta/Fuhx9XnqTUjZznG3gMYohnfKe7jPgVTxmud+Yf0/E3dRrDNWl
WNolS8rv4dLf1mjR+vGZ63KasK0Rz5Z4YxDV4kOvh+/VUhqg3XpZ1OTKOh/B9IWX
pUTedq7hIZ44ZMINcwObvLNTeaoEhPNpgA4Rs07vQPYugY0V61HszqR/Mu+l8Rgp
LiE8NUBANJ+8+wydHe/vP6lvOthh7YGSx4091rUe+X17qgfBDKCTsOyECnZ/UX+r
aB1MaTAnr1H0dfM49yeoFcMSPc1rGA==
=q0nG
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal updates from Daniel Lezcano:
- Add rk3568 sensor support (Finley Xiao)
- Add missing MODULE_DEVICE_TABLE for the Spreadtrum sensor (Chunyan
Zhang)
- Export additionnal attributes for the int340x thermal processor
(Srinivas Pandruvada)
- Add SC7280 compatible for the tsens driver (Rajeshwari Ravindra
Kamble)
- Fix kernel documentation for thermal_zone_device_unregister() and use
devm_platform_get_and_ioremap_resource() (Yang Yingliang)
- Fix coefficient calculations for the rcar_gen3 sensor driver (Niklas
Söderlund)
- Fix shadowing variable rcar_gen3_ths_tj_1 (Geert Uytterhoeven)
- Add missing of_node_put() for the iMX and Spreadtrum sensors
(Krzysztof Kozlowski)
- Add tegra3 thermal sensor DT bindings (Dmitry Osipenko)
- Stop the thermal zone monitoring when unregistering it to prevent a
temperature update without the 'get_temp' callback (Dmitry Osipenko)
- Add rk3568 DT bindings, convert bindings to yaml schemas and add the
corresponding compatible in the Rockchip sensor (Ezequiel Garcia)
- Add the sc8180x compatible for the Qualcomm tsensor (Bjorn Andersson)
- Use the find_first_zero_bit() function instead of custom code (Andy
Shevchenko)
- Fix the kernel doc for the device cooling device (Yang Li)
- Reorg the processor thermal int340x to set the scene for the PCI mmio
driver (Srinivas Pandruvada)
- Add PCI MMIO driver for the int340x processor thermal driver
(Srinivas Pandruvada)
- Add hwmon sensors for the mediatek sensor (Frank Wunderlich)
- Fix warning for return value reported by Smatch for the int340x
thermal processor (Srinivas Pandruvada)
- Fix wrong register access and decoding for the int340x thermal
processor (Srinivas Pandruvada)
* tag 'thermal-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (23 commits)
thermal/drivers/int340x/processor_thermal: Fix tcc setting
thermal/drivers/int340x/processor_thermal: Fix warning for return value
thermal/drivers/mediatek: Add sensors-support
thermal/drivers/int340x/processor_thermal: Add PCI MMIO based thermal driver
thermal/drivers/int340x/processor_thermal: Split enumeration and processing part
thermal: devfreq_cooling: Fix kernel-doc
thermal/drivers/intel/intel_soc_dts_iosf: Switch to use find_first_zero_bit()
dt-bindings: thermal: tsens: Add sc8180x compatible
dt-bindings: rockchip-thermal: Support the RK3568 SoC compatible
dt-bindings: thermal: convert rockchip-thermal to json-schema
thermal/core/thermal_of: Stop zone device before unregistering it
dt-bindings: thermal: Add binding for Tegra30 thermal sensor
thermal/drivers/sprd: Add missing of_node_put for loop iteration
thermal/drivers/imx_sc: Add missing of_node_put for loop iteration
thermal/drivers/rcar_gen3_thermal: Do not shadow rcar_gen3_ths_tj_1
thermal/drivers/rcar_gen3_thermal: Fix coefficient calculations
thermal/drivers/st: Use devm_platform_get_and_ioremap_resource()
thermal/core: Correct function name thermal_zone_device_unregister()
dt-bindings: thermal: tsens: Add compatible string to TSENS binding for SC7280
thermal/drivers/int340x: processor_thermal: Export additional attributes
...
The following fixes are done for tcc sysfs interface:
- TCC is 6 bits only from bit 29-24
- TCC of 0 is valid
- When BIT(31) is set, this register is read only
- Check for invalid tcc value
- Error for negative values
Fixes: fdf4f2fb8e ("drivers: thermal: processor_thermal_device: Export sysfs interface for TCC offset")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: stable@vger.kernel.org
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210628215803.75038-1-srinivas.pandruvada@linux.intel.com
Add a new PCI driver which register a thermal zone and allows to get
notification for threshold violation by a RW trip point. These
notifications are delivered from the device using MSI based
interrupt.
The main difference between this new PCI driver and the existing
one is that the temperature and trip points directly use PCI
MMIO instead of using ACPI methods.
This driver registers a thermal zone "TCPU_PCI" in addition to the
legacy processor thermal device, which uses ACPI companion device
to set name, temperature and trips.
This driver is enabled for AlderLake.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210525204811.3793651-3-srinivas.pandruvada@linux.intel.com
Remove enumeration part from the processor_thermal_device to two
different modules. One for ACPI and one for PCI:
ACPI enumeration: int3401_thermal
PCI part: processor_thermal_device_pci_legacy
The current processor_thermal_device now just implements interface
functions to be used by the ACPI and PCI enumeration module. This is
done by:
1. Make functions proc_thermal_add() and proc_thermal_remove() non static
and export them for usage in other processor_thermal_device_pci_legacy.c
and in int3401_thermal.c.
2. Move the sysfs file creation for TCC offset and power limit attribute
group to the proc_thermal_add() from the individual enumeration callbacks
for PCI and ACPI.
3. Create new interface functions proc_thermal_mmio_add() and
proc_thermal_mmio_remove() which will be called from the
processor_thermal_device_pci_legacy module.
4. Export proc_thermal_resume(), so that it can be used by power
management callbacks.
5. Remove special check for double enumeration as it never happens.
While here, fix some cleanup on error conditions in proc_thermal_add().
No functional changes are expected with this change.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210525204811.3793651-2-srinivas.pandruvada@linux.intel.com
Fix function name in devfreq_cooling.c comment to remove a
warning found by kernel-doc.
drivers/thermal/devfreq_cooling.c:479: warning: expecting prototype for
devfreq_cooling_em_register_power(). Prototype was for
devfreq_cooling_em_register() instead.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1623223350-128104-1-git-send-email-yang.lee@linux.alibaba.com
Zone device is enabled after thermal_zone_of_sensor_register() completion,
but it's not disabled before senor is unregistered, leaving temperature
polling active. This results in accessing a disabled zone device and
produces a warning about this problem. Stop zone device before
unregistering it in order to fix this "use-after-free" problem.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210616190417.32214-3-digetx@gmail.com
- Changes to core scheduling facilities:
- Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables
coordinated scheduling across SMT siblings. This is a much
requested feature for cloud computing platforms, to allow
the flexible utilization of SMT siblings, without exposing
untrusted domains to information leaks & side channels, plus
to ensure more deterministic computing performance on SMT
systems used by heterogenous workloads.
There's new prctls to set core scheduling groups, which
allows more flexible management of workloads that can share
siblings.
- Fix task->state access anti-patterns that may result in missed
wakeups and rename it to ->__state in the process to catch new
abuses.
- Load-balancing changes:
- Tweak newidle_balance for fair-sched, to improve
'memcache'-like workloads.
- "Age" (decay) average idle time, to better track & improve workloads
such as 'tbench'.
- Fix & improve energy-aware (EAS) balancing logic & metrics.
- Fix & improve the uclamp metrics.
- Fix task migration (taskset) corner case on !CONFIG_CPUSET.
- Fix RT and deadline utilization tracking across policy changes
- Introduce a "burstable" CFS controller via cgroups, which allows
bursty CPU-bound workloads to borrow a bit against their future
quota to improve overall latencies & batching. Can be tweaked
via /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us.
- Rework assymetric topology/capacity detection & handling.
- Scheduler statistics & tooling:
- Disable delayacct by default, but add a sysctl to enable
it at runtime if tooling needs it. Use static keys and
other optimizations to make it more palatable.
- Use sched_clock() in delayacct, instead of ktime_get_ns().
- Misc cleanups and fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZcPoRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1g3yw//WfhIqy7Psa9d/MBMjQDRGbTuO4+w22Dj
vmWFU44Q4KJxQHWeIgUlrK+dzvYWvNmflUs2CUUOiDVzxFTHMIyBtL4qCBUbx4Ns
vKAcB9wsWZge2o3WzZqpProRhdoRaSKw8egUr2q7rACVBkckY7eGP/OjWxXU8BdA
b7D0LPWwuIBFfN4pFYeCDLn32Dqr9s6Chyj+ZecabdG7EE6Gu+f1diVcxy7JE/mc
4WWL0D1RqdgpGrBEuMJIxPYekdrZiuy4jtEbztz5gbTBteN1cj3BLfqn0Pc/e6rO
Vyuc5mXCAmzRVi18z6g6bsVl+IA/nrbErENB2OHOhOYtqiZxqGTd4GPWZszMyY17
5AsEO5+5pcaBsy4gyp09qURggBu9zhJnMVmOI3rIHZkmkhwzc6uUJlyhDCTiFWOz
3ZF3LjbZEyCKodMD8qMHbs3axIBpIfZqjzkvSKyFnvfXEGVytVse7NUuWtQ36u92
GnURxVeYY1TDVXvE1Y8owNKMxknKQ6YRlypP7Dtbeo/qG6hShp0xmS7qDLDi0ybZ
ZlK+bDECiVoDf3nvJo+8v5M82IJ3CBt4UYldeRJsa1YCK/FsbK8tp91fkEfnXVue
+U6LPX0AmMpXacR5HaZfb3uBIKRw/QMdP/7RFtBPhpV6jqCrEmuqHnpPQiEVtxwO
UmG7bt94Trk=
=3VDr
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler udpates from Ingo Molnar:
- Changes to core scheduling facilities:
- Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables
coordinated scheduling across SMT siblings. This is a much
requested feature for cloud computing platforms, to allow the
flexible utilization of SMT siblings, without exposing untrusted
domains to information leaks & side channels, plus to ensure more
deterministic computing performance on SMT systems used by
heterogenous workloads.
There are new prctls to set core scheduling groups, which allows
more flexible management of workloads that can share siblings.
- Fix task->state access anti-patterns that may result in missed
wakeups and rename it to ->__state in the process to catch new
abuses.
- Load-balancing changes:
- Tweak newidle_balance for fair-sched, to improve 'memcache'-like
workloads.
- "Age" (decay) average idle time, to better track & improve
workloads such as 'tbench'.
- Fix & improve energy-aware (EAS) balancing logic & metrics.
- Fix & improve the uclamp metrics.
- Fix task migration (taskset) corner case on !CONFIG_CPUSET.
- Fix RT and deadline utilization tracking across policy changes
- Introduce a "burstable" CFS controller via cgroups, which allows
bursty CPU-bound workloads to borrow a bit against their future
quota to improve overall latencies & batching. Can be tweaked via
/sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us.
- Rework assymetric topology/capacity detection & handling.
- Scheduler statistics & tooling:
- Disable delayacct by default, but add a sysctl to enable it at
runtime if tooling needs it. Use static keys and other
optimizations to make it more palatable.
- Use sched_clock() in delayacct, instead of ktime_get_ns().
- Misc cleanups and fixes.
* tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
sched/doc: Update the CPU capacity asymmetry bits
sched/topology: Rework CPU capacity asymmetry detection
sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flag
psi: Fix race between psi_trigger_create/destroy
sched/fair: Introduce the burstable CFS controller
sched/uclamp: Fix uclamp_tg_restrict()
sched/rt: Fix Deadline utilization tracking during policy change
sched/rt: Fix RT utilization tracking during policy change
sched: Change task_struct::state
sched,arch: Remove unused TASK_STATE offsets
sched,timer: Use __set_current_state()
sched: Add get_current_state()
sched,perf,kvm: Fix preemption condition
sched: Introduce task_is_running()
sched: Unbreak wakeups
sched/fair: Age the average idle time
sched/cpufreq: Consider reduced CPU capacity in energy calculation
sched/fair: Take thermal pressure into account while estimating energy
thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure
sched/fair: Return early from update_tg_cfs_load() if delta == 0
...
This commit in sched/urgent moved the cfs_rq_is_decayed() function:
a7b359fc6a37: ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
and this fresh commit in sched/core modified it in the old location:
9e077b52d86a: ("sched/pelt: Check that *_avg are null when *_sum are")
Merge the two variants.
Conflicts:
kernel/sched/fair.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The thermal pressure signal gives information to the scheduler about
reduced CPU capacity due to thermal. It is based on a value stored in
a per-cpu 'thermal_pressure' variable. The online CPUs will get the
new value there, while the offline won't. Unfortunately, when the CPU
is back online, the value read from per-cpu variable might be wrong
(stale data). This might affect the scheduler decisions, since it
sees the CPU capacity differently than what is actually available.
Fix it by making sure that all online+offline CPUs would get the
proper value in their per-cpu variable when thermal framework sets
capping.
Fixes: f12e4f66ab ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping")
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20210614191030.22241-1-lukasz.luba@arm.com
Early exits from for_each_available_child_of_node() should decrement the
node reference counter. Reported by Coccinelle:
drivers/thermal/sprd_thermal.c:387:1-23: WARNING:
Function "for_each_child_of_node" should have of_node_put() before goto around lines 391.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Acked-by: Chunyan Zhang <zhang.lyra@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210614192230.19248-2-krzysztof.kozlowski@canonical.com
Early exits from for_each_available_child_of_node() should decrement the
node reference counter. Reported by Coccinelle:
drivers/thermal/imx_sc_thermal.c:93:1-33: WARNING:
Function "for_each_available_child_of_node" should have of_node_put() before return around line 97.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Reviewed-by: Jacky Bai <ping.bai@nxp.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210614192230.19248-1-krzysztof.kozlowski@canonical.com
With -Wshadow:
drivers/thermal/rcar_gen3_thermal.c: In function ‘rcar_gen3_thermal_probe’:
drivers/thermal/rcar_gen3_thermal.c:310:13: warning: declaration of ‘rcar_gen3_ths_tj_1’ shadows a global declaration [-Wshadow]
310 | const int *rcar_gen3_ths_tj_1 = of_device_get_match_data(dev);
| ^~~~~~~~~~~~~~~~~~
drivers/thermal/rcar_gen3_thermal.c:246:18: note: shadowed declaration is here
246 | static const int rcar_gen3_ths_tj_1 = 126;
| ^~~~~~~~~~~~~~~~~~
To add to the confusion, the local variable has a different type.
Fix the shadowing by renaming the local variable to ths_tj_1.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/9ea7e65d0331daba96f9a7925cb3d12d2170efb1.1623076804.git.geert+renesas@glider.be
The fixed value of 157 used in the calculations are only correct for
M3-W, on other Gen3 SoC it should be 167. The constant can be derived
correctly from the static TJ_3 constant and the SoC specific TJ_1 value.
Update the calculation be correct on all Gen3 SoCs.
Fixes: 4eb39f79ef ("thermal: rcar_gen3_thermal: Update value of Tj_1")
Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210605085211.564909-1-niklas.soderlund+renesas@ragnatech.se
Use devm_platform_get_and_ioremap_resource() to simplify
code and remove error message which within
devm_platform_get_and_ioremap_resource() already.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210605120205.2459578-1-yangyingliang@huawei.com
Fix the following make W=1 kernel build warning:
drivers/thermal/thermal_core.c:1376: warning: expecting prototype for thermal_device_unregister(). Prototype was for thermal_zone_device_unregister() instead
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210517051020.3463536-1-yangyingliang@huawei.com
Export additional attributes:
ddr_data_rate (RO) : Show current DDR (Double Data Rate) data rate.
rfi_restriction (RW) : Show or set current state for RFI (Radio
Frequency Interference) protection.
These attributes use mailbox commands to get/set information. Here
command codes are:
0x0007: Read RFI restriction
0x0107: Read DDR data rate
0x0008: Write RFI restriction
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210517061441.1921901-3-srinivas.pandruvada@linux.intel.com
MODULE_DEVICE_TABLE is used to extract the device information out of the
driver and builds a table when being compiled. If using this macro,
kernel can find the driver if available when the device is plugged in,
and then loads that driver and initializes the device.
Fixes: 554fdbaf19 ("thermal: sprd: Add Spreadtrum thermal driver support")
Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210512093752.243168-1-zhang.lyra@gmail.com
The RK3568 SoCs have two Temperature Sensors, channel 0 is for CPU,
channel 1 is for GPU.
Signed-off-by: Finley Xiao <finley.xiao@rock-chips.com>
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210506175514.168365-5-ezequiel@collabora.com
MSR_AMD64_SEV even though the spec clearly states so, and check CPUID
bits first.
- Send only one signal to a task when it is a SEGV_PKUERR si_code type.
- Do away with all the wankery of reserving X amount of memory in
the first megabyte to prevent BIOS corrupting it and simply and
unconditionally reserve the whole first megabyte.
- Make alternatives NOP optimization work at an arbitrary position
within the patched sequence because the compiler can put single-byte
NOPs for alignment anywhere in the sequence (32-bit retpoline), vs our
previous assumption that the NOPs are only appended.
- Force-disable ENQCMD[S] instructions support and remove update_pasid()
because of insufficient protection against FPU state modification in an
interrupt context, among other xstate horrors which are being addressed
at the moment. This one limits the fallout until proper enablement.
- Use cpu_feature_enabled() in the idxd driver so that it can be
build-time disabled through the defines in .../asm/disabled-features.h.
- Fix LVT thermal setup for SMI delivery mode by making sure the APIC
LVT value is read before APIC initialization so that softlockups during
boot do not happen at least on one machine.
- Mark all legacy interrupts as legacy vectors when the IO-APIC is
disabled and when all legacy interrupts are routed through the PIC.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmC8fdEACgkQEsHwGGHe
VUqO5A/+IbIo8myl8VPjw6HRnHgY8rsYRjxdtmVhbaMi5XOmTMfVA9zJ6QALxseo
Mar8bmWcezEs0/FmNvk1vEOtIgZvRVy5RqXbu3W2EgWICuzRWbj822q+KrkbY0tH
1GWjcZQO8VlgeuQsukyj5QHaBLffpn3Fh1XB8r0cktZvwciM+LRNMnK8d6QjqxNM
ctTX4wdI6kc076pOi7MhKxSe+/xo5Wnf27lClLMOcsO/SS42KqgeRM5psWqxihhL
j6Y3Oe+Nm+7GKF8y841PUSlwjgWmlZa6UkR6DBTP7DGnHDa5hMpzxYvHOquq/SbA
leV9OLqI0iWs56kSzbEcXo7do1kld62KjsA2KtUhJfVAtm+igQLh5G0jESBwrWca
TBWaE5kt6s8wP7LXeg26o4U8XD8vqEH88Tmsjlgqb/t/PKDV9PMGvNpF00dPZFo6
Jhj2yntJYjLQYoAQLuQm5pfnKhZy3KKvk7ViGcnp3iN9i4eU9HzawIiXnliNOrTI
ohQ9KoRhy1Cx0UfLkR+cdK4ks0u26DC2/Ewt0CE5AP/CQ1rX6Zbv2gFLjSpy7yQo
6A99HEpbaLuy3kDt5vn91viPNUlOveuIXIdHp6u+zgFfx88eLUoEvfR135aV/Gyh
p5PJm/BO99KByQzFCnilkp7nBeKtnKYSmUojA6JsZKjzJimSPYo=
=zRI1
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
"A bunch of x86/urgent stuff accumulated for the last two weeks so
lemme unload it to you.
It should be all totally risk-free, of course. :-)
- Fix out-of-spec hardware (1st gen Hygon) which does not implement
MSR_AMD64_SEV even though the spec clearly states so, and check
CPUID bits first.
- Send only one signal to a task when it is a SEGV_PKUERR si_code
type.
- Do away with all the wankery of reserving X amount of memory in the
first megabyte to prevent BIOS corrupting it and simply and
unconditionally reserve the whole first megabyte.
- Make alternatives NOP optimization work at an arbitrary position
within the patched sequence because the compiler can put
single-byte NOPs for alignment anywhere in the sequence (32-bit
retpoline), vs our previous assumption that the NOPs are only
appended.
- Force-disable ENQCMD[S] instructions support and remove
update_pasid() because of insufficient protection against FPU state
modification in an interrupt context, among other xstate horrors
which are being addressed at the moment. This one limits the
fallout until proper enablement.
- Use cpu_feature_enabled() in the idxd driver so that it can be
build-time disabled through the defines in disabled-features.h.
- Fix LVT thermal setup for SMI delivery mode by making sure the APIC
LVT value is read before APIC initialization so that softlockups
during boot do not happen at least on one machine.
- Mark all legacy interrupts as legacy vectors when the IO-APIC is
disabled and when all legacy interrupts are routed through the PIC"
* tag 'x86_urgent_for_v5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev: Check SME/SEV support in CPUID first
x86/fault: Don't send SIGSEGV twice on SEGV_PKUERR
x86/setup: Always reserve the first 1M of RAM
x86/alternative: Optimize single-byte NOPs at an arbitrary position
x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid()
dmaengine: idxd: Use cpu_feature_enabled()
x86/thermal: Fix LVT thermal setup for SMI delivery mode
x86/apic: Mark _all_ legacy interrupts when IO/APIC is missing
There are machines out there with added value crap^WBIOS which provide an
SMI handler for the local APIC thermal sensor interrupt. Out of reset,
the BSP on those machines has something like 0x200 in that APIC register
(timestamps left in because this whole issue is timing sensitive):
[ 0.033858] read lvtthmr: 0x330, val: 0x200
which means:
- bit 16 - the interrupt mask bit is clear and thus that interrupt is enabled
- bits [10:8] have 010b which means SMI delivery mode.
Now, later during boot, when the kernel programs the local APIC, it
soft-disables it temporarily through the spurious vector register:
setup_local_APIC:
...
/*
* If this comes from kexec/kcrash the APIC might be enabled in
* SPIV. Soft disable it before doing further initialization.
*/
value = apic_read(APIC_SPIV);
value &= ~APIC_SPIV_APIC_ENABLED;
apic_write(APIC_SPIV, value);
which means (from the SDM):
"10.4.7.2 Local APIC State After It Has Been Software Disabled
...
* The mask bits for all the LVT entries are set. Attempts to reset these
bits will be ignored."
And this happens too:
[ 0.124111] APIC: Switch to symmetric I/O mode setup
[ 0.124117] lvtthmr 0x200 before write 0xf to APIC 0xf0
[ 0.124118] lvtthmr 0x10200 after write 0xf to APIC 0xf0
This results in CPU 0 soft lockups depending on the placement in time
when the APIC soft-disable happens. Those soft lockups are not 100%
reproducible and the reason for that can only be speculated as no one
tells you what SMM does. Likely, it confuses the SMM code that the APIC
is disabled and the thermal interrupt doesn't doesn't fire at all,
leading to CPU 0 stuck in SMM forever...
Now, before
4f432e8bb1 ("x86/mce: Get rid of mcheck_intel_therm_init()")
due to how the APIC_LVTTHMR was read before APIC initialization in
mcheck_intel_therm_init(), it would read the value with the mask bit 16
clear and then intel_init_thermal() would replicate it onto the APs and
all would be peachy - the thermal interrupt would remain enabled.
But that commit moved that reading to a later moment in
intel_init_thermal(), resulting in reading APIC_LVTTHMR on the BSP too
late and with its interrupt mask bit set.
Thus, revert back to the old behavior of reading the thermal LVT
register before the APIC gets initialized.
Fixes: 4f432e8bb1 ("x86/mce: Get rid of mcheck_intel_therm_init()")
Reported-by: James Feeney <james@nurealm.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lkml.kernel.org/r/YKIqDdFNaXYd39wz@zn.tnic
Return -EINVAL when args is invalid instead of 'ret' which is set to
zero by a previous successful call to a function.
Fixes: ca66dca5ed ("thermal: qcom: add support for adc-tm5 PMIC thermal monitor")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210527092640.2070555-1-yangyingliang@huawei.com
Fix function name in ti-bandgap.c kernel-doc comment
to remove a warning.
drivers/thermal/ti-soc-thermal/ti-bandgap.c:787: warning: expecting
prototype for ti_bandgap_alert_init(). Prototype was for
ti_bandgap_talert_init() instead.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Acked-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1621851963-36548-1-git-send-email-yang.lee@linux.alibaba.com
After commit 81ad4276b5 ("Thermal: Ignore invalid trip points") all
user_space governor notifications via RW trip point is broken in intel
thermal drivers. This commits marks trip_points with value of 0 during
call to thermal_zone_device_register() as invalid. RW trip points can be
0 as user space will set the correct trip temperature later.
During driver init, x86_package_temp and all int340x drivers sets RW trip
temperature as 0. This results in all these trips marked as invalid by
the thermal core.
To fix this initialize RW trips to THERMAL_TEMP_INVALID instead of 0.
Cc: <stable@vger.kernel.org>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210430122343.1789899-1-srinivas.pandruvada@linux.intel.com
- Fix spellos in comments for the tegra and sun8i (Bhaskar Chowdhury)
- Add the missing fifth node on the rcar_gen3 sensor (Niklas
Söderlund)
- Remove duplicate include in ti-bandgap (Zhang Yunkai)
- Assign error code in the error path in the function
thermal_of_populate_bind_params() (Jia-Ju Bai)
- Fix spelling mistake in a comment 'disabed' -> 'disabled' (Colin Ian
King)
- Use the device name instead of auto-numbering for a better
identification of the cooling device (Daniel Lezcano)
- Improve a bit the division accuracy in the power allocator governor
(Jeson Gao)
- Enable the missing third sensor on msm8976 (Konrad Dybcio)
- Add QCom tsens driver co-maintainer (Thara Gopinath)
- Fix memory leak and use after free errors in the core code (Daniel
Lezcano)
- Add the MDM9607 compatible bindings (Konrad Dybcio)
- Fix trivial spello in the copyright name for Hisilicon (Hao Fang)
- Fix negative index array access when converting the frequency to
power in the energy model (Brian-sy Yang)
- Add support for Gen2 new PMIC support for Qcom SPMI (David Collins)
- Update maintainer file for CPU cooling device section (Lukasz Luba)
- Fix missing put_device on error in the Qcom tsens driver (Guangqing
Zhu)
- Add compatible DT binding for sm8350 (Robert Foss)
- Add support for the MDM9607's tsens driver (Konrad Dybcio)
- Remove duplicate error messages in thermal_mmio and the bcm2835
driver (Ruiqi Gong)
- Add the Thermal Temperature Cooling driver (Zhang Rui)
- Remove duplicate error messages in the Hisilicon sensor driver (Ye
Bin)
- Use the devm_platform_ioremap_resource_byname() function instead of
a couple of corresponding calls (dingsenjie)
- Sort the headers alphabetically in the ti-bandgap driver (Zhen Lei)
- Add missing property in the DT thermal sensor binding (Rafał
Miłecki)
- Remove dead code in the ti-bandgap sensor driver (Lin Ruizhe)
- Convert the BRCM DT bindings to the yaml schema (Rafał Miłecki)
- Replace the thermal_notify_framework() call by a call to the
thermal_zone_device_update() function. Remove the function as well
as the corresponding documentation (Thara Gopinath)
- Add support for the ipq8064-tsens sensor along with a set of
cleanups and code preparation (Ansuel Smith)
- Add a lockless __thermal_cdev_update() function to improve the
locking scheme in the core code and governors (Lukasz Luba)
- Fix multiple cooling device notification changes (Lukasz Luba)
- Remove unneeded variable initialization (Colin Ian King)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmCRqDIACgkQqDIjiipP
6E8O2Qf5AQvSVoN9WYRBLo1+a4mkGsJ/wHQMEsOA4FVHft5/QVkRtpMNbSiyq00O
YTpNuoBqiYm/tSTyzK/5Oh+0ucgm/ef4c4dTyPjZYw2GB+3rYNRAXdX/tB6Ggjl/
oUArUCoSQZjOU6Y573B05rcHp1PVM/XL9LgD1uX76tXA1MaGvsyC0cyPRAdOANke
W83BWI0XMhv8B1bZwHVB2Oft5x6HhqWBl3HKbNOmPEMtwkqqBCFAqB0wNEH88ZTf
2hyBjBoZQHdMkJsC0piMvIyAjHZiIjQB47VWz31EvKB3/E28xCqRqPViPq9QbrA5
got0+oDbxI96T024ndXRomc0SSxZnw==
=5THg
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal updates from Daniel Lezcano:
- Remove duplicate error message for the amlogic driver (Tang Bin)
- Fix spellos in comments for the tegra and sun8i (Bhaskar Chowdhury)
- Add the missing fifth node on the rcar_gen3 sensor (Niklas Söderlund)
- Remove duplicate include in ti-bandgap (Zhang Yunkai)
- Assign error code in the error path in the function
thermal_of_populate_bind_params() (Jia-Ju Bai)
- Fix spelling mistake in a comment 'disabed' -> 'disabled' (Colin Ian
King)
- Use the device name instead of auto-numbering for a better
identification of the cooling device (Daniel Lezcano)
- Improve a bit the division accuracy in the power allocator governor
(Jeson Gao)
- Enable the missing third sensor on msm8976 (Konrad Dybcio)
- Add QCom tsens driver co-maintainer (Thara Gopinath)
- Fix memory leak and use after free errors in the core code (Daniel
Lezcano)
- Add the MDM9607 compatible bindings (Konrad Dybcio)
- Fix trivial spello in the copyright name for Hisilicon (Hao Fang)
- Fix negative index array access when converting the frequency to
power in the energy model (Brian-sy Yang)
- Add support for Gen2 new PMIC support for Qcom SPMI (David Collins)
- Update maintainer file for CPU cooling device section (Lukasz Luba)
- Fix missing put_device on error in the Qcom tsens driver (Guangqing
Zhu)
- Add compatible DT binding for sm8350 (Robert Foss)
- Add support for the MDM9607's tsens driver (Konrad Dybcio)
- Remove duplicate error messages in thermal_mmio and the bcm2835
driver (Ruiqi Gong)
- Add the Thermal Temperature Cooling driver (Zhang Rui)
- Remove duplicate error messages in the Hisilicon sensor driver (Ye
Bin)
- Use the devm_platform_ioremap_resource_byname() function instead of a
couple of corresponding calls (dingsenjie)
- Sort the headers alphabetically in the ti-bandgap driver (Zhen Lei)
- Add missing property in the DT thermal sensor binding (Rafał Miłecki)
- Remove dead code in the ti-bandgap sensor driver (Lin Ruizhe)
- Convert the BRCM DT bindings to the yaml schema (Rafał Miłecki)
- Replace the thermal_notify_framework() call by a call to the
thermal_zone_device_update() function. Remove the function as well as
the corresponding documentation (Thara Gopinath)
- Add support for the ipq8064-tsens sensor along with a set of cleanups
and code preparation (Ansuel Smith)
- Add a lockless __thermal_cdev_update() function to improve the
locking scheme in the core code and governors (Lukasz Luba)
- Fix multiple cooling device notification changes (Lukasz Luba)
- Remove unneeded variable initialization (Colin Ian King)
* tag 'thermal-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (55 commits)
thermal/drivers/mtk_thermal: Remove redundant initializations of several variables
thermal/core/power allocator: Use the lockless __thermal_cdev_update() function
thermal/core/fair share: Use the lockless __thermal_cdev_update() function
thermal/core/fair share: Lock the thermal zone while looping over instances
thermal/core/power_allocator: Update once cooling devices when temp is low
thermal/core/power_allocator: Maintain the device statistics from going stale
thermal/core: Create a helper __thermal_cdev_update() without a lock
dt-bindings: thermal: tsens: Document ipq8064 bindings
thermal/drivers/tsens: Add support for ipq8064-tsens
thermal/drivers/tsens: Drop unused define for msm8960
thermal/drivers/tsens: Replace custom 8960 apis with generic apis
thermal/drivers/tsens: Fix bug in sensor enable for msm8960
thermal/drivers/tsens: Use init_common for msm8960
thermal/drivers/tsens: Add VER_0 tsens version
thermal/drivers/tsens: Convert msm8960 to reg_field
thermal/drivers/tsens: Don't hardcode sensor slope
Documentation: driver-api: thermal: Remove thermal_notify_framework from documentation
thermal/core: Remove thermal_notify_framework
iwlwifi: mvm: tt: Replace thermal_notify_framework
dt-bindings: thermal: brcm,ns-thermal: Convert to the json-schema
...
Several variables are being initialized with values that is never
read and being updated later with a new value. The initializations
are redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210422120412.246291-1-colin.king@canonical.com
Use the new helper function and avoid unnecessery second lock/unlock,
which was present in old approach with thermal_cdev_update().
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210422153624.6074-4-lukasz.luba@arm.com
Use the new helper function and avoid unnecessery second lock/unlock,
which was present in old approach with thermal_cdev_update().
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210422153624.6074-3-lukasz.luba@arm.com
The tz->lock must be hold during the looping over the instances in that
thermal zone. This lock was missing in the governor code since the
beginning, so it's hard to point into a particular commit.
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210422153624.6074-2-lukasz.luba@arm.com
The cooling device state change generates an event, also when there is no
need, because temperature is low and device is not throttled. Avoid to
unnecessary update the cooling device which means also not sending event.
The cooling device state has not changed because the temperature is still
below the first activation trip point value, so we can do this.
Add a tracking mechanism to make sure it updates cooling devices only
once - when the temperature dropps below first trip point.
Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210422114308.29684-4-lukasz.luba@arm.com
When the temperature is below the first activation trip point the cooling
devices are not checked, so they cannot maintain fresh statistics. It
leads into the situation, when temperature crosses first trip point, the
statistics are stale and show state for very long period. This has impact
on IPA algorithm calculation and wrong decisions. Thus, check the cooling
devices even when the temperature is low, to refresh these statistics.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210422114308.29684-3-lukasz.luba@arm.com
There is a need to have a helper function which updates cooling device
state from the governors code. With this change governor can use
lock and unlock while calling helper function. This avoid unnecessary
second time lock/unlock which was in previous solution present in
governor implementation. This new helper function must be called
with mutex 'cdev->lock' hold.
The changed been discussed and part of code presented in thread:
https://lore.kernel.org/linux-pm/20210419084536.25000-1-lukasz.luba@arm.com/
Co-developed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20210422114308.29684-2-lukasz.luba@arm.com
Add support for tsens present in ipq806x SoCs based on generic msm8960
tsens driver.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210420183343.2272-9-ansuelsmth@gmail.com
Drop unused define for msm8960 replaced by generic api and reg_field.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210420183343.2272-8-ansuelsmth@gmail.com
Rework calibrate function to use common function. Derive the offset from
a missing hardcoded slope table and the data from the nvmem calib
efuses.
Drop custom get_temp function and use generic api.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Acked-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210420183343.2272-7-ansuelsmth@gmail.com
Device based on tsens VER_0 contains a hardware bug that results in some
problem with sensor enablement. Sensor id 6-11 can't be enabled
selectively and all of them must be enabled in one step.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Acked-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210420183343.2272-6-ansuelsmth@gmail.com
Use init_common and drop custom init for msm8960.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210420183343.2272-5-ansuelsmth@gmail.com
VER_0 is used to describe device based on tsens version before v0.1.
These device are devices based on msm8960 for example apq8064 or
ipq806x. Add support for VER_0 in tsens.c and set the right tsens feat
in tsens-8960.c file.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210420183343.2272-4-ansuelsmth@gmail.com
Convert msm9860 driver to reg_field to use the init_common
function.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Acked-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210420183343.2272-3-ansuelsmth@gmail.com
Function compute_intercept_slope hardcode the sensor slope to
SLOPE_DEFAULT. Change this and use the default value only if a slope is
not defined. This is needed for tsens VER_0 that has a hardcoded slope
table.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210420183343.2272-2-ansuelsmth@gmail.com
thermal_notify_framework just updates for a single trip point where as
thermal_zone_device_update does other bookkeeping like updating the
temperature of the thermal zone and setting the next trip point. The only
driver that was using thermal_notify_framework was updated in the previous
patch to use thermal_zone_device_update instead. Since there are no users
for thermal_notify_framework remove it.
Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210122023406.3500424-3-thara.gopinath@linaro.org
The function ti_bandgap_restore_ctxt() restores the context at resume
time. It checks if the sensor has a counter, reads the register but
does nothing with the value.
The block was probably omitted by the commit b87ea759a4.
Remove the unused variable as well as the block using it as we can
consider it as dead code.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: b87ea759a4 ("staging: omap-thermal: fix context restore function")
Signed-off-by: Lin Ruizhe <linruizhe@huawei.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210421084256.57591-1-linruizhe@huawei.com
For the sake of lisibility, reorder the header files alphabetically.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210406091912.2583-2-thunder.leizhen@huawei.com
Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.
Signed-off-by: dingsenjie <dingsenjie@yulong.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210414063943.96244-1-dingsenjie@163.com
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210409075224.2109503-1-yebin10@huawei.com
On Intel processors, the core frequency can be reduced below OS request,
when the current temperature reaches the TCC (Thermal Control Circuit)
activation temperature.
The default TCC activation temperature is specified by
MSR_IA32_TEMPERATURE_TARGET. However, it can be adjusted by specifying an
offset in degrees C, using the TCC Offset bits in the same MSR register.
This patch introduces a cooling devices driver that utilizes the TCC
Offset feature. The bigger the current cooling state is, the lower the
effective TCC activation temperature is, so that the processors can be
throttled earlier before system critical overheats.
Note that, on different platforms, the behavior might be different on
how fast the setting takes effect, and how much the CPU frequency is
reduced.
This patch has been tested on a KabyLake mobile platform from me, and also
on a CometLake platform from Doug.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210412125901.12549-1-rui.zhang@intel.com
There is a error message within devm_ioremap_resource already, so
remove the dev_err call to avoid redundant error message.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210408100329.7585-1-gongruiqi1@huawei.com
There is a error message within devm_ioremap_resource already, so
remove the dev_err call to avoid redundant error message.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com>
Acked-by: Talel Shenhar <talel@amazon.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210408100144.7494-1-gongruiqi1@huawei.com
MDM9607 TSENS IP is very similar to the one of MSM8916, with
minor adjustments to various tuning values.
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Acked-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210319220802.198215-2-konrad.dybcio@somainline.org
Fixes coccicheck error:
drivers/thermal/qcom/tsens.c:759:4-10: ERROR: missing put_device; call
of_find_device_by_node on line 715, but without a corresponding object
release within this function.
Fixes: a7ff829761 ("drivers: thermal: tsens: Merge tsens-common.c into tsens.c")
Signed-off-by: Guangqing Zhu <zhuguangqing83@gmail.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210404125431.12208-1-zhuguangqing83@gmail.com
Add support for TEMP_ALARM GEN2 PMIC peripherals with digital
major revision 1. This revision utilizes a different temperature
threshold mapping than earlier revisions.
Signed-off-by: David Collins <collinsd@codeaurora.org>
Signed-off-by: Guru Das Srinagesh <gurus@codeaurora.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/69c90a004b3f5b7ae282f5ec5ca2920a48f23e02.1596040416.git.gurus@codeaurora.org
Slab OOB issue is scanned by KASAN in cpu_power_to_freq().
If power is limited below the power of OPP0 in EM table,
it will cause slab out-of-bound issue with negative array
index.
Return the lowest frequency if limited power cannot found
a suitable OPP in EM table to fix this issue.
Backtrace:
[<ffffffd02d2a37f0>] die+0x104/0x5ac
[<ffffffd02d2a5630>] bug_handler+0x64/0xd0
[<ffffffd02d288ce4>] brk_handler+0x160/0x258
[<ffffffd02d281e5c>] do_debug_exception+0x248/0x3f0
[<ffffffd02d284488>] el1_dbg+0x14/0xbc
[<ffffffd02d75d1d4>] __kasan_report+0x1dc/0x1e0
[<ffffffd02d75c2e0>] kasan_report+0x10/0x20
[<ffffffd02d75def8>] __asan_report_load8_noabort+0x18/0x28
[<ffffffd02e6fce5c>] cpufreq_power2state+0x180/0x43c
[<ffffffd02e6ead80>] power_actor_set_power+0x114/0x1d4
[<ffffffd02e6fac24>] allocate_power+0xaec/0xde0
[<ffffffd02e6f9f80>] power_allocator_throttle+0x3ec/0x5a4
[<ffffffd02e6ea888>] handle_thermal_trip+0x160/0x294
[<ffffffd02e6edd08>] thermal_zone_device_check+0xe4/0x154
[<ffffffd02d351cb4>] process_one_work+0x5e4/0xe28
[<ffffffd02d352f44>] worker_thread+0xa4c/0xfac
[<ffffffd02d360124>] kthread+0x33c/0x358
[<ffffffd02d289940>] ret_from_fork+0xc/0x18
Fixes: 371a3bc79c ("thermal/drivers/cpufreq_cooling: Fix wrong frequency converted from power")
Signed-off-by: brian-sy yang <brian-sy.yang@mediatek.com>
Signed-off-by: Michael Kao <michael.kao@mediatek.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Cc: stable@vger.kernel.org #v5.7
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201229050831.19493-1-michael.kao@mediatek.com
When the function successfully finishes it logs an information about
the registration of the cooling device and use its name to build the
message. Unfortunately it was freed right before:
drivers/thermal/cpuidle_cooling.c:218 __cpuidle_cooling_register()
warn: 'name' was already freed.
Fix this by freeing after the message happened.
Fixes: 6fd1b186d9 ("thermal/drivers/cpuidle_cooling: Use device name instead of auto-numbering")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20210319202522.891061-1-daniel.lezcano@linaro.org
The following error is reported by kbuild:
smatch warnings:
drivers/thermal/devfreq_cooling.c:433 of_devfreq_cooling_register_power() warn: passing zero to 'ERR_PTR'
Fix the error code by the setting the 'err' variable instead of 'cdev'.
Fixes: f8d354e821 ("thermal/drivers/devfreq_cooling: Use device name instead of auto-numbering")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210319202424.890968-1-daniel.lezcano@linaro.org
Fix the following error:
smatch warnings:
drivers/thermal/thermal_core.c:1020 __thermal_cooling_device_register() warn: possible memory leak of 'cdev'
by freeing the cdev when exiting the function in the error path.
Fixes: 5848376181 ("thermal/drivers/core: Use a char pointer for the cooling device name")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210319202257.890848-1-daniel.lezcano@linaro.org
The sensor *is* in fact used and does report temperature.
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Acked-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210225213119.116550-1-konrad.dybcio@somainline.org
There is a possible chance that some cooling device stats buffer
allocation fails due to very high cooling device max state value.
Later cooling device update sysfs can try to access stats data
for the same cooling device. It will lead to NULL pointer
dereference issue.
Add a NULL pointer check before accessing thermal cooling device
stats data. It fixes the following bug
[ 26.812833] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004
[ 27.122960] Call trace:
[ 27.122963] do_raw_spin_lock+0x18/0xe8
[ 27.122966] _raw_spin_lock+0x24/0x30
[ 27.128157] thermal_cooling_device_stats_update+0x24/0x98
[ 27.128162] cur_state_store+0x88/0xb8
[ 27.128166] dev_attr_store+0x40/0x58
[ 27.128169] sysfs_kf_write+0x50/0x68
[ 27.133358] kernfs_fop_write+0x12c/0x1c8
[ 27.133362] __vfs_write+0x54/0x160
[ 27.152297] vfs_write+0xcc/0x188
[ 27.157132] ksys_write+0x78/0x108
[ 27.162050] ksys_write+0xf8/0x108
[ 27.166968] __arm_smccc_hvc+0x158/0x4b0
[ 27.166973] __arm_smccc_hvc+0x9c/0x4b0
[ 27.186005] el0_svc+0x8/0xc
Signed-off-by: Manaf Meethalavalappu Pallikunhi <manafm@codeaurora.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1607367181-24589-1-git-send-email-manafm@codeaurora.org
The division is used directly in re-divvying up power, the decimal part will
be discarded, devices will get less than the extra_actor_power - 1.
if using round the division to make the calculation more accurate.
For example:
actor0 received more than its max_power, it has the extra_power 759
actor1 received less than its max_power, it require extra_actor_power 395
actor2 received less than its max_power, it require extra_actor_power 365
actor1 and actor2 require the total capped_extra_power 760
using division in re-divvying up power
actor1 would actually get the extra_actor_power 394
actor2 would actually get the extra_actor_power 364
if using round the division in re-divvying up power
actor1 would actually get the extra_actor_power 394
actor2 would actually get the extra_actor_power 365
Signed-off-by: Jeson Gao <jeson.gao@unisoc.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1615796737-4688-1-git-send-email-gao.yunxiao6@gmail.com
There is a list with the purpose of grouping the cpufreq cooling
device together as described in the comments but actually it is
unused, the code evolved since 2012 and the list was no longer needed.
Delete the remaining unused list related code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20210314111333.16551-5-daniel.lezcano@linaro.org
Currently the naming of a cooling device is just a cooling technique
followed by a number. When there are multiple cooling devices using
the same technique, it is impossible to clearly identify the related
device as this one is just a number.
For instance:
thermal-idle-0
thermal-idle-1
thermal-idle-2
thermal-idle-3
etc ...
The 'thermal' prefix is redundant with the subsystem namespace. This
patch removes the 'thermal prefix and changes the number by the device
name. So the naming above becomes:
idle-cpu0
idle-cpu1
idle-cpu2
idle-cpu3
etc ...
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20210314111333.16551-4-daniel.lezcano@linaro.org
Currently the naming of a cooling device is just a cooling technique
followed by a number. When there are multiple cooling devices using
the same technique, it is impossible to clearly identify the related
device as this one is just a number.
For instance:
thermal-devfreq-0
thermal-devfreq-1
etc ...
The 'thermal' prefix is redundant with the subsystem namespace. This
patch removes the 'thermal' prefix and changes the number by the device
name. So the naming above becomes:
devfreq-5000000.gpu
devfreq-1d84000.ufshc
etc ...
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20210314111333.16551-3-daniel.lezcano@linaro.org
Currently the naming of a cooling device is just a cooling technique
followed by a number. When there are multiple cooling devices using
the same technique, it is impossible to clearly identify the related
device as this one is just a number.
For instance:
thermal-cpufreq-0
thermal-cpufreq-1
etc ...
The 'thermal' prefix is redundant with the subsystem namespace. This
patch removes the 'thermal' prefix and changes the number by the device
name. So the naming above becomes:
cpufreq-cpu0
cpufreq-cpu4
etc ...
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20210314111333.16551-2-daniel.lezcano@linaro.org
We want to have any kind of name for the cooling devices as we do no
longer want to rely on auto-numbering. Let's replace the cooling
device's fixed array by a char pointer to be allocated dynamically
when registering the cooling device, so we don't limit the length of
the name.
Rework the error path at the same time as we have to rollback the
allocations in case of error.
Tested with a dummy device having the name:
"Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch"
A village on the island of Anglesey (Wales), known to have the longest
name in Europe.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20210314111333.16551-1-daniel.lezcano@linaro.org
When kcalloc() returns NULL to __tcbp or of_count_phandle_with_args()
returns zero or -ENOENT to count, no error return code of
thermal_of_populate_bind_params() is assigned.
To fix these bugs, ret is assigned with -ENOMEM and -ENOENT in these
cases, respectively.
Fixes: a92bab8919 ("of: thermal: Allow multiple devices to share cooling map")
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210310122423.3266-1-baijiaju1990@gmail.com
Add support for up to five TSC nodes. The new THCODE values are taken
from the example in the datasheet.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210309162419.2621359-1-niklas.soderlund+renesas@ragnatech.se
The function devm_platform_ioremap_resource has already contains error
message, so remove the redundant dev_err here.
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210222061105.6008-1-tangbin@cmss.chinamobile.com
thermal driver (Daniel Lezcano)
- Remove the notify ops as it is no longer used (Daniel Lezcano)
- Remove the 'forced passive' option and the unused bind/unbind
functions (Daniel Lezcano)
- Remove the THERMAL_TRIPS_NONE and the code cleanup around this
macro (Daniel Lezcano)
- Rework the delays to make them pre-computed instead of computing
them again and again at each polling interval (Daniel Lezcano)
- Remove the pointless 'thermal_zone_device_reset' function (Daniel
Lezcano)
- Use the critical and hot ops to prevent an unexpected system
shutdown on int340x (Kai-Heng Feng)
- Make the cooling device state private to the thermal subsystem
(Daniel Lezcano)
- Prevent to use not-power-aware actor devices with the power
allocator governor (Lukasz Luba)
- Remove 'zx' and 'tango' support along with the corresponding
platforms (Arnd Bergman)
- Fix several issues on the Omap thermal driver (Tony Lindgren)
- Add support for adc-tm5 PMIC thermal monitor for Qcom
platforms. Please note those changes rely on an immutable branch:
iio-thermal-5.11-rc1/ib-iio-thermal-5.11-rc1 from the iio tree
(Dmitry Baryshkov)
- Fix an initialization loop in the adc-tm5 (Colin Ian King)
- Fix a return error check in the cpufreq cooling device (Viresh Kumar)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmAvkKMACgkQqDIjiipP
6E9yFggAmXy8t2j1mRvn/KLU+teTGIoSFkZ8mBnY2Sgip97IRZRhCwAZbUKOW0eI
bvpAzBjacxdZHT7OxxvGzCOq/zlAh4UoStI8bMpzdUWPdkAj4ippArLYGvagLym8
WEQysWnrr8V1RCZbQuBNjyLwjf0fcXkzIBU1mbZXA8T8Y6Yn646TdtsrVT4Idg1j
MOg7PAHBcTSY/wOReZKJ5TB1yvo2tNOuGOqUVbrIAHlRkiNTVHirVUq6aZGtTTKp
7ukcu8EI/o7XKBdQ5d9MZaHdwkcyAIJj4jdjmjkUJpa8VYQFPjayNyN3I+Py9lH2
jtWVYHQxZbY166IZP2yeXFjPzd6elw==
=Jmz4
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal updates from Daniel Lezcano:
- Use the newly introduced 'hot' and 'critical' ops for the acpi
thermal driver (Daniel Lezcano)
- Remove the notify ops as it is no longer used (Daniel Lezcano)
- Remove the 'forced passive' option and the unused bind/unbind
functions (Daniel Lezcano)
- Remove the THERMAL_TRIPS_NONE and the code cleanup around this macro
(Daniel Lezcano)
- Rework the delays to make them pre-computed instead of computing them
again and again at each polling interval (Daniel Lezcano)
- Remove the pointless 'thermal_zone_device_reset' function (Daniel
Lezcano)
- Use the critical and hot ops to prevent an unexpected system shutdown
on int340x (Kai-Heng Feng)
- Make the cooling device state private to the thermal subsystem
(Daniel Lezcano)
- Prevent to use not-power-aware actor devices with the power allocator
governor (Lukasz Luba)
- Remove 'zx' and 'tango' support along with the corresponding
platforms (Arnd Bergman)
- Fix several issues on the Omap thermal driver (Tony Lindgren)
- Add support for adc-tm5 PMIC thermal monitor for Qcom platforms
(Dmitry Baryshkov)
- Fix an initialization loop in the adc-tm5 (Colin Ian King)
- Fix a return error check in the cpufreq cooling device (Viresh Kumar)
* tag 'thermal-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (26 commits)
thermal: cpufreq_cooling: freq_qos_update_request() returns < 0 on error
thermal: qcom: Fix comparison with uninitialized variable channels_available
thermal: qcom: add support for adc-tm5 PMIC thermal monitor
dt-bindings: thermal: qcom: add adc-thermal monitor bindings
thermal: ti-soc-thermal: Use non-inverted define for omap4
thermal: ti-soc-thermal: Simplify polling with iopoll
thermal: ti-soc-thermal: Fix stuck sensor with continuous mode for 4430
thermal: ti-soc-thermal: Skip pointless register access for dra7
thermal/drivers/zx: Remove zx driver
thermal/drivers/tango: Remove tango driver
thermal: power allocator: fail binding for non-power actor devices
thermal/core: Make cooling device state change private
thermal: intel: pch: Fix unexpected shutdown at critical temperature
thermal: int340x: Fix unexpected shutdown at critical temperature
thermal/core: Remove pointless thermal_zone_device_reset() function
thermal/core: Remove ms based delay fields
thermal/core: Use precomputed jiffies for the polling
thermal/core: Precompute the delays from msecs to jiffies
thermal/core: Remove unused macro THERMAL_TRIPS_NONE
thermal/core: Remove THERMAL_TRIPS_NONE test
...
[ NOTE: unfortunately this tree had to be freshly rebased today,
it's a same-content tree of 82891be90f3c (-next published)
merged with v5.11.
The main reason for the rebase was an authorship misattribution
problem with a new commit, which we noticed in the last minute,
and which we didn't want to be merged upstream. The offending
commit was deep in the tree, and dependent commits had to be
rebased as well. ]
- Core scheduler updates:
- Add CONFIG_PREEMPT_DYNAMIC: this in its current form adds the
preempt=none/voluntary/full boot options (default: full),
to allow distros to build a PREEMPT kernel but fall back to
close to PREEMPT_VOLUNTARY (or PREEMPT_NONE) runtime scheduling
behavior via a boot time selection.
There's also the /debug/sched_debug switch to do this runtime.
This feature is implemented via runtime patching (a new variant of static calls).
The scope of the runtime patching can be best reviewed by looking
at the sched_dynamic_update() function in kernel/sched/core.c.
( Note that the dynamic none/voluntary mode isn't 100% identical,
for example preempt-RCU is available in all cases, plus the
preempt count is maintained in all models, which has runtime
overhead even with the code patching. )
The PREEMPT_VOLUNTARY/PREEMPT_NONE models, used by the vast majority
of distributions, are supposed to be unaffected.
- Fix ignored rescheduling after rcu_eqs_enter(). This is a bug that
was found via rcutorture triggering a hang. The bug is that
rcu_idle_enter() may wake up a NOCB kthread, but this happens after
the last generic need_resched() check. Some cpuidle drivers fix it
by chance but many others don't.
In true 2020 fashion the original bug fix has grown into a 5-patch
scheduler/RCU fix series plus another 16 RCU patches to address
the underlying issue of missed preemption events. These are the
initial fixes that should fix current incarnations of the bug.
- Clean up rbtree usage in the scheduler, by providing & using the following
consistent set of rbtree APIs:
partial-order; less() based:
- rb_add(): add a new entry to the rbtree
- rb_add_cached(): like rb_add(), but for a rb_root_cached
total-order; cmp() based:
- rb_find(): find an entry in an rbtree
- rb_find_add(): find an entry, and add if not found
- rb_find_first(): find the first (leftmost) matching entry
- rb_next_match(): continue from rb_find_first()
- rb_for_each(): iterate a sub-tree using the previous two
- Improve the SMP/NUMA load-balancer: scan for an idle sibling in a single pass.
This is a 4-commit series where each commit improves one aspect of the idle
sibling scan logic.
- Improve the cpufreq cooling driver by getting the effective CPU utilization
metrics from the scheduler
- Improve the fair scheduler's active load-balancing logic by reducing the number
of active LB attempts & lengthen the load-balancing interval. This improves
stress-ng mmapfork performance.
- Fix CFS's estimated utilization (util_est) calculation bug that can result in
too high utilization values
- Misc updates & fixes:
- Fix the HRTICK reprogramming & optimization feature
- Fix SCHED_SOFTIRQ raising race & warning in the CPU offlining code
- Reduce dl_add_task_root_domain() overhead
- Fix uprobes refcount bug
- Process pending softirqs in flush_smp_call_function_from_idle()
- Clean up task priority related defines, remove *USER_*PRIO and
USER_PRIO()
- Simplify the sched_init_numa() deduplication sort
- Documentation updates
- Fix EAS bug in update_misfit_status(), which degraded the quality
of energy-balancing
- Smaller cleanups
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmAtHBsRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1itgg/+NGed12pgPjYBzesdou60Lvx7LZLGjfOt
M1F1EnmQGn/hEH2fCY6ZoqIZQTVltm7GIcBNabzYTzlaHZsdtyuDUJBZyj19vTlk
zekcj7WVt+qvfjChaNwEJhQ9nnOM/eohMgEOHMAAJd9zlnQvve7NOLQ56UDM+kn/
9taFJ5ZPvb4avP6C5p3KivvKex6Bjof/Tl0m3utpNyPpI/qK3FyGxwdgCxU0yepT
ABWQX5ZQCufFvo1bgnBPfqyzab4MqhoM3bNKBsLQfuAlssG1xRv4KQOev4dRwrt9
pXJikV5C9yez5d2lGe5p0ltH5IZS/l9x2yI/ZQj3OUDTFyV1ic6WfFAqJgDzVF8E
i/vvA4NPQiI241Bkps+ErcCw4aVOgiY6TWli74cHjLUIX0+As6aHrFWXGSxUmiHB
WR+B8KmdfzRTTlhOxMA+cvlpZcKCfxWkJJmXzr/lDZzIuKPqM3QCE2wD9sixkfVo
JNICT0IvZghWOdbMEfZba8Psh/e2LVI9RzdpEiuYJz1ZrVlt1hO0M6jBxY0hMz9n
k54z81xODw0a8P2FHMtpmB1vhAeqCmvwA6DO8z0Oxs0DFi+KM2bLf2efHsCKafI+
Bm5v9YFaOk/55R76hJVh+aYLlyFgFkKd+P/niJTPDnxOk3SqJuXvTrql1HeGHkNr
kYgQa23dsZk=
=pyaG
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Core scheduler updates:
- Add CONFIG_PREEMPT_DYNAMIC: this in its current form adds the
preempt=none/voluntary/full boot options (default: full), to allow
distros to build a PREEMPT kernel but fall back to close to
PREEMPT_VOLUNTARY (or PREEMPT_NONE) runtime scheduling behavior via
a boot time selection.
There's also the /debug/sched_debug switch to do this runtime.
This feature is implemented via runtime patching (a new variant of
static calls).
The scope of the runtime patching can be best reviewed by looking
at the sched_dynamic_update() function in kernel/sched/core.c.
( Note that the dynamic none/voluntary mode isn't 100% identical,
for example preempt-RCU is available in all cases, plus the
preempt count is maintained in all models, which has runtime
overhead even with the code patching. )
The PREEMPT_VOLUNTARY/PREEMPT_NONE models, used by the vast
majority of distributions, are supposed to be unaffected.
- Fix ignored rescheduling after rcu_eqs_enter(). This is a bug that
was found via rcutorture triggering a hang. The bug is that
rcu_idle_enter() may wake up a NOCB kthread, but this happens after
the last generic need_resched() check. Some cpuidle drivers fix it
by chance but many others don't.
In true 2020 fashion the original bug fix has grown into a 5-patch
scheduler/RCU fix series plus another 16 RCU patches to address the
underlying issue of missed preemption events. These are the initial
fixes that should fix current incarnations of the bug.
- Clean up rbtree usage in the scheduler, by providing & using the
following consistent set of rbtree APIs:
partial-order; less() based:
- rb_add(): add a new entry to the rbtree
- rb_add_cached(): like rb_add(), but for a rb_root_cached
total-order; cmp() based:
- rb_find(): find an entry in an rbtree
- rb_find_add(): find an entry, and add if not found
- rb_find_first(): find the first (leftmost) matching entry
- rb_next_match(): continue from rb_find_first()
- rb_for_each(): iterate a sub-tree using the previous two
- Improve the SMP/NUMA load-balancer: scan for an idle sibling in a
single pass. This is a 4-commit series where each commit improves
one aspect of the idle sibling scan logic.
- Improve the cpufreq cooling driver by getting the effective CPU
utilization metrics from the scheduler
- Improve the fair scheduler's active load-balancing logic by
reducing the number of active LB attempts & lengthen the
load-balancing interval. This improves stress-ng mmapfork
performance.
- Fix CFS's estimated utilization (util_est) calculation bug that can
result in too high utilization values
Misc updates & fixes:
- Fix the HRTICK reprogramming & optimization feature
- Fix SCHED_SOFTIRQ raising race & warning in the CPU offlining code
- Reduce dl_add_task_root_domain() overhead
- Fix uprobes refcount bug
- Process pending softirqs in flush_smp_call_function_from_idle()
- Clean up task priority related defines, remove *USER_*PRIO and
USER_PRIO()
- Simplify the sched_init_numa() deduplication sort
- Documentation updates
- Fix EAS bug in update_misfit_status(), which degraded the quality
of energy-balancing
- Smaller cleanups"
* tag 'sched-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
sched,x86: Allow !PREEMPT_DYNAMIC
entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point
entry: Explicitly flush pending rcuog wakeup before last rescheduling point
rcu/nocb: Trigger self-IPI on late deferred wake up before user resume
rcu/nocb: Perform deferred wake up before last idle's need_resched() check
rcu: Pull deferred rcuog wake up to rcu_eqs_enter() callers
sched/features: Distinguish between NORMAL and DEADLINE hrtick
sched/features: Fix hrtick reprogramming
sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()
uprobes: (Re)add missing get_uprobe() in __find_uprobe()
smp: Process pending softirqs in flush_smp_call_function_from_idle()
sched: Harden PREEMPT_DYNAMIC
static_call: Allow module use without exposing static_call_key
sched: Add /debug/sched_preempt
preempt/dynamic: Support dynamic preempt with preempt= boot option
preempt/dynamic: Provide irqentry_exit_cond_resched() static call
preempt/dynamic: Provide preempt_schedule[_notrace]() static calls
preempt/dynamic: Provide cond_resched() and might_resched() static calls
preempt: Introduce CONFIG_PREEMPT_DYNAMIC
static_call: Provide DEFINE_STATIC_CALL_RET0()
...
freq_qos_update_request() returns 1 if the effective constraint value
has changed, 0 if the effective constraint value has not changed, or a
negative error code on failures.
The frequency constraints for CPUs can be set by different parts of the
kernel. If the maximum frequency constraint set by other parts of the
kernel are set at a lower value than the one corresponding to cooling
state 0, then we will never be able to cool down the system as
freq_qos_update_request() will keep on returning 0 and we will skip
updating cpufreq_state and thermal pressure.
Fix that by doing the updates even in the case where
freq_qos_update_request() returns 0, as we have effectively set the
constraint to a new value even if the consolidated value of the
actual constraint is unchanged because of external factors.
Cc: v5.7+ <stable@vger.kernel.org> # v5.7+
Reported-by: Thara Gopinath <thara.gopinath@linaro.org>
Fixes: f12e4f66ab ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Thara Gopinath<thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/b2b7e84944937390256669df5a48ce5abba0c1ef.1613540713.git.viresh.kumar@linaro.org
Currently the check of chip->channels[i].channel is against an the
uninitialized variable channels_available. I believe the variable
channels_available needs to be fetched first by the call to adc_tm5_read
before the channels check. Fix the issue swapping the order of the
channels check loop with the call to adc_tm5_read.
Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: ca66dca5ed ("thermal: qcom: add support for adc-tm5 PMIC thermal monitor")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210216151626.162996-1-colin.king@canonical.com
Add support for Thermal Monitoring part of PMIC5. This part is closely
coupled with ADC, using it's channels directly. ADC-TM support
generating interrupts on ADC value crossing low or high voltage bounds,
which is used to support thermal trip points.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210205000118.493610-3-dmitry.baryshkov@linaro.org
When we set bit 10 high we use continuous mode and not single
mode. Let's correct this to avoid confusion. No functional
changes here, the code does the right thing with bit 10.
Cc: Adam Ford <aford173@gmail.com>
Cc: Carl Philipp Klemm <philipp@uvos.xyz>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: H. Nikolaus Schaller <hns@goldelico.com>
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Cc: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Adam Ford <aford173@gmail.com> #logicpd-torpedo-37xx-devkit
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210205134534.49200-5-tony@atomide.com
We can use iopoll for checking the EOCZ (end of conversion) bit. And with
this we now also want to handle the timeout errors properly.
For omap3, we need about 1.2ms for the single mode sampling to wait for
EOCZ down, so let's use 1.5ms timeout there. Waiting for sampling to start
is faster and we can use 1ms timeout.
Cc: Adam Ford <aford173@gmail.com>
Cc: Carl Philipp Klemm <philipp@uvos.xyz>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: H. Nikolaus Schaller <hns@goldelico.com>
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Cc: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Adam Ford <aford173@gmail.com> #logicpd-torpedo-37xx-devkit
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210205134534.49200-4-tony@atomide.com
At least for 4430, trying to use the single conversion mode eventually
hangs the thermal sensor. This can be quite easily seen with errors:
thermal thermal_zone0: failed to read out thermal zone (-5)
Also, trying to read the temperature shows a stuck value with:
$ while true; do cat /sys/class/thermal/thermal_zone0/temp; done
Where the temperature is not rising at all with the busy loop.
Additionally, the EOCZ (end of conversion) bit is not rising on 4430 in
single conversion mode while it works fine in continuous conversion mode.
It is also possible that the hung temperature sensor can affect the
thermal shutdown alert too.
Let's fix the issue by adding TI_BANDGAP_FEATURE_CONT_MODE_ONLY flag and
use it for 4430.
Note that we also need to add udelay to for the EOCZ (end of conversion)
bit polling as otherwise we have it time out too early on 4430. We'll be
changing the loop to use iopoll in the following clean-up patch.
Cc: Adam Ford <aford173@gmail.com>
Cc: Carl Philipp Klemm <philipp@uvos.xyz>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: H. Nikolaus Schaller <hns@goldelico.com>
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Cc: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Adam Ford <aford173@gmail.com> #logicpd-torpedo-37xx-devkit
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210205134534.49200-3-tony@atomide.com
On dra7, there is no Start of Conversion (SOC) register bit and we have an
empty bgap_soc_mask in the configuration for the thermal driver. Let's not
do pointless reads and writes with the empty mask.
There's also no point waiting for End of Conversion bit (EOCZ) to go high
on dra7. We only care about it going down, and are now mostly timing out
waiting for EOCZ high while it has already gone down.
When we add checking for the timeout errors in a later patch, waiting for
EOCZ high would cause bogus time out errors.
Cc: Adam Ford <aford173@gmail.com>
Cc: Carl Philipp Klemm <philipp@uvos.xyz>
Cc: Eduardo Valentin <edubezval@gmail.com>
Cc: H. Nikolaus Schaller <hns@goldelico.com>
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Cc: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Adam Ford <aford173@gmail.com> #logicpd-torpedo-37xx-devkit
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210205134534.49200-2-tony@atomide.com
This functionality has nothing to do with MCE, move it to the thermal
framework and untangle it from MCE.
Requested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lkml.kernel.org/r/20210202121003.GD18075@zn.tnic
The zte zx platform is getting removed, so this driver is no
longer needed.
Cc: Jun Nie <jun.nie@linaro.org>
Cc: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210120162400.4115366-3-arnd@kernel.org
The tango platform is getting removed, so the driver is no
longer needed.
Cc: Marc Gonzalez <marc.w.gonzalez@free.fr>
Cc: Mans Rullgard <mans@mansr.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210120162400.4115366-2-arnd@kernel.org
The thermal zone can have cooling devices which are missing power actor
API. This could be due to missing Energy Model for devfreq or cpufreq
cooling device. In this case it is safe to fail the binding rather than
trying to workaround and control the temperature in such thermal zone.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210119114126.19480-1-lukasz.luba@arm.com
The change of the cooling device state should be used by the governor
or at least by the core code, not by the drivers themselves.
Remove the API usage and move the function declaration to the internal
headers.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20210118173824.9970-1-daniel.lezcano@linaro.org
Like previous patch, the intel_pch_thermal device is not in ACPI
ThermalZone namespace, so a critical trip doesn't mean shutdown.
Override the default .critical callback to prevent surprising thermal
shutdoown.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201221172345.36976-2-kai.heng.feng@canonical.com
We are seeing thermal shutdown on Intel based mobile workstations, the
shutdown happens during the first trip handle in
thermal_zone_device_register():
kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down
However, we shouldn't do a thermal shutdown here, since
1) We may want to use a dedicated daemon, Intel's thermald in this case,
to handle thermal shutdown.
2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
"... If this object it present under a device, the device’s driver
evaluates this object to determine the device’s critical cooling
temperature trip point. This value may then be used by the device’s
driver to program an internal device temperature sensor trip point."
So a "critical trip" here merely means we should take a more aggressive
cooling method.
As int340x device isn't present under ACPI ThermalZone, override the
default .critical callback to prevent surprising thermal shutdown.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201221172345.36976-1-kai.heng.feng@canonical.com
The function thermal_zone_device_reset() is called in the
thermal_zone_device_register() which allocates and initialize the
structure. The passive field is already zero-ed by the allocation, the
function is useless.
Call directly thermal_zone_device_init() instead and
thermal_zone_device_reset().
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201222181110.1231977-1-daniel.lezcano@linaro.org
The code does no longer use the ms unit based fields to set the
delays as they are replaced by the jiffies.
Remove them and replace their user to use the jiffies version instead.
Cc: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Peter Kästle <peter@piie.net>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20201216220337.839878-3-daniel.lezcano@linaro.org
The delays are also stored in jiffies based unit. Use them instead of
the ms.
Cc: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Link: https://lore.kernel.org/r/20201216220337.839878-2-daniel.lezcano@linaro.org
The delays are stored in ms units and when the polling function is
called this delay is converted into jiffies at each call.
Instead of doing the conversion again and again, compute the jiffies
at init time and use the value directly when setting the polling.
Cc: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Link: https://lore.kernel.org/r/20201216220337.839878-1-daniel.lezcano@linaro.org
The last site calling the thermal_zone_bind_cooling_device() function
with the THERMAL_TRIPS_NONE parameter was removed.
We can get rid of this test as no user of this function is calling
this function with this parameter.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Link: https://lore.kernel.org/r/20201214233811.485669-5-daniel.lezcano@linaro.org
The functions thermal_zone_device_rebind_exception and
thermal_zone_device_unbind_exception are not used from anywhere.
Remove that code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Link: https://lore.kernel.org/r/20201214233811.485669-2-daniel.lezcano@linaro.org
The code was reorganized in 2012 with the commit 0c01ebbfd3.
The main change is a loop on the trip points array and a unconditional
call to the throttle() ops of the governors for each of them even if
the trip temperature is not reached yet.
With this change, the 'forced_passive' is no longer checked in the
thermal_zone_device_update() function but in the step wise governor's
throttle() callback.
As the force_passive does no belong to the trip point array, the
thermal_zone_device_update() can not compare with the specified
passive temperature, thus does not detect the passive limit has been
crossed. Consequently, throttle() is never called and the
'forced_passive' branch is unreached.
In addition, the default processor cooling device is not automatically
bound to the thermal zone if there is not passive trip point, thus the
'forced_passive' can not operate.
If there is an active trip point, then the throttle function will be
called to mitigate at this temperature and the 'forced_passive' will
override the mitigation of the active trip point in this case but with
the default cooling device bound to the thermal zone, so usually a
fan, and that is not a passive cooling effect.
Given the regression exists since more than 8 years, nobody complained
and at the best of my knowledge there is no bug open in
https://bugzilla.kernel.org, it is reasonable to say it is unused.
Remove the 'forced_passive' related code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Link: https://lore.kernel.org/r/20201214233811.485669-1-daniel.lezcano@linaro.org
Several parts of the kernel are already using the effective CPU
utilization (as seen by the scheduler) to get the current load on the
CPU, do the same here instead of depending on the idle time of the CPU,
which isn't that accurate comparatively.
This is also the right thing to do as it makes the cpufreq governor
(schedutil) align better with the cpufreq_cooling driver, as the power
requested by cpufreq_cooling governor will exactly match the next
frequency requested by the schedutil governor since they are both using
the same metric to calculate load.
This was tested on ARM Hikey6220 platform with hackbench, sysbench and
schbench. None of them showed any regression or significant
improvements. Schbench is the most important ones out of these as it
creates the scenario where the utilization numbers provide a better
estimate of the future.
Scenario 1: The CPUs were mostly idle in the previous polling window of
the IPA governor as the tasks were sleeping and here are the details
from traces (load is in %):
Old: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=203 load={{0x35,0x1,0x0,0x31,0x0,0x0,0x64,0x0}} dynamic_power=1339
New: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=600 load={{0x60,0x46,0x45,0x45,0x48,0x3b,0x61,0x44}} dynamic_power=3960
Here, the "Old" line gives the load and requested_power (dynamic_power
here) numbers calculated using the idle time based implementation, while
"New" is based on the CPU utilization from scheduler.
As can be clearly seen, the load and requested_power numbers are simply
incorrect in the idle time based approach and the numbers collected from
CPU's utilization are much closer to the reality.
Scenario 2: The CPUs were busy in the previous polling window of the IPA
governor:
Old: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=800 load={{0x64,0x64,0x64,0x64,0x64,0x64,0x64,0x64}} dynamic_power=5280
New: thermal_power_cpu_get_power: cpus=00000000,000000ff freq=1200000 total_load=708 load={{0x4d,0x5c,0x5c,0x5b,0x5c,0x5c,0x51,0x5b}} dynamic_power=4672
As can be seen, the idle time based load is 100% for all the CPUs as it
took only the last window into account, but in reality the CPUs aren't
that loaded as shown by the utilization numbers.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lkml.kernel.org/r/9c255c83d78d58451abc06848001faef94c87a12.1607400596.git.viresh.kumar@linaro.org
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAl/cYxEACgkQqDIjiipP
6E//swf+NQe5V9ioQ7j4wshFJLwIfu7uV0pNeTx3aHGx2R03hYwtcIS21/77SAgB
3KQikcrblrT49oKyCfkzYRUmdTgweSFHj9wN9OWBsIStus2sTFho9EOmcpyX0KOJ
2QBiOvvD3+xlhSMiKelJ/3OohrHwYE9ZMMW9pL5bZKDXuHXBao2Y1s7ZB9EaaM11
p+Qo9m9E+Pbp4WdJX+AgTHYMyKpVZbbAY+hulSOWDIcOms05MtPF5glPhQQS9yfR
yS+xZVtS2G9LCkuMdbXBGEn7vKJeqjH2aJ21+UkpBQUFLQExXvitx2O3w7i0cbup
Q6j4CHTtDcXBKl9p1L5wjmjEM6hwMA==
=5VgU
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.11-2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal fixlet from Daniel Lezcano:
"A trivial change which fell through the cracks:
Add Alder Lake support ACPI ids (Srinivas Pandruvada)"
* tag 'thermal-v5.11-2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal: int340x: Support Alder Lake
Pull thermal updates from Daniel Lezcano:
- Add upper and lower limits clamps for the cooling device state in the
power allocator governor (Michael Kao)
- Add upper and lower limits support for the power allocator governor
(Lukasz Luba)
- Optimize conditions testing for the trip points (Bernard Zhao)
- Replace spin_lock_irqsave by spin_lock in hard IRQ on the rcar driver
(Tian Tao)
- Add MT8516 dt-bindings and device reset optional support (Fabien
Parent)
- Add a quiescent period to cool down the PCH when entering S0iX
(Sumeet Pawnikar)
- Use bitmap API instead of re-inventing the wheel on sun8i (Yangtao
Li)
- Remove useless NULL check in the hwmon driver (Bernard Zhao)
- Update the current state in the cpufreq cooling device only if the
frequency change is effective (Zhuguangqing)
- Improve the schema validation for the rcar DT bindings (Geert
Uytterhoeven)
- Fix the user time unit in the documentation (Viresh Kumar)
- Add PCI ids for Lewisburg PCH (Andres Freund)
- Add hwmon support on amlogic (Martin Blumenstingl)
- Fix build failure for PCH entering on in S0iX (Randy Dunlap)
- Improve the k_* coefficient for the power allocator governor (Lukasz
Luba)
- Fix missing const on a sysfs attribute (Rikard Falkeborn)
- Remove broken interrupt support on rcar to be replaced by a new one
(Niklas Söderlund)
- Improve the error code handling at init time on imx8mm (Fabio
Estevam)
- Compute interval validity once instead at each temperature reading
iteration on acerhdf (Daniel Lezcano)
- Add r8a779a0 support (Niklas Söderlund)
- Add PCI ids for AlderLake PCH and mmio refactoring (Srinivas
Pandruvada)
- Add RFIM and mailbox support on int340x (Srinivas Pandruvada)
- Use macro for temperature calculation on PCH (Sumeet Pawnikar)
- Simplify return conditions at probe time on Broadcom (Zheng Yongjun)
- Fix workload name on PCH (Srinivas Pandruvada)
- Migrate the devfreq cooling device code to the energy model API
(Lukasz Luba)
- Emit a warning if the thermal_zone_device_update is called without
the .get_temp() ops (Daniel Lezcano)
- Add critical and hot ops for the thermal zone (Daniel Lezcano)
- Remove notification usage when critical is reached on rcar (Daniel
Lezcano)
- Fix devfreq build when ENERGY_MODEL is not set (Lukasz Luba)
* tag 'thermal-v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (45 commits)
thermal/drivers/devfreq_cooling: Fix the build when !ENERGY_MODEL
thermal/drivers/rcar: Remove notification usage
thermal/core: Add critical and hot ops
thermal/core: Emit a warning if the thermal zone is updated without ops
drm/panfrost: Register devfreq cooling and attempt to add Energy Model
thermal: devfreq_cooling: remove old power model and use EM
thermal: devfreq_cooling: add new registration functions with Energy Model
thermal: devfreq_cooling: use a copy of device status
thermal: devfreq_cooling: change tracing function and arguments
thermal: int340x: processor_thermal: Correct workload type name
thermal: broadcom: simplify the return expression of bcm2711_thermal_probe()
thermal: intel: pch: use macro for temperature calculation
thermal: int340x: processor_thermal: Add mailbox driver
thermal: int340x: processor_thermal: Add RFIM driver
thermal: int340x: processor_thermal: Add AlderLake PCI device id
thermal: int340x: processor_thermal: Refactor MMIO interface
thermal: rcar_gen3_thermal: Add r8a779a0 support
dt-bindings: thermal: rcar-gen3-thermal: Add r8a779a0 support
platform/x86/drivers/acerhdf: Check the interval value when it is set
platform/x86/drivers/acerhdf: Use module_param_cb to set/get polling interval
...
Prevent build failure if the option CONFIG_ENERGY_MODEL is not set. The
devfreq cooling is able to operate without the Energy Model.
Don't use dev->em_pd directly and use local pointer.
Fixes: 615510fe13 ("thermal: devfreq_cooling: remove old power model and use EM")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201215154221.8828-1-lukasz.luba@arm.com
The ops is only showing a trace telling a critical trip point is
crossed. The same information is given by the thermal framework.
This is redundant, remove the code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://lore.kernel.org/r/20201210121514.25760-4-daniel.lezcano@linaro.org
Currently there is no way to the sensors to directly call an ops in
interrupt mode without calling thermal_zone_device_update assuming all
the trip points are defined.
A sensor may want to do something special if a trip point is hot or
critical.
This patch adds the critical and hot ops to the thermal zone device,
so a sensor can directly invoke them or let the thermal framework to
call the sensor specific ones.
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20201210121514.25760-2-daniel.lezcano@linaro.org
The actual code is silently ignoring a thermal zone update when a
driver is requesting it without a get_temp ops set.
That looks not correct, as the caller should not have called this
function if the thermal zone is unable to read the temperature.
That makes the code less robust as the check won't detect the driver
is inconsistently using the thermal API and that does not help to
improve the framework as these circumvolutions hide the problem at the
source.
In order to detect the situation when it happens, let's add a warning
when the update is requested without the get_temp() ops set.
Any warning emitted will have to be fixed at the source of the
problem: the caller must not call thermal_zone_device_update if there
is not get_temp callback set.
Cc: Thara Gopinath <thara.gopinath@linaro.org>
Cc: Amit Kucheria <amitk@kernel.org>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://lore.kernel.org/r/20201210121514.25760-1-daniel.lezcano@linaro.org