32-bit processors cannot generally access 64-bit MMIO registers
atomically, and it is unknown in which order the two halves of
this registers would need to be read:
drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.c: In function 'send_mbox_cmd':
drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.c:79:37: error: implicit declaration of function 'readq'; did you mean 'readl'? [-Werror=implicit-function-declaration]
79 | *cmd_resp = readq((void __iomem *) (proc_priv->mmio_base + MBOX_OFFSET_DATA));
| ^~~~~
| readl
The driver already does not build for anything other than x86,
so limit it further to x86-64.
Fixes: aeb58c860d ("thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit aeb58c860d ("thermal/drivers/int340x: processor_thermal: Suppot
64 bit RFIM responses") started using 'readq()' to read 64-bit status
responses from the int340x hardware.
That's all fine and good, but on 32-bit targets a 64-bit 'readq()' is
ambiguous, since it's no longer an atomic access. Some hardware might
require 64-bit accesses, and other hardware might want low word first or
high word first.
It's quite likely that the driver isn't relevant in a 32-bit environment
any more, and there's a patch floating around to just make it depend on
X86_64, but let's make it buildable on x86-32 anyway.
The driver previously just read the low 32 bits, so the hardware
certainly is ok with 32-bit reads, and in a little-endian environment
the low word first model is the natural one.
So just add the include for the 'io-64-nonatomic-lo-hi.h' version.
Fixes: aeb58c860d ("thermal/drivers/int340x: processor_thermal: Suppot 64 bit RFIM responses")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some of the RFIM mail box command returns 64 bit values. So enhance
mailbox interface to return 64 bit values and use them for RFIM
commands.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Fixes: 5d6fbc96bd ("thermal/drivers/int340x: processor_thermal: Export additional attributes")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When the driver resumes, the tcc offset is set back to its previous
value. But this only works if the value was user defined as otherwise
the offset isn't saved. This asymmetric logic is harder to maintain and
introduced some issues.
Improve the logic by saving the tcc offset in a suspend op, so the right
value is always restored after a resume.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pI andruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20210909085613.5577-3-atenart@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
This check has a signedness bug and does not work. If "length" is
larger than "PAGE_SIZE" then "PAGE_SIZE - length" is not negative
but instead it is a large unsigned value. Fortunately, Takashi Iwai
changed this code to use scnprint() instead of snprintf() so now
"length" is never larger than "PAGE_SIZE - 1" and the check can be
removed.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
'cpu_clamping_mask' is a bitmap. So use 'bitmap_zalloc()' and
'bitmap_free()' to simplify code, improve the semantic of the code and
avoid some open-coded arithmetic in allocator arguments.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After upgrading to Linux 5.13.3 I noticed my laptop would shutdown due
to overheat (when it should not). It turned out this was due to commit
fe6a6de669 ("thermal/drivers/int340x/processor_thermal: Fix tcc setting").
What happens is this drivers uses a global variable to keep track of the
tcc offset (tcc_offset_save) and uses it on resume. The issue is this
variable is initialized to 0, but is only set in
tcc_offset_degree_celsius_store, i.e. when the tcc offset is explicitly
set by userspace. If that does not happen, the resume path will set the
offset to 0 (in my case the h/w default being 3, the offset would become
too low after a suspend/resume cycle).
The issue did not arise before commit fe6a6de669, as the function
setting the offset would return if the offset was 0. This is no longer
the case (rightfully).
Fix this by not applying the offset if it wasn't saved before, reverting
back to the old logic. A better approach will come later, but this will
be easier to apply to stable kernels.
The logic to restore the offset after a resume was there long before
commit fe6a6de669, but as a value of 0 was considered invalid I'm
referencing the commit that made the issue possible in the Fixes tag
instead.
Fixes: fe6a6de669 ("thermal/drivers/int340x/processor_thermal: Fix tcc setting")
Cc: stable@vger.kernel.org
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pI andruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20210909085613.5577-2-atenart@kernel.org
tegra by adding a dependency on ARCH_TEGRA along with COMPILE_TEST
(Dmitry Osipenko)
- Fix the error code for the exynos when devm_get_clk() fails (Dan
Carpenter)
- Add the TCC cooling support for AlderLake platform (Sumeet Pawnikar)
- Add support for hardware trip points for the rcar gen3 thermal
driver and store TSC id as unsigned int (Niklas Söderlund)
- Replace the deprecated CPU-hotplug functions get_online_cpus() and
put_online_cpus (Sebastian Andrzej Siewior)
- Add the thermal tools directory in the MAINTAINERS file (Daniel
Lezcano)
- Fix the Makefile and the cross compilation flags for the userspace
'tmon' tool (Rolf Eike Beer)
- Allow to use the IMOK independently from the GDDV on Int340x (Sumeet
Pawnikar)
- Fix the stub thermal_cooling_device_register() function prototype
which does not match the real function (Arnd Bergmann)
- Make the thermal trip point optional in the DT bindings (Maxime
Ripard)
- Fix a typo in a comment in the core code (Geert Uytterhoeven)
- Reduce the verbosity of the trace in the SoC thermal tegra driver
(Dmitry Osipenko)
- Add the support for the LMh (Limit Management hardware) driver on
the QCom platforms (Thara Gopinath)
- Allow processing of HWP interrupt by adding a weak function in the
Intel driver (Srinivas Pandruvada)
- Prevent an abort of the sensor probe is a channel is not used
(Matthias Kaehlcke)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmE7yAcACgkQqDIjiipP
6E8WKwgAmI+kdOURCz1CtCIv6FnQFx20suxZidfATPULBPZ8zcHqdjxJECXJpvDz
y8T8dGAPRqXmMBJm2vF/qVraHaDJvscHXaflhsurhyEp0tiGptNeugBosso0jDEQ
JFrQ6Rny/NCmNMOkWW4YlSfkik0zQ43xlHocy3UfOKfcshhYWGsPTbIXdA6wH/zh
3tFW8j327tkyenrKZiaioJOSX88Oyy7Ft8l5k0HP+hPYkrvihWDJuEfEEmpfkdA7
Q70Lacf2QtFP8mUfGKpQxURSgndjphDPU7tzP5NqqWvLuT94aTspvJx4ywVwK8Iv
o+jNvR33yBIs5FXXB7B1ymkNE1h8uQ==
=blIQ
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal updates from Daniel Lezcano:
- Add the tegra3 thermal sensor and fix the compilation testing on
tegra by adding a dependency on ARCH_TEGRA along with COMPILE_TEST
(Dmitry Osipenko)
- Fix the error code for the exynos when devm_get_clk() fails (Dan
Carpenter)
- Add the TCC cooling support for AlderLake platform (Sumeet Pawnikar)
- Add support for hardware trip points for the rcar gen3 thermal driver
and store TSC id as unsigned int (Niklas Söderlund)
- Replace the deprecated CPU-hotplug functions get_online_cpus() and
put_online_cpus (Sebastian Andrzej Siewior)
- Add the thermal tools directory in the MAINTAINERS file (Daniel
Lezcano)
- Fix the Makefile and the cross compilation flags for the userspace
'tmon' tool (Rolf Eike Beer)
- Allow to use the IMOK independently from the GDDV on Int340x (Sumeet
Pawnikar)
- Fix the stub thermal_cooling_device_register() function prototype
which does not match the real function (Arnd Bergmann)
- Make the thermal trip point optional in the DT bindings (Maxime
Ripard)
- Fix a typo in a comment in the core code (Geert Uytterhoeven)
- Reduce the verbosity of the trace in the SoC thermal tegra driver
(Dmitry Osipenko)
- Add the support for the LMh (Limit Management hardware) driver on the
QCom platforms (Thara Gopinath)
- Allow processing of HWP interrupt by adding a weak function in the
Intel driver (Srinivas Pandruvada)
- Prevent an abort of the sensor probe is a channel is not used
(Matthias Kaehlcke)
* tag 'thermal-v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal/drivers/qcom/spmi-adc-tm5: Don't abort probing if a sensor is not used
thermal/drivers/intel: Allow processing of HWP interrupt
dt-bindings: thermal: Add dt binding for QCOM LMh
thermal/drivers/qcom: Add support for LMh driver
firmware: qcom_scm: Introduce SCM calls to access LMh
thermal/drivers/tegra-soctherm: Silence message about clamped temperature
thermal: Spelling s/scallbacks/callbacks/
dt-bindings: thermal: Make trips node optional
thermal/core: Fix thermal_cooling_device_register() prototype
thermal/drivers/int340x: Use IMOK independently
tools/thermal/tmon: Add cross compiling support
thermal/tools/tmon: Improve the Makefile
MAINTAINERS: Add missing userspace thermal tools to the thermal section
thermal/drivers/intel_powerclamp: Replace deprecated CPU-hotplug functions.
thermal/drivers/rcar_gen3_thermal: Store TSC id as unsigned int
thermal/drivers/rcar_gen3_thermal: Add support for hardware trip points
drivers/thermal/intel: Add TCC cooling support for AlderLake platform
thermal/drivers/exynos: Fix an error code in exynos_tmu_probe()
thermal/drivers/tegra: Correct compile-testing of drivers
thermal/drivers/tegra: Add driver for Tegra30 thermal sensor
Add a weak function to process HWP (Hardware P-states) notifications and
move updating HWP_STATUS MSR to this function.
This allows HWP interrupts to be processed by the intel_pstate driver in
HWP mode by overriding the implementation.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210820024006.2347720-1-srinivas.pandruvada@linux.intel.com
Highlights:
- Move all the Intel drivers into their own subdir(s) (mostly Kate's work)
- New meraki-mx100 platform driver
- Asus WMI driver enhancements, including
/sys/firmware/acpi/platform_profile support
- New BIOS SAR driver for Intel M.2 WWAM modems
- Alder Lake support for the Intel PMC driver
- A whole bunch of cleanups + fixes all over the place
The following is an automated git shortlog grouped by driver:
BIOS SAR driver for Intel M.2 Modem:
- BIOS SAR driver for Intel M.2 Modem
ISST:
- use semi-colons instead of commas
- Fix optimization with use of numa
Replace deprecated CPU-hotplug functions.:
- Replace deprecated CPU-hotplug functions.
Update Mario Limonciello's email address in the docs:
- Update Mario Limonciello's email address in the docs
acer-wmi:
- Add Turbo Mode support for Acer PH315-53
add meraki-mx100 platform driver:
- add meraki-mx100 platform driver
asus-nb-wmi:
- Add tablet_mode_sw=lid-flip quirk for the TP200s
- Allow configuring SW_TABLET_MODE method with a module option
asus-wmi:
- Fix "unsigned 'retval' is never less than zero" smatch warning
- Delete impossible condition
- Add support for platform_profile
- Add egpu enable method
- Add dgpu disable method
- Add panel overdrive functionality
dell-smbios:
- Remove unused dmi_system_id table
dell-smbios-wmi:
- Add missing kfree in error-exit from run_smbios_call
- Avoid false-positive memcpy() warning
dell-smo8800:
- Convert to be a platform driver
dual_accel_detect:
- Use the new i2c_acpi_client_count() helper
gigabyte-wmi:
- add support for B450M S2H V2
- add support for X570 GAMING X
hp_accel:
- Convert to be a platform driver
- Remove _INI method call
i2c:
- acpi: Add an i2c_acpi_client_count() helper function
i2c-multi-instantiate:
- Use the new i2c_acpi_client_count() helper
ideapad-laptop:
- Fix Legion 5 Fn lock LED
intel-hid:
- Move to intel sub-directory
intel-rst:
- Move to intel sub-directory
intel-smartconnect:
- Move to intel sub-directory
intel-uncore-frequency:
- Move to intel sub-directory
intel-vbtn:
- Move to intel sub-directory
intel-wmi-sbl-fw-update:
- Move to intel sub-directory
intel-wmi-thunderbolt:
- Move to intel sub-directory
intel_atomisp2:
- Move to intel sub-directory
intel_bxtwc_tmu:
- Move to intel sub-directory
intel_cht_int33fe:
- Use the new i2c_acpi_client_count() helper
intel_chtdc_ti_pwrbtn:
- Move to intel sub-directory
intel_int0002_vgpio:
- Move to intel sub-directory
intel_mrfld_pwrbtn:
- Move to intel sub-directory
intel_oaktrail:
- Move to intel sub-directory
intel_pmc_core:
- Move to intel sub-directory
- Prevent possibile overflow
intel_pmt_telemetry:
- Ignore zero sized entries
intel_punit_ipc:
- Move to intel sub-directory
intel_scu_ipc:
- Fix doc of intel_scu_ipc_dev_command_with_size()
intel_speed_select_if:
- Move to intel sub-directory
intel_telemetry:
- Move to intel sub-directory
intel_turbo_max_3:
- Move to intel sub-directory
lg-laptop:
- Use correct event for keyboard backlight FN-key
- Use correct event for touchpad toggle FN-key
- Support for battery charge limit on newer models
platform/mellanox:
- mlxbf-pmc: fix kernel-doc notation
platform/surface:
- aggregator: Use y instead of objs in Makefile
- surface3_power: Use i2c_acpi_get_i2c_resource() helper
platform/x86/intel:
- pmc/core: Add GBE Package C10 fix for Alder Lake PCH
- pmc/core: Add Alder Lake low power mode support for pmc core
- pmc/core: Add Latency Tolerance Reporting (LTR) support to Alder Lake
- pmc/core: Add Alderlake support to pmc core driver
- int3472: Use y instead of objs in Makefile
- pmt: Use y instead of objs in Makefile
- int33fe: Use y instead of objs in Makefile
- Move Intel PMT drivers to new subfolder
thermal/drivers/intel:
- Move intel_menlow to thermal drivers
think-lmi:
- add debug_cmd
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmEw1KAUHGhkZWdvZWRl
QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9wk7Qf/dsaaDgx7aC6DfKzdcMgfeLIdTaGm
a6svNXM2t/JFdvhjYzxA+4QlQgco7zkN06iRlWbObEonSUsHlGlwEOHX60VgcopO
qJaqnznmfdXUocFTnA+5acJXabNaw7xkKHS0K61UWgk+mm6aMuygpKxULnNTa4X+
p3HoU6uXFckpoA/Jstzo5UfegNYhg11bflNd7XN4F3rMCbbNHAsWlf4oVr2YsEHa
wECW+1e8wZl4BInUzoXQhilRoybJWXWJ8sLsvQfDXLs9aNoLdDqu9p0MuXEW5QqE
wNt26SNNAP2L49BD6kaJszV5Ry/jNSEtVkWwkrzHbGTZoOEyMYas8pm6Uw==
=iK1W
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Hans de Goede:
"Highlights:
- Move all the Intel drivers into their own subdir(s) (mostly Kate's
work)
- New meraki-mx100 platform driver
- Asus WMI driver enhancements, including support for
/sys/firmware/acpi/platform_profile
- New BIOS SAR driver for Intel M.2 WWAM modems
- Alder Lake support for the Intel PMC driver
- A whole bunch of cleanups + fixes all over the place"
* tag 'platform-drivers-x86-v5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (65 commits)
platform/x86: dell-smbios-wmi: Add missing kfree in error-exit from run_smbios_call
platform/x86: dell-smbios-wmi: Avoid false-positive memcpy() warning
platform/x86: ISST: use semi-colons instead of commas
platform/x86: asus-wmi: Fix "unsigned 'retval' is never less than zero" smatch warning
platform/x86: asus-wmi: Delete impossible condition
platform/x86: hp_accel: Convert to be a platform driver
platform/x86: hp_accel: Remove _INI method call
platform/mellanox: mlxbf-pmc: fix kernel-doc notation
platform/x86/intel: pmc/core: Add GBE Package C10 fix for Alder Lake PCH
platform/x86/intel: pmc/core: Add Alder Lake low power mode support for pmc core
platform/x86/intel: pmc/core: Add Latency Tolerance Reporting (LTR) support to Alder Lake
platform/x86/intel: pmc/core: Add Alderlake support to pmc core driver
platform/x86: intel-wmi-thunderbolt: Move to intel sub-directory
platform/x86: intel-wmi-sbl-fw-update: Move to intel sub-directory
platform/x86: intel-vbtn: Move to intel sub-directory
platform/x86: intel_oaktrail: Move to intel sub-directory
platform/x86: intel_int0002_vgpio: Move to intel sub-directory
platform/x86: intel-hid: Move to intel sub-directory
platform/x86: intel_atomisp2: Move to intel sub-directory
platform/x86: intel_speed_select_if: Move to intel sub-directory
...
Add a weak function to process HWP (Hardware P-states) notifications and
move updating HWP_STATUS MSR to this function.
This allows HWP interrupts to be processed by the intel_pstate driver in
HWP mode by overriding the implementation.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Moved drivers/platform/x86/intel_menlow.c to drivers/thermal/intel.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20210816035356.1955982-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Some chrome platform requires IMOK method in coreboot. But these platforms
don't use GDDV data vault in coreboot. As per current code flow, to enable
and use IMOK only, we need to have GDDV support as well in coreboot. This
patch removes the dependency for IMOK from GDDV to enable and use IMOK
independently.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210716163946.3142-1-sumeet.r.pawnikar@intel.com
The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().
Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Amit Kucheria <amitk@kernel.org>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210803141621.780504-20-bigeasy@linutronix.de
The following fixes are done for tcc sysfs interface:
- TCC is 6 bits only from bit 29-24
- TCC of 0 is valid
- When BIT(31) is set, this register is read only
- Check for invalid tcc value
- Error for negative values
Fixes: fdf4f2fb8e ("drivers: thermal: processor_thermal_device: Export sysfs interface for TCC offset")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: stable@vger.kernel.org
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210628215803.75038-1-srinivas.pandruvada@linux.intel.com
Add a new PCI driver which register a thermal zone and allows to get
notification for threshold violation by a RW trip point. These
notifications are delivered from the device using MSI based
interrupt.
The main difference between this new PCI driver and the existing
one is that the temperature and trip points directly use PCI
MMIO instead of using ACPI methods.
This driver registers a thermal zone "TCPU_PCI" in addition to the
legacy processor thermal device, which uses ACPI companion device
to set name, temperature and trips.
This driver is enabled for AlderLake.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210525204811.3793651-3-srinivas.pandruvada@linux.intel.com
Remove enumeration part from the processor_thermal_device to two
different modules. One for ACPI and one for PCI:
ACPI enumeration: int3401_thermal
PCI part: processor_thermal_device_pci_legacy
The current processor_thermal_device now just implements interface
functions to be used by the ACPI and PCI enumeration module. This is
done by:
1. Make functions proc_thermal_add() and proc_thermal_remove() non static
and export them for usage in other processor_thermal_device_pci_legacy.c
and in int3401_thermal.c.
2. Move the sysfs file creation for TCC offset and power limit attribute
group to the proc_thermal_add() from the individual enumeration callbacks
for PCI and ACPI.
3. Create new interface functions proc_thermal_mmio_add() and
proc_thermal_mmio_remove() which will be called from the
processor_thermal_device_pci_legacy module.
4. Export proc_thermal_resume(), so that it can be used by power
management callbacks.
5. Remove special check for double enumeration as it never happens.
While here, fix some cleanup on error conditions in proc_thermal_add().
No functional changes are expected with this change.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210525204811.3793651-2-srinivas.pandruvada@linux.intel.com
Export additional attributes:
ddr_data_rate (RO) : Show current DDR (Double Data Rate) data rate.
rfi_restriction (RW) : Show or set current state for RFI (Radio
Frequency Interference) protection.
These attributes use mailbox commands to get/set information. Here
command codes are:
0x0007: Read RFI restriction
0x0107: Read DDR data rate
0x0008: Write RFI restriction
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210517061441.1921901-3-srinivas.pandruvada@linux.intel.com
MSR_AMD64_SEV even though the spec clearly states so, and check CPUID
bits first.
- Send only one signal to a task when it is a SEGV_PKUERR si_code type.
- Do away with all the wankery of reserving X amount of memory in
the first megabyte to prevent BIOS corrupting it and simply and
unconditionally reserve the whole first megabyte.
- Make alternatives NOP optimization work at an arbitrary position
within the patched sequence because the compiler can put single-byte
NOPs for alignment anywhere in the sequence (32-bit retpoline), vs our
previous assumption that the NOPs are only appended.
- Force-disable ENQCMD[S] instructions support and remove update_pasid()
because of insufficient protection against FPU state modification in an
interrupt context, among other xstate horrors which are being addressed
at the moment. This one limits the fallout until proper enablement.
- Use cpu_feature_enabled() in the idxd driver so that it can be
build-time disabled through the defines in .../asm/disabled-features.h.
- Fix LVT thermal setup for SMI delivery mode by making sure the APIC
LVT value is read before APIC initialization so that softlockups during
boot do not happen at least on one machine.
- Mark all legacy interrupts as legacy vectors when the IO-APIC is
disabled and when all legacy interrupts are routed through the PIC.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmC8fdEACgkQEsHwGGHe
VUqO5A/+IbIo8myl8VPjw6HRnHgY8rsYRjxdtmVhbaMi5XOmTMfVA9zJ6QALxseo
Mar8bmWcezEs0/FmNvk1vEOtIgZvRVy5RqXbu3W2EgWICuzRWbj822q+KrkbY0tH
1GWjcZQO8VlgeuQsukyj5QHaBLffpn3Fh1XB8r0cktZvwciM+LRNMnK8d6QjqxNM
ctTX4wdI6kc076pOi7MhKxSe+/xo5Wnf27lClLMOcsO/SS42KqgeRM5psWqxihhL
j6Y3Oe+Nm+7GKF8y841PUSlwjgWmlZa6UkR6DBTP7DGnHDa5hMpzxYvHOquq/SbA
leV9OLqI0iWs56kSzbEcXo7do1kld62KjsA2KtUhJfVAtm+igQLh5G0jESBwrWca
TBWaE5kt6s8wP7LXeg26o4U8XD8vqEH88Tmsjlgqb/t/PKDV9PMGvNpF00dPZFo6
Jhj2yntJYjLQYoAQLuQm5pfnKhZy3KKvk7ViGcnp3iN9i4eU9HzawIiXnliNOrTI
ohQ9KoRhy1Cx0UfLkR+cdK4ks0u26DC2/Ewt0CE5AP/CQ1rX6Zbv2gFLjSpy7yQo
6A99HEpbaLuy3kDt5vn91viPNUlOveuIXIdHp6u+zgFfx88eLUoEvfR135aV/Gyh
p5PJm/BO99KByQzFCnilkp7nBeKtnKYSmUojA6JsZKjzJimSPYo=
=zRI1
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
"A bunch of x86/urgent stuff accumulated for the last two weeks so
lemme unload it to you.
It should be all totally risk-free, of course. :-)
- Fix out-of-spec hardware (1st gen Hygon) which does not implement
MSR_AMD64_SEV even though the spec clearly states so, and check
CPUID bits first.
- Send only one signal to a task when it is a SEGV_PKUERR si_code
type.
- Do away with all the wankery of reserving X amount of memory in the
first megabyte to prevent BIOS corrupting it and simply and
unconditionally reserve the whole first megabyte.
- Make alternatives NOP optimization work at an arbitrary position
within the patched sequence because the compiler can put
single-byte NOPs for alignment anywhere in the sequence (32-bit
retpoline), vs our previous assumption that the NOPs are only
appended.
- Force-disable ENQCMD[S] instructions support and remove
update_pasid() because of insufficient protection against FPU state
modification in an interrupt context, among other xstate horrors
which are being addressed at the moment. This one limits the
fallout until proper enablement.
- Use cpu_feature_enabled() in the idxd driver so that it can be
build-time disabled through the defines in disabled-features.h.
- Fix LVT thermal setup for SMI delivery mode by making sure the APIC
LVT value is read before APIC initialization so that softlockups
during boot do not happen at least on one machine.
- Mark all legacy interrupts as legacy vectors when the IO-APIC is
disabled and when all legacy interrupts are routed through the PIC"
* tag 'x86_urgent_for_v5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev: Check SME/SEV support in CPUID first
x86/fault: Don't send SIGSEGV twice on SEGV_PKUERR
x86/setup: Always reserve the first 1M of RAM
x86/alternative: Optimize single-byte NOPs at an arbitrary position
x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid()
dmaengine: idxd: Use cpu_feature_enabled()
x86/thermal: Fix LVT thermal setup for SMI delivery mode
x86/apic: Mark _all_ legacy interrupts when IO/APIC is missing
There are machines out there with added value crap^WBIOS which provide an
SMI handler for the local APIC thermal sensor interrupt. Out of reset,
the BSP on those machines has something like 0x200 in that APIC register
(timestamps left in because this whole issue is timing sensitive):
[ 0.033858] read lvtthmr: 0x330, val: 0x200
which means:
- bit 16 - the interrupt mask bit is clear and thus that interrupt is enabled
- bits [10:8] have 010b which means SMI delivery mode.
Now, later during boot, when the kernel programs the local APIC, it
soft-disables it temporarily through the spurious vector register:
setup_local_APIC:
...
/*
* If this comes from kexec/kcrash the APIC might be enabled in
* SPIV. Soft disable it before doing further initialization.
*/
value = apic_read(APIC_SPIV);
value &= ~APIC_SPIV_APIC_ENABLED;
apic_write(APIC_SPIV, value);
which means (from the SDM):
"10.4.7.2 Local APIC State After It Has Been Software Disabled
...
* The mask bits for all the LVT entries are set. Attempts to reset these
bits will be ignored."
And this happens too:
[ 0.124111] APIC: Switch to symmetric I/O mode setup
[ 0.124117] lvtthmr 0x200 before write 0xf to APIC 0xf0
[ 0.124118] lvtthmr 0x10200 after write 0xf to APIC 0xf0
This results in CPU 0 soft lockups depending on the placement in time
when the APIC soft-disable happens. Those soft lockups are not 100%
reproducible and the reason for that can only be speculated as no one
tells you what SMM does. Likely, it confuses the SMM code that the APIC
is disabled and the thermal interrupt doesn't doesn't fire at all,
leading to CPU 0 stuck in SMM forever...
Now, before
4f432e8bb1 ("x86/mce: Get rid of mcheck_intel_therm_init()")
due to how the APIC_LVTTHMR was read before APIC initialization in
mcheck_intel_therm_init(), it would read the value with the mask bit 16
clear and then intel_init_thermal() would replicate it onto the APs and
all would be peachy - the thermal interrupt would remain enabled.
But that commit moved that reading to a later moment in
intel_init_thermal(), resulting in reading APIC_LVTTHMR on the BSP too
late and with its interrupt mask bit set.
Thus, revert back to the old behavior of reading the thermal LVT
register before the APIC gets initialized.
Fixes: 4f432e8bb1 ("x86/mce: Get rid of mcheck_intel_therm_init()")
Reported-by: James Feeney <james@nurealm.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lkml.kernel.org/r/YKIqDdFNaXYd39wz@zn.tnic
After commit 81ad4276b5 ("Thermal: Ignore invalid trip points") all
user_space governor notifications via RW trip point is broken in intel
thermal drivers. This commits marks trip_points with value of 0 during
call to thermal_zone_device_register() as invalid. RW trip points can be
0 as user space will set the correct trip temperature later.
During driver init, x86_package_temp and all int340x drivers sets RW trip
temperature as 0. This results in all these trips marked as invalid by
the thermal core.
To fix this initialize RW trips to THERMAL_TEMP_INVALID instead of 0.
Cc: <stable@vger.kernel.org>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210430122343.1789899-1-srinivas.pandruvada@linux.intel.com
On Intel processors, the core frequency can be reduced below OS request,
when the current temperature reaches the TCC (Thermal Control Circuit)
activation temperature.
The default TCC activation temperature is specified by
MSR_IA32_TEMPERATURE_TARGET. However, it can be adjusted by specifying an
offset in degrees C, using the TCC Offset bits in the same MSR register.
This patch introduces a cooling devices driver that utilizes the TCC
Offset feature. The bigger the current cooling state is, the lower the
effective TCC activation temperature is, so that the processors can be
throttled earlier before system critical overheats.
Note that, on different platforms, the behavior might be different on
how fast the setting takes effect, and how much the CPU frequency is
reduced.
This patch has been tested on a KabyLake mobile platform from me, and also
on a CometLake platform from Doug.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210412125901.12549-1-rui.zhang@intel.com
thermal driver (Daniel Lezcano)
- Remove the notify ops as it is no longer used (Daniel Lezcano)
- Remove the 'forced passive' option and the unused bind/unbind
functions (Daniel Lezcano)
- Remove the THERMAL_TRIPS_NONE and the code cleanup around this
macro (Daniel Lezcano)
- Rework the delays to make them pre-computed instead of computing
them again and again at each polling interval (Daniel Lezcano)
- Remove the pointless 'thermal_zone_device_reset' function (Daniel
Lezcano)
- Use the critical and hot ops to prevent an unexpected system
shutdown on int340x (Kai-Heng Feng)
- Make the cooling device state private to the thermal subsystem
(Daniel Lezcano)
- Prevent to use not-power-aware actor devices with the power
allocator governor (Lukasz Luba)
- Remove 'zx' and 'tango' support along with the corresponding
platforms (Arnd Bergman)
- Fix several issues on the Omap thermal driver (Tony Lindgren)
- Add support for adc-tm5 PMIC thermal monitor for Qcom
platforms. Please note those changes rely on an immutable branch:
iio-thermal-5.11-rc1/ib-iio-thermal-5.11-rc1 from the iio tree
(Dmitry Baryshkov)
- Fix an initialization loop in the adc-tm5 (Colin Ian King)
- Fix a return error check in the cpufreq cooling device (Viresh Kumar)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmAvkKMACgkQqDIjiipP
6E9yFggAmXy8t2j1mRvn/KLU+teTGIoSFkZ8mBnY2Sgip97IRZRhCwAZbUKOW0eI
bvpAzBjacxdZHT7OxxvGzCOq/zlAh4UoStI8bMpzdUWPdkAj4ippArLYGvagLym8
WEQysWnrr8V1RCZbQuBNjyLwjf0fcXkzIBU1mbZXA8T8Y6Yn646TdtsrVT4Idg1j
MOg7PAHBcTSY/wOReZKJ5TB1yvo2tNOuGOqUVbrIAHlRkiNTVHirVUq6aZGtTTKp
7ukcu8EI/o7XKBdQ5d9MZaHdwkcyAIJj4jdjmjkUJpa8VYQFPjayNyN3I+Py9lH2
jtWVYHQxZbY166IZP2yeXFjPzd6elw==
=Jmz4
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal updates from Daniel Lezcano:
- Use the newly introduced 'hot' and 'critical' ops for the acpi
thermal driver (Daniel Lezcano)
- Remove the notify ops as it is no longer used (Daniel Lezcano)
- Remove the 'forced passive' option and the unused bind/unbind
functions (Daniel Lezcano)
- Remove the THERMAL_TRIPS_NONE and the code cleanup around this macro
(Daniel Lezcano)
- Rework the delays to make them pre-computed instead of computing them
again and again at each polling interval (Daniel Lezcano)
- Remove the pointless 'thermal_zone_device_reset' function (Daniel
Lezcano)
- Use the critical and hot ops to prevent an unexpected system shutdown
on int340x (Kai-Heng Feng)
- Make the cooling device state private to the thermal subsystem
(Daniel Lezcano)
- Prevent to use not-power-aware actor devices with the power allocator
governor (Lukasz Luba)
- Remove 'zx' and 'tango' support along with the corresponding
platforms (Arnd Bergman)
- Fix several issues on the Omap thermal driver (Tony Lindgren)
- Add support for adc-tm5 PMIC thermal monitor for Qcom platforms
(Dmitry Baryshkov)
- Fix an initialization loop in the adc-tm5 (Colin Ian King)
- Fix a return error check in the cpufreq cooling device (Viresh Kumar)
* tag 'thermal-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (26 commits)
thermal: cpufreq_cooling: freq_qos_update_request() returns < 0 on error
thermal: qcom: Fix comparison with uninitialized variable channels_available
thermal: qcom: add support for adc-tm5 PMIC thermal monitor
dt-bindings: thermal: qcom: add adc-thermal monitor bindings
thermal: ti-soc-thermal: Use non-inverted define for omap4
thermal: ti-soc-thermal: Simplify polling with iopoll
thermal: ti-soc-thermal: Fix stuck sensor with continuous mode for 4430
thermal: ti-soc-thermal: Skip pointless register access for dra7
thermal/drivers/zx: Remove zx driver
thermal/drivers/tango: Remove tango driver
thermal: power allocator: fail binding for non-power actor devices
thermal/core: Make cooling device state change private
thermal: intel: pch: Fix unexpected shutdown at critical temperature
thermal: int340x: Fix unexpected shutdown at critical temperature
thermal/core: Remove pointless thermal_zone_device_reset() function
thermal/core: Remove ms based delay fields
thermal/core: Use precomputed jiffies for the polling
thermal/core: Precompute the delays from msecs to jiffies
thermal/core: Remove unused macro THERMAL_TRIPS_NONE
thermal/core: Remove THERMAL_TRIPS_NONE test
...
This functionality has nothing to do with MCE, move it to the thermal
framework and untangle it from MCE.
Requested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lkml.kernel.org/r/20210202121003.GD18075@zn.tnic
Like previous patch, the intel_pch_thermal device is not in ACPI
ThermalZone namespace, so a critical trip doesn't mean shutdown.
Override the default .critical callback to prevent surprising thermal
shutdoown.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201221172345.36976-2-kai.heng.feng@canonical.com
We are seeing thermal shutdown on Intel based mobile workstations, the
shutdown happens during the first trip handle in
thermal_zone_device_register():
kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down
However, we shouldn't do a thermal shutdown here, since
1) We may want to use a dedicated daemon, Intel's thermald in this case,
to handle thermal shutdown.
2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
"... If this object it present under a device, the device’s driver
evaluates this object to determine the device’s critical cooling
temperature trip point. This value may then be used by the device’s
driver to program an internal device temperature sensor trip point."
So a "critical trip" here merely means we should take a more aggressive
cooling method.
As int340x device isn't present under ACPI ThermalZone, override the
default .critical callback to prevent surprising thermal shutdown.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201221172345.36976-1-kai.heng.feng@canonical.com
Use macro for temperature calculation
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201210124801.13850-1-sumeet.r.pawnikar@intel.com
Added processor thermal device mail box interface for workload hints
setting. These hints will give indication to hardware to better manage
power and thermals. The supported hints are:
idle
semi_active
burusty
sustained
battery_life
For example when the system is on battery, the hardware can be less
aggressive in power ramp up.
This will create an attribute group at
/sys/bus/pci/devices/0000:00:04.0/workload_request
This folder contains two attributes:
workload_available_types : (RO): This shows available workload types
workload_type: (RW) : Allows to set and get current workload type
setting
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201126171829.945969-4-srinivas.pandruvada@linux.intel.com
Add support for RFIM (Radio Frequency Interference Mitigation) support
via processor thermal PCI device. This drivers allows adjustment of
FIVR (Fully Integrated Voltage Regulator) and DDR (Double Data Rate)
frequencies to avoid RF interference with WiFi and 5G.
Switching voltage regulators (VR) generate radiated EMI or RFI at the
fundamental frequency and its harmonics. Some harmonics may interfere
with very sensitive wireless receivers such as Wi-Fi and cellular that
are integrated into host systems like notebook PCs. One of mitigation
methods is requesting SOC integrated VR (IVR) switching frequency to a
small % and shift away the switching noise harmonic interference from
radio channels. OEM or ODMs can use the driver to control SOC IVR
operation within the range where it does not impact IVR performance.
DRAM devices of DDR IO interface and their power plane can generate EMI
at the data rates. Similar to IVR control mechanism, Intel offers a
mechanism by which DDR data rates can be changed if several conditions
are met: there is strong RFI interference because of DDR; CPU power
management has no other restriction in changing DDR data rates;
PC ODMs enable this feature (real time DDR RFI Mitigation referred to as
DDR-RFIM) for Wi-Fi from BIOS.
This change exports two folders under /sys/bus/pci/devices/0000:00:04.0.
One folder "fivr" contains all attributes exposed for controling FIVR
features. The other folder "dvfs" contains all attributes for DDR
features.
Changes done to implement:
- New module for rfim interfaces
- Two new per processor features for DDR and FIVR
- Enable feature for Tiger Lake (FIVR only) and Alder Lake
The attributes exposed and explanation:
FIVR attributes
vco_ref_code_lo (RW): The VCO reference code is an 11-bit field and
controls the FIVR switching frequency. This is the 3-bit LSB field.
vco_ref_code_hi (RW): The VCO reference code is an 11-bit field and
controls the FIVR switching frequency. This is the 8-bit MSB field.
spread_spectrum_pct (RW): Set the FIVR spread spectrum clocking
percentage
spread_spectrum_clk_enable (RW): Enable/disable of the FIVR spread
spectrum clocking feature
rfi_vco_ref_code (RW): This field is a read only status register which
reflects the current FIVR switching frequency
fivr_fffc_rev (RW): This field indicated the revision of the FIVR HW.
DVFS attributes
rfi_restriction_run_busy (RW): Request the restriction of specific DDR
data rate and set this value 1. Self reset to 0 after operation.
rfi_restriction_err_code (RW): Values: 0 :Request is accepted, 1:Feature
disabled, 2: the request restricts more points than it is allowed
rfi_restriction_data_rate_Delta (RW): Restricted DDR data rate for RFI
protection: Lower Limit
rfi_restriction_data_rate_Base (RW): Restricted DDR data rate for RFI
protection: Upper Limit
ddr_data_rate_point_0 (RO): DDR data rate selection 1st point
ddr_data_rate_point_1 (RO): DDR data rate selection 2nd point
ddr_data_rate_point_2 (RO): DDR data rate selection 3rd point
ddr_data_rate_point_3 (RO): DDR data rate selection 4th point
rfi_disable (RW): Disable DDR rate change feature
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201126171829.945969-3-srinivas.pandruvada@linux.intel.com
Added AlderLake PCI device id to support processor thermal driver. Reuse
the feature set (just includes RAPL) from previous generations.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201126171829.945969-2-srinivas.pandruvada@linux.intel.com
The Processor Thermal PCI device supports multiple features. Currently
we export only RAPL. But we need more features from this device exposed
for Tiger Lake and Alder Lake based platforms. So re-structure the
current MMIO interface, so that more features can be added cleanly.
No functional changes are expected with this change.
Changes done in this patch:
- Using PCI_DEVICE_DATA(), hence names of defines changed
- Move RAPL MMIO code to its own module
- Move the RAPL MMIO offsets to RAPL MMIO module
- Adjust Kconfig dependency of PROC_THERMAL_MMIO_RAPL
- Per processor driver data now contains the supported features
- Moved all the common data structures and defines to a common header
file
- This new header file contains all the processor_thermal_* interfaces
- Based on the features supported the module interface is called
- Each module atleast provides one add and one remove function
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201126171829.945969-1-srinivas.pandruvada@linux.intel.com
The reference to acpi_gbl_FADT causes a build error when ACPI is not
enabled. Fix by making that conditional on CONFIG_ACPI.
../drivers/thermal/intel/intel_pch_thermal.c: In function 'pch_wpt_suspend':
../drivers/thermal/intel/intel_pch_thermal.c:217:8: error: 'acpi_gbl_FADT' undeclared (first use in this function); did you mean 'acpi_get_type'?
if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0))
^~~~~~~~~~~~~
Fixes: ef63b043ac ("thermal: intel: pch: fix S0ix failure due to PCH temperature above threshold")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Amit Kucheria <amitk@kernel.org>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201117023807.8266-1-rdunlap@infradead.org
I noticed that I couldn't read the PCH temperature on my workstation
(C620 series chipset, w/ 2x Xeon Gold 5215 CPUs) directly, but had to go
through IPMI. Looking at the data sheet, it looks to me like the
existing intel PCH thermal driver should work without changes for
Lewisburg.
I suspect there's some other PCI IDs missing. But I hope somebody at
Intel would have an easier time figuring that out than I...
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Tushar Dave <tushar.n.dave@intel.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/lkml/20200115184415.1726953-1-andres@anarazel.de/
Signed-off-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Pandruvada, Srinivas <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201113204916.1144907-1-andres@anarazel.de
When system tries to enter S0ix suspend state, just after active load
scenarios, it fails due to PCH current temperature is higher than set
threshold.
This patch introduces delay loop mechanism that allows PCH temperature
to go down below threshold during suspend so it won't fail to enter S0ix.
Add delay loop timeout and count as module parameters for user to tune it,
if required based on system design. This change notifies the different
warning messages like when PCH temperature above the threshold and
executing delay loop. Also, notify the messages when it success or
failure for S0ix entry.
Previously out of 1000 runs around 3 to 5 times it might fail to enter
S0ix just after heavy workload. With this change, S0ix failures reduced
as PCH cools down below threshold.
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201106170633.20838-1-sumeet.r.pawnikar@intel.com
When firmware requests keep alive response, send an event to user space
to confirm by using imok sysfs entry.
Create a new sysf entry called "imok". User space can write an integer,
which results in execution of IMOK ACPI method of INT3400 thermal zone
device. This results in sending response to firmware request for keep
alive.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200915223650.406046-4-srinivas.pandruvada@linux.intel.com
When we receive ACPI notification for OEM variable change pass the
notification to user space handler. This will avoid polling for
OEM variable change from user space.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200915223650.406046-2-srinivas.pandruvada@linux.intel.com
drivers cleanup (Andrzej Pietrasiewicz)
- Add generic netlink support for userspace notifications: events, temperature
and discovery commands (Daniel Lezcano)
- Fix redundant initialization for a ret variable (Colin Ian King)
- Remove the clock cooling code as it is used nowhere (Amit Kucheria)
- Add the rcar_gen3_thermal's r8a774e1 support (Marian-Cristian Rotariu)
- Replace all references to thermal.txt in the documentation to the
corresponding yaml files (Amit Kucheria)
- Add maintainer entry for the IPA (Lukasz Luba)
- Add support for MSM8939 for the tsens (Shawn Guo)
- Update power allocator and devfreq cooling to SPDX licensing (Lukasz Luba)
- Add Cannon Lake Low Power PCH support (Sumeet Pawnikar)
- Add tsensor support for V2 mediatek thermal system (Henry Yen)
- Fix thermal zone lookup by ID for the core code (Thierry Reding)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAl8q7tsACgkQqDIjiipP
6E+5Rwf7BFEn5YXPvng8cmnAlgvEBc9DdT6mGSo0NpFm9MdUxXlaqvw3WWSGyqWQ
+z0Ka7lmn5XyiMsVN11++Snp+79X17HzZf9SXO3glyIpAn+5prTDRhzzj0/jPrtS
sEeI++DrILsKKMGVljzftLmwNJN9DkUDNcnmWmZdCDbYVEKtP9Pjf2wBjAnXj7sX
JA3CkHRMwYLEQbfaKz37M11cYM+LqbDOlb6U11YWgAGGJ7d7zNYRf2/YSYPM4AN6
iE6j0E+3jIlXesULsap1AzeJaBq+wFxj1FL2TUZ8KscvRrm3AucqzNAT2M/Bc5Az
XLKKzc6Gp9JfqB5KXhX2EDu7VRnDBg==
=cSMN
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal updates from Daniel Lezcano:
- Add support to enable/disable the thermal zones resulting on core
code and drivers cleanup (Andrzej Pietrasiewicz)
- Add generic netlink support for userspace notifications: events,
temperature and discovery commands (Daniel Lezcano)
- Fix redundant initialization for a ret variable (Colin Ian King)
- Remove the clock cooling code as it is used nowhere (Amit Kucheria)
- Add the rcar_gen3_thermal's r8a774e1 support (Marian-Cristian
Rotariu)
- Replace all references to thermal.txt in the documentation to the
corresponding yaml files (Amit Kucheria)
- Add maintainer entry for the IPA (Lukasz Luba)
- Add support for MSM8939 for the tsens (Shawn Guo)
- Update power allocator and devfreq cooling to SPDX licensing (Lukasz
Luba)
- Add Cannon Lake Low Power PCH support (Sumeet Pawnikar)
- Add tsensor support for V2 mediatek thermal system (Henry Yen)
- Fix thermal zone lookup by ID for the core code (Thierry Reding)
* tag 'thermal-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (40 commits)
thermal: intel: intel_pch_thermal: Add Cannon Lake Low Power PCH support
thermal: mediatek: Add tsensor support for V2 thermal system
thermal: mediatek: Prepare to add support for other platforms
thermal: Update power allocator and devfreq cooling to SPDX licensing
MAINTAINERS: update entry to thermal governors file name prefixing
thermal: core: Add thermal zone enable/disable notification
thermal: qcom: tsens-v0_1: Add support for MSM8939
dt-bindings: tsens: qcom: Document MSM8939 compatible
thermal: core: Fix thermal zone lookup by ID
thermal: int340x: processor_thermal: fix: update Jasper Lake PCI id
thermal: imx8mm: Support module autoloading
thermal: ti-soc-thermal: Fix reversed condition in ti_thermal_expose_sensor()
MAINTAINERS: Add maintenance information for IPA
thermal: rcar_gen3_thermal: Do not shadow thcode variable
dt-bindings: thermal: Get rid of thermal.txt and replace references
thermal: core: Move initialization after core initcall
thermal: netlink: Improve the initcall ordering
net: genetlink: Move initialization to core_initcall
thermal: rcar_gen3_thermal: Add r8a774e1 support
thermal/drivers/clock_cooling: Remove clock_cooling code
...
static priority level knowledge from non-scheduler code.
The three APIs for non-scheduler code to set SCHED_FIFO are:
- sched_set_fifo()
- sched_set_fifo_low()
- sched_set_normal()
These are two FIFO priority levels: default (high), and a 'low' priority level,
plus sched_set_normal() to set the policy back to non-SCHED_FIFO.
Since the changes affect a lot of non-scheduler code, we kept this in a separate
tree.
When merging to the latest upstream tree there's a conflict in drivers/spi/spi.c,
which can be resolved via:
sched_set_fifo(ctlr->kworker_task);
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8pPQIRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1j0Jw/+LlSyX6gD2ATy3cizGL7DFPZogD5MVKTb
IXbhXH/ACpuPQlBe1+haRLbJj6XfXqbOlAleVKt7eh+jZ1jYjC972RCSTO4566mJ
0v8Iy9kkEeb2TDbYx1H3bnk78lf85t0CB+sCzyKUYFuTrXU04eRj7MtN3vAQyRQU
xJg83x/sT5DGdDTP50sL7lpbwk3INWkD0aDCJEaO/a9yHElMsTZiZBKoXxN/s30o
FsfzW56jqtng771H2bo8ERN7+abwJg10crQU5mIaLhacNMETuz0NZ/f8fY/fydCL
Ju8HAdNKNXyphWkAOmixQuyYtWKe2/GfbHg8hld0jmpwxkOSTgZjY+pFcv7/w306
g2l1TPOt8e1n5jbfnY3eig+9Kr8y0qHkXPfLfgRqKwMMaOqTTYixEzj+NdxEIRX9
Kr7oFAv6VEFfXGSpb5L1qyjIGVgQ5/JE/p3OC3GHEsw5VKiy5yjhNLoSmSGzdS61
1YurVvypSEUAn3DqTXgeGX76f0HH365fIKqmbFrUWxliF+YyflMhtrj2JFtejGzH
Md3RgAzxusE9S6k3gw1ev4byh167bPBbY8jz0w3Gd7IBRKy9vo92h6ZRYIl6xeoC
BU2To1IhCAydIr6hNsIiCSDTgiLbsYQzPuVVovUxNh+l1ZvKV2X+csEHhs8oW4pr
4BRU7dKL2NE=
=/7JH
-----END PGP SIGNATURE-----
Merge tag 'sched-fifo-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull sched/fifo updates from Ingo Molnar:
"This adds the sched_set_fifo*() encapsulation APIs to remove static
priority level knowledge from non-scheduler code.
The three APIs for non-scheduler code to set SCHED_FIFO are:
- sched_set_fifo()
- sched_set_fifo_low()
- sched_set_normal()
These are two FIFO priority levels: default (high), and a 'low'
priority level, plus sched_set_normal() to set the policy back to
non-SCHED_FIFO.
Since the changes affect a lot of non-scheduler code, we kept this in
a separate tree"
* tag 'sched-fifo-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
sched,tracing: Convert to sched_set_fifo()
sched: Remove sched_set_*() return value
sched: Remove sched_setscheduler*() EXPORTs
sched,psi: Convert to sched_set_fifo_low()
sched,rcutorture: Convert to sched_set_fifo_low()
sched,rcuperf: Convert to sched_set_fifo_low()
sched,locktorture: Convert to sched_set_fifo()
sched,irq: Convert to sched_set_fifo()
sched,watchdog: Convert to sched_set_fifo()
sched,serial: Convert to sched_set_fifo()
sched,powerclamp: Convert to sched_set_fifo()
sched,ion: Convert to sched_set_normal()
sched,powercap: Convert to sched_set_fifo*()
sched,spi: Convert to sched_set_fifo*()
sched,mmc: Convert to sched_set_fifo*()
sched,ivtv: Convert to sched_set_fifo*()
sched,drm/scheduler: Convert to sched_set_fifo*()
sched,msm: Convert to sched_set_fifo*()
sched,psci: Convert to sched_set_fifo*()
sched,drbd: Convert to sched_set_fifo*()
...
Update PCI device id for Jasper Lake processor thermal device.
With this proc_thermal driver is getting loaded and processor
thermal functionality works on Jasper Lake system.
Fixes: f64a6583d3 ("thermal: int340x: processor_thermal: Add Jasper Lake support")
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1595577146-1221-1-git-send-email-sumeet.r.pawnikar@intel.com
Downgrade "Unsupported event" message from dev_err to dev_dbg to avoid
flooding with this message on some platforms.
Cc: stable@vger.kernel.org # v5.4+
Suggested-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Alex Hung <alex.hung@canonical.com>
[ rzhang: fix typo in changelog ]
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20200615223957.183153-1-alex.hung@canonical.com
Starting from commit "thermal/int340x_thermal: Don't require IDSP to
exist", priv->current_uuid_index is initialized to -1. This value may
be passed to int3400_thermal_run_osc() from int3400_thermal_set_mode,
contributing to page fault when accessing int3400_thermal_uuids array
at index -1.
This commit adds a check on uuid value to int3400_thermal_run_osc.
Fixes: 8d485da0dd ("thermal/int340x_thermal: Don't require IDSP to exist")
Signed-off-by: Bartosz Szczepanek <bsz@semihalf.com>
Reviewed-by: Pandruvada, Srinivas <srinivas.pandruvada@linux.intel.com>
[ rzhang: Add Fixes tag ]
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20200708134613.131555-1-bsz@semihalf.com
set_mode() is only called when tzd's mode is about to change. Actual
setting is performed in thermal_core, in thermal_zone_device_set_mode().
The meaning of set_mode() callback is actually to notify the driver about
the mode being changed and giving the driver a chance to oppose such
change.
To better reflect the purpose of the method rename it to change_mode()
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
[for acerhdf]
Acked-by: Peter Kaestle <peter@piie.net>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-12-andrzej.p@collabora.com