OpenCloudOS-Kernel

Commit Graph

Author	SHA1	Message	Date
Gautham R. Shenoy	71737a6c2a	cpuidle: pseries: Do not cap the CEDE0 latency in fixup_cede0_latency() Currently in fixup_cede0_latency() code, we perform the fixup the CEDE(0) exit latency value only if minimum advertized extended CEDE latency values are less than 10us. This was done so as to not break the expected behaviour on POWER8 platforms where the advertised latency was higher than the default 10us, which would delay the SMT folding on the core. However, after the earlier patch "cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards", we can be sure that the fixup of CEDE0 latency is going to happen only from POWER10 onwards. Hence unconditionally use the minimum exit latency provided by the platform. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1626676399-15975-3-git-send-email-ego@linux.vnet.ibm.com	2021-08-03 22:33:37 +10:00
Gautham R. Shenoy	50741b70b0	cpuidle: pseries: Fixup CEDE0 latency only for POWER10 onwards Commit `d947fb4c96` ("cpuidle: pseries: Fixup exit latency for CEDE(0)") sets the exit latency of CEDE(0) based on the latency values of the Extended CEDE states advertised by the platform On POWER9 LPARs, the firmwares advertise a very low value of 2us for CEDE1 exit latency on a Dedicated LPAR. The latency advertized by the PHYP hypervisor corresponds to the latency required to wakeup from the underlying hardware idle state. However the wakeup latency from the LPAR perspective should include 1. The time taken to transition the CPU from the Hypervisor into the LPAR post wakeup from platform idle state 2. Time taken to send the IPI from the source CPU (waker) to the idle target CPU (wakee). 1. can be measured via timer idle test, where we queue a timer, say for 1ms, and enter the CEDE state. When the timer fires, in the timer handler we compute how much extra timer over the expected 1ms have we consumed. On a a POWER9 LPAR the numbers are CEDE latency measured using a timer (numbers in ns) N Min Median Avg 90%ile 99%ile Max Stddev 400 2601 5677 5668.74 5917 6413 9299 455.01 1. and 2. combined can be determined by an IPI latency test where we send an IPI to an idle CPU and in the handler compute the time difference between when the IPI was sent and when the handler ran. We see the following numbers on POWER9 LPAR. CEDE latency measured using an IPI (numbers in ns) N Min Median Avg 90%ile 99%ile Max Stddev 400 711 7564 7369.43 8559 9514 9698 1200.01 Suppose, we consider the 99th percentile latency value measured using the IPI to be the wakeup latency, the value would be 9.5us This is in the ballpark of the default value of 10us. Hence, use the exit latency of CEDE(0) based on the latency values advertized by platform only from POWER10 onwards. The values advertized on POWER10 platforms is more realistic and informed by the latency measurements. For earlier platforms stick to the default value of 10us. The fix was suggested by Michael Ellerman. Fixes: `d947fb4c96` ("cpuidle: pseries: Fixup exit latency for CEDE(0)") Reported-by: Enrico Joedecke <joedecke@de.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1626676399-15975-2-git-send-email-ego@linux.vnet.ibm.com	2021-08-03 22:33:19 +10:00
Rafael J. Wysocki	ad6b010d81	- Add support for the Qcom MSM8226 (Bartosz Dudziak) -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmDRhckACgkQqDIjiipP 6E98lggAiMzuvF8eSvq+maQc787J8vfteM7UiSyexpdh/0GUqDOVM1CBxUV5MGoE hRZe/hSW8PfgkZ7eGwe2bRnzSs60i6tQT9l2zkgdpuImlpCz/kZrEGkYjmP16zzM 81yVPpan4QgKWRMKI8A1SVdbkRvwmw9HuQxOyBcNaR9qlxjhYwt2Kihga80UjCUs HkaKSP16JNkepmzCHBdaIyMW2cTlN8r+pCe8g5dBZ5hXwVCjcW7CGBxtQBzFgS8O t5V06hkgvG619qfEgvipbCdOREgZgNz95alth5OIE4Ihqyb3N4u9xHC8Ah9AsQ/u UAF1r0KClULdoVvGSIr+CnhOLCWb9A== =doID -----END PGP SIGNATURE----- Merge tag 'cpuidle-v5.14-rc1' of https://git.linaro.org/people/daniel.lezcano/linux Pull ARM cpuidle updates for v5.14 from Daniel Lezcano: "- Add support for Qcom MSM8226 (Bartosz Dudziak)" * tag 'cpuidle-v5.14-rc1' of https://git.linaro.org/people/daniel.lezcano/linux: cpuidle: qcom: Add SPM register data for MSM8226 dt-bindings: arm: msm: Add SAW2 for MSM8226	2021-06-30 14:56:51 +02:00
Linus Torvalds	3563f55ce6	Power management updates for 5.14-rc1 - Make intel_pstate support hybrid processors using abstract performance units in the HWP interface (Rafael Wysocki). - Add Icelake servers and Cometlake support in no-HWP mode to intel_pstate (Giovanni Gherdovich). - Make cpufreq_online() error path be consistent with the CPU device removal path in cpufreq (Rafael Wysocki). - Clean up 3 cpufreq drivers and the statistics code (Hailong Liu, Randy Dunlap, Shaokun Zhang). - Make intel_idle use special idle state parameters for C6 when package C-states are disabled (Chen Yu). - Rework the TEO (timer events oriented) cpuidle governor to address some theoretical shortcomings in it (Rafael Wysocki). - Drop unneeded semicolon from the TEO governor (Wan Jiabing). - Modify the runtime PM framework to accept unassigned suspend and resume callback pointers (Ulf Hansson). - Improve pm_runtime_get_sync() documentation (Krzysztof Kozlowski). - Improve device performance states support in the generic power domains (genpd) framework (Ulf Hansson). - Fix some documentation issues in genpd (Yang Yingliang). - Make the operating performance points (OPP) framework use the required-opps DT property in use cases that are not related to genpd (Hsin-Yi Wang). - Make lazy_link_required_opp_table() use list_del_init instead of list_del/INIT_LIST_HEAD (Yang Yingliang). - Simplify wake IRQs handling in the core system-wide sleep support code and clean up some coding style inconsistencies in it (Tian Tao, Zhen Lei). - Add cooling support to the tegra30 devfreq driver and improve its DT bindings (Dmitry Osipenko). - Fix some assorted issues in the devfreq core and drivers (Chanwoo Choi, Dong Aisheng, YueHaibing). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmDbaFwSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxR/gP/inFFjdwWpq3r6XD5P4XeVvum/MEjqQs rDUDHXgEEZmWGL9CpQ0PxhXyvc7SSqtNLakECTXxrOccfMbo0NKQtjCt8eMO1XMu nrAcp68hr/Rg2TaenRC3j0+oGXPzOw5/Mg5Y9subRymdI3o5HyoJNjBkU+LlkdGs HC7k8zWPqKQaEoFgjOpYPuXKf2bvEm2jIh4dzmtCRWXBUOxDgN0z6Kckhs59xrU+ +CLP/W4XMDrWSYdd2zjPV2IBNsqfePFchZ+t2CMQwYycI+KJr2s8tLbAFnQXfxLz WRqxpZKvMUPthKsK2/vgnCQkQKhGq39NmlHdqRdJm8uivCPx1Q2uuHRYF9782M+o cWuO60VvtUax0RIk1prP2l6JBBU/3Hvln7uf4cBnIeh/3QZKKygIgnNI1YwaqXSq zP4EWY00kKNmKwRUZAkDR9ZavXHiHvtoytT44XU/NxS+YXh6nMLC34CeuDUQaqni JvniXQyZCIWecZhwbOpW+FmAXMBCyvqXarDM0Zw4coWoyFLN7Y8ow9C5T5EWcgQ+ pKyGhS698HiePPJrwDOtOqzfPqxcgXEWUqwmxTeD8MRDSMlamzPnRJh+wxlrIdv9 2c8SAOSD7xvRlQQcsOpEVcKVkjsWDy7tvK6/O2CtoBsUpZOXtKIKJhr7ixnLN3Ej XHX/voFVDIao =mgLK -----END PGP SIGNATURE----- Merge tag 'pm-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These add hybrid processors support to the intel_pstate driver and make it work with more processor models when HWP is disabled, make the intel_idle driver use special C6 idle state paremeters when package C-states are disabled, add cooling support to the tegra30 devfreq driver, rework the TEO (timer events oriented) cpuidle governor, extend the OPP (operating performance points) framework to use the required-opps DT property in more cases, fix some issues and clean up a number of assorted pieces of code. Specifics: - Make intel_pstate support hybrid processors using abstract performance units in the HWP interface (Rafael Wysocki). - Add Icelake servers and Cometlake support in no-HWP mode to intel_pstate (Giovanni Gherdovich). - Make cpufreq_online() error path be consistent with the CPU device removal path in cpufreq (Rafael Wysocki). - Clean up 3 cpufreq drivers and the statistics code (Hailong Liu, Randy Dunlap, Shaokun Zhang). - Make intel_idle use special idle state parameters for C6 when package C-states are disabled (Chen Yu). - Rework the TEO (timer events oriented) cpuidle governor to address some theoretical shortcomings in it (Rafael Wysocki). - Drop unneeded semicolon from the TEO governor (Wan Jiabing). - Modify the runtime PM framework to accept unassigned suspend and resume callback pointers (Ulf Hansson). - Improve pm_runtime_get_sync() documentation (Krzysztof Kozlowski). - Improve device performance states support in the generic power domains (genpd) framework (Ulf Hansson). - Fix some documentation issues in genpd (Yang Yingliang). - Make the operating performance points (OPP) framework use the required-opps DT property in use cases that are not related to genpd (Hsin-Yi Wang). - Make lazy_link_required_opp_table() use list_del_init instead of list_del/INIT_LIST_HEAD (Yang Yingliang). - Simplify wake IRQs handling in the core system-wide sleep support code and clean up some coding style inconsistencies in it (Tian Tao, Zhen Lei). - Add cooling support to the tegra30 devfreq driver and improve its DT bindings (Dmitry Osipenko). - Fix some assorted issues in the devfreq core and drivers (Chanwoo Choi, Dong Aisheng, YueHaibing)" * tag 'pm-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (39 commits) PM / devfreq: passive: Fix get_target_freq when not using required-opp cpufreq: Make cpufreq_online() call driver->offline() on errors opp: Allow required-opps to be used for non genpd use cases cpuidle: teo: remove unneeded semicolon in teo_select() dt-bindings: devfreq: tegra30-actmon: Add cooling-cells dt-bindings: devfreq: tegra30-actmon: Convert to schema PM / devfreq: userspace: Use DEVICE_ATTR_RW macro PM: runtime: Clarify documentation when callbacks are unassigned PM: runtime: Allow unassigned ->runtime_suspend\|resume callbacks PM: runtime: Improve path in rpm_idle() when no callback PM: hibernate: remove leading spaces before tabs PM: sleep: remove trailing spaces and tabs PM: domains: Drop/restore performance state votes for devices at runtime PM PM: domains: Return early if perf state is already set for the device PM: domains: Split code in dev_pm_genpd_set_performance_state() cpuidle: teo: Use kerneldoc documentation in admin-guide cpuidle: teo: Rework most recent idle duration values treatment cpuidle: teo: Change the main idle state selection logic cpuidle: teo: Cosmetic modification of teo_select() cpuidle: teo: Cosmetic modifications of teo_update() ...	2021-06-29 13:36:06 -07:00
Wan Jiabing	795e0e38de	cpuidle: teo: remove unneeded semicolon in teo_select() Fix following coccicheck warning: drivers/cpuidle/governors/teo.c:315:10-11: Unneeded semicolon Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-06-17 14:13:53 +02:00
Bartosz Dudziak	0f0ac1e4ee	cpuidle: qcom: Add SPM register data for MSM8226 Add MSM8226 register data to SPM AVS Wrapper 2 (SAW2) power controller driver. Reviewed-by: Stephan Gerhold <stephan@gerhold.net> Signed-off-by: Bartosz Dudziak <bartosz.dudziak@snejp.pl> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210612205335.9730-3-bartosz.dudziak@snejp.pl	2021-06-16 20:03:26 +02:00
Rafael J. Wysocki	154ae8bb3c	cpuidle: teo: Use kerneldoc documentation in admin-guide There are two descriptions of the TEO (Timer Events Oriented) cpuidle governor in the kernel source tree, one in the C file containing its code and one in cpuidle.rst which is part of admin-guide. Instead of trying to keep them both in sync and in order to reduce text duplication, include the governor description from the C file directly into cpuidle.rst. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-06-11 18:36:45 +02:00
Rafael J. Wysocki	77577558f2	cpuidle: teo: Rework most recent idle duration values treatment The TEO (Timer Events Oriented) cpuidle governor uses several most recent idle duration values for a given CPU to refine the idle state selection in case the previous long-term trends have not been followed recently and a new trend appears to be forming. That is done by computing the average of the most recent idle duration values falling below the time till the next timer event ("sleep length"), provided that they are the majority of the most recent idle duration values taken into account, and using it as the new expected idle duration value. However, idle state selection based on that value may not be optimal, because the average does not really indicate which of the idle states with target residencies less than or equal to it is likely to be the best fit. Thus, instead of computing the average, make the governor carry out computations based on the distribution of the most recent idle duration values among the bins corresponding to different idle states. Namely, if the majority of the most recent idle duration values taken into consideration are less than the current sleep length (which means that the CPU is likely to wake up early), find the idle state closest to the "candidate" one "matching" the sleep length whose target residency is less than or equal to the majority of the most recent idle duration values that have fallen below the current sleep length (which means that it is likely to be "shallow enough" this time). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-06-11 18:36:45 +02:00
Rafael J. Wysocki	c410a9a142	cpuidle: teo: Change the main idle state selection logic Two aspects of the current main idle state selection logic in the TEO (Timer Events Oriented) cpuidle governor are quite questionable. First of all, the "hits" and "misses" metrics used by it are only updated for a given idle state if the time till the next timer event ("sleep length") is between the target residency of that state and the target residency of the next one. Consequently, they are likely to become stale if the sleep length tends to fall outside that interval which increases the likelihood of subomtimal idle state selection. Second, the decision on whether or not to select the idle state "matching" the sleep length is based on the metrics collected for that state alone, whereas in principle the metrics collected for the other idle states should be taken into consideration when that decision is made. For example, if the measured idle duration is less than the target residency of the idle state "matching" the sleep length, then it is also less than the target residency of any deeper idle state and that should be taken into account when considering whether or not to select any of those states, but currently it is not. In order to address the above shortcomings, modify the main idle state selection logic in the TEO governor to take the metrics collected for all of the idle states into account when deciding whether or not to select the one "matching" the sleep length. Moreover, drop the "misses" metric that becomes redundant after the above change and rename the "early_hits" metric to "intercepts" so that its role is better reflected by its name (the idea being that if a CPU wakes up earlier than indicated by the sleep length, then it must be a result of a non-timer interrupt that "intercepts" the CPU). Also rename the states[] array in struct struct teo_cpu to state_bins[] to avoid confusing it with the states[] array in struct cpuidle_driver and update the documentation to match the new code (and make it more comprehensive while at it). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-06-11 18:36:45 +02:00
Rafael J. Wysocki	b18e0de1cf	cpuidle: teo: Cosmetic modification of teo_select() Initialize local variables in teo_select() where they are declared. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-06-11 18:36:45 +02:00
Rafael J. Wysocki	f53cbdab01	cpuidle: teo: Cosmetic modifications of teo_update() Rename a local variable in teo_update() so that its purpose is better reflected by its name and use one more local variable in the loop over the CPU idle states in that function to make the code somewhat easier to read. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-06-11 18:36:45 +02:00
Alexey Dobriyan	8fc2858e57	sched: Make nr_iowait_cpu() return 32-bit value Runqueue ->nr_iowait counters are 32-bit anyway. Propagate 32-bitness into other code, but don't try too hard. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210422200228.1423391-3-adobriyan@gmail.com	2021-05-12 21:34:16 +02:00
Rafael J. Wysocki	71f4dd3441	Merge back earlier cpuidle updates for v5.13.	2021-04-08 20:05:49 +02:00
He Ying	498ba2a8a2	cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration When CONFIG_ARM_QCOM_SPM_CPUIDLE is y and CONFIG_MMU is not set, compiling errors are encountered as follows: drivers/cpuidle/cpuidle-qcom-spm.o: In function `spm_dev_probe': cpuidle-qcom-spm.c:(.text+0x140): undefined reference to `cpu_resume_arm' cpuidle-qcom-spm.c:(.text+0x148): undefined reference to `cpu_resume_arm' Note that cpu_resume_arm is defined when MMU is set. So, add dependency on MMU in ARM_QCOM_SPM_CPUIDLE configuration. Fixes: `a871be6b8e` ("cpuidle: Convert Qualcomm SPM driver to a generic CPUidle driver") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: He Ying <heying24@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210406123328.92904-1-heying24@huawei.com	2021-04-08 19:54:14 +02:00
Dmitry Osipenko	2dabed4777	cpuidle: tegra: Remove do_idle firmware call The do_idle firmware call is unused by all Tegra SoCs, hence remove it in order to keep driver's code clean. Tested-by: Anton Bambura <jenneron@protonmail.com> # TF701 T114 Tested-by: Matt Merhar <mattmerhar@protonmail.com> # Ouya T30 Tested-by: Peter Geis <pgwipeout@gmail.com> # Ouya T30 Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210302095405.28453-2-digetx@gmail.com	2021-04-08 19:54:14 +02:00
Dmitry Osipenko	32c8c34d81	cpuidle: tegra: Fix C7 idling state on Tegra114 Trusted Foundation firmware doesn't implement the do_idle call and in this case suspending should fall back to the common suspend path. In order to fix this issue we will unconditionally set the NOFLUSH_L2 mode via firmware call, which is a NO-OP on Tegra30/124, and then proceed to the C7 idling, like it was done by the older Tegra114 cpuidle driver. Fixes: `14e086baca` ("cpuidle: tegra: Squash Tegra114 driver into the common driver") Cc: stable@vger.kernel.org # 5.7+ Reported-by: Anton Bambura <jenneron@protonmail.com> # TF701 T114 Tested-by: Anton Bambura <jenneron@protonmail.com> # TF701 T114 Tested-by: Matt Merhar <mattmerhar@protonmail.com> # Ouya T30 Tested-by: Peter Geis <pgwipeout@gmail.com> # Ouya T30 Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20210302095405.28453-1-digetx@gmail.com	2021-04-08 19:54:14 +02:00
Rafael J. Wysocki	060e3535ad	cpuidle: menu: Take negative "sleep length" values into account Make the menu governor check the tick_nohz_get_next_hrtimer() return value so as to avoid dealing with negative "sleep length" values and make it use that value directly when the tick is stopped. While at it, rename local variable delta_next in menu_select() to delta_tick which better reflects its purpose. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-04-07 19:26:44 +02:00
Rafael J. Wysocki	030adec9f6	cpuidle: teo: Take negative "sleep length" values into account Modify the TEO governor to take possible negative return values of tick_nohz_get_next_hrtimer() into account by changing the data type of some variables used by it to s64 which allows it to carry out computations without potentially problematic data type conversions into u64. Also change the computations in teo_select() so that the negative values themselves are handled in a natural way to avoid adding extra negative value checks to that function. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-04-07 19:26:44 +02:00
Rafael J. Wysocki	d3c33be1f3	cpuidle: teo: Adjust handling of very short idle times If the time till the next timer event is shorter than the target residency of the first idle state (state 0), the TEO governor does not update its metrics for any idle states, but arguably it should record a "hit" for idle state 0 in that case, so modify it to do that. Accordingly, also make it record an "early hit" for idle state 0 if the measured idle duration is less than its target residency, which allows one branch more to be dropped from teo_update(). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-04-07 19:26:44 +02:00
Rafael J. Wysocki	2ab80d46fe	cpuidle: Use s64 as exit_latency_ns and target_residency_ns data type Subsequent changes will cause the exit_latency_ns and target_residency_ns fields in struct cpuidle_state to be used in computations in which data type conversions to u64 may turn a negative number close to zero into a verly large positive number leading to incorrect results. In preparation for that, change the data type of the fields mentioned above to s64, but ensure that they will not be negative themselves. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2021-04-07 19:26:44 +02:00
Linus Torvalds	48c1c40ab4	ARM: SoC drivers for v5.11 There are a couple of subsystems maintained by other people that merge their drivers through the SoC tree, those changes include: - The SCMI firmware framework gains support for sensor notifications and for controlling voltage domains. - A large update for the Tegra memory controller driver, integrating it better with the interconnect framework - The memory controller subsystem gains support for Mediatek MT8192 - The reset controller framework gains support for sharing pulsed resets For Soc specific drivers in drivers/soc, the main changes are - The Allwinner/sunxi MBUS gets a rework for the way it handles dma_map_ops and offsets between physical and dma address spaces. - An errata fix plus some cleanups for Freescale Layerscape SoCs - A cleanup for renesas drivers regarding MMIO accesses. - New SoC specific drivers for Mediatek MT8192 and MT8183 power domains - New SoC specific drivers for Aspeed AST2600 LPC bus control and SoC identification. - Core Power Domain support for Qualcomm MSM8916, MSM8939, SDM660 and SDX55. - A rework of the TI AM33xx 'genpd' power domain support to use information from DT instead of platform data - Support for TI AM64x SoCs - Allow building some Amlogic drivers as modules instead of built-in Finally, there are numerous cleanups and smaller bug fixes for Mediatek, Tegra, Samsung, Qualcomm, TI OMAP, Amlogic, Rockchips, Renesas, and Xilinx SoCs. There is a trivial conflict in the cedrus driver, with two branches adding the same CEDRUS_CAPABILITY_H265_DEC flag, and another trivial remove/remove conflict in linux/dma-mapping.h. Signed-off-by: Arnd Bergmann <arnd@arndb.de> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAl/alSUACgkQmmx57+YA GNm7GRAAlNMVi7F0f4Ixf1bEh+J2QUonYIpZfrdxOLFwISGQ+nstGrFW2He/OeQv KAi027tZLl6Sdzjy809cLDPA4Z2IKwjVWhEbBHybvy1+irPYjnixtLd0x3YvPhjH iadlcjQ3uaGue8PvubK6CVnBEy82A+Pp29n9i4A4wX/8w+BVIhVsxwQWUBF8pFXE 3La2UZYZMVMvVZMrpTOqwCgdmLDCk+RLMVZ1IiRqBEBq5/DVq03uIXgjGEOrq8tl PXC89w7K510Is891mbBdBThQf+pZkU1vwORuknDcEJKWs9ngbEha7ebVgp32kbFl pi8DEK205d106WQgfn0Zxkpbsp8XD058wDILwkhBcteXlBaUEL6btGVLDTUCJZuv /pkH8tL4lNGpThQFbCEXC8oHZBp2xk55P+SW9RRZOoA5tAp+sz7hlf3y3YKdCSxv 4xybeeVOAgjl01WtbEC7CuIkqcKVSQ7njhLhC8r5ASteNywDThqxLT6nd0VegcQc YH3Eu9QRXpvFwQ35zMkTMWa27bMG5d60fp90bWT0R5amXZpxJJot87w8trFCxv74 mE5KvCbefCRNsTt8GOBA/WR7hVaG369g07qOvs7g4LjJEM3Nl2h/A4/zVFef9O0t yq3Nm4YCGfDSAQXzGR2SJ3nxiqbDknzJTAtZPf4BmbaMuPOIJ5k= =BjJf -----END PGP SIGNATURE----- Merge tag 'arm-soc-drivers-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "There are a couple of subsystems maintained by other people that merge their drivers through the SoC tree, those changes include: - The SCMI firmware framework gains support for sensor notifications and for controlling voltage domains. - A large update for the Tegra memory controller driver, integrating it better with the interconnect framework - The memory controller subsystem gains support for Mediatek MT8192 - The reset controller framework gains support for sharing pulsed resets For Soc specific drivers in drivers/soc, the main changes are - The Allwinner/sunxi MBUS gets a rework for the way it handles dma_map_ops and offsets between physical and dma address spaces. - An errata fix plus some cleanups for Freescale Layerscape SoCs - A cleanup for renesas drivers regarding MMIO accesses. - New SoC specific drivers for Mediatek MT8192 and MT8183 power domains - New SoC specific drivers for Aspeed AST2600 LPC bus control and SoC identification. - Core Power Domain support for Qualcomm MSM8916, MSM8939, SDM660 and SDX55. - A rework of the TI AM33xx 'genpd' power domain support to use information from DT instead of platform data - Support for TI AM64x SoCs - Allow building some Amlogic drivers as modules instead of built-in Finally, there are numerous cleanups and smaller bug fixes for Mediatek, Tegra, Samsung, Qualcomm, TI OMAP, Amlogic, Rockchips, Renesas, and Xilinx SoCs" * tag 'arm-soc-drivers-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (222 commits) soc: mediatek: mmsys: Specify HAS_IOMEM dependency for MTK_MMSYS firmware: xilinx: Properly align function parameter firmware: xilinx: Add a blank line after function declaration firmware: xilinx: Remove additional newline firmware: xilinx: Fix kernel-doc warnings firmware: xlnx-zynqmp: fix compilation warning soc: xilinx: vcu: add missing register NUM_CORE soc: xilinx: vcu: use vcu-settings syscon registers dt-bindings: soc: xlnx: extract xlnx, vcu-settings to separate binding soc: xilinx: vcu: drop useless success message clk: samsung: mark PM functions as __maybe_unused soc: samsung: exynos-chipid: initialize later - with arch_initcall soc: samsung: exynos-chipid: order list of SoCs by name memory: jz4780_nemc: Fix potential NULL dereference in jz4780_nemc_probe() memory: ti-emif-sram: only build for ARMv7 memory: tegra30: Support interconnect framework memory: tegra20: Support hardware versioning and clean up OPP table initialization dt-bindings: memory: tegra20-emc: Document opp-supported-hw property soc: rockchip: io-domain: Fix error return code in rockchip_iodomain_probe() reset-controller: ti: force the write operation when assert or deassert ...	2020-12-16 16:38:41 -08:00
Linus Torvalds	b4ec805464	Power management updates for 5.11-rc1 - Use local_clock() instead of jiffies in the cpufreq statistics to improve accuracy (Viresh Kumar). - Fix up OPP usage in the cpufreq-dt and qcom-cpufreq-nvmem cpufreq drivers (Viresh Kumar). - Clean up the cpufreq core, the intel_pstate driver and the schedutil cpufreq governor (Rafael Wysocki). - Fix up error code paths in the sti-cpufreq and mediatek cpufreq drivers (Yangtao Li, Qinglang Miao). - Fix cpufreq_online() to return error codes instead of success (0) in all cases when it fails (Wang ShaoBo). - Add mt8167 support to the mediatek cpufreq driver and blacklist mt8516 in the cpufreq-dt-platdev driver (Fabien Parent). - Modify the tegra194 cpufreq driver to always return values from the frequency table as the current frequency and clean up that driver (Sumit Gupta, Jon Hunter). - Modify the arm_scmi cpufreq driver to allow it to discover the power scale present in the performance protocol and provide this information to the Energy Model (Lukasz Luba). - Add missing MODULE_DEVICE_TABLE to several cpufreq drivers (Pali Rohár). - Clean up the CPPC cpufreq driver (Ionela Voinescu). - Fix NVMEM_IMX_OCOTP dependency in the imx cpufreq driver (Arnd Bergmann). - Rework the poling interval selection for the polling state in cpuidle (Mel Gorman). - Enable suspend-to-idle for PSCI OSI mode in the PSCI cpuidle driver (Ulf Hansson). - Modify the OPP framework to support empty (node-less) OPP tables in DT for passing dependency information (Nicola Mazzucato). - Fix potential lockdep issue in the OPP core and clean up the OPP core (Viresh Kumar). - Modify dev_pm_opp_put_regulators() to accept a NULL argument and update its users accordingly (Viresh Kumar). - Add frequency changes tracepoint to devfreq (Matthias Kaehlcke). - Add support for governor feature flags to devfreq, make devfreq sysfs file permissions depend on the governor and clean up the devfreq core (Chanwoo Choi). - Clean up the tegra20 devfreq driver and deprecate it to allow another driver based on EMC_STAT to be used instead of it (Dmitry Osipenko). - Add interconnect support to the tegra30 devfreq driver, allow it to take the interconnect and OPP information from DT and clean it up ((Dmitry Osipenko). - Add interconnect support to the exynos-bus devfreq driver along with interconnect properties documentation (Sylwester Nawrocki). - Add suport for AMD Fam17h and Fam19h processors to the RAPL power capping driver (Victor Ding, Kim Phillips). - Fix handling of overly long constraint names in the powercap framework (Lukasz Luba). - Fix the wakeup configuration handling for bridges in the ACPI device power management core (Rafael Wysocki). - Add support for using an abstract scale for power units in the Energy Model (EM) and document it (Lukasz Luba). - Add em_cpu_energy() micro-optimization to the EM (Pavankumar Kondeti). - Modify the generic power domains (genpd) framwework to support suspend-to-idle (Ulf Hansson). - Fix creation of debugfs nodes in genpd (Thierry Strudel). - Clean up genpd (Lina Iyer). - Clean up the core system-wide suspend code and make it print driver flags for devices with debug enabled (Alex Shi, Patrice Chotard, Chen Yu). - Modify the ACPI system reboot code to make it prepare for system power off to avoid confusing the platform firmware (Kai-Heng Feng). - Update the pm-graph (multiple changes, mostly usability-related) and cpupower (online and offline CPU information support) PM utilities (Todd Brandt, Brahadambal Srinivasan). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl/Y8mcSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxjY4QAKsNFJeEtjGCxq7MxQIML3QLAsdJM9of 9kkY9skMEw4v1TRmyy7sW9jZW2pLSRcLJwWRKWu4143qUS3YUp2DQ0lqX4WyXoWu BhnkhkMUl6iCeBO8CWnt8zsTuqSa20A13sL9LyqN1+7OZKHD8StbT4hKjBncdNNN 4aDj+1uAPyOgj2iCUZuHQ8DtpBvOLjgTh367vbhbufjeJ//8/9+R7s4Xzrj7wtmv JlE0LDgvge9QeGTpjhxQJzn0q2/H5fg9jbmjPXUfbHJNuyKhrqnmjGyrN5m256JI 8DqGqQtJpmFp7Ihrur3uKTk3gWO05YwJ1FdeEooAKEjEMObm5xuYhKVRoDhmlJAu G6ui+OAUvNR0FffJtbzvWe/pLovLGOEOHdvTrZxUF8Abo6br3untTm8rKTi1fhaF wWndSMw0apGsPzCx5T+bE7AbJz2QHFpLhaVAutenuCzNI8xoMlxNKEzsaVz/+FqL Pq/PdFaM4vNlMbv7hkb/fujkCs/v3EcX2ihzvt7I2o8dBS0D1X8A4mnuWJmiGslw 1ftbJ6M9XacwkPBTHPgeXxJh2C1yxxe5VQ9Z5fWWi7sPOUeJnUwxKaluv+coFndQ sO6JxsPQ4hQihg8yOxLEkL6Wn68sZlmp+u2Oj+TPFAsAGANIA8rJlBPo1ppJWvdQ j1OCIc/qzwpH =BVdX -----END PGP SIGNATURE----- Merge tag 'pm-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These update cpufreq (core and drivers), cpuidle (polling state implementation and the PSCI driver), the OPP (operating performance points) framework, devfreq (core and drivers), the power capping RAPL (Running Average Power Limit) driver, the Energy Model support, the generic power domains (genpd) framework, the ACPI device power management, the core system-wide suspend code and power management utilities. Specifics: - Use local_clock() instead of jiffies in the cpufreq statistics to improve accuracy (Viresh Kumar). - Fix up OPP usage in the cpufreq-dt and qcom-cpufreq-nvmem cpufreq drivers (Viresh Kumar). - Clean up the cpufreq core, the intel_pstate driver and the schedutil cpufreq governor (Rafael Wysocki). - Fix up error code paths in the sti-cpufreq and mediatek cpufreq drivers (Yangtao Li, Qinglang Miao). - Fix cpufreq_online() to return error codes instead of success (0) in all cases when it fails (Wang ShaoBo). - Add mt8167 support to the mediatek cpufreq driver and blacklist mt8516 in the cpufreq-dt-platdev driver (Fabien Parent). - Modify the tegra194 cpufreq driver to always return values from the frequency table as the current frequency and clean up that driver (Sumit Gupta, Jon Hunter). - Modify the arm_scmi cpufreq driver to allow it to discover the power scale present in the performance protocol and provide this information to the Energy Model (Lukasz Luba). - Add missing MODULE_DEVICE_TABLE to several cpufreq drivers (Pali Rohár). - Clean up the CPPC cpufreq driver (Ionela Voinescu). - Fix NVMEM_IMX_OCOTP dependency in the imx cpufreq driver (Arnd Bergmann). - Rework the poling interval selection for the polling state in cpuidle (Mel Gorman). - Enable suspend-to-idle for PSCI OSI mode in the PSCI cpuidle driver (Ulf Hansson). - Modify the OPP framework to support empty (node-less) OPP tables in DT for passing dependency information (Nicola Mazzucato). - Fix potential lockdep issue in the OPP core and clean up the OPP core (Viresh Kumar). - Modify dev_pm_opp_put_regulators() to accept a NULL argument and update its users accordingly (Viresh Kumar). - Add frequency changes tracepoint to devfreq (Matthias Kaehlcke). - Add support for governor feature flags to devfreq, make devfreq sysfs file permissions depend on the governor and clean up the devfreq core (Chanwoo Choi). - Clean up the tegra20 devfreq driver and deprecate it to allow another driver based on EMC_STAT to be used instead of it (Dmitry Osipenko). - Add interconnect support to the tegra30 devfreq driver, allow it to take the interconnect and OPP information from DT and clean it up (Dmitry Osipenko). - Add interconnect support to the exynos-bus devfreq driver along with interconnect properties documentation (Sylwester Nawrocki). - Add suport for AMD Fam17h and Fam19h processors to the RAPL power capping driver (Victor Ding, Kim Phillips). - Fix handling of overly long constraint names in the powercap framework (Lukasz Luba). - Fix the wakeup configuration handling for bridges in the ACPI device power management core (Rafael Wysocki). - Add support for using an abstract scale for power units in the Energy Model (EM) and document it (Lukasz Luba). - Add em_cpu_energy() micro-optimization to the EM (Pavankumar Kondeti). - Modify the generic power domains (genpd) framwework to support suspend-to-idle (Ulf Hansson). - Fix creation of debugfs nodes in genpd (Thierry Strudel). - Clean up genpd (Lina Iyer). - Clean up the core system-wide suspend code and make it print driver flags for devices with debug enabled (Alex Shi, Patrice Chotard, Chen Yu). - Modify the ACPI system reboot code to make it prepare for system power off to avoid confusing the platform firmware (Kai-Heng Feng). - Update the pm-graph (multiple changes, mostly usability-related) and cpupower (online and offline CPU information support) PM utilities (Todd Brandt, Brahadambal Srinivasan)" * tag 'pm-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (86 commits) cpufreq: Fix cpufreq_online() return value on errors cpufreq: Fix up several kerneldoc comments cpufreq: stats: Use local_clock() instead of jiffies cpufreq: schedutil: Simplify sugov_update_next_freq() cpufreq: intel_pstate: Simplify intel_cpufreq_update_pstate() PM: domains: create debugfs nodes when adding power domains opp: of: Allow empty opp-table with opp-shared dt-bindings: opp: Allow empty OPP tables media: venus: dev_pm_opp_put_() accepts NULL argument drm/panfrost: dev_pm_opp_put_() accepts NULL argument drm/lima: dev_pm_opp_put_() accepts NULL argument PM / devfreq: exynos: dev_pm_opp_put_() accepts NULL argument cpufreq: qcom-cpufreq-nvmem: dev_pm_opp_put_() accepts NULL argument cpufreq: dt: dev_pm_opp_put_regulators() accepts NULL argument opp: Allow dev_pm_opp_put_() APIs to accept NULL opp_table opp: Don't create an OPP table from dev_pm_opp_get_opp_table() cpufreq: dt: Don't (ab)use dev_pm_opp_get_opp_table() to create OPP table opp: Reduce the size of critical section in _opp_kref_release() PM / EM: Micro optimization in em_cpu_energy cpufreq: arm_scmi: Discover the power scale in performance protocol ...	2020-12-15 16:30:31 -08:00
Mel Gorman	7a25759eaa	cpuidle: Select polling interval based on a c-state with a longer target residency It was noted that a few workloads that idle rapidly regressed when commit `36fcb42924` ("cpuidle: use first valid target residency as poll time") was merged. The workloads in question were heavy communicators that idle rapidly and were impacted by the c-state exit latency as the active CPUs were not polling at the time of wakeup. As they were not particularly realistic workloads, it was not considered to be a major problem. Unfortunately, a bug was reported for a real workload in a production environment that relied on large numbers of threads operating in a worker pool pattern. These threads would idle for periods of time longer than the C1 target residency and so incurred the c-state exit latency penalty. The application is very sensitive to wakeup latency and indirectly relying on behaviour prior to commit on `a37b969a61` ("cpuidle: poll_state: Add time limit to poll_idle()") to poll for long enough to avoid the exit latency cost. The target residency of C1 is typically very short. On some x86 machines, it can be as low as 2 microseconds. In poll_idle(), the clock is checked every POLL_IDLE_RELAX_COUNT interations of cpu_relax() and even one iteration of that loop can be over 1 microsecond so the polling interval is very close to the granularity of what poll_idle() can detect. Furthermore, a basic ping pong workload like perf bench pipe has a longer round-trip time than the 2 microseconds meaning that the CPU will almost certainly not be polling when the ping-pong completes. This patch selects a polling interval based on an enabled c-state that has an target residency longer than 10usec. If there is no enabled-cstate then polling will be up to a TICK_NSEC/16 similar to what it was up until kernel 4.20. Polling for a full tick is unlikely (rescheduling event) and is much longer than the existing target residencies for a deep c-state. As an example, consider a CPU with the following c-state information from an Intel CPU; residency exit_latency C1 2 2 C1E 20 10 C3 100 33 C6 400 133 The polling interval selected is 20usec. If booted with intel_idle.max_cstate=1 then the polling interval is 250usec as the deeper c-states were not available. On an AMD EPYC machine, the c-state information is more limited and looks like residency exit_latency C1 2 1 C2 800 400 The polling interval selected is 250usec. While C2 was considered, the polling interval was clamped by CPUIDLE_POLL_MAX. Note that it is not expected that polling will be a universal win. As well as potentially trading power for performance, the performance is not guaranteed if the extra polling prevented a turbo state being reached. Making it a tunable was considered but it's driver-specific, may be overridden by a governor and is not a guaranteed polling interval making it difficult to describe without knowledge of the implementation. tbench4 vanilla polling Hmean 1 497.89 ( 0.00%) 543.15 * 9.09%* Hmean 2 975.88 ( 0.00%) 1059.73 * 8.59%* Hmean 4 1953.97 ( 0.00%) 2081.37 * 6.52%* Hmean 8 3645.76 ( 0.00%) 4052.95 * 11.17%* Hmean 16 6882.21 ( 0.00%) 6995.93 * 1.65%* Hmean 32 10752.20 ( 0.00%) 10731.53 * -0.19%* Hmean 64 12875.08 ( 0.00%) 12478.13 * -3.08%* Hmean 128 21500.54 ( 0.00%) 21098.60 * -1.87%* Hmean 256 21253.70 ( 0.00%) 21027.18 * -1.07%* Hmean 320 20813.50 ( 0.00%) 20580.64 * -1.12%* Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-12-01 17:59:18 +01:00
Ingo Molnar	a787bdaff8	Merge branch 'linus' into sched/core, to resolve semantic conflict Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-11-27 11:10:50 +01:00
Peter Zijlstra	545b8c8df4	smp: Cleanup smp_call_function*() Get rid of the __call_single_node union and cleanup the API a little to avoid external code relying on the structure layout as much. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>	2020-11-24 16:47:49 +01:00
Rafael J. Wysocki	0f6e2cb45b	Merge back cpuidle changes for v5.11.	2020-11-23 12:49:28 +01:00
Dmitry Osipenko	c39de538a0	cpuidle: tegra: Annotate tegra_pm_set_cpu_in_lp2() with RCU_NONIDLE Annotate tegra_pm_set[clear]_cpu_in_lp2() with RCU_NONIDLE in order to fix lockdep warning about suspicious RCU usage of a spinlock during late idling phase. WARNING: suspicious RCU usage ... include/trace/events/lock.h:13 suspicious rcu_dereference_check() usage! ... (dump_stack) from (lock_acquire) (lock_acquire) from (_raw_spin_lock) (_raw_spin_lock) from (tegra_pm_set_cpu_in_lp2) (tegra_pm_set_cpu_in_lp2) from (tegra_cpuidle_enter) (tegra_cpuidle_enter) from (cpuidle_enter_state) (cpuidle_enter_state) from (cpuidle_enter_state_coupled) (cpuidle_enter_state_coupled) from (cpuidle_enter) (cpuidle_enter) from (do_idle) ... Tested-by: Peter Geis <pgwipeout@gmail.com> Reported-by: Peter Geis <pgwipeout@gmail.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-11-16 13:24:32 +01:00
Ulf Hansson	670c90def0	cpuidle: psci: Enable suspend-to-idle for PSCI OSI mode To select domain idlestates for cpuidle-psci when OSI mode has been enabled, the PM domains via genpd are being managed through runtime PM. This works fine for the regular idlepath, but it doesn't during system wide suspend. More precisely, the domain idlestates becomes temporarily disabled, which is because the PM core disables runtime PM for devices during system wide suspend. Later in the system suspend phase, genpd intends to deal with this from its ->suspend_noirq() callback, but this doesn't work as expected for a device corresponding to a CPU, because the domain idlestates needs to be selected on a per CPU basis (the PM core doesn't invoke the callbacks like that). To address this problem, let's enable the syscore flag for the corresponding CPU device that becomes successfully attached to its PM domain (applicable only in OSI mode). This informs the PM core to skip invoke the system wide suspend/resume callbacks for the device, thus also prevents genpd from screwing up its internal state of it. Moreover, to properly select a domain idlestate for the CPUs during suspend-to-idle, let's assign a specific ->enter_s2idle() callback for the corresponding domain idlestate (applicable only in OSI mode). From that callback, let's invoke dev_pm_genpd_suspend\|resume(), as this allows a domain idlestate to be selected for the current CPU by genpd. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-11-10 20:42:01 +01:00
Marek Szyprowski	f1118a28be	cpuidle: big.LITTLE: enable driver only on Peach-Pit/Pi Chromebooks This driver always worked properly only on the Exynos 5420/5800 based Chromebooks (Peach-Pit/Pi), so change the required compatible string to the 'google,peach', to avoid enabling it on the other Exynos 542x/5800 boards, which hangs in such case. The main difference between Peach-Pit/Pi and other Exynos 542x/5800 boards is the firmware - Peach platform doesn't use secure firmware at all. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>	2020-10-26 10:15:24 +01:00
Linus Torvalds	96685f8666	powerpc updates for 5.10 - A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting it for powerpc, as well as a related fix for sparc. - Remove support for PowerPC 601. - Some fixes for watchpoints & addition of a new ptrace flag for detecting ISA v3.1 (Power10) watchpoint features. - A fix for kernels using 4K pages and the hash MMU on bare metal Power9 systems with > 16TB of RAM, or RAM on the 2nd node. - A basic idle driver for shallow stop states on Power10. - Tweaks to our sched domains code to better inform the scheduler about the hardware topology on Power9/10, where two SMT4 cores can be presented by firmware as an SMT8 core. - A series doing further reworks & cleanups of our EEH code. - Addition of a filter for RTAS (firmware) calls done via sys_rtas(), to prevent root from overwriting kernel memory. - Other smaller features, fixes & cleanups. Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Athira Rajeev, Biwen Li, Cameron Berkenpas, Cédric Le Goater, Christophe Leroy, Christoph Hellwig, Colin Ian King, Daniel Axtens, David Dai, Finn Thain, Frederic Barrat, Gautham R. Shenoy, Greg Kurz, Gustavo Romero, Ira Weiny, Jason Yan, Joel Stanley, Jordan Niethe, Kajol Jain, Konrad Rzeszutek Wilk, Laurent Dufour, Leonardo Bras, Liu Shixin, Luca Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar, Nathan Lynch, Nicholas Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Pedro Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott Cheloha, Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt, Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang Yingliang, zhengbin. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl+JBQoTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgJJAD/0e3tsFP+9rFlxKSJlDcMW3w7kXDRXE tG40F1ubYFLU8wtFVR0De3njTRsz5HyaNU6SI8CwPq48mCa7OFn1D1OeHonHXDX9 w6v3GE2S1uXXQnjm+czcfdjWQut0IwWBLx007/S23WcPff3Abc2irupKLNu+Gx29 b/yxJHZSRJVX59jSV94HkdJS75mDHQ3oUOlFGXtuGcUZDufpD1ynRcQOjr0V/8JU F4WAblFSe7hiczHGqIvfhFVJ+OikEhnj2aEMAL8U7vxzrAZ7RErKCN9s/0Tf0Ktx FzNEFNLHZGqh+qNDpKKmM+RnaeO2Lcoc9qVn7vMHOsXPzx9F5LJwkI/DgPjtgAq/ mFvGnQB/FapATnQeMluViC/qhEe5bQXLUfPP5i2+QOjK0QqwyFlUMgaVNfsY8jRW 0Q/sNA72Opzst4WUTveCd4SOInlUuat09e5nLooCRLW7u7/jIiXNRSFNvpOiwkfF EcIPJsi6FUQ4SNbqpRSNEO9fK5JZrrUtmr0pg8I7fZhHYGcxEjqPR6IWCs3DTsak 4/KhjhhTnP/IWJRw6qKAyNhEyEwpWqYZ97SIQbvSb1g/bS47AIdQdJRb0eEoRjhx sbbnnYFwPFkG4c1yQSIFanT9wNDQ2hFx/c/mRfbd7J+ordx9JsoqXjqrGuhsU/pH GttJLmkJ5FH+pQ== =akeX -----END PGP SIGNATURE----- Merge tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting it for powerpc, as well as a related fix for sparc. - Remove support for PowerPC 601. - Some fixes for watchpoints & addition of a new ptrace flag for detecting ISA v3.1 (Power10) watchpoint features. - A fix for kernels using 4K pages and the hash MMU on bare metal Power9 systems with > 16TB of RAM, or RAM on the 2nd node. - A basic idle driver for shallow stop states on Power10. - Tweaks to our sched domains code to better inform the scheduler about the hardware topology on Power9/10, where two SMT4 cores can be presented by firmware as an SMT8 core. - A series doing further reworks & cleanups of our EEH code. - Addition of a filter for RTAS (firmware) calls done via sys_rtas(), to prevent root from overwriting kernel memory. - Other smaller features, fixes & cleanups. Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Athira Rajeev, Biwen Li, Cameron Berkenpas, Cédric Le Goater, Christophe Leroy, Christoph Hellwig, Colin Ian King, Daniel Axtens, David Dai, Finn Thain, Frederic Barrat, Gautham R. Shenoy, Greg Kurz, Gustavo Romero, Ira Weiny, Jason Yan, Joel Stanley, Jordan Niethe, Kajol Jain, Konrad Rzeszutek Wilk, Laurent Dufour, Leonardo Bras, Liu Shixin, Luca Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar, Nathan Lynch, Nicholas Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Pedro Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott Cheloha, Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt, Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang Yingliang, zhengbin. * tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (228 commits) Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed" selftests/powerpc: Fix eeh-basic.sh exit codes cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier powerpc/time: Make get_tb() common to PPC32 and PPC64 powerpc/time: Make get_tbl() common to PPC32 and PPC64 powerpc/time: Remove get_tbu() powerpc/time: Avoid using get_tbl() and get_tbu() internally powerpc/time: Make mftb() common to PPC32 and PPC64 powerpc/time: Rename mftbl() to mftb() powerpc/32s: Remove #ifdef CONFIG_PPC_BOOK3S_32 in head_book3s_32.S powerpc/32s: Rename head_32.S to head_book3s_32.S powerpc/32s: Setup the early hash table at all time. powerpc/time: Remove ifdef in get_dec() and set_dec() powerpc: Remove get_tb_or_rtc() powerpc: Remove __USE_RTC() powerpc: Tidy up a bit after removal of PowerPC 601. powerpc: Remove support for PowerPC 601 powerpc: Remove PowerPC 601 powerpc: Drop SYNC_601() ISYNC_601() and SYNC() powerpc: Remove CONFIG_PPC601_SYNC_FIX ...	2020-10-16 12:21:15 -07:00
Rafael J. Wysocki	f3643b5b77	Merge back cpuidle material for 5.10.	2020-09-28 16:31:25 +02:00
Lina Iyer	f49735f497	cpuidle: record state entry rejection statistics CPUs may fail to enter the chosen idle state if there was a pending interrupt, causing the cpuidle driver to return an error value. Record that and export it via sysfs along with the other idle state statistics. This could prove useful in understanding behavior of the governor and the system during usecases that involve multiple CPUs. Signed-off-by: Lina Iyer <ilina@codeaurora.org> [ rjw: Changelog and documentation edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-23 14:10:31 +02:00
Ulf Hansson	bd80527457	cpuidle: Drop misleading comments about RCU usage The commit `1098582a0f` ("sched,idle,rcu: Push rcu_idle deeper into the idle path"), moved the calls rcu_idle_enter\|exit() into the cpuidle core. However, it forgot to remove a couple of comments in enter_s2idle_proper() about why RCU_NONIDLE earlier was needed. So, let's drop them as they have become a bit misleading. Fixes: `1098582a0f` ("sched,idle,rcu: Push rcu_idle deeper into the idle path") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-22 19:32:03 +02:00
Ulf Hansson	70c179b498	cpuidle: psci: Allow PM domain to be initialized even if no OSI mode If the PSCI OSI mode isn't supported or fails to be enabled, the PM domain topology with the genpd providers isn't initialized. This is perfectly fine from cpuidle-psci point of view. However, since the PM domain topology in the DTS files is a description of the HW, no matter of whether the PSCI OSI mode is supported or not, other consumers besides the CPUs may rely on it. Therefore, let's always allow the initialization of the PM domain topology to succeed, independently of whether the PSCI OSI mode is supported. Consequentially we need to track if we succeed to enable the OSI mode, as to know when a domain idlestate can be selected. Note that, CPU devices are still not being attached to the PM domain topology, unless the PSCI OSI mode is supported. Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-22 17:50:32 +02:00
Ulf Hansson	1094201904	firmware: psci: Extend psci_set_osi_mode() to allow reset to PC mode The current user (cpuidle-psci) of psci_set_osi_mode() only needs to enable the PSCI OSI mode. Although, as subsequent changes shows, there is a need to be able to reset back into the PSCI PC mode. Therefore, let's extend psci_set_osi_mode() to take a bool as in-parameter, to let the user indicate whether to enable OSI or to switch back to PC mode. Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-22 17:50:32 +02:00
Dmitry Osipenko	1170433e66	cpuidle: tegra: Correctly handle result of arm_cpuidle_simple_enter() The enter() callback of CPUIDLE drivers returns index of the entered idle state on success or a negative value on failure. The negative value could any negative value, i.e. it doesn't necessarily needs to be a error code. That's because CPUIDLE core only cares about the fact of failure and not about the reason of the enter() failure. Like every other enter() callback, the arm_cpuidle_simple_enter() returns the entered idle-index on success. Unlike some of other drivers, it never fails. It happened that TEGRA_C1=index=err=0 in the code of cpuidle-tegra driver, and thus, there is no problem for the cpuidle-tegra driver created by the typo in the code which assumes that the arm_cpuidle_simple_enter() returns a error code. The arm_cpuidle_simple_enter() also may return a -ENODEV error if CPU_IDLE is disabled in a kernel's config, but all CPUIDLE drivers are disabled if CPU_IDLE is disabled, including the cpuidle-tegra driver. So we can't ever see the error code from arm_cpuidle_simple_enter() today. Of course the code may get some changes in the future and then the typo may transform into a real bug, so let's correct the typo! The tegra_cpuidle_state_enter() is now changed to make it return the entered idle-index on success and negative error code on fail, which puts it on par with the arm_cpuidle_simple_enter(), making code consistent in regards to the error handling. This patch fixes a minor typo in the code, it doesn't fix any bugs. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-21 17:44:54 +02:00
Ulf Hansson	36050d8984	cpuidle: psci: Fix suspicious RCU usage The commit `eb1f00237a` ("lockdep,trace: Expose tracepoints"), started to expose us for tracepoints. This lead to the following RCU splat on an ARM64 Qcom board. [ 5.529634] WARNING: suspicious RCU usage [ 5.537307] sdhci-pltfm: SDHCI platform and OF driver helper [ 5.541092] 5.9.0-rc3 #86 Not tainted [ 5.541098] ----------------------------- [ 5.541105] ../include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage! [ 5.541110] [ 5.541110] other info that might help us debug this: [ 5.541110] [ 5.541116] [ 5.541116] rcu_scheduler_active = 2, debug_locks = 1 [ 5.541122] RCU used illegally from extended quiescent state! [ 5.541129] no locks held by swapper/0/0. [ 5.541134] [ 5.541134] stack backtrace: [ 5.541143] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc3 #86 [ 5.541149] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) [ 5.541157] Call trace: [ 5.568185] sdhci_msm 7864900.sdhci: Got CD GPIO [ 5.574186] dump_backtrace+0x0/0x1c8 [ 5.574206] show_stack+0x14/0x20 [ 5.574229] dump_stack+0xe8/0x154 [ 5.574250] lockdep_rcu_suspicious+0xd4/0xf8 [ 5.574269] lock_acquire+0x3f0/0x460 [ 5.574292] _raw_spin_lock_irqsave+0x80/0xb0 [ 5.574314] __pm_runtime_suspend+0x4c/0x188 [ 5.574341] psci_enter_domain_idle_state+0x40/0xa0 [ 5.574362] cpuidle_enter_state+0xc0/0x610 [ 5.646487] cpuidle_enter+0x38/0x50 [ 5.650651] call_cpuidle+0x18/0x40 [ 5.654467] do_idle+0x228/0x278 [ 5.657678] cpu_startup_entry+0x24/0x70 [ 5.661153] rest_init+0x1a4/0x278 [ 5.665061] arch_call_rest_init+0xc/0x14 [ 5.668272] start_kernel+0x508/0x540 Following the path in pm_runtime_put_sync_suspend() from psci_enter_domain_idle_state(), it seems like we end up using the RCU. Therefore, let's simply silence the splat by informing the RCU about it with RCU_NONIDLE. Note that, this is a temporary solution. Instead we should strive to avoid using RCU_NONIDLE (and similar), but rather push rcu_idle_enter\|exit() further down, closer to the arch specific code. However, as the CPU PM notifiers are also using the RCU, additional rework is needed. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-21 15:56:53 +02:00
Linus Torvalds	5a55d36f71	powerpc fixes for 5.9 #5 Opt us out of the DEBUG_VM_PGTABLE support for now as it's causing crashes. Fix a long standing bug in our DMA mask handling that was hidden until recently, and which caused problems with some drivers. Fix a boot failure on systems with large amounts of RAM, and no hugepage support and using Radix MMU, only seen in the lab. A few other minor fixes. Thanks to: Alexey Kardashevskiy, Aneesh Kumar K.V, Gautham R. Shenoy, Hari Bathini, Ira Weiny, Nick Desaulniers, Shirisha Ganta, Vaibhav Jain, Vaidyanathan Srinivasan. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl9kk4UTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLM5D/42Wuuq6hOpGEfE2XjBjsOxCR07SCw9 CQ/c72jw/1tDLe0YclPVYAZc8BmT8uLo6tE2Ot+0vI+1Y06rRvC5g5uQBwp1zD/t MOwC0d4zf8a7WzBCcVaBv9HMHVOaKaTBwQc2R4k6NzYtARIf5m0evMPOWINioRsv /x4+Np8aeQd1WiVn6PBdqL8w1yRhk8LsVDvX35lFzQlgZvH09umXSGjw9K442xdE lr1PrV9GKd4DeudwLHPkMNs8Ul1QTxmY5vKIAklsJ5g3dBySfM7+GMrTzYOHYkWr aqGfGH6ojdFSQZRo7QwFhO52Kni7JN7AIoUPEBDqLb1fR10w8wesdjCs8JQMXIMc 8Eo210EbiSsq6kG/LzqZuStLUAup3rQd20+wWua7jo8HbcZOLDH7pPGwNtJInPTr gwH7sALYhTzFUAO4LzqVVE+yA8wndFPHoz+QSkO6LZJKuszON2LJ0r+IEx1l9Wmr ClZYujK36/N+ih42xBFcBZi2lL0VKzlkn7u7NDeZQBONgoJMCoWkGDkEHnMI9zdT 1iYcVZyY+IpJLcwxH+NP8weJKHIZbP4kvuFXR/QIpHdrDWKaTT95BvBhAK1KMYBo lLo81zMTzJCz2wWDg/+3sLuEzHdGr//uxcVXxjqE2vyEckeuZjiCyz0dRNEY4Sw3 vB2p3Zl7L0J4pQ== =cj6B -----END PGP SIGNATURE----- Merge tag 'powerpc-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 5.9: - Opt us out of the DEBUG_VM_PGTABLE support for now as it's causing crashes. - Fix a long standing bug in our DMA mask handling that was hidden until recently, and which caused problems with some drivers. - Fix a boot failure on systems with large amounts of RAM, and no hugepage support and using Radix MMU, only seen in the lab. - A few other minor fixes. Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Gautham R. Shenoy, Hari Bathini, Ira Weiny, Nick Desaulniers, Shirisha Ganta, Vaibhav Jain, and Vaidyanathan Srinivasan" * tag 'powerpc-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute cpuidle: pseries: Fix CEDE latency conversion from tb to us powerpc/dma: Fix dma_map_ops::get_required_mask Revert "powerpc/build: vdso linker warning for orphan sections" powerpc/mm: Remove DEBUG_VM_PGTABLE support on powerpc selftests/powerpc: Skip PROT_SAO test in guests/LPARS powerpc/book3s64/radix: Fix boot failure with large amount of guest memory	2020-09-18 11:48:25 -07:00
Peter Zijlstra	8747f2022f	cpuidle: Allow cpuidle drivers to take over RCU-idle Some drivers have to do significant work, some of which relies on RCU still being active. Instead of using RCU_NONIDLE in the drivers and flipping RCU back on, allow drivers to take over RCU-idle duty. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-16 19:36:26 +02:00
Nicholas Piggin	ffd2961bb4	powerpc/powernv/idle: add a basic stop 0-3 driver for POWER10 This driver does not restore stop > 3 state, so it limits itself to states which do not lose full state or TB. The POWER10 SPRs are sufficiently different from P9 that it seems easier to split out the P10 code. The POWER10 deep sleep code (e.g., the BHRB restore) has been taken out, but it can be re-added when stop > 3 support is added. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Pratik Rajesh Sampat<psampat@linux.ibm.com> Tested-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Reviewed-by: Pratik Rajesh Sampat<psampat@linux.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200819094700.493399-1-npiggin@gmail.com	2020-09-15 22:13:38 +10:00
Gautham R. Shenoy	1d3ee7df00	cpuidle: pseries: Fix CEDE latency conversion from tb to us Commit `d947fb4c96` ("cpuidle: pseries: Fixup exit latency for CEDE(0)") sets the exit latency of CEDE(0) based on the latency values of the Extended CEDE states advertised by the platform. The values advertised by the platform are in timebase ticks. However the cpuidle framework requires the latency values in microseconds. If the tb-ticks value advertised by the platform correspond to a value smaller than 1us, during the conversion from tb-ticks to microseconds, in the current code, the result becomes zero. This is incorrect as it puts a CEDE state on par with the snooze state. This patch fixes this by rounding up the result obtained while converting the latency value from tb-ticks to microseconds. It also prints a warning in case we discover an extended-cede state with wakeup latency to be 0. In such a case, ensure that CEDE(0) has a non-zero wakeup latency. Fixes: `d947fb4c96` ("cpuidle: pseries: Fixup exit latency for CEDE(0)") Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1599125247-28488-1-git-send-email-ego@linux.vnet.ibm.com	2020-09-08 17:14:42 +10:00
Peter Zijlstra	bf9282dc26	cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic This allows moving the leave_mm() call into generic code before rcu_idle_enter(). Gets rid of more trace_*_rcuidle() users. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Marco Elver <elver@google.com> Link: https://lkml.kernel.org/r/20200821085348.369441600@infradead.org	2020-08-26 12:41:53 +02:00
Peter Zijlstra	1098582a0f	sched,idle,rcu: Push rcu_idle deeper into the idle path Lots of things take locks, due to a wee bug, rcu_lockdep didn't notice that the locking tracepoints were using RCU. Push rcu_idle_{enter,exit}() as deep as possible into the idle paths, this also resolves a lot of _rcuidle()/RCU_NONIDLE() usage. Specifically, sched_clock_idle_wakeup_event() will use ktime which will use seqlocks which will tickle lockdep, and stop_critical_timings() uses lock. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Marco Elver <elver@google.com> Link: https://lkml.kernel.org/r/20200821085348.310943801@infradead.org	2020-08-26 12:41:53 +02:00
Peter Zijlstra	49d9c59363	cpuidle: Fixup IRQ state Match the pattern elsewhere in this file. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Marco Elver <elver@google.com> Link: https://lkml.kernel.org/r/20200821085348.251340558@infradead.org	2020-08-26 12:41:53 +02:00
Linus Torvalds	25d8d4eeca	powerpc updates for 5.9 - Add support for (optionally) using queued spinlocks & rwlocks. - Support for a new faster system call ABI using the scv instruction on Power9 or later. - Drop support for the PROT_SAO mmap/mprotect flag as it will be unsupported on Power10 and future processors, leaving us with no way to implement the functionality it requests. This risks breaking userspace, though we believe it is unused in practice. - A bug fix for, and then the removal of, our custom stack expansion checking. We now allow stack expansion up to the rlimit, like other architectures. - Remove the remnants of our (previously disabled) topology update code, which tried to react to NUMA layout changes on virtualised systems, but was prone to crashes and other problems. - Add PMU support for Power10 CPUs. - A change to our signal trampoline so that we don't unbalance the link stack (branch return predictor) in the signal delivery path. - Lots of other cleanups, refactorings, smaller features and so on as usual. Thanks to: Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan S, Bharata B Rao, Bill Wendling, Bin Meng, Cédric Le Goater, Chris Packham, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Dan Williams, David Lamparter, Desnes A. Nunes do Rosario, Erhard F., Finn Thain, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geoff Levand, Greg Kurz, Gustavo A. R. Silva, Hari Bathini, Harish, Imre Kaloz, Joel Stanley, Joe Perches, John Crispin, Jordan Niethe, Kajol Jain, Kamalesh Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li RongQing, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal Suchanek, Milton Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver O'Halloran, Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud, Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar Dronamraju, Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza Cascardo, Thiago Jung Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov, Wei Yongjun, Wen Xiong, YueHaibing. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl8tOxATHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgDQfEAClXHWf6hnxB84bEu39D51NkVotL1IG BRWFvyix+xHuUkHIouBPAAMl6ngY5X6wkYd+Z+CY9zHNtdSDoVlJE30YXdMQA/dE L/rYxR1884yGR/uU/3wusboO68ReXwcKQPmKOymUfh0zH7ujyJsSWLpXFK1YDC5d 2TVVTi0Q+P5ucMHDh0L+AHirIxZvtZSp43+J7xLtywsj+XAxJWCTGo5WCJbdgbCA Qbv3aOkVyUa3EgsbdM/STPpv82ebqT+PHxeSIO4Jw6ZODtKRH0R5YsWCApuY9eZ+ ebY9RLmgv9ZAhJqB2fv9A5NDcMoGpZNmjM7HrWpXwULKQpkBGHCzJ9FcSdHVMOx8 nbVMFjt4uzLwV1w8lFYslQ2tNH/uH2o9BlryV1RLpiiKokDAJO/NOsWN9y0u/I4J EmAM5DSX2LgVvvas96IlGK8KX4xkOkf8FLX/H5UDvvAfloH8J4CZXk/CWCab/nqY KEHPnMmYvQZ1w9SzyZg9sO/1p6Bl1Gmm75Jv2F1lBiRW/42VcGBI/qLsJ4lC59Fc KbwufYNYYG38wbxDLW1HAPJhRonxIcaZj3EEqk7aTiLZ55nNbu8e2k32CpNXTGqt npOhzJHimcq7L6+878ZW+xpbZwogIEUdRSsmwb6aT8za3ShnYwSA2Q3LYxh9xyGH j3GifvPq6Efp3Q== =QMY1 -----END PGP SIGNATURE----- Merge tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Add support for (optionally) using queued spinlocks & rwlocks. - Support for a new faster system call ABI using the scv instruction on Power9 or later. - Drop support for the PROT_SAO mmap/mprotect flag as it will be unsupported on Power10 and future processors, leaving us with no way to implement the functionality it requests. This risks breaking userspace, though we believe it is unused in practice. - A bug fix for, and then the removal of, our custom stack expansion checking. We now allow stack expansion up to the rlimit, like other architectures. - Remove the remnants of our (previously disabled) topology update code, which tried to react to NUMA layout changes on virtualised systems, but was prone to crashes and other problems. - Add PMU support for Power10 CPUs. - A change to our signal trampoline so that we don't unbalance the link stack (branch return predictor) in the signal delivery path. - Lots of other cleanups, refactorings, smaller features and so on as usual. Thanks to: Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan S, Bharata B Rao, Bill Wendling, Bin Meng, Cédric Le Goater, Chris Packham, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Dan Williams, David Lamparter, Desnes A. Nunes do Rosario, Erhard F., Finn Thain, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geoff Levand, Greg Kurz, Gustavo A. R. Silva, Hari Bathini, Harish, Imre Kaloz, Joel Stanley, Joe Perches, John Crispin, Jordan Niethe, Kajol Jain, Kamalesh Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li RongQing, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal Suchanek, Milton Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver O'Halloran, Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud, Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar Dronamraju, Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza Cascardo, Thiago Jung Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov, Wei Yongjun, Wen Xiong, YueHaibing. * tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (337 commits) selftests/powerpc: Fix pkey syscall redefinitions powerpc: Fix circular dependency between percpu.h and mmu.h powerpc/powernv/sriov: Fix use of uninitialised variable selftests/powerpc: Skip vmx/vsx/tar/etc tests on older CPUs powerpc/40x: Fix assembler warning about r0 powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric powerpc/papr_scm: Fetch nvdimm performance stats from PHYP cpuidle: pseries: Fixup exit latency for CEDE(0) cpuidle: pseries: Add function to parse extended CEDE records cpuidle: pseries: Set the latency-hint before entering CEDE selftests/powerpc: Fix online CPU selection powerpc/perf: Consolidate perf_callchain_user_[64\|32]() powerpc/pseries/hotplug-cpu: Remove double free in error path powerpc/pseries/mobility: Add pr_debug() for device tree changes powerpc/pseries/mobility: Set pr_fmt() powerpc/cacheinfo: Warn if cache object chain becomes unordered powerpc/cacheinfo: Improve diagnostics about malformed cache lists powerpc/cacheinfo: Use name@unit instead of full DT path in debug messages powerpc/cacheinfo: Set pr_fmt() powerpc: fix function annotations to avoid section mismatch warnings with gcc-10 ...	2020-08-07 10:33:50 -07:00
Gautham R. Shenoy	d947fb4c96	cpuidle: pseries: Fixup exit latency for CEDE(0) We are currently assuming that CEDE(0) has exit latency 10us, since there is no way for us to query from the platform. However, if the wakeup latency of an Extended CEDE state is smaller than 10us, then we can be sure that the exit latency of CEDE(0) cannot be more than that. In this patch, we fix the exit latency of CEDE(0) if we discover an Extended CEDE state with wakeup latency smaller than 10us. Benchmark results: On POWER8, this patch does not have any impact since the advertized latency of Extended CEDE (1) is 30us which is higher than the default latency of CEDE (0) which is 10us. On POWER9 we see improvement the single-threaded performance of ebizzy, and no regression in the wakeup latency or the number of context-switches. ebizzy: 2 ebizzy threads bound to the same big-core. 25% improvement in the avg records/s with patch. x without_patch * with_patch N Min Max Median Avg Stddev x 10 2491089 5834307 5398375 4244335 1596244.9 * 10 2893813 5834474 5832448 5327281.3 1055941.4 context_switch2: There is no major regression observed with this patch as seen from the context_switch2 benchmark. context_switch2 across CPU0 CPU1 (Both belong to same big-core, but different small cores). We observe a minor 0.14% regression in the number of context-switches (higher is better). x without_patch * with_patch N Min Max Median Avg Stddev x 500 348872 362236 354712 354745.69 2711.827 * 500 349422 361452 353942 354215.4 2576.9258 Difference at 99.0% confidence -530.288 +/- 430.963 -0.149484% +/- 0.121485% (Student's t, pooled s = 2645.24) context_switch2 across CPU0 CPU8 (Different big-cores). We observe a 0.37% improvement in the number of context-switches (higher is better). x without_patch * with_patch N Min Max Median Avg Stddev x 500 287956 294940 288896 288977.23 646.59295 * 500 288300 294646 289582 290064.76 1161.9992 Difference at 99.0% confidence 1087.53 +/- 153.194 0.376337% +/- 0.0530125% (Student's t, pooled s = 940.299) schbench: No major difference could be seen until the 99.9th percentile. Without-patch: Latency percentiles (usec) 50.0th: 29 75.0th: 39 90.0th: 49 95.0th: 59 99.0th: 13104 99.5th: 14672 99.9th: 15824 min=0, max=17993 With-patch: Latency percentiles (usec) 50.0th: 29 75.0th: 40 90.0th: 50 95.0th: 61 99.0th: 13648 99.5th: 14768 99.9th: 15664 min=0, max=29812 Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> [mpe: Minor formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1596087177-30329-4-git-send-email-ego@linux.vnet.ibm.com	2020-07-30 22:53:50 +10:00
Gautham R. Shenoy	054e44ba99	cpuidle: pseries: Add function to parse extended CEDE records Currently we use CEDE with latency-hint 0 as the only other idle state on a dedicated LPAR apart from the polling "snooze" state. The platform might support additional extended CEDE idle states, which can be discovered through the "ibm,get-system-parameter" rtas-call made with CEDE_LATENCY_TOKEN. This patch adds a function to obtain information about the extended CEDE idle states from the platform and parse the contents to populate an array of extended CEDE states. These idle states thus discovered will be added to the cpuidle framework in the next patch. dmesg on a POWER8 and POWER9 LPAR, demonstrating the output of parsing the extended CEDE latency parameters are as follows POWER8 [ 10.093279] xcede : xcede_record_size = 10 [ 10.093285] xcede : Record 0 : hint = 1, latency = 0x3c00 tb ticks, Wake-on-irq = 1 [ 10.093291] xcede : Record 1 : hint = 2, latency = 0x4e2000 tb ticks, Wake-on-irq = 0 [ 10.093297] cpuidle : Skipping the 2 Extended CEDE idle states POWER9 [ 5.913180] xcede : xcede_record_size = 10 [ 5.913183] xcede : Record 0 : hint = 1, latency = 0x400 tb ticks, Wake-on-irq = 1 [ 5.913188] xcede : Record 1 : hint = 2, latency = 0x3e8000 tb ticks, Wake-on-irq = 0 [ 5.913193] cpuidle : Skipping the 2 Extended CEDE idle states Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> [mpe: Make space for 16 records, drop memset, minor cleanup & formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1596087177-30329-3-git-send-email-ego@linux.vnet.ibm.com	2020-07-30 22:53:50 +10:00
Gautham R. Shenoy	3af0ada7dd	cpuidle: pseries: Set the latency-hint before entering CEDE As per the PAPR, each H_CEDE call is associated with a latency-hint to be passed in the VPA field "cede_latency_hint". The CEDE states that we were implicitly entering so far is CEDE with latency-hint = 0. This patch explicitly sets the latency hint corresponding to the CEDE state that we are currently entering. While at it, we save the previous hint, to be restored once we wakeup from CEDE. This will be required in the future when we expose extended-cede states through the cpuidle framework, where each of them will have a different cede-latency hint. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> [mpe: Make cede_latency_hint static] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1596087177-30329-2-git-send-email-ego@linux.vnet.ibm.com	2020-07-30 22:53:50 +10:00
Neal Liu	efe9711214	cpuidle: change enter_s2idle() prototype Control Flow Integrity(CFI) is a security mechanism that disallows changes to the original control flow graph of a compiled binary, making it significantly harder to perform such attacks. init_state_node() assign same function callback to different function pointer declarations. static int init_state_node(struct cpuidle_state idle_state, const struct of_device_id matches, struct device_node state_node) { ... idle_state->enter = match_id->data; ... idle_state->enter_s2idle = match_id->data; } Function declarations: struct cpuidle_state { ... int (enter) (struct cpuidle_device dev, struct cpuidle_driver drv, int index); void (enter_s2idle) (struct cpuidle_device dev, struct cpuidle_driver *drv, int index); }; In this case, either enter() or enter_s2idle() would cause CFI check failed since they use same callee. Align function prototype of enter() since it needs return value for some use cases. The return value of enter_s2idle() is no need currently. Signed-off-by: Neal Liu <neal.liu@mediatek.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-29 18:38:30 +02:00
Ulf Hansson	81f94ddfec	cpuidle: psci: Prevent domain idlestates until consumers are ready Depending on the SoC/platform, additional devices may be part of the PSCI PM domain topology. This is the case with 'qcom,rpmh-rsc' device, for example, even if this is not yet visible in the corresponding DTS-files. Without going into too much details, a device like the 'qcom,rpmh-rsc' may have HW constraints that needs to be obeyed to, before a domain idlestate can be picked. Therefore, let's implement the ->sync_state() callback to receive a notification when all consumers of the PSCI PM domain providers have been attached/probed to it. In this way, we can make sure all constraints from all relevant devices, are taken into account before allowing a domain idlestate to be picked. Acked-by: Saravana Kannan <saravanak@google.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-29 18:38:30 +02:00
Ulf Hansson	ee7c34caac	cpuidle: psci: Convert PM domain to platform driver To enable support for deferred probing and to allow implementation of the ->sync_state() callback from subsequent changes, let's convert into a platform driver. Reviewed-by: Lina Iyer <ilina@codeaurora.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-29 18:38:30 +02:00
Ulf Hansson	166bf83529	cpuidle: psci: Fix error path via converting to a platform driver The current error paths for the cpuidle-psci driver, may leak memory or possibly leave CPU devices attached to their PM domains. These are quite harmless issues, but still deserves to be taken care of. Although, rather than fixing them by keeping track of allocations that needs to be freed, which tends to become a bit messy, let's convert into a platform driver. In this way, it gets easier to fix the memory leaks as we can rely on the devm_* functions. Moreover, converting to a platform driver also enables support for deferred probe, which subsequent changes takes benefit from. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-29 18:38:30 +02:00
Ulf Hansson	4b072cd6ac	cpuidle: psci: Fail cpuidle registration if set OSI mode failed Currently we allow the cpuidle driver registration to succeed, even if we failed to enable the OSI mode when the hierarchical DT layout is used. This means running in a degraded mode, by using the available idle states per CPU, while also preventing the domain idle states. Moving forward, this behaviour looks quite questionable to maintain, as complexity seems to grow around it, especially when trying to add support for deferred probe, for example. Therefore, let's make the cpuidle driver registration to fail in this situation, thus relying on the default architectural cpuidle backend for WFI to be used. Reviewed-by: Lina Iyer <ilina@codeaurora.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-29 18:38:30 +02:00
Ulf Hansson	0317561912	cpuidle: psci: Split into two separate build objects The combined build object for the PSCI cpuidle driver and the PSCI PM domain, is a bit messy. Therefore let's split it up by adding a new Kconfig ARM_PSCI_CPUIDLE_DOMAIN and convert into two separate objects. Reviewed-by: Lina Iyer <ilina@codeaurora.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-07-29 18:38:30 +02:00
Wei Yongjun	92fe8483b1	cpuidle/pseries: Make symbol 'pseries_idle_driver' static The sparse tool complains as follows: drivers/cpuidle/cpuidle-pseries.c:25:23: warning: symbol 'pseries_idle_driver' was not declared. Should it be static? 'pseries_idle_driver' is not used outside of this file, so marks it static. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200714142424.66648-1-weiyongjun1@huawei.com	2020-07-16 13:12:45 +10:00
Abhishek Goel	c339f9be30	cpuidle/powernv : Remove dead code block Commit `1961acad2f` removes usage of function "validate_dt_prop_sizes". This patch removes this unused function. Signed-off-by: Abhishek Goel <huntbag@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200706053258.121475-1-huntbag@linux.vnet.ibm.com	2020-07-15 11:07:20 +10:00
Rafael J. Wysocki	10e8b11eb3	cpuidle: Rearrange s2idle-specific idle state entry code Implement call_cpuidle_s2idle() in analogy with call_cpuidle() for the s2idle-specific idle state entry and invoke it from cpuidle_idle_call() to make the s2idle-specific idle entry code path look more similar to the "regular" idle entry one. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Chen Yu <yu.c.chen@intel.com>	2020-06-25 13:52:53 +02:00
Chen Yu	81e6737518	PM: s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle Suspend to idle was found to not work on Goldmont CPU recently. The issue happens due to: 1. On Goldmont the CPU in idle can only be woken up via IPIs, not POLLING mode, due to commit `08e237fa56` ("x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont based CPUs") 2. When the CPU is entering suspend to idle process, the _TIF_POLLING_NRFLAG remains on, because cpuidle_enter_s2idle() doesn't match call_cpuidle() exactly. 3. Commit `b2a02fc43a` ("smp: Optimize send_call_function_single_ipi()") makes use of _TIF_POLLING_NRFLAG to avoid sending IPIs to idle CPUs. 4. As a result, some IPIs related functions might not work well during suspend to idle on Goldmont. For example, one suspected victim: tick_unfreeze() -> timekeeping_resume() -> hrtimers_resume() -> clock_was_set() -> on_each_cpu() might wait forever, because the IPIs will not be sent to the CPUs which are sleeping with _TIF_POLLING_NRFLAG set, and Goldmont CPU could not be woken up by only setting _TIF_NEED_RESCHED on the monitor address. To avoid that, clear the _TIF_POLLING_NRFLAG flag before invoking enter_s2idle_proper() in cpuidle_enter_s2idle() in analogy with the call_cpuidle() code flow. Fixes: `b2a02fc43a` ("smp: Optimize send_call_function_single_ipi()") Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> [ rjw: Subject / changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-06-23 17:06:55 +02:00
Linus Torvalds	df2fbf5bfa	- Add the hwmon support on the i.MX SC (Anson Huang) - Thermal framework cleanups (self-encapsulation, pointless stubs, private structures) (Daniel Lezcano) - Use the PM QoS frequency changes for the devfreq cooling device (Matthias Kaehlcke) - Remove duplicate error messages from platform_get_irq() error handling (Markus Elfring) - Add support for the bandgap sensors (Keerthy) - Statically initialize .get_mode/.set_mode ops (Andrzej Pietrasiewicz) - Add Renesas R-Car maintainer entry (Niklas Söderlund) - Fix error checking after calling ti_bandgap_get_sensor_data() for the TI SoC thermal (Sudip Mukherjee) - Add latency constraint for the idle injection, the DT binding and the change the registering function (Daniel Lezcano) - Convert the thermal framework binding to the Yaml schema (Amit Kucheria) - Replace zero-length array with flexible-array on i.MX 8MM (Gustavo A. R. Silva) - Thermal framework cleanups (alphabetic order for heads, replace module.h by export.h, make file naming consistent) (Amit Kucheria) - Merge tsens-common into the tsens driver (Amit Kucheria) - Fix platform dependency for the Qoriq driver (Geert Uytterhoeven) - Clean up the rcar_thermal_update_temp() function in the rcar thermal driver (Niklas Söderlund) - Fix the TMSAR register for the TMUv2 on the Qoriq platform (Yuantian Tang) - Export GDDV, OEM vendor variables, and don't require IDSP for the int340x thermal driver - trivial conflicts fixed (Matthew Garrett) -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAl7jra8ACgkQqDIjiipP 6E+ugAgApBF6FsHoonWIvoSrzBrrbU2oqhEJA42Mx+iY/UnXi01I79vZ/8WpZt7M D1J01Kf0PUhRbywoKaoCX3Oh9ZO9PKq4N9ZC8yqdoD6GLl+rC9Wmr7Ui+c80klcv M9rYhpPYfNXTFj0saSbbFWNNhP4TvhzGsNj8foYVQDKyhjbSmNE5ipZlbmP23jlr O53SmJAwS5zxLOd8QA5nfSWP9FYYMuCR2AHj8BUCmxiAjXZLPNB/Hz2RRBr7q0MF zRo/4HJ04mSQYp0kluP/EBhz9g2wM/htIPyWRveB/ByKEYt3UNKjB++PJmPbu5UG dS3aXZhRfaPqpdsWrMB9fY7ll+oyfw== =T+RI -----END PGP SIGNATURE----- Merge tag 'thermal-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux Pull thermal updates from Daniel Lezcano: - Add the hwmon support on the i.MX SC (Anson Huang) - Thermal framework cleanups (self-encapsulation, pointless stubs, private structures) (Daniel Lezcano) - Use the PM QoS frequency changes for the devfreq cooling device (Matthias Kaehlcke) - Remove duplicate error messages from platform_get_irq() error handling (Markus Elfring) - Add support for the bandgap sensors (Keerthy) - Statically initialize .get_mode/.set_mode ops (Andrzej Pietrasiewicz) - Add Renesas R-Car maintainer entry (Niklas Söderlund) - Fix error checking after calling ti_bandgap_get_sensor_data() for the TI SoC thermal (Sudip Mukherjee) - Add latency constraint for the idle injection, the DT binding and the change the registering function (Daniel Lezcano) - Convert the thermal framework binding to the Yaml schema (Amit Kucheria) - Replace zero-length array with flexible-array on i.MX 8MM (Gustavo A. R. Silva) - Thermal framework cleanups (alphabetic order for heads, replace module.h by export.h, make file naming consistent) (Amit Kucheria) - Merge tsens-common into the tsens driver (Amit Kucheria) - Fix platform dependency for the Qoriq driver (Geert Uytterhoeven) - Clean up the rcar_thermal_update_temp() function in the rcar thermal driver (Niklas Söderlund) - Fix the TMSAR register for the TMUv2 on the Qoriq platform (Yuantian Tang) - Export GDDV, OEM vendor variables, and don't require IDSP for the int340x thermal driver - trivial conflicts fixed (Matthew Garrett) * tag 'thermal-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (48 commits) thermal/int340x_thermal: Don't require IDSP to exist thermal/int340x_thermal: Export OEM vendor variables thermal/int340x_thermal: Export GDDV thermal: qoriq: Update the settings for TMUv2 thermal: rcar_thermal: Clean up rcar_thermal_update_temp() thermal: qoriq: Add platform dependencies drivers: thermal: tsens: Merge tsens-common.c into tsens.c thermal/of: Rename of-thermal.c thermal/governors: Prefix all source files with gov_ thermal/drivers/user_space: Sort headers alphabetically thermal/drivers/of-thermal: Sort headers alphabetically thermal/drivers/cpufreq_cooling: Replace module.h with export.h thermal/drivers/cpufreq_cooling: Sort headers alphabetically thermal/drivers/clock_cooling: Include export.h thermal/drivers/clock_cooling: Sort headers alphabetically thermal/drivers/thermal_hwmon: Include export.h thermal/drivers/thermal_hwmon: Sort headers alphabetically thermal/drivers/thermal_helpers: Include export.h thermal/drivers/thermal_helpers: Sort headers alphabetically thermal/core: Replace module.h with export.h ...	2020-06-12 14:10:21 -07:00
Linus Torvalds	7ae77150d9	powerpc updates for 5.8 - Support for userspace to send requests directly to the on-chip GZIP accelerator on Power9. - Rework of our lockless page table walking (__find_linux_pte()) to make it safe against parallel page table manipulations without relying on an IPI for serialisation. - A series of fixes & enhancements to make our machine check handling more robust. - Lots of plumbing to add support for "prefixed" (64-bit) instructions on Power10. - Support for using huge pages for the linear mapping on 8xx (32-bit). - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound driver. - Removal of some obsolete 40x platforms and associated cruft. - Initial support for booting on Power10. - Lots of other small features, cleanups & fixes. Thanks to: Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Andrey Abramov, Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent Abali, Cédric Le Goater, Chen Zhou, Christian Zigotzky, Christophe JAILLET, Christophe Leroy, Dmitry Torokhov, Emmanuel Nicolet, Erhard F., Gautham R. Shenoy, Geoff Levand, George Spelvin, Greg Kurz, Gustavo A. R. Silva, Gustavo Walbon, Haren Myneni, Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Kees Cook, Leonardo Bras, Madhavan Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Michal Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram Pai, Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler, Wolfram Sang, Xiongfeng Wang. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl7aYZ8THG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgPiKD/9zNCuZLFMAFrIdbm0HlYA2RGYZFT75 GUHsqYyei1pxA7PgM3KwJiXELVODsBv0eQbgNh1tbecKrxPRegN/cywd1KLjPZ7I v5/qweQP8MvR0RhzjbhvUcO0jq/f8u2LbJr5mUfVzjU6tAvrvcWo3oZqDElsekCS kgyOH3r1vZ2PLTMiGFhb0gWi2iqc+6BHU1AFCGPCMjB1Vu5d5+54VvZ/6lllGsOF yg9CBXmmVvQ+Bn6tH4zdEB78FYxnAIwBqlbmL79i5ca+HQJ0Sw6HuPRy9XYq35p6 2EiXS4Wrgp7i7+1TN3HO362u5Onb8TSyQU7NS6yCFPoJ6JQxcJMBIw6mHhnXOPuZ CrjgcdwUMjx8uDoKmX1Epbfuex2w+AysW+4yBHPFiSgl3klKC3D0wi95mR485w2F rN8uzJtrDeFKcYZJG7IoB/cgFCCPKGf9HaXr8q0S/jBKMffx91ul3cfzlfdIXOCw FDNw/+ZX7UD6ddFEG12ZTO+vdL8yf1uCRT/DIZwUiDMIA0+M6F4nc7j3lfyZfoO1 65f9UlhoLxScq7VH2fKH4UtZatO9cPID2z1CmiY4UbUIPtFDepSuYClgLF+Duf4b rkfxhKU0+Ja1zNH5XNc+L+Bc5/W4lFiJXz02dYIjtHoUpWkc1aToOETVwzggYFNM G3PXIBOI0jRgRw== =o0WU -----END PGP SIGNATURE----- Merge tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Support for userspace to send requests directly to the on-chip GZIP accelerator on Power9. - Rework of our lockless page table walking (__find_linux_pte()) to make it safe against parallel page table manipulations without relying on an IPI for serialisation. - A series of fixes & enhancements to make our machine check handling more robust. - Lots of plumbing to add support for "prefixed" (64-bit) instructions on Power10. - Support for using huge pages for the linear mapping on 8xx (32-bit). - Remove obsolete Xilinx PPC405/PPC440 support, and an associated sound driver. - Removal of some obsolete 40x platforms and associated cruft. - Initial support for booting on Power10. - Lots of other small features, cleanups & fixes. Thanks to: Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Andrey Abramov, Aneesh Kumar K.V, Balamuruhan S, Bharata B Rao, Bulent Abali, Cédric Le Goater, Chen Zhou, Christian Zigotzky, Christophe JAILLET, Christophe Leroy, Dmitry Torokhov, Emmanuel Nicolet, Erhard F., Gautham R. Shenoy, Geoff Levand, George Spelvin, Greg Kurz, Gustavo A. R. Silva, Gustavo Walbon, Haren Myneni, Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Kees Cook, Leonardo Bras, Madhavan Srinivasan., Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Michal Simek, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Paul Mackerras, Pingfan Liu, Qian Cai, Ram Pai, Raphael Moreira Zinsly, Ravi Bangoria, Sam Bobroff, Sandipan Das, Segher Boessenkool, Stephen Rothwell, Sukadev Bhattiprolu, Tyrel Datwyler, Wolfram Sang, Xiongfeng Wang. * tag 'powerpc-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (299 commits) powerpc/pseries: Make vio and ibmebus initcalls pseries specific cxl: Remove dead Kconfig options powerpc: Add POWER10 architected mode powerpc/dt_cpu_ftrs: Add MMA feature powerpc/dt_cpu_ftrs: Enable Prefixed Instructions powerpc/dt_cpu_ftrs: Advertise support for ISA v3.1 if selected powerpc: Add support for ISA v3.1 powerpc: Add new HWCAP bits powerpc/64s: Don't set FSCR bits in INIT_THREAD powerpc/64s: Save FSCR to init_task.thread.fscr after feature init powerpc/64s: Don't let DT CPU features set FSCR_DSCR powerpc/64s: Don't init FSCR_DSCR in __init_FSCR() powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG powerpc/module_64: Use special stub for _mcount() with -mprofile-kernel powerpc/module_64: Simplify check for -mprofile-kernel ftrace relocations powerpc/module_64: Consolidate ftrace code powerpc/32: Disable KASAN with pages bigger than 16k powerpc/uaccess: Don't set KUEP by default on book3s/32 powerpc/uaccess: Don't set KUAP by default on book3s/32 powerpc/8xx: Reduce time spent in allow_user_access() and friends ...	2020-06-05 12:39:30 -07:00
Linus Torvalds	828f3e18e1	ARM/SoC: drivers for v5.7 These are updates to SoC specific drivers that did not have another subsystem maintainer tree to go through for some reason: - Some bus and memory drivers for the MIPS P5600 based Baikal-T1 SoC that is getting added through the MIPS tree. - There are new soc_device identification drivers for TI K3, Qualcomm MSM8939 - New reset controller drivers for NXP i.MX8MP, Renesas RZ/G1H, and Hisilicon hi6220 - The SCMI firmware interface can now work across ARM SMC/HVC as a transport. - Mediatek platforms now use a new driver for their "MMSYS" hardware block that controls clocks and some other aspects in behalf of the media and gpu drivers. - Some Tegra processors have improved power management support, including getting woken up by the PMIC and cluster power down during idle. - A new v4l staging driver for Tegra is added. - Cleanups and minor bugfixes for TI, NXP, Hisilicon, Mediatek, and Tegra. Signed-off-by: Arnd Bergmann <arnd@arndb.de> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAl7XvgAACgkQmmx57+YA GNmj/hAAnAJ/hYehLfgCe711HUntgeRkaoTVpCt8BJNMdxsa23sn3V6k5+WYn1uG PtlgpefZEMHLUEEVDegR4nZXLG0Pzu1SR12KW34YPcQKkNo/+vlQ9zYUajnJ/KX6 10zdLSIzHfk1VtXKvvQQ8xFyE+S/trGmjC57E6gfoCUT3rl1maD+ccVXUBaz9oob wuMxGXQAl57mio5yT1OfSk6Fev39xRE2dN1hzP7KUYhsemZajBwBBW5wVJZCsCB8 LCGmxVkavM7BV4r2NokbBDs5rlfedBl/P/IPd9Is5a5tuGUkSsVRG9zqShxYLGM3 S06az6POQFwXKFJoUKW0dK/Koy0D7BK+vhUBPzFv4HZ8iDCVf6Jju2MJ02GMqHPj OOrXaCbLYrvN/edVUWeeFywqwMbYTRwC4DxyTq5m7HxEB004xTOhs0rX5aR0u4n1 bbsR97LguolwH9iEMzd3F3jCiKBcMecH3lAh5WcrtwlFIRrNhbWoGDoA/4TuORFS b11rgsTRIJ5Vc++D1HnSnx0ZZvUzyluMvygdALnSgVah6xYe6KVw9Kg/wioAJ04G uSTidqP3qRhsyET2HQo7CxdVfZbKfP25iKCKrdhQziztKvhF8qrUmZKloXOodRw+ ewYSRmv8c324OYYit1X43oAdW8dntq1XbSIauaqxEb4JC7x/xRY= =44Jc -----END PGP SIGNATURE----- Merge tag 'arm-drivers-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM/SoC driver updates from Arnd Bergmann: "These are updates to SoC specific drivers that did not have another subsystem maintainer tree to go through for some reason: - Some bus and memory drivers for the MIPS P5600 based Baikal-T1 SoC that is getting added through the MIPS tree. - There are new soc_device identification drivers for TI K3, Qualcomm MSM8939 - New reset controller drivers for NXP i.MX8MP, Renesas RZ/G1H, and Hisilicon hi6220 - The SCMI firmware interface can now work across ARM SMC/HVC as a transport. - Mediatek platforms now use a new driver for their "MMSYS" hardware block that controls clocks and some other aspects in behalf of the media and gpu drivers. - Some Tegra processors have improved power management support, including getting woken up by the PMIC and cluster power down during idle. - A new v4l staging driver for Tegra is added. - Cleanups and minor bugfixes for TI, NXP, Hisilicon, Mediatek, and Tegra" * tag 'arm-drivers-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (155 commits) clk: sprd: fix compile-testing bus: bt1-axi: Build the driver into the kernel bus: bt1-apb: Build the driver into the kernel bus: bt1-axi: Use sysfs_streq instead of strncmp bus: bt1-axi: Optimize the return points in the driver bus: bt1-apb: Use sysfs_streq instead of strncmp bus: bt1-apb: Use PTR_ERR_OR_ZERO to return from request-regs method bus: bt1-apb: Fix show/store callback identations bus: bt1-apb: Include linux/io.h dt-bindings: memory: Add Baikal-T1 L2-cache Control Block binding memory: Add Baikal-T1 L2-cache Control Block driver bus: Add Baikal-T1 APB-bus driver bus: Add Baikal-T1 AXI-bus driver dt-bindings: bus: Add Baikal-T1 APB-bus binding dt-bindings: bus: Add Baikal-T1 AXI-bus binding staging: tegra-video: fix V4L2 dependency tee: fix crypto select drivers: soc: ti: knav_qmss_queue: Make knav_gp_range_ops static soc: ti: add k3 platforms chipid module driver dt-bindings: soc: ti: add binding for k3 platforms chipid module ...	2020-06-04 19:56:20 -07:00
Qiushi Wu	c343bf1ba5	cpuidle: Fix three reference count leaks kobject_init_and_add() takes reference even when it fails. If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. Previous commit "b8eb718348b8" fixed a similar problem. Signed-off-by: Qiushi Wu <wu000273@umn.edu> [ rjw: Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-05-29 18:07:18 +02:00
Stephan Gerhold	a871be6b8e	cpuidle: Convert Qualcomm SPM driver to a generic CPUidle driver The Qualcomm SPM cpuidle driver seems to be the last driver still using the generic ARM CPUidle infrastructure. Converting it actually allows us to simplify the driver, and we end up being able to remove more lines than adding new ones: - We can parse the CPUidle states in the device tree directly with dt_idle_states (and don't need to duplicate that functionality into the spm driver). - Each "saw" device managed by the SPM driver now directly registers its own cpuidle driver, removing the need for any global (per cpu) state. The device tree binding is the same, so the driver stays compatible with all old device trees. Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Reviewed-by: Lina Iyer <ilina@codeaurora.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-05-26 10:46:01 +02:00
Hanjun Guo	cce55cc902	cpuidle: sysfs: Remove sysfs_switch and switch attributes Since the cpuidle governor can be switched via sysfs in default, remove sysfs_switch and cpuidle_switch_attrs. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-05-19 17:41:17 +02:00
Hanjun Guo	b52e93e4e8	cpuidle: Make cpuidle governor switchable to be the default behaviour For now cpuidle governor can be switched via sysfs only when the boot option "cpuidle_sysfs_switch" is passed, but it's important to switch the governor to adapt to different workloads, especially after TEO and haltpoll governor were introduced. Add available_governors and current_governor into the default attributes, but reserve the current_governor_ro for compatiblity. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-05-19 17:41:17 +02:00
Hanjun Guo	ef7e7d65eb	cpuidle: sysfs: Accept governor name with 15 characters CPUIDLE_NAME_LEN is 16, so it's possible to accept governor name with 15 characters, but now store_current_governor() rejects governor name with 15 characters as it returns -EINVAL if count equals CPUIDLE_NAME_LEN. Refactor the code to accept such case and simplify the code. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-05-19 17:41:17 +02:00
Hanjun Guo	3f9f8daad3	cpuidle: sysfs: Fix the overlap for showing available governors When showing the available governors, it's "%s " in scnprintf(), not "%s", so if the governor name has 15 characters, it will overlap with the later one, fix it by adding one more for the size. While we are at it, fix the minor coding style issue and remove the "/sizeof(char)" since sizeof(char) always equals 1. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-05-19 17:41:17 +02:00
Daniel Lezcano	fc7a3d9e9c	thermal: cpuidle: Register cpuidle cooling device The cpuidle driver can be used as a cooling device by injecting idle cycles. When the property is set, register the cpuidle driver with the idle state node pointer as a cooling device. The thermal framework will do the association automatically with the thermal zone via the cooling-device defined in the device tree cooling-maps section. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20200429103644.5492-4-daniel.lezcano@linaro.org	2020-05-19 12:58:07 +02:00
Ulf Hansson	8b7ce5e490	cpuidle: psci: Fixup execution order when entering a domain idle state Moving forward, platforms are going to need to execute specific "last-man" operations before a domain idle state can be entered. In one way or the other, these operations needs to be triggered while walking the hierarchical topology via runtime PM and genpd, as it's at that point the last-man becomes known. Moreover, executing last-man operations needs to be done after the CPU PM notifications are sent through cpu_pm_enter(), as otherwise it's likely that some notifications would fail. Therefore, let's re-order the sequence in psci_enter_domain_idle_state(), so cpu_pm_enter() gets called prior pm_runtime_put_sync(). Fixes: `ce85aef570` ("cpuidle: psci: Manage runtime PM in the idle path") Reported-by: Lina Iyer <ilina@codeaurora.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-05-15 18:37:36 +02:00
Dmitry Osipenko	fafd62e768	cpuidle: tegra: Support CPU cluster power-down state on Tegra30 The new Tegra CPU Idle driver now has a unified code path for the coupled CC6 (LP2) state, this allows to enable the deepest idling state on Tegra30 SoC where the whole CPU cluster is power-gated. Tested-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Tested-by: Jasper Korten <jja2000@gmail.com> Tested-by: David Heidelberg <david@ixit.cz> Tested-by: Peter Geis <pgwipeout@gmail.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2020-05-06 18:42:55 +02:00
Gautham R. Shenoy	c4019198cf	powerpc/idle: Store PURR snapshot in a per-cpu global variable Currently when CPU goes idle, we take a snapshot of PURR via pseries_idle_prolog() which is used at the CPU idle exit to compute the idle PURR cycles via the function pseries_idle_epilog(). Thus, the value of idle PURR cycle thus read before pseries_idle_prolog() and after pseries_idle_epilog() is always correct. However, if we were to read the idle PURR cycles from an interrupt context between pseries_idle_prolog() and pseries_idle_epilog() (this will be done in a future patch), then, the value of the idle PURR thus read will not include the cycles spent in the most recent idle period. Thus, in that interrupt context, we will need access to the snapshot of the PURR before going idle, in order to compute the idle PURR cycles for the latest idle duration. In this patch, we save the snapshot of PURR in pseries_idle_prolog() in a per-cpu variable, instead of on the stack, so that it can be accessed from an interrupt context. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1586249263-14048-3-git-send-email-ego@linux.vnet.ibm.com	2020-04-30 12:35:26 +10:00
Gautham R. Shenoy	e4a884cc28	powerpc: Move idle_loop_prolog()/epilog() functions to header file Currently prior to entering an idle state on a Linux Guest, the pseries cpuidle driver implement an idle_loop_prolog() and idle_loop_epilog() functions which ensure that idle_purr is correctly computed, and the hypervisor is informed that the CPU cycles have been donated. These prolog and epilog functions are also required in the default idle call, i.e pseries_lpar_idle(). Hence move these accessor functions to a common header file and call them from pseries_lpar_idle(). Since the existing header files such as asm/processor.h have enough clutter, create a new header file asm/idle.h. Finally rename idle_loop_prolog() and idle_loop_epilog() to pseries_idle_prolog() and pseries_idle_epilog() as they are only relavent for on pseries guests. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1586249263-14048-2-git-send-email-ego@linux.vnet.ibm.com	2020-04-30 12:35:26 +10:00
Hanjun Guo	eba933ceeb	cpuidle: sysfs: Minor coding style corrections Fix two minor coding style issues. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-04-29 13:33:24 +02:00
Hanjun Guo	2f516e7cbe	cpuidle: sysfs: Remove the unused define_one_r(o/w) macros The define_one_ro and define_one_rw macros are not used, remove it. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-04-29 13:33:24 +02:00
Rafael J. Wysocki	a31434bcd4	Merge branch 'pm-cpuidle' * pm-cpuidle: cpuidle-haltpoll: Fix small typo	2020-04-10 11:32:22 +02:00
Yihao Wu	4902f7fcb3	cpuidle-haltpoll: Fix small typo Fix a spelling typo in cpuidle-haltpoll.c. Signed-off-by: Yihao Wu <wuyihao@linux.alibaba.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-04-08 14:35:45 +02:00
Linus Torvalds	0e8fb69f28	ARM: SoC updates The code changes are mostly for 32-bit platforms and include: - Lots of updates for the Nvidia Tegra platform, including cpuidle, pmc, and dt-binding changes - Microchip at91 power management updates for the recently added sam9x60 SoC - Treewide setup_irq deprecation by afzal mohammed - STMicroelectronics stm32 gains earlycon support - Renesas platforms with Cortex-A9 can now use the global timer - Some TI OMAP2+ platforms gain cpuidle support - Various cleanups for the i.MX6 and Orion platforms, as well as Kconfig files across all platforms Signed-off-by: Arnd Bergmann <arnd@arndb.de> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAl6EaLsACgkQmmx57+YA GNkg/Q/9EmLGT/bznqi6P/8V75qPY6j4swife5HlI43EdCEyN+iME1w1rFfrNilf A/QyXzxhYgUZoyIt7K4qjWc8fPNlJZ8X10UqeEftQFxi82cmX2+OaT2fy6OdHVRJ SAGb1pw7463TQ6LKA7LC7rztFjahsjnE/9szlgXQT4v5OzayGyxd3OKy2Q1zASEi rwN+85Lh+xJvWhhenPKVvs2Dei+up8Y9uyPfJkj2QudCB+zx5k5stkk4EiQLBd1W SHzhijbSU3MrAUz2Cnp9Qa+86DdGPvFPfpzQy3pSYU9nNrC0aKpS8YmHT99SIWVN 6R1YJF7Htmui5I9+O0baejJMEqvzGUysqe+rQdCofD7ooVKWU7WKYj5HxZxyBCEN dvlN3KRmS6l5KLsZARSxuBUw2MPTgjsxczZ84NKMLj8czw6yXyrePZ1RWiEZ4HIu 4GiFNLYSMmAynD/dLuC9USDsjlPsQKnQ3e8hDf3a6oK5OHUIkr3uvguhqWa5WJsi cy2DUeWJkXgJDhlxcfr8MiPpPJRo3N/8O8PYci8dkdDFRs32j/5Qf22vywDdUHaZ I9Pl+VOGOSGiqRc9gmay6DNXpJusfuv72omrz4rL4kGMahpk0LeV5w+a9TdrkreY sLM67wtshOjOVMc/ebQ5Q8RR17Go+MFK+Co9Yc1ybbQ6lkSzeuY= =UTk1 -----END PGP SIGNATURE----- Merge tag 'arm-soc-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC updates from Arnd Bergmann: "The code changes are mostly for 32-bit platforms and include: - Lots of updates for the Nvidia Tegra platform, including cpuidle, pmc, and dt-binding changes - Microchip at91 power management updates for the recently added sam9x60 SoC - Treewide setup_irq deprecation by afzal mohammed - STMicroelectronics stm32 gains earlycon support - Renesas platforms with Cortex-A9 can now use the global timer - Some TI OMAP2+ platforms gain cpuidle support - Various cleanups for the i.MX6 and Orion platforms, as well as Kconfig files across all platforms" * tag 'arm-soc-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (75 commits) ARM: qcom: Add support for IPQ40xx ARM: mmp: replace setup_irq() by request_irq() ARM: cns3xxx: replace setup_irq() by request_irq() ARM: spear: replace setup_irq() by request_irq() ARM: ep93xx: Replace setup_irq() by request_irq() ARM: iop32x: replace setup_irq() by request_irq() arm: mach-dove: Mark dove_io_desc as __maybe_unused ARM: orion: replace setup_irq() by request_irq() ARM: debug: stm32: add UART early console support for STM32MP1 ARM: debug: stm32: add UART early console support for STM32H7 ARM: debug: stm32: add UART early console configuration for STM32F7 ARM: debug: stm32: add UART early console configuration for STM32F4 cpuidle: tegra: Disable CC6 state if LP2 unavailable cpuidle: tegra: Squash Tegra114 driver into the common driver cpuidle: tegra: Squash Tegra30 driver into the common driver cpuidle: Refactor and move out NVIDIA Tegra20 driver into drivers/cpuidle ARM: tegra: cpuidle: Remove unnecessary memory barrier ARM: tegra: cpuidle: Make abort_flag atomic ARM: tegra: cpuidle: Handle case where secondary CPU hangs on entering LP2 ARM: tegra: Make outer_disable() open-coded ...	2020-04-03 15:02:35 -07:00
Rafael J. Wysocki	ada0629bd3	Merge branches 'pm-core', 'pm-sleep', 'pm-acpi' and 'pm-domains' * pm-core: PM: runtime: Add pm_runtime_get_if_active() * pm-sleep: PM: sleep: wakeup: Skip wakeup_source_sysfs_remove() if device is not there PM / hibernate: Remove unnecessary compat ioctl overrides PM: hibernate: fix docs for ioctls that return loff_t via pointer PM: sleep: wakeup: Use built-in RCU list checking PM: sleep: core: Use built-in RCU list checking * pm-acpi: ACPI: PM: s2idle: Refine active GPEs check ACPICA: Allow acpi_any_gpe_status_set() to skip one GPE ACPI: PM: s2idle: Fix comment in acpi_s2idle_prepare_late() * pm-domains: cpuidle: psci: Split psci_dt_cpu_init_idle() PM / Domains: Allow no domain-idle-states DT property in genpd when parsing	2020-03-30 14:46:58 +02:00
Rafael J. Wysocki	be4f65405a	Merge branch 'pm-cpuidle' * pm-cpuidle: cpuidle: haltpoll: allow force loading on hosts without the REALTIME hint intel_idle: Update copyright notice, known limitations and version intel_idle: Define CPUIDLE_FLAG_TLB_FLUSHED as BIT(16) intel_idle: Clean up kerneldoc comments for multiple functions intel_idle: Reorder declarations of static variables intel_idle: Annotate init time data structures intel_idle: Add __initdata annotations to init time variables intel_idle: Relocate definitions of cpuidle callbacks intel_idle: Clean up definitions of cpuidle callbacks intel_idle: Simplify LAPIC timer reliability checks	2020-03-30 14:46:17 +02:00
Ulf Hansson	7fbee48ea0	cpuidle: psci: Split psci_dt_cpu_init_idle() To make the code a bit more readable, let's move the OSI specific initialization out of the psci_dt_cpu_init_idle() and into a separate function. Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-03-14 12:03:05 +01:00
Maciej S. Szmigiero	dd52551fb7	cpuidle: haltpoll: allow force loading on hosts without the REALTIME hint Before commit `1328edca4a` ("cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available") the cpuidle-haltpoll driver could also be used in scenarios when the host does not advertise the KVM_HINTS_REALTIME hint. While the behavior introduced by the aforementioned commit makes sense as the default there are cases where the old behavior is desired, for example, when other kernel changes triggered by presence by this hint are unwanted, for some workloads where the latency benefit from polling overweights the loss from idle CPU capacity that otherwise would be available, or just when running under older Qemu versions that lack this hint. Let's provide a typical "force" module parameter that allows restoring the old behavior. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-03-14 11:39:07 +01:00
Dmitry Osipenko	382ac8e22b	cpuidle: tegra: Disable CC6 state if LP2 unavailable LP2 suspending could be unavailable, for example if it is disabled in a device-tree. CC6 cpuidle state won't work in that case. Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2020-03-13 11:32:01 +01:00
Dmitry Osipenko	14e086baca	cpuidle: tegra: Squash Tegra114 driver into the common driver Tegra20/30/114/124 SoCs have common idling states, thus there is no much point in having separate drivers for a similar hardware. This patch moves Tegra114/124 arch/ drivers into the common driver without any functional changes. The CC6 state is kept disabled on Tegra114/124 because the core Tegra PM code needs some more work in order to support that state. Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thierry Reding <treding@nvidia.com>	2020-03-13 11:32:01 +01:00
Dmitry Osipenko	19461a499c	cpuidle: tegra: Squash Tegra30 driver into the common driver Tegra20 and Terga30 SoCs have common C1 and CC6 idling states and thus share the same code paths, there is no point in having separate drivers for a similar hardware. This patch merely moves functionality of the old driver into the new, although the CC6 state is kept disabled for now since old driver had a rudimentary support for this state (allowing to enter into CC6 only when secondary CPUs are put offline), while new driver can provide a full-featured support. The new feature will be enabled by another patch. Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com> Tested-by: Peter Geis <pgwipeout@gmail.com> Tested-by: Jasper Korten <jja2000@gmail.com> Tested-by: David Heidelberg <david@ixit.cz> Tested-by: Nicolas Chauvet <kwizart@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2020-03-13 11:32:01 +01:00
Dmitry Osipenko	860fbde438	cpuidle: Refactor and move out NVIDIA Tegra20 driver into drivers/cpuidle The driver's code is refactored in a way that will make it easy to support Tegra30/114/124 SoCs by this unified driver later on. The current functionality is equal to the old Tegra20 driver, only the code's structure changed a tad. This is also a proper platform driver now. Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Thierry Reding <treding@nvidia.com>	2020-03-13 11:31:58 +01:00
Rafael J. Wysocki	f60ccc3558	cpuidle: Call cpu_latency_qos_limit() instead of pm_qos_request() Call cpu_latency_qos_limit() instead of pm_qos_request(), because the latter is going to be dropped. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Tested-by: Amit Kucheria <amit.kucheria@linaro.org>	2020-02-13 11:26:57 +01:00
Rafael J. Wysocki	3a4a004222	PM: QoS: Drop PM_QOS_CPU_DMA_LATENCY notifier chain Notice that pm_qos_remove_notifier() is not used at all and the only caller of pm_qos_add_notifier() is the cpuidle core, which only needs the PM_QOS_CPU_DMA_LATENCY notifier to invoke wake_up_all_idle_cpus() upon changes of the PM_QOS_CPU_DMA_LATENCY target value. First, to ensure that wake_up_all_idle_cpus() will be called whenever the PM_QOS_CPU_DMA_LATENCY target value changes, modify the pm_qos_add/update/remove_request() family of functions to check if the effective constraint for the PM_QOS_CPU_DMA_LATENCY has changed and call wake_up_all_idle_cpus() directly in that case. Next, drop the PM_QOS_CPU_DMA_LATENCY notifier from cpuidle as it is not necessary any more. Finally, drop both pm_qos_add_notifier() and pm_qos_remove_notifier(), as they have no callers now, along with cpu_dma_lat_notifier which is only used by them. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Tested-by: Amit Kucheria <amit.kucheria@linaro.org>	2020-02-13 11:26:27 +01:00
Linus Torvalds	eab3540562	ARM: SoC-related driver updates Various driver updates for platforms: - Nvidia: Fuse support for Tegra194, continued memory controller pieces for Tegra30 - NXP/FSL: Refactorings of QuickEngine drivers to support ARM/ARM64/PPC - NXP/FSL: i.MX8MP SoC driver pieces - TI Keystone: ring accelerator driver - Qualcomm: SCM driver cleanup/refactoring + support for new SoCs. - Xilinx ZynqMP: feature checking interface for firmware. Mailbox communication for power management - Overall support patch set for cpuidle on more complex hierarchies (PSCI-based) + Misc cleanups, refactorings of Marvell, TI, other platforms. -----BEGIN PGP SIGNATURE----- iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAl4+lTYPHG9sb2ZAbGl4 b20ubmV0AAoJEIwa5zzehBx3nQcQAJm91+6hZbmMjlBySGS7ISjYvOcrI/hMgiOl uhhEP0Dcylvf9A9x3wcIbLwixe+2pvie9DQh2u5F80ShYimidtFi/2xCfuTb9fKu sxxKjrXWyVKhkpW0z+tedY08ftVhkwwcyD4m2C7uVl6AwTP7c367vFeU7XjF2APn drfgmgbjm8U3XbSyAqv+k6z6tyqaCnFM7vbPupSKHgHJ3mfByxOa+XyBN2RdgBbs 0KrVfbXGv80zFIFrMPwaWG7G52bu7K68nVdgy44MpKdRZ6QTjhnR+kerFxHsYgV4 bM55Fya52nTCSTGdKaQakDtKwbAUdCDTSkxgOHGcQoyFi0R/VaEUJtcysnvLbI6c +n/yFIzGyEdXcvIzfv2SoDYhogw19I6RR/M9K5Ni29eazkDVYx2z3rI+2QYeqCiF u7cq52gW6JLP0SI/9kuUrRFiR8v19Ixap7qokAxgqQwYB3NzT8a7WsYPkzdpDZGQ ETSDFMyBWT6UvBe/HWkQluBabbet53rG8BF0OHFrQuMK0u/ieKgSGuTB9XN2djEW PHMOMz2vhi+8XTfpkskhF2tTxlA/k4R6QwCdIMpIkMRVnVQCh1XdPr3Fi2NrgB+S kIXHD4vV6zLYh04zHyKewSPHAXWgraFpg2qKnvL5+KWMTnW6QH+RNjOt9xKDNXOd +iDXpOad =ONtb -----END PGP SIGNATURE----- Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC-related driver updates from Olof Johansson: "Various driver updates for platforms: - Nvidia: Fuse support for Tegra194, continued memory controller pieces for Tegra30 - NXP/FSL: Refactorings of QuickEngine drivers to support ARM/ARM64/PPC - NXP/FSL: i.MX8MP SoC driver pieces - TI Keystone: ring accelerator driver - Qualcomm: SCM driver cleanup/refactoring + support for new SoCs. - Xilinx ZynqMP: feature checking interface for firmware. Mailbox communication for power management - Overall support patch set for cpuidle on more complex hierarchies (PSCI-based) and misc cleanups, refactorings of Marvell, TI, other platforms" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (166 commits) drivers: soc: xilinx: Use mailbox IPI callback dt-bindings: power: reset: xilinx: Add bindings for ipi mailbox drivers: soc: ti: knav_qmss_queue: Pass lockdep expression to RCU lists MAINTAINERS: Add brcmstb PCIe controller entry soc/tegra: fuse: Unmap registers once they are not needed anymore soc/tegra: fuse: Correct straps' address for older Tegra124 device trees soc/tegra: fuse: Warn if straps are not ready soc/tegra: fuse: Cache values of straps and Chip ID registers memory: tegra30-emc: Correct error message for timed out auto calibration memory: tegra30-emc: Firm up hardware programming sequence memory: tegra30-emc: Firm up suspend/resume sequence soc/tegra: regulators: Do nothing if voltage is unchanged memory: tegra: Correct reset value of xusb_hostr soc/tegra: fuse: Add APB DMA dependency for Tegra20 bus: tegra-aconnect: Remove PM_CLK dependency dt-bindings: mediatek: add MT6765 power dt-bindings soc: mediatek: cmdq: delete not used define memory: tegra: Add support for the Tegra194 memory controller memory: tegra: Only include support for enabled SoCs memory: tegra: Support DVFS on Tegra186 and later ...	2020-02-08 14:04:19 -08:00
Rafael J. Wysocki	e6cf623ba3	Merge branch 'intel_idle+acpi' Merge changes updating the ACPI processor driver in order to export acpi_processor_evaluate_cst() to the code outside of it and adding ACPI support to the intel_idle driver based on that. * intel_idle+acpi: Documentation: admin-guide: PM: Add intel_idle document intel_idle: Use ACPI _CST on server systems intel_idle: Add module parameter to prevent ACPI _CST from being used intel_idle: Allow ACPI _CST to be used for selected known processors cpuidle: Allow idle states to be disabled by default intel_idle: Use ACPI _CST for processor models without C-state tables intel_idle: Refactor intel_idle_cpuidle_driver_init() ACPI: processor: Export acpi_processor_evaluate_cst() ACPI: processor: Make ACPI_PROCESSOR_CSTATE depend on ACPI_PROCESSOR ACPI: processor: Clean up acpi_processor_evaluate_cst() ACPI: processor: Introduce acpi_processor_evaluate_cst() ACPI: processor: Export function to claim _CST control	2020-01-23 00:35:50 +01:00
Benjamin Gaignard	cefb9409ff	cpuidle: fix cpuidle_find_deepest_state() kerneldoc warnings Fix cpuidle_find_deepest_state() kernel documentation to avoid warnings when compiling with W=1. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-01-23 00:33:31 +01:00
Benjamin Gaignard	a09da3fbc1	cpuidle: sysfs: fix warnings when compiling with W=1 Fix kernel documentation comments to remove warnings when compiling with W=1. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-01-23 00:31:31 +01:00
Benjamin Gaignard	32014c86d4	cpuidle: coupled: fix warnings when compiling with W=1 Fix warnings that show up when compiling with W=1 Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-01-23 00:30:28 +01:00
Rafael J. Wysocki	f7d50a1534	Merge back cpuidle material for v5.6.	2020-01-17 00:18:31 +01:00
Krzysztof Kozlowski	53eb82b097	cpuidle: arm: Enable compile testing for some of drivers Some of cpuidle drivers for ARMv7 can be compile tested on this architecture because they do not depend on mach-specific bits. Enable compile testing for big.LITTLE, Kirkwood, Zynq, AT91, Exynos and mvebu cpuidle drivers. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-01-14 23:48:46 +01:00
Ikjoon Jang	57388a2ccb	cpuidle: teo: Fix intervals[] array indexing bug Fix a simple bug in rotating array index. Fixes: `b26bf6ab71` ("cpuidle: New timer events oriented governor for tickless systems") Signed-off-by: Ikjoon Jang <ikjn@chromium.org> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-01-13 11:14:58 +01:00
Rafael J. Wysocki	577a2f41f4	cpuidle: Drop unused cpuidle_driver_ref/unref() functions The cpuidle_driver_ref() and cpuidle_driver_unref() functions are not used and the refcnt field in struct cpuidle_driver operated by them is not updated anywhere else (so it is permanently equal to 0), so drop both of them along with refcnt. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>	2020-01-09 16:47:22 +01:00
Ulf Hansson	a65a397f24	cpuidle: psci: Add support for PM domains by using genpd When the hierarchical CPU topology layout is used in DT and the PSCI OSI mode is supported by the PSCI FW, let's initialize a corresponding PM domain topology by using genpd. This enables a CPU and a group of CPUs, when attached to the topology, to be power-managed accordingly. To trigger the attempt to initialize the genpd data structures let's use a subsys_initcall, which should be early enough to allow CPUs, but also other devices to be attached. The initialization consists of parsing the PSCI OF node for the topology and the "domain idle states" DT bindings. In case the idle states are compatible with "domain-idle-state", the initialized genpd becomes responsible of selecting an idle state for the PM domain, via assigning it a genpd governor. Note that, a successful initialization of the genpd data structures, is followed by a call to psci_set_osi_mode(), as to try to enable the OSI mode in the PSCI FW. In case this fails, we fall back into a degraded mode rather than bailing out and returning error codes. Co-developed-by: Lina Iyer <lina.iyer@linaro.org> Signed-off-by: Lina Iyer <lina.iyer@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:52:57 +01:00
Ulf Hansson	9c6ceecb65	cpuidle: psci: Support CPU hotplug for the hierarchical model When the hierarchical CPU topology is used and when a CPU is put offline, that CPU prevents its PM domain from being powered off, which is because genpd observes the corresponding attached device as being active from a runtime PM point of view. Furthermore, any potential master PM domains are also prevented from being powered off. To address this limitation, let's add add a new CPU hotplug state (CPUHP_AP_CPU_PM_STARTING) and register up/down callbacks for it, which allows us to deal with runtime PM accordingly. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:52:18 +01:00
Ulf Hansson	ce85aef570	cpuidle: psci: Manage runtime PM in the idle path In case we have succeeded to attach a CPU to its PM domain, let's deploy runtime PM support for the corresponding attached device, to allow the CPU to be powered-managed accordingly. The triggering point for when runtime PM reference counting should be done, has been selected to the deepest idle state for the CPU. However, from the hierarchical point view, there may be good reasons to do runtime PM reference counting even on shallower idle states, but at this point this isn't supported, mainly due to limitations set by the generic PM domain. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:52:06 +01:00
Ulf Hansson	a0cf319460	cpuidle: psci: Prepare to use OS initiated suspend mode via PM domains The per CPU variable psci_power_state, contains an array of fixed values, which reflects the corresponding arm,psci-suspend-param parsed from DT, for each of the available CPU idle states. This isn't sufficient when using the hierarchical CPU topology in DT, in combination with having PSCI OS initiated (OSI) mode enabled. More precisely, in OSI mode, Linux is responsible of telling the PSCI FW what idle state the cluster (a group of CPUs) should enter, while in PSCI Platform Coordinated (PC) mode, each CPU independently votes for an idle state of the cluster. For this reason, introduce a per CPU variable called domain_state and implement two helper functions to read/write its value. Then let the domain_state take precedence over the regular selected state, when entering and idle state. To avoid executing the above OSI specific code in the ->enter() callback, while operating in the default PSCI Platform Coordinated mode, let's also add a new enter-function and use it for OSI. Co-developed-by: Lina Iyer <lina.iyer@linaro.org> Signed-off-by: Lina Iyer <lina.iyer@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:34 +01:00
Ulf Hansson	8554951a4d	cpuidle: psci: Attach CPU devices to their PM domains In order to enable a CPU to be power managed through its PM domain, let's try to attach it by calling psci_dt_attach_cpu() during the cpuidle initialization. psci_dt_attach_cpu() returns a pointer to the attached struct device, which later should be used for runtime PM, hence we need to store it somewhere. Rather than adding yet another per CPU variable, let's create a per CPU struct to collect the relevant per CPU variables. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:28 +01:00
Ulf Hansson	a5e0454cf3	cpuidle: psci: Add a helper to attach a CPU to its PM domain Introduce a PSCI DT helper function, psci_dt_attach_cpu(), which takes a CPU number as an in-parameter and tries to attach the CPU's struct device to its corresponding PM domain. Let's makes use of dev_pm_domain_attach_by_name(), as it allows us to specify "psci" as the "name" of the PM domain to attach to. Additionally, let's also prepare the attached device to be power managed via runtime PM. Note that, the implementation of the new helper function is in a new separate c-file, which may seems a bit too much at this point. However, subsequent changes that implements the remaining part of the PM domain support for cpuidle-psci, helps to justify this split. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:24 +01:00
Ulf Hansson	f08cfbfa4f	cpuidle: psci: Support hierarchical CPU idle states Currently CPU's idle states are represented using the flattened model. Let's add support for the hierarchical layout, via converting to use of_get_cpu_state_node(). Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:19 +01:00
Ulf Hansson	1595e4b09b	cpuidle: psci: Simplify OF parsing of CPU idle state nodes Iterating through the idle state nodes in DT, to find out the number of states that needs to be allocated is unnecessary, as it has already been done from dt_init_idle_driver(). Therefore, drop the iteration and use the number we already have at hand. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:15 +01:00
Lina Iyer	778f173eb4	cpuidle: dt: Support hierarchical CPU idle states Currently CPU's idle states are represented using the flattened model. Let's add support for the hierarchical layout, via converting to use of_get_cpu_state_node(). Suggested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Lina Iyer <lina.iyer@linaro.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Co-developed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:50:08 +01:00
Sudeep Holla	4386aa866d	cpuidle: psci: Align psci_power_state count with idle state count Instead of allocating 'n-1' states in psci_power_state to manage 'n' idle states which include "ARM WFI" state, it would be simpler to have 1:1 mapping between psci_power_state and cpuidle driver states. ARM WFI state(i.e. idx == 0) is handled specially in the generic macro CPU_PM_CPU_IDLE_ENTER_PARAM and hence state[-1] is not possible. However for sake of code readability, it is better to have 1:1 mapping and not use [idx - 1] to access psci_power_state corresponding to driver cpuidle state for idx. psci_power_state[0] is default initialised to 0 and is never accessed while entering WFI state. Reported-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org>	2020-01-02 16:42:13 +01:00
Rafael J. Wysocki	75a8026741	cpuidle: Allow idle states to be disabled by default In certain situations it may be useful to prevent some idle states from being used by default while allowing user space to enable them later on. For this purpose, introduce a new state flag, CPUIDLE_FLAG_OFF, to mark idle states that should be disabled by default, make the core set CPUIDLE_STATE_DISABLED_BY_USER for those states at the initialization time and add a new state attribute in sysfs, "default_status", to inform user space of the initial status of the given idle state ("disabled" if CPUIDLE_FLAG_OFF is set for it, "enabled" otherwise). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-27 11:02:08 +01:00
Yangtao Li	85c3ebd4a0	cpuidle: kirkwood: convert to devm_platform_ioremap_resource() Use devm_platform_ioremap_resource() to simplify code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-20 10:05:44 +01:00
Yangtao Li	22c48a439d	cpuidle: clps711x: convert to devm_platform_ioremap_resource() Use devm_platform_ioremap_resource() to simplify code. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-20 10:04:27 +01:00
Rafael J. Wysocki	d4d8140176	cpuidle: Drop unnecessary type cast in cpuidle_poll_time() The data type of the target_residency_ns field in struct cpuidle_state is u64, so it does not need to be cast into u64. Get rid of the unnecessary type cast. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-12 17:56:08 +01:00
Rafael J. Wysocki	b0142d66f4	cpuidle: Fix cpuidle_driver_state_disabled() It turns out that cpuidle_driver_state_disabled() can be called before registering the cpufreq driver on some platforms, which was not expected when it was introduced and which leads to a NULL pointer dereference when trying to walk the CPUs associated with the given cpuidle driver. Fix the problem by making cpuidle_driver_state_disabled() check if the driver's mask of CPUs associated with it is present and to set CPUIDLE_FLAG_UNUSABLE for the given idle state in the driver's states list if that is not the case to cause __cpuidle_register_device() to set CPUIDLE_STATE_DISABLED_BY_DRIVER for that state for all cpuidle devices registered by it later. Fixes: `cbda56d5fe` ("cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks") Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org> Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-10 23:41:20 +01:00
Marcelo Tosatti	36fcb42924	cpuidle: use first valid target residency as poll time Commit `259231a045` ("cpuidle: add poll_limit_ns to cpuidle_device structure") changed, by mistake, the target residency from the first available sleep state to the last available sleep state (which should be longer). This might cause excessive polling. Fixes: `259231a045` ("cpuidle: add poll_limit_ns to cpuidle_device structure") Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Cc: 5.4+ <stable@vger.kernel.org> # 5.4+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-12-09 10:37:16 +01:00
Randy Dunlap	4d30d4a044	cpuidle: minor Kconfig help text fixes End sentences in help text with a period (aka full stop). Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-29 11:49:32 +01:00
Rafael J. Wysocki	ba1e78a1dc	cpuidle: Drop disabled field from struct cpuidle_state After recent cpuidle updates the "disabled" field in struct cpuidle_state is only used by two drivers (intel_idle and shmobile cpuidle) for marking unusable idle states, but that may as well be achieved with the help of a state flag, so define an "unusable" idle state flag, CPUIDLE_FLAG_UNUSABLE, make the drivers in question use it instead of the "disabled" field and make the core set CPUIDLE_STATE_DISABLED_BY_DRIVER for the idle states with that flag set. After the above changes, the "disabled" field in struct cpuidle_state is not used any more, so drop it. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-29 11:48:39 +01:00
Krzysztof Kozlowski	656b4e6398	cpuidle: Fix Kconfig indentation Adjust indentation from spaces to tab (+optional two spaces) as in coding style with command like: $ sed -e 's/^ /\t/' -i */Kconfig Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-29 11:47:35 +01:00
Daniel Lezcano	5aa9ba6312	cpuidle: Pass exit latency limit to cpuidle_use_deepest_state() Modify cpuidle_use_deepest_state() to take an additional exit latency limit argument to be passed to find_deepest_idle_state() and make cpuidle_idle_call() pass dev->forced_idle_latency_limit_ns to it for forced idle. Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: Rebase and rearrange code, subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-20 11:46:18 +01:00
Daniel Lezcano	c55b51a06b	cpuidle: Allow idle injection to apply exit latency limit In some cases it may be useful to specify an exit latency limit for the idle state to be used during CPU idle time injection. Instead of duplicating the information in struct cpuidle_device or propagating the latency limit in the call stack, replace the use_deepest_state field with forced_latency_limit_ns to represent that limit, so that the deepest idle state with exit latency within that limit is forced (i.e. no governors) when it is set. A zero exit latency limit for forced idle means to use governors in the usual way (analogous to use_deepest_state equal to "false" before this change). Additionally, add play_idle_precise() taking two arguments, the duration of forced idle and the idle state exit latency limit, both in nanoseconds, and redefine play_idle() as a wrapper around that new function. This change is preparatory, no functional impact is expected. Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> [ rjw: Subject, changelog, cpuidle_use_deepest_state() kerneldoc, whitespace ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-20 11:32:55 +01:00
Rafael J. Wysocki	cbda56d5fe	cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks Commit `99e98d3fb1` ("cpuidle: Consolidate disabled state checks") overlooked the fact that the imx6q and tegra20 cpuidle drivers use the "disabled" field in struct cpuidle_state for quirks which trigger after the initialization of cpuidle, so reading the initial value of that field is not sufficient for those drivers. In order to allow them to implement the quirks without using the "disabled" field in struct cpuidle_state, introduce a new helper function and modify them to use it. Fixes: `99e98d3fb1` ("cpuidle: Consolidate disabled state checks") Reported-by: Len Brown <lenb@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-19 10:35:13 +01:00
Rafael J. Wysocki	85f6a17f24	cpuidle: teo: Avoid code duplication in conditionals There are three places in teo_select() where a given amount of time is compared with TICK_NSEC if tick_nohz_tick_stopped() returns true, which is a bit of duplicated code. Avoid that code duplication by defining a helper function to do the check and using it in all of the places in question. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-15 00:54:33 +01:00
Rafael J. Wysocki	63f202e5ed	cpuidle: teo: Avoid using "early hits" incorrectly If the current state with the maximum "early hits" metric in teo_select() is also the one "matching" the expected idle duration, it will be used as the candidate one for selection even if its "misses" metric is greater than its "hits" metric, which is not correct. In that case, the candidate state should be shallower than the current one and its "early hits" metric should be the maximum among the idle states shallower than the current one. To make that happen, modify teo_select() to save the index of the state whose "early hits" metric is the maximum for the range of states below the current one and go back to that state if it turns out that the current one should be rejected. Fixes: `159e48560f` ("cpuidle: teo: Fix "early hits" handling for disabled idle states") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-13 22:31:17 +01:00
Rafael J. Wysocki	b6495b7f00	cpuidle: teo: Exclude cpuidle overhead from computations One purpose of the computations in teo_update() is to determine whether or not the (saved) time till the next timer event and the measured idle duration fall into the same "bin", so avoid using values that include the cpuidle overhead to obtain the latter. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-11-13 22:31:17 +01:00
Rafael J. Wysocki	c1d51f684c	cpuidle: Use nanoseconds as the unit of time Currently, the cpuidle subsystem uses microseconds as the unit of time which (among other things) causes the idle loop to incur some integer division overhead for no clear benefit. In order to allow cpuidle to measure time in nanoseconds, add two new fields, exit_latency_ns and target_residency_ns, to represent the exit latency and target residency of an idle state in nanoseconds, respectively, to struct cpuidle_state and initialize them with the help of the corresponding values in microseconds provided by drivers. Additionally, change cpuidle_governor_latency_req() to return the idle state exit latency constraint in nanoseconds. Also meeasure idle state residency (last_residency_ns in struct cpuidle_device and time_ns in struct cpuidle_driver) in nanoseconds and update the cpuidle core and governors accordingly. However, the menu governor still computes typical intervals in microseconds to avoid integer overflows. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net>	2019-11-11 21:56:07 +01:00
Rafael J. Wysocki	99e98d3fb1	cpuidle: Consolidate disabled state checks There are two reasons why CPU idle states may be disabled: either because the driver has disabled them or because they have been disabled by user space via sysfs. In the former case, the state's "disabled" flag is set once during the initialization of the driver and it is never cleared later (it is read-only effectively). In the latter case, the "disable" field of the given state's cpuidle_state_usage struct is set and it may be changed via sysfs. Thus checking whether or not an idle state has been disabled involves reading these two flags every time. In order to avoid the additional check of the state's "disabled" flag (which is effectively read-only anyway), use the value of it at the init time to set a (new) flag in the "disable" field of that state's cpuidle_state_usage structure and use the sysfs interface to manipulate another (new) flag in it. This way the state is disabled whenever the "disable" field of its cpuidle_state_usage structure is nonzero, whatever the reason, and it is the only place to look into to check whether or not the state has been disabled. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2019-11-06 13:19:56 +01:00
Zhenzhong Duan	918c1fe9fb	cpuidle: Do not unset the driver if it is there already Fix __cpuidle_set_driver() to check if any of the CPUs in the mask has a driver different from drv already and, if so, return -EBUSY before updating any cpuidle_drivers per-CPU pointers. Fixes: `82467a5a88` ("cpuidle: simplify multiple driver support") Cc: 3.11+ <stable@vger.kernel.org> # 3.11+ Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-10-24 23:22:33 +02:00
Rafael J. Wysocki	2c2a83d329	Merge back earlier cpuidle material for v5.5.	2019-10-24 23:12:55 +02:00
Zhenzhong Duan	31d851407f	cpuidle: haltpoll: Take 'idle=' override into account Currenly haltpoll isn't aware of the 'idle=' override, the priority is 'idle=poll' > haltpoll > 'idle=halt'. When 'idle=poll' is used, cpuidle driver is bypassed but current_driver in sys still shows 'haltpoll'. When 'idle=halt' is used, haltpoll takes precedence and makes 'idle=halt' have no effect. Add a check to prevent the haltpoll driver from loading if 'idle=' is present. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Co-developed-by: Joao Martins <joao.m.martins@oracle.com> [ rjw: Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-10-22 11:43:17 +02:00
Rafael J. Wysocki	159e48560f	cpuidle: teo: Fix "early hits" handling for disabled idle states The TEO governor uses idle duration "bins" defined in accordance with the CPU idle states table provided by the driver, so that each "bin" covers the idle duration range between the target residency of the idle state corresponding to it and the target residency of the closest deeper idle state. The governor collects statistics for each bin regardless of whether or not the idle state corresponding to it is currently enabled. In particular, the "early hits" metric measures the likelihood of a situation in which the idle duration measured after wakeup falls into to given bin, but the time till the next timer (sleep length) falls into a bin corresponding to one of the deeper idle states. It is used when the "hits" and "misses" metrics indicate that the state "matching" the sleep length should not be selected, so that the state with the maximum "early hits" value is selected instead of it. If the idle state corresponding to the given bin is disabled, it cannot be selected and if it turns out to be the one that should be selected, a shallower idle state needs to be used instead of it. Nevertheless, the metrics collected for the bin corresponding to it are still valid and need to be taken into account as though that state had not been disabled. As far as the "early hits" metric is concerned, teo_select() tries to take disabled states into account, but the state index corresponding to the maximum "early hits" value computed by it may be incorrect. Namely, it always uses the index of the previous maximum "early hits" state then, but there may be enabled idle states closer to the disabled one in question. In particular, if the current candidate state (whose index is the idx value) is closer to the disabled one and the "early hits" value of the disabled state is greater than the current maximum, the index of the current candidate state (idx) should replace the "maximum early hits state" index. Modify the code to handle that case correctly. Fixes: `b26bf6ab71` ("cpuidle: New timer events oriented governor for tickless systems") Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+	2019-10-14 10:40:33 +02:00
Rafael J. Wysocki	e43dcf2021	cpuidle: teo: Consider hits and misses metrics of disabled states The TEO governor uses idle duration "bins" defined in accordance with the CPU idle states table provided by the driver, so that each "bin" covers the idle duration range between the target residency of the idle state corresponding to it and the target residency of the closest deeper idle state. The governor collects statistics for each bin regardless of whether or not the idle state corresponding to it is currently enabled. In particular, the "hits" and "misses" metrics measure the likelihood of a situation in which both the time till the next timer (sleep length) and the idle duration measured after wakeup fall into the given bin. Namely, if the "hits" value is greater than the "misses" one, that situation is more likely than the one in which the sleep length falls into the given bin, but the idle duration measured after wakeup falls into a bin corresponding to one of the shallower idle states. If the idle state corresponding to the given bin is disabled, it cannot be selected and if it turns out to be the one that should be selected, a shallower idle state needs to be used instead of it. Nevertheless, the metrics collected for the bin corresponding to it are still valid and need to be taken into account as though that state had not been disabled. For this reason, make teo_select() always use the "hits" and "misses" values of the idle duration range that the sleep length falls into even if the specific idle state corresponding to it is disabled and if the "hits" values is greater than the "misses" one, select the closest enabled shallower idle state in that case. Fixes: `b26bf6ab71` ("cpuidle: New timer events oriented governor for tickless systems") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+	2019-10-14 10:40:33 +02:00
Rafael J. Wysocki	4f690bb8ce	cpuidle: teo: Rename local variable in teo_select() Rename a local variable in teo_select() in preparation for subsequent code modifications, no intentional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+	2019-10-14 10:40:33 +02:00
Rafael J. Wysocki	069ce2ef1a	cpuidle: teo: Ignore disabled idle states that are too deep Prevent disabled CPU idle state with target residencies beyond the anticipated idle duration from being taken into account by the TEO governor. Fixes: `b26bf6ab71` ("cpuidle: New timer events oriented governor for tickless systems") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 5.1+ <stable@vger.kernel.org> # 5.1+	2019-10-14 10:40:32 +02:00
Linus Torvalds	77dcfe2b9e	Power management updates for 5.4-rc1 - Rework the main suspend-to-idle control flow to avoid repeating "noirq" device resume and suspend operations in case of spurious wakeups from the ACPI EC and decouple the ACPI EC wakeups support from the LPS0 _DSM support (Rafael Wysocki). - Extend the wakeup sources framework to expose wakeup sources as device objects in sysfs (Tri Vo, Stephen Boyd). - Expose system suspend statistics in sysfs (Kalesh Singh). - Introduce a new haltpoll cpuidle driver and a new matching governor for virtualized guests wanting to do guest-side polling in the idle loop (Marcelo Tosatti, Joao Martins, Wanpeng Li, Stephen Rothwell). - Fix the menu and teo cpuidle governors to allow the scheduler tick to be stopped if PM QoS is used to limit the CPU idle state exit latency in some cases (Rafael Wysocki). - Increase the resolution of the play_idle() argument to microseconds for more fine-grained injection of CPU idle cycles (Daniel Lezcano). - Switch over some users of cpuidle notifiers to the new QoS-based frequency limits and drop the CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events (Viresh Kumar). - Add new cpufreq driver based on nvmem for sun50i (Yangtao Li). - Add support for MT8183 and MT8516 to the mediatek cpufreq driver (Andrew-sh.Cheng, Fabien Parent). - Add i.MX8MN support to the imx-cpufreq-dt cpufreq driver (Anson Huang). - Add qcs404 to cpufreq-dt-platdev blacklist (Jorge Ramirez-Ortiz). - Update the qcom cpufreq driver (among other things, to make it easier to extend and to use kryo cpufreq for other nvmem-based SoCs) and add qcs404 support to it (Niklas Cassel, Douglas RAILLARD, Sibi Sankar, Sricharan R). - Fix assorted issues and make assorted minor improvements in the cpufreq code (Colin Ian King, Douglas RAILLARD, Florian Fainelli, Gustavo Silva, Hariprasad Kelam). - Add new devfreq driver for NVidia Tegra20 (Dmitry Osipenko, Arnd Bergmann). - Add new Exynos PPMU events to devfreq events and extend that mechanism (Lukasz Luba). - Fix and clean up the exynos-bus devfreq driver (Kamil Konieczny). - Improve devfreq documentation and governor code, fix spelling typos in devfreq (Ezequiel Garcia, Krzysztof Kozlowski, Leonard Crestez, MyungJoo Ham, Gaël PORTAY). - Add regulators enable and disable to the OPP (operating performance points) framework (Kamil Konieczny). - Update the OPP framework to support multiple opp-suspend properties (Anson Huang). - Fix assorted issues and make assorted minor improvements in the OPP code (Niklas Cassel, Viresh Kumar, Yue Hu). - Clean up the generic power domains (genpd) framework (Ulf Hansson). - Clean up assorted pieces of power management code and documentation (Akinobu Mita, Amit Kucheria, Chuhong Yuan). - Update the pm-graph tool to version 5.5 including multiple fixes and improvements (Todd Brandt). - Update the cpupower utility (Benjamin Weis, Geert Uytterhoeven, Sébastien Szymanski). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl2ArZ4SHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxgfYQAK80hs43vWQDmp7XKrN4pQe8+qYULAGO fBfrFl+NG9y/cnuqnt3NtA8MoyNsMMkMLkpkEDMfSbYqqH5ehEzX5+uGJWiWx8+Y oH5KU8MH7Tj/utYaalGzDt0AHfHZDIGC0NCUNQJVtE/4mOANFabwsCwscp4MrD5Q WjFN8U4BrsmWgJdZ/U9QIWcDZ0I+1etCF+rZG2yxSv31FMq2Zk/Qm4YyobqCvQFl TR9rxl08wqUmIYIz5cDjt/3AKH7NLLDqOTstbCL7cmufM5XPFc1yox69xc89UrIa 4AMgmDp7SMwFG/gdUPof0WQNmx7qxmiRAPleAOYBOZW/8jPNZk2y+RhM5NeF72m7 AFqYiuxqatkSb4IsT8fLzH9IUZOdYr8uSmoMQECw+MHdApaKFjFV8Lb/qx5+AwkD y7pwys8dZSamAjAf62eUzJDWcEwkNrujIisGrIXrVHb7ISbweskMOmdAYn9p4KgP dfRzpJBJ45IaMIdbaVXNpg3rP7Apfs7X1X+/ZhG6f+zHH3zYwr8Y81WPqX8WaZJ4 qoVCyxiVWzMYjY2/1lzjaAdqWojPWHQ3or3eBaK52DouyG3jY6hCDTLwU7iuqcCX jzAtrnqrNIKufvaObEmqcmYlIIOFT7QaJCtGUSRFQLfSon8fsVSR7LLeXoAMUJKT JWQenuNaJngK =TBDQ -----END PGP SIGNATURE----- Merge tag 'pm-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These include a rework of the main suspend-to-idle code flow (related to the handling of spurious wakeups), a switch over of several users of cpufreq notifiers to QoS-based limits, a new devfreq driver for Tegra20, a new cpuidle driver and governor for virtualized guests, an extension of the wakeup sources framework to expose wakeup sources as device objects in sysfs, and more. Specifics: - Rework the main suspend-to-idle control flow to avoid repeating "noirq" device resume and suspend operations in case of spurious wakeups from the ACPI EC and decouple the ACPI EC wakeups support from the LPS0 _DSM support (Rafael Wysocki). - Extend the wakeup sources framework to expose wakeup sources as device objects in sysfs (Tri Vo, Stephen Boyd). - Expose system suspend statistics in sysfs (Kalesh Singh). - Introduce a new haltpoll cpuidle driver and a new matching governor for virtualized guests wanting to do guest-side polling in the idle loop (Marcelo Tosatti, Joao Martins, Wanpeng Li, Stephen Rothwell). - Fix the menu and teo cpuidle governors to allow the scheduler tick to be stopped if PM QoS is used to limit the CPU idle state exit latency in some cases (Rafael Wysocki). - Increase the resolution of the play_idle() argument to microseconds for more fine-grained injection of CPU idle cycles (Daniel Lezcano). - Switch over some users of cpuidle notifiers to the new QoS-based frequency limits and drop the CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events (Viresh Kumar). - Add new cpufreq driver based on nvmem for sun50i (Yangtao Li). - Add support for MT8183 and MT8516 to the mediatek cpufreq driver (Andrew-sh.Cheng, Fabien Parent). - Add i.MX8MN support to the imx-cpufreq-dt cpufreq driver (Anson Huang). - Add qcs404 to cpufreq-dt-platdev blacklist (Jorge Ramirez-Ortiz). - Update the qcom cpufreq driver (among other things, to make it easier to extend and to use kryo cpufreq for other nvmem-based SoCs) and add qcs404 support to it (Niklas Cassel, Douglas RAILLARD, Sibi Sankar, Sricharan R). - Fix assorted issues and make assorted minor improvements in the cpufreq code (Colin Ian King, Douglas RAILLARD, Florian Fainelli, Gustavo Silva, Hariprasad Kelam). - Add new devfreq driver for NVidia Tegra20 (Dmitry Osipenko, Arnd Bergmann). - Add new Exynos PPMU events to devfreq events and extend that mechanism (Lukasz Luba). - Fix and clean up the exynos-bus devfreq driver (Kamil Konieczny). - Improve devfreq documentation and governor code, fix spelling typos in devfreq (Ezequiel Garcia, Krzysztof Kozlowski, Leonard Crestez, MyungJoo Ham, Gaël PORTAY). - Add regulators enable and disable to the OPP (operating performance points) framework (Kamil Konieczny). - Update the OPP framework to support multiple opp-suspend properties (Anson Huang). - Fix assorted issues and make assorted minor improvements in the OPP code (Niklas Cassel, Viresh Kumar, Yue Hu). - Clean up the generic power domains (genpd) framework (Ulf Hansson). - Clean up assorted pieces of power management code and documentation (Akinobu Mita, Amit Kucheria, Chuhong Yuan). - Update the pm-graph tool to version 5.5 including multiple fixes and improvements (Todd Brandt). - Update the cpupower utility (Benjamin Weis, Geert Uytterhoeven, Sébastien Szymanski)" * tag 'pm-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (126 commits) cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available cpuidle-haltpoll: do not set an owner to allow modunload cpuidle-haltpoll: return -ENODEV on modinit failure cpuidle-haltpoll: set haltpoll as preferred governor cpuidle: allow governor switch on cpuidle_register_driver() PM: runtime: Documentation: add runtime_status ABI document pm-graph: make setVal unbuffered again for python2 and python3 powercap: idle_inject: Use higher resolution for idle injection cpuidle: play_idle: Increase the resolution to usec cpuidle-haltpoll: vcpu hotplug support cpufreq: Add qcs404 to cpufreq-dt-platdev blacklist cpufreq: qcom: Add support for qcs404 on nvmem driver cpufreq: qcom: Refactor the driver to make it easier to extend cpufreq: qcom: Re-organise kryo cpufreq to use it for other nvmem based qcom socs dt-bindings: opp: Add qcom-opp bindings with properties needed for CPR dt-bindings: opp: qcom-nvmem: Support pstates provided by a power domain Documentation: cpufreq: Update policy notifier documentation cpufreq: Remove CPUFREQ_ADJUST and CPUFREQ_NOTIFY policy notifier events PM / Domains: Verify PM domain type in dev_pm_genpd_set_performance_state() PM / Domains: Simplify genpd_lookup_dev() ...	2019-09-17 19:15:14 -07:00
Wanpeng Li	1328edca4a	cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available The downside of guest side polling is that polling is performed even with other runnable tasks in the host. However, even if poll in kvm can aware whether or not other runnable tasks in the same pCPU, it can still incur extra overhead in over-subscribe scenario. Now we can just enable guest polling when dedicated pCPUs are available. Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-11 17:46:15 +02:00
Joao Martins	472f263660	cpuidle-haltpoll: do not set an owner to allow modunload cpuidle-haltpoll can be built as a module to allow optional late load. Given we are setting @owner to THIS_MODULE, cpuidle will attempt to grab a module reference every time a cpuidle_device is registered -- so essentially all online cpus get a reference. This prevents for the module to be unloaded later, which makes the module_exit callback entirely unused. Thus remove the @owner and allow module to be unloaded. Fixes: `fa86ee90eb` ("add cpuidle-haltpoll driver") Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-11 17:36:30 +02:00
Joao Martins	5cc59f597c	cpuidle-haltpoll: return -ENODEV on modinit failure When a user loads cpuidle-haltpoll on a non KVM guest the module will successfully load, even though idle driver registration didn't take place. We should instead return -ENODEV signaling the user that the driver can't be loaded, like other error paths in haltpoll_init(). An example of such error paths is when we return -EBUSY when attempting to register an idle driver when it had one already (e.g. intel_idle loads at boot and then we attempt to insert module cpuidle-haltpoll). Fixes: `fa86ee90eb` ("add cpuidle-haltpoll driver") Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-11 17:36:30 +02:00
Joao Martins	7321440829	cpuidle-haltpoll: set haltpoll as preferred governor Right now, guest current governors have the following ratings: * ladder -> 10 * teo -> 19 * menu -> 20 * haltpoll -> 21 * ladder + nohz=off -> 25 haltpoll governor got introduced and it is now the default governor given its highest rating -- with ladder+nohz being the exception -- regardless of idle driver in the guest. An example of an undesirable case is x86 KVM guests with MWAIT which have intel_idle registered first, and consequently will have haltpoll be used as governor which would get limited to a poll state and state 1 and the other states wouldn't get used. To keep the previous defaults we decrease rating of governor to 9 (below current lowest rating) and thus rely on @governor switch on cpuidle_register_driver() to tie in haltpoll idle driver and governor together. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-11 17:36:30 +02:00
Joao Martins	cb5d8c45ab	cpuidle: allow governor switch on cpuidle_register_driver() The recently introduced haltpoll driver is largely only useful with haltpoll governor. To allow drivers to associate with a particular idle behaviour, add a @governor property to 'struct cpuidle_driver' and thus allow a cpuidle driver to switch to a preferred governor on idle driver registration. We save the previous governor, and when an idle driver is unregistered we switch back to that. The @governor can be overridden by cpuidle.governor= boot param or alternatively be ignored if the governor doesn't exist. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-11 17:36:30 +02:00
Joao Martins	97d3eb9da8	cpuidle-haltpoll: vcpu hotplug support When cpus != maxcpus cpuidle-haltpoll will fail to register all vcpus past the online ones and thus fail to register the idle driver. This is because cpuidle_add_sysfs() will return with -ENODEV as a consequence from get_cpu_device() return no device for a non-existing CPU. Instead switch to cpuidle_register_driver() and manually register each of the present cpus through cpuhp_setup_state() callbacks and future ones that get onlined or offlined. This mimmics similar logic that intel_idle does. Fixes: `fa86ee90eb` ("add cpuidle-haltpoll driver") Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-09-03 09:36:36 +02:00
Rafael J. Wysocki	b7e7fffd3e	cpuidle: teo: Get rid of redundant check in teo_update() Notice that setting measured_us to UINT_MAX in teo_update() earlier doesn't change the behavior of the following code, so do that and eliminate a redundant check used for setting measured_us to UINT_MAX. This change is not expected to alter functionality. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-08-10 14:34:28 +02:00
Lorenzo Pieralisi	9ffeb6d08c	PSCI: cpuidle: Refactor CPU suspend power_state parameter handling Current PSCI code handles idle state entry through the psci_cpu_suspend_enter() API, that takes an idle state index as a parameter and convert the index into a previously initialized power_state parameter before calling the PSCI.CPU_SUSPEND() with it. This is unwieldly, since it forces the PSCI firmware layer to keep track of power_state parameter for every idle state so that the index->power_state conversion can be made in the PSCI firmware layer instead of the CPUidle driver implementations. Move the power_state handling out of drivers/firmware/psci into the respective ACPI/DT PSCI CPUidle backends and convert the psci_cpu_suspend_enter() API to get the power_state parameter as input, which makes it closer to its firmware interface PSCI.CPU_SUSPEND() API. A notable side effect is that the PSCI ACPI/DT CPUidle backends now can directly handle (and if needed update) power_state parameters before handing them over to the PSCI firmware interface to trigger PSCI.CPU_SUSPEND() calls. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will@kernel.org>	2019-08-09 17:51:39 +01:00
Lorenzo Pieralisi	788961462f	ARM: psci: cpuidle: Enable PSCI CPUidle driver Allow selection of the PSCI CPUidle in the kernel by updating the respective Kconfig entry. Remove PSCI callbacks from ARM/ARM64 generic CPU ops to prevent the PSCI idle driver from clashing with the generic ARM CPUidle driver initialization, that relies on CPU ops to initialize and enter idle states. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Cc: Will Deacon <will@kernel.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will@kernel.org>	2019-08-09 17:51:39 +01:00
Lorenzo Pieralisi	81d549e0c8	ARM: psci: cpuidle: Introduce PSCI CPUidle driver PSCI firmware is the standard power management control for all ARM64 based platforms and it is also deployed on some ARM 32 bit platforms to date. Idle state entry in PSCI is currently achieved by calling arm_cpuidle_init() and arm_cpuidle_suspend() in a generic idle driver, which in turn relies on ARM/ARM64 CPUidle back-end to relay the call into PSCI firmware if PSCI is the boot method. Given that PSCI is the standard idle entry method on ARM64 systems (which means that no other CPUidle driver are expected on ARM64 platforms - so PSCI is already a generic idle driver), in order to simplify idle entry and code maintenance, it makes sense to have a PSCI specific idle driver so that idle code that it is currently living in drivers/firmware directory can be hoisted out of it and moved where it belongs, into a full-fledged PSCI driver, leaving PSCI code in drivers/firmware as a pure firmware interface, as it should be. Implement a PSCI CPUidle driver. By default it is a silent Kconfig entry which is left unselected, since it selection would clash with the generic ARM CPUidle driver that provides a PSCI based idle driver through the arm/arm64 arches back-ends CPU operations. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will@kernel.org>	2019-08-09 17:51:39 +01:00
Lorenzo Pieralisi	6460d7ba48	ARM: cpuidle: Remove overzealous error logging CPUidle back-end operations are not implemented in some platforms but this should not be considered an error serious enough to be logged. Check the arm_cpuidle_init() return value to detect whether the failure must be reported or not in the kernel log and do not log it if the platform does not support CPUidle operations. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will@kernel.org>	2019-08-09 17:51:39 +01:00
Lorenzo Pieralisi	63e3ee6154	ARM: cpuidle: Remove useless header include The generic ARM CPUidle driver includes <linux/topology.h> by mistake. Remove the topology header include. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will@kernel.org>	2019-08-09 17:51:39 +01:00
Rafael J. Wysocki	cab09f3d2d	cpuidle: teo: Allow tick to be stopped if PM QoS is used The TEO goveror prevents the scheduler tick from being stopped (unless stopped already) if there is a PM QoS latency constraint for the given CPU and the target residency of the deepest idle state matching that constraint is below the tick boundary. However, that is problematic if CPUs with PM QoS latency constraints are idle for long times, because it effectively causes the tick to run on them all the time which is wasteful. [It is also confusing and questionable if they are full dynticks CPUs.] To address that issue, modify the TEO governor to carry out the entire search for the most suitable idle state (from the target residency perspective) even if a latency constraint is present, to allow it to determine the expected idle duration in all cases. Also, when using the last several measured idle duration values to refine the idle state selection, make it compare those values with the current expected idle duration value (instead of comparing them with the target residency of the idle state selected so far) which should prevent the tick from being retained when it makes sense to stop it sometimes (especially in the presence of PM QoS latency constraints). Fixes: `b26bf6ab71` ("cpuidle: New timer events oriented governor for tickless systems") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-08-05 11:02:44 +02:00
Rafael J. Wysocki	32b91ca153	cpuidle: menu: Allow tick to be stopped if PM QoS is used After commit `554c8aa8ec` ("sched: idle: Select idle state before stopping the tick") the menu governor prevents the scheduler tick from being stopped (unless stopped already) if there is a PM QoS latency constraint for the given CPU and the target residency of the deepest idle state matching that constraint is below the tick boundary. However, that is problematic if CPUs with PM QoS latency constraints are idle for long times, because it effectively causes the tick to run on them all the time which is wasteful. [It is also confusing and questionable if they are full dynticks CPUs.] To address that issue, make the menu governor allow the tick to be stopped only if the idle duration predicted by it is beyond the tick boundary, except when the shallowest idle state is selected upfront and it is not a "polling" one. Fixes: `554c8aa8ec` ("sched: idle: Select idle state before stopping the tick") Link: https://lore.kernel.org/lkml/79b247b3-e056-610e-9a07-e685dfdaa6c9@gmail.com/ Reported-by: Thomas Lindroth <thomas.lindroth@gmail.com> Tested-by: Thomas Lindroth <thomas.lindroth@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-08-05 11:02:44 +02:00
Marcelo Tosatti	a1c4423b02	cpuidle-haltpoll: disable host side polling when kvm virtualized When performing guest side polling, it is not necessary to also perform host side polling. So disable host side polling, via the new MSR interface, when loading cpuidle-haltpoll driver. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-30 17:27:37 +02:00
Marcelo Tosatti	2cffe9f6b9	cpuidle: add haltpoll governor The cpuidle_haltpoll governor, in conjunction with the haltpoll cpuidle driver, allows guest vcpus to poll for a specified amount of time before halting. This provides the following benefits to host side polling: 1) The POLL flag is set while polling is performed, which allows a remote vCPU to avoid sending an IPI (and the associated cost of handling the IPI) when performing a wakeup. 2) The VM-exit cost can be avoided. The downside of guest side polling is that polling is performed even with other runnable tasks in the host. Results comparing halt_poll_ns and server/client application where a small packet is ping-ponged: host --> 31.33 halt_poll_ns=300000 / no guest busy spin --> 33.40 (93.8%) halt_poll_ns=0 / guest_halt_poll_ns=300000 --> 32.73 (95.7%) For the SAP HANA benchmarks (where idle_spin is a parameter of the previous version of the patch, results should be the same): hpns == halt_poll_ns idle_spin=0/ idle_spin=800/ idle_spin=0/ hpns=200000 hpns=0 hpns=800000 DeleteC06T03 (100 thread) 1.76 1.71 (-3%) 1.78 (+1%) InsertC16T02 (100 thread) 2.14 2.07 (-3%) 2.18 (+1.8%) DeleteC00T01 (1 thread) 1.34 1.28 (-4.5%) 1.29 (-3.7%) UpdateC00T03 (1 thread) 4.72 4.18 (-12%) 4.53 (-5%) Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-30 17:27:37 +02:00
Marcelo Tosatti	7d4daeedd5	governors: unify last_state_idx Since this field is shared by all governors, move it to cpuidle device structure. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-30 17:27:37 +02:00
Marcelo Tosatti	259231a045	cpuidle: add poll_limit_ns to cpuidle_device structure Add a poll_limit_ns variable to cpuidle_device structure. Calculate and configure it in the new cpuidle_poll_time function, in case its zero. Individual governors are allowed to override this value. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-30 17:27:37 +02:00
Marcelo Tosatti	fa86ee90eb	add cpuidle-haltpoll driver Add a cpuidle driver that calls the architecture default_idle routine. To be used in conjunction with the haltpoll governor. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-07-30 17:27:37 +02:00

1 2 3 4 5 ...

810 Commits