linux-sg2042

Commit Graph

Author	SHA1	Message	Date
Rafael J. Wysocki	b8bd1581aa	cpufreq: intel_pstate: Rework iowait boosting to be less aggressive The current iowait boosting mechanism in intel_pstate_update_util() is quite aggressive, as it goes to the maximum P-state right away, and may cause excessive amounts of energy to be used, which is not desirable and arguably isn't necessary too. Follow commit `a5a0809bc5` ("cpufreq: schedutil: Make iowait boost more energy efficient") that reworked the analogous iowait boost mechanism in the schedutil governor and make the iowait boosting in intel_pstate_update_util() work along the same lines. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-18 11:34:32 +01:00
Rafael J. Wysocki	a8e1942d97	cpufreq: intel_pstate: Eliminate intel_pstate_get_base_pstate() There is only one caller of intel_pstate_get_base_pstate() and it is more straightforward to carry out the computation directly in the caller, so do that and drop intel_pstate_get_base_pstate(). No intentional changes of behavior. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-18 11:34:32 +01:00
Rafael J. Wysocki	fa93b51c55	cpufreq: intel_pstate: Avoid redundant initialization of local vars After commit `1a4fe38add` ("cpufreq: intel_pstate: Remove max/min fractions to limit performance") the initial value of the pstate local variable in intel_pstate_max_within_limits() and the initial value of the max_pstate local variable in intel_pstate_prepare_request() are both immediately discarded, so initialize both these variables to their target values upfront. No intentional changes of behavior. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-18 11:34:32 +01:00
Erwan Velu	076b862c7e	cpufreq: intel_pstate: Add reasons for failure and debug messages The init code path has several exceptions where the driver can decide not to load. As CONFIG_X86_INTEL_PSTATE is generally set to Y, the return code is not reachable. The initialization code is neither verbose of the reason why it did choose to prematurely exit, so it is difficult for a user to determine, on a given platform, why the driver didn't load properly. This patch is about reporting to the user the reason/context of why the driver failed to load. That is a precious hint when debugging a platform. Signed-off-by: Erwan Velu <e.velu@criteo.com> [ rjw: Subject & changelog, minor fixups ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-13 12:32:10 +01:00
Viresh Kumar	625c85a62c	cpufreq: Use struct kobj_attribute instead of struct global_attr The cpufreq_global_kobject is created using kobject_create_and_add() helper, which assigns the kobj_type as dynamic_kobj_ktype and show/store routines are set to kobj_attr_show() and kobj_attr_store(). These routines pass struct kobj_attribute as an argument to the show/store callbacks. But all the cpufreq files created using the cpufreq_global_kobject expect the argument to be of type struct attribute. Things work fine currently as no one accesses the "attr" argument. We may not see issues even if the argument is used, as struct kobj_attribute has struct attribute as its first element and so they will both get same address. But this is logically incorrect and we should rather use struct kobj_attribute instead of struct global_attr in the cpufreq core and drivers and the show/store callbacks should take struct kobj_attribute as argument instead. This bug is caught using CFI CLANG builds in android kernel which catches mismatch in function prototypes for such callbacks. Reported-by: Donghee Han <dh.han@samsung.com> Reported-by: Sangkyu Kim <skwith.kim@samsung.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-01-29 11:44:30 +01:00
Linus Torvalds	792bf4d871	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The biggest RCU changes in this cycle were: - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar. - Replace calls of RCU-bh and RCU-sched update-side functions to their vanilla RCU counterparts. This series is a step towards complete removal of the RCU-bh and RCU-sched update-side functions. ( Note that some of these conversions are going upstream via their respective maintainers. ) - Documentation updates, including a number of flavor-consolidation updates from Joel Fernandes. - Miscellaneous fixes. - Automate generation of the initrd filesystem used for rcutorture testing. - Convert spin_is_locked() assertions to instead use lockdep. ( Note that some of these conversions are going upstream via their respective maintainers. ) - SRCU updates, especially including a fix from Dennis Krein for a bag-on-head-class bug. - RCU torture-test updates" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits) rcutorture: Don't do busted forward-progress testing rcutorture: Use 100ms buckets for forward-progress callback histograms rcutorture: Recover from OOM during forward-progress tests rcutorture: Print forward-progress test age upon failure rcutorture: Print time since GP end upon forward-progress failure rcutorture: Print histogram of CB invocation at OOM time rcutorture: Print GP age upon forward-progress failure rcu: Print per-CPU callback counts for forward-progress failures rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings rcutorture: Dump grace-period diagnostics upon forward-progress OOM rcutorture: Prepare for asynchronous access to rcu_fwd_startat torture: Remove unnecessary "ret" variables rcutorture: Affinity forward-progress test to avoid housekeeping CPUs rcutorture: Break up too-long rcu_torture_fwd_prog() function rcutorture: Remove cbflood facility torture: Bring any extra CPUs online during kernel startup rcutorture: Add call_rcu() flooding forward-progress tests rcutorture/formal: Replace synchronize_sched() with synchronize_rcu() tools/kernel.h: Replace synchronize_sched() with synchronize_rcu() net/decnet: Replace rcu_barrier_bh() with rcu_barrier() ...	2018-12-26 13:07:19 -08:00
Srinivas Pandruvada	af3b7379e2	cpufreq: intel_pstate: Force HWP min perf before offline Force HWP Request MAX = HWP Request MIN = HWP Capability MIN and EPP to 0xFF. In this way the performance limits on the offlined CPU will not influence performance limits on its sibling CPU, which is still online. If the sibling CPU is calling for higher performance, it will impact the max core performance. Here core performance will follow higher of the performance requests from each sibling. Reported-and-tested-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-11-29 22:31:58 +01:00
Paul E. McKenney	09659af308	cpufreq/intel_pstate: Replace synchronize_sched() with synchronize_rcu() Now that synchronize_rcu() waits for preempt-disable regions of code as well as RCU read-side critical sections, synchronize_sched() can be replaced by synchronize_rcu(). This commit therefore makes this change. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Len Brown <lenb@kernel.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: <linux-pm@vger.kernel.org>	2018-11-27 09:21:38 -08:00
Dominik Brodowski	5906056e52	cpufreq: intel_pstate: Fix compilation for !CONFIG_ACPI While at it, add a few comments which config options #ifdef and #else statements refer to. Fixes: `86d333a8cc` (cpufreq: intel_pstate: Add base_frequency attribute) Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-10-25 18:37:06 +02:00
Linus Torvalds	c05f3642f4	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main updates in this cycle were: - Lots of perf tooling changes too voluminous to list (big perf trace and perf stat improvements, lots of libtraceevent reorganization, etc.), so I'll list the authors and refer to the changelog for details: Benjamin Peterson, Jérémie Galarneau, Kim Phillips, Peter Zijlstra, Ravi Bangoria, Sangwon Hong, Sean V Kelley, Steven Rostedt, Thomas Gleixner, Ding Xiang, Eduardo Habkost, Thomas Richter, Andi Kleen, Sanskriti Sharma, Adrian Hunter, Tzvetomir Stoyanov, Arnaldo Carvalho de Melo, Jiri Olsa. ... with the bulk of the changes written by Jiri Olsa, Tzvetomir Stoyanov and Arnaldo Carvalho de Melo. - Continued intel_rdt work with a focus on playing well with perf events. This also imported some non-perf RDT work due to dependencies. (Reinette Chatre) - Implement counter freezing for Arch Perfmon v4 (Skylake and newer). This allows to speed up the PMI handler by avoiding unnecessary MSR writes and make it more accurate. (Andi Kleen) - kprobes cleanups and simplification (Masami Hiramatsu) - Intel Goldmont PMU updates (Kan Liang) - ... plus misc other fixes and updates" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (155 commits) kprobes/x86: Use preempt_enable() in optimized_callback() x86/intel_rdt: Prevent pseudo-locking from using stale pointers kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack perf/x86/intel: Export mem events only if there's PEBS support x86/cpu: Drop pointless static qualifier in punit_dev_state_show() x86/intel_rdt: Fix initial allocation to consider CDP x86/intel_rdt: CBM overlap should also check for overlap with CDP peer x86/intel_rdt: Introduce utility to obtain CDP peer tools lib traceevent, perf tools: Move struct tep_handler definition in a local header file tools lib traceevent: Separate out tep_strerror() for strerror_r() issues perf python: More portable way to make CFLAGS work with clang perf python: Make clang_has_option() work on Python 3 perf tools: Free temporary 'sys' string in read_event_files() perf tools: Avoid double free in read_event_file() perf tools: Free 'printk' string in parse_ftrace_printk() perf tools: Cleanup trace-event-info 'tdata' leak perf strbuf: Match va_{add,copy} with va_end perf test: S390 does not support watchpoints in test 22 perf auxtrace: Include missing asm/bitsperlong.h to get BITS_PER_LONG tools include: Adopt linux/bits.h ...	2018-10-23 13:32:18 +01:00
Srinivas Pandruvada	86d333a8cc	cpufreq: intel_pstate: Add base_frequency attribute Expose base_frequency to user space via cpufreq sysfs when HWP is in use. This HWP base frequency is read from the ACPI _CPC object if present, or from the HWP Capabilities MSR otherwise. On the majority of the HWP platforms the _CPC object will point to the HWP Capabilities MSR using the "Functional Fixed Hardware" address space type. The address space type also can simply be ACPI_TYPE_INTEGER, however, in which case the platform firmware can set its value at the initialization time based on the system constraints. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-10-16 10:33:39 +02:00
Peter Zijlstra	f2c4db1bd8	x86/cpu: Sanitize FAM6_ATOM naming Going primarily by: https://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors with additional information gleaned from other related pages; notably: - Bonnell shrink was called Saltwell - Moorefield is the Merriefield refresh which makes it Airmont The general naming scheme is: FAM6_ATOM_UARCH_SOCTYPE for i in `git grep -l FAM6_ATOM` ; do sed -i -e 's/ATOM_PINEVIEW/ATOM_BONNELL/g' \ -e 's/ATOM_LINCROFT/ATOM_BONNELL_MID/' \ -e 's/ATOM_PENWELL/ATOM_SALTWELL_MID/g' \ -e 's/ATOM_CLOVERVIEW/ATOM_SALTWELL_TABLET/g' \ -e 's/ATOM_CEDARVIEW/ATOM_SALTWELL/g' \ -e 's/ATOM_SILVERMONT1/ATOM_SILVERMONT/g' \ -e 's/ATOM_SILVERMONT2/ATOM_SILVERMONT_X/g' \ -e 's/ATOM_MERRIFIELD/ATOM_SILVERMONT_MID/g' \ -e 's/ATOM_MOOREFIELD/ATOM_AIRMONT_MID/g' \ -e 's/ATOM_DENVERTON/ATOM_GOLDMONT_X/g' \ -e 's/ATOM_GEMINI_LAKE/ATOM_GOLDMONT_PLUS/g' ${i} done Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: dave.hansen@linux.intel.com Cc: len.brown@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-10-02 10:14:32 +02:00
Srinivas Pandruvada	d3264f752a	cpufreq: intel_pstate: Ignore turbo active ratio in HWP When HWP is active turbo active ratio is not used, so we should allow policy max frequency above turbo activation ratio to be set. When HWP is not active, then any policy max frequency above turbo activation ratio can result upto max one-core turbo frequency. This fix helps better thermal control in turbo region when other methods like "Running Average Power Limit" is not available to use. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-08-06 10:22:27 +02:00
Rafael J. Wysocki	6ccbe1dcdd	Merge back cpufreq changes for 4.19.	2018-08-06 10:09:52 +02:00
Srinivas Pandruvada	01e61a42a5	cpufreq: intel_pstate: Limit the scope of HWP dynamic boost platforms Dynamic boosting of HWP performance on IO wake showed significant improvement to IO workloads. This series was intended for Skylake Xeon platforms only and feature was enabled by default based on CPU model number. But some Xeon platforms reused the Skylake desktop CPU model number. This caused some undesirable side effects to some graphics workloads. Since they are heavily IO bound, the increase in CPU performance decreased the power available for GPU to do its computing and hence decrease in graphics benchmark performance. For example on a Skylake desktop, GpuTest benchmark showed average FPS reduction from 529 to 506. This change makes sure that HWP boost feature is only enabled for Skylake server platforms by using ACPI FADT preferred PM Profile. If some desktop users wants to get benefit of boost, they can still enable boost from intel_pstate sysfs attribute "hwp_dynamic_boost". Fixes: `41ab43c9c8` (cpufreq: intel_pstate: enable boost for Skylake Xeon) Link: https://bugs.freedesktop.org/show_bug.cgi?id=107410 Reported-by: Eero Tamminen <eero.t.tamminen@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Acked-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-07-31 10:39:58 +02:00
Rafael J. Wysocki	6e926363fc	Merge back cpufreq material for 4.19.	2018-07-25 13:28:01 +02:00
Srinivas Pandruvada	eea033d075	cpufreq: intel_pstate: Show different max frequency with turbo 3 and HWP On HWP platforms with Turbo 3.0, the HWP capability max ratio shows the maximum ratio of that core, which can be different than other cores. If we show the correct maximum frequency in cpufreq sysfs via cpuinfo_max_freq and scaling_max_freq then, user can know which cores can run faster for pinning some high priority tasks. Currently the max turbo frequency is shown as max frequency, which is the max of all cores, even if some cores can't reach that frequency even for single threaded workload. But it is possible that max ratio in HWP capabilities is set as 0xFF or some high invalid value (E.g. One KBL NUC). Since the actual performance can never exceed 1 core turbo frequency from MSR TURBO_RATIO_LIMIT, we use this as a bound check. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-07-19 12:53:03 +02:00
Rafael J. Wysocki	95d6c0857e	cpufreq: intel_pstate: Register when ACPI PCCH is present Currently, intel_pstate doesn't register if _PSS is not present on HP Proliant systems, because it expects the firmware to take over CPU performance scaling in that case. However, if ACPI PCCH is present, the firmware expects the kernel to use it for CPU performance scaling and the pcc-cpufreq driver is loaded for that. Unfortunately, the firmware interface used by that driver is not scalable for fundamental reasons, so pcc-cpufreq is way suboptimal on systems with more than just a few CPUs. In fact, it is better to avoid using it at all. For this reason, modify intel_pstate to look for ACPI PCCH if _PSS is not present and register if it is there. Also prevent the pcc-cpufreq driver from trying to initialize itself if intel_pstate has been registered already. Fixes: `fbbcdc0744` (intel_pstate: skip the driver if ACPI has power mgmt option) Reported-by: Andreas Herrmann <aherrmann@suse.com> Reviewed-by: Andreas Herrmann <aherrmann@suse.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: Andreas Herrmann <aherrmann@suse.com> Cc: 4.16+ <stable@vger.kernel.org> # 4.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-07-18 13:38:37 +02:00
Xie Yisheng	1111b7836c	cpufreq: intel_pstate: use match_string() helper match_string() returns the index of an array for a matching string, which can be used instead of open coded variant. Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-07-02 11:25:40 +02:00
Srinivas Pandruvada	ff7c991714	cpufreq: intel_pstate: Fix scaling max/min limits with Turbo 3.0 When scaling max/min settings are changed, internally they are converted to a ratio using the max turbo 1 core turbo frequency. This works fine when 1 core max is same irrespective of the core. But under Turbo 3.0, this will not be the case. For example: Core 0: max turbo pstate: 43 (4.3GHz) Core 1: max turbo pstate: 45 (4.5GHz) In this case 1 core turbo ratio will be maximum of all, so it will be 45 (4.5GHz). Suppose scaling max is set to 4GHz (ratio 40) for all cores ,then on core one it will be = max_state * policy->max / max_freq; = 43 * (4000000/4500000) = 38 (3.8GHz) = 38 which is 200MHz less than the desired. On core2, it will be correctly set to ratio 40 (4GHz). Same holds true for scaling min frequency limit. So this requires usage of correct turbo max frequency for core one, which in this case is 4.3GHz. So we need to adjust per CPU cpu->pstate.turbo_freq using the maximum HWP ratio of that core. This change uses the HWP capability of a core to adjust max turbo frequency. But since Broadwell HWP doesn't use ratios in the HWP capabilities, we have to use legacy max 1 core turbo ratio. This is not a problem as the HWP capabilities don't differ among cores in Broadwell. We need to check for non Broadwell CPU model for applying this change, though. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 4.6+ <stable@vger.kernel.org> # 4.6+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-06-19 10:40:29 +02:00
Linus Torvalds	d09fcecb0c	Additional power management updates for 4.18-rc1 - Revert a recent PM core change that attempted to fix an issue related to device links, but introduced a regression (Rafael Wysocki). - Fix build when the recently added cpufreq driver for Kryo processors is selected by making it possible to build that driver as a module (Arnd Bergmann). - Fix the long idle detection mechanism in the out-of-band (ondemand and conservative) cpufreq governors (Chen Yu). - Add support for devices in multiple power domains to the generic power domains (genpd) framework (Ulf Hansson). - Add support for iowait boosting on systems with hardware-managed P-states (HWP) enabled to the intel_pstate driver and make it use that feature on systems with Skylake Xeon processors as it is reported to improve performance significantly on those systems (Srinivas Pandruvada). - Fix and update the acpi_cpufreq, ti-cpufreq and imx6q cpufreq drivers (Colin Ian King, Suman Anna, Sébastien Szymanski). - Change the behavior of the wakeup_count device attribute in sysfs to expose the number of events when the device might have aborted system suspend in progress (Ravi Chandra Sadineni). - Fix two minor issues in the cpupower utility (Abhishek Goel, Colin Ian King). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJbIOf+AAoJEILEb/54YlRxk5EQAIyLpvR0zdp2gMaMl3rbWqtM W6XpJbLzL4be9zHKDj4bycO6nbevPOr5oXgm3DQUaUvkLo86cUl2NJlNAv789UZR NQ8L51WiY4hG4WDrBQntEBw7TDBUDuo6TEa2/0WJQQhj6WQP821oehmF4G+N9A9h z9YhwbWNgivulyNy09nAcVgJ39cxUVWb9EmTXthp0KnyJzn8de+V3MxlEwJTAmHc jma9PEil9Key2rS8LRr+djvwa6tYKydOCjkA+o6m7Fo1IVaaVydDgciG4tjnsHNV wtEfbOZnisnkYrNEbViqQhhnsvSLkTtfAku58Ove5Kz2GPSPjyIoRrK7FUfDetr+ ZQLWq6TPzR9u2m3kQfhHB6C463bGxd4s2BntPH2RLHbs82FENEtGkHdxQOv5B1tW Gvl9gF9ZDov6gL3jftNdhIz4rQVGaXQlY5/q+alV1I3jhyg7zddht4oh+nNt41XR ysszEg9K62w/QAuqZeUsHaR7pPoZZDQzr3TRkKX0uvl88jq4HUPj+aKqNYxq0IrZ uYd92gqvD7HH1UKRPqjvZ65Uj5WTbn7picAYJhTlQR4b73X0j66xDSZp/IZVpbEc ierDftBxdwklnfxrpy19yJKgIDB89zLP0IX+3BacEC+BWguI//MOb5X0EEpcf/WK eyG13J1wTF1qLzKDdur9 =VROk -----END PGP SIGNATURE----- Merge tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These revert a recent PM core change that introduced a regression, fix the build when the recently added Kryo cpufreq driver is selected, add support for devices attached to multiple power domains to the generic power domains (genpd) framework, add support for iowait boosting on systens with hardware-managed P-states (HWP) enabled to the intel_pstate driver, modify the behavior of the wakeup_count device attribute in sysfs, fix a few issues and clean up some ugliness, mostly in cpufreq (core and drivers) and in the cpupower utility. Specifics: - Revert a recent PM core change that attempted to fix an issue related to device links, but introduced a regression (Rafael Wysocki) - Fix build when the recently added cpufreq driver for Kryo processors is selected by making it possible to build that driver as a module (Arnd Bergmann) - Fix the long idle detection mechanism in the out-of-band (ondemand and conservative) cpufreq governors (Chen Yu) - Add support for devices in multiple power domains to the generic power domains (genpd) framework (Ulf Hansson) - Add support for iowait boosting on systems with hardware-managed P-states (HWP) enabled to the intel_pstate driver and make it use that feature on systems with Skylake Xeon processors as it is reported to improve performance significantly on those systems (Srinivas Pandruvada) - Fix and update the acpi_cpufreq, ti-cpufreq and imx6q cpufreq drivers (Colin Ian King, Suman Anna, Sébastien Szymanski) - Change the behavior of the wakeup_count device attribute in sysfs to expose the number of events when the device might have aborted system suspend in progress (Ravi Chandra Sadineni) - Fix two minor issues in the cpupower utility (Abhishek Goel, Colin Ian King)" * tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "PM / runtime: Fixup reference counting of device link suppliers at probe" cpufreq: imx6q: check speed grades for i.MX6ULL cpufreq: governors: Fix long idle detection logic in load calculation cpufreq: intel_pstate: enable boost for Skylake Xeon PM / wakeup: Export wakeup_count instead of event_count via sysfs PM / Domains: Add dev_pm_domain_attach_by_id() to manage multi PM domains PM / Domains: Add support for multi PM domains per device to genpd PM / Domains: Split genpd_dev_pm_attach() PM / Domains: Don't attach devices in genpd with multi PM domains PM / Domains: dt: Allow power-domain property to be a list of specifiers cpufreq: intel_pstate: New sysfs entry to control HWP boost cpufreq: intel_pstate: HWP boost performance on IO wakeup cpufreq: intel_pstate: Add HWP boost utility and sched util hooks cpufreq: ti-cpufreq: Use devres managed API in probe() cpufreq: ti-cpufreq: Fix an incorrect error return value cpufreq: ACPI: make function acpi_cpufreq_fast_switch() static cpufreq: kryo: allow building as a loadable module cpupower : Fix header name to read idle state name cpupower: fix spelling mistake: "logilename" -> "logfilename"	2018-06-13 07:24:18 -07:00
Kees Cook	fad953ce0b	treewide: Use array_size() in vzalloc() The vzalloc() function has no 2-factor argument form, so multiplication factors need to be wrapped in array_size(). This patch replaces cases of: vzalloc(a * b) with: vzalloc(array_size(a, b)) as well as handling cases of: vzalloc(a * b * c) with: vzalloc(array3_size(a, b, c)) This does, however, attempt to ignore constant size factors like: vzalloc(4 * 1024) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( vzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) \| vzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( vzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) \| vzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) \| vzalloc( - sizeof(char) * (COUNT) + COUNT , ...) \| vzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) \| vzalloc( - sizeof(u8) * COUNT + COUNT , ...) \| vzalloc( - sizeof(__u8) * COUNT + COUNT , ...) \| vzalloc( - sizeof(char) * COUNT + COUNT , ...) \| vzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( vzalloc( - sizeof(TYPE) * (COUNT_ID) + array_size(COUNT_ID, sizeof(TYPE)) , ...) \| vzalloc( - sizeof(TYPE) * COUNT_ID + array_size(COUNT_ID, sizeof(TYPE)) , ...) \| vzalloc( - sizeof(TYPE) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(TYPE)) , ...) \| vzalloc( - sizeof(TYPE) * COUNT_CONST + array_size(COUNT_CONST, sizeof(TYPE)) , ...) \| vzalloc( - sizeof(THING) * (COUNT_ID) + array_size(COUNT_ID, sizeof(THING)) , ...) \| vzalloc( - sizeof(THING) * COUNT_ID + array_size(COUNT_ID, sizeof(THING)) , ...) \| vzalloc( - sizeof(THING) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(THING)) , ...) \| vzalloc( - sizeof(THING) * COUNT_CONST + array_size(COUNT_CONST, sizeof(THING)) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ vzalloc( - SIZE * COUNT + array_size(COUNT, SIZE) , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( vzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| vzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| vzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| vzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| vzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( vzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| vzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| vzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| vzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| vzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) \| vzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( vzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| vzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| vzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| vzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| vzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( vzalloc(C1 * C2 * C3, ...) \| vzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants. @@ expression E1, E2; constant C1, C2; @@ ( vzalloc(C1 * C2, ...) \| vzalloc( - E1 * E2 + array_size(E1, E2) , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>	2018-06-12 16:19:22 -07:00
Srinivas Pandruvada	41ab43c9c8	cpufreq: intel_pstate: enable boost for Skylake Xeon Enable HWP boost on Skylake server and workstations. Reported-by: Mel Gorman <mgorman@techsingularity.net> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-06-08 11:10:26 +02:00
Srinivas Pandruvada	aaaece3de9	cpufreq: intel_pstate: New sysfs entry to control HWP boost A new attribute is added to intel_pstate sysfs to enable/disable HWP dynamic performance boost. Reported-by: Mel Gorman <mgorman@techsingularity.net> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-06-06 08:37:52 +02:00
Srinivas Pandruvada	52ccc43142	cpufreq: intel_pstate: HWP boost performance on IO wakeup This change uses SCHED_CPUFREQ_IOWAIT flag to boost HWP performance. Since SCHED_CPUFREQ_IOWAIT flag is set frequently, we don't start boosting steps unless we see two consecutive flags in two ticks. This avoids boosting due to IO because of regular system activities. To avoid synchronization issues, the actual processing of the flag is done on the local CPU callback. Reported-by: Mel Gorman <mgorman@techsingularity.net> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-06-06 08:37:52 +02:00
Srinivas Pandruvada	e0efd5be63	cpufreq: intel_pstate: Add HWP boost utility and sched util hooks Added two utility functions to HWP boost up gradually and boost down to the default cached HWP request values. Boost up: Boost up updates HWP request minimum value in steps. This minimum value can reach upto at HWP request maximum values depends on how frequently, this boost up function is called. At max, boost up will take three steps to reach the maximum, depending on the current HWP request levels and HWP capabilities. For example, if the current settings are: If P0 (Turbo max) = P1 (Guaranteed max) = min No boost at all. If P0 (Turbo max) > P1 (Guaranteed max) = min Should result in one level boost only for P0. If P0 (Turbo max) = P1 (Guaranteed max) > min Should result in two level boost: (min + p1)/2 and P1. If P0 (Turbo max) > P1 (Guaranteed max) > min Should result in three level boost: (min + p1)/2, P1 and P0. We don't set any level between P0 and P1 as there is no guarantee that they will be honored. Boost down: After the system is idle for hold time of 3ms, the HWP request is reset to the default value from HWP init or user modified one via sysfs. Caching of HWP Request and Capabilities Store the HWP request value last set using MSR_HWP_REQUEST and read MSR_HWP_CAPABILITIES. This avoid reading of MSRs in the boost utility functions. These boost utility functions calculated limits are based on the latest HWP request value, which can be modified by setpolicy() callback. So if user space modifies the minimum perf value, that will be accounted for every time the boost up is called. There will be case when there can be contention with the user modified minimum perf, in that case user value will gain precedence. For example just before HWP_REQUEST MSR is updated from setpolicy() callback, the boost up function is called via scheduler tick callback. Here the cached MSR value is already the latest and limits are updated based on the latest user limits, but on return the MSR write callback called from setpolicy() callback will update the HWP_REQUEST value. This will be used till next time the boost up function is called. In addition add a variable to control HWP dynamic boosting. When HWP dynamic boost is active then set the HWP specific update util hook. The contents in the utility hooks will be filled in the subsequent patches. Reported-by: Mel Gorman <mgorman@techsingularity.net> Tested-by: Giovanni Gherdovich <ggherdovich@suse.cz> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-06-06 08:37:52 +02:00
Doug Smythies	50e9ffaba5	cpufreq: intel_pstate: allow trace in passive mode Allow use of the trace_pstate_sample trace function when the intel_pstate driver is in passive mode. Since the core_busy and scaled_busy fields are not used, and it might be desirable to know which path through the driver was used, either intel_cpufreq_target or intel_cpufreq_fast_switch, re-task the core_busy field as a flag indicator. The user can then use the intel_pstate_tracer.py utility to summarize and plot the trace. Note: The core_busy feild still goes by that name in include/trace/events/power.h and within the intel_pstate_tracer.py script and csv file headers, but it is graphed as "performance", and called core_avg_perf now in the intel_pstate driver. Sometimes, in passive mode, the driver is not called for many tens or even hundreds of seconds. The user needs to understand, and not be confused by, this limitation. Signed-off-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-05-14 23:34:25 +02:00
Rafael J. Wysocki	b258dfeafb	cpufreq: intel_pstate: Do not include debugfs.h The intel_pstate driver doesn't use debugfs any more, so drop linux/debugfs.h from the list of included headers in it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2018-04-10 08:38:02 +02:00
Chen Yu	70f6bf2a3b	cpufreq: intel_pstate: Enable HWP during system resume on CPU0 When maxcpus=1 is in the kernel command line, the BP is responsible for re-enabling the HWP - because currently only the APs invoke intel_pstate_hwp_enable() during their online process - which might put the system into unstable state after resume. Fix this by enabling the HWP explicitly on BP during resume. Reported-by: Doug Smythies <dsmythies@telus.net> Suggested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Yu Chen <yu.c.chen@intel.com> [ rjw: Subject/changelog, minor modifications ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-02-08 10:21:38 +01:00
Srinivas Pandruvada	d8de7a44e1	cpufreq: intel_pstate: Add Skylake servers support Currently intel_pstate can function only in HWP mode on Skylake servers. When HWP feature is not enabled on the processor then acpi-cpufreq is driver is used. Based on the power and performance tests using intel_pstate scaling algorithm the results are comparable. But intel_pstate brings in additional features: - Display of turbo frequency range, which many users like to see - Place limits in the turbo frequency range when platform allows Since these tests are done only using non PID algorithm introduced in kernel version 4.14, this patch is not a backport candidate. So each user has to carefully weigh the benefits before he backports. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-01-11 18:57:20 +01:00
Srinivas Pandruvada	dbd49b85ee	cpufreq: intel_pstate: Replace bxt_funcs with core_funcs Since core_funcs and bxt_funcs have same set of callbacks, replace bxt_funcs with core_funcs. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-01-11 18:57:20 +01:00
Linus Torvalds	53ac64aac9	ACPI updates for v4.14-rc1 - Update the ACPICA code in the kernel to upstream revision 20170728 including: * Alias operator handling update (Bob Moore). * Deferred resolution of reference package elements (Bob Moore). * Support for the _DMA method in walk resources (Bob Moore). * Tables handling update and support for deferred table verification (Lv Zheng). * Update of SMMU models for IORT (Robin Murphy). * Compiler and disassembler updates (Alex James, Erik Schmauss, Ganapatrao Kulkarni, James Morse). * Tools updates (Erik Schmauss, Lv Zheng). * Assorted minor fixes and cleanups (Bob Moore, Kees Cook, Lv Zheng, Shao Ming). - Rework the initialization of non-wakeup GPEs with method handlers in order to address a boot crash on some systems with Thunderbolt devices connected at boot time where we miss an early hotplug event due to a delay in GPE enabling (Rafael Wysocki). - Rework the handling of PCI bridges when setting up ACPI-based device wakeup in order to avoid disabling wakeup for bridges prematurely (Rafael Wysocki). - Consolidate Apple DMI checks throughout the tree, add support for Apple device properties to the device properties framework and use these properties for the handling of I2C and SPI devices on Apple systems (Lukas Wunner). - Add support for _DMA to the ACPI-based device properties lookup code and make it possible to use the information from there to configure DMA regions on ARM64 systems (Lorenzo Pieralisi). - Fix several issues in the APEI code, add support for exporting the BERT error region over sysfs and update APEI MAINTAINERS entry with reviewers information (Borislav Petkov, Dongjiu Geng, Loc Ho, Punit Agrawal, Tony Luck, Yazen Ghannam). - Fix a potential initialization ordering issue in the ACPI EC driver and clean it up somewhat (Lv Zheng). - Update the ACPI SPCR driver to extend the existing XGENE 8250 workaround in it to a new platform (m400) and to work around an Xgene UART clock issue (Graeme Gregory). - Add a new utility function to the ACPI core to support using ACPI OEM ID / OEM Table ID / Revision for system identification in blacklisting or similar and switch over the existing code already using this information to this new interface (Toshi Kani). - Fix an xpower PMIC issue related to GPADC reads that always return 0 without extra pin manipulations (Hans de Goede). - Add statements to print debug messages in a couple of places in the ACPI core for easier diagnostics (Rafael Wysocki). - Clean up the ACPI processor driver slightly (Colin Ian King, Hanjun Guo). - Clean up the ACPI x86 boot code somewhat (Andy Shevchenko). - Add a quirk for Dell OptiPlex 9020M to the ACPI backlight driver (Alex Hung). - Assorted fixes, cleanups and updates related to ACPI (Amitoj Kaur Chawla, Bhumika Goyal, Frank Rowand, Jean Delvare, Punit Agrawal, Ronald Tschalär, Sumeet Pawnikar). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJZrcE+AAoJEILEb/54YlRxVGAP/RKzkJlYlOIXtMjf4XWg5ZfJ RKZA68E9DW179KoBoTCVPD6/eD5UoEJ7fsWXFU2Hgp2xL3N1mZMAJHgAE4GoAwCx uImoYvQgdPna7DawzRIFkvkfceYxNyh+KaV9s7xne4hAwsB7JzP9yf5Ywll53+oF Le27/r6lDOaWhG7uYcxSabnQsWZQkBF5mj2GPzEpKDIHcLA1Vii0URzm7mAHdZsz vGjYhxrshKYEVdkLSRn536m1rEfp2fqsRJ5wqNAazZJr6Cs1WIfNVuv/RfduRJpG /zHIRAmgKV+3jp39cBpjdnexLczb1rGiCV1yZOvwCNM7jy4evL8vbL7VgcUCopaj fHbF34chNG/hKJd3Zn3RRCTNzCs6bv+txslOMARxji5eyr2Q4KuVnvg5LM4hxOUP 23FvcYkBYWu4QCNLOTnC7y2OqK6WzOvDpfi7hf13Z42iNzeAUbwt1sVF0/OCwL51 Og6blSy2x8FidKp8oaBBboBzHEiKWnXBj/Hw8KEHVcsqZv1ZC6igNRAL3tjxamU8 98/Z2NSZHYPrrrn13tT9ywISYXReXzUF85787+0ofugvDe8/QyBH6UhzzZc/xKVA t329JEjEFZZSLgxMIIa9bXoQANxkeZEGsxN6FfwvQhyIVdagLF3UvCjZl/q2NScC 9n++s32qfUBRHetGODWc =6Ke9 -----END PGP SIGNATURE----- Merge tag 'acpi-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These include a usual ACPICA code update (this time to upstream revision 20170728), a fix for a boot crash on some systems with Thunderbolt devices connected at boot time, a rework of the handling of PCI bridges when setting up device wakeup, new support for Apple device properties, support for DMA configurations reported via ACPI on ARM64, APEI-related updates, ACPI EC driver updates and assorted minor modifications in several places. Specifics: - Update the ACPICA code in the kernel to upstream revision 20170728 including: * Alias operator handling update (Bob Moore). * Deferred resolution of reference package elements (Bob Moore). * Support for the _DMA method in walk resources (Bob Moore). * Tables handling update and support for deferred table verification (Lv Zheng). * Update of SMMU models for IORT (Robin Murphy). * Compiler and disassembler updates (Alex James, Erik Schmauss, Ganapatrao Kulkarni, James Morse). * Tools updates (Erik Schmauss, Lv Zheng). * Assorted minor fixes and cleanups (Bob Moore, Kees Cook, Lv Zheng, Shao Ming). - Rework the initialization of non-wakeup GPEs with method handlers in order to address a boot crash on some systems with Thunderbolt devices connected at boot time where we miss an early hotplug event due to a delay in GPE enabling (Rafael Wysocki). - Rework the handling of PCI bridges when setting up ACPI-based device wakeup in order to avoid disabling wakeup for bridges prematurely (Rafael Wysocki). - Consolidate Apple DMI checks throughout the tree, add support for Apple device properties to the device properties framework and use these properties for the handling of I2C and SPI devices on Apple systems (Lukas Wunner). - Add support for _DMA to the ACPI-based device properties lookup code and make it possible to use the information from there to configure DMA regions on ARM64 systems (Lorenzo Pieralisi). - Fix several issues in the APEI code, add support for exporting the BERT error region over sysfs and update APEI MAINTAINERS entry with reviewers information (Borislav Petkov, Dongjiu Geng, Loc Ho, Punit Agrawal, Tony Luck, Yazen Ghannam). - Fix a potential initialization ordering issue in the ACPI EC driver and clean it up somewhat (Lv Zheng). - Update the ACPI SPCR driver to extend the existing XGENE 8250 workaround in it to a new platform (m400) and to work around an Xgene UART clock issue (Graeme Gregory). - Add a new utility function to the ACPI core to support using ACPI OEM ID / OEM Table ID / Revision for system identification in blacklisting or similar and switch over the existing code already using this information to this new interface (Toshi Kani). - Fix an xpower PMIC issue related to GPADC reads that always return 0 without extra pin manipulations (Hans de Goede). - Add statements to print debug messages in a couple of places in the ACPI core for easier diagnostics (Rafael Wysocki). - Clean up the ACPI processor driver slightly (Colin Ian King, Hanjun Guo). - Clean up the ACPI x86 boot code somewhat (Andy Shevchenko). - Add a quirk for Dell OptiPlex 9020M to the ACPI backlight driver (Alex Hung). - Assorted fixes, cleanups and updates related to ACPI (Amitoj Kaur Chawla, Bhumika Goyal, Frank Rowand, Jean Delvare, Punit Agrawal, Ronald Tschalär, Sumeet Pawnikar)" * tag 'acpi-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (75 commits) ACPI / APEI: Suppress message if HEST not present intel_pstate: convert to use acpi_match_platform_list() ACPI / blacklist: add acpi_match_platform_list() ACPI, APEI, EINJ: Subtract any matching Register Region from Trigger resources ACPI: make device_attribute const ACPI / sysfs: Extend ACPI sysfs to provide access to boot error region ACPI: APEI: fix the wrong iteration of generic error status block ACPI / processor: make function acpi_processor_check_duplicates() static ACPI / EC: Clean up EC GPE mask flag ACPI: EC: Fix possible issues related to EC initialization order ACPI / PM: Add debug statements to acpi_pm_notify_handler() ACPI: Add debug statements to acpi_global_event_handler() ACPI / scan: Enable GPEs before scanning the namespace ACPICA: Make it possible to enable runtime GPEs earlier ACPICA: Dispatch active GPEs at init time ACPI: SPCR: work around clock issue on xgene UART ACPI: SPCR: extend XGENE 8250 workaround to m400 ACPI / LPSS: Don't abort ACPI scan on missing mem resource mailbox: pcc: Drop uninformative output during boot ACPI/IORT: Add IORT named component memory address limits ...	2017-09-05 12:45:03 -07:00
Rafael J. Wysocki	ab271bc95b	Merge branch 'intel_pstate' * intel_pstate: cpufreq: intel_pstate: Shorten a couple of long names cpufreq: intel_pstate: Simplify intel_pstate_adjust_pstate() cpufreq: intel_pstate: Improve IO performance with per-core P-states cpufreq: intel_pstate: Drop INTEL_PSTATE_HWP_SAMPLING_INTERVAL cpufreq: intel_pstate: Drop ->update_util from pstate_funcs cpufreq: intel_pstate: Do not use PID-based P-state selection	2017-09-04 00:05:42 +02:00
Rafael J. Wysocki	08a10002be	Merge branch 'pm-cpufreq-sched' * pm-cpufreq-sched: cpufreq: schedutil: Always process remote callback with slow switching cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily cpufreq: Return 0 from ->fast_switch() on errors cpufreq: Simplify cpufreq_can_do_remote_dvfs() cpufreq: Process remote callbacks from any CPU if the platform permits sched: cpufreq: Allow remote cpufreq callbacks cpufreq: schedutil: Use unsigned int for iowait boost cpufreq: schedutil: Make iowait boost more energy efficient	2017-09-04 00:05:22 +02:00
Rafael J. Wysocki	bd87c8fb9d	Merge branch 'pm-cpufreq' * pm-cpufreq: (33 commits) cpufreq: imx6q: Fix imx6sx low frequency support cpufreq: speedstep-lib: make several arrays static, makes code smaller cpufreq: ti: Fix 'of_node_put' being called twice in error handling path cpufreq: dt-platdev: Drop few entries from whitelist cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2 ARM: ux500: don't select CPUFREQ_DT cpufreq: Convert to using %pOF instead of full_name cpufreq: Cap the default transition delay value to 10 ms cpufreq: dbx500: Delete obsolete driver mfd: db8500-prcmu: Get rid of cpufreq dependency cpufreq: enable the DT cpufreq driver on the Ux500 cpufreq: Loongson2: constify platform_device_id cpufreq: dt: Add r8a7796 support to to use generic cpufreq driver cpufreq: remove setting of policy->cpu in policy->cpus during init cpufreq: mediatek: add support of cpufreq to MT7622 SoC cpufreq: mediatek: add cleanups with the more generic naming cpufreq: rcar: Add support for R8A7795 SoC cpufreq: dt: Add rk3328 compatible to use generic cpufreq driver cpufreq: s5pv210: add missing of_node_put() cpufreq: Allow dynamic switching with CPUFREQ_ETERNAL latency ...	2017-09-04 00:05:13 +02:00
Toshi Kani	5e93232194	intel_pstate: convert to use acpi_match_platform_list() Convert to use acpi_match_platform_list() for the platform check. There is no change in functionality. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-08-29 01:42:49 +02:00
Rafael J. Wysocki	57ccaf3384	Merge back intel_pstate material for v4.14.	2017-08-21 01:50:20 +02:00
Sudeep Holla	b20a3f3d8a	cpufreq: remove setting of policy->cpu in policy->cpus during init policy->cpu is copied into policy->cpus in cpufreq_online() before calling into cpufreq_driver->init(). So there's no need to set the same in the individual driver init() functions again. This patch removes the redundant setting of policy->cpu in policy->cpus in intel_pstate and cppc drivers. Reported-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-08-18 01:41:37 +02:00
Doug Smythies	c587c79f90	cpufreq: intel_pstate: report correct CPU frequencies during trace The intel_pstate CPU frequency scaling driver has always calculated CPU frequency incorrectly. Recent changes have eliminted most of the issues, however the frequency reported in the trace buffer, if used, is incorrect. It remains desireable that cpu->pstate.scaling still be a nice round number for things such as when setting max and min frequencies. So the proposal is to just fix the reported frequency in the trace data. Fixes what remains of [1]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=96521 # [1] Signed-off-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-08-11 01:25:53 +02:00
Rafael J. Wysocki	d77d4888cb	cpufreq: intel_pstate: Shorten a couple of long names The names of the INTEL_PSTATE_DEFAULT_SAMPLING_INTERVAL symbol and the get_target_pstate_use_cpu_load() function don't need to be so long any more, so make them shorter. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-08-10 01:09:16 +02:00
Rafael J. Wysocki	a891283e56	cpufreq: intel_pstate: Simplify intel_pstate_adjust_pstate() Since there is only one P-state selection routine in intel_pstate now, make intel_pstate_adjust_pstate() call it directly and drop the target_pstate argument from that function. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-08-10 01:08:56 +02:00
Rafael J. Wysocki	3714c281c6	Merge v4.13 intel_pstate fixes.	2017-08-04 14:28:58 +02:00
Srinivas Pandruvada	7bde2d5001	cpufreq: intel_pstate: Improve IO performance with per-core P-states In the current implementation, the response latency between seeing SCHED_CPUFREQ_IOWAIT set and the actual P-state adjustment can be up to 10ms. It can be reduced by bumping up the P-state to the max at the time SCHED_CPUFREQ_IOWAIT is passed to intel_pstate_update_util(). With this change, the IO performance improves significantly. For a simple "grep -r . linux" (Here linux is the kernel source folder) with caches dropped every time on a Broadwell Xeon workstation with per-core P-states, the user and system time is shorter by as much as 30% - 40%. The same performance difference was not observed on clients that don't support per-core P-state. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-08-04 13:58:57 +02:00
Rafael J. Wysocki	8a05c3115d	Merge branches 'pm-cpufreq-x86', 'pm-cpufreq-docs' and 'intel_pstate' * pm-cpufreq-x86: cpufreq: x86: Make scaling_cur_freq behave more as expected * pm-cpufreq-docs: cpufreq: docs: Add missing cpuinfo_cur_freq description * intel_pstate: cpufreq: intel_pstate: Drop ->get from intel_pstate structure	2017-08-03 20:29:24 +02:00
Viresh Kumar	674e75411f	sched: cpufreq: Allow remote cpufreq callbacks With Android UI and benchmarks the latency of cpufreq response to certain scheduling events can become very critical. Currently, callbacks into cpufreq governors are only made from the scheduler if the target CPU of the event is the same as the current CPU. This means there are certain situations where a target CPU may not run the cpufreq governor for some time. One testcase to show this behavior is where a task starts running on CPU0, then a new task is also spawned on CPU0 by a task on CPU1. If the system is configured such that the new tasks should receive maximum demand initially, this should result in CPU0 increasing frequency immediately. But because of the above mentioned limitation though, this does not occur. This patch updates the scheduler core to call the cpufreq callbacks for remote CPUs as well. The schedutil, ondemand and conservative governors are updated to process cpufreq utilization update hooks called for remote CPUs where the remote CPU is managed by the cpufreq policy of the local CPU. The intel_pstate driver is updated to always reject remote callbacks. This is tested with couple of usecases (Android: hackbench, recentfling, galleryfling, vellamo, Ubuntu: hackbench) on ARM hikey board (64 bit octa-core, single policy). Only galleryfling showed minor improvements, while others didn't had much deviation. The reason being that this patch only targets a corner case, where following are required to be true to improve performance and that doesn't happen too often with these tests: - Task is migrated to another CPU. - The task has high demand, and should take the target CPU to higher OPPs. - And the target CPU doesn't call into the cpufreq governor until the next tick. Based on initial work from Steve Muckle. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Saravana Kannan <skannan@codeaurora.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-08-01 14:24:53 +02:00
Rafael J. Wysocki	f5c13f44c7	cpufreq: intel_pstate: Drop INTEL_PSTATE_HWP_SAMPLING_INTERVAL After commit `62611cb912` (intel_pstate: delete scheduler hook in HWP mode) the INTEL_PSTATE_HWP_SAMPLING_INTERVAL is not used anywhere in the code, so drop it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-08-01 14:14:03 +02:00
Rafael J. Wysocki	22baebd489	cpufreq: intel_pstate: Drop ->get from intel_pstate structure The ->get callback in the intel_pstate structure was mostly there for the scaling_cur_freq sysfs attribute to work, but after commit `f8475cef90` (x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF) that attribute uses arch_freq_get_on_cpu() provided by the x86 arch code on all processors supported by intel_pstate, so it doesn't need the ->get callback from the driver any more. Moreover, the very presence of the ->get callback in the intel_pstate structure causes the cpuinfo_cur_freq attribute to be present when intel_pstate operates in the active mode, which is bogus, because the role of that attribute is to return the current CPU frequency as seen by the hardware. For intel_pstate, though, this is just an average frequency and not really current, but computed for the previous sampling interval (the actual current frequency may be way different at the point this value is obtained by reading from cpuinfo_cur_freq), and after commit `82b4e03e01` (intel_pstate: skip scheduler hook when in "performance" mode) the value in cpuinfo_cur_freq may be stale or just 0, depending on the driver's operation mode. In fact, however, on the hardware supported by intel_pstate there is no way to read the current CPU frequency from it, so the cpuinfo_cur_freq attribute should not be present at all when this driver is in use. For this reason, drop intel_pstate_get() and clear the ->get callback pointer pointing to it, so that the cpuinfo_cur_freq is not present for intel_pstate in the active mode any more. Fixes: `82b4e03e01` (intel_pstate: skip scheduler hook when in "performance" mode) Reported-by: Huaisheng Ye <yehs1@lenovo.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2017-07-27 23:51:58 +02:00
Rafael J. Wysocki	c4f3f70cac	cpufreq: intel_pstate: Drop ->update_util from pstate_funcs All systems use the same P-state selection "powersave" algorithm in the active mode if HWP is not used, so there's no need to provide a pointer for it in struct pstate_funcs any more. Drop ->update_util from struct pstate_funcs and make intel_pstate_set_update_util_hook() use intel_pstate_update_util() directly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-26 20:42:50 +02:00
Rafael J. Wysocki	9d0ef7af1f	cpufreq: intel_pstate: Do not use PID-based P-state selection All systems with a defined ACPI preferred profile that are not "servers" have been using the load-based P-state selection algorithm in intel_pstate since 4.12-rc1 (mobile systems and laptops have been using it since 4.10-rc1) and no problems with it have been reported to date. In particular, no regressions with respect to the PID-based P-state selection have been reported. Also testing indicates that the P-state selection algorithm based on CPU load is generally on par with the PID-based algorithm performance-wise, and for some workloads it turns out to be better than the other one, while being more straightforward and easier to understand at the same time. Moreover, the PID-based P-state selection algorithm in intel_pstate is known to be unstable in some situation and generally problematic, the issues with it are hard to address and it has become a significant maintenance burden. For these reasons, make intel_pstate use the "powersave" P-state selection algorithm based on CPU load in the active mode on all systems and drop the PID-based P-state selection code along with all things related to it from the driver. Also update the documentation accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-26 20:42:50 +02:00
Viresh Kumar	b8b78825a2	cpufreq: Don't set transition_latency for setpolicy drivers The transition_latency field isn't used for drivers with ->setpolicy() callback present and there is no point setting it from the drivers. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-26 00:15:43 +02:00

1 2 3 4 5 ...

348 Commits