From 2194324d8bbbad1b179c08b6095649b06abd62d5 Mon Sep 17 00:00:00 2001 From: Len Brown Date: Fri, 14 Feb 2014 01:14:13 -0500 Subject: [PATCH 01/28] ACPI idle: permit sparse C-state sub-state numbers Linux uses CPUID.MWAIT.EDX to validate the C-states reported by ACPI, silently discarding states which are not supported by the HW. This test is too restrictive, as some HW now uses sparse sub-state numbering, so the sub-state number may be higher than the number of sub-states... Also, rather than silently ignoring an invalid state, we should complain about a firmware bug. In practice... Bay Trail systems originally supported C6-no-shrink as MWAIT sub-state 0x58, and in CPUID.MWAIT.EDX 0x03000000 indicated that there were 3 MWAIT-C6 sub-states. So acpi_idle would discard that C-state because 8 >= 3. Upon discovering this issue, the ucode was updated so that C6-no-shrink was also exported as 0x51, and the BIOS was updated to match. However, systems shipped with 0x58, will never get a BIOS update, and this patch allows Linux to see C6-no-shrink on early Bay Trail. Signed-off-by: Len Brown --- arch/x86/kernel/acpi/cstate.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/acpi/cstate.c b/arch/x86/kernel/acpi/cstate.c index e69182fd01cf..4b28159e0421 100644 --- a/arch/x86/kernel/acpi/cstate.c +++ b/arch/x86/kernel/acpi/cstate.c @@ -87,7 +87,9 @@ static long acpi_processor_ffh_cstate_probe_cpu(void *_cx) num_cstate_subtype = edx_part & MWAIT_SUBSTATE_MASK; retval = 0; - if (num_cstate_subtype < (cx->address & MWAIT_SUBSTATE_MASK)) { + /* If the HW does not support any sub-states in this C-state */ + if (num_cstate_subtype == 0) { + pr_warn(FW_BUG "ACPI MWAIT C-state 0x%x not supported by HW (0x%x)\n", cx->address, edx_part); retval = -1; goto out; } From 24bfa950f114d5d770e482de73cf673a3b017f65 Mon Sep 17 00:00:00 2001 From: Len Brown Date: Fri, 14 Feb 2014 00:50:34 -0500 Subject: [PATCH 02/28] intel_idle: allow sparse sub-state numbering, for Bay Trail Like acpi_idle, intel_idle compared sub-state numbers to the number of supported sub-states -- discarding sub-states numbers that were numbered >= the number of states. But some Bay Trail SOCs use sparse sub-state numbers, so we can't make such a comparison if we are going to access those states. So now we simply check that _some_ sub-states are supported for the given state, and assume that the sub-state number in our driver is valid. In practice, the driver is correct, and even if it were not, the hardware clips invalid sub-state requests to valid ones. No entries in the driver require this change, but Bay Trail will need it. Signed-off-by: Len Brown --- drivers/idle/intel_idle.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 8e1939f564f4..057ffef37b42 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -584,7 +584,7 @@ static int __init intel_idle_cpuidle_driver_init(void) drv->state_count = 1; for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) { - int num_substates, mwait_hint, mwait_cstate, mwait_substate; + int num_substates, mwait_hint, mwait_cstate; if (cpuidle_state_table[cstate].enter == NULL) break; @@ -597,14 +597,13 @@ static int __init intel_idle_cpuidle_driver_init(void) mwait_hint = flg2MWAIT(cpuidle_state_table[cstate].flags); mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint); - mwait_substate = MWAIT_HINT2SUBSTATE(mwait_hint); - /* does the state exist in CPUID.MWAIT? */ + /* number of sub-states for this state in CPUID.MWAIT */ num_substates = (mwait_substates >> ((mwait_cstate + 1) * 4)) & MWAIT_SUBSTATE_MASK; - /* if sub-state in table is not enumerated by CPUID */ - if ((mwait_substate + 1) > num_substates) + /* if NO sub-states for this state in CPUID, skip it */ + if (num_substates == 0) continue; if (((mwait_cstate + 1) > 2) && From 718987d695adc991eb94501209fe5353136c8c16 Mon Sep 17 00:00:00 2001 From: Len Brown Date: Fri, 14 Feb 2014 02:30:00 -0500 Subject: [PATCH 03/28] intel_idle: support Bay Trail Bay Trail (BYT) is a family of Silvermont-core Atom Processor SOCs, including the Intel Atom Processor Z36xxx and Z37xxx Series. Although it shares the Silvermont core with Avoton, BYT is optimized for mobile, and thus it supports different power saving CPU idle states. Note that not all versions of Bay Trail HW support all of the states listed in the driver. Signed-off-by: Len Brown Tested-by: Aubrey Li --- drivers/idle/intel_idle.c | 53 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 53 insertions(+) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 057ffef37b42..aeddfc46c831 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -196,6 +196,53 @@ static struct cpuidle_state snb_cstates[] = { .enter = NULL } }; +static struct cpuidle_state byt_cstates[] = { + { + .name = "C1-BYT", + .desc = "MWAIT 0x00", + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 1, + .target_residency = 1, + .enter = &intel_idle }, + { + .name = "C1E-BYT", + .desc = "MWAIT 0x01", + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 15, + .target_residency = 30, + .enter = &intel_idle }, + { + .name = "C6N-BYT", + .desc = "MWAIT 0x58", + .flags = MWAIT2flg(0x58) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 40, + .target_residency = 275, + .enter = &intel_idle }, + { + .name = "C6S-BYT", + .desc = "MWAIT 0x52", + .flags = MWAIT2flg(0x52) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 140, + .target_residency = 560, + .enter = &intel_idle }, + { + .name = "C7-BYT", + .desc = "MWAIT 0x60", + .flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 1200, + .target_residency = 1500, + .enter = &intel_idle }, + { + .name = "C7S-BYT", + .desc = "MWAIT 0x64", + .flags = MWAIT2flg(0x64) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 10000, + .target_residency = 20000, + .enter = &intel_idle }, + { + .enter = NULL } +}; + static struct cpuidle_state ivb_cstates[] = { { .name = "C1-IVB", @@ -464,6 +511,11 @@ static const struct idle_cpu idle_cpu_snb = { .disable_promotion_to_c1e = true, }; +static const struct idle_cpu idle_cpu_byt = { + .state_table = byt_cstates, + .disable_promotion_to_c1e = true, +}; + static const struct idle_cpu idle_cpu_ivb = { .state_table = ivb_cstates, .disable_promotion_to_c1e = true, @@ -494,6 +546,7 @@ static const struct x86_cpu_id intel_idle_ids[] = { ICPU(0x2f, idle_cpu_nehalem), ICPU(0x2a, idle_cpu_snb), ICPU(0x2d, idle_cpu_snb), + ICPU(0x37, idle_cpu_byt), ICPU(0x3a, idle_cpu_ivb), ICPU(0x3e, idle_cpu_ivb), ICPU(0x3c, idle_cpu_hsw), From acead1b0fac5b10d0ae3f1cc5f7820b9f9f924f5 Mon Sep 17 00:00:00 2001 From: Jan Kiszka Date: Sat, 25 Jan 2014 22:24:22 +0100 Subject: [PATCH 04/28] intel_idle: Add CPU model 54 (Atom N2000 series) Add CPU ID for Atom N2600/N2800 processors. Datasheets indicate support for this, detailed information about potential quirks or limitations are missing, though. So we just reuse the definition for the previous ATOM series. Tests on N2800 systems showed that this addition is fine an can reduce power consumption by about 0.25 W (personally confirmed on Intel DN2800MT). Signed-off-by: Jan Kiszka Signed-off-by: Len Brown --- drivers/idle/intel_idle.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index aeddfc46c831..710ab54ab04b 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -546,6 +546,7 @@ static const struct x86_cpu_id intel_idle_ids[] = { ICPU(0x2f, idle_cpu_nehalem), ICPU(0x2a, idle_cpu_snb), ICPU(0x2d, idle_cpu_snb), + ICPU(0x36, idle_cpu_atom), ICPU(0x37, idle_cpu_byt), ICPU(0x3a, idle_cpu_ivb), ICPU(0x3e, idle_cpu_ivb), From fc04cc67ea8f44124f048832a745a24bc2fa12fa Mon Sep 17 00:00:00 2001 From: Len Brown Date: Thu, 6 Feb 2014 00:55:19 -0500 Subject: [PATCH 05/28] tools/power turbostat: simplify output, add Avg_MHz Use 8 columns for each number ouput. We don't fit into 80 columns on most machines, so keep the format simple. Print frequency in MHz instead of GHz. We've got 8 columns now, so use them to show low frequency in a more natural unit. Many users didn't understand what %c0 meant, so re-name it to be %Busy. Add Avg_MHz column, which is the frequency that many users expect to see -- the total number of cycles executed over the measurement interval. People found the previous GHz to be confusing, since it was the speed only over the non-idle interval. That measurement has been re-named Bzy_MHz. Suggested-by: Dirk J. Brandewie Signed-off-by: Len Brown --- tools/power/x86/turbostat/turbostat.8 | 127 ++++++-------- tools/power/x86/turbostat/turbostat.c | 230 ++++++++++++-------------- 2 files changed, 151 insertions(+), 206 deletions(-) diff --git a/tools/power/x86/turbostat/turbostat.8 b/tools/power/x86/turbostat/turbostat.8 index b4ddb748356c..56bfb523c5bb 100644 --- a/tools/power/x86/turbostat/turbostat.8 +++ b/tools/power/x86/turbostat/turbostat.8 @@ -47,21 +47,22 @@ displays the statistics gathered since it was forked. .PP .SH FIELD DESCRIPTIONS .nf -\fBpk\fP processor package number. -\fBcor\fP processor core number. +\fBPackage\fP processor package number. +\fBCore\fP processor core number. \fBCPU\fP Linux CPU (logical processor) number. Note that multiple CPUs per core indicate support for Intel(R) Hyper-Threading Technology. -\fB%c0\fP percent of the interval that the CPU retired instructions. -\fBGHz\fP average clock rate while the CPU was in c0 state. -\fBTSC\fP average GHz that the TSC ran during the entire interval. -\fB%c1, %c3, %c6, %c7\fP show the percentage residency in hardware core idle states. -\fBCTMP\fP Degrees Celsius reported by the per-core Digital Thermal Sensor. -\fBPTMP\fP Degrees Celsius reported by the per-package Package Thermal Monitor. -\fB%pc2, %pc3, %pc6, %pc7\fP percentage residency in hardware package idle states. -\fBPkg_W\fP Watts consumed by the whole package. -\fBCor_W\fP Watts consumed by the core part of the package. -\fBGFX_W\fP Watts consumed by the Graphics part of the package -- available only on client processors. -\fBRAM_W\fP Watts consumed by the DRAM DIMMS -- available only on server processors. +\fBAVG_MHz\fP number of cycles executed divided by time elapsed. +\fB%Buzy\fP percent of the interval that the CPU retired instructions, aka. % of time in "C0" state. +\fBBzy_MHz\fP average clock rate while the CPU was busy (in "c0" state). +\fBTSC_MHz\fP average MHz that the TSC ran during the entire interval. +\fBCPU%c1, CPU%c3, CPU%c6, CPU%c7\fP show the percentage residency in hardware core idle states. +\fBCoreTmp\fP Degrees Celsius reported by the per-core Digital Thermal Sensor. +\fBPkgTtmp\fP Degrees Celsius reported by the per-package Package Thermal Monitor. +\fBPkg%pc2, Pkg%pc3, Pkg%pc6, Pkg%pc7\fP percentage residency in hardware package idle states. +\fBPkgWatt\fP Watts consumed by the whole package. +\fBCorWatt\fP Watts consumed by the core part of the package. +\fBGFXWatt\fP Watts consumed by the Graphics part of the package -- available only on client processors. +\fBRAMWatt\fP Watts consumed by the DRAM DIMMS -- available only on server processors. \fBPKG_%\fP percent of the interval that RAPL throttling was active on the Package. \fBRAM_%\fP percent of the interval that RAPL throttling was active on DRAM. .fi @@ -78,29 +79,17 @@ For Watts columns, the summary is a system total. Subsequent rows show per-CPU statistics. .nf -[root@sandy]# ./turbostat -cor CPU %c0 GHz TSC %c1 %c3 %c6 %c7 CTMP PTMP %pc2 %pc3 %pc6 %pc7 Pkg_W Cor_W GFX_W - 0.06 0.80 2.29 0.11 0.00 0.00 99.83 47 40 0.26 0.01 0.44 98.78 3.49 0.12 0.14 - 0 0 0.07 0.80 2.29 0.07 0.00 0.00 99.86 40 40 0.26 0.01 0.44 98.78 3.49 0.12 0.14 - 0 4 0.03 0.80 2.29 0.12 - 1 1 0.04 0.80 2.29 0.25 0.01 0.00 99.71 40 - 1 5 0.16 0.80 2.29 0.13 - 2 2 0.05 0.80 2.29 0.06 0.01 0.00 99.88 40 - 2 6 0.03 0.80 2.29 0.08 - 3 3 0.05 0.80 2.29 0.08 0.00 0.00 99.87 47 - 3 7 0.04 0.84 2.29 0.09 -.fi -.SH SUMMARY EXAMPLE -The "-s" option prints the column headers just once, -and then the one line system summary for each sample interval. - -.nf -[root@wsm]# turbostat -S - %c0 GHz TSC %c1 %c3 %c6 CTMP %pc3 %pc6 - 1.40 2.81 3.38 10.78 43.47 44.35 42 13.67 2.09 - 1.34 2.90 3.38 11.48 58.96 28.23 41 19.89 0.15 - 1.55 2.72 3.38 26.73 37.66 34.07 42 2.53 2.80 - 1.37 2.83 3.38 16.95 60.05 21.63 42 5.76 0.20 +[root@ivy]# ./turbostat + Core CPU Avg_MHz %Busy Bzy_MHz TSC_MHz SMI CPU%c1 CPU%c3 CPU%c6 CPU%c7 CoreTmp PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 PkgWatt CorWatt GFXWatt + - - 6 0.36 1596 3492 0 0.59 0.01 99.04 0.00 23 24 23.82 0.01 72.47 0.00 6.40 1.01 0.00 + 0 0 9 0.58 1596 3492 0 0.28 0.01 99.13 0.00 23 24 23.82 0.01 72.47 0.00 6.40 1.01 0.00 + 0 4 1 0.07 1596 3492 0 0.79 + 1 1 10 0.65 1596 3492 0 0.59 0.00 98.76 0.00 23 + 1 5 5 0.28 1596 3492 0 0.95 + 2 2 10 0.66 1596 3492 0 0.41 0.01 98.92 0.00 23 + 2 6 2 0.10 1597 3492 0 0.97 + 3 3 3 0.20 1596 3492 0 0.44 0.00 99.37 0.00 23 + 3 7 5 0.31 1596 3492 0 0.33 .fi .SH VERBOSE EXAMPLE The "-v" option adds verbosity to the output: @@ -154,55 +143,35 @@ eg. Here a cycle soaker is run on 1 CPU (see %c0) for a few seconds until ^C while the other CPUs are mostly idle: .nf -[root@x980 lenb]# ./turbostat cat /dev/zero > /dev/null +root@ivy: turbostat cat /dev/zero > /dev/null ^C -cor CPU %c0 GHz TSC %c1 %c3 %c6 %pc3 %pc6 - 8.86 3.61 3.38 15.06 31.19 44.89 0.00 0.00 - 0 0 1.46 3.22 3.38 16.84 29.48 52.22 0.00 0.00 - 0 6 0.21 3.06 3.38 18.09 - 1 2 0.53 3.33 3.38 2.80 46.40 50.27 - 1 8 0.89 3.47 3.38 2.44 - 2 4 1.36 3.43 3.38 9.04 23.71 65.89 - 2 10 0.18 2.86 3.38 10.22 - 8 1 0.04 2.87 3.38 99.96 0.01 0.00 - 8 7 99.72 3.63 3.38 0.27 - 9 3 0.31 3.21 3.38 7.64 56.55 35.50 - 9 9 0.08 2.95 3.38 7.88 - 10 5 1.42 3.43 3.38 2.14 30.99 65.44 - 10 11 0.16 2.88 3.38 3.40 + Core CPU Avg_MHz %Busy Bzy_MHz TSC_MHz SMI CPU%c1 CPU%c3 CPU%c6 CPU%c7 CoreTmp PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 PkgWatt CorWatt GFXWatt + - - 496 12.75 3886 3492 0 13.16 0.04 74.04 0.00 36 36 0.00 0.00 0.00 0.00 23.15 17.65 0.00 + 0 0 22 0.57 3830 3492 0 0.83 0.02 98.59 0.00 27 36 0.00 0.00 0.00 0.00 23.15 17.65 0.00 + 0 4 9 0.24 3829 3492 0 1.15 + 1 1 4 0.09 3783 3492 0 99.91 0.00 0.00 0.00 36 + 1 5 3880 99.82 3888 3492 0 0.18 + 2 2 17 0.44 3813 3492 0 0.77 0.04 98.75 0.00 28 + 2 6 12 0.32 3823 3492 0 0.89 + 3 3 16 0.43 3844 3492 0 0.63 0.11 98.84 0.00 30 + 3 7 4 0.11 3827 3492 0 0.94 +30.372243 sec + .fi -Above the cycle soaker drives cpu7 up its 3.6 GHz turbo limit +Above the cycle soaker drives cpu5 up its 3.8 GHz turbo limit while the other processors are generally in various states of idle. -Note that cpu1 and cpu7 are HT siblings within core8. -As cpu7 is very busy, it prevents its sibling, cpu1, +Note that cpu1 and cpu5 are HT siblings within core1. +As cpu5 is very busy, it prevents its sibling, cpu1, from entering a c-state deeper than c1. -Note that turbostat reports average GHz of 3.63, while -the arithmetic average of the GHz column above is lower. -This is a weighted average, where the weight is %c0. ie. it is the total number of -un-halted cycles elapsed per time divided by the number of CPUs. -.SH SMI COUNTING EXAMPLE -On Intel Nehalem and newer processors, MSR 0x34 is a System Management Mode Interrupt (SMI) counter. -This counter is shown by default under the "SMI" column. -.nf -[root@x980 ~]# turbostat -cor CPU %c0 GHz TSC SMI %c1 %c3 %c6 CTMP %pc3 %pc6 - 0.11 1.91 3.38 0 1.84 0.26 97.79 29 0.82 83.87 - 0 0 0.40 1.63 3.38 0 10.27 0.12 89.20 20 0.82 83.88 - 0 6 0.06 1.63 3.38 0 10.61 - 1 2 0.37 2.63 3.38 0 0.02 0.10 99.51 22 - 1 8 0.01 1.62 3.38 0 0.39 - 2 4 0.07 1.62 3.38 0 0.04 0.07 99.82 23 - 2 10 0.02 1.62 3.38 0 0.09 - 8 1 0.23 1.64 3.38 0 0.10 1.07 98.60 24 - 8 7 0.02 1.64 3.38 0 0.31 - 9 3 0.03 1.62 3.38 0 0.03 0.05 99.89 29 - 9 9 0.02 1.62 3.38 0 0.05 - 10 5 0.07 1.62 3.38 0 0.08 0.12 99.73 27 - 10 11 0.03 1.62 3.38 0 0.13 -^C -.fi +Note that the Avg_MHz column reflects the total number of cycles executed +divided by the measurement interval. If the %Busy column is 100%, +then the processor was running at that speed the entire interval. +The Avg_MHz multiplied by the %Busy results in the Bzy_MHz -- +which is the average frequency while the processor was executing -- +not including any non-busy idle time. + .SH NOTES .B "turbostat " diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index 77eb130168da..18dab6ecc132 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -56,7 +56,7 @@ unsigned int do_slm_cstates; unsigned int use_c1_residency_msr; unsigned int has_aperf; unsigned int has_epb; -unsigned int units = 1000000000; /* Ghz etc */ +unsigned int units = 1000000; /* MHz etc */ unsigned int genuine_intel; unsigned int has_invariant_tsc; unsigned int do_nehalem_platform_info; @@ -264,88 +264,93 @@ int get_msr(int cpu, off_t offset, unsigned long long *msr) return 0; } +/* + * Example Format w/ field column widths: + * + * Package Core CPU Avg_MHz Bzy_MHz TSC_MHz SMI %Busy CPU_%c1 CPU_%c3 CPU_%c6 CPU_%c7 CoreTmp PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 PkgWatt CorWatt GFXWatt + * 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 1234567 + */ + void print_header(void) { if (show_pkg) - outp += sprintf(outp, "pk"); - if (show_pkg) - outp += sprintf(outp, " "); + outp += sprintf(outp, "Package "); if (show_core) - outp += sprintf(outp, "cor"); + outp += sprintf(outp, " Core "); if (show_cpu) - outp += sprintf(outp, " CPU"); - if (show_pkg || show_core || show_cpu) - outp += sprintf(outp, " "); - if (do_nhm_cstates) - outp += sprintf(outp, " %%c0"); + outp += sprintf(outp, " CPU "); if (has_aperf) - outp += sprintf(outp, " GHz"); - outp += sprintf(outp, " TSC"); + outp += sprintf(outp, "Avg_MHz "); + if (do_nhm_cstates) + outp += sprintf(outp, " %%Busy "); + if (has_aperf) + outp += sprintf(outp, "Bzy_MHz "); + outp += sprintf(outp, "TSC_MHz "); if (do_smi) - outp += sprintf(outp, " SMI"); + outp += sprintf(outp, " SMI "); if (extra_delta_offset32) - outp += sprintf(outp, " count 0x%03X", extra_delta_offset32); + outp += sprintf(outp, " count 0x%03X ", extra_delta_offset32); if (extra_delta_offset64) - outp += sprintf(outp, " COUNT 0x%03X", extra_delta_offset64); + outp += sprintf(outp, " COUNT 0x%03X ", extra_delta_offset64); if (extra_msr_offset32) - outp += sprintf(outp, " MSR 0x%03X", extra_msr_offset32); + outp += sprintf(outp, " MSR 0x%03X ", extra_msr_offset32); if (extra_msr_offset64) - outp += sprintf(outp, " MSR 0x%03X", extra_msr_offset64); + outp += sprintf(outp, " MSR 0x%03X ", extra_msr_offset64); if (do_nhm_cstates) - outp += sprintf(outp, " %%c1"); + outp += sprintf(outp, " CPU%%c1 "); if (do_nhm_cstates && !do_slm_cstates) - outp += sprintf(outp, " %%c3"); + outp += sprintf(outp, " CPU%%c3 "); if (do_nhm_cstates) - outp += sprintf(outp, " %%c6"); + outp += sprintf(outp, " CPU%%c6 "); if (do_snb_cstates) - outp += sprintf(outp, " %%c7"); + outp += sprintf(outp, " CPU%%c7 "); if (do_dts) - outp += sprintf(outp, " CTMP"); + outp += sprintf(outp, "CoreTmp "); if (do_ptm) - outp += sprintf(outp, " PTMP"); + outp += sprintf(outp, " PkgTmp "); if (do_snb_cstates) - outp += sprintf(outp, " %%pc2"); + outp += sprintf(outp, "Pkg%%pc2 "); if (do_nhm_cstates && !do_slm_cstates) - outp += sprintf(outp, " %%pc3"); + outp += sprintf(outp, "Pkg%%pc3 "); if (do_nhm_cstates && !do_slm_cstates) - outp += sprintf(outp, " %%pc6"); + outp += sprintf(outp, "Pkg%%pc6 "); if (do_snb_cstates) - outp += sprintf(outp, " %%pc7"); + outp += sprintf(outp, "Pkg%%pc7 "); if (do_c8_c9_c10) { - outp += sprintf(outp, " %%pc8"); - outp += sprintf(outp, " %%pc9"); - outp += sprintf(outp, " %%pc10"); + outp += sprintf(outp, "Pkg%%pc8 "); + outp += sprintf(outp, "Pkg%%pc9 "); + outp += sprintf(outp, "Pk%%pc10 "); } if (do_rapl && !rapl_joules) { if (do_rapl & RAPL_PKG) - outp += sprintf(outp, " Pkg_W"); + outp += sprintf(outp, "PkgWatt "); if (do_rapl & RAPL_CORES) - outp += sprintf(outp, " Cor_W"); + outp += sprintf(outp, "CorWatt "); if (do_rapl & RAPL_GFX) - outp += sprintf(outp, " GFX_W"); + outp += sprintf(outp, "GFXWatt "); if (do_rapl & RAPL_DRAM) - outp += sprintf(outp, " RAM_W"); + outp += sprintf(outp, "RAMWatt "); if (do_rapl & RAPL_PKG_PERF_STATUS) - outp += sprintf(outp, " PKG_%%"); + outp += sprintf(outp, " PKG_%% "); if (do_rapl & RAPL_DRAM_PERF_STATUS) - outp += sprintf(outp, " RAM_%%"); + outp += sprintf(outp, " RAM_%% "); } else { if (do_rapl & RAPL_PKG) - outp += sprintf(outp, " Pkg_J"); + outp += sprintf(outp, " Pkg_J "); if (do_rapl & RAPL_CORES) - outp += sprintf(outp, " Cor_J"); + outp += sprintf(outp, " Cor_J "); if (do_rapl & RAPL_GFX) - outp += sprintf(outp, " GFX_J"); + outp += sprintf(outp, " GFX_J "); if (do_rapl & RAPL_DRAM) - outp += sprintf(outp, " RAM_W"); + outp += sprintf(outp, " RAM_W "); if (do_rapl & RAPL_PKG_PERF_STATUS) - outp += sprintf(outp, " PKG_%%"); + outp += sprintf(outp, " PKG_%% "); if (do_rapl & RAPL_DRAM_PERF_STATUS) - outp += sprintf(outp, " RAM_%%"); - outp += sprintf(outp, " time"); + outp += sprintf(outp, " RAM_%% "); + outp += sprintf(outp, " time "); } outp += sprintf(outp, "\n"); @@ -410,25 +415,12 @@ int dump_counters(struct thread_data *t, struct core_data *c, /* * column formatting convention & formats - * package: "pk" 2 columns %2d - * core: "cor" 3 columns %3d - * CPU: "CPU" 3 columns %3d - * Pkg_W: %6.2 - * Cor_W: %6.2 - * GFX_W: %5.2 - * RAM_W: %5.2 - * GHz: "GHz" 3 columns %3.2 - * TSC: "TSC" 3 columns %3.2 - * SMI: "SMI" 4 columns %4d - * percentage " %pc3" %6.2 - * Perf Status percentage: %5.2 - * "CTMP" 4 columns %4d */ int format_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p) { double interval_float; - char *fmt5, *fmt6; + char *fmt8; /* if showing only 1st thread in core and this isn't one, bail out */ if (show_core_only && !(t->flags & CPU_IS_FIRST_THREAD_IN_CORE)) @@ -443,65 +435,52 @@ int format_counters(struct thread_data *t, struct core_data *c, /* topo columns, print blanks on 1st (average) line */ if (t == &average.threads) { if (show_pkg) - outp += sprintf(outp, " "); - if (show_pkg && show_core) - outp += sprintf(outp, " "); + outp += sprintf(outp, " -"); if (show_core) - outp += sprintf(outp, " "); + outp += sprintf(outp, " -"); if (show_cpu) - outp += sprintf(outp, " " " "); + outp += sprintf(outp, " -"); } else { if (show_pkg) { if (p) - outp += sprintf(outp, "%2d", p->package_id); + outp += sprintf(outp, "%8d", p->package_id); else - outp += sprintf(outp, " "); + outp += sprintf(outp, " -"); } - if (show_pkg && show_core) - outp += sprintf(outp, " "); if (show_core) { if (c) - outp += sprintf(outp, "%3d", c->core_id); + outp += sprintf(outp, "%8d", c->core_id); else - outp += sprintf(outp, " "); + outp += sprintf(outp, " -"); } if (show_cpu) - outp += sprintf(outp, " %3d", t->cpu_id); + outp += sprintf(outp, "%8d", t->cpu_id); } + + /* AvgMHz */ + if (has_aperf) + outp += sprintf(outp, "%8.0f", + 1.0 / units * t->aperf / interval_float); + /* %c0 */ if (do_nhm_cstates) { - if (show_pkg || show_core || show_cpu) - outp += sprintf(outp, " "); if (!skip_c0) - outp += sprintf(outp, "%6.2f", 100.0 * t->mperf/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * t->mperf/t->tsc); else - outp += sprintf(outp, " ****"); + outp += sprintf(outp, "********"); } - /* GHz */ - if (has_aperf) { - if (!aperf_mperf_unstable) { - outp += sprintf(outp, " %3.2f", - 1.0 * t->tsc / units * t->aperf / - t->mperf / interval_float); - } else { - if (t->aperf > t->tsc || t->mperf > t->tsc) { - outp += sprintf(outp, " ***"); - } else { - outp += sprintf(outp, "%3.1f*", - 1.0 * t->tsc / - units * t->aperf / - t->mperf / interval_float); - } - } - } + /* BzyMHz */ + if (has_aperf) + outp += sprintf(outp, "%8.0f", + 1.0 * t->tsc / units * t->aperf / t->mperf / interval_float); /* TSC */ - outp += sprintf(outp, "%5.2f", 1.0 * t->tsc/units/interval_float); + outp += sprintf(outp, "%8.0f", 1.0 * t->tsc/units/interval_float); /* SMI */ if (do_smi) - outp += sprintf(outp, "%4d", t->smi_count); + outp += sprintf(outp, "%8d", t->smi_count); /* delta */ if (extra_delta_offset32) @@ -520,9 +499,9 @@ int format_counters(struct thread_data *t, struct core_data *c, if (do_nhm_cstates) { if (!skip_c1) - outp += sprintf(outp, " %6.2f", 100.0 * t->c1/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * t->c1/t->tsc); else - outp += sprintf(outp, " ****"); + outp += sprintf(outp, "********"); } /* print per-core data only for 1st thread in core */ @@ -530,79 +509,76 @@ int format_counters(struct thread_data *t, struct core_data *c, goto done; if (do_nhm_cstates && !do_slm_cstates) - outp += sprintf(outp, " %6.2f", 100.0 * c->c3/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * c->c3/t->tsc); if (do_nhm_cstates) - outp += sprintf(outp, " %6.2f", 100.0 * c->c6/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * c->c6/t->tsc); if (do_snb_cstates) - outp += sprintf(outp, " %6.2f", 100.0 * c->c7/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * c->c7/t->tsc); if (do_dts) - outp += sprintf(outp, " %4d", c->core_temp_c); + outp += sprintf(outp, "%8d", c->core_temp_c); /* print per-package data only for 1st core in package */ if (!(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE)) goto done; if (do_ptm) - outp += sprintf(outp, " %4d", p->pkg_temp_c); + outp += sprintf(outp, "%8d", p->pkg_temp_c); if (do_snb_cstates) - outp += sprintf(outp, " %6.2f", 100.0 * p->pc2/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * p->pc2/t->tsc); if (do_nhm_cstates && !do_slm_cstates) - outp += sprintf(outp, " %6.2f", 100.0 * p->pc3/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * p->pc3/t->tsc); if (do_nhm_cstates && !do_slm_cstates) - outp += sprintf(outp, " %6.2f", 100.0 * p->pc6/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * p->pc6/t->tsc); if (do_snb_cstates) - outp += sprintf(outp, " %6.2f", 100.0 * p->pc7/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * p->pc7/t->tsc); if (do_c8_c9_c10) { - outp += sprintf(outp, " %6.2f", 100.0 * p->pc8/t->tsc); - outp += sprintf(outp, " %6.2f", 100.0 * p->pc9/t->tsc); - outp += sprintf(outp, " %6.2f", 100.0 * p->pc10/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * p->pc8/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * p->pc9/t->tsc); + outp += sprintf(outp, "%8.2f", 100.0 * p->pc10/t->tsc); } /* * If measurement interval exceeds minimum RAPL Joule Counter range, * indicate that results are suspect by printing "**" in fraction place. */ - if (interval_float < rapl_joule_counter_range) { - fmt5 = " %5.2f"; - fmt6 = " %6.2f"; - } else { - fmt5 = " %3.0f**"; - fmt6 = " %4.0f**"; - } + if (interval_float < rapl_joule_counter_range) + fmt8 = "%8.2f"; + else + fmt8 = " %6.0f**"; if (do_rapl && !rapl_joules) { if (do_rapl & RAPL_PKG) - outp += sprintf(outp, fmt6, p->energy_pkg * rapl_energy_units / interval_float); + outp += sprintf(outp, fmt8, p->energy_pkg * rapl_energy_units / interval_float); if (do_rapl & RAPL_CORES) - outp += sprintf(outp, fmt6, p->energy_cores * rapl_energy_units / interval_float); + outp += sprintf(outp, fmt8, p->energy_cores * rapl_energy_units / interval_float); if (do_rapl & RAPL_GFX) - outp += sprintf(outp, fmt5, p->energy_gfx * rapl_energy_units / interval_float); + outp += sprintf(outp, fmt8, p->energy_gfx * rapl_energy_units / interval_float); if (do_rapl & RAPL_DRAM) - outp += sprintf(outp, fmt5, p->energy_dram * rapl_energy_units / interval_float); + outp += sprintf(outp, fmt8, p->energy_dram * rapl_energy_units / interval_float); if (do_rapl & RAPL_PKG_PERF_STATUS) - outp += sprintf(outp, fmt5, 100.0 * p->rapl_pkg_perf_status * rapl_time_units / interval_float); + outp += sprintf(outp, fmt8, 100.0 * p->rapl_pkg_perf_status * rapl_time_units / interval_float); if (do_rapl & RAPL_DRAM_PERF_STATUS) - outp += sprintf(outp, fmt5, 100.0 * p->rapl_dram_perf_status * rapl_time_units / interval_float); + outp += sprintf(outp, fmt8, 100.0 * p->rapl_dram_perf_status * rapl_time_units / interval_float); } else { if (do_rapl & RAPL_PKG) - outp += sprintf(outp, fmt6, + outp += sprintf(outp, fmt8, p->energy_pkg * rapl_energy_units); if (do_rapl & RAPL_CORES) - outp += sprintf(outp, fmt6, + outp += sprintf(outp, fmt8, p->energy_cores * rapl_energy_units); if (do_rapl & RAPL_GFX) - outp += sprintf(outp, fmt5, + outp += sprintf(outp, fmt8, p->energy_gfx * rapl_energy_units); if (do_rapl & RAPL_DRAM) - outp += sprintf(outp, fmt5, + outp += sprintf(outp, fmt8, p->energy_dram * rapl_energy_units); if (do_rapl & RAPL_PKG_PERF_STATUS) - outp += sprintf(outp, fmt5, 100.0 * p->rapl_pkg_perf_status * rapl_time_units / interval_float); + outp += sprintf(outp, fmt8, 100.0 * p->rapl_pkg_perf_status * rapl_time_units / interval_float); if (do_rapl & RAPL_DRAM_PERF_STATUS) - outp += sprintf(outp, fmt5, 100.0 * p->rapl_dram_perf_status * rapl_time_units / interval_float); - outp += sprintf(outp, fmt5, interval_float); + outp += sprintf(outp, fmt8, 100.0 * p->rapl_dram_perf_status * rapl_time_units / interval_float); + outp += sprintf(outp, fmt8, interval_float); } done: @@ -2455,7 +2431,7 @@ int main(int argc, char **argv) cmdline(argc, argv); if (verbose) - fprintf(stderr, "turbostat v3.6 Dec 2, 2013" + fprintf(stderr, "turbostat v3.7 Feb 6, 2014" " - Len Brown \n"); turbostat_init(); From 4e8e863fed2e82278d29c6357de8251adb73acb9 Mon Sep 17 00:00:00 2001 From: Len Brown Date: Thu, 27 Feb 2014 23:28:53 -0500 Subject: [PATCH 06/28] tools/power turbostat: Run on Broadwell Signed-off-by: Len Brown --- tools/power/x86/turbostat/turbostat.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index 18dab6ecc132..7c9d8e71eb9e 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -1492,6 +1492,9 @@ int has_nehalem_turbo_ratio_limit(unsigned int family, unsigned int model) case 0x46: /* HSW */ case 0x37: /* BYT */ case 0x4D: /* AVN */ + case 0x3D: /* BDW */ + case 0x4F: /* BDX */ + case 0x56: /* BDX-DE */ return 1; case 0x2E: /* Nehalem-EX Xeon - Beckton */ case 0x2F: /* Westmere-EX Xeon - Eagleton */ @@ -1605,9 +1608,12 @@ void rapl_probe(unsigned int family, unsigned int model) case 0x3C: /* HSW */ case 0x45: /* HSW */ case 0x46: /* HSW */ + case 0x3D: /* BDW */ do_rapl = RAPL_PKG | RAPL_CORES | RAPL_CORE_POLICY | RAPL_GFX | RAPL_PKG_POWER_INFO; break; case 0x3F: /* HSX */ + case 0x4F: /* BDX */ + case 0x56: /* BDX-DE */ do_rapl = RAPL_PKG | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_PKG_PERF_STATUS | RAPL_PKG_POWER_INFO; break; case 0x2D: @@ -1851,6 +1857,9 @@ int is_snb(unsigned int family, unsigned int model) case 0x3F: /* HSW */ case 0x45: /* HSW */ case 0x46: /* HSW */ + case 0x3D: /* BDW */ + case 0x4F: /* BDX */ + case 0x56: /* BDX-DE */ return 1; } return 0; @@ -1862,7 +1871,8 @@ int has_c8_c9_c10(unsigned int family, unsigned int model) return 0; switch (model) { - case 0x45: + case 0x45: /* HSW */ + case 0x3D: /* BDW */ return 1; } return 0; From 2d0acb4af981e20eb626c6ea1925e95056220b2a Mon Sep 17 00:00:00 2001 From: "jhbird.choi@samsung.com" Date: Thu, 20 Mar 2014 16:35:56 +0900 Subject: [PATCH 07/28] ACPI: Clean up memory allocations Use acpi_os_allocate_zeroed instead of acpi_os_allocate + memset. Signed-off-by: Jonghwan Choi [rjw: Subject and changelog] Signed-off-by: Rafael J. Wysocki --- drivers/acpi/osl.c | 3 +-- drivers/acpi/utils.c | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c index f7fd72ac69cf..6776c599816f 100644 --- a/drivers/acpi/osl.c +++ b/drivers/acpi/osl.c @@ -1219,10 +1219,9 @@ acpi_os_create_semaphore(u32 max_units, u32 initial_units, acpi_handle * handle) { struct semaphore *sem = NULL; - sem = acpi_os_allocate(sizeof(struct semaphore)); + sem = acpi_os_allocate_zeroed(sizeof(struct semaphore)); if (!sem) return AE_NO_MEMORY; - memset(sem, 0, sizeof(struct semaphore)); sema_init(sem, initial_units); diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c index 0f5f78fa6545..bba526148583 100644 --- a/drivers/acpi/utils.c +++ b/drivers/acpi/utils.c @@ -164,11 +164,10 @@ acpi_extract_package(union acpi_object *package, * Validate output buffer. */ if (buffer->length == ACPI_ALLOCATE_BUFFER) { - buffer->pointer = ACPI_ALLOCATE(size_required); + buffer->pointer = ACPI_ALLOCATE_ZEROED(size_required); if (!buffer->pointer) return AE_NO_MEMORY; buffer->length = size_required; - memset(buffer->pointer, 0, size_required); } else { if (buffer->length < size_required) { buffer->length = size_required; From 0138d8f0755b5b28d0acdb0a758bcfcaf441fc58 Mon Sep 17 00:00:00 2001 From: Len Brown Date: Fri, 4 Apr 2014 01:21:07 -0400 Subject: [PATCH 08/28] intel_idle: fine-tune IVT residency targets Ivy Town processors have slightly different properties than Ivy Bridge processors, particuarly as socket count grows. Here we add dedicated tables covering 1-2 socket, 3-4 socket, and > 4 socket IVT configurations. This reduces the frequency of deep transitions on those systems, which can impact throughput. Signed-off-by: Len Brown --- drivers/idle/intel_idle.c | 141 +++++++++++++++++++++++++++++++++++++- 1 file changed, 140 insertions(+), 1 deletion(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 710ab54ab04b..9ddbf550bafa 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -283,6 +283,105 @@ static struct cpuidle_state ivb_cstates[] = { .enter = NULL } }; +static struct cpuidle_state ivt_cstates[] = { + { + .name = "C1-IVT", + .desc = "MWAIT 0x00", + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 1, + .target_residency = 1, + .enter = &intel_idle }, + { + .name = "C1E-IVT", + .desc = "MWAIT 0x01", + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 10, + .target_residency = 80, + .enter = &intel_idle }, + { + .name = "C3-IVT", + .desc = "MWAIT 0x10", + .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 59, + .target_residency = 156, + .enter = &intel_idle }, + { + .name = "C6-IVT", + .desc = "MWAIT 0x20", + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 82, + .target_residency = 300, + .enter = &intel_idle }, + { + .enter = NULL } +}; + +static struct cpuidle_state ivt_cstates_4s[] = { + { + .name = "C1-IVT-4S", + .desc = "MWAIT 0x00", + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 1, + .target_residency = 1, + .enter = &intel_idle }, + { + .name = "C1E-IVT-4S", + .desc = "MWAIT 0x01", + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 10, + .target_residency = 250, + .enter = &intel_idle }, + { + .name = "C3-IVT-4S", + .desc = "MWAIT 0x10", + .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 59, + .target_residency = 300, + .enter = &intel_idle }, + { + .name = "C6-IVT-4S", + .desc = "MWAIT 0x20", + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 84, + .target_residency = 400, + .enter = &intel_idle }, + { + .enter = NULL } +}; + +static struct cpuidle_state ivt_cstates_8s[] = { + { + .name = "C1-IVT-8S", + .desc = "MWAIT 0x00", + .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 1, + .target_residency = 1, + .enter = &intel_idle }, + { + .name = "C1E-IVT-8S", + .desc = "MWAIT 0x01", + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_TIME_VALID, + .exit_latency = 10, + .target_residency = 500, + .enter = &intel_idle }, + { + .name = "C3-IVT-8S", + .desc = "MWAIT 0x10", + .flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 59, + .target_residency = 600, + .enter = &intel_idle }, + { + .name = "C6-IVT-8S", + .desc = "MWAIT 0x20", + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED, + .exit_latency = 88, + .target_residency = 700, + .enter = &intel_idle }, + { + .enter = NULL } +}; + static struct cpuidle_state hsw_cstates[] = { { .name = "C1-HSW", @@ -521,6 +620,11 @@ static const struct idle_cpu idle_cpu_ivb = { .disable_promotion_to_c1e = true, }; +static const struct idle_cpu idle_cpu_ivt = { + .state_table = ivt_cstates, + .disable_promotion_to_c1e = true, +}; + static const struct idle_cpu idle_cpu_hsw = { .state_table = hsw_cstates, .disable_promotion_to_c1e = true, @@ -549,7 +653,7 @@ static const struct x86_cpu_id intel_idle_ids[] = { ICPU(0x36, idle_cpu_atom), ICPU(0x37, idle_cpu_byt), ICPU(0x3a, idle_cpu_ivb), - ICPU(0x3e, idle_cpu_ivb), + ICPU(0x3e, idle_cpu_ivt), ICPU(0x3c, idle_cpu_hsw), ICPU(0x3f, idle_cpu_hsw), ICPU(0x45, idle_cpu_hsw), @@ -626,6 +730,39 @@ static void intel_idle_cpuidle_devices_uninit(void) free_percpu(intel_idle_cpuidle_devices); return; } + +/* + * intel_idle_state_table_update() + * + * Update the default state_table for this CPU-id + * + * Currently used to access tuned IVT multi-socket targets + * Assumption: num_sockets == (max_package_num + 1) + */ +void intel_idle_state_table_update(void) +{ + /* IVT uses a different table for 1-2, 3-4, and > 4 sockets */ + if (boot_cpu_data.x86_model == 0x3e) { /* IVT */ + int cpu, package_num, num_sockets = 1; + + for_each_online_cpu(cpu) { + package_num = topology_physical_package_id(cpu); + if (package_num + 1 > num_sockets) { + num_sockets = package_num + 1; + + if (num_sockets > 4) + cpuidle_state_table = ivt_cstates_8s; + return; + } + } + + if (num_sockets > 2) + cpuidle_state_table = ivt_cstates_4s; + /* else, 1 and 2 socket systems use default ivt_cstates */ + } + return; +} + /* * intel_idle_cpuidle_driver_init() * allocate, initialize cpuidle_states @@ -635,6 +772,8 @@ static int __init intel_idle_cpuidle_driver_init(void) int cstate; struct cpuidle_driver *drv = &intel_idle_driver; + intel_idle_state_table_update(); + drv->state_count = 1; for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) { From ee17fdf24bf815650c87b3bb35fbab42681523a8 Mon Sep 17 00:00:00 2001 From: Zhihui Zhang Date: Sat, 29 Mar 2014 09:25:47 -0400 Subject: [PATCH 09/28] ACPI / thermal: Fix wrong variable usage in debug statement A debug statement in acpi_thermal_trips_update() uses a wrong trip point (tz->trips.critical instead of tz->trips.hot) to get the temperature value from. Fix that. Signed-off-by: Zhihui Zhang [rjw: Subject and changelog] Signed-off-by: Rafael J. Wysocki --- drivers/acpi/thermal.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 964068553334..c1e31a41f949 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -344,7 +344,7 @@ static int acpi_thermal_trips_update(struct acpi_thermal *tz, int flag) tz->trips.hot.flags.valid = 1; ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found hot threshold [%lu]\n", - tz->trips.critical.temperature)); + tz->trips.hot.temperature)); } } From 997dd7fe546b33a4f86adc635c5ffc0e79966cf2 Mon Sep 17 00:00:00 2001 From: Stephen Chandler Paul Date: Fri, 4 Apr 2014 10:44:46 -0400 Subject: [PATCH 10/28] ACPI / video: Favor native backlight interface for ThinkPad Helix Like many other Windows 8 laptops the ThinkPad Helix's backlight has a broken ACPI interface and can only be properly adjusted by using the video card's native backlight control. This adds the ThinkPad Helix to the list of laptops affected with this issue so that video.use_native_backlight is enabled by default. Signed-off-by: Stephen Chandler Paul Signed-off-by: Rafael J. Wysocki --- drivers/acpi/video.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/acpi/video.c b/drivers/acpi/video.c index 48c7e8af9c96..8b6990e417ec 100644 --- a/drivers/acpi/video.c +++ b/drivers/acpi/video.c @@ -487,6 +487,14 @@ static struct dmi_system_id video_dmi_table[] __initdata = { DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo IdeaPad Yoga 13"), }, }, + { + .callback = video_set_use_native_backlight, + .ident = "Thinkpad Helix", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad Helix"), + }, + }, { .callback = video_set_use_native_backlight, .ident = "Dell Inspiron 7520", From 4e24ea2a2c942591ad3e843ec0efaca08f3cfa04 Mon Sep 17 00:00:00 2001 From: "Rafael J. Wysocki" Date: Mon, 7 Apr 2014 14:11:35 +0200 Subject: [PATCH 11/28] ACPI / dock: Drop dock_device_ids[] table There are no references to the dock_device_ids[] table anywhere in the code and it is not even useful for module autoloading, because the dock driver can only be built in, so drop it. Signed-off-by: Rafael J. Wysocki --- drivers/acpi/dock.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/drivers/acpi/dock.c b/drivers/acpi/dock.c index f0fc6260266b..d9339b442a4e 100644 --- a/drivers/acpi/dock.c +++ b/drivers/acpi/dock.c @@ -51,12 +51,6 @@ MODULE_PARM_DESC(immediate_undock, "1 (default) will cause the driver to " " the driver to wait for userspace to write the undock sysfs file " " before undocking"); -static const struct acpi_device_id dock_device_ids[] = { - {"LNXDOCK", 0}, - {"", 0}, -}; -MODULE_DEVICE_TABLE(acpi, dock_device_ids); - struct dock_station { acpi_handle handle; unsigned long last_dock_time; From 39ac5ba51b69a77a30d2e783aed02ec73c9f6d70 Mon Sep 17 00:00:00 2001 From: Tushar Behera Date: Fri, 28 Mar 2014 10:50:21 +0530 Subject: [PATCH 12/28] PM / domains: Add pd_ignore_unused to keep power domains enabled Keep all power-domains already enabled by bootloader on, even if no driver has claimed them. This is useful for debug and development, but should not be needed on a platform with proper driver support. Signed-off-by: Tushar Behera Signed-off-by: Rafael J. Wysocki --- Documentation/kernel-parameters.txt | 7 +++++++ drivers/base/power/domain.c | 13 +++++++++++++ 2 files changed, 20 insertions(+) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index bc3478581f67..d4aff4086952 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2558,6 +2558,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted. pcmv= [HW,PCMCIA] BadgePAD 4 + pd_ignore_unused + [PM] + Keep all power-domains already enabled by bootloader on, + even if no driver has claimed them. This is useful + for debug and development, but should not be + needed on a platform with proper driver support. + pd. [PARIDE] See Documentation/blockdev/paride.txt. diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 6f54962aae1d..ae098a261fcd 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -705,6 +705,14 @@ static int pm_genpd_runtime_resume(struct device *dev) return 0; } +static bool pd_ignore_unused; +static int __init pd_ignore_unused_setup(char *__unused) +{ + pd_ignore_unused = true; + return 1; +} +__setup("pd_ignore_unused", pd_ignore_unused_setup); + /** * pm_genpd_poweroff_unused - Power off all PM domains with no devices in use. */ @@ -712,6 +720,11 @@ void pm_genpd_poweroff_unused(void) { struct generic_pm_domain *genpd; + if (pd_ignore_unused) { + pr_warn("genpd: Not disabling unused power domains\n"); + return; + } + mutex_lock(&gpd_list_lock); list_for_each_entry(genpd, &gpd_list, gpd_list_node) From d0549801659747c46092fb22db1488100d7923e4 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Fri, 28 Mar 2014 11:15:14 +0100 Subject: [PATCH 13/28] PM / wakeup: Correct presence vs. emptiness of wakeup_* attributes According to the documentation, the various wakeup_* attributes in sysfs are not present if the device is not enabled to wake up the system. This is not correct: the attributes are not present if the device is not capable to wake up the system. They are empty if the device is not enabled to wake up the system. Signed-off-by: Geert Uytterhoeven Signed-off-by: Rafael J. Wysocki --- Documentation/ABI/testing/sysfs-devices-power | 46 ++++++++++++------- 1 file changed, 30 insertions(+), 16 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-devices-power b/Documentation/ABI/testing/sysfs-devices-power index 7dbf96b724ed..676fdf5f2a99 100644 --- a/Documentation/ABI/testing/sysfs-devices-power +++ b/Documentation/ABI/testing/sysfs-devices-power @@ -83,8 +83,10 @@ Contact: Rafael J. Wysocki Description: The /sys/devices/.../wakeup_count attribute contains the number of signaled wakeup events associated with the device. This - attribute is read-only. If the device is not enabled to wake up + attribute is read-only. If the device is not capable to wake up the system from sleep states, this attribute is not present. + If the device is not enabled to wake up the system from sleep + states, this attribute is empty. What: /sys/devices/.../power/wakeup_active_count Date: September 2010 @@ -93,8 +95,10 @@ Description: The /sys/devices/.../wakeup_active_count attribute contains the number of times the processing of wakeup events associated with the device was completed (at the kernel level). This attribute - is read-only. If the device is not enabled to wake up the - system from sleep states, this attribute is not present. + is read-only. If the device is not capable to wake up the + system from sleep states, this attribute is not present. If + the device is not enabled to wake up the system from sleep + states, this attribute is empty. What: /sys/devices/.../power/wakeup_abort_count Date: February 2012 @@ -104,8 +108,9 @@ Description: number of times the processing of a wakeup event associated with the device might have aborted system transition into a sleep state in progress. This attribute is read-only. If the device - is not enabled to wake up the system from sleep states, this - attribute is not present. + is not capable to wake up the system from sleep states, this + attribute is not present. If the device is not enabled to wake + up the system from sleep states, this attribute is empty. What: /sys/devices/.../power/wakeup_expire_count Date: February 2012 @@ -114,8 +119,10 @@ Description: The /sys/devices/.../wakeup_expire_count attribute contains the number of times a wakeup event associated with the device has been reported with a timeout that expired. This attribute is - read-only. If the device is not enabled to wake up the system - from sleep states, this attribute is not present. + read-only. If the device is not capable to wake up the system + from sleep states, this attribute is not present. If the + device is not enabled to wake up the system from sleep states, + this attribute is empty. What: /sys/devices/.../power/wakeup_active Date: September 2010 @@ -124,8 +131,10 @@ Description: The /sys/devices/.../wakeup_active attribute contains either 1, or 0, depending on whether or not a wakeup event associated with the device is being processed (1). This attribute is read-only. - If the device is not enabled to wake up the system from sleep - states, this attribute is not present. + If the device is not capable to wake up the system from sleep + states, this attribute is not present. If the device is not + enabled to wake up the system from sleep states, this attribute + is empty. What: /sys/devices/.../power/wakeup_total_time_ms Date: September 2010 @@ -134,8 +143,9 @@ Description: The /sys/devices/.../wakeup_total_time_ms attribute contains the total time of processing wakeup events associated with the device, in milliseconds. This attribute is read-only. If the - device is not enabled to wake up the system from sleep states, - this attribute is not present. + device is not capable to wake up the system from sleep states, + this attribute is not present. If the device is not enabled to + wake up the system from sleep states, this attribute is empty. What: /sys/devices/.../power/wakeup_max_time_ms Date: September 2010 @@ -144,8 +154,10 @@ Description: The /sys/devices/.../wakeup_max_time_ms attribute contains the maximum time of processing a single wakeup event associated with the device, in milliseconds. This attribute is read-only. - If the device is not enabled to wake up the system from sleep - states, this attribute is not present. + If the device is not capable to wake up the system from sleep + states, this attribute is not present. If the device is not + enabled to wake up the system from sleep states, this attribute + is empty. What: /sys/devices/.../power/wakeup_last_time_ms Date: September 2010 @@ -156,7 +168,8 @@ Description: signaling the last wakeup event associated with the device, in milliseconds. This attribute is read-only. If the device is not enabled to wake up the system from sleep states, this - attribute is not present. + attribute is not present. If the device is not enabled to wake + up the system from sleep states, this attribute is empty. What: /sys/devices/.../power/wakeup_prevent_sleep_time_ms Date: February 2012 @@ -165,9 +178,10 @@ Description: The /sys/devices/.../wakeup_prevent_sleep_time_ms attribute contains the total time the device has been preventing opportunistic transitions to sleep states from occurring. - This attribute is read-only. If the device is not enabled to + This attribute is read-only. If the device is not capable to wake up the system from sleep states, this attribute is not - present. + present. If the device is not enabled to wake up the system + from sleep states, this attribute is empty. What: /sys/devices/.../power/autosuspend_delay_ms Date: September 2010 From 1fedc2f5cff650baaec6f90da0104157bcf10e96 Mon Sep 17 00:00:00 2001 From: Sachin Kamat Date: Wed, 2 Apr 2014 16:33:47 +0530 Subject: [PATCH 14/28] cpufreq: exynos: Disable on multiplatform build The current exynos cpufreq drivers are not multiplatform compliant and give build errors as they refer to header files from machine directory. Work to migrate them to generic cpufreq framework is under way. Till such time disable the build on multiplatform so that other multiplatform ready features get tested. Signed-off-by: Sachin Kamat Cc: Arnd Bergmann Acked-by: Viresh Kumar Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/Kconfig.arm | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm index 9fb627046e17..33526ed8b9d4 100644 --- a/drivers/cpufreq/Kconfig.arm +++ b/drivers/cpufreq/Kconfig.arm @@ -30,7 +30,7 @@ config ARM_EXYNOS_CPUFREQ config ARM_EXYNOS4210_CPUFREQ bool "SAMSUNG EXYNOS4210" - depends on CPU_EXYNOS4210 + depends on CPU_EXYNOS4210 && !ARCH_MULTIPLATFORM default y select ARM_EXYNOS_CPUFREQ help @@ -41,7 +41,7 @@ config ARM_EXYNOS4210_CPUFREQ config ARM_EXYNOS4X12_CPUFREQ bool "SAMSUNG EXYNOS4x12" - depends on (SOC_EXYNOS4212 || SOC_EXYNOS4412) + depends on (SOC_EXYNOS4212 || SOC_EXYNOS4412) && !ARCH_MULTIPLATFORM default y select ARM_EXYNOS_CPUFREQ help @@ -52,7 +52,7 @@ config ARM_EXYNOS4X12_CPUFREQ config ARM_EXYNOS5250_CPUFREQ bool "SAMSUNG EXYNOS5250" - depends on SOC_EXYNOS5250 + depends on SOC_EXYNOS5250 && !ARCH_MULTIPLATFORM default y select ARM_EXYNOS_CPUFREQ help From b4ddad95020e65cfbbf9aee63d3bcdf682794ade Mon Sep 17 00:00:00 2001 From: Chen Gang Date: Mon, 7 Apr 2014 20:04:21 +0800 Subject: [PATCH 15/28] cpufreq: unicore32: fix typo issue for 'clk' MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Need use 'clk' instead of 'mclk', which is the original removed local variable. The related original commit: "652ed95 cpufreq: introduce cpufreq_generic_get() routine" The related error with allmodconfig for unicore32: CC drivers/cpufreq/unicore2-cpufreq.o drivers/cpufreq/unicore2-cpufreq.c: In function ‘ucv2_target’: drivers/cpufreq/unicore2-cpufreq.c:48: error: ‘struct cpufreq_policy’ has no member named ‘mclk’ make[2]: *** [drivers/cpufreq/unicore2-cpufreq.o] Error 1 make[1]: *** [drivers/cpufreq] Error 2 make: *** [drivers] Error 2 Fixes: 652ed95d5fa6 (cpufreq: introduce cpufreq_generic_get() routine) Signed-off-by: Chen Gang Acked-by: Viresh Kumar Cc: 3.14+ # 3.14+ Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/unicore2-cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/unicore2-cpufreq.c b/drivers/cpufreq/unicore2-cpufreq.c index 13be802b6170..8d045afa7fb4 100644 --- a/drivers/cpufreq/unicore2-cpufreq.c +++ b/drivers/cpufreq/unicore2-cpufreq.c @@ -45,7 +45,7 @@ static int ucv2_target(struct cpufreq_policy *policy, freqs.new = target_freq; cpufreq_freq_transition_begin(policy, &freqs); - ret = clk_set_rate(policy->mclk, target_freq * 1000); + ret = clk_set_rate(policy->clk, target_freq * 1000); cpufreq_freq_transition_end(policy, &freqs, ret); return ret; From 05ed672292dc9d37db4fafdd49baa782158cd795 Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Wed, 2 Apr 2014 10:14:24 +0530 Subject: [PATCH 16/28] cpufreq: loongson2_cpufreq: don't declare local variable as static Earlier commit: commit 652ed95d5fa6074b3c4ea245deb0691f1acb6656 Author: Viresh Kumar Date: Thu Jan 9 20:38:43 2014 +0530 cpufreq: introduce cpufreq_generic_get() routine did some changes to driver and by mistake made cpuclk as a 'static' local variable, which wasn't actually required. Fix it. Fixes: 652ed95d5fa6 (cpufreq: introduce cpufreq_generic_get() routine) Reported-by: Alexandre Oliva Signed-off-by: Viresh Kumar Cc: 3.14+ # 3.14+ Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/loongson2_cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/loongson2_cpufreq.c b/drivers/cpufreq/loongson2_cpufreq.c index a3588d61d933..f0bc31f5db27 100644 --- a/drivers/cpufreq/loongson2_cpufreq.c +++ b/drivers/cpufreq/loongson2_cpufreq.c @@ -69,7 +69,7 @@ static int loongson2_cpufreq_target(struct cpufreq_policy *policy, static int loongson2_cpufreq_cpu_init(struct cpufreq_policy *policy) { - static struct clk *cpuclk; + struct clk *cpuclk; int i; unsigned long rate; int ret; From 0ca97886fece9e1acd71ade4ca3a250945c8fc8b Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Thu, 3 Apr 2014 20:20:36 +0530 Subject: [PATCH 17/28] cpufreq: at32ap: don't declare local variable as static Earlier commit: commit 652ed95d5fa6074b3c4ea245deb0691f1acb6656 Author: Viresh Kumar Date: Thu Jan 9 20:38:43 2014 +0530 cpufreq: introduce cpufreq_generic_get() routine did some changes to driver and by mistake made cpuclk as a 'static' local variable, which wasn't actually required. Fix it. Fixes: 652ed95d5fa6 (cpufreq: introduce cpufreq_generic_get() routine) Reported-by: Alexandre Oliva Signed-off-by: Viresh Kumar Acked-by: Hans-Christian Egtvedt Cc: 3.14+ # 3.14+ Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/at32ap-cpufreq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/at32ap-cpufreq.c b/drivers/cpufreq/at32ap-cpufreq.c index a1c79f549edb..7b612c8bb09e 100644 --- a/drivers/cpufreq/at32ap-cpufreq.c +++ b/drivers/cpufreq/at32ap-cpufreq.c @@ -52,7 +52,7 @@ static int at32_set_target(struct cpufreq_policy *policy, unsigned int index) static int at32_cpufreq_driver_init(struct cpufreq_policy *policy) { unsigned int frequency, rate, min_freq; - static struct clk *cpuclk; + struct clk *cpuclk; int retval, steps, i; if (policy->cpu != 0) From b3d627a5f2bf1a9a486f65af6f7c2ce0e09b3d12 Mon Sep 17 00:00:00 2001 From: Vaidyanathan Srinivasan Date: Tue, 1 Apr 2014 12:43:26 +0530 Subject: [PATCH 18/28] cpufreq: powernv: cpufreq driver for powernv platform Backend driver to dynamically set voltage and frequency on IBM POWER non-virtualized platforms. Power management SPRs are used to set the required PState. This driver works in conjunction with cpufreq governors like 'ondemand' to provide a demand based frequency and voltage setting on IBM POWER non-virtualized platforms. PState table is obtained from OPAL v3 firmware through device tree. powernv_cpufreq back-end driver would parse the relevant device-tree nodes and initialise the cpufreq subsystem on powernv platform. The code was originally written by svaidy@linux.vnet.ibm.com. Over time it was modified to accomodate bug-fixes as well as updates to the the cpu-freq core. Relevant portions of the change logs corresponding to those modifications are noted below: * The policy->cpus needs to be populated in a hotplug-invariant manner instead of using cpu_sibling_mask() which varies with cpu-hotplug. This is because the cpufreq core code copies this content into policy->related_cpus mask which should not vary on cpu-hotplug. [Authored by srivatsa.bhat@linux.vnet.ibm.com] * Create a helper routine that can return the cpu-frequency for the corresponding pstate_id. Also, cache the values of the pstate_max, pstate_min and pstate_nominal and nr_pstates in a static structure so that they can be reused in the future to perform any validations. [Authored by ego@linux.vnet.ibm.com] * Create a driver attribute named cpuinfo_nominal_freq which creates a sysfs read-only file named cpuinfo_nominal_freq. Export the frequency corresponding to the nominal_pstate through this interface. Nominal frequency is the highest non-turbo frequency for the platform. This is generally used for setting governor policies from user space for optimal energy efficiency. [Authored by ego@linux.vnet.ibm.com] * Implement a powernv_cpufreq_get(unsigned int cpu) method which will return the current operating frequency. Export this via the sysfs interface cpuinfo_cur_freq by setting powernv_cpufreq_driver.get to powernv_cpufreq_get(). [Authored by ego@linux.vnet.ibm.com] [Change log updated by ego@linux.vnet.ibm.com] Reviewed-by: Preeti U Murthy Signed-off-by: Vaidyanathan Srinivasan Signed-off-by: Srivatsa S. Bhat Signed-off-by: Anton Blanchard Signed-off-by: Gautham R. Shenoy Signed-off-by: Rafael J. Wysocki --- arch/powerpc/include/asm/reg.h | 4 + drivers/cpufreq/Kconfig.powerpc | 8 + drivers/cpufreq/Makefile | 1 + drivers/cpufreq/powernv-cpufreq.c | 342 ++++++++++++++++++++++++++++++ 4 files changed, 355 insertions(+) create mode 100644 drivers/cpufreq/powernv-cpufreq.c diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index 1a36b8ede417..2189f8f2ca88 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -271,6 +271,10 @@ #define SPRN_HSRR1 0x13B /* Hypervisor Save/Restore 1 */ #define SPRN_IC 0x350 /* Virtual Instruction Count */ #define SPRN_VTB 0x351 /* Virtual Time Base */ +#define SPRN_PMICR 0x354 /* Power Management Idle Control Reg */ +#define SPRN_PMSR 0x355 /* Power Management Status Reg */ +#define SPRN_PMCR 0x374 /* Power Management Control Register */ + /* HFSCR and FSCR bit numbers are the same */ #define FSCR_TAR_LG 8 /* Enable Target Address Register */ #define FSCR_EBB_LG 7 /* Enable Event Based Branching */ diff --git a/drivers/cpufreq/Kconfig.powerpc b/drivers/cpufreq/Kconfig.powerpc index ca0021a96e19..72564b701b4a 100644 --- a/drivers/cpufreq/Kconfig.powerpc +++ b/drivers/cpufreq/Kconfig.powerpc @@ -54,3 +54,11 @@ config PPC_PASEMI_CPUFREQ help This adds the support for frequency switching on PA Semi PWRficient processors. + +config POWERNV_CPUFREQ + tristate "CPU frequency scaling for IBM POWERNV platform" + depends on PPC_POWERNV + default y + help + This adds support for CPU frequency switching on IBM POWERNV + platform diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile index 74945652dd7a..0dbb963c1aef 100644 --- a/drivers/cpufreq/Makefile +++ b/drivers/cpufreq/Makefile @@ -86,6 +86,7 @@ obj-$(CONFIG_PPC_CORENET_CPUFREQ) += ppc-corenet-cpufreq.o obj-$(CONFIG_CPU_FREQ_PMAC) += pmac32-cpufreq.o obj-$(CONFIG_CPU_FREQ_PMAC64) += pmac64-cpufreq.o obj-$(CONFIG_PPC_PASEMI_CPUFREQ) += pasemi-cpufreq.o +obj-$(CONFIG_POWERNV_CPUFREQ) += powernv-cpufreq.o ################################################################################## # Other platform drivers diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c new file mode 100644 index 000000000000..e1e519703dfe --- /dev/null +++ b/drivers/cpufreq/powernv-cpufreq.c @@ -0,0 +1,342 @@ +/* + * POWERNV cpufreq driver for the IBM POWER processors + * + * (C) Copyright IBM 2014 + * + * Author: Vaidyanathan Srinivasan + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2, or (at your option) + * any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#define pr_fmt(fmt) "powernv-cpufreq: " fmt + +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#define POWERNV_MAX_PSTATES 256 + +static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1]; +static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1]; + +/* + * Note: The set of pstates consists of contiguous integers, the + * smallest of which is indicated by powernv_pstate_info.min, the + * largest of which is indicated by powernv_pstate_info.max. + * + * The nominal pstate is the highest non-turbo pstate in this + * platform. This is indicated by powernv_pstate_info.nominal. + */ +static struct powernv_pstate_info { + int min; + int max; + int nominal; + int nr_pstates; +} powernv_pstate_info; + +/* + * Initialize the freq table based on data obtained + * from the firmware passed via device-tree + */ +static int init_powernv_pstates(void) +{ + struct device_node *power_mgt; + int i, pstate_min, pstate_max, pstate_nominal, nr_pstates = 0; + const __be32 *pstate_ids, *pstate_freqs; + u32 len_ids, len_freqs; + + power_mgt = of_find_node_by_path("/ibm,opal/power-mgt"); + if (!power_mgt) { + pr_warn("power-mgt node not found\n"); + return -ENODEV; + } + + if (of_property_read_u32(power_mgt, "ibm,pstate-min", &pstate_min)) { + pr_warn("ibm,pstate-min node not found\n"); + return -ENODEV; + } + + if (of_property_read_u32(power_mgt, "ibm,pstate-max", &pstate_max)) { + pr_warn("ibm,pstate-max node not found\n"); + return -ENODEV; + } + + if (of_property_read_u32(power_mgt, "ibm,pstate-nominal", + &pstate_nominal)) { + pr_warn("ibm,pstate-nominal not found\n"); + return -ENODEV; + } + pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min, + pstate_nominal, pstate_max); + + pstate_ids = of_get_property(power_mgt, "ibm,pstate-ids", &len_ids); + if (!pstate_ids) { + pr_warn("ibm,pstate-ids not found\n"); + return -ENODEV; + } + + pstate_freqs = of_get_property(power_mgt, "ibm,pstate-frequencies-mhz", + &len_freqs); + if (!pstate_freqs) { + pr_warn("ibm,pstate-frequencies-mhz not found\n"); + return -ENODEV; + } + + WARN_ON(len_ids != len_freqs); + nr_pstates = min(len_ids, len_freqs) / sizeof(u32); + if (!nr_pstates) { + pr_warn("No PStates found\n"); + return -ENODEV; + } + + pr_debug("NR PStates %d\n", nr_pstates); + for (i = 0; i < nr_pstates; i++) { + u32 id = be32_to_cpu(pstate_ids[i]); + u32 freq = be32_to_cpu(pstate_freqs[i]); + + pr_debug("PState id %d freq %d MHz\n", id, freq); + powernv_freqs[i].frequency = freq * 1000; /* kHz */ + powernv_pstate_ids[i] = id; + } + /* End of list marker entry */ + powernv_freqs[i].frequency = CPUFREQ_TABLE_END; + + powernv_pstate_info.min = pstate_min; + powernv_pstate_info.max = pstate_max; + powernv_pstate_info.nominal = pstate_nominal; + powernv_pstate_info.nr_pstates = nr_pstates; + + return 0; +} + +/* Returns the CPU frequency corresponding to the pstate_id. */ +static unsigned int pstate_id_to_freq(int pstate_id) +{ + int i; + + i = powernv_pstate_info.max - pstate_id; + BUG_ON(i >= powernv_pstate_info.nr_pstates || i < 0); + + return powernv_freqs[i].frequency; +} + +/* + * cpuinfo_nominal_freq_show - Show the nominal CPU frequency as indicated by + * the firmware + */ +static ssize_t cpuinfo_nominal_freq_show(struct cpufreq_policy *policy, + char *buf) +{ + return sprintf(buf, "%u\n", + pstate_id_to_freq(powernv_pstate_info.nominal)); +} + +struct freq_attr cpufreq_freq_attr_cpuinfo_nominal_freq = + __ATTR_RO(cpuinfo_nominal_freq); + +static struct freq_attr *powernv_cpu_freq_attr[] = { + &cpufreq_freq_attr_scaling_available_freqs, + &cpufreq_freq_attr_cpuinfo_nominal_freq, + NULL, +}; + +/* Helper routines */ + +/* Access helpers to power mgt SPR */ + +static inline unsigned long get_pmspr(unsigned long sprn) +{ + switch (sprn) { + case SPRN_PMCR: + return mfspr(SPRN_PMCR); + + case SPRN_PMICR: + return mfspr(SPRN_PMICR); + + case SPRN_PMSR: + return mfspr(SPRN_PMSR); + } + BUG(); +} + +static inline void set_pmspr(unsigned long sprn, unsigned long val) +{ + switch (sprn) { + case SPRN_PMCR: + mtspr(SPRN_PMCR, val); + return; + + case SPRN_PMICR: + mtspr(SPRN_PMICR, val); + return; + } + BUG(); +} + +/* + * Use objects of this type to query/update + * pstates on a remote CPU via smp_call_function. + */ +struct powernv_smp_call_data { + unsigned int freq; + int pstate_id; +}; + +/* + * powernv_read_cpu_freq: Reads the current frequency on this CPU. + * + * Called via smp_call_function. + * + * Note: The caller of the smp_call_function should pass an argument of + * the type 'struct powernv_smp_call_data *' along with this function. + * + * The current frequency on this CPU will be returned via + * ((struct powernv_smp_call_data *)arg)->freq; + */ +static void powernv_read_cpu_freq(void *arg) +{ + unsigned long pmspr_val; + s8 local_pstate_id; + struct powernv_smp_call_data *freq_data = arg; + + pmspr_val = get_pmspr(SPRN_PMSR); + + /* + * The local pstate id corresponds bits 48..55 in the PMSR. + * Note: Watch out for the sign! + */ + local_pstate_id = (pmspr_val >> 48) & 0xFF; + freq_data->pstate_id = local_pstate_id; + freq_data->freq = pstate_id_to_freq(freq_data->pstate_id); + + pr_debug("cpu %d pmsr %016lX pstate_id %d frequency %d kHz\n", + raw_smp_processor_id(), pmspr_val, freq_data->pstate_id, + freq_data->freq); +} + +/* + * powernv_cpufreq_get: Returns the CPU frequency as reported by the + * firmware for CPU 'cpu'. This value is reported through the sysfs + * file cpuinfo_cur_freq. + */ +unsigned int powernv_cpufreq_get(unsigned int cpu) +{ + struct powernv_smp_call_data freq_data; + + smp_call_function_any(cpu_sibling_mask(cpu), powernv_read_cpu_freq, + &freq_data, 1); + + return freq_data.freq; +} + +/* + * set_pstate: Sets the pstate on this CPU. + * + * This is called via an smp_call_function. + * + * The caller must ensure that freq_data is of the type + * (struct powernv_smp_call_data *) and the pstate_id which needs to be set + * on this CPU should be present in freq_data->pstate_id. + */ +static void set_pstate(void *freq_data) +{ + unsigned long val; + unsigned long pstate_ul = + ((struct powernv_smp_call_data *) freq_data)->pstate_id; + + val = get_pmspr(SPRN_PMCR); + val = val & 0x0000FFFFFFFFFFFFULL; + + pstate_ul = pstate_ul & 0xFF; + + /* Set both global(bits 56..63) and local(bits 48..55) PStates */ + val = val | (pstate_ul << 56) | (pstate_ul << 48); + + pr_debug("Setting cpu %d pmcr to %016lX\n", + raw_smp_processor_id(), val); + set_pmspr(SPRN_PMCR, val); +} + +/* + * powernv_cpufreq_target_index: Sets the frequency corresponding to + * the cpufreq table entry indexed by new_index on the cpus in the + * mask policy->cpus + */ +static int powernv_cpufreq_target_index(struct cpufreq_policy *policy, + unsigned int new_index) +{ + struct powernv_smp_call_data freq_data; + + freq_data.pstate_id = powernv_pstate_ids[new_index]; + + /* + * Use smp_call_function to send IPI and execute the + * mtspr on target CPU. We could do that without IPI + * if current CPU is within policy->cpus (core) + */ + smp_call_function_any(policy->cpus, set_pstate, &freq_data, 1); + + return 0; +} + +static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy) +{ + int base, i; + + base = cpu_first_thread_sibling(policy->cpu); + + for (i = 0; i < threads_per_core; i++) + cpumask_set_cpu(base + i, policy->cpus); + + return cpufreq_table_validate_and_show(policy, powernv_freqs); +} + +static struct cpufreq_driver powernv_cpufreq_driver = { + .name = "powernv-cpufreq", + .flags = CPUFREQ_CONST_LOOPS, + .init = powernv_cpufreq_cpu_init, + .verify = cpufreq_generic_frequency_table_verify, + .target_index = powernv_cpufreq_target_index, + .get = powernv_cpufreq_get, + .attr = powernv_cpu_freq_attr, +}; + +static int __init powernv_cpufreq_init(void) +{ + int rc = 0; + + /* Discover pstates from device tree and init */ + rc = init_powernv_pstates(); + if (rc) { + pr_info("powernv-cpufreq disabled. System does not support PState control\n"); + return rc; + } + + return cpufreq_register_driver(&powernv_cpufreq_driver); +} +module_init(powernv_cpufreq_init); + +static void __exit powernv_cpufreq_exit(void) +{ + cpufreq_unregister_driver(&powernv_cpufreq_driver); +} +module_exit(powernv_cpufreq_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Vaidyanathan Srinivasan "); From 0692c69138355fdbf32ecf70a2cde9c1fc3d7bb2 Mon Sep 17 00:00:00 2001 From: "Gautham R. Shenoy" Date: Tue, 1 Apr 2014 12:43:27 +0530 Subject: [PATCH 19/28] cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids The .driver_data field in the cpufreq_frequency_table was supposed to be private to the drivers. However at some later point, it was being used to indicate if the particular frequency in the table is the BOOST_FREQUENCY. After patches [1] and [2], the .driver_data is once again private to the driver. Thus we can safely use cpufreq_frequency_table.driver_data to store pstate_ids instead of having to maintain a separate array powernv_pstate_ids[] for this purpose. [1]: Subject: cpufreq: don't print value of .driver_data from core From : Viresh Kumar url : http://marc.info/?l=linux-pm&m=139601421504709&w=2 [2]: Subject: cpufreq: create another field .flags in cpufreq_frequency_table From : Viresh Kumar url : http://marc.info/?l=linux-pm&m=139601416804702&w=2 Signed-off-by: Gautham R. Shenoy Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/powernv-cpufreq.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c index e1e519703dfe..9edccc63245d 100644 --- a/drivers/cpufreq/powernv-cpufreq.c +++ b/drivers/cpufreq/powernv-cpufreq.c @@ -33,7 +33,6 @@ #define POWERNV_MAX_PSTATES 256 static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1]; -static int powernv_pstate_ids[POWERNV_MAX_PSTATES+1]; /* * Note: The set of pstates consists of contiguous integers, the @@ -112,7 +111,7 @@ static int init_powernv_pstates(void) pr_debug("PState id %d freq %d MHz\n", id, freq); powernv_freqs[i].frequency = freq * 1000; /* kHz */ - powernv_pstate_ids[i] = id; + powernv_freqs[i].driver_data = id; } /* End of list marker entry */ powernv_freqs[i].frequency = CPUFREQ_TABLE_END; @@ -283,7 +282,7 @@ static int powernv_cpufreq_target_index(struct cpufreq_policy *policy, { struct powernv_smp_call_data freq_data; - freq_data.pstate_id = powernv_pstate_ids[new_index]; + freq_data.pstate_id = powernv_freqs[new_index].driver_data; /* * Use smp_call_function to send IPI and execute the From 81f359027a3a383cc1dc79499ce97f1074861e5b Mon Sep 17 00:00:00 2001 From: "Gautham R. Shenoy" Date: Tue, 1 Apr 2014 12:43:28 +0530 Subject: [PATCH 20/28] cpufreq: powernv: Select CPUFreq related Kconfig options for powernv Enable CPUFreq for PowerNV. Select "performance", "powersave", "userspace" and "ondemand" governors. Choose "ondemand" to be the default governor. Signed-off-by: Srivatsa S. Bhat Signed-off-by: Gautham R. Shenoy Signed-off-by: Rafael J. Wysocki --- arch/powerpc/configs/pseries_defconfig | 1 + arch/powerpc/configs/pseries_le_defconfig | 1 + arch/powerpc/platforms/powernv/Kconfig | 6 ++++++ 3 files changed, 8 insertions(+) diff --git a/arch/powerpc/configs/pseries_defconfig b/arch/powerpc/configs/pseries_defconfig index 9ea8342bd219..a905063281cc 100644 --- a/arch/powerpc/configs/pseries_defconfig +++ b/arch/powerpc/configs/pseries_defconfig @@ -306,3 +306,4 @@ CONFIG_KVM_BOOK3S_64=m CONFIG_KVM_BOOK3S_64_HV=y CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y +CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y diff --git a/arch/powerpc/configs/pseries_le_defconfig b/arch/powerpc/configs/pseries_le_defconfig index 3c84f9d87980..58e3dbf43ca4 100644 --- a/arch/powerpc/configs/pseries_le_defconfig +++ b/arch/powerpc/configs/pseries_le_defconfig @@ -301,3 +301,4 @@ CONFIG_CRYPTO_LZO=m # CONFIG_CRYPTO_ANSI_CPRNG is not set CONFIG_CRYPTO_DEV_NX=y CONFIG_CRYPTO_DEV_NX_ENCRYPT=m +CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig index 895e8a20a3fc..c252ee95bddf 100644 --- a/arch/powerpc/platforms/powernv/Kconfig +++ b/arch/powerpc/platforms/powernv/Kconfig @@ -11,6 +11,12 @@ config PPC_POWERNV select PPC_UDBG_16550 select PPC_SCOM select ARCH_RANDOM + select CPU_FREQ + select CPU_FREQ_GOV_PERFORMANCE + select CPU_FREQ_GOV_POWERSAVE + select CPU_FREQ_GOV_USERSPACE + select CPU_FREQ_GOV_ONDEMAND + select CPU_FREQ_GOV_CONSERVATIVE default y config PPC_POWERNV_RTAS From 1ea7d77b099434916756ac0920a0af6e9a408251 Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Fri, 28 Mar 2014 19:11:44 +0530 Subject: [PATCH 21/28] cpufreq: ia64: don't set .driver_data to index .driver_data field is only required to be filled if drivers want to preserve some data in there which they can use according to the value of .frequency. But this driver isn't using this field at all, but just setting it equal to the index value. Which isn't required. Fix it. Signed-off-by: Viresh Kumar Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/ia64-acpi-cpufreq.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/cpufreq/ia64-acpi-cpufreq.c b/drivers/cpufreq/ia64-acpi-cpufreq.c index a22b5d182e0e..beb191b589d4 100644 --- a/drivers/cpufreq/ia64-acpi-cpufreq.c +++ b/drivers/cpufreq/ia64-acpi-cpufreq.c @@ -275,7 +275,6 @@ acpi_cpufreq_cpu_init ( /* table init */ for (i = 0; i <= data->acpi_data.state_count; i++) { - data->freq_table[i].driver_data = i; if (i < data->acpi_data.state_count) { data->freq_table[i].frequency = data->acpi_data.states[i].core_frequency * 1000; From ae87f10f35f75deb8f74dbd92d062200932c2f26 Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Fri, 28 Mar 2014 19:11:45 +0530 Subject: [PATCH 22/28] cpufreq: don't print value of .driver_data from core CPUFreq core doesn't control value of .driver_data and this field is completely driver specific. This can contain any value and not only indexes. For most of the drivers, which aren't using this field, its value is zero. So, printing this from core doesn't make any sense. Don't print it. Signed-off-by: Viresh Kumar Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/freq_table.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c index 65a477075b3f..53ff3c209606 100644 --- a/drivers/cpufreq/freq_table.c +++ b/drivers/cpufreq/freq_table.c @@ -36,8 +36,7 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy, && table[i].driver_data == CPUFREQ_BOOST_FREQ) continue; - pr_debug("table entry %u: %u kHz, %u driver_data\n", - i, freq, table[i].driver_data); + pr_debug("table entry %u: %u kHz\n", i, freq); if (freq < min_freq) min_freq = freq; if (freq > max_freq) @@ -175,8 +174,8 @@ int cpufreq_frequency_table_target(struct cpufreq_policy *policy, } else *index = optimal.driver_data; - pr_debug("target is %u (%u kHz, %u)\n", *index, table[*index].frequency, - table[*index].driver_data); + pr_debug("target index is %u, freq is:%u kHz\n", *index, + table[*index].frequency); return 0; } From 71508a1f4f2286eea728a5994f1fb14b77340b47 Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Fri, 28 Mar 2014 19:11:46 +0530 Subject: [PATCH 23/28] cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table Few drivers are using kmalloc() to allocate memory for frequency tables and since we will have an additional field '.flags' in 'struct cpufreq_frequency_table', these might become unstable. Better get these fixed by replacing kmalloc() by kzalloc() instead. Along with that we also remove use of .driver_data from SPEAr driver as it doesn't use it at all. Also, writing zero to .driver_data is not required for powernow-k8 as it is already zero. Reported-and-reviewed-by: Gautham R. Shenoy Signed-off-by: Viresh Kumar Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/acpi-cpufreq.c | 2 +- drivers/cpufreq/ia64-acpi-cpufreq.c | 2 +- drivers/cpufreq/longhaul.c | 2 +- drivers/cpufreq/powernow-k8.c | 5 ++--- drivers/cpufreq/s3c24xx-cpufreq.c | 4 ++-- drivers/cpufreq/spear-cpufreq.c | 7 ++----- 6 files changed, 9 insertions(+), 13 deletions(-) diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c index 822ca03a87f7..c91ef5785dfa 100644 --- a/drivers/cpufreq/acpi-cpufreq.c +++ b/drivers/cpufreq/acpi-cpufreq.c @@ -754,7 +754,7 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy) goto err_unreg; } - data->freq_table = kmalloc(sizeof(*data->freq_table) * + data->freq_table = kzalloc(sizeof(*data->freq_table) * (perf->state_count+1), GFP_KERNEL); if (!data->freq_table) { result = -ENOMEM; diff --git a/drivers/cpufreq/ia64-acpi-cpufreq.c b/drivers/cpufreq/ia64-acpi-cpufreq.c index beb191b589d4..c30aaa6a54e8 100644 --- a/drivers/cpufreq/ia64-acpi-cpufreq.c +++ b/drivers/cpufreq/ia64-acpi-cpufreq.c @@ -254,7 +254,7 @@ acpi_cpufreq_cpu_init ( } /* alloc freq_table */ - data->freq_table = kmalloc(sizeof(*data->freq_table) * + data->freq_table = kzalloc(sizeof(*data->freq_table) * (data->acpi_data.state_count + 1), GFP_KERNEL); if (!data->freq_table) { diff --git a/drivers/cpufreq/longhaul.c b/drivers/cpufreq/longhaul.c index 5c440f87ba8a..d00e5d1abd25 100644 --- a/drivers/cpufreq/longhaul.c +++ b/drivers/cpufreq/longhaul.c @@ -475,7 +475,7 @@ static int longhaul_get_ranges(void) return -EINVAL; } - longhaul_table = kmalloc((numscales + 1) * sizeof(*longhaul_table), + longhaul_table = kzalloc((numscales + 1) * sizeof(*longhaul_table), GFP_KERNEL); if (!longhaul_table) return -ENOMEM; diff --git a/drivers/cpufreq/powernow-k8.c b/drivers/cpufreq/powernow-k8.c index 770a9e1b3468..1b6ae6b57c11 100644 --- a/drivers/cpufreq/powernow-k8.c +++ b/drivers/cpufreq/powernow-k8.c @@ -623,7 +623,7 @@ static int fill_powernow_table(struct powernow_k8_data *data, if (check_pst_table(data, pst, maxvid)) return -EINVAL; - powernow_table = kmalloc((sizeof(*powernow_table) + powernow_table = kzalloc((sizeof(*powernow_table) * (data->numps + 1)), GFP_KERNEL); if (!powernow_table) { printk(KERN_ERR PFX "powernow_table memory alloc failure\n"); @@ -793,7 +793,7 @@ static int powernow_k8_cpu_init_acpi(struct powernow_k8_data *data) } /* fill in data->powernow_table */ - powernow_table = kmalloc((sizeof(*powernow_table) + powernow_table = kzalloc((sizeof(*powernow_table) * (data->acpi_data.state_count + 1)), GFP_KERNEL); if (!powernow_table) { pr_debug("powernow_table memory alloc failure\n"); @@ -810,7 +810,6 @@ static int powernow_k8_cpu_init_acpi(struct powernow_k8_data *data) powernow_table[data->acpi_data.state_count].frequency = CPUFREQ_TABLE_END; - powernow_table[data->acpi_data.state_count].driver_data = 0; data->powernow_table = powernow_table; if (cpumask_first(cpu_core_mask(data->cpu)) == data->cpu) diff --git a/drivers/cpufreq/s3c24xx-cpufreq.c b/drivers/cpufreq/s3c24xx-cpufreq.c index a3dc192d21f9..be1b2b5c9753 100644 --- a/drivers/cpufreq/s3c24xx-cpufreq.c +++ b/drivers/cpufreq/s3c24xx-cpufreq.c @@ -586,7 +586,7 @@ static int s3c_cpufreq_build_freq(void) size = cpu_cur.info->calc_freqtable(&cpu_cur, NULL, 0); size++; - ftab = kmalloc(sizeof(*ftab) * size, GFP_KERNEL); + ftab = kzalloc(sizeof(*ftab) * size, GFP_KERNEL); if (!ftab) { printk(KERN_ERR "%s: no memory for tables\n", __func__); return -ENOMEM; @@ -664,7 +664,7 @@ int __init s3c_plltab_register(struct cpufreq_frequency_table *plls, size = sizeof(*vals) * (plls_no + 1); - vals = kmalloc(size, GFP_KERNEL); + vals = kzalloc(size, GFP_KERNEL); if (vals) { memcpy(vals, plls, size); pll_reg = vals; diff --git a/drivers/cpufreq/spear-cpufreq.c b/drivers/cpufreq/spear-cpufreq.c index 4cfdcff8a310..38678396636d 100644 --- a/drivers/cpufreq/spear-cpufreq.c +++ b/drivers/cpufreq/spear-cpufreq.c @@ -195,18 +195,15 @@ static int spear_cpufreq_probe(struct platform_device *pdev) cnt = prop->length / sizeof(u32); val = prop->value; - freq_tbl = kmalloc(sizeof(*freq_tbl) * (cnt + 1), GFP_KERNEL); + freq_tbl = kzalloc(sizeof(*freq_tbl) * (cnt + 1), GFP_KERNEL); if (!freq_tbl) { ret = -ENOMEM; goto out_put_node; } - for (i = 0; i < cnt; i++) { - freq_tbl[i].driver_data = i; + for (i = 0; i < cnt; i++) freq_tbl[i].frequency = be32_to_cpup(val++); - } - freq_tbl[i].driver_data = i; freq_tbl[i].frequency = CPUFREQ_TABLE_END; spear_cpufreq.freq_tbl = freq_tbl; From 7f4b04614a273089ad65654f53771c033fadc65e Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Fri, 28 Mar 2014 19:11:47 +0530 Subject: [PATCH 24/28] cpufreq: create another field .flags in cpufreq_frequency_table Currently cpufreq frequency table has two fields: frequency and driver_data. driver_data is only for drivers' internal use and cpufreq core shouldn't use it at all. But with the introduction of BOOST frequencies, this assumption was broken and we started using it as a flag instead. There are two problems due to this: - It is against the description of this field, as driver's data is used by the core now. - if drivers fill it with -3 for any frequency, then those frequencies are never considered by cpufreq core as it is exactly same as value of CPUFREQ_BOOST_FREQ, i.e. ~2. The best way to get this fixed is by creating another field flags which will be used for such flags. This patch does that. Along with that various drivers need modifications due to the change of struct cpufreq_frequency_table. Reviewed-by: Gautham R Shenoy Signed-off-by: Viresh Kumar Signed-off-by: Rafael J. Wysocki --- arch/mips/loongson/lemote-2f/clock.c | 20 +++++++-------- drivers/cpufreq/cris-artpec3-cpufreq.c | 6 ++--- drivers/cpufreq/cris-etraxfs-cpufreq.c | 6 ++--- drivers/cpufreq/elanfreq.c | 18 +++++++------- drivers/cpufreq/exynos4210-cpufreq.c | 12 ++++----- drivers/cpufreq/exynos4x12-cpufreq.c | 30 +++++++++++------------ drivers/cpufreq/exynos5250-cpufreq.c | 34 +++++++++++++------------- drivers/cpufreq/freq_table.c | 4 +-- drivers/cpufreq/kirkwood-cpufreq.c | 6 ++--- drivers/cpufreq/maple-cpufreq.c | 6 ++--- drivers/cpufreq/p4-clockmod.c | 20 +++++++-------- drivers/cpufreq/pasemi-cpufreq.c | 12 ++++----- drivers/cpufreq/pmac32-cpufreq.c | 6 ++--- drivers/cpufreq/pmac64-cpufreq.c | 6 ++--- drivers/cpufreq/powernow-k6.c | 18 +++++++------- drivers/cpufreq/ppc_cbe_cpufreq.c | 18 +++++++------- drivers/cpufreq/s3c2416-cpufreq.c | 20 +++++++-------- drivers/cpufreq/s3c64xx-cpufreq.c | 26 ++++++++++---------- drivers/cpufreq/s5pv210-cpufreq.c | 12 ++++----- drivers/cpufreq/sc520_freq.c | 6 ++--- drivers/cpufreq/speedstep-ich.c | 6 ++--- drivers/cpufreq/speedstep-smi.c | 6 ++--- include/linux/cpufreq.h | 9 ++++--- 23 files changed, 155 insertions(+), 152 deletions(-) diff --git a/arch/mips/loongson/lemote-2f/clock.c b/arch/mips/loongson/lemote-2f/clock.c index aed32b88576c..e1f427f4f5f3 100644 --- a/arch/mips/loongson/lemote-2f/clock.c +++ b/arch/mips/loongson/lemote-2f/clock.c @@ -28,16 +28,16 @@ enum { }; struct cpufreq_frequency_table loongson2_clockmod_table[] = { - {DC_RESV, CPUFREQ_ENTRY_INVALID}, - {DC_ZERO, CPUFREQ_ENTRY_INVALID}, - {DC_25PT, 0}, - {DC_37PT, 0}, - {DC_50PT, 0}, - {DC_62PT, 0}, - {DC_75PT, 0}, - {DC_87PT, 0}, - {DC_DISABLE, 0}, - {DC_RESV, CPUFREQ_TABLE_END}, + {0, DC_RESV, CPUFREQ_ENTRY_INVALID}, + {0, DC_ZERO, CPUFREQ_ENTRY_INVALID}, + {0, DC_25PT, 0}, + {0, DC_37PT, 0}, + {0, DC_50PT, 0}, + {0, DC_62PT, 0}, + {0, DC_75PT, 0}, + {0, DC_87PT, 0}, + {0, DC_DISABLE, 0}, + {0, DC_RESV, CPUFREQ_TABLE_END}, }; EXPORT_SYMBOL_GPL(loongson2_clockmod_table); diff --git a/drivers/cpufreq/cris-artpec3-cpufreq.c b/drivers/cpufreq/cris-artpec3-cpufreq.c index d4573032cbbc..601b88c490cf 100644 --- a/drivers/cpufreq/cris-artpec3-cpufreq.c +++ b/drivers/cpufreq/cris-artpec3-cpufreq.c @@ -15,9 +15,9 @@ static struct notifier_block cris_sdram_freq_notifier_block = { }; static struct cpufreq_frequency_table cris_freq_table[] = { - {0x01, 6000}, - {0x02, 200000}, - {0, CPUFREQ_TABLE_END}, + {0, 0x01, 6000}, + {0, 0x02, 200000}, + {0, 0, CPUFREQ_TABLE_END}, }; static unsigned int cris_freq_get_cpu_frequency(unsigned int cpu) diff --git a/drivers/cpufreq/cris-etraxfs-cpufreq.c b/drivers/cpufreq/cris-etraxfs-cpufreq.c index 13c3361437f7..22b2cdde74d9 100644 --- a/drivers/cpufreq/cris-etraxfs-cpufreq.c +++ b/drivers/cpufreq/cris-etraxfs-cpufreq.c @@ -15,9 +15,9 @@ static struct notifier_block cris_sdram_freq_notifier_block = { }; static struct cpufreq_frequency_table cris_freq_table[] = { - {0x01, 6000}, - {0x02, 200000}, - {0, CPUFREQ_TABLE_END}, + {0, 0x01, 6000}, + {0, 0x02, 200000}, + {0, 0, CPUFREQ_TABLE_END}, }; static unsigned int cris_freq_get_cpu_frequency(unsigned int cpu) diff --git a/drivers/cpufreq/elanfreq.c b/drivers/cpufreq/elanfreq.c index c987e94708f5..7f5d2a68c353 100644 --- a/drivers/cpufreq/elanfreq.c +++ b/drivers/cpufreq/elanfreq.c @@ -56,15 +56,15 @@ static struct s_elan_multiplier elan_multiplier[] = { }; static struct cpufreq_frequency_table elanfreq_table[] = { - {0, 1000}, - {1, 2000}, - {2, 4000}, - {3, 8000}, - {4, 16000}, - {5, 33000}, - {6, 66000}, - {7, 99000}, - {0, CPUFREQ_TABLE_END}, + {0, 0, 1000}, + {0, 1, 2000}, + {0, 2, 4000}, + {0, 3, 8000}, + {0, 4, 16000}, + {0, 5, 33000}, + {0, 6, 66000}, + {0, 7, 99000}, + {0, 0, CPUFREQ_TABLE_END}, }; diff --git a/drivers/cpufreq/exynos4210-cpufreq.c b/drivers/cpufreq/exynos4210-cpufreq.c index 40d84c43d8f4..6384e5b9a347 100644 --- a/drivers/cpufreq/exynos4210-cpufreq.c +++ b/drivers/cpufreq/exynos4210-cpufreq.c @@ -29,12 +29,12 @@ static unsigned int exynos4210_volt_table[] = { }; static struct cpufreq_frequency_table exynos4210_freq_table[] = { - {L0, 1200 * 1000}, - {L1, 1000 * 1000}, - {L2, 800 * 1000}, - {L3, 500 * 1000}, - {L4, 200 * 1000}, - {0, CPUFREQ_TABLE_END}, + {0, L0, 1200 * 1000}, + {0, L1, 1000 * 1000}, + {0, L2, 800 * 1000}, + {0, L3, 500 * 1000}, + {0, L4, 200 * 1000}, + {0, 0, CPUFREQ_TABLE_END}, }; static struct apll_freq apll_freq_4210[] = { diff --git a/drivers/cpufreq/exynos4x12-cpufreq.c b/drivers/cpufreq/exynos4x12-cpufreq.c index 7c11ace3b3fc..466c76ad335b 100644 --- a/drivers/cpufreq/exynos4x12-cpufreq.c +++ b/drivers/cpufreq/exynos4x12-cpufreq.c @@ -30,21 +30,21 @@ static unsigned int exynos4x12_volt_table[] = { }; static struct cpufreq_frequency_table exynos4x12_freq_table[] = { - {CPUFREQ_BOOST_FREQ, 1500 * 1000}, - {L1, 1400 * 1000}, - {L2, 1300 * 1000}, - {L3, 1200 * 1000}, - {L4, 1100 * 1000}, - {L5, 1000 * 1000}, - {L6, 900 * 1000}, - {L7, 800 * 1000}, - {L8, 700 * 1000}, - {L9, 600 * 1000}, - {L10, 500 * 1000}, - {L11, 400 * 1000}, - {L12, 300 * 1000}, - {L13, 200 * 1000}, - {0, CPUFREQ_TABLE_END}, + {CPUFREQ_BOOST_FREQ, L0, 1500 * 1000}, + {0, L1, 1400 * 1000}, + {0, L2, 1300 * 1000}, + {0, L3, 1200 * 1000}, + {0, L4, 1100 * 1000}, + {0, L5, 1000 * 1000}, + {0, L6, 900 * 1000}, + {0, L7, 800 * 1000}, + {0, L8, 700 * 1000}, + {0, L9, 600 * 1000}, + {0, L10, 500 * 1000}, + {0, L11, 400 * 1000}, + {0, L12, 300 * 1000}, + {0, L13, 200 * 1000}, + {0, 0, CPUFREQ_TABLE_END}, }; static struct apll_freq *apll_freq_4x12; diff --git a/drivers/cpufreq/exynos5250-cpufreq.c b/drivers/cpufreq/exynos5250-cpufreq.c index 5f90b82a4082..363a0b3fe1b1 100644 --- a/drivers/cpufreq/exynos5250-cpufreq.c +++ b/drivers/cpufreq/exynos5250-cpufreq.c @@ -34,23 +34,23 @@ static unsigned int exynos5250_volt_table[] = { }; static struct cpufreq_frequency_table exynos5250_freq_table[] = { - {L0, 1700 * 1000}, - {L1, 1600 * 1000}, - {L2, 1500 * 1000}, - {L3, 1400 * 1000}, - {L4, 1300 * 1000}, - {L5, 1200 * 1000}, - {L6, 1100 * 1000}, - {L7, 1000 * 1000}, - {L8, 900 * 1000}, - {L9, 800 * 1000}, - {L10, 700 * 1000}, - {L11, 600 * 1000}, - {L12, 500 * 1000}, - {L13, 400 * 1000}, - {L14, 300 * 1000}, - {L15, 200 * 1000}, - {0, CPUFREQ_TABLE_END}, + {0, L0, 1700 * 1000}, + {0, L1, 1600 * 1000}, + {0, L2, 1500 * 1000}, + {0, L3, 1400 * 1000}, + {0, L4, 1300 * 1000}, + {0, L5, 1200 * 1000}, + {0, L6, 1100 * 1000}, + {0, L7, 1000 * 1000}, + {0, L8, 900 * 1000}, + {0, L9, 800 * 1000}, + {0, L10, 700 * 1000}, + {0, L11, 600 * 1000}, + {0, L12, 500 * 1000}, + {0, L13, 400 * 1000}, + {0, L14, 300 * 1000}, + {0, L15, 200 * 1000}, + {0, 0, CPUFREQ_TABLE_END}, }; static struct apll_freq apll_freq_5250[] = { diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c index 53ff3c209606..08e7bbcf6d73 100644 --- a/drivers/cpufreq/freq_table.c +++ b/drivers/cpufreq/freq_table.c @@ -33,7 +33,7 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy, continue; } if (!cpufreq_boost_enabled() - && table[i].driver_data == CPUFREQ_BOOST_FREQ) + && (table[i].flags & CPUFREQ_BOOST_FREQ)) continue; pr_debug("table entry %u: %u kHz\n", i, freq); @@ -229,7 +229,7 @@ static ssize_t show_available_freqs(struct cpufreq_policy *policy, char *buf, * show_boost = false and driver_data != BOOST freq * display NON BOOST freqs */ - if (show_boost ^ (table[i].driver_data == CPUFREQ_BOOST_FREQ)) + if (show_boost ^ (table[i].flags & CPUFREQ_BOOST_FREQ)) continue; count += sprintf(&buf[count], "%d ", table[i].frequency); diff --git a/drivers/cpufreq/kirkwood-cpufreq.c b/drivers/cpufreq/kirkwood-cpufreq.c index 3d114bc5a97a..37a480680cd0 100644 --- a/drivers/cpufreq/kirkwood-cpufreq.c +++ b/drivers/cpufreq/kirkwood-cpufreq.c @@ -43,9 +43,9 @@ static struct priv * table. */ static struct cpufreq_frequency_table kirkwood_freq_table[] = { - {STATE_CPU_FREQ, 0}, /* CPU uses cpuclk */ - {STATE_DDR_FREQ, 0}, /* CPU uses ddrclk */ - {0, CPUFREQ_TABLE_END}, + {0, STATE_CPU_FREQ, 0}, /* CPU uses cpuclk */ + {0, STATE_DDR_FREQ, 0}, /* CPU uses ddrclk */ + {0, 0, CPUFREQ_TABLE_END}, }; static unsigned int kirkwood_cpufreq_get_cpu_frequency(unsigned int cpu) diff --git a/drivers/cpufreq/maple-cpufreq.c b/drivers/cpufreq/maple-cpufreq.c index c4dfa42a75ac..cc3408fc073f 100644 --- a/drivers/cpufreq/maple-cpufreq.c +++ b/drivers/cpufreq/maple-cpufreq.c @@ -59,9 +59,9 @@ #define CPUFREQ_LOW 1 static struct cpufreq_frequency_table maple_cpu_freqs[] = { - {CPUFREQ_HIGH, 0}, - {CPUFREQ_LOW, 0}, - {0, CPUFREQ_TABLE_END}, + {0, CPUFREQ_HIGH, 0}, + {0, CPUFREQ_LOW, 0}, + {0, 0, CPUFREQ_TABLE_END}, }; /* Power mode data is an array of the 32 bits PCR values to use for diff --git a/drivers/cpufreq/p4-clockmod.c b/drivers/cpufreq/p4-clockmod.c index 74f593e70e19..529cfd92158f 100644 --- a/drivers/cpufreq/p4-clockmod.c +++ b/drivers/cpufreq/p4-clockmod.c @@ -92,16 +92,16 @@ static int cpufreq_p4_setdc(unsigned int cpu, unsigned int newstate) static struct cpufreq_frequency_table p4clockmod_table[] = { - {DC_RESV, CPUFREQ_ENTRY_INVALID}, - {DC_DFLT, 0}, - {DC_25PT, 0}, - {DC_38PT, 0}, - {DC_50PT, 0}, - {DC_64PT, 0}, - {DC_75PT, 0}, - {DC_88PT, 0}, - {DC_DISABLE, 0}, - {DC_RESV, CPUFREQ_TABLE_END}, + {0, DC_RESV, CPUFREQ_ENTRY_INVALID}, + {0, DC_DFLT, 0}, + {0, DC_25PT, 0}, + {0, DC_38PT, 0}, + {0, DC_50PT, 0}, + {0, DC_64PT, 0}, + {0, DC_75PT, 0}, + {0, DC_88PT, 0}, + {0, DC_DISABLE, 0}, + {0, DC_RESV, CPUFREQ_TABLE_END}, }; diff --git a/drivers/cpufreq/pasemi-cpufreq.c b/drivers/cpufreq/pasemi-cpufreq.c index 6a2b7d3e85a7..84c84b5f0f3a 100644 --- a/drivers/cpufreq/pasemi-cpufreq.c +++ b/drivers/cpufreq/pasemi-cpufreq.c @@ -60,12 +60,12 @@ static int current_astate; /* We support 5(A0-A4) power states excluding turbo(A5-A6) modes */ static struct cpufreq_frequency_table pas_freqs[] = { - {0, 0}, - {1, 0}, - {2, 0}, - {3, 0}, - {4, 0}, - {0, CPUFREQ_TABLE_END}, + {0, 0, 0}, + {0, 1, 0}, + {0, 2, 0}, + {0, 3, 0}, + {0, 4, 0}, + {0, 0, CPUFREQ_TABLE_END}, }; /* diff --git a/drivers/cpufreq/pmac32-cpufreq.c b/drivers/cpufreq/pmac32-cpufreq.c index cf55d202f332..7615180d7ee3 100644 --- a/drivers/cpufreq/pmac32-cpufreq.c +++ b/drivers/cpufreq/pmac32-cpufreq.c @@ -81,9 +81,9 @@ static int is_pmu_based; #define CPUFREQ_LOW 1 static struct cpufreq_frequency_table pmac_cpu_freqs[] = { - {CPUFREQ_HIGH, 0}, - {CPUFREQ_LOW, 0}, - {0, CPUFREQ_TABLE_END}, + {0, CPUFREQ_HIGH, 0}, + {0, CPUFREQ_LOW, 0}, + {0, 0, CPUFREQ_TABLE_END}, }; static inline void local_delay(unsigned long ms) diff --git a/drivers/cpufreq/pmac64-cpufreq.c b/drivers/cpufreq/pmac64-cpufreq.c index 6a338f8c3860..8bc422977b5b 100644 --- a/drivers/cpufreq/pmac64-cpufreq.c +++ b/drivers/cpufreq/pmac64-cpufreq.c @@ -65,9 +65,9 @@ #define CPUFREQ_LOW 1 static struct cpufreq_frequency_table g5_cpu_freqs[] = { - {CPUFREQ_HIGH, 0}, - {CPUFREQ_LOW, 0}, - {0, CPUFREQ_TABLE_END}, + {0, CPUFREQ_HIGH, 0}, + {0, CPUFREQ_LOW, 0}, + {0, 0, CPUFREQ_TABLE_END}, }; /* Power mode data is an array of the 32 bits PCR values to use for diff --git a/drivers/cpufreq/powernow-k6.c b/drivers/cpufreq/powernow-k6.c index 62c6f2e5afce..49f120e1bc7b 100644 --- a/drivers/cpufreq/powernow-k6.c +++ b/drivers/cpufreq/powernow-k6.c @@ -37,15 +37,15 @@ MODULE_PARM_DESC(bus_frequency, "Bus frequency in kHz"); /* Clock ratio multiplied by 10 - see table 27 in AMD#23446 */ static struct cpufreq_frequency_table clock_ratio[] = { - {60, /* 110 -> 6.0x */ 0}, - {55, /* 011 -> 5.5x */ 0}, - {50, /* 001 -> 5.0x */ 0}, - {45, /* 000 -> 4.5x */ 0}, - {40, /* 010 -> 4.0x */ 0}, - {35, /* 111 -> 3.5x */ 0}, - {30, /* 101 -> 3.0x */ 0}, - {20, /* 100 -> 2.0x */ 0}, - {0, CPUFREQ_TABLE_END} + {0, 60, /* 110 -> 6.0x */ 0}, + {0, 55, /* 011 -> 5.5x */ 0}, + {0, 50, /* 001 -> 5.0x */ 0}, + {0, 45, /* 000 -> 4.5x */ 0}, + {0, 40, /* 010 -> 4.0x */ 0}, + {0, 35, /* 111 -> 3.5x */ 0}, + {0, 30, /* 101 -> 3.0x */ 0}, + {0, 20, /* 100 -> 2.0x */ 0}, + {0, 0, CPUFREQ_TABLE_END} }; static const u8 index_to_register[8] = { 6, 3, 1, 0, 2, 7, 5, 4 }; diff --git a/drivers/cpufreq/ppc_cbe_cpufreq.c b/drivers/cpufreq/ppc_cbe_cpufreq.c index af7b1cabd1e7..5be8a48dba74 100644 --- a/drivers/cpufreq/ppc_cbe_cpufreq.c +++ b/drivers/cpufreq/ppc_cbe_cpufreq.c @@ -32,15 +32,15 @@ /* the CBE supports an 8 step frequency scaling */ static struct cpufreq_frequency_table cbe_freqs[] = { - {1, 0}, - {2, 0}, - {3, 0}, - {4, 0}, - {5, 0}, - {6, 0}, - {8, 0}, - {10, 0}, - {0, CPUFREQ_TABLE_END}, + {0, 1, 0}, + {0, 2, 0}, + {0, 3, 0}, + {0, 4, 0}, + {0, 5, 0}, + {0, 6, 0}, + {0, 8, 0}, + {0, 10, 0}, + {0, 0, CPUFREQ_TABLE_END}, }; /* diff --git a/drivers/cpufreq/s3c2416-cpufreq.c b/drivers/cpufreq/s3c2416-cpufreq.c index 826b8be23099..4626f90559b5 100644 --- a/drivers/cpufreq/s3c2416-cpufreq.c +++ b/drivers/cpufreq/s3c2416-cpufreq.c @@ -72,19 +72,19 @@ static struct s3c2416_dvfs s3c2416_dvfs_table[] = { #endif static struct cpufreq_frequency_table s3c2416_freq_table[] = { - { SOURCE_HCLK, FREQ_DVS }, - { SOURCE_ARMDIV, 133333 }, - { SOURCE_ARMDIV, 266666 }, - { SOURCE_ARMDIV, 400000 }, - { 0, CPUFREQ_TABLE_END }, + { 0, SOURCE_HCLK, FREQ_DVS }, + { 0, SOURCE_ARMDIV, 133333 }, + { 0, SOURCE_ARMDIV, 266666 }, + { 0, SOURCE_ARMDIV, 400000 }, + { 0, 0, CPUFREQ_TABLE_END }, }; static struct cpufreq_frequency_table s3c2450_freq_table[] = { - { SOURCE_HCLK, FREQ_DVS }, - { SOURCE_ARMDIV, 133500 }, - { SOURCE_ARMDIV, 267000 }, - { SOURCE_ARMDIV, 534000 }, - { 0, CPUFREQ_TABLE_END }, + { 0, SOURCE_HCLK, FREQ_DVS }, + { 0, SOURCE_ARMDIV, 133500 }, + { 0, SOURCE_ARMDIV, 267000 }, + { 0, SOURCE_ARMDIV, 534000 }, + { 0, 0, CPUFREQ_TABLE_END }, }; static unsigned int s3c2416_cpufreq_get_speed(unsigned int cpu) diff --git a/drivers/cpufreq/s3c64xx-cpufreq.c b/drivers/cpufreq/s3c64xx-cpufreq.c index c4226de079ab..ff7d3ecb85f0 100644 --- a/drivers/cpufreq/s3c64xx-cpufreq.c +++ b/drivers/cpufreq/s3c64xx-cpufreq.c @@ -37,19 +37,19 @@ static struct s3c64xx_dvfs s3c64xx_dvfs_table[] = { }; static struct cpufreq_frequency_table s3c64xx_freq_table[] = { - { 0, 66000 }, - { 0, 100000 }, - { 0, 133000 }, - { 1, 200000 }, - { 1, 222000 }, - { 1, 266000 }, - { 2, 333000 }, - { 2, 400000 }, - { 2, 532000 }, - { 2, 533000 }, - { 3, 667000 }, - { 4, 800000 }, - { 0, CPUFREQ_TABLE_END }, + { 0, 0, 66000 }, + { 0, 0, 100000 }, + { 0, 0, 133000 }, + { 0, 1, 200000 }, + { 0, 1, 222000 }, + { 0, 1, 266000 }, + { 0, 2, 333000 }, + { 0, 2, 400000 }, + { 0, 2, 532000 }, + { 0, 2, 533000 }, + { 0, 3, 667000 }, + { 0, 4, 800000 }, + { 0, 0, CPUFREQ_TABLE_END }, }; #endif diff --git a/drivers/cpufreq/s5pv210-cpufreq.c b/drivers/cpufreq/s5pv210-cpufreq.c index 72421534fff5..ab2c1a40d437 100644 --- a/drivers/cpufreq/s5pv210-cpufreq.c +++ b/drivers/cpufreq/s5pv210-cpufreq.c @@ -64,12 +64,12 @@ enum s5pv210_dmc_port { }; static struct cpufreq_frequency_table s5pv210_freq_table[] = { - {L0, 1000*1000}, - {L1, 800*1000}, - {L2, 400*1000}, - {L3, 200*1000}, - {L4, 100*1000}, - {0, CPUFREQ_TABLE_END}, + {0, L0, 1000*1000}, + {0, L1, 800*1000}, + {0, L2, 400*1000}, + {0, L3, 200*1000}, + {0, L4, 100*1000}, + {0, 0, CPUFREQ_TABLE_END}, }; static struct regulator *arm_regulator; diff --git a/drivers/cpufreq/sc520_freq.c b/drivers/cpufreq/sc520_freq.c index 69371bf0886d..ac84e4818014 100644 --- a/drivers/cpufreq/sc520_freq.c +++ b/drivers/cpufreq/sc520_freq.c @@ -33,9 +33,9 @@ static __u8 __iomem *cpuctl; #define PFX "sc520_freq: " static struct cpufreq_frequency_table sc520_freq_table[] = { - {0x01, 100000}, - {0x02, 133000}, - {0, CPUFREQ_TABLE_END}, + {0, 0x01, 100000}, + {0, 0x02, 133000}, + {0, 0, CPUFREQ_TABLE_END}, }; static unsigned int sc520_freq_get_cpu_frequency(unsigned int cpu) diff --git a/drivers/cpufreq/speedstep-ich.c b/drivers/cpufreq/speedstep-ich.c index 394ac159312a..1a07b5904ed5 100644 --- a/drivers/cpufreq/speedstep-ich.c +++ b/drivers/cpufreq/speedstep-ich.c @@ -49,9 +49,9 @@ static u32 pmbase; * are in kHz for the time being. */ static struct cpufreq_frequency_table speedstep_freqs[] = { - {SPEEDSTEP_HIGH, 0}, - {SPEEDSTEP_LOW, 0}, - {0, CPUFREQ_TABLE_END}, + {0, SPEEDSTEP_HIGH, 0}, + {0, SPEEDSTEP_LOW, 0}, + {0, 0, CPUFREQ_TABLE_END}, }; diff --git a/drivers/cpufreq/speedstep-smi.c b/drivers/cpufreq/speedstep-smi.c index db5d274dc13a..8635eec96da5 100644 --- a/drivers/cpufreq/speedstep-smi.c +++ b/drivers/cpufreq/speedstep-smi.c @@ -42,9 +42,9 @@ static enum speedstep_processor speedstep_processor; * are in kHz for the time being. */ static struct cpufreq_frequency_table speedstep_freqs[] = { - {SPEEDSTEP_HIGH, 0}, - {SPEEDSTEP_LOW, 0}, - {0, CPUFREQ_TABLE_END}, + {0, SPEEDSTEP_HIGH, 0}, + {0, SPEEDSTEP_LOW, 0}, + {0, 0, CPUFREQ_TABLE_END}, }; #define GET_SPEEDSTEP_OWNER 0 diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index c48e595f623e..5ae5100c1f24 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -455,11 +455,14 @@ extern struct cpufreq_governor cpufreq_gov_conservative; * FREQUENCY TABLE HELPERS * *********************************************************************/ -#define CPUFREQ_ENTRY_INVALID ~0 -#define CPUFREQ_TABLE_END ~1 -#define CPUFREQ_BOOST_FREQ ~2 +/* Special Values of .frequency field */ +#define CPUFREQ_ENTRY_INVALID ~0 +#define CPUFREQ_TABLE_END ~1 +/* Special Values of .flags field */ +#define CPUFREQ_BOOST_FREQ (1 << 0) struct cpufreq_frequency_table { + unsigned int flags; unsigned int driver_data; /* driver specific data, not used by core */ unsigned int frequency; /* kHz - doesn't need to be in ascending * order */ From f334a1e8434c49b6f9c947d3eb7caf712fbcc3fa Mon Sep 17 00:00:00 2001 From: Sachin Kamat Date: Tue, 8 Apr 2014 10:27:58 +0530 Subject: [PATCH 25/28] cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h fsl_soc.h was included twice. Signed-off-by: Sachin Kamat Acked-by: Viresh Kumar Signed-off-by: Rafael J. Wysocki --- drivers/cpufreq/ppc-corenet-cpufreq.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/cpufreq/ppc-corenet-cpufreq.c b/drivers/cpufreq/ppc-corenet-cpufreq.c index 3bd9123e7026..b7e677be1df0 100644 --- a/drivers/cpufreq/ppc-corenet-cpufreq.c +++ b/drivers/cpufreq/ppc-corenet-cpufreq.c @@ -13,7 +13,6 @@ #include #include #include -#include #include #include #include From 9bc0482feae8763447746c89fd971704c819b52e Mon Sep 17 00:00:00 2001 From: Daniel Lezcano Date: Mon, 17 Mar 2014 12:17:30 +0100 Subject: [PATCH 26/28] cpuidle: sysfs: Export target residency information From user space, there is no way to know the target residency for each idle state. If we want to write tools to measure the accuracy of the idle state selection from the governor, we need this info. As the exit latency is exported through sysfs, exporting the target residency in the same place makes sense. Signed-off-by: Daniel Lezcano Signed-off-by: Rafael J. Wysocki --- drivers/cpuidle/sysfs.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c index e918b6d0caf7..efe2f175168f 100644 --- a/drivers/cpuidle/sysfs.c +++ b/drivers/cpuidle/sysfs.c @@ -293,6 +293,7 @@ static ssize_t show_state_##_name(struct cpuidle_state *state, \ } define_show_state_function(exit_latency) +define_show_state_function(target_residency) define_show_state_function(power_usage) define_show_state_ull_function(usage) define_show_state_ull_function(time) @@ -304,6 +305,7 @@ define_store_state_ull_function(disable) define_one_state_ro(name, show_state_name); define_one_state_ro(desc, show_state_desc); define_one_state_ro(latency, show_state_exit_latency); +define_one_state_ro(residency, show_state_target_residency); define_one_state_ro(power, show_state_power_usage); define_one_state_ro(usage, show_state_usage); define_one_state_ro(time, show_state_time); @@ -313,6 +315,7 @@ static struct attribute *cpuidle_state_default_attrs[] = { &attr_name.attr, &attr_desc.attr, &attr_latency.attr, + &attr_residency.attr, &attr_power.attr, &attr_usage.attr, &attr_time.attr, From 553f809e23f00976caea7a1ebdabaa58a6383e7d Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Mon, 7 Apr 2014 01:36:08 +0800 Subject: [PATCH 27/28] arm, kvm: fix double lock on cpu_add_remove_lock Commit 8146875de7d4 (arm, kvm: Fix CPU hotplug callback registration) holds the lock before calling the two functions: kvm_vgic_hyp_init() kvm_timer_hyp_init() and both the two functions are calling register_cpu_notifier() to register cpu notifier, so cause double lock on cpu_add_remove_lock. Considered that both two functions are only called inside kvm_arch_init() with holding cpu_add_remove_lock, so simply use __register_cpu_notifier() to fix the problem. Fixes: 8146875de7d4 (arm, kvm: Fix CPU hotplug callback registration) Signed-off-by: Ming Lei Reviewed-by: Srivatsa S. Bhat Signed-off-by: Rafael J. Wysocki --- virt/kvm/arm/arch_timer.c | 2 +- virt/kvm/arm/vgic.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c index 5081e809821f..22fa819a9b6a 100644 --- a/virt/kvm/arm/arch_timer.c +++ b/virt/kvm/arm/arch_timer.c @@ -277,7 +277,7 @@ int kvm_timer_hyp_init(void) host_vtimer_irq = ppi; - err = register_cpu_notifier(&kvm_timer_cpu_nb); + err = __register_cpu_notifier(&kvm_timer_cpu_nb); if (err) { kvm_err("Cannot register timer CPU notifier\n"); goto out_free; diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 8ca405cd7c1a..47b29834a6b6 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -1496,7 +1496,7 @@ int kvm_vgic_hyp_init(void) goto out; } - ret = register_cpu_notifier(&vgic_cpu_nb); + ret = __register_cpu_notifier(&vgic_cpu_nb); if (ret) { kvm_err("Cannot register vgic CPU notifier\n"); goto out_free_irq; From c7f5220d0c65ce15872dc119d7188ad8fb339dff Mon Sep 17 00:00:00 2001 From: Hanjun Guo Date: Tue, 8 Apr 2014 20:59:48 +0800 Subject: [PATCH 28/28] ACPI: Update the ACPI spec information in Kconfig The UEFI Forum included the ACPI spec in its portfolio in October 2013 and will host future spec iterations, following the ACPI v5.0a release. A UEFI Forum working group named ACPI Specification Working Group (ASWG) has been established to handle future ACPI developments, any UEFI member can join the group and contribute to ACPI specification. So update the ownership and developers for ACPI in Kconfig accordingly, and add another website link to ACPI specification too. Signed-off-by: Hanjun Guo Signed-off-by: Rafael J. Wysocki --- drivers/acpi/Kconfig | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index c205653e9644..ab686b310100 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -31,10 +31,14 @@ menuconfig ACPI ACPI CA, see: - ACPI is an open industry specification co-developed by - Hewlett-Packard, Intel, Microsoft, Phoenix, and Toshiba. + ACPI is an open industry specification originally co-developed by + Hewlett-Packard, Intel, Microsoft, Phoenix, and Toshiba. Currently, + it is developed by the ACPI Specification Working Group (ASWG) under + the UEFI Forum and any UEFI member can join the ASWG and contribute + to the ACPI specification. The specification is available at: + if ACPI