None of the callers check the return value, so it might as
well not have one at all.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This reverts commit b2b5efcf20.
This code was way too board specific, there are quirks as to how
the PERST line is wired on different boards, we'll have to revisit
this using/creating appropriate firmware interfaces.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This series adds support for building the powerpc 64-bit
LE kernel using the new ABI v2. We already supported
running ABI v2 userspace programs but this adds support
for building the kernel itself using the new ABI.
Unaligned stores take alignment exceptions on POWER7 running in little-endian.
This is a dumb little-endian base memcpy that prevents unaligned stores.
Once booted the feature fixup code switches over to the VMX copy loops
(which are already endian safe).
The question is what we do before that switch over. The base 64bit
memcpy takes alignment exceptions on POWER7 so we can't use it as is.
Fixing the causes of alignment exception would slow it down, because
we'd need to ensure all loads and stores are aligned either through
rotate tricks or bytewise loads and stores. Either would be bad for
all other 64bit platforms.
[ I simplified the loop a bit - Anton ]
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If we do a treclaim and we are not in TM suspend mode, it results in a TM bad
thing (ie. a 0x700 program check). Similarly if we do a trechkpt and we have
an active transaction or TEXASR Failure Summary (FS) is not set, we also take a
TM bad thing.
This should never happen, but if it does (ie. a kernel bug), the cause is
almost impossible to debug as the GPR state is mostly userspace and hence we
don't get a call chain.
This adds some checks in these cases case a BUG_ON() (in asm) in case we ever
hit these cases. It moves the register saving around to preserve r1 till later
also.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently, the code in setup-common.c for powerpc assumes that all
clock rates are same in a smp system. This value is cached in the
variable named ppc_proc_freq and is the value that is reported in
/proc/cpuinfo.
However on the PowerNV platform, the clock rate is same only across
the threads of the same core. Hence the value that is reported in
/proc/cpuinfo is incorrect on PowerNV platforms. We need a better way
to query and report the correct value of the processor clock in
/proc/cpuinfo.
The patch achieves this by creating a machdep_call named
get_proc_freq() which is expected to returns the frequency in Hz. The
code in show_cpuinfo() can invoke this method to display the correct
clock rate on platforms that have implemented this method. On the
other powerpc platforms it can use the value cached in ppc_proc_freq.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Firmware update on PowerNV platform takes several minutes. During
this time one CPU is stuck in FW and the kernel complains about "soft
lockups".
This patch returns all secondary CPUs to firmware before starting
firmware update process.
[ Reworked a bit and cleaned up -- BenH ]
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch intends to support fundamental reset on PLX downstream
ports. If the PCI device matches any one of the internal table,
which includes PLX vendor ID, bridge device ID, register offset
for fundamental reset and bit, fundamental reset will be done
accordingly. Otherwise, it will fail back to hot reset.
Additional flag (EEH_DEV_FRESET) is introduced to record the last
reset type on the PCI bridge.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The problem was initially reported by Wendy who tried pass through
IPR adapter, which was connected to PHB root port directly, to KVM
based guest. When doing that, pci_reset_bridge_secondary_bus() was
called by VFIO driver and linkDown was detected by the root port.
That caused all PEs to be frozen.
The patch fixes the issue by routing the reset for the secondary bus
of root port to underly firmware. For that, one more weak function
pci_reset_secondary_bus() is introduced so that the individual platforms
can override that and do specific reset for bridge's secondary bus.
Reported-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Basically, we have 3 types of resets to fulfil PE reset: fundamental,
hot and PHB reset. For the later 2 cases, we need PCI bus reset hold
and settlement delay as specified by PCI spec. PowerNV and pSeries
platforms are running on top of different firmware and some of the
delays have been covered by underly firmware (PowerNV).
The patch makes the delays unified to be done in backend, instead of
EEH core.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The issue was detected in a bit complicated test case where
we have multiple hierarchical PEs shown as following figure:
+-----------------+
| PE#3 p2p#0 |
| p2p#1 |
+-----------------+
|
+-----------------+
| PE#4 pdev#0 |
| pdev#1 |
+-----------------+
PE#4 (have 2 PCI devices) is the child of PE#3, which has 2 p2p
bridges. We accidentally had less-known scenario: PE#4 was removed
permanently from the system because of permanent failure (e.g.
exceeding the max allowd failure times in last hour), then we detects
EEH errors on PE#3 and tried to recover it. However, eeh_dev instances
for pdev#0/1 were not detached from PE#4, which was still connected to
PE#3. All of that was because of the fact that we rely on count-based
pcibios_release_device(), which isn't reliable enough. When doing
recovery for PE#3, we still apply hotplug on PE#4 and pdev#0/1, which
are not valid any more. Eventually, we run into kernel crash.
The patch fixes above issue from two aspects. For unplug, we simply
skip those permanently removed PE, whose state is (EEH_PE_STATE_ISOLATED
&& !EEH_PE_STATE_RECOVERING) and its frozen count should be greater
than EEH_MAX_ALLOWED_FREEZES. For plug, we marked all permanently
removed EEH devices with EEH_DEV_REMOVED and return 0xFF's on read
its PCI config so that PCI core will omit them.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There're 2 EEH subsystem variables: eeh_subsystem_enabled and
eeh_probe_mode. We needn't maintain 2 variables and we can just
have one variable and introduce different flags. The patch also
introduces additional flag EEH_FORCE_DISABLE, which will be used
to disable EEH subsystem via boot parameter ("eeh=off") in future.
Besides, the patch also introduces flag EEH_ENABLED, which is
changed to disable or enable EEH functionality on the fly through
debugfs entry in future.
With the patch applied, the creteria to check the enabled EEH
functionality is changed to:
!EEH_FORCE_DISABLED && EEH_ENABLED : Enabled
Other cases : Disabled
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When calling into eeh_gather_pci_data() on pSeries platform, we
possiblly don't have pci_dev instance yet, but eeh_dev is always
ready. So we use cached capability from eeh_dev instead of pci_dev
for log dump there. In order to keep things unified, we also cache
PCI capability positions to eeh_dev for PowerNV as well.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We've observed multiple PE reset failures because of PCI-CFG
access during that period. Potentially, some device drivers
can't support EEH very well and they can't put the device to
motionless state before PE reset. So those device drivers might
produce PCI-CFG accesses during PE reset. Also, we could have
PCI-CFG access from user space (e.g. "lspci"). Since access to
frozen PE should return 0xFF's, we can block PCI-CFG access
during the period of PE reset so that we won't get recrusive EEH
errors.
The patch adds flag EEH_PE_RESET, which is kept during PE reset.
The PowerNV/pSeries PCI-CFG accessors reuse the flag to block
PCI-CFG accordingly.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The PE state (for eeh_pe instance) EEH_PE_PHB_DEAD is duplicate to
EEH_PE_ISOLATED. Originally, those PHBs (PHB PE) with EEH_PE_PHB_DEAD
would be removed from the system. However, it's safe to replace
that with EEH_PE_ISOLATED.
The patch also clear EEH_PE_RECOVERING after fenced PHB has been handled,
either failure or success. It makes the PHB PE state consistent with:
PHB functions normally NONE
PHB has been removed EEH_PE_ISOLATED
PHB fenced, recovery in progress EEH_PE_ISOLATED | RECOVERING
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have two copies of code that creates an OPAL sg list. Consolidate
these into a common set of helpers and fix the endian issues.
The flash interface embedded a version number in the num_entries
field, whereas the dump interface did did not. Since versioning
wasn't added to the flash interface and it is impossible to add
this in a backwards compatible way, just remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix little endian issues with the OPAL error log code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We had some duplication of the internal OPAL functions.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Using size_t in our APIs is asking for trouble, especially
when some OPAL calls use size_t pointers.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
ftrace has way too much knowledge of our kernel module trampoline
layout hidden inside it. Create module_trampoline_target() that gives
the target address of a kernel module trampoline.
Signed-off-by: Anton Blanchard <anton@samba.org>
ftrace has way too much knowledge of our kernel module trampoline
layout hidden inside it. Create is_module_trampoline() that can
abstract this away inside the module loader code.
Signed-off-by: Anton Blanchard <anton@samba.org>
If an assembly function that calls back into c code is exported to
modules, we need to ensure r2 is setup correctly. There are only
two places crazy enough to do it (two of which are my fault).
Signed-off-by: Anton Blanchard <anton@samba.org>
The kernel resolved the '.TOC.' to a fake symbol, so we need to fix it up
to point to our .toc section plus 0x8000.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
TRACE_WITH_FRAME_BUFFER creates 32 byte stack frames. On ppc64
ABIv1 this is too small and a callee could corrupt the stack by
writing to the parameter save area (starting at offset 48).
Signed-off-by: Anton Blanchard <anton@samba.org>
ABIv2 doesn't have function descriptors or dot symbols. One
new thing it does add is a function global and a local entry
point, so add that to our _GLOBAL macro.
Signed-off-by: Anton Blanchard <anton@samba.org>
There are a few places we have to use dot symbols with the
current ABI - the syscall table and the kvm hcall table.
Wrap both of these with a new macro called DOTSYM so it will
be easy to transition away from dot symbols in a future ABI.
Signed-off-by: Anton Blanchard <anton@samba.org>
binutils is smart enough to know that a branch to a function
descriptor is actually a branch to the functions text address.
Alan tells me that binutils has been doing this for 9 years.
Signed-off-by: Anton Blanchard <anton@samba.org>
- Fix for a recently introduced CPU hotplug regression in ARM KVM
from Ming Lei.
- Fixes for breakage in the at32ap, loongson2_cpufreq, and unicore32
cpufreq drivers introduced during the 3.14 cycle (-stable material)
from Chen Gang and Viresh Kumar.
- New powernv cpufreq driver from Vaidyanathan Srinivasan, with bits
from Gautham R Shenoy and Srivatsa S Bhat.
- Exynos cpufreq driver fix preventing it from being included into
multiplatform builds that aren't supported by it from Sachin Kamat.
- cpufreq cleanups related to the usage of the driver_data field in
struct cpufreq_frequency_table from Viresh Kumar.
- cpufreq ppc driver cleanup from Sachin Kamat.
- Intel BayTrail support for intel_idle and ACPI idle from Len Brown.
- Intel CPU model 54 (Atom N2000 series) support for intel_idle from
Jan Kiszka.
- intel_idle fix for Intel Ivy Town residency targets from Len Brown.
- turbostat updates (Intel Broadwell support and output cleanups)
from Len Brown.
- New cpuidle sysfs attribute for exporting C-states' target residency
information to user space from Daniel Lezcano.
- New kernel command line argument to prevent power domains enabled
by the bootloader from being turned off even if they are not in use
(for diagnostics purposes) from Tushar Behera.
- Fixes for wakeup sysfs attributes documentation from Geert Uytterhoeven.
- New ACPI video blacklist entry for ThinkPad Helix from Stephen Chandler
Paul.
- Assorted ACPI cleanups and a Kconfig help update from Jonghwan Choi,
Zhihui Zhang, Hanjun Guo.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJTRxLAAAoJEILEb/54YlRxnCwP/16UwO/eVE8SIi0TqQboikFC
k8u7F3zgDYG+xPSzXlCR+J7thTxGueQlrb+aM18PYuMVgaw2rpy7U7SIqEk8s6oR
uFnzZCWKA5ZebbZn+NlodnQaJmbgJxwsVJDuuechUka/e67CaIc54JULi2ynZ0lz
Kg/nU3NJhu4S81cT5SOTkJ9xE63oxHcCwKbNqEmxn7x7ddFzGK/DThG67NMEnW1F
vHbBTSyI6vmXXg1f9aobUtuo3PfJkkx5jD+nR1H2e6wmB64tW7JPVKV3mi6LJfYM
ui/8/gNb3PUMHMX1QbL9EFbPxl9miQx2NJ7dgFKa1HZ/WPyiXpJjz7uGr9O3Fau3
cFVREdaW8p2TAYWOEgH8luohhdK0j8UEpR/sEm0TrTjsK8wqczVf/hz6RraVJZiN
ck6eVHjY6m3/bFQauZQ/r+DNeeNcdr+iLejgjbh/MXuF3j0kx+1dkKkzCEU2TgEZ
9etF0uzjlgyXySyxNKBeSW13+ssVA6kF5/BHns7LHoxTfGu7Y4oVaWUi+j74i66O
bc+2ileNal71mS4v9gomnj6Ffj8oH8KXFA7k0sEsAdwLZNgThB5bTppmY/U7Y5Ce
hTS81tcGe2vOVQzF9iFOF7LNKKussAVAtrgkkrA8lJLeOTfQbIo4+fMhORxf3X/p
3O7R/jc4cT+IXK8a2xRt
=hGKg
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI and power management fixes and updates from Rafael Wysocki:
"This is PM and ACPI material that has emerged over the last two weeks
and one fix for a CPU hotplug regression introduced by the recent CPU
hotplug notifiers registration series.
Included are intel_idle and turbostat updates from Len Brown (these
have been in linux-next for quite some time), a new cpufreq driver for
powernv (that might spend some more time in linux-next, but BenH was
asking me so nicely to push it for 3.15 that I couldn't resist), some
cpufreq fixes and cleanups (including fixes for some silly breakage in
a couple of cpufreq drivers introduced during the 3.14 cycle),
assorted ACPI cleanups, wakeup framework documentation fixes, a new
sysfs attribute for cpuidle and a new command line argument for power
domains diagnostics.
Specifics:
- Fix for a recently introduced CPU hotplug regression in ARM KVM
from Ming Lei.
- Fixes for breakage in the at32ap, loongson2_cpufreq, and unicore32
cpufreq drivers introduced during the 3.14 cycle (-stable material)
from Chen Gang and Viresh Kumar.
- New powernv cpufreq driver from Vaidyanathan Srinivasan, with bits
from Gautham R Shenoy and Srivatsa S Bhat.
- Exynos cpufreq driver fix preventing it from being included into
multiplatform builds that aren't supported by it from Sachin Kamat.
- cpufreq cleanups related to the usage of the driver_data field in
struct cpufreq_frequency_table from Viresh Kumar.
- cpufreq ppc driver cleanup from Sachin Kamat.
- Intel BayTrail support for intel_idle and ACPI idle from Len Brown.
- Intel CPU model 54 (Atom N2000 series) support for intel_idle from
Jan Kiszka.
- intel_idle fix for Intel Ivy Town residency targets from Len Brown.
- turbostat updates (Intel Broadwell support and output cleanups)
from Len Brown.
- New cpuidle sysfs attribute for exporting C-states' target
residency information to user space from Daniel Lezcano.
- New kernel command line argument to prevent power domains enabled
by the bootloader from being turned off even if they are not in use
(for diagnostics purposes) from Tushar Behera.
- Fixes for wakeup sysfs attributes documentation from Geert
Uytterhoeven.
- New ACPI video blacklist entry for ThinkPad Helix from Stephen
Chandler Paul.
- Assorted ACPI cleanups and a Kconfig help update from Jonghwan
Choi, Zhihui Zhang, Hanjun Guo"
* tag 'pm+acpi-3.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (28 commits)
ACPI: Update the ACPI spec information in Kconfig
arm, kvm: fix double lock on cpu_add_remove_lock
cpuidle: sysfs: Export target residency information
cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h
cpufreq: create another field .flags in cpufreq_frequency_table
cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table
cpufreq: don't print value of .driver_data from core
cpufreq: ia64: don't set .driver_data to index
cpufreq: powernv: Select CPUFreq related Kconfig options for powernv
cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids
cpufreq: powernv: cpufreq driver for powernv platform
cpufreq: at32ap: don't declare local variable as static
cpufreq: loongson2_cpufreq: don't declare local variable as static
cpufreq: unicore32: fix typo issue for 'clk'
cpufreq: exynos: Disable on multiplatform build
PM / wakeup: Correct presence vs. emptiness of wakeup_* attributes
PM / domains: Add pd_ignore_unused to keep power domains enabled
ACPI / dock: Drop dock_device_ids[] table
ACPI / video: Favor native backlight interface for ThinkPad Helix
ACPI / thermal: Fix wrong variable usage in debug statement
...
Pull more powerpc updates from Ben Herrenschmidt:
"Here are a few more powerpc things for you.
So you'll find here the conversion of the two new firmware sysfs
interfaces to the new API for self-removing files that Greg and Tejun
introduced, so they can finally remove the old one.
I'm also reverting the hwmon driver for powernv. I shouldn't have
merged it, I got a bit carried away here. I hadn't realized it was
never CCed to the relevant maintainer(s) and list(s), and happens to
have some issues so I'm taking it out and it will come back via the
proper channels.
The rest is a bunch of LE fixes (argh, some of the new stuff was
broken on LE, I really need to start testing LE myself !) and various
random fixes here and there.
Finally one bit that's not strictly a fix, which is the HVC OPAL
change to "kick" the HVC thread when the firmware tells us there is
new incoming data. I don't feel like waiting for this one, it's
simple enough, and it makes a big difference in console responsiveness
which is good for my nerves"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (26 commits)
powerpc/powernv Adapt opal-elog and opal-dump to new sysfs_remove_file_self
Revert "powerpc/powernv: hwmon driver for power values, fan rpm and temperature"
power, sched: stop updating inside arch_update_cpu_topology() when nothing to be update
powerpc/le: Avoid creatng R_PPC64_TOCSAVE relocations for modules.
arch/powerpc: Use RCU_INIT_POINTER(x, NULL) in platforms/cell/spu_syscalls.c
powerpc/opal: Add missing include
powerpc: Convert last uses of __FUNCTION__ to __func__
powerpc: Add lq/stq emulation
powerpc/powernv: Add invalid OPAL call
powerpc/powernv: Add OPAL message log interface
powerpc/book3s: Fix mc_recoverable_range buffer overrun issue.
powerpc: Remove dead code in sycall entry
powerpc: Use of_node_init() for the fakenode in msi_bitmap.c
powerpc/mm: NUMA pte should be handled via slow path in get_user_pages_fast()
powerpc/powernv: Fix endian issues with sensor code
powerpc/powernv: Fix endian issues with OPAL async code
tty/hvc_opal: Kick the HVC thread on OPAL console events
powerpc/powernv: Add opal_notifier_unregister() and export to modules
powerpc/ppc64: Do not turn AIL (reloc-on interrupts) too early
powerpc/ppc64: Gracefully handle early interrupts
...
next-20140324 currently fails compiling celleb_defconfig with:
arch/powerpc/include/asm/opal.h:894:42: error: 'struct notifier_block' declared inside parameter list [-Werror]
arch/powerpc/include/asm/opal.h:894:42: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
arch/powerpc/include/asm/opal.h:896:14: error: 'struct notifier_block' declared inside parameter list [-Werror]
This is due to a missing include which is added here.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Recent CPUs support quad word load and store instructions. Add
support to the alignment handler for them.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This call will not be understood by OPAL, and cause it to add an error
to it's log. Among other things, this is useful for testing the
behaviour of the log as it fills up.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
OPAL provides an in-memory circular buffer containing a message log
populated with various runtime messages produced by the firmware.
Provide a sysfs interface /sys/firmware/opal/msglog for userspace to
view the messages.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
One OPAL call and one device tree property needed byte swapping.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* pm-cpufreq:
cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h
cpufreq: create another field .flags in cpufreq_frequency_table
cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table
cpufreq: don't print value of .driver_data from core
cpufreq: ia64: don't set .driver_data to index
cpufreq: powernv: Select CPUFreq related Kconfig options for powernv
cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids
cpufreq: powernv: cpufreq driver for powernv platform
cpufreq: at32ap: don't declare local variable as static
cpufreq: loongson2_cpufreq: don't declare local variable as static
cpufreq: unicore32: fix typo issue for 'clk'
cpufreq: exynos: Disable on multiplatform build
Backend driver to dynamically set voltage and frequency on
IBM POWER non-virtualized platforms. Power management SPRs
are used to set the required PState.
This driver works in conjunction with cpufreq governors
like 'ondemand' to provide a demand based frequency and
voltage setting on IBM POWER non-virtualized platforms.
PState table is obtained from OPAL v3 firmware through device
tree.
powernv_cpufreq back-end driver would parse the relevant device-tree
nodes and initialise the cpufreq subsystem on powernv platform.
The code was originally written by svaidy@linux.vnet.ibm.com. Over
time it was modified to accomodate bug-fixes as well as updates to the
the cpu-freq core. Relevant portions of the change logs corresponding
to those modifications are noted below:
* The policy->cpus needs to be populated in a hotplug-invariant
manner instead of using cpu_sibling_mask() which varies with
cpu-hotplug. This is because the cpufreq core code copies this
content into policy->related_cpus mask which should not vary on
cpu-hotplug. [Authored by srivatsa.bhat@linux.vnet.ibm.com]
* Create a helper routine that can return the cpu-frequency for the
corresponding pstate_id. Also, cache the values of the pstate_max,
pstate_min and pstate_nominal and nr_pstates in a static structure
so that they can be reused in the future to perform any
validations. [Authored by ego@linux.vnet.ibm.com]
* Create a driver attribute named cpuinfo_nominal_freq which creates
a sysfs read-only file named cpuinfo_nominal_freq. Export the
frequency corresponding to the nominal_pstate through this
interface.
Nominal frequency is the highest non-turbo frequency for the
platform. This is generally used for setting governor policies
from user space for optimal energy efficiency. [Authored by
ego@linux.vnet.ibm.com]
* Implement a powernv_cpufreq_get(unsigned int cpu) method which will
return the current operating frequency. Export this via the sysfs
interface cpuinfo_cur_freq by setting powernv_cpufreq_driver.get to
powernv_cpufreq_get(). [Authored by ego@linux.vnet.ibm.com]
[Change log updated by ego@linux.vnet.ibm.com]
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
OPAL defines opal_msg as a big endian struct so we have to
byte swap it on little endian builds.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
opal_notifier_register() is missing a pending "unregister" variant
and should be exposed to modules.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The current kernel code assumes big endian and parses RTAS events all
wrong. The most visible effect is that we cannot honor EPOW events,
meaning, for example, we cannot shut down a guest properly from the
hypervisor.
This new patch is largely inspired by Nathan's work: we get rid of all
the bit fields in the RTAS event structures (even the unused ones, for
consistency). We also introduce endian safe accessors for the fields used
by the kernel (trivial rtas_error_type() accessor added for consistency).
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
will be showing up in Intel Broadwell CPU's.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJTPaDyAAoJENNvdpvBGATwIVkP/Ao77ATdIBKr7kJJ+7q/jroo
hfg2EOBzYwFSTZ4EUCwTwCNOqcAnr8xBYRWDDTxoiT0ORQN+4QkIXlukTBZbeCUg
5ZCvRYhNcrieCBX69X/0aQ7yAkMR5/KSjccww3Tr6bLEItEBS7/gkeGv0H/E1DOg
/1C3vbDDRPchM2qGM/TS/2Wd3WU5jWJk4cURJUlweBWHGXItOk6yHIY/+h3t4Bj1
eJfxwG21BLJ1t159uvMj5Sw9Ketwnzgc4hsU6Ai/o37+Q12LYJ6xRCHsAatr6aA8
dsObrfnD15T3gH/elVzB0tpBkor9pESkJAvUxWdMNMNGVU01ONuBz6rbSUqLdKu4
GXA+9iHq4Ti43f4M886PweSweFr1+lFIiQrdAQaWZdrooLCMJCgjMNM+7wnlu8ho
Eit65rq4k6y5Fb5GTB8Am/TCb3HNYhXnrMERA0meKx+/K8yaMygzHtPoT+DHdn8z
GWDwNiMYrtTeiJ5nduljO3M2lYJDIWLkCD9SwupXe5HEj3qSHS/aiT5ng6YP+oOK
mmUiWWgpVv3unO6pJw49pbzuAWXnbHVWMV2VDILVxn7mZZ0NAOL3OGAmK7US+Z/J
qww9s89A2oba8vTbdrW7JuiP9ESlujpfrLQQgnW6WInMHY2dfo5hSo84ULLeLMXQ
SqT1yywl4U888jqBYioB
=RtMT
-----END PGP SIGNATURE-----
Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull /dev/random changes from Ted Ts'o:
"A number of cleanups plus support for the RDSEED instruction, which
will be showing up in Intel Broadwell CPU's"
* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: Add arch_has_random[_seed]()
random: If we have arch_get_random_seed*(), try it before blocking
random: Use arch_get_random_seed*() at init time and once a second
x86, random: Enable the RDSEED instruction
random: use the architectural HWRNG for the SHA's IV in extract_buf()
random: clarify bits/bytes in wakeup thresholds
random: entropy_bytes is actually bits
random: simplify accounting code
random: tighten bound on random_read_wakeup_thresh
random: forget lock in lockless accounting
random: simplify accounting logic
random: fix comment on "account"
random: simplify loop in random_read
random: fix description of get_random_bytes
random: fix comment on proc_do_uuid
random: fix typos / spelling errors in comments
Pull kvm updates from Paolo Bonzini:
"PPC and ARM do not have much going on this time. Most of the cool
stuff, instead, is in s390 and (after a few releases) x86.
ARM has some caching fixes and PPC has transactional memory support in
guests. MIPS has some fixes, with more probably coming in 3.16 as
QEMU will soon get support for MIPS KVM.
For x86 there are optimizations for debug registers, which trigger on
some Windows games, and other important fixes for Windows guests. We
now expose to the guest Broadwell instruction set extensions and also
Intel MPX. There's also a fix/workaround for OS X guests, nested
virtualization features (preemption timer), and a couple kvmclock
refinements.
For s390, the main news is asynchronous page faults, together with
improvements to IRQs (floating irqs and adapter irqs) that speed up
virtio devices"
* tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (96 commits)
KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8
KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset
KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode
KVM: PPC: Book3S HV: Return ENODEV error rather than EIO
KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code
KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state
KVM: PPC: Book3S HV: Add transactional memory support
KVM: Specify byte order for KVM_EXIT_MMIO
KVM: vmx: fix MPX detection
KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n
KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCE
KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd write
KVM: s390: clear local interrupts at cpu initial reset
KVM: s390: Fix possible memory leak in SIGP functions
KVM: s390: fix calculation of idle_mask array size
KVM: s390: randomize sca address
KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP
KVM: Bump KVM_MAX_IRQ_ROUTES for s390
KVM: s390: irq routing for adapter interrupts.
KVM: s390: adapter interrupt sources
...
Pull powerpc non-virtualized cpuidle from Ben Herrenschmidt:
"This is the branch I mentioned in my other pull request which contains
our improved cpuidle support for the "powernv" platform
(non-virtualized).
It adds support for the "fast sleep" feature of the processor which
provides higher power savings than our usual "nap" mode but at the
cost of losing the timers while asleep, and thus exploits the new
timer broadcast framework to work around that limitation.
It's based on a tip timer tree that you seem to have already merged"
* 'powernv-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
cpuidle/powernv: Parse device tree to setup idle states
cpuidle/powernv: Add "Fast-Sleep" CPU idle state
powerpc/powernv: Add OPAL call to resync timebase on wakeup
powerpc/powernv: Add context management for Fast Sleep
powerpc: Split timer_interrupt() into timer handling and interrupt handling routines
powerpc: Implement tick broadcast IPI as a fixed IPI message
powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message