GFP_THISNODE is for callers that implement their own clever fallback to
remote nodes. It restricts the allocation to the specified node and
does not invoke reclaim, assuming that the caller will take care of it
when the fallback fails, e.g. through a subsequent allocation request
without GFP_THISNODE set.
However, many current GFP_THISNODE users only want the node exclusive
aspect of the flag, without actually implementing their own fallback or
triggering reclaim if necessary. This results in things like page
migration failing prematurely even when there is easily reclaimable
memory available, unless kswapd happens to be running already or a
concurrent allocation attempt triggers the necessary reclaim.
Convert all callsites that don't implement their own fallback strategy
to __GFP_THISNODE. This restricts the allocation a single node too, but
at the same time allows the allocator to enter the slowpath, wake
kswapd, and invoke direct reclaim if necessary, to make the allocation
happen when memory is full.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need to unmangle the full address, not just the register
number, and we also need to support the real indirect bit
being set for in-kernel uses.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org> [v3.13]
As Ben suggested, the patch prints PHB diag-data with multiple
fields in one line and omits the line if the fields of that
line are all zero.
With the patch applied, the PHB3 diag-data dump looks like:
PHB3 PHB#3 Diag-data (Version: 1)
brdgCtl: 00000002
RootSts: 0000000f 00400000 b0830008 00100147 00002000
nFir: 0000000000000000 0030006e00000000 0000000000000000
PhbSts: 0000001c00000000 0000000000000000
Lem: 0000000000100000 42498e327f502eae 0000000000000000
InAErr: 8000000000000000 8000000000000000 0402030000000000 0000000000000000
PE[ 8] A/B: 8480002b00000000 8000000000000000
[ The current diag data is so big that it overflows the printk
buffer pretty quickly in cases when we get a handful of errors
at once which can happen. --BenH
]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The PHB diag-data is important to help locating the root cause for
EEH errors such as frozen PE or fenced PHB. However, the EEH core
enables IO path by clearing part of HW registers before collecting
this data causing it to be corrupted.
This patch fixes this by dumping the PHB diag-data immediately when
frozen/fenced state on PE or PHB is detected for the first time in
eeh_ops::get_state() or next_error() backend.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently we're storing a host endian RTAS token in
rtas_stop_self_args.token. We then pass that directly to rtas. This is
fine on big endian however on little endian the token is not what we
expect.
This will typically result in hitting:
panic("Alas, I survived.\n");
To fix this we always use the stop-self token in host order and always
convert it to be32 before passing this to rtas.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We possiblly detect EEH errors during reboot, particularly in kexec
path, but it's impossible for device drivers and EEH core to handle
or recover them properly.
The patch registers one reboot notifier for EEH and disable EEH
subsystem during reboot. That means the EEH errors is going to be
cleared by hardware reset or second kernel during early stage of
PCI probe.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch cleans up variable eeh_subsystem_enabled so that we needn't
refer the variable directly from external. Instead, we will use
function eeh_enabled() and eeh_set_enable() to operate the variable.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When doing reset in order to recover the affected PE, we issue
hot reset on PE primary bus if it's not root bus. Otherwise, we
issue hot or fundamental reset on root port or PHB accordingly.
For the later case, we didn't cover the situation where PE only
includes root port and it potentially causes kernel crash upon
EEH error to the PE.
The patch reworks the logic of EEH reset to improve the code
readability and also avoid the kernel crash.
Cc: stable@vger.kernel.org
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rev3 of the PCI Express Base Specification defines a Supported Link
Speeds Vector where the bit definitions within this field are:
Bit 0 - 2.5 GT/s
Bit 1 - 5.0 GT/s
Bit 2 - 8.0 GT/s
This vector definition is used by the platform firmware to export the
maximum and current link speeds of the PCI bus via the
"ibm,pcie-link-speed-stats" device-tree property.
This patch updates pseries_root_bridge_prepare() to detect Gen3
speed buses (defined by 0x04).
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit 5091f0c (powerpc/pseries: Fix PCIE link speed endian issue)
introduced a regression on the PCI link speed detection using the
device-tree property. The ibm,pcie-link-speed-stats property is composed
of two 32-bit integers, the first one being the maxinum link speed and
the second the current link speed. The changes introduced by the
aforementioned commit are considering just the first integer.
Fix this issue by changing how the property is accessed, using the
helper functions to properly access the array of values. The explicit
byte swapping is not needed anymore here, since it's done by the helper
functions.
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch adds the support for to create a direct iommu "bypass"
window on IODA2 bridges (such as Power8) allowing to bypass iommu
page translation completely for 64-bit DMA capable devices, thus
significantly improving DMA performances.
Additionally, this adds a hook to the struct iommu_table so that
the IOMMU API / VFIO can disable the bypass when external ownership
is requested, since in that case, the device will be used by an
environment such as userspace or a KVM guest which must not be
allowed to bypass translations.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have a driver for the ARCH_RANDOM hook in rng.c, so we should select
ARCH_RANDOM on pseries.
Without this the build breaks if you turn ARCH_RANDOM off.
This hasn't broken the build because pseries_defconfig doesn't specify a
value for PPC_POWERNV, which is default y, and selects ARCH_RANDOM.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Disable relocation on exception while going down even in kdump case. This
is because we are about clear htab mappings while kexec-ing into kdump
kernel and we may run into issues if we still have AIL ON.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Following patch ports the cpuidle framework for powernv
platform and also implements a cpuidle back-end powernv
idle driver calling on to power7_nap and snooze idle states.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Move the file from arch specific pseries/processor_idle.c
to drivers/cpuidle/cpuidle-pseries.c
Make the relevant Makefile and Kconfig changes.
Also, introduce Kconfig.powerpc in drivers/cpuidle
for all powerpc cpuidle drivers.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
<<
Switch mpc512x to the common clock framework and adapt mpc512x
drivers to use the new clock driver. Old PPC_CLOCK code is
removed entirely since there are no users any more.
>>
Pull powerpc updates from Ben Herrenschmidt:
"So here's my next branch for powerpc. A bit late as I was on vacation
last week. It's mostly the same stuff that was in next already, I
just added two patches today which are the wiring up of lockref for
powerpc, which for some reason fell through the cracks last time and
is trivial.
The highlights are, in addition to a bunch of bug fixes:
- Reworked Machine Check handling on kernels running without a
hypervisor (or acting as a hypervisor). Provides hooks to handle
some errors in real mode such as TLB errors, handle SLB errors,
etc...
- Support for retrieving memory error information from the service
processor on IBM servers running without a hypervisor and routing
them to the memory poison infrastructure.
- _PAGE_NUMA support on server processors
- 32-bit BookE relocatable kernel support
- FSL e6500 hardware tablewalk support
- A bunch of new/revived board support
- FSL e6500 deeper idle states and altivec powerdown support
You'll notice a generic mm change here, it has been acked by the
relevant authorities and is a pre-req for our _PAGE_NUMA support"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (121 commits)
powerpc: Implement arch_spin_is_locked() using arch_spin_value_unlocked()
powerpc: Add support for the optimised lockref implementation
powerpc/powernv: Call OPAL sync before kexec'ing
powerpc/eeh: Escalate error on non-existing PE
powerpc/eeh: Handle multiple EEH errors
powerpc: Fix transactional FP/VMX/VSX unavailable handlers
powerpc: Don't corrupt transactional state when using FP/VMX in kernel
powerpc: Reclaim two unused thread_info flag bits
powerpc: Fix races with irq_work
Move precessing of MCE queued event out from syscall exit path.
pseries/cpuidle: Remove redundant call to ppc64_runlatch_off() in cpu idle routines
powerpc: Make add_system_ram_resources() __init
powerpc: add SATA_MV to ppc64_defconfig
powerpc/powernv: Increase candidate fw image size
powerpc: Add debug checks to catch invalid cpu-to-node mappings
powerpc: Fix the setup of CPU-to-Node mappings during CPU online
powerpc/iommu: Don't detach device without IOMMU group
powerpc/eeh: Hotplug improvement
powerpc/eeh: Call opal_pci_reinit() on powernv for restoring config space
powerpc/eeh: Add restore_config operation
...
- ACPI core changes to make it create a struct acpi_device object for every
device represented in the ACPI tables during all namespace scans regardless
of the current status of that device. In accordance with this, ACPI hotplug
operations will not delete those objects, unless the underlying ACPI tables
go away.
- On top of the above, new sysfs attribute for ACPI device objects allowing
user space to check device status by triggering the execution of _STA for
its ACPI object. From Srinivas Pandruvada.
- ACPI core hotplug changes reducing code duplication, integrating the
PCI root hotplug with the core and reworking container hotplug.
- ACPI core simplifications making it use ACPI_COMPANION() in the code
"glueing" ACPI device objects to "physical" devices.
- ACPICA update to upstream version 20131218. This adds support for the
DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug
facilities. From Bob Moore, Lv Zheng and Betty Dall.
- Init code change to carry out the early ACPI initialization earlier.
That should allow us to use ACPI during the timekeeping initialization
and possibly to simplify the EFI initialization too. From Chun-Yi Lee.
- Clenups of the inclusions of ACPI headers in many places all over from
Lv Zheng and Rashika Kheria (work in progress).
- New helper for ACPI _DSM execution and rework of the code in drivers
that uses _DSM to execute it via the new helper. From Jiang Liu.
- New Win8 OSI blacklist entries from Takashi Iwai.
- Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo,
Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria,
Tang Chen, Zhang Rui.
- intel_pstate driver updates, including proper Baytrail support, from
Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra.
- Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski.
- powernow-k6 cpufreq driver fixes from Mikulas Patocka.
- cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown.
- Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias,
Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar.
- cpuidle cleanups from Bartlomiej Zolnierkiewicz.
- Support for hibernation APM events from Bin Shi.
- Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled
during thaw transitions from Bjørn Mork.
- PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson.
- PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa,
Rashika Kheria.
- New tool for profiling system suspend from Todd E Brandt and a cpupower
tool cleanup from One Thousand Gnomes.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJS3a1eAAoJEILEb/54YlRxnTgP/iGawvgjKWm6Qqp7WSIvd5gQ
zZ6q75C6Pc/W2fq1+OzVGnpCF8WYFy+nFDAXOvUHjIXuoxSwFcuW5l4aMckgl/0a
TXEWe9MJrCHHRfDApfFacCJ44U02bjJAD5vTyL/hKA+IHeinq4WCSojryYC+8jU0
cBrUIV0aNH8r5JR2WJNAyv/U29rXsDUOu0I4qTqZ4YaZT6AignMjtLXn1e9AH1Pn
DPZphTIo/HMnb+kgBOjt4snMk+ahVO9eCOxh/hH8ecnWExw9WynXoU5Nsna0tSZs
ssyHC7BYexD3oYsG8D52cFUpp4FCsJ0nFQNa2kw0LY+0FBNay43LySisKYHZPXEs
2WpESDv+/t7yhtnrvM+TtA7aBheKm2XMWGFSu/aERLE17jIidOkXKH5Y7ryYLNf/
uyRKxNS0NcZWZ0G+/wuY02jQYNkfYz3k/nTr8BAUItRBjdporGIRNEnR9gPzgCUC
uQhjXWMPulqubr8xbyefPWHTEzU2nvbXwTUWGjrBxSy8zkyy5arfqizUj+VG6afT
NsboANoMHa9b+xdzigSFdA3nbVK6xBjtU6Ywntk9TIpODKF5NgfARx0H+oSH+Zrj
32bMzgZtHw/lAbYsnQ9OnTY6AEWQYt6NMuVbTiLXrMHhM3nWwfg/XoN4nZqs6jPo
IYvE6WhQZU6L6fptGHFC
=dRf6
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"As far as the number of commits goes, the top spot belongs to ACPI
this time with cpufreq in the second position and a handful of PM
core, PNP and cpuidle updates. They are fixes and cleanups mostly, as
usual, with a couple of new features in the mix.
The most visible change is probably that we will create struct
acpi_device objects (visible in sysfs) for all devices represented in
the ACPI tables regardless of their status and there will be a new
sysfs attribute under those objects allowing user space to check that
status via _STA.
Consequently, ACPI device eject or generally hot-removal will not
delete those objects, unless the table containing the corresponding
namespace nodes is unloaded, which is extremely rare. Also ACPI
container hotplug will be handled quite a bit differently and cpufreq
will support CPU boost ("turbo") generically and not only in the
acpi-cpufreq driver.
Specifics:
- ACPI core changes to make it create a struct acpi_device object for
every device represented in the ACPI tables during all namespace
scans regardless of the current status of that device. In
accordance with this, ACPI hotplug operations will not delete those
objects, unless the underlying ACPI tables go away.
- On top of the above, new sysfs attribute for ACPI device objects
allowing user space to check device status by triggering the
execution of _STA for its ACPI object. From Srinivas Pandruvada.
- ACPI core hotplug changes reducing code duplication, integrating
the PCI root hotplug with the core and reworking container hotplug.
- ACPI core simplifications making it use ACPI_COMPANION() in the
code "glueing" ACPI device objects to "physical" devices.
- ACPICA update to upstream version 20131218. This adds support for
the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves
debug facilities. From Bob Moore, Lv Zheng and Betty Dall.
- Init code change to carry out the early ACPI initialization
earlier. That should allow us to use ACPI during the timekeeping
initialization and possibly to simplify the EFI initialization too.
From Chun-Yi Lee.
- Clenups of the inclusions of ACPI headers in many places all over
from Lv Zheng and Rashika Kheria (work in progress).
- New helper for ACPI _DSM execution and rework of the code in
drivers that uses _DSM to execute it via the new helper. From
Jiang Liu.
- New Win8 OSI blacklist entries from Takashi Iwai.
- Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun
Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava,
Rashika Kheria, Tang Chen, Zhang Rui.
- intel_pstate driver updates, including proper Baytrail support,
from Dirk Brandewie and intel_pstate documentation from Ramkumar
Ramachandra.
- Generic CPU boost ("turbo") support for cpufreq from Lukasz
Majewski.
- powernow-k6 cpufreq driver fixes from Mikulas Patocka.
- cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark
Brown.
- Assorted cpufreq drivers fixes and cleanups from Anson Huang, John
Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh
Kumar.
- cpuidle cleanups from Bartlomiej Zolnierkiewicz.
- Support for hibernation APM events from Bin Shi.
- Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC
disabled during thaw transitions from Bjørn Mork.
- PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf
Hansson.
- PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente
Kurusa, Rashika Kheria.
- New tool for profiling system suspend from Todd E Brandt and a
cpupower tool cleanup from One Thousand Gnomes"
* tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (153 commits)
thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412)
cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ
Documentation: cpufreq / boost: Update BOOST documentation
cpufreq: exynos: Extend Exynos cpufreq driver to support boost
cpufreq / boost: Kconfig: Support for software-managed BOOST
acpi-cpufreq: Adjust the code to use the common boost attribute
cpufreq: Add boost frequency support in core
intel_pstate: Add trace point to report internal state.
cpufreq: introduce cpufreq_generic_get() routine
ARM: SA1100: Create dummy clk_get_rate() to avoid build failures
cpufreq: stats: create sysfs entries when cpufreq_stats is a module
cpufreq: stats: free table and remove sysfs entry in a single routine
cpufreq: stats: remove hotplug notifiers
cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly
cpufreq: speedstep: remove unused speedstep_get_state
platform: introduce OF style 'modalias' support for platform bus
PM / tools: new tool for suspend/resume performance optimization
ACPI: fix module autoloading for ACPI enumerated devices
ACPI: add module autoloading support for ACPI enumerated devices
ACPI: fix create_modalias() return value handling
...
Pull trivial tree updates from Jiri Kosina:
"Usual rocket science stuff from trivial.git"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
neighbour.h: fix comment
sched: Fix warning on make htmldocs caused by wait.h
slab: struct kmem_cache is protected by slab_mutex
doc: Fix typo in USB Gadget Documentation
of/Kconfig: Spelling s/one/once/
mkregtable: Fix sscanf handling
lp5523, lp8501: comment improvements
thermal: rcar: comment spelling
treewide: fix comments and printk msgs
IXP4xx: remove '1 &&' from a condition check in ixp4xx_restart()
Documentation: update /proc/uptime field description
Documentation: Fix size parameter for snprintf
arm: fix comment header and macro name
asm-generic: uaccess: Spelling s/a ny/any/
mtd: onenand: fix comment header
doc: driver-model/platform.txt: fix a typo
drivers: fix typo in DEVTMPFS_MOUNT Kconfig help text
doc: Fix typo (acces_process_vm -> access_process_vm)
treewide: Fix typos in printk
drivers/gpu/drm/qxl/Kconfig: reformat the help text
...
Pull core debug changes from Ingo Molnar:
"Currently there are two methods to set the panic_timeout: via
'panic=X' boot commandline option, or via /proc/sys/kernel/panic.
This tree adds a third panic_timeout configuration method:
configuration via Kconfig, via CONFIG_PANIC_TIMEOUT=X - useful to
distros that generally want their kernel defaults to come with the
.config.
CONFIG_PANIC_TIMEOUT defaults to 0, which was the previous default
value of panic_timeout.
Doing that unearthed a few arch trickeries regarding arch-special
panic_timeout values and related complications - hopefully all
resolved to the satisfaction of everyone"
* 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
powerpc: Clean up panic_timeout usage
MIPS: Remove panic_timeout settings
panic: Make panic_timeout configurable
Its possible that OPAL may be writing to host memory during
kexec (like dump retrieve scenario). In this situation we might
end up corrupting host memory.
This patch makes OPAL sync call to make sure OPAL stops
writing to host memory before kexec'ing.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Sometimes, especially in sinario of loading another kernel with kdump,
we got EEH error on non-existing PE. That means the PEEV / PEST in
the corresponding PHB would be messy and we can't handle that case.
The patch escalates the error to fenced PHB so that the PHB could be
rested in order to revoer the errors on non-existing PEs.
Reported-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Tested-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For one PCI error relevant OPAL event, we possibly have multiple
EEH errors for that. For example, multiple frozen PEs detected on
different PHBs. Unfortunately, we didn't cover the case. The patch
enumarates the return value from eeh_ops::next_error() and change
eeh_handle_special_event() and eeh_ops::next_error() to handle all
existing EEH errors.
As Ben pointed out, we needn't list_for_each_entry_safe() since we
are not deleting any PHB from the hose_list and the EEH serialized
lock should be held while purging EEH events. The patch covers those
suggestions as well.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit fbd7740fdfdf9475f(powerpc: Simplify pSeries idle loop) switched pseries cpu
idle handling from complete idle loops to ppc_md.powersave functions. Earlier to
this switch, ppc64_runlatch_off() had to be called in each of the idle routines.
But after the switch, this call is handled in arch_cpu_idle(),just before the call
to ppc_md.powersave, where platform specific idle routines are called.
As a consequence, the call to ppc64_runlatch_off() got duplicated in the
arch_cpu_idle() routine as well as in the some of the idle routines in
pseries and commit fbd7740fdf missed to get rid of these redundant
calls. These calls were carried over subsequent enhancements to the pseries
cpuidle routines.
Although multiple calls to ppc64_runlatch_off() is harmless, there is still some
overhead due to it. Besides that, these calls could also make way for a
misunderstanding that it is *necessary* to call ppc64_runlatch_off() multiple
times, when that is not the case. Hence this patch takes care of eliminating
this redundancy.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
At present we assume candidate image is <= 256MB. But in P8,
candidate image size can go up to 750MB. Hence increasing
candidate image max size to 1GB.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch implements the EEH operation backend restore_config()
for PowerNV platform. That relies on OPAL API opal_pci_reinit()
where we reinitialize the error reporting properly after PE or
PHB reset.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
After reset on the specific PE or PHB, we never configure AER
correctly on PowerNV platform. We needn't care it on pSeries
platform. The patch introduces additional EEH operation eeh_ops::
restore_config() so that we have chance to configure AER correctly
for PowerNV platform.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We don't have IO ports on PHB3 and the assignment of variable
"iomap_off" on PHB3 is meaningless. The patch just removes the
unnecessary assignment to the variable. The code change should
have been part of commit c35d2a8c ("powerpc/powernv: Needn't IO
segment map for PHB3").
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
After reverting 25ebc45b93
("powerpc/pseries/iommu: remove default window before attempting DDW
manipulation"), we no longer remove the base window in enable_ddw.
Therefore, we no longer need to reset the DMA window state in
find_existing_ddw_windows(). We can instead go back to what was done
before, which simply reuses the previous configuration, if any. Further,
this removes the final caller of the reset-pe-dma-windows call, so
remove those functions.
This fixes an EEH on kdump with the ipr driver. The EEH occurs, because
the initcall removes the DDW configuration (64-bit DMA window), but
doesn't ensure the ops are via the IOMMU -- a DMA operation occurs
during probe (still investigating this) and we EEH.
This reverts commit 14b6f00f8a.
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Ben rightfully pointed out that there is a race in the "newer" DDW code.
Presuming we are running on recent enough firmware that supports the
"reset" DDW manipulation call, we currently always remove the base
32-bit DMA window in order to maximize the resources for Phyp when
creating the 64-bit window. However, this can be problematic for the
case where multiple functions are in the same PE (partitionable
endpoint), where some funtions might be 32-bit DMA only. All of a
sudden, the only functional DMA window for such functions is gone. We
will have serious errors in such situations. The best solution is simply
to revert the extension to the DDW code where we ever remove the base
DMA window.
This reverts commit 25ebc45b93.
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>. Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.
The one instance where we add an include for init.h covers off
a case where that file was implicitly getting it from another
header which itself didn't need it.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
improve the common clock support code for MPC512x
- expand the CCM register set declaration with MPC5125 related registers
(which reside in the previously "reserved" area)
- tell the MPC5121, MPC5123, and MPC5125 SoC variants apart, and derive
the availability of components and their clocks from the detected SoC
(MBX, AXE, VIU, SPDIF, PATA, SATA, PCI, second FEC, second SDHC,
number of PSC components, type of NAND flash controller,
interpretation of the CPMF bitfield, PSC/CAN mux0 stage input clocks,
output clocks on SoC pins)
- add backwards compatibility (allow operation against a device tree
which lacks clock related specs) for MPC5125 FECs, too
telling SoC variants apart and adjusting the clock tree's generation
occurs at runtime, a common generic binary supports all of the chips
the MPC5125 approach to the NFC clock (one register with two counters
for the high and low periods of the clock) is not implemented, as there
are no users and there is no common implementation which supports this
kind of clock -- the new implementation would be unused and could not
get verified, so it shall wait until there is demand
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
the SDHC clock is derived from CSB with a fractional divider which can
address "quarters"; the implementation multiplies CSB by 4 and divides
it by the (integer) divider value
a bug in the clock domain synchronisation requires that only even
divider values get setup; we achieve this by
- multiplying CSB by 2 only instead of 4
- registering with CCF the divider's bit field without bit0
- the divider's lowest bit remains clear as this is the reset value
and later operations won't touch it
this change keeps fully utilizing common clock primitives (needs no
additional support logic, and avoids an excessive divider table) and
satisfies the hardware's constraint of only supporting even divider
values
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
adjust (expand on or move) a few comments,
add markers for easier navigation around helpers
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
this change removes workarounds which have become obsolete after
migration to common clock support has completed
- remove clkdev registration calls (compatibility clock item aliases)
after all peripheral drivers were adjusted for device tree based
clock lookup
- remove pre-enable workarounds after all peripheral drivers were
adjusted to acquire their respective clock items
workarounds for these clock items get removed: FEC (ethernet), I2C,
PSC (UART, SPI), PSC FIFO, USB, NFC (NAND flash), VIU (video capture),
BDLC (CAN), CAN MCLK, DIU (video output)
these clkdev registered names won't be provided any longer by the
MPC512x platform's clock driver: "psc%d_mclk", "mscan%d_mclk",
"usb%d_clk", "nfc_clk", "viu_clk", "sys_clk", "ref_clk"
the pre-enable workaround for PCI remains, but depends on the presence
of PCI related device tree nodes (disables the PCI clock in the absence
of PCI nodes, keeps the PCI clock enabled in the presence of nodes) --
moving clock acquisition into the peripheral driver isn't possible for
PCI because its initialization takes place before the platform clock
driver gets initialized, thus the clock provider isn't available then
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
adapt the DIU clock initialization to the COMMON_CLK approach:
device tree based clock lookup, prepare and unprepare for clocks,
work with frequencies not dividers, call the appropriate clk_*()
routines and don't access CCM registers
the "best clock" determination now completely relies on the
platform's clock driver to pick a frequency close to what the
caller requests, and merely checks whether the desired frequency
was met (fits the tolerance of the monitor)
this approach shall succeed upon first try in the usual case,
will test a few less desirable yet acceptable frequencies in
edge cases, and will fallback to "best effort" if none of the
previously tried frequencies pass the test
provide a fallback clock lookup approach in case the OF based clock
lookup for the DIU fails, this allows for successful operation in
the presence of an outdated device tree which lacks clock specs
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
the setup before the change was
- arch/powerpc/Kconfig had the PPC_CLOCK option, off by default
- depending on the PPC_CLOCK option the arch/powerpc/kernel/clock.c file
was built, which implements the clk.h API but always returns -ENOSYS
unless a platform registers specific callbacks
- the MPC52xx platform selected PPC_CLOCK but did not register any
callbacks, thus all clk.h API calls keep resulting in -ENOSYS errors
(which is OK, all peripheral drivers deal with the situation)
- the MPC512x platform selected PPC_CLOCK and registered specific
callbacks implemented in arch/powerpc/platforms/512x/clock.c, thus
provided real support for the clock API
- no other powerpc platform did select PPC_CLOCK
the situation after the change is
- the MPC512x platform implements the COMMON_CLK interface, and thus the
PPC_CLOCK approach in arch/powerpc/platforms/512x/clock.c has become
obsolete
- the MPC52xx platform still lacks genuine support for the clk.h API
while this is not a change against the previous situation (the error
code returned from COMMON_CLK stubs differs but every call still
results in an error)
- with all references gone, the arch/powerpc/kernel/clock.c wrapper and
the PPC_CLOCK option have become obsolete, as did the clk_interface.h
header file
the switch from PPC_CLOCK to COMMON_CLK is done for all platforms within
the same commit such that multiplatform kernels (the combination of 512x
and 52xx within one executable) keep working
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
extend the recently added COMMON_CLK platform support for MPC512x such
that it works with incomplete device tree data which lacks clock specs
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
[agust@denx.de: moved node macro definitions out of the function body]
Signed-off-by: Anatolij Gustschin <agust@denx.de>
this change implements a clock driver for the MPC512x PowerPC platform
which follows the COMMON_CLK approach and uses common clock drivers
shared with other platforms
this driver implements the publicly announced set of clocks (those
listed in the dt-bindings header file), as well as generates additional
'struct clk' items where the SoC hardware cannot easily get mapped to
the common primitives (shared code) of the clock API, or requires
"intermediate clock nodes" to represent clocks that have both gates and
dividers
the previous PPC_CLOCK implementation is kept in place and remains
active for the moment, the newly introduced CCF clock driver will
receive additional support for backwards compatibility in a subsequent
patch before it gets enabled and will replace the PPC_CLOCK approach
some of the clock items get pre-enabled in the clock driver to not have
them automatically disabled by the underlying clock subsystem because of
their being unused -- this approach is desirable because
- some of the clocks are useful to have for diagnostics and information
despite their not getting claimed by any drivers (CPU, internal and
external RAM, internal busses, boot media)
- some of the clocks aren't claimed by their peripheral drivers yet,
either because of missing driver support or because device tree specs
aren't available yet (but the workarounds will get removed as the
drivers get adjusted and the device tree provides the clock specs)
clkdev registration provides "alias names" for few clock items
- to not break those peripheral drivers which encode their component
index into the name that is used for clock lookup (UART, SPI, USB)
- to not break those drivers which use names for the clock lookup which
were encoded in the previous PPC_CLOCK implementation (NFC, VIU, CAN)
this workaround will get removed as these drivers get adjusted after
device tree based clock lookup has become available
the COMMON_CLK implementation copes with device trees which lack an
oscillator node (backwards compat), the REF clock is then derived from
the IPS bus frequency and multiplier values fetched from hardware
Cc: Mike Turquette <mturquette@linaro.org>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
It is now possible to use the common cpuidle_[un]register() routines
(instead of open-coding them) so do it.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
pseries cpuidle driver sets dev->state_count to drv->state_count so
the default dev->state_count initialization in cpuidle_enable_device()
(called from cpuidle_register_device()) can be used instead.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add support for the Motorola/Emerson MVME5100 Single Board Computer.
The MVME5100 is a 6U form factor VME64 computer with:
- A single MPC7410 or MPC750 CPU
- A HAWK Processor Host Bridge (CPU to PCI) and
MultiProcessor Interrupt Controller (MPIC)
- Up to 500Mb of onboard memory
- A M48T37 Real Time Clock (RTC) and Non-Volatile Memory chip
- Two 16550 compatible UARTS
- Two Intel E100 Fast Ethernets
- Two PCI Mezzanine Card (PMC) Slots
- PPCBug Firmware
The HAWK PHB/MPIC is compatible with the MPC10x devices.
There is no onboard disk support. This is usually provided by installing a PMC
in first PMC slot.
This patch revives the board support, it was present in early 2.6
series kernels. The board support in those days was by Matt Porter of
MontaVista Software.
CSC Australia has around 31 of these boards in service. The kernel in use
for the boards is based on 2.6.31. The boards are operated without disks
from a file server.
This patch is based on linux-3.13-rc2 and has been boot tested.
Only boards with 512 Mb of memory are known to work.
Signed-off-by: Stephen Chivers <schivers@csc.com>
Tested-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Signed-off-by: Scott Wood <scottwood@freescale.com>
TWR-P1025 Overview
-----------------
512Mbyte DDR3 (on board DDR)
64MB Nor Flash
eTSEC1: Connected to RGMII PHY AR8035
eTSEC3: Connected to RGMII PHY AR8035
Two USB2.0 Type A
One microSD Card slot
One mini-PCIe slot
One mini-USB TypeB dual UART
Signed-off-by: Michael Johnston <michael.johnston@freescale.com>
Signed-off-by: Xie Xiaobo <X.Xie@freescale.com>
[scottwood@freescale.com: use pr_info rather than KERN_INFO]
Signed-off-by: Scott Wood <scottwood@freescale.com>
Define a QE init function in common file, and avoid
the same codes being duplicated in board files.
Signed-off-by: Xie Xiaobo <X.Xie@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
It makes no sense to initialize the mpic ipi for the SoC which has
doorbell support. So set the smp_85xx_ops.probe to NULL for this
case. Since the smp_85xx_ops.probe is also used in function
smp_85xx_setup_cpu() to check if we need to invoke
mpic_setup_this_cpu(), we introduce a new setup_cpu function
smp_85xx_basic_setup() to remove this dependency.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
This patch fixed several typo in printk from various
part of kernel source.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Merge a pile of fixes that went into the "merge" branch (3.13-rc's) such
as Anton Little Endian fixes.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch updates the generic iommu backend code to use the
it_page_shift field to determine the iommu page size instead of
using hardcoded values.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch adds a it_page_shift field to struct iommu_table and
initiliases it to 4K for all platforms.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>